Theory In_Out_Parameters_Ex

(*
 * Copyright (c) 2023 Apple Inc. All rights reserved.
 *
 * SPDX-License-Identifier: BSD-2-Clause
 *)

section "In-Out Parameters, Abstracting Pointers to Values"

theory In_Out_Parameters_Ex imports "AutoCorres" 
begin

consts abs_step:: "word32  word32  bool"

subsection ‹Overview›

text ‹We introduce a new phase in AutoCorres to replace pointer parameters by ‹in/out parameters›:
Instead of passing a pointer into a function we pass the initial value of the pointer 
(by dereferencing the pointer) into the function and return the value at the end of the function
as additional output. E.g. a function ‹void inc(unsigned *n)› becomes ‹unsigned inc(unsigned n)›.
The initial motivation for this phase was to get rid of explicit pointers to local variables 
and replace them by ordinary values instead.
This conversion has two main aspects:
 Function signatures may change: pointer parameters become value parameters and
  the function may return the value of the pointer at the end as an additonal
  return value (tupled with the ordinary return value). Functions with side effects on the heap
  may become pure functions on values by this transformation.
 As a result of the first transformation, pointers to local variables may be eliminated 
  and be replaced by bound variables. This means that conststack_heap_state.with_fresh_stack_ptr 
  disappears.

The new phase is called ‹IO› and is placed between L2 and HL. This means local variables are
already represented as bound variables in L2 and we still operate on a monolithic heap.

›

subsection ‹Building Blocks›

text ‹
What ingredients do we need for the abstraction relation in the refinement proof?
 Heaps: As we eliminate pointers, the original function manipulates the heap more 
  often then the resulting function. So we need a notion to relate the heap states and keep track
  of the relevant pointers. As the stack of local variable pointers is also modelled in the heap
  the stack free locations are also of interest. The heaps of the original and the resulting
  program may differ in these locations. The heap value of a stack free location should be 
  irrelevant for the program.

 Function signatures: The function signature changes, pointer parameters become ordinary values,
  the return value is tupled with output parameters.

In the following we refer to the original state (containing the heap) as terms and the
resulting / abstract state as termt.
As I first idea we want to relate states terms and termt such that the heaps are the
same except for some pointers we want to eliminate and the stack free locations. It would be natural
to describe this as a relation. However, it turns out that dealing with relations within the
refinement proof are hard to make admissible in order to support recursive functions. Having a
abstraction / lifting function (instead of a relation) from terms to termt makes admissibility
straightforward.

As a prerequisite we encourage the reader to consider the documentation about
pointers to local variables: 🗏‹pointers_to_locals.thy›

context heap_state
begin
subsubsection constframe

text ‹We want to express that certain portions of the heap (typing and values) are 'irrelevant',
in particular regarding (free) stack space. The notion of 'irrelevant' is a bit vague, it means
that the behaviour of the resulting program does not depend on those locations and also that it does not 
modify those locations. Moreover, we prefer an abstraction function rather than an 
relation between states to avoid admissibility issues for refinement.
The central property is termt = frame A t0 s, for termA being a set of addresses and
termt, termt0 and terms being states. Think of termA as the set of ‹allocated› addresses
containing those pointers we want to ‹abstract› aka. eliminate.›

thm frame_def

text ‹
The standard use case we have in mind is @{term "A  stack_free (htd s) = {}"}, hence
@{prop "A - stack_free (htd s) = A"}, but nevertheless the intuition is:
 stack free typing from terms is preserved, the framed sate has at least as many stack free
  addresses as the original one. So we can simulate any stack allocation.
 heap values for stack free and termA are taken from reference state termt0, this
  captures that we do not depend on the original values in terms for those addresses.
 typing for allocations in termA is taken from reference state termt0, this captures
  that we do not depend on the original typing in terms for those addresses.

By taking the same reference state termt0 to frame two states terms and terms',
we can express that the 'irrelevant' parts of the heap did not change in the respective
framed states termframe A t0 s and termframe A t0 s'.
›

subsubsection constrel_alloc, constrel_stack, constrel_sum_stack

text ‹The core invariant between the states that is maintained by the refinement is
termrel_alloc 𝒮 M A t0 s t:
 term𝒮 is a static set of addresses containing the stack. Stack allocation and stack release
  are contained within term𝒮.
 termM is the set of addresses that might be modified by the original program.
 termA is the set of addresses that are eliminated (abstracted) in the resulting program. Another 
  good intuition about the set termA is that the original function might read from and depend on 
  those addresses but the resulting function does not. Moreover, the resulting function does not
  change any of heap locations in termA. The resulting function is agnostic to those locations. 
  It neither reads nor modifies them.
  For any heap valuation of termA the abstract function simulates the original one.
 termt0 is the reference state explained above.
 terms is the state of the original program
 termt is the state of the resulting program.
›

thm rel_alloc_def [of 𝒮 M A t0]

text ‹
The properties of constrel_alloc are:
 termt = frame A t0 s, the resulting heap in termt is the same as the original in terms, 
  except for the addresses in termA and the stack free space.
 termstack_free (htd s)  𝒮: The stack free space is contained in term𝒮.
 termstack_free (htd s)  A = {}: We only want to eliminate properly allocated pointers, which
  are not contained in the stack free space.  
  termstack_free (htd s)  M = {}: We only mention properly allocated pointers to be modified. 
  Modifications within the stack free space are irrelevant, as they are abstracted by the framing
  anyways.
›
text ‹
For an original function termf and the abstract aka. lifted function termg the refinement
property has the following shape:
proprel_alloc 𝒮 M A t0 s t  refines f g s t (rel_stack 𝒮 M1 A s t0 (rel_xval_stack L R))

Note that the output relation of constrefines relates both the resulting states and return
values to each other. The return values of the error monad are related by constrel_sum_stack, taking a
relation termL for error values (termInl) and the termR for normal termination (termInr).

›
thm rel_sum_stack_def
thm rel_stack_def [of 𝒮 M1 A s t0 "rel_sum_stack L R"]

text ‹Consider the output states terms' and termt' and the output values termv and termw 
for the original and the abstracted function. Then we have:
 termrel_xval_stack L R (hmem s') v w: The result values are related, (see below).
 termrel_alloc 𝒮 M1 A t0 s' t': In particular termt' = frame A t0 s'. Note that we use
   the same termA and the same reference state termt0 as for the initial states. So this means
   that the abstract program termg does not change the heap values and typing in termA and does
   not depend on the values. 
 termequal_upto (M1  stack_free (htd s')) (hmem s') (hmem s): The original program only modifies
   pointers contained in termM1 or in the stack free portions of the heap 
    (e.g. by nested function calls).
 termequal_upto M1 (htd s') (htd s): Changes to the heap typing of the original program 
   are confined within set termM1.
 termequal_on 𝒮 (htd s') (htd s): The stack typing of the original program remains unchanged.
  Note that it might change temporarily by nested function calls but the stacking discipline
  restores the state.
By construction we have termM1  M.
›

paragraph ‹Stacking pointers: constrel_singleton_stack, constrel_push

text ‹
The relation on the output value captures the idea that portions of the heap are stacked / tupled 
into result values. On the boundaries of a function the relation on error values termL is always the
trivial termλ_. (=). Errors are terminal as there is no exception handling in C. Within
the boundaries of a function there can be more complex relations reflecting the nesting of
break / continue / return and goto. 
For ordinary values the values of pointers are tupled by constrel_singleton_stack, in case 
the original function returned no result (void), or nested by  constrel_push in case the
original function returned some result.
›
thm rel_singleton_stack_def
thm rel_push_def
text ‹
 A typical relation is termR = rel_singleton_stack p for some pointer termp. This means
that the abstract value can be obtained by looking into the heap at pointer termp:
termrel_singleton_stack p h () v means that termv = h_val h p. The heap of the original program
 is propagated down the into the relations.
Similar termrel_push p (λ_. (=)) h x (v, x) relates the result value termx to the tuple
term(v, x) where termv = h_val h p. Note that termrel_push can be nested represent multiple
output parameters.
Depending on the role of the pointers there are additional assumptions about the pointers, 
in particular things like termptr_span p  A or  termptr_span p  M and disjointness properties 
like termdistinct_sets [ptr_span p, ptr_span q].
›

subsubsection constIOcorres

text ‹
There is an additional predicate constIOcorres which represents the result of the refinement
proof in the canonical autocorres ideom that is used within the other phases. This form
is what is expected when constructing the final theorem combining all autocorres layers
@{thm ac_corres_chain_sims}. The termrefines version and the termIOcorres version can
be converted into each other by @{thm ac_corres_chain_sims}.
›
thm IOcorres_def
thm IOcorres_refines_conv
thm ac_corres_chain_sims
end

install_C_file "in_out_parameters.c"


subsection ‹Options›

text ‹
To control the in/out- parameter abstraction there are two options of autocorres:
 skip_io_abs› a boolean which is true› by default. This skips the complete phase for all 
  functions.
 in_out_parameters› which takes a list of in-out specifications for functions. As soon
  as there is at least one specification present the phase is enabled (taking
  precedence over skip_io_abs›. 

So to properly enable the IO phase you have two main choices to specify what should be done
either in @{command "init-autocorres"} or subsequent calls to @{command autocorres}. You can
only disable option skip_io_abs› in @{command "init-autocorres"} and then provide the
individual options for in_out_parameters› in subsequent @{command autocorres} invocations, or
you can already define all in_out_parameters› already in @{command "init-autocorres"}.

More details on the options are provided in the following examples.
›

subsection ‹Examples›


init-autocorres [ in_out_parameters = 
    compare(cmp:out) and
    inc(y:in_out) and
    inc_int(y:in_out) and
    dec(y:in_out) and
    swap(x:in_out, y:in_out) and
    swap_pair(x:in_out, y:in_out) and
    call_inc_ptr(p: in_out) and
    inc_twice(y:in_out) and
    inc2(y:in_out, z:in_out) and 
    heap_inc2_part(z:in_out) and
    heap_inc2_part_swap(y:in_out) and
    safe_add(result: in_out) and
    inc_pair(p:in_out) and
    get_arr_idx(arr:"in") and
    get_pair_arr_idx_second(arr:"in") and
    inc_pair(p:in_out) and
    xyz(x:in_out) and
    resab(res:in_out) and
    abcmp(cmpRst:in_out) and
    out(out:out) and
    out2(out:in_out) and
    out_seq(out:out) and
    inc_loop (x:"in", y:in_out) and
    call_inc_int_other(m: in_out) and
    inc_int_no_exit(y:in_out) and
    call_inc_int_other_mixed(m: in_out) and
    inc_int2(y:in_out, z:in_out) and
    call_inc_int2(n:in_out) and
    call_inc_int_pair(p:in_out) and
    call_inc_int_array(p:in_out) and
    call_compare_ptr(m:"out") and
    mixed_global_local_inc(q:"in_out") and
    g1(out:"in_out") and
    set_void(p:"data") and
    set_void2(p:"data") and
    set_byte(p:"data") and
    read_char(p:in_out, len: in_out) and
    goto_read_char1(p:in_out) and
    goto_read_char5(elem: "in"),
    in_out_globals =
      keep_inc_global_array
      inc_global_array
      keep_inc_global_array2
      inc_global_array2
      call_keep_inc_global_array
      call_inc_global_array
      mixed_global_local_inc shuffle global_array_update
      read_char call_read_char call_read_char_loop 
      goto_read_char1 goto_read_char2 goto_read_char3 goto_read_char4 goto_read_char5,
    ignore_addressable_fields_error, (* FIXME *)
    addressable_fields = 
      pair.first 
      pair.second 
      array.elements
      int_pair.int_first
      int_pair.int_second
     (* int_array.int_elements*)
(*
FIXME: open_types seems to have an issue that
lemma "typ_uinfo_t TYPE(32 signed word[3]) = typ_uinfo_t (TYPE(32 word[3])))"

*)
   ]"in_out_parameters.c"


text ‹In option in_out_parameters› you provide a parameter specification for the functions
you want to abstract. 
The parameter specs following the parameter name are
 in›: pointer is only used as input to the function. The referenced value does not change
  during the function call.
 out›: the pointer is only used to output a result from the function. The referenced value at
   the beginning is irrelevant, the abstracted function should return the referenced value at
   the end of the original function.
 in_out› the pointer is used to input a value and to return a result.

When no specification is given for a function, the list is considered empty. This means that
the signature of the function shall not be changed by the abstraction. Pointer parameters in 
the original function stay pointer parameters in the resulting function.
Note that this has a quite different effect from just skipping the phase, as autocorres is still attempting
to do a proper refinement proof. More details are provided by the examples.
›

locale keep_globals = in_out_parameters_global_addresses +
  assumes keep_global_array: "{ptr_val global_array ..+128}  𝒢"
begin
lemma global_array_contained1:
  assumes i_bound: "i < 12" 
  shows "ptr_span (global_array +p (int i))  𝒢"
proof -
  have "ptr_span (global_array +p (int i))  {ptr_val global_array ..+128}"
    using i_bound
    apply (clarsimp simp add: ptr_add_def intvl_def)
    subgoal for k
    apply (rule exI [where x= "i * 4 + k"])
      by force
    done
  with keep_global_array show ?thesis by blast
qed

lemma gobal_array_contained2[polish]:
  assumes i_bound: "numeral i < (12::nat)"
  shows "ptr_span (global_array +p (numeral i))  𝒢"
  using global_array_contained1
  by (metis i_bound of_nat_numeral)

end

autocorres  [
    (* base_locale=keep_globals *) (* experimental *)
    ts_force nondet = shuffle] "in_out_parameters.c"

context in_out_parameters_all_corres
begin

thm io_corres
thm ts_def
term ptr_valid
subsubsection constinc'

text ‹

The function ‹inc› increments the value which is passed as a pointer ‹y›.
‹
void inc (unsigned* y) {
  *y = *y + 1;
}
›

When we abstract this function to have an in-out parameter instead, we obtain a function like:

‹
void inc (unsigned y) {
  y = y + 1;
  return y;
}
›
Note that this is now a pure function, that does no longer depend on the heap. This is
also the final output of autocorres in @{thm inc'_def}:
›
thm inc'_def

text ‹
Let us zoom into the refinement step from L2 to IO.
›
thm l2_inc'_def
thm io_inc'_def
text ‹
Note that the type of parameter termy changes from a pointer type to a value type. Here
is the generated refinement theorem.
›
thm io_inc'_corres 
lemma "c_guard y 
  distinct_sets [ptr_span y] 
  ptr_span y  A 
  ptr_span y  M 
  globals.rel_alloc 𝒮 M A t0 s t 
  refines 
    (l2_inc' y) 
    (io_inc' (h_val (hrs_mem (t_hrs_' s)) y)) s t 
    (globals.rel_stack 𝒮 (ptr_span y) A s t0 (rel_xval_stack (rel_exit (λ_ _ _. False)) (rel_singleton_stack y)))"
  by (rule io_inc'_corres)

text ‹
Parameter y› was specified as in_out›. So function constio_inc' becomes the actual value
of the pointer passed in: term(h_val (hrs_mem (t_hrs_' s)) y). To discharge the termc_guard y
guards within the body of constl2_inc' we need the assumption that the guard holds at the beginning.
As we attempt to abstract pointer y› we have its pointer span included in set termA. Moreover,
the content of pointer y› is modified by constl2_inc' hence it appears in termM as well
as explicit in the final relation termglobals.rel_stack. 
Pointer parameters are assumed to be distinct. As we have only one parameter the assumption trivially
termdistinct_sets [ptr_span y] holds. 
Note that set termA is only constrained by termptr_span y  A. So we could instantiate it
with the universal set termUNIV::addr set. Recalling the meaning of termA this means that
the resulting function termio_inc' is a pure function, in the sense that it does not depend
on the heap anymore. The relation for the termination with ‹exit› is termrel_exit (λ_ _ _. False), 
which expresses
that this path is unreachable for the current function. The function will never terminate with
‹exit›.
›

subsubsection constcall_inc_parameter'

text ‹
Next we look at a function that calls our previously defined integer function on a pointer
to its parameter.
‹
unsigned call_inc_parameter(unsigned n) {
  inc(&n);
  return(n);
}
›
After autocorres the function is just a mere wrapper to the increment function
›
thm call_inc_parameter'_def

text ‹Here are the relevant definitions and the IO refinement theorem.›
thm l2_call_inc_parameter'_def
thm io_call_inc_parameter'_def
thm io_call_inc_parameter'_corres [no_vars]

thm io_call_inc_parameter'_corres
lemma "distinct_sets ([]::addr set list) 
  globals.rel_alloc 𝒮 M A t0 s t 
  refines 
     (l2_call_inc_parameter' n) 
     (io_call_inc_parameter' n) s t
     (globals.rel_stack 𝒮 {} A s t0 (rel_xval_stack (rel_exit (λ_ _ _. False)) (λ_. (=))))"
  by (rule io_call_inc_parameter'_corres)

text ‹Note that here we do not have any assumption on the set termA and the modified variables
 in the final state are empty. This is because we eliminate a pointer to a temporary stack variable that
 is out of scope after the function and in the beginning of the function it is part of the termstack_free 
 space. 

›


subsubsection ‹constkeep_inc'›

text ‹Now let us consider a copy of the increment function were we do not convert to in out parameters
but leave it as it is. Unsurprisingly, the resulting function still operates on pointers.›
thm keep_inc'_def

text ‹Let us look into the IO corres phase.›
thm l2_keep_inc'_def
thm io_keep_inc'_def
thm io_keep_inc'_corres

lemma "distinct_sets [ptr_span y] 
  ptr_span y  A = {} 
  ptr_span y  stack_free (hrs_htd (t_hrs_' s)) = {}  ― ‹FIXME: redundant assumption?, follows from the next ones›
  globals.rel_alloc 𝒮 M A t0 s t 
  ptr_span y  M 
  refines 
    (l2_keep_inc' y) 
    (io_keep_inc' y) s t
    (globals.rel_stack 𝒮 (ptr_span y) A s t0 (rel_xval_stack (rel_exit (λ_ _ _. False)) (λ_. (=))))"
  by (rule io_keep_inc'_corres) 
text ‹As pointer y› is an allocated pointer that we do not abstract it should not belong to
termA and must not collide with termstack_free space. As we might modify the pointer it
is contained in termM as well as the final modifies set.

Note that the bodies of constio_keep_inc' and constl2_keep_inc' are equivalent but the
refinement statement is not a trivial statement and in particular is not the same as when
skipping the IO corres phase. For example from the refinement statement we can derive that
any heap location different from termptr_span y is unchanged. 

Next let's look what happens when we call this function on a local parameter.
›

subsubsection constkeep_call_inc_parameter'

thm keep_call_inc_parameter'_def
thm io_keep_call_inc_parameter'_def
thm l2_keep_call_inc_parameter'_def
thm io_keep_call_inc_parameter'_corres [no_vars]

lemma "distinct_sets ([]::addr set list) 
  globals.rel_alloc 𝒮 M A t0 s t 
  refines 
    (l2_keep_call_inc_parameter' n) 
    (io_keep_call_inc_parameter' n) s t
    (globals.rel_stack 𝒮 {} A s t0 (rel_xval_stack (rel_exit (λ_ _ _. False)) (λ_. (=))))"
  by (rule io_keep_call_inc_parameter'_corres)

text ‹The local variable pointer is still present in the final function. Still the refinement
statement indicates that we have a `pure' functions, as we do not have constraints on termA
and also do not modify anything. The notion of pure we use here, is that the function does
not depend on proper heap pointers. The termstack_free space that we temporarily use is
excluded by our definitions of constglobals.frame, constglobals.rel_alloc and constglobals.rel_stack.
›

subsubsection constsafe_add'

text ‹This function is an example of an arithmetic function that checks for overflows. The
status of the check is the return value and the result of the operation is passed via pointer.
We make an in-out parameter for the result. So the final function returns a tuple of the
result value and the status (constrel_push in the refinement statement).
›
thm safe_add'_def
thm l2_safe_add'_def
thm io_safe_add'_def
thm io_safe_add'_corres


subsubsection ‹Recursion›

thm l2_fac'.simps
thm io_fac'.simps
thm io_fac'_corres

thm l2_even'.simps l2_odd'.simps
thm io_even'.simps io_odd'.simps
thm io_even'_corres io_odd'_corres

subsubsection constswap'

text ‹We abstract a function that swaps the values referenced by two input pointers by a function that
takes the values and returns the swapped values (as a pair). Note that in the
refinement statement we assume disjointness of the input pointers. The swap function would still 
be correct if the pointers are equal. To generalise the refinement proofs to `disjoint or equal' 
inputs is left as future work. ›

thm swap'_def
thm l2_swap'_def
thm io_swap'_def
thm io_swap'_corres

subsubsection ‹Out parameters›

text ‹The function compare› has an mere 'out' parameter. Until phase L2 the function has three parameters,
after phase IO the function only has two parameters.›

thm l2_compare'_def
thm io_compare'_def
thm compare'_def

thm l2_call_compare'_def
thm io_call_compare'_def
thm call_compare'_def


subsubsection ‹Pointers to compound types (structs / arrays)›

text ‹Here are some examples of function that work on (parts) of a compound type.›


thm get_arr_idx'_def
thm l2_get_arr_idx'_def
thm io_get_arr_idx'_def

thm inc_pair'_def
thm l2_inc_pair'_def
thm io_inc_pair'_def

subsubsection ‹Misc›

thm wa_def
thm ts_def
thm io_corres

subsection ‹Handling ‹exit› in function calls›

text ‹When we call a function that was refined by in-out-parameters there are various cases
we have to consider, e.g. do we pass in a heap pointer or a local pointer, do we eliminate the
local pointer or keep it? 
In general when we pass in a pointer that is not supposed to be eliminated as an in-out parameter
we first pass in the dereferenced value and after the function returns we take the value from the embellished
result and assign it to the pointer. So the pointer assignment that would take place
in the middle of the called function in the original program, is moved outside of the call in the
refined program. After this the heaps of the original program and the refined program are in sync
again at the pointer.
To establish the heap refinement it turns out that we also have to bring the heaps in sync 
again even if the function terminates with ‹exit›. We catch the ‹exit›, update the heap and then
re-throw the termexit. So the called function also returns embellished exit values which
hold the value of the relevant pointers. This extra work is a little annoying as the ‹exit› terminates the
complete program and usually we do not catch an ‹exit›. So this step might introduce
constL2_catch to wrap up the procedure call. Note that this is currently the only remaining
use case for constL2_catch, the other occurrences of constL2_catch were already transformed
to constL2_try as post processing in phase L2 when we move from `flat' exception handling to 
`nested' exception handling.

›

term inc_int'
thm inc_int'_def
term call_inc_int_ptr'
thm call_inc_int_ptr'_def

subsection ‹Global Heap Pointers›

text ‹
A core part of the IO phase is to keep track of pointers and in particular to do some 
bookkeeping of the parts of the heap that might get modified. This analysis and bookkeeping is
focused around the pointers that are visible in the signature of a function. This fails short
as soon as pointers to global heap are used within the body of a function, which are not
mentioned in the function signature. To handle some common use cases with global heap
we have a mechanism to over-approximate the impact of a function on the heap by referring to a
static set term𝒢 of addresses that should not be abstracted by the IO phase. We can 
annotate a function to make use of this mechanism by mentioning them in @{command "init-autocorres"}
with the option ‹in_out_globals›. 
For the refinement statement we assume that
 prop𝒢  A = {}: pointers in term𝒢 are not supposed to be abstracted, and
 prop𝒢  M: the function might modify any global variable.

Within the body of such a function (marked with  option ‹in_out_globals›), 
whenever a pointer is used which does not occur in the signature
as pointer parameter and is specified otherwise we assume that
this pointer a global pointer and assert this formally by adding
a guard propptr_span x  𝒢. If a function has multiple pointer parameters we also assert
disjointness of pointers.
›

thm io_keep_inc_global_array'_corres
thm io_keep_inc_global_array'_def
thm l2_keep_inc_global_array'_def

thm io_inc_global_array'_corres
thm io_inc_global_array'_def
thm l2_inc_global_array'_def

thm io_inc_global_array2'_corres
thm io_inc_global_array2'_def
thm l2_inc_global_array2'_def
thm ac_corres

text ‹Note that a function can also have both in-out parameters and ‹in_out_globals›.›

thm io_read_char'_corres
thm io_read_char'_def
thm l2_read_char'_def

thm io_call_read_char'_corres
thm io_call_read_char'_def
thm l2_call_read_char'_def

thm io_call_read_char_loop'_corres
thm io_call_read_char_loop'_def
thm l2_call_read_char_loop'_def


subsection ‹Disclaimer / Caution›

text ‹As we have seen in the examples the refinement theorems can contain assumptions on the
input pointers, in particular disjointness assumptions between different pointer parameters and 
also disjointness of pointer parameters to complete memory areas like termstack_free. When
a function is called these assumptions have to be discharged. When these pointer parameters are
eliminated (replaced by in-out value parameters) the assumptions disappear. So when you work
on the final output of autocorres you know nothing about pointers or disjointness.

In case a local pointer is eliminated these assumptions are be discharged locally as an intermediate
step in the proof. So this use case is fine.
However, if the function is called on a heap pointer these assumptions are propagated to the caller. So
the assumptions move up the call stack and might end up as implicit hidden assumptions on your 
toplevel functions (API).

Rule of thumb: In case your toplevel API function has more then one pointer parameter 
don't specify any in-out parameters on that function to avoid implicit assumptions.

›

subsection ‹Implementation Aspects›

text ‹
The proof of the refinement theorem requires to keep track of various pointers and changes
the signature of functions, including extending plain return values to tuples.
We decided to make a rather explicit forward proof for this, to keep tight control of the
various aspects. We make heavy use of ML antiquotations to match and build cterms.
An obstacle here is that the state record typglobals. which holds the heap,
is not yet defined but is only present as a locale. So our code is implemented within
the locale @{locale stack_heap_state}. We have introduced the concept of
‹interpretation data› (c.f. 🗏‹../lib/ml-helpers/interpretation_data.ML›) to get hold
of the interpretation we are interested in. 
The data is initialised in 🗏‹../AutoCorres.thy› as declaration within @{locale stack_heap_state}.
In particular this allows us to match against and build heap lookup and heap update expressions.

The interface is via @{ML In_Out_Parameters.operations}. This takes the morphism of the
interpretation as an argument to derive the instance we are interested in.
Note that we have crafted the main functions to benefit from the eager evaluation of ML.
In particular @{ML In_Out_Parameters.operations} takes the morphism phi› as its only parameter
and thus the instances of the main functions are only evaluated once.
›
end


subsection ‹pointer-parameters as data›

text ‹In a case where a pointer is not used to access the heap (e.g., set_void›), the
  "modifies" part of the corres theorem should be empty.
  Otherwise functions like set_byte›, that pass "arbitrarily" computed pointers fail to get
  IO-lifted.
  We enforce this by declaring pointer parameters as "data" in the in_out_parameters› option of
  autocorres.

  Future work: Automate this analysis to infer which pointer-parameters are treated as pure data.
  ›
context io_corres_set_void begin

lemma "distinct_sets ([]::addr set list)
  ― ‹should be able to get rid of this assumption, too (for multiple parameters)› 
globals.rel_alloc 𝒮 M A t0 s t 
refines (l2_set_void' p) (io_set_void' p) s t
 (globals.rel_stack 𝒮 {} A s t0
   (rel_xval_stack (rel_exit (λ_ _ _. False)) (λ_. (=))))"
  by (rule io_set_void'_corres)
end
context io_corres_set_void2 begin
thm io_set_void2'_corres
end
context io_corres_set_byte begin
thm io_set_byte'_corres
end

subsection ‹Future work / Open Ends›
text  Support function pointers
 support functions that might modify the heap typing (e.g. via exec_concrete›)
  currently our rel_alloc› / rel_stack› setup enforces that heap typing does not change at all
  this is to restrictive. Before we had that heap typing does not change in term𝒮 which was
  too weak to modify the A› frame to handle invocations on heap pointers. I guess a good
  approximation would be to say that heap typing is unchanged on all relevant addresses:
            𝒮 ∪ A ∪ M›
  Including M› might be too restrictive, e.g. consider an alloc function that zeros some 
  memory and adapts the heap-typing.
 Refined treatment of in_out_globals›: allow the user to add a guard that is inserted
  in front of the abstract program.
  E.G. locale for procedure: ptr_span (global_array + MAX_IDX) ⊆ 𝒢›
    + a Guard depending on a parameter idx›, e.g. idx < MAX_IDX›. Then we can automatically
    discharge local guards like ptr_span (global_array + idx) ⊆ 𝒢›. Not sure if this is 
  useful though
 Cleanup unused Generic_Data› that was replaced by interpretation data
 Remove struct-rewrite step from Heap Lifting. As we normalise pointers to root pointers in
  L2-Opt the pointers should already be in normal form and struct-rewrite does nothing.
 Add sanity checks to in_out› specs, especially if parameter names actually occur in signature
 Remove redundant assumptions for keep parameters:
  ptr_span y ∩ stack_free (hrs_htd (t_hrs_' s)) = {} ⟹
  globals.rel_alloc 𝒮 M A t0 s t ⟹
  ptr_span y ⊆ M ⟹ ...›
   The first one can be derived from the following ones

 Peephole: remove ptr_disjoint› of singletons
 Peephole simplification, especially Guard True. Not really necessary, will disappear in hl
  anyways.
 Eliminate pointers to global variables: When a pointer to a global variable is only used
  as a in_out parameter› it would be nice to abstract it to a record field in lifted_globals›  (or
  to something similar, i.e. lookup / update function in the heap with disjointnes and
  commutation of updates for other globals )
 Support more relaxed disjoint assumptions, especially disjoint_or_eq›, 
  e.g. swap also works if the input pointers are equal.
   All `keep` pointers are currently assumed disjoint and also are in the modifies set. We could
    be more relaxed here.
 HL phase has some explicit references to phase L2 instead of abstract prev_phase› (probably related
  to function pointers)
 Would be nice if synthesize rules would also participate in thm, and add / del stuff
  like named_rules›

end