Theory Chapter3_HoareHeap

(*
 * Copyright 2020, Data61, CSIRO (ABN 41 687 119 230)
 * Copyright (c) 2022 Apple Inc. All rights reserved.
 *
 * SPDX-License-Identifier: BSD-2-Clause
 *)

(*<*)
theory Chapter3_HoareHeap
imports "AutoCorres2.AutoCorres"
begin

(*>*)

subsection ‹\texttt{swap}›

text ‹

  Here, we use AutoCorres to verify a C program that reads and writes to the heap.
  Our C function, \texttt{swap}, swaps two words in memory:

\lstinputlisting[language=C, firstline=11]{swap.c}

  Again, we translate the program using the C parser and AutoCorres.
›

install_C_file "sources/swap.c"
autocorres [heap_abs_syntax, ts_rules = nondet] "swap.c"
(*<*)
context swap_all_corres begin
(*>*)

text ‹
  Most heap operations in C programs consist of accessing a pointer.
  AutoCorres abstracts the global C heap by creating one heap for
  each type. (In our simple \texttt{swap} example, it creates only
  a @{typ word32} heap.) This makes verification easier as long as
  the program does not access the same memory as two different types.

  There are other operations that are relevant to program verification,
  such as changing the type that occupies a given region of memory.
  AutoCorres will not abstract any functions that use these operations,
  so verifying them will be more complicated (but still possible).
›

text ‹
  The C parser expresses \texttt{swap} like this:
›

thm swap_body_def
text @{thm [display] swap_body_def}

text ‹
  AutoCorres abstracts the function to this:
›

thm swap'_def
text @{thm [display] swap'_def }

text ‹
  There are some things to note:

  The function contains guards (assertions) that the pointers
  \texttt{a} and \texttt{b} are valid. We need to prove that
  these guards never fail when verifying \texttt{swap}.
  Conversely, when verifying any function that calls
  \texttt{swap}, we need to show that the arguments are
  valid pointers.

  We saw a monadic program in the previous section, but here
  the monad is actually being used to carry the program heap.
›


(*<*)
context 
  fixes s:: lifted_globals
  fixes a::"32 word ptr"
begin
(*>*)
text ‹
  Now we prove that \texttt{swap} is correct. We use @{term x}
  and @{term y} to ``remember'' the initial values so that we can
  talk about them in the post-condition. The heap access syntax terms[a] is used
  to select the split heap termheap_w32 from state terms at pointer terma::32 word ptr.
  ‹For more details on the split heap model in autocorres see \autoref{sec:open_struct}.›
(*<*)
end
(*>*)

lemma "IS_VALID(32 word) s a; s[a] = x; IS_VALID(32 word) s b; s[b] = y 
         swap' a b  s
        λ_ s. s[a] = y  s[b] = x "
  apply (unfold swap'_def)
  apply runs_to_vcg
  apply clarsimp
  txt ‹
    The C parser and AutoCorres both model the C heap using functions,
    which takes a pointer to some object in memory. Heap updates are
    modelled using the functional update @{term fun_upd}:

      @{thm [display] fun_upd_def}

    To reason about functional updates, we use the rule fun\_upd\_apply.
›
  apply (simp add: fun_upd_apply)
  done

text ‹
  Note that we have ``only'' proved that the function swaps its
  arguments. We have not proved that it does \emph{not} change
  any other state. This is a typical \emph{frame problem} with
  pointer reasoning. We can prove a more complete specification
  of \texttt{swap}:
›
lemma "(x y s. P (s[a := x][b := y]) = P s);
       IS_VALID(32 word) s a; s[a] = x; IS_VALID(32 word) s b; s[b] = y ; P s  
         (swap' a b)  s
        λ_ s. s[a] = y  s[b] = x  P s "
  apply (unfold swap'_def)
  apply runs_to_vcg
  apply (clarsimp simp: fun_upd_apply)+
  done

text ‹
  In other words, if predicate @{term P} does not depend on the
  inputs to \texttt{swap}, it will continue to hold.

  \emph{Separation logic} provides a more structured approach
  to this problem.
›

(*<*)
end
(*>*)

section "Command Options and Invocation"

subsection "Session Structure"

text ‹
The supplied session structure has some peculiarities to accomodate the AFP. The AFP presents
the documentation of the session that is named like the AFP-entry. This is @{session AutoCorres2}.
This session also includes this documentation and other example application of AutoCorres.
When you want to use AutoCorres for your own projects you do not have to import these examples.
Thats why we also supply the leaner @{session AutoCorres2_Main} as entry point, which we recommend
as parent session for your applications.
›

subsection "C-Parser"

text ‹Options for @{command install_C_file},

    syntax ‹install_C_file <filename> [<option> = value, ...]›:

 ‹memsafe› (default false): add additional guards to ensure that welltypedness of pointers
 ‹c_types› (default true): import types to UMM model
 ‹c_defs› (default true): import function to SIMPL
 ‹roots›: (default all functions): List of C functions that are the root functions to import
 ‹prune_types› (default true): only import those types that are actually used in the program
 ‹machinety›: HOL type for ghost state modelling the machine
 ‹gostty›: HOL type for some additional ghost state
 ‹skip_asm› (default false): Skip inline assembler
›

text ‹ 
Moreover there are some Isabelle options for more feedback / tracing:
 ‹c_parser_feedback_level› (default 0), higher number means more tracing
 ‹option c_parser_verbose› (default false), enable for verbose messages
›

subsubsection ‹Multiple C Files›

text ‹Command @{command install_C_file} only has a single C file as argument. When your project
consists of several ‹.c› and ‹.h› you can register those dependencies first, with 
command @{command include_C_file}. The main C file, which is the argument to @{command install_C_file}, 
can then include all these files. If there is no such C file in your project you can create 
one to accomodate @{command install_C_file}.
›

subsubsection ‹Target Architecture Selection ‹L4V_ARCH›
text ‹
The Environment variable ‹L4V_ARCH› can be used to determine the target architecture
which influences the sizes of C integral types and pointer types. Supported platforms are
‹ARM› (default), ‹ARM64›, ‹ARM_HYP›,  ‹RISCV64› and ‹X64›.

For ‹ARM›, the sizes are:
 128 bits: ‹int128›
 64 bits: ‹long long›
 32 bits: pointers, ‹long›, ‹int›
 16 bits: ‹short›

For ‹X64›, ‹ARM64›, ‹ARM_HYP›:
 128 bits: ‹int128›
 64 bits: pointers, ‹long long›, ‹long›
 32 bits: ‹int›
 16 bits: ‹short›

For ‹ARM64› ‹char› is signed.

For example to build AutoCorres for ARM64:

    ‹L4V_ARCH=ARM64 isabelle build -d . AutoCorres_Main›

subsection "AutoCorres"

text ‹Options for @{command "init-autocorres"} / @{command "autocorres"}, 

    syntax ‹autocorres [option = value, ...] <filename>›:

 ‹phase›: perform autocorres up to specified phase only (L1, L2, HL, WA, IO, TS)
 ‹scope›: space separated list of functions to perform autocorres on. (Default all).
 ‹scope_depth›: depth of callees to also also include
 ‹single_threaded›: flag to disable parallel processing (e.g. to make more sense out of tracing messages)
 ‹no_heap_abs›: space separated list of functions that should be excluded from heap abstraction
 ‹in_out_parameters›: 'and' separated list of function specs e.g. ‹inc(y:in_out)›, 
  cf. 🗏‹../In_Out_Parameters_Ex.thy›
 ‹in_out_globals›: 'and' separated list of functions which modify global variables via pointers
 ‹skip_io_abs›: flag to disable IO abstraction
 ‹addressable_fields›: space separated list of paths to struct fields that should be addressable in
 the split heap,  cf. 🗏‹../open_struct.thy›
 ‹ignore_addressable_fields_error›: option to downgrade error to a warning
 ‹skip_heap_abs›: flag to disable heap abstraction (into split heap)
 ‹unsigned_word_abs›: space separated list of functions where unsigned words should be abstracted to typnat.
 ‹unsigned_word_abs_known_functions›: assume unsigned word abstraction for function pointers
 ‹no_signed_word_abs›: space separated list of functions where abstraction of unsigned word 
  to typint should be disabled.
 ‹no_signed_word_abs_known_functions›: don't assume unsigned word abstraction for function pointers
 ‹skip_word_abs›: flag to disable word abstraction
 ‹ts_rules›: space separated list of rules to consider during TS phase (pure, gets, option, nondet, exit).
 ‹ts_force <rule-name>›: space separated list of function names to put in the specified monad (pure, gets, option, nondet, exit).
 ‹ts_force_known_functions›: assume function pointers live in the specified monad (pure, gets, option, nondet, exit)
 ‹heap_abs_syntax›: enable some additional syntax for heap accesses
 ‹do_polish› (default true): flag for polish phase
 ‹L1_opt› (RAW | PEEP (default)): level for L1 optimisation
 ‹L2_opt› (RAW | PEEP (default)): level for L2 optimisation
 ‹HL_opt› (RAW | PEEP (default)): level for HL optimisation
 ‹WA_opt› (RAW | PEEP (default)): level for WA optimisation
 ‹trace_opt›: flag to enable some tracing
 ‹gen_word_heaps›: flag to generate word heaps even if the program does not use them 
 ‹keep_going›: continue despite some errors
 ‹lifted_globals_field_prefix›: custom prefix for split heap naming
 ‹lifted_globals_field_suffix›: custom suffix for split heap naming
 ‹function_name_prefix›: custom prefix for generated function names
 ‹function_name_suffix›: custom suffix for generated function names
 ‹no_c_termination›: flag to disable termination precondition for correspondence proofs
 ‹unfold_constructor_bind›: (Selectors (default) | Always | Never) to
  give some user level control to unfold certain "simple" binds.
   cf.: 🗏‹../../tests/proof-tests/unfold_bind_options.thy›
 ‹base_locale›: custom base locale for all autocorres locales
›

text ‹
Moreover there are some Isabelle options for more feedback / tracing:
 ‹option verbose› (default 0), higher value means more verbosity
 ‹verbose_timing› (default 0), higher value means more timing messages
 ‹timing_threshold› (default 3), threshold in milliseconds for timing messages
›

(*<*)
end
(*>*)