Theory Chapter2_HoareHeap

(*
 * Copyright 2020, Data61, CSIRO (ABN 41 687 119 230)
 *
 * SPDX-License-Identifier: BSD-2-Clause
 * Copyright (c) 2022 Apple Inc. All rights reserved.
 *)

(*<*)
theory Chapter2_HoareHeap
imports "AutoCorres2.AutoCorres"
begin

(*>*)

section  ‹More Complex Functions with AutoCorres›

text ‹

  In the previous section we saw how to use the C-Parser and AutoCorres
  to prove properties about some very simple C programs.

  Real life C programs tend to be far more complex however; they
  read and write to local variables, have loops and use pointers to
  access the heap. In this section we will look at some simple programs
  which use loops and access the heap to show how AutoCorres can
  allow such constructs to be reasoned about.

›

subsection ‹A simple loop: \texttt{mult\_by\_add}›

text ‹

  Our C function \texttt{mult\_by\_add} implements a multiply operation
  by successive additions:

\lstinputlisting[language=C, firstline=11]{mult_by_add.c}

  We start by translating the program using the C parser and AutoCorres,
  and entering the generated locale \texttt{mult\_by\_add}.
›

install_C_file "sources/mult_by_add.c"
autocorres [ts_rules = nondet] "mult_by_add.c"
(*<*)
context mult_by_add_all_corres begin
(*>*)

text ‹
  The C parser produces the SIMPL output as follows:
›

thm mult_by_add_body_def
text ‹@{thm [display] mult_by_add_body_def}›

text ‹
  Which is abstracted by AutoCorres to the following:
›

thm mult_by_add'_def
text ‹@{thm [display] mult_by_add'_def }›

text ‹

  In this case AutoCorres has abstracted \texttt{mult\_by\_add} into a
  \emph{monadic} form. Monads are a pattern frequently used in
  functional programming to represent imperative-style control-flow; an
  in-depth introduction to monads can be found elsewhere.

  The monads used by AutoCorres in this example is a
  \emph{non-deterministic state monad};  program state is implicitly
  passed between each statement, and results of computations may produce
  more than one (or possibly zero) results\footnote{Non-determinism
  becomes useful when modelling hardware, for example, where the exact
  results of the hardware cannot be determined ahead of time.}.

  The HOL type is called specification monad: @{typ "('e, 'a, 's) spec_monad"}, where
  ▪ \<^typ>‹'e› type for exception / error results,
  ▪ \<^typ>‹'r› type of the result and
  ▪ \<^typ>‹'s› type of the state.

  When \<^typ>‹'e› is instantiated with the unit type \<^typ>‹unit›, there is an abbreviation
  @{typ "('a, 's) res_monad"}, for ∗‹result monad›, When it is instantiated with a proper
  type we have an abbreviation @{typ "('e, 'a, 's) exn_monad"}, for ∗‹exception monad›.

  To uniformly model results and exceptions as values the type 
  @{typ "('e, 'a) exception_or_result"} is used. It is constructed as a sum type where we
  exclude a default value from the exception type \<^typ>‹'e›. So when \<^typ>‹'e› is instantiated
  with \<^typ>‹unit› the type is isomorphic to the result type \<^typ>‹'a›. Type \<^typ>‹unit› is only inhabited
  by the unique unit value \<^term>‹()› which is also default value. This instance is abbreviated with
  \<^typ>‹'a val›. For proper exceptions we instantiate \<^typ>‹'e› with an option type \<^typ>‹'x option›.
  This instance is abbreviated with \<^typ>‹('x, 'a) xval›.

  Constructors and pattern matching:
  ▪\<^typ>‹('e, 'a) exception_or_result›:\<^term>‹Exception e›, \<^term>‹Result v›, and pattern matching
    @{term [eta_contract=false] ‹λException e ⇒ f e | Result v ⇒ g v›}
  ▪\<^typ>‹'a val›:\<^term>‹Result v::'a val›, with pattern matching 
    @{term [eta_contract=false] "λRes v. g v"}
  ▪\<^typ>‹('e, 'a) xval›: \<^term>‹Exn e›, \<^term>‹Result v›, and pattern matching
    @{term [eta_contract=false] ‹λExn e ⇒ f e | Result v ⇒ g v›}

  This enconding allows us to give a uniform definition of the monadic \<^const>‹bind› which works for 
  the generic specification monad and thus also for its instances the result and the exception monad.
›

subsubsection ‹'Hoare Triples'›

text ‹
  The bulk of @{term "mult_by_add'"} is wrapped inside the @{term
  "whileLoop"} \emph{monad combinator}, which is really just a fancy way
  of describing the method used by AutoCorres to encode (potentially
  non-terminating) loops using monads.

  If we want to describe the behaviour of this program, we can use
  Hoare logic as follows:

    @{term [display] "P s ⟹ mult_by_add' a b ∙ s ⦃ Q ⦄"}

  This predicate states that, assuming @{term "P"} holds on the initial
  program state, if we execute @{term "mult_by_add' a b"}, then
  @{term "Q"} will hold on the final state of the program.

  There are a few details: while @{term "P"} has type
  @{typ "'s ⇒ bool"} (i.e., takes a state and returns true if
  it satisifies the precondition), @{term "Q"} has an additional
  parameter @{typ "'r ⇒ 's ⇒ bool"}. The additional parameter
  @{typ "'r"} is the \emph{return value} of the function; so, in
  our \<^const>‹mult_by_add'› example, it will be the result of
  the multiplication. Note that the precondition is not part of the 'hoare-triple' but is
  an ordinary Isabelle assumption on the initial state \<^term>‹s›. That way we can
  directly use Isabelle/Isar to decompose the precondition and can also refer to the initial state 
  within the postcondition.
  The foundational constant for this hoare triple is \<^const>‹runs_to›
  which means total correctness: we have to proof that the program terminates and that
  there is no undefined behaviour (all guards must hold).
  There is also a weaker notion for partial correctness \<^const>‹runs_to_partial› with syntax
  \<^term>‹f ∙ s ?⦃ Q ⦄›. 

  For example one useful property to prove on our program would
  be:

    @{term [display] " mult_by_add' a b ∙ s ⦃ λRes r _. r = a * b ⦄"}

  That is, for all possible input states, our \texttt{mult\_by\_add'}
  function returns the product of @{term "a"} and @{term "b"}.
 
  Our proof of @{term mult_by_add'} could then proceed as follows:

›

lemma mult_by_add_correct:
    "mult_by_add' a b ∙ s ⦃ λRes r _. r = a * b ⦄"
  txt ‹Unfold abstracted definition›
  apply (unfold mult_by_add'_def)
  txt ‹Annotate the loop with an invariant and a measure, by adding a specialised rule
    to the verification condition generator.›
  supply runs_to_whileLoop_res [where  
      I="λ(a', result) s. result = (a - a') * b" and 
      R="measure' (λ((a', result), s). a')",
      tags body post, ― ‹optional: tag resulting verification conditions›
      runs_to_vcg]
  txt ‹Run the ``verification condition generator''.›
  apply (runs_to_vcg)
  txt ‹Solve the program correctness goals.›
  subgoalsT body
    apply (simp add: field_simps)
    apply unat_arith
    done
  subgoalT post
    apply (auto simp: field_simps not_less)
    done
  done


text ‹
  The main tool is the proof method @{method runs_to_vcg}, a 
  verification condition generator for monadic programs. It uses
  weakest precondition style rules which are collected in named theorems collection
  @{thm[source] runs_to_vcg}.
  In that collection there is no rule for \<^const>‹whileLoop›. The verification
  generator will stop there unless you specify a rule to use.
  This was done in the proof above by instantiating the rule @{thm[source] runs_to_whileLoop_res}
  with an invariant and a measure and by adding it to the verification condition generator
  via attribute @{attribute "runs_to_vcg"}.
  We finally discharge the remaining subgoals left from
  @{method runs_to_vcg} with standard proof tools of Isabelle.

  In the next section, we will look at how we can use AutoCorres to
  verify a C program that reads and writes to the heap.

›

(*<*)
end
end
(*>*)