Theory CTranslationInfrastructure
chapter ‹C-Translation Infrastructure \label{chap:CTranslationInfrastructure}›
theory CTranslationInfrastructure
imports
CTranslation
begin
term guarded_spec_body
section ‹Local Variables›
text ‹
Local variables in the SIMPL outcome of the C-Parser are represented in \<^typ>‹locals›.
The natural number in the domain encodes the canonical position
in the scope of the C-function: input arguments, then return variable, then local variables.
The byte list in the range of the function is the encoding of the value according to the
UMM (unified memory model) \<^class>‹c_type›. The typing of variables is maintained as data in
@{ML_structure "CLocals"} and a typed view is achieved via \<^const>‹clookup› and \<^const>‹cupdate›.
In previous attempts local variables were represented first as a @{command record} and more
recently as @{command statespace}.
The representation as \<^typ>‹locals› tries to combine the best of all previous approaches:
▪ Uniform (program independent) representation of local variables.
▪ Positional 'naming' enables a canonical parameter passing for function calls,
even if the names of parameters are unknown, as in the case of function-pointers.
▪ Lightweight "typing" via ML infrastructure instead of the 'fixes' of @{command statespace}.
Unfortunately, the more flexible @{command statespace} representation for local variables turned
out to reveal some performance bottle neck in the locale infrastructure, which itself boils down to
the resources consumed by the omnipresent instantiation of types and terms.
›
declare [[ML_print_depth = 1000]]
subsection ‹Basic ML primitives›
text ‹
To define new variables the basic primitive is @{ML CLocals.define_locals} which takes a
scope (currently a two element list of qualifiers: program-name followed by function-name) and
the list of local variable declarations (name, type and kind).
Note that the order is relevant as it represents the canonical positional ordering.
›
setup ‹CLocals.define_locals ["prog", "myfun"] [
("x", @{typ ‹32 word›}, NameGeneration.Loc),
("p", @{typ ‹32 word ptr›}, NameGeneration.Loc),
("z", @{typ ‹64 word›}, NameGeneration.Loc)] ›
ML ‹CLocals.Data.get (Context.Proof @{context})›
text ‹
For each local variable a constant is defined with the position of the variable. The constant
name encodes the original name of the local variable. Whenever available we use this symbolic name
in favour of the plain literal number. However, at some places we use the plain numbers, in
particular for parameter passing. There is some simplifier setup which we explain later that
allows to use either representation.
To enable syntax translations from short names to the symbolic constants together with their typing
one has to enter the correct scope, e.g.:
›
local_setup ‹
Local_Theory.declaration {pervasive=false, syntax=true, pos = ⌂} (fn _ =>
CLocals.scope_map (K ["prog", "myfun"]))
›
term prog.myfun.p_'
thm prog.myfun.p_'_def
text ‹
Some remarks on the naming scheme for those symbolic constants.
The qualifiers for both program-name and function-name are mandatory.
The name mangling with the suffix serves two purposes:
▪ Avoid name clashes with user input via variable names in the C sources (in particular
for the artificial return variable)
▪ Avoid name clashes for bound or free variables (or new constants) in HOL. Note that
even a constant with mandatory qualifier e.g. ‹foo.c› would force the pretty printer to avoid
a bound variable being named just as ‹c›. The pretty printer only takes
the @{ML Long_Name.base_name} into account.
The general suffix ‹_'› avoids clashes with C variable names as the prime is not
a legal character for C identifiers. Syntax translations will truncate this suffix.
The artificial return variable ‹ret› that the C-Parser introduces is internally ‹ret'_'› to
avoid a potential conflict with a local variable named ‹ret›.
›
ML ‹CLocals.Data.get (Context.Proof @{context})›
subsection "Syntax"
text ‹Plain lookup and update in locals.›
term ‹l ⋅ p›
ML_val ‹\<^term>‹l ⋅ p››
term ‹s⟨x := y⟩›
term ‹s⟨x := y, p := k⟩›
ML_val ‹\<^term>‹s⟨x := y⟩››
ML_val ‹\<^term>‹s⟨x := y, p := k⟩››
text ‹Lookup and update in a complete Simpl state, selecting the \<^const>‹locals››
term ‹s ⋅⇩ℒ p›
ML_val ‹@{term ‹s ⋅⇩ℒ p›}›
term ‹s⟨x :=⇩ℒ y⟩›
term ‹s⟨x :=⇩ℒ y, p :=⇩ℒ k⟩›
subsection "Simplifier setup"
text ‹
The simplifier is setup to be able to deal with symbolic as well as literal numerals. Symbolic
ones are expanded on the fly employing the infrastructure form the code generator.
The trigger for this expansion is via a simplification procedure that. Conditional rules can be
configured via the attribute:
›
ML ‹@{attributes [code_simproc]}›
text ‹
The simplification procedure first checks wheter the precondition evaluates to true, via
the code-generator. In case of success the simplifier is invoked to replay that proof and
trigger the rule. @{ML Code_Simproc.code_simp_prems}, @{ML Code_Simproc.code_prove}.
Note that the attribute also consumes an optional name to index the simprocs. This name is
taken from @{command named_theorems}.
The relevant setup of the rules can be found in 🗏‹CLocals.thy›.
›
schematic_goal ‹(clookup 2 (cupdate prog.myfun.z_' (λ_. 2::64 word) l)) = (?X::64 word)›
apply simp
done
schematic_goal ‹(clookup prog.myfun.z_' (cupdate prog.myfun.z_' (λ_. 2::64 word) l)) = (?X::64 word)›
apply simp
done
schematic_goal "s⟨x := λ_. 2⟩ ⋅ x = ?X"
apply simp
done
schematic_goal "s⟨x := λ_. 2⟩ ⋅ p = ?X"
apply simp
done