Theory open_struct
section ‹Pointers into Structures in Split Heap \label{sec:open_struct}›
theory open_struct
imports "AutoCorres"
begin
subsection ‹Overview›
text ‹
Heap lifting in AutoCorres transforms the monolithic byte-level heap \<^typ>‹heap_mem› with explicit
term level heap type description \<^typ>‹heap_typ_desc› and typing constraints, like \<^const>‹c_guard›,
\<^const>‹h_t_valid› and \<^term>‹d ⊨⇩t p› to a more abstract split heap model with implicit HOL-typing.
The major advantage of the split heap model is that, updates to one particular heap do
not affect all other distinct heaps, removing the obligation to explicitly take care about aliasing.
Aliasing between different heaps is ruled out by the model already.
The question that arises is on which granularity do we split the byte-level heap?
The original AutoCorres splitted at the granularity of types. Each distinct c-type is mapped into
a separate heap. In that model you can still have intra-type aliasing (i.e. pointers of the
same type can alias), but no inter-type aliasing (i.e. pointers of different type are distinct, and
cannot alias). As an example a pointer to a list structure refers to a different (split) heap as
a pointer to a tree structure. So when a function modifies lists, no pointer to a tree is affected.
Of course not every C-program can be represented in this split-heap model. In those cases
the abstraction fails.
In particular think of a low-level memory allocator, that allocates some raw-bytes and
delivers a void-pointer, that is then retyped to point to the actual structure. To be able
to combine those byte-level functions with high-level typed functions (in the split heap) one
can switch between both layers of abstraction by \<^const>‹exec_concrete› and \<^const>‹exec_abstract›.
The good thing is that this gives us a very expressive tool to switch between byte-level and
typed view. We can do the specification and proofs of different parts of the program on
the adequate level and still combine the results. But switching between the layers can be
tedious and thus should be limited to the low-level functions that really inherently need
byte-level reasoning.
Unfortunately the type granularity of the split heaps also rules out programs dealing with
pointers into a structure. For example a structure containing a ‹int› field ‹count› cannot pass a
‹int› pointer to that field ‹count› to a another function, that might update it.
The complete structure has a separate
(split) heap that is distinct from the heap containing ‹int›s. This would mean that we
have to resort to the byte-level heap to reason about those functions.
We extended the split heap model to support this common use-case. The core idea is that
in addition to split heaps on the granularity of types you can also specify split heaps on the
granularity of structure fields. Conceptually you 'open' the structure into its
components (aka. fields) and each field gets a separate heap. Additionally you can specify which
fields should be ∗‹addressable› and thus shared. In the example you can specify that ‹count› is addressable,
so it is represented within the same heap as all other ‹int›. This means that an ‹int› pointer can
now point to a plain ‹int› or to a ‹count› field within a structure. You explicitly allow
this limited inter-type aliasing on ‹int› pointers. All other fields that are not
explicitly marked as ∗‹addressable› still have their own heap and thus cannot alias with another
pointer of the same field type. Note that mainly for performance reasons we actually avoid to
have a separate heap for each single field but try to minimise the number of heaps by clustering the
fields that are treated the same. Some more details on this construction are below.
›
subsection ‹Example Program and some Intuition›
text ‹We illustrate the approach with the following example specifying some structures,
with addressable and non-addressable fields. Note that for an addressable field that is
of an array type, each element of the array is considered to be addressable.
›