Theory M_02_Background

(*************************************************************************
 * Copyright (C) 
 *               2019-2023 The University of Exeter 
 *               2018-2023 The University of Paris-Saclay
 *               2018      The University of Sheffield
 *
 * License:
 *   This program can be redistributed and/or modified under the terms
 *   of the 2-clause BSD-style license.
 *
 *   SPDX-License-Identifier: BSD-2-Clause
 *************************************************************************)

(*<*)
theory "M_02_Background"
  imports "M_01_Introduction"
begin
(*>*)

chapter*[background::text_section]‹ Background›
section*[bgrnd1::introduction]‹The Isabelle System Architecture›

figure*[architecture::figure,relative_width="95",file_src="''figures/isabelle-architecture.pdf''"]‹ 
     The system architecture of Isabelle (left-hand side) and the 
     asynchronous communication between the Isabelle system and 
     the IDE (right-hand side). ›

text*[bg::introduction]‹
While Isabelle is widely perceived as an interactive theorem 
prover for HOL (Higher-order Logic)~\<^cite>‹"nipkow.ea:isabelle:2002"›, we would like to emphasize
the view that Isabelle is far more than that: it is the ∗‹Eclipse of Formal Methods Tools›.  This 
refers to the ``∗‹generic system framework of Isabelle/Isar underlying recent versions of Isabelle.  
Among other things, Isabelle provides an infrastructure for Isabelle plug-ins, comprising extensible 
state components and extensible syntax that can be bound to SML programs. Thus, the Isabelle 
architecture may be understood as an extension and refinement of the traditional `LCF approach', 
with explicit infrastructure for building derivative systems.›''~\<^cite>‹"wenzel.ea:building:2007"› 

The current system framework offers moreover the following features:
▪ a build management grouping components into to pre-compiled sessions,
▪ a prover IDE (PIDE) framework~\<^cite>‹"wenzel:asynchronous:2014"› with various front-ends, 
▪ documentation-generation,
▪ code generators for various target languages,
▪ an extensible front-end language Isabelle/Isar, and,
▪ last but not least, an LCF style, generic theorem prover kernel as 
  the most prominent and deeply integrated system component.
›
text‹
The Isabelle system architecture shown in @{figure ‹architecture›} comes with many layers, 
with Standard ML (SML) at the bottom layer as implementation  language. The architecture actually 
foresees a ∗‹Nano-Kernel› (our terminology) which resides in the SML structure \<^boxed_sml>‹Context›. 
This structure provides a kind of container called ∗‹context› providing an identity, an 
ancestor-list as well as typed, user-defined state for plugins such as \<^isadof>. 
On top of the latter, the LCF-Kernel, tactics, automated proof procedures as well as specific 
support for higher specification constructs were built.⁋‹We use the term ∗‹plugin› for a collection 
of HOL-definitions, SML and Scala code in order to distinguish it from the official Isabelle 
term ∗‹component› which implies a particular format and support by the Isabelle build system.›
›

section*[dof::introduction]‹The Document Model Required by \<^dof>›
text‹
  In this section, we explain the assumed document model underlying our Document Ontology Framework 
  (\<^dof>) in general. In particular we discuss the concepts 
   ∗‹integrated document›\<^bindex>‹integrated document›, ∗‹sub-document›\<^bindex>‹sub-document›,  
  ∗‹document-element›\<^bindex>‹document-element›, and ∗‹semantic macros›\<^bindex>‹semantic macros› occurring 
  inside document-elements. This type of document structure is quite common for scripts interactively
  evaluated in an incremental fashion. 
  Furthermore, we assume a bracketing mechanism that unambiguously allows to separate different 
  syntactic fragments and that can be nested. In the case of Isabelle, these are the guillemot 
  symbols ‹‹...››, which represent the begin and end of a ∗‹cartouche›\<^bindex>‹cartouche›.›


(*<*)
declare_reference*[docModGenConcr::figure]
(*>*)
text‹
  The Isabelle Framework is based on a ∗‹document-centric view›\<^bindex>‹document-centric view› of 
  a document, treating the input in its integrality as set of (user-programmable) ∗‹document element› 
  that may mutually depend on and link to each other; A ∗‹document› in our sense is what is configured 
  in a set of  ▩‹ROOT›- and ▩‹ROOTS›-files.

  Isabelle assumes a hierarchical document model\<^index>‹document model›, \<^ie>, an ∗‹integrated› document 
  consist of a hierarchy of ∗‹sub-documents›  (files); dependencies are restricted to be
  acyclic at this level (c.f. @{figure (unchecked) "docModGenConcr"}). 
  Document parts can have different document types in order to capture documentations consisting of 
  documentation, models, proofs, code of various forms and other technical artifacts.  We call the 
  main sub-document type, for historical reasons, ∗‹theory›-files.  A theory file\<^bindex>‹theory!file›
  consists of a ∗‹header›\<^bindex>‹header›, a ∗‹context definition›\<^index>‹context›, and a body 
  consisting of a sequence of document elements called
  ∗‹command›s (see @{figure (unchecked) "docModGenConcr"} (left-hand side)). Even
  the header consists of a sequence of commands used for introductory text elements not depending on 
  any context. The context-definition contains an \<^boxed_theory_text>‹import› and a 
  \<^boxed_theory_text>‹keyword› section, for example:
@{boxed_theory_text [display]‹
  theory Example         ―‹Name of the 'theory'›
    imports              ―‹Declaration of 'theory' dependencies›
      Main               ―‹Imports a library called 'Main'›
    keywords             ―‹Registration of keywords defined locally›
      requirement        ―‹A command for describing requirements› ›}
  where \<^boxed_theory_text>‹Example› is the abstract name of the text-file, \<^boxed_theory_text>‹Main› 
  refers to an imported theory (recall that the import relation must be acyclic) and 
  \<^boxed_theory_text>‹keywords› are used to separate commands from each other.
›

text*[docModGenConcr::float, 
      main_caption="‹A Representation of a Document Model.›"]
‹
@{fig_content (width=45, caption="Schematic Representation.") "figures/doc-mod-generic.pdf"
}\<^hfill>@{fig_content (width=45, caption="The Isar Instance.") "figures/doc-mod-isar.pdf"} 
›

text‹The body of a theory file consists of a sequence of ∗‹commands› that must be introduced
by a command keyword such as \<^boxed_theory_text>‹requirement› above. Command keywords may mark 
the the begin of a text that is parsed by a command-specific parser;  the end of the 
command-span is defined by the next keyword. Commands were used to define definitions, lemmas, 
code and text-elements (see @{float "docModGenConcr"} (right-hand side)).  ›

text‹ A simple text-element \<^index>‹text-element› may look like this:

  @{boxed_theory_text [display]‹
text‹ This is a simple text.››}
\ldots so it is a command ⬚‹text› followed by an argument (here in  ‹‹ ... ›› parenthesis) which 
contains characters. While ⬚‹text›-elements play a major role in this manual---document 
generation is the main use-case of \<^dof> in its current stage---it is important to note that there 
are actually three families of ``ontology aware'' document elements with analogous 
syntax to standard ones. The difference is a bracket with meta-data of the form:
@{theory_text [display,indent=5, margin=70] 
‹
text*[label::classid, attr⇩₁=E⇩₁, ... attr⇩_n=E⇩_n]‹ some semi-formal text ›
ML*[label::classid, attr⇩₁=E⇩₁, ... attr⇩_n=E⇩_n]‹ some SML code ›
value*[label::classid, attr⇩₁=E⇩₁, ... attr⇩_n=E⇩_n]‹ some annotated λ-term ›
›}

Other instances of document elements belonging to these families are, for example, the free-form
⬚‹Definition*› and ⬚‹Lemma*› as well as their formal counterparts ⬚‹definition*› and ⬚‹lemma*›,
which allow in addition to their standard Isabelle functionality the creation and management of
ontology-generated meta-data associated to them (cf. -@{text_section (unchecked) "writing_doc"}).

Depending on the family, we will speak about ∗‹(formal) text-contexts›,\<^bindex>‹formal text-contexts› 
∗‹(ML) code-contexts›\<^bindex>‹code-contexts› and ∗‹term-contexts›\<^bindex>‹term-contexts› if we refer 
to sub-elements inside the ‹‹...›› cartouches of these command families. 

Text- code- or term contexts may contain a special form comment, that may be considered as a
"semantic macro" or a machine-checked annotation: the so-called antiquotations\<^bindex>‹antiquotation›. 
Its Its general syntactic format reads as follows:

@{boxed_theory_text [display]‹ @{antiquotation_name (args) [more_args] ‹sub-context› }›}

The sub-context may be different from the surrounding one; therefore, it is possible
to switch from a text-context to a term-context, for example. Therefore, antiquotations allow
 the nesting of cartouches, albeit not all combinations are actually supported.⁋‹In the 
literature, this concept has been referred to ‹Cascade-Syntax› and was used in the 
Centaur-system and is existing in some limited form in some Emacs-implementations these days. › 
Isabelle comes with a number of built-in antiquotations for text- and code-contexts;
a detailed overview can be found in \<^cite>‹"wenzel:isabelle-isar:2020"›. \<^dof> reuses this general
infrastructure but ∗‹generates› its own families of antiquotations from ontologies.›

text‹ An example for a text-element \<^index>‹text-element› using built-in antoquotations 
may look like this:

@{boxed_theory_text [display]‹
text‹ According to the ∗‹reflexivity› axiom @{thm refl}, 
   we obtain in Γ for @{term "fac 5"} the result @{value "fac 5"}.››}
... so it is a command ⬚‹text› followed by an argument (here in  ‹‹ ... ›› parenthesis) which 
contains characters and a special notation for semantic macros \<^bindex>‹semantic macros› 
(here ⬚‹@{term "fac 5"}›).

The above text element is represented in the final document (\<^eg>, a PDF) by:

@{boxed_pdf [display]
‹According to the $\emph{reflexivity}$ axiom $\mathrm{x = x}$, we obtain in $\Gamma$ 
for $\operatorname{fac} \text{\textrm{5}}$ the result $\text{\textrm{120}}$.›
 }›


text‹Antiquotations seen as semantic macros are partial functions of type ‹logical_context → text›; 
  since they can use the system state, they can perform all sorts of specific checks or evaluations 
  (type-checks, executions of code-elements, references to text-elements or proven theorems such as 
  ‹refl›, which is the reference to the axiom of reflexivity).

  Therefore, semantic macros can establish ∗‹formal content› inside informal content; they can be 
  type-checked before being displayed and can be used for calculations before being 
  typeset. They represent the device for linking formal with the informal content. 
›

text*[docModOnto::float, 
      main_caption="‹Documents conform to Ontologies.›"]
‹
@{fig_content (width=47, caption="A Document with Ontological Annotations.") "figures/doc-mod-DOF.pdf"
}\<^hfill>@{fig_content (width=47, caption="Ontological References.") "figures/doc-mod-onto-docinst.pdf"} 
›

text‹Since Isabelle's commands are freely programmable, it is possible to implement  \<^dof> as an 
extension of the system. In particular, the ontology language of \<^dof> provides an  ontology 
definition language ODL\<^bindex>‹ODL› that ∗‹generates› anti-quotations and the infrastructure to check and evaluate 
them. This allows for checking an annotated document with respect to a given ontology, which may be 
specific for  a given domain-specific universe of discourse (see @{float "docModOnto"}). ODL will 
be described in  @{text_section (unchecked) "isadof_tour"} in more detail.›

section*[bgrnd21::introduction]‹Implementability of the Document Model in other ITP's›
text‹ 
  Batch-mode checkers for \<^dof> can be implemented in all systems of the LCF-style prover family, 
  \<^ie>, systems with a type-checked ‹term›, and abstract ‹thm›-type for theorems 
  (protected by a kernel).  This includes, \<^eg>, ProofPower, HOL4, HOL-light, Isabelle, or Coq
  and its derivatives. \<^dof> is, however, designed for fast interaction in an IDE. If a user wants
  to benefit from this experience, only Isabelle and Coq have the necessary infrastructure of 
  asynchronous proof-processing and support by a PIDE~\<^cite>‹"wenzel:asynchronous:2014" and 
  "wenzel:system:2014" and "barras.ea:pervasive:2013" and "faithfull.ea:coqoon:2018"› which 
  in many features over-accomplishes the required  features of \<^dof>. 
›

figure*["fig_dof_ide",relative_width="95",file_src="''figures/cicm2018-combined.png''"]‹ 
     The \<^isadof> IDE (left) and the corresponding PDF (right), showing the first page
      of~\<^cite>‹"brucker.ea:isabelle-ontologies:2018"›.›

text‹ 
  We call the present implementation of \<^dof> on the Isabelle platform  \<^isadof> . 
  @{figure  "fig_dof_ide"} shows a screenshot of an introductory paper on 
  \<^isadof>~\<^cite>‹"brucker.ea:isabelle-ontologies:2018"›: the \<^isadof> PIDE can be seen on the left, 
  while the generated presentation in PDF is shown on the right.

  Isabelle provides, beyond the features required for \<^dof>, a lot of additional benefits. 
  Besides UTF8-support for characters used in text-elements, Isabelle offers built-in already a 
  mechanism for user-programmable antiquotations \<^index>‹antiquotations› which we use to implement
  semantic macros \<^index>‹semantic macros› in \<^isadof> (We will actually use these two terms
  as synonym in the context of \<^isadof>). Moreover, \<^isadof> allows for the asynchronous 
  evaluation and checking of the document content~\<^cite>‹"wenzel:asynchronous:2014" and 
  "wenzel:system:2014" and "barras.ea:pervasive:2013"› and is dynamically extensible. Its PIDE 
  provides a  ∗‹continuous build, continuous check›  functionality, syntax highlighting, and 
  auto-completion. It also provides infrastructure for displaying meta-information (\<^eg>, binding 
  and type annotation) as pop-ups, while hovering over sub-expressions.  A fine-grained dependency 
  analysis allows the processing of individual parts of theory files asynchronously, allowing 
  Isabelle to interactively process large (hundreds of theory files) documents.  Isabelle can group 
  sub-documents into sessions, \<^ie>, sub-graphs of the document-structure that can be ``pre-compiled'' 
  and loaded instantaneously, \<^ie>, without re-processing, which is an important means to scale up. ›

(*<*) 
end
(*>*)