Theory Category
chapter "Category"
theory Category
imports Main "HOL-Library.FuncSet"
begin
text ‹
This theory develops an ``object-free'' definition of category loosely following
\<^cite>‹"AHS"›, Sec. 3.52-3.53.
We define the notion ``category'' in terms of axioms that concern a single
partial binary operation on a type, some of whose elements are to be regarded
as the ``arrows'' of the category.
The nonstandard definition of category has some advantages and disadvantages.
An advantage is that only one piece of data (the composition operation) is required
to specify a category, so the use of records is not required to bundle up several
separate objects. A related advantage is the fact that functors and natural
transformations can be defined simply to be functions that satisfy certain axioms,
rather than more complex composite objects.
One disadvantage is that the notions of ``object'' and ``identity arrow'' are
conflated, though this is easy to get used to. Perhaps a more significant disadvantage
is that each arrow of a category must carry along the information about its domain
and codomain. This implies, for example, that the arrows of a category of sets and
functions cannot be directly identified with functions, but rather only with functions that
have been equipped with their domain and codomain sets.
To represent the partiality of the composition operation of a category, we assume that the
composition for a category has a unique zero element, which we call ‹null›,
and we consider arrows to be ``composable'' if and only if their composite is non-null.
Functors and natural transformations are required to map arrows to arrows and be
``extensional'' in the sense that they map non-arrows to null. This is so that
equality of functors and natural transformations coincides with their extensional equality
as functions in HOL.
The fact that we co-opt an element of the arrow type to serve as ‹null› means that
it is not possible to define a category whose arrows exhaust the elements of a given type.
This presents a disadvantage in some situations. For example, we cannot construct a
discrete category whose arrows are directly identified with the set of \emph{all}
elements of a given type @{typ 'a}; instead, we must pass to a larger type
(such as @{typ "'a option"}) so that there is an element available for use as ‹null›.
The presence of ‹null›, however, is crucial to our being able to define a
system of introduction and elimination rules that can be applied automatically to establish
that a given expression denotes an arrow. Without ‹null›, we would be able to
define an introduction rule to infer, say, that the composition of composable arrows
is composable, but not an elimination rule to infer that arrows are composable from
the fact that their composite is an arrow. Having the ability to do both is critical
to the usability of the theory.
›
text ‹
A \emph{partial magma} is a partial binary operation ‹OP› defined on the set
of elements at a type @{typ 'a}. As discussed above,
we assume the existence of a unique element ‹null› of type @{typ 'a}
that is a zero for ‹OP›, and we use ‹null› to represent ``undefined''.
A \emph{partial magma} consists simply of a partial binary operation.
We represent the partiality by assuming the existence of a unique value ‹null›
that behaves as a zero for the operation.
›
locale partial_magma =
fixes OP :: "'a ⇒ 'a ⇒ 'a"
assumes ex_un_null: "∃!n. ∀t. OP n t = n ∧ OP t n = n"
begin
definition null :: 'a
where "null = (THE n. ∀t. OP n t = n ∧ OP t n = n)"
lemma null_eqI:
assumes "⋀t. OP n t = n ∧ OP t n = n"
shows "n = null"
using assms null_def ex_un_null the1_equality [of "λn. ∀t. OP n t = n ∧ OP t n = n"]
by auto
lemma null_is_zero [simp]:
shows "OP null t = null" and "OP t null = null"
using null_def ex_un_null theI' [of "λn. ∀t. OP n t = n ∧ OP t n = n"]
by auto
end
section "Partial Composition"
text ‹
A \emph{partial composition} is formally the same thing as a partial magma,
except that we think of the operation as an operation of ``composition'',
and we regard elements ‹f› and ‹g› of type @{typ 'a} as \emph{composable}
if their composition is non-null.
›
type_synonym 'a comp = "'a ⇒ 'a ⇒ 'a"
locale partial_composition =
partial_magma C
for C :: "'a comp" (infixr "⋅" 55)
begin
text ‹
An \emph{identity} is a self-composable element ‹a› such that composition of
any other element ‹f› with ‹a› on either the left or the right results in
‹f› whenever the composition is defined.
›
definition ide
where "ide a ≡ a ⋅ a ≠ null ∧
(∀f. (f ⋅ a ≠ null ⟶ f ⋅ a = f) ∧ (a ⋅ f ≠ null ⟶ a ⋅ f = f))"
text ‹
A \emph{domain} of an element ‹f› is an identity ‹a› for which composition of
‹f› with ‹a› on the right is defined.
The notion \emph{codomain} is defined similarly, using composition on the left.
Note that, although these definitions are completely dual, the choice of terminology
implies that we will think of composition as being written in traditional order,
as opposed to diagram order. It is pretty much essential to do it this way, to maintain
compatibility with the notation for function application once we start working with
functors and natural transformations.
›
definition domains
where "domains f ≡ {a. ide a ∧ f ⋅ a ≠ null}"
definition codomains
where "codomains f ≡ {b. ide b ∧ b ⋅ f ≠ null}"
lemma domains_null:
shows "domains null = {}"
by (simp add: domains_def)
lemma codomains_null:
shows "codomains null = {}"
by (simp add: codomains_def)
lemma self_domain_iff_ide:
shows "a ∈ domains a ⟷ ide a"
using ide_def domains_def by auto
lemma self_codomain_iff_ide:
shows "a ∈ codomains a ⟷ ide a"
using ide_def codomains_def by auto
text ‹
An element ‹f› is an \emph{arrow} if either it has a domain or it has a codomain.
In an arbitrary partial magma it is possible for ‹f› to have one but not the other,
but the ‹category› locale will include assumptions to rule this out.
›
definition arr
where "arr f ≡ domains f ≠ {} ∨ codomains f ≠ {}"
lemma not_arr_null [simp]:
shows "¬ arr null"
by (simp add: arr_def domains_null codomains_null)
text ‹
Using the notions of domain and codomain, we can define \emph{homs}.
The predicate @{term "in_hom f a b"} expresses ``@{term f} is an arrow from @{term a}
to @{term b},'' and the term @{term "hom a b"} denotes the set of all such arrows.
It is convenient to have both of these, though passing back and forth sometimes involves
extra work. We choose @{term "in_hom"} as the more fundamental notion.
›
definition in_hom ("«_ : _ → _»")
where "«f : a → b» ≡ a ∈ domains f ∧ b ∈ codomains f"
abbreviation hom
where "hom a b ≡ {f. «f : a → b»}"
lemma arrI:
assumes "«f : a → b»"
shows "arr f"
using assms arr_def in_hom_def by auto
lemma ide_in_hom [intro]:
shows "ide a ⟷ «a : a → a»"
using self_domain_iff_ide self_codomain_iff_ide in_hom_def ide_def by fastforce
text ‹
Arrows @{term "f"} @{term "g"} for which the composite @{term "g ⋅ f"} is defined
are \emph{sequential}.
›
abbreviation seq
where "seq g f ≡ arr (g ⋅ f)"
lemma comp_arr_ide:
assumes "ide a" and "seq f a"
shows "f ⋅ a = f"
using assms ide_in_hom ide_def not_arr_null by metis
lemma comp_ide_arr:
assumes "ide b" and "seq b f"
shows "b ⋅ f = f"
using assms ide_in_hom ide_def not_arr_null by metis
text ‹
The \emph{domain} of an arrow @{term f} is an element chosen arbitrarily from the
set of domains of @{term f} and the \emph{codomain} of @{term f} is an element chosen
arbitrarily from the set of codomains.
›
definition dom
where "dom f = (if domains f ≠ {} then (SOME a. a ∈ domains f) else null)"
definition cod
where "cod f = (if codomains f ≠ {} then (SOME b. b ∈ codomains f) else null)"
lemma dom_null [simp]:
shows "dom null = null"
by (simp add: dom_def domains_null)
lemma cod_null [simp]:
shows "cod null = null"
by (simp add: cod_def codomains_null)
lemma dom_in_domains:
assumes "domains f ≠ {}"
shows "dom f ∈ domains f"
using assms dom_def someI [of "λa. a ∈ domains f"] by auto
lemma cod_in_codomains:
assumes "codomains f ≠ {}"
shows "cod f ∈ codomains f"
using assms cod_def someI [of "λb. b ∈ codomains f"] by auto
end
section "Categories"
text‹
A \emph{category} is defined to be a partial magma whose composition satisfies an
extensionality condition, an associativity condition, and the requirement that every
arrow have both a domain and a codomain.
The associativity condition involves four ``matching conditions''
(‹match_1›, ‹match_2›, ‹match_3›, and ‹match_4›)
which constrain the domain of definition of the composition, and a fifth condition
(‹comp_assoc'›) which states that the results of the two ways of composing
three elements are equal. In the presence of the ‹comp_assoc'› axiom
‹match_4› can be derived from ‹match_3› and vice versa.
›
locale category = partial_composition +
assumes ext: "g ⋅ f ≠ null ⟹ seq g f"
and has_domain_iff_has_codomain: "domains f ≠ {} ⟷ codomains f ≠ {}"
and match_1: "⟦ seq h g; seq (h ⋅ g) f ⟧ ⟹ seq g f"
and match_2: "⟦ seq h (g ⋅ f); seq g f ⟧ ⟹ seq h g"
and match_3: "⟦ seq g f; seq h g ⟧ ⟹ seq (h ⋅ g) f"
and comp_assoc': "⟦ seq g f; seq h g ⟧ ⟹ (h ⋅ g) ⋅ f = h ⋅ g ⋅ f"
begin
text‹
Associativity of composition holds unconditionally. This was not the case in
previous, weaker versions of this theory, and I did not notice this for some
time after updating to the current axioms. It is obviously an advantage that
no additional hypotheses have to be verified in order to apply associativity,
but a disadvantage is that this fact is now ``too readily applicable,''
so that if it is made a default simplification it tends to get in the way of
applying other simplifications that we would also like to be able to apply automatically.
So, it now seems best not to make this fact a default simplification, but rather
to invoke it explicitly where it is required.
›
lemma comp_assoc:
shows "(h ⋅ g) ⋅ f = h ⋅ g ⋅ f"
by (metis comp_assoc' ex_un_null ext match_1 match_2)
lemma match_4:
assumes "seq g f" and "seq h g"
shows "seq h (g ⋅ f)"
using assms match_3 comp_assoc by auto
lemma domains_comp:
assumes "seq g f"
shows "domains (g ⋅ f) = domains f"
proof -
have "domains (g ⋅ f) = {a. ide a ∧ seq (g ⋅ f) a}"
using domains_def ext by auto
also have "... = {a. ide a ∧ seq f a}"
using assms ide_def match_1 match_3 by meson
also have "... = domains f"
using domains_def ext by auto
finally show ?thesis by blast
qed
lemma codomains_comp:
assumes "seq g f"
shows "codomains (g ⋅ f) = codomains g"
proof -
have "codomains (g ⋅ f) = {b. ide b ∧ seq b (g ⋅ f)}"
using codomains_def ext by auto
also have "... = {b. ide b ∧ seq b g}"
using assms ide_def match_2 match_4 by meson
also have "... = codomains g"
using codomains_def ext by auto
finally show ?thesis by blast
qed
lemma has_domain_iff_arr:
shows "domains f ≠ {} ⟷ arr f"
by (simp add: arr_def has_domain_iff_has_codomain)
lemma has_codomain_iff_arr:
shows "codomains f ≠ {} ⟷ arr f"
using has_domain_iff_arr has_domain_iff_has_codomain by auto
text‹
A consequence of the category axioms is that domains and codomains, if they exist,
are unique.
›
lemma domain_unique:
assumes "a ∈ domains f" and "a' ∈ domains f"
shows "a = a'"
proof -
have "ide a ∧ seq f a ∧ ide a' ∧ seq f a'"
using assms domains_def ext by force
thus ?thesis
using match_1 ide_def not_arr_null by metis
qed
lemma codomain_unique:
assumes "b ∈ codomains f" and "b' ∈ codomains f"
shows "b = b'"
proof -
have "ide b ∧ seq b f ∧ ide b' ∧ seq b' f"
using assms codomains_def ext by force
thus ?thesis
using match_2 ide_def not_arr_null by metis
qed
lemma domains_simp:
assumes "arr f"
shows "domains f = {dom f}"
using assms dom_in_domains has_domain_iff_arr domain_unique by auto
lemma codomains_simp:
assumes "arr f"
shows "codomains f = {cod f}"
using assms cod_in_codomains has_codomain_iff_arr codomain_unique by auto
lemma domains_char:
shows "domains f = (if arr f then {dom f} else {})"
using dom_in_domains has_domain_iff_arr domain_unique by auto
lemma codomains_char:
shows "codomains f = (if arr f then {cod f} else {})"
using cod_in_codomains has_codomain_iff_arr codomain_unique by auto
text‹
A consequence of the following lemma is that the notion @{term "arr"} is redundant,
given @{term "in_hom"}, @{term "dom"}, and @{term "cod"}. However, I have retained it
because I have not been able to find a set of usefully powerful simplification rules
expressed only in terms of @{term "in_hom"} that does not result in looping in many
situations.
›
lemma arr_iff_in_hom:
shows "arr f ⟷ «f : dom f → cod f»"
using cod_in_codomains dom_in_domains has_domain_iff_arr has_codomain_iff_arr in_hom_def
by auto
lemma in_homI [intro]:
assumes "arr f" and "dom f = a" and "cod f = b"
shows "«f : a → b»"
using assms cod_in_codomains dom_in_domains has_domain_iff_arr has_codomain_iff_arr
in_hom_def
by auto
lemma in_homE [elim]:
assumes "«f : a → b»"
and "arr f ⟹ dom f = a ⟹ cod f = b ⟹ T"
shows "T"
using assms in_hom_def domains_char codomains_char has_domain_iff_arr
by (metis empty_iff singleton_iff)
text‹
To obtain the ``only if'' direction in the next two results and in similar results later
for composition and the application of functors and natural transformations,
is the reason for assuming the existence of @{term null} as a special element of the
arrow type, as opposed to, say, using option types to represent partiality.
The presence of @{term null} allows us not only to make the ``upward'' inference that
the domain of an arrow is again an arrow, but also to make the ``downward'' inference
that if @{term "dom f"} is an arrow then so is @{term f}. Similarly, we will be able
to infer not only that if @{term f} and @{term g} are composable arrows then
@{term "C g f"} is an arrow, but also that if @{term "C g f"} is an arrow then
‹f› and ‹g› are composable arrows. These inferences allow most necessary
facts about what terms denote arrows to be deduced automatically from minimal
assumptions. Typically all that is required is to assume or establish that certain
terms denote arrows in particular homs at the point where those terms are first
introduced, and then similar facts about related terms can be derived automatically.
Without this feature, nearly every proof would involve many tedious additional steps
to establish that each of the terms appearing in the proof (including all its subterms)
in fact denote arrows.
›
lemma arr_dom_iff_arr:
shows "arr (dom f) ⟷ arr f"
using dom_def dom_in_domains has_domain_iff_arr self_domain_iff_ide domains_def
by fastforce
lemma arr_cod_iff_arr:
shows "arr (cod f) ⟷ arr f"
using cod_def cod_in_codomains has_codomain_iff_arr self_codomain_iff_ide codomains_def
by fastforce
lemma arr_dom [simp]:
assumes "arr f"
shows "arr (dom f)"
using assms arr_dom_iff_arr by simp
lemma arr_cod [simp]:
assumes "arr f"
shows "arr (cod f)"
using assms arr_cod_iff_arr by simp
lemma seqI [simp]:
assumes "arr f" and "arr g" and "dom g = cod f"
shows "seq g f"
proof -
have "ide (cod f) ∧ seq (cod f) f"
using assms(1) has_codomain_iff_arr codomains_def cod_in_codomains ext by blast
moreover have "ide (cod f) ∧ seq g (cod f)"
using assms(2-3) domains_def domains_simp ext by fastforce
ultimately show ?thesis
using match_4 ide_def ext by metis
qed
text ‹
This version of ‹seqI› is useful as an introduction rule, but not as useful
as a simplification, because it requires finding the intermediary term ‹b›.
Sometimes \emph{auto} is able to do this, but other times it is more expedient
just to invoke this rule and fill in the missing terms manually, especially
when dealing with a chain of compositions.
›
lemma seqI' [intro]:
assumes "«f : a → b»" and "«g : b → c»"
shows "seq g f"
using assms by fastforce
lemma compatible_iff_seq:
shows "domains g ∩ codomains f ≠ {} ⟷ seq g f"
proof
show "domains g ∩ codomains f ≠ {} ⟹ seq g f"
using cod_in_codomains dom_in_domains empty_iff has_domain_iff_arr has_codomain_iff_arr
domain_unique codomain_unique
by (metis Int_emptyI seqI)
show "seq g f ⟹ domains g ∩ codomains f ≠ {}"
proof -
assume gf: "seq g f"
have 1: "cod f ∈ codomains f"
using gf has_domain_iff_arr domains_comp cod_in_codomains codomains_simp by blast
have "ide (cod f) ∧ seq (cod f) f"
using 1 codomains_def ext by auto
hence "seq g (cod f)"
using gf has_domain_iff_arr match_2 domains_null ide_def by metis
thus ?thesis
using domains_def 1 codomains_def by auto
qed
qed
text‹
The following is another example of a crucial ``downward'' rule that would not be possible
without a reserved @{term null} value.
›
lemma seqE [elim]:
assumes "seq g f"
and "arr f ⟹ arr g ⟹ dom g = cod f ⟹ T"
shows "T"
using assms cod_in_codomains compatible_iff_seq has_domain_iff_arr has_codomain_iff_arr
domains_comp codomains_comp domains_char codomain_unique
by (metis Int_emptyI singletonD)
lemma comp_in_homI [intro]:
assumes "«f : a → b»" and "«g : b → c»"
shows "«g ⋅ f : a → c»"
proof
show 1: "seq g f" using assms compatible_iff_seq by blast
show "dom (g ⋅ f) = a"
using assms 1 domains_comp domains_simp by blast
show "cod (g ⋅ f) = c"
using assms 1 codomains_comp codomains_simp by blast
qed
lemma comp_in_homI' [simp]:
assumes "arr f" and "arr g" and "dom f = a" and "cod g = c" and "dom g = cod f"
shows "«g ⋅ f : a → c»"
using assms by auto
lemma comp_in_homE [elim]:
assumes "«g ⋅ f : a → c»"
obtains b where "«f : a → b»" and "«g : b → c»"
using assms in_hom_def domains_comp codomains_comp
by (metis arrI in_homI seqE)
text ‹
The next two rules are useful as simplifications, but they slow down the
simplifier too much to use them by default. So it is necessary to guess when
they are needed and cite them explicitly. This is usually not too difficult.
›
lemma comp_arr_dom:
assumes "arr f" and "dom f = a"
shows "f ⋅ a = f"
using assms dom_in_domains has_domain_iff_arr domains_def ide_def by auto
lemma comp_cod_arr:
assumes "arr f" and "cod f = b"
shows "b ⋅ f = f"
using assms cod_in_codomains has_codomain_iff_arr ide_def codomains_def by auto
lemma ide_char:
shows "ide a ⟷ arr a ∧ dom a = a ∧ cod a = a"
using ide_in_hom by auto
text ‹
In some contexts, this rule causes the simplifier to loop, but it is too useful
not to have as a default simplification. In cases where it is a problem, usually
a method like \emph{blast} or \emph{force} will succeed if this rule is cited
explicitly.
›
lemma ideD [simp]:
assumes "ide a"
shows "arr a" and "dom a = a" and "cod a = a"
using assms ide_char by auto
lemma ide_dom [simp]:
assumes "arr f"
shows "ide (dom f)"
using assms dom_in_domains has_domain_iff_arr domains_def by auto
lemma ide_cod [simp]:
assumes "arr f"
shows "ide (cod f)"
using assms cod_in_codomains has_codomain_iff_arr codomains_def by auto
lemma dom_eqI:
assumes "ide a" and "seq f a"
shows "dom f = a"
using assms cod_in_codomains codomain_unique ide_char
by (metis seqE)
lemma cod_eqI:
assumes "ide b" and "seq b f"
shows "cod f = b"
using assms dom_in_domains domain_unique ide_char
by (metis seqE)
lemma dom_eqI':
assumes "a ∈ domains f"
shows "a = dom f"
using assms dom_in_domains domain_unique by blast
lemma cod_eqI':
assumes "a ∈ codomains f"
shows "a = cod f"
using assms cod_in_codomains codomain_unique by blast
lemma ide_char':
shows "ide a ⟷ arr a ∧ (dom a = a ∨ cod a = a)"
using ide_dom ide_cod ide_char by metis
lemma dom_dom:
shows "dom (dom f) = dom f"
by (metis dom_null domains_char ideD(2) ide_dom dom_def)
lemma cod_cod:
shows "cod (cod f) = cod f"
by (metis arr_cod_iff_arr cod_def has_codomain_iff_arr ideD(3) ide_cod)
lemma dom_cod:
shows "dom (cod f) = cod f"
by (metis arr_cod_iff_arr cod_def has_codomain_iff_arr has_domain_iff_arr ideD(2)
ide_cod dom_def)
lemma cod_dom:
shows "cod (dom f) = dom f"
by (metis cod_null has_domain_iff_arr ideD(3) ide_dom dom_def)
lemma dom_comp [simp]:
assumes "seq g f"
shows "dom (g ⋅ f) = dom f"
using assms by (simp add: dom_def domains_comp)
lemma cod_comp [simp]:
assumes "seq g f"
shows "cod (g ⋅ f) = cod g"
using assms by (simp add: cod_def codomains_comp)
lemma comp_ide_self [simp]:
assumes "ide a"
shows "a ⋅ a = a"
using assms comp_arr_ide arrI by auto
lemma ide_compE [elim]:
assumes "ide (g ⋅ f)"
and "seq g f ⟹ seq f g ⟹ g ⋅ f = dom f ⟹ g ⋅ f = cod g ⟹ T"
shows "T"
using assms dom_comp cod_comp ide_char ide_in_hom
by (metis seqE seqI)
text ‹
The next two results are sometimes useful for performing manipulations at the
head of a chain of composed arrows. I have adopted the convention that such
chains are canonically represented in right-associated form. This makes it
easy to perform manipulations at the ``tail'' of a chain, but more difficult
to perform them at the ``head''. These results take care of the rote manipulations
using associativity that are needed to either permute or combine arrows at the
head of a chain.
›
lemma comp_permute:
assumes "f ⋅ g = k ⋅ l" and "seq f g" and "seq g h"
shows "f ⋅ g ⋅ h = k ⋅ l ⋅ h"
using assms by (metis comp_assoc)
lemma comp_reduce:
assumes "f ⋅ g = k" and "seq f g" and "seq g h"
shows "f ⋅ g ⋅ h = k ⋅ h"
using assms comp_assoc by auto
text‹
Here we define some common configurations of arrows.
These are defined as abbreviations, because we want all ``diagrammatic'' assumptions
in a theorem to reduce readily to a conjunction of assertions of the basic forms
@{term "arr f"}, @{term "dom f = X"}, @{term "cod f = Y"}, and @{term "in_hom f a b"}.
›
abbreviation endo
where "endo f ≡ seq f f"
abbreviation antipar
where "antipar f g ≡ seq g f ∧ seq f g"
abbreviation span
where "span f g ≡ arr f ∧ arr g ∧ dom f = dom g"
abbreviation cospan
where "cospan f g ≡ arr f ∧ arr g ∧ cod f = cod g"
abbreviation par
where "par f g ≡ arr f ∧ arr g ∧ dom f = dom g ∧ cod f = cod g"
end
end