Abstract
This development formalizes the Aho-Corasick multi-pattern string matching
algorithm over lists. The verified executable reference automaton uses the
canonical Aho-Corasick state: after reading a text prefix, the state is the
longest suffix of that prefix that is also a pattern prefix. The development
also proves a state-only transition theorem, a fold-based scan invariant, and
a recursive failure-link transition refining the canonical transition, together
with equivalent state-carrying and failure-link search procedures. The main
theorem shows that the search procedure reports exactly the pattern occurrences
ending at each text position.
License
Note
Codex with Gpt 5.5 on xhigh was used to help with proof engineering. Proof manually checked by the authors.