In the tables below (Figures 33–35), we can see an impressive list of formalizations from some of the above provers.
As shown in the above sections, even among just these 17 provers some use higher order logic, first order logic and ZFC set theory while others use Tarski-Grothendieck set theory which is a non-conservative extension of ZFC that includes the Tarski’s axiom to help prove more theorems. Some use Hilbert-style axiomatic systems while others use Jaskowski’s system of natural deduction where proofs are based on the use of assumptions which are freely introduced but dropped under some conditions. Some use classical logic systems while others use intuitionistic or constructive logic which is a restriction of classical logic wherein the law of excluded middle and double negation elimination have been removed. And one uses LUTINS (Logic of Undefined Terms for Inference in a Natural Styles), a variant of Church’s simple type theory
Type theory was originally introduced in contrast to set theory to avoid paradoxes and is known to be more expressive. But expressiveness tends to impact automation negatively. As noted in this article “A Survey on Formal Verification Techniques for Safety-Critical Systems-on-Chip”: (Figure-36)
“…increase in expressiveness has the inverse effect on automation. The more expressive a logic is, the less automated it can be. This means that tools for higher-order logic usually work iteratively, where the user is responsible for providing simpler proofs to help guide the verification process.”
With each choice there is a tradeoff. But to help tease these tradeoffs out in a foolproof or reliable and faster way, we need help. The turnaround time for humans to spot an error, correct it and re-check can be far too long. Around 1900, Bertrand Russell, a mathematician, logician and philosopher found a bug (Russell’s paradox) in set theory which eventually led to the discovery/invention of type theory. A hundred years later around 2000, Vladimir Voevodsky (a Fields medalist) finds an error in his own paper (written seven years earlier!) which leads to the discovery/invention of univalent foundations. Voevodsky notes in this article titled The Origins and Motivations of Univalent Foundations the following:
“In 1999–2000, again at the IAS, I was giving a series of lectures, and Pierre Deligne (Professor in the School of Mathematics) was taking notes and checking every step of my arguments. Only then did I discover that the proof of a key lemma in my paper contained a mistake and that the lemma, as stated, could not be salvaged. Fortunately, I was able to prove a weaker and more complicated lemma, which turned out to be sufficient for all applications. A corrected sequence of arguments was published in 2006. This story got me scared. Starting from 1993, multiple groups of mathematicians studied my paper at seminars and used it in their work and none of them noticed the mistake. And it clearly was not an accident. A technical argument by a trusted author, which is hard to check and looks similar to arguments known to be correct, is hardly ever checked in detail.”
Voevodsky goes on to propose a new homotypy type theory to provide univalent foundations for mathematics, based on relations between homotopy and type theory. This helped formalize a considerable amount of mathematics and proof assistants such as Coq and Agda moving the needle in ATP.
Choices such as set theory versus type theory are so core to automated reasoning that this field needs a much deeper understanding of the fundamentals of mathematics and hence needs a lot more mathematicians to work alongside computer scientists, logicians and philosophers to accelerate progress. In this very nicely written article titled Will Computers Redefine the Roots of Math?, the author talks about a recent cross disciplinary collaboration which we need more of:
Along with Thierry Coquand, a computer scientist at the University of Gothenburg in Sweden, Voevodsky and Awodey organized a special research year to take place at IAS during the 2012–2013 academic year. More than thirty computer scientists, logicians and mathematicians came from around the world to participate. Voevodsky said the ideas they discussed were so strange that at the outset, “there wasn’t a single person there who was entirely comfortable with it.” The ideas may have been slightly alien, but they were also exciting. Shulman deferred the start of a new job in order to take part in the project. “I think a lot of us felt we were on the cusp of something big, something really important,” he said, “and it was worth making some sacrifices to be involved with the genesis of it.”
One trend to note is that the complexity of theorem provers has slowly been growing from simple systems to system of systems, i.e. pulling in the best reasoners from multiple sources. But this also means that maintaining such systems (for performance, robustness, accuracy, reliability) is getting harder. Let’s take a sneak peak at the architectures of some of them: ACL2, Omega, LEAN, ALLIGATOR, iGeoTutor/GeoLogic, Matryoshka and Keymaera.
We saw ACL2 earlier. It has an IDE supporting several modes of interaction, it provides a powerful termination analysis engine and has automatic bug-finding methods. Most notably, ACL2 (and its precursor Nqthm) won the ACM Software System Award in 2005 (Boyer-Moore Theorem Prover) which is awarded to software systems that have had a lasting influence. Figure-37 shows a high-level flow.
We saw Omega earlier as well. Interesting to note here (Figure-38) is how external reasoners (like OTTER, SPASS for first-order or TPS, LEO for higher order) are invoked as needed.
LEAN is an open source theorem prover developed at Microsoft Research and CMU. It is based on the logical foundations of CiC (Calculus of Inductive Constructions) and has a small trusted kernel. It brings together tools and approaches into a common framework to allow for user interaction hence bringing ATP and ITP closer. Figure-39 shows some of the key modules.
ALLIGATOR is a natural deduction theorem prover based on dependent type systems. It breaks goals down and generates proof objects (text files with assertions leading to a conclusion), an interesting part of the architecture shown in Figure-40. As noted in the paper “The ALLIGATOR Theorem Prover for Dependent Type Systems: Description and Proof Sample”, the authors identify the following key advantages of having proof objects:
- Reliability: Satisfies de Bruijn’s criterion that an algorithm can check the proof objects easily.
- Naturalness: Based on natural deduction (closer to human reasoning — less axiomatic)
- Relevance: Proof objects can identify proofs which are valid but spurious (do not consume their premises)
- Justification of behavior: Proof objects provide direct access to the justifications that an agent has for the conclusions and the interpretations that it constructs, i.e. more explainability.
An important goal in efforts like iGeoTutor (See Automated Geometry Theorem Proving for Human-Readable Proofs, Figure-41) and GeoLogic is to generate more human readable proofs in geometry. GeoLogic is a logic system for Euclidean geometry allowing users to interact with the logic system and visualize the geometry (equal distances, point being on a line …). Here’s a detailed presentation by Pedro Quaresma on Geometric Automated Theorem Proving dating back to efforts in 1959 by H.Gelerntner (See Realization of a geometry-theorem proving machine)
Matryoshka fuses two lines of research — Automated Theorem Proving and Interactive Theorem Proving. These systems are based on the premise that first-order (FO) provers are best suited for performing most of the logical work but they go on to to enrich superposition (Resolution+Rewriting) and SMT (Satisfiability Module Theories) with higher-order (HO) reasoning to preserve their desirable properties (Figure-42 and Figure-43)
KeYmaera is a theorem prover for differential dynamic logic for verifying properties of hybrid systems (i.e. mixed discrete and continuous dynamics). Figure-44 shows three big parts to the system — the axiomatic core, the tactical prover and the user interface. The architecture gives us a sense of the multiple moving parts.