By Johannes K. Fichte, Daniel Le Berre, Markus Hecher, Stefan Szeider
Communications of the ACM,
June 2023,
Vol. 66 No. 6, Pages 64-72
10.1145/3560469
Comments

Credit: Igor Kisselev
The propositional satisfiability problem (SAT) was the first to be shown NP-complete by Cook and Levin. SAT remained the embodiment of theoretical worst-case hardness. However, in stark contrast to its theoretical hardness, SAT has emerged as a central target problem for efficiently solving a wide variety of computational problems. SAT solving technology has continuously advanced since a breakthrough around the millennium, which catapulted practical SAT solving ahead by orders of magnitudes. Today, the many flavors of SAT technology can be found in all areas of technological innovation.
Key Insights
SAT asks whether a given propositional formula is satisfiable. That is, can we set the formula’s variables to values 1 (True) or 0 (False) in such a way that the entire formula evaluates to 1? F = (x1 ∨ x2 ∨ x3) ∧ (¬x1 ∨ ¬x2 ∨ ¬x3) ∧ (¬x1 ∨ x2) ∧ (x2 ∨ x3) is a simple propositional formula in conjunctive normal form (CNF), where x1, x2, and x3 are propositional variables and ∨, ∧, and ¬ refer to the logical operators OR (disjunction), AND (conjunction), and NOT (negation), respectively. A variable xi or a negated variable ¬xi is a literal, and a disjunction of literals is a clause. So, the above formula F is a conjunction of four clauses. The formula is satisfiable; we can satisfy it by the truth assignment that sets x1 and x2 to 1, and x3 to 0: the first, third, and fourth clauses are satisfied by x2 = 1 because the clauses contain x2. The second clause is satisfied by x3 = 0 because it contains ¬x3. In consequence, all clauses are satisfied. A truth assignment naturally extends from variables to literals by setting ¬x to the opposite value of x. Hence, a formula is satisfiable if and only if there is a truth assignment that sets at least one literal in each clause to 1.
Example 1 shows a larger formula that is unsatisfiable—that is, not satisfied by any assignment. The focus on CNF formulas is not a restriction. The so-called Tseitin transformation39 efficiently transforms any propositional formula into CNF without affecting its satisfiability.
Figure. Example 1. Under each of the 29 truth assignments to the variables x1, …, x9, at least one of G’s clauses evaluates to 0, making the formula G unsatisfiable.*
At first glance, the SAT problem looks inconspicuous since it is simple to state, does not look difficult to solve, and seems uninteresting for practical purposes. Still, Stephen Cook7 and Leonid Levin29 showed independently in the 1970s that SAT is NP-complete, making it the first NP-complete problem. So, suppose one could solve SAT in polynomial time on arbitrary input. In that case, one could also solve any NP-complete problem in polynomial time, and it would follow that P equals NP. Thus, in terms of worst-case complexity theory, SAT embodies computational hardness. Also, in modern complexity theory, SAT continues to serve as a hard benchmark problem in the form of the (Strong) Exponential Time Hypothesis.4
In stark contrast to its theoretical worst-case hardness, the SAT problem has emerged as an essential instrument for efficiently solving a wide variety of computational problems, ranging from hardware and software verification to planning, combinatorial design, and software dependency.1,4 In this way, SAT significantly impacts today’s technological innovation. SAT is widely applied in knowledge representation, reasoning,19 and artificial intelligence (AI).9 Although SAT is mainly associated with symbolic AI, it contributes to non-symbolic AI by providing model-counting algorithms, which are essential tools for probabilistic reasoning,9 and allocates a main backbone for neurosymbolic AI.24 Notably, using SAT, long-standing open problems in mathematical combinatorics were successfully solved—for example, the Pythagorean Triples Problem.4,21 Solving such challenging problems with SAT requires the ability to efficiently translate the original problem into an instance of the SAT problem and the availability of computer programs, called SAT solvers, which efficiently evaluate SAT instances. Initial progress in practical SAT solving began in the early 1990s, leading to a breakthrough around the millennium. The last two decades have brought further enormous technological advancement and innovation to SAT solving. Today, SAT solvers are so powerful and robust that they have become primary tools for solving hard computational problems. Solvers have been embedded into complex procedures to solve more complex problems, such as optimization problems (MaxSAT, Pseudo-Boolean Optimization) or quantified satisfiability (QSAT).
“The story of satisfiability is the tale of a triumph of software engineering, blended with rich doses of beautiful mathematics,” writes Donald E. Knuth in the preface to the second part of the fourth volume of The Art of Computer Programming,28 which contains a section on satisfiability that stretches well over 300 pages.
As we observed incredible advances in computer hardware, yielding ever-faster processors, large and efficient memory, and massively parallel computing units, one could ask whether progress in SAT solving is the sole result of hardware advancement. A recent Time Leap Challenge addressed this question by running a race between 20-year-old SAT solvers on new computer hardware and modern SAT solvers on 20-year-old computer hardware.16 The experiments confirm Knuth’s statement on engineering and mathematics. Although hardware improvements make old solvers faster, algorithmic progress dominates and drives today’s SAT solving.
This article aims to shed light on the continuing progress of practical SAT solving through a series of evolutionary innovations and improvements that have been ongoing since the revolutionary breakthrough around the millennium. Overall, we argue that SAT has earned the title of a silent (r)evolution. We tell the story of SAT divided into three eras: the pre-revolution, the revolution, and the evolution.
Eras of Practical SAT Solving
Era I: The pre-revolution. The building blocks of today’s SAT success are various and date back even to the first half of the 20th century. We refer to other sources for a detailed description of SAT’s history4 and focus on a few important milestones of the modern era. Complete and incomplete solvers were the prevalent SAT solvers in the 1990s. Incomplete solvers, based on stochastic local search,23 were successfully applied to planning problems4 and satisfiable instances. In contrast, complete solvers, based on backtracking search, were used to solve combinatorial problems (such as puzzles, N-queens, and Latin squares) as well as uniform random k-SAT formulas (including unsatisfiable ones). These early complete solvers followed the general approach, called DPLL, which was first proposed by Davis, Logemann, and Loveland (DLL) as a memory-efficient refinement of an earlier algorithm by Davis and Putnam (DP).a,11,12
In modern terminology, DPLL is a backtracking algorithm that performs a depth-first exploration of a binary search tree on truth assignments (see Example 2). It applies the following two optimization steps repeatedly after a variable has been assigned, propagating the current partial assignment: if the current assignment sets all but one literal of a clause to 0 (the clause is a unit clause), then we can safely set the remaining unassigned literal to 1—unit propagation is the repeated application of this process; if the current assignment satisfies all clauses containing some uassigned literal, then we can safely set that literal to 0—the opposite literal is called a pure literal. Backtracking occurs when the current partial assignment sets all the literals of a clause to 0.
Figure. Example 2. Search Tree: Illustration of a possible run of the DPLL algorithm on the formula G from Example 1. Circles indicate decision variables, and unit propagation is represented by the list of propagated literals. After obtaining inconsistency () in one branch, DPLL chronologically backtracks to the last decision, which in our example causes the search to run into the same conflict several times.
Heuristic methods that decide when and how variables are assigned play a crucial role in the context of DPLL.8
Early implementation challenges. Three SAT competitions were organized in the 1990s. The second competition, 1993’s DIMACS implementation challenge,4 introduced the standard ASCII input format for SAT solvers, which is still in use. This standardized input format supports reusing SAT solvers as black boxes and sharing benchmarks. The DIMACS CNF format describes a propositional formula. The preamble provides the format (CNF), the number of variables, and the number of clauses. The remaining lines denote clauses, where each literal is represented by a signed integer, using 0 as a separator. Example 3 illustrates our running example containing nine variables and 17 clauses in the DIMACS CNF format.
Figure. Example 3. DIMACS CNF representation of Example 1‘s formula G. Try this example at http://bit.ly/406Lqc7 using for instance kissat SAT solver http://bit.ly/3oa8zx9 to solve it with a real SAT solver.
Era II: The revolution. The first pillar of the revolution30 was the solver GRASP,31 which proposed a new architecture, combining non-chronological backtracking, conflict analysis, and learning, today referred to as conflict driven clause learning (CDCL). CDCL is more than just DPLL with learning, since it is no longer a pure backtracking search. It captures unit propagation in a directed acyclic implication graph that is used to perform conflict analysis and clause learning, which prominently drives the search. Modern SAT solvers use the trail data structure proposed by MiniSat13 to capture the search tree and the implication graph.
The second pillar was the CDCL-based solver Chaff,32 specially designed to solve large benchmark instances by taking the characteristics of the host computer into account and achieving an unprecedented balance between the sophistication of algorithms and data structures on the one side and the practical efficiency on the other. Chaff introduced the Watched Literal data structure, a “lazy” scheme for performing unit propagation that allowed cost-free backtracking. The branching heuristics (VSIDS) and conflict analysis procedure were carefully designed with a new tradeoff between reasoning power and computation cost: The emphasis on recent conflicts leads to a locality-based search.
We illustrate CDCL for our running example in Example 4. Many conflict clauses are learned during the search, which is quite demanding on memory. Modern CDCL solvers frequently delete learned clauses and heuristically predict the ones to keep.4 Various heuristics, such as VSIDS,32 EVSIDS,13 or LRB,4 are available for optimizing the search and assigning the variables. Those heuristics impose low overhead, generate almost no cost for backtracking, and are updated when learning clauses. To escape unlucky search tree exploration during the search, solvers frequently unassign all variables and restart the search (but keep the learned clauses). Common heuristic schemes that decide when to restart are Luby or Rapid Restarts.4 Also, machine-learning techniques have been used to optimize heuristics.4 Still, more careful analysis is needed to better understand how the components of the solver and their interaction affect its overall efficiency.14,27
Figure. Example 4. Propagation Graph: Trace of a possible CDCL run on the formula G from Example 1.
Some critical applications drew much attention to practical SAT solving from academia and industry. For example, bounded model checking (BMC)4 provided numerous challenging benchmarks, which were soon tackled. BMC, which checks the correctness of sequential systems over a bounded number of steps, was proposed after the failure of standard model checking with binary decision diagrams (BDDs), a graph-based canonical representation of propositional formulas.6 The BMC benchmarks were extensive, with tens of thousands of variables and clauses. Combining conflict-driven heuristics with conflict analysis (CDCL) could solve BMC benchmarks up to two orders of magnitude faster than any other approach. These impressive results applied to other application benchmarks as well. The computer-aided verification (CAV) community acknowledged the central role of GRASP and Chaff for that outstanding progress by presenting their authors with the CAV 2009 award for “fundamental contributions to the development of high-performance Boolean satisfiability solvers.”
Era III: The evolution. While improving solvers is an important goal, the community has progressed in new modeling techniques and methods for certifying results.
Efficient SAT encodings. It is challenging to express a computational problem as a CNF formula that the solver can solve efficiently. Encodings may blow up the resulting CNF formula quite notably, resulting in quadratic, cubic, or even worse overhead compared to the original problem instance. Usually, one faces a tradeoff between space and simplicity. Intuitively, one would try to keep the instance as small and concise as possible, balancing the number of variables and the number of clauses. If we have n variables and add m auxiliary variables, the potential search space grows from 2n to 2n+m. Still, in practice, adding such variables can reduce the number of clauses, thereby speeding up the search.4 A similar pattern was observed by early automatic test pattern-generation programs, such as path-oriented decision making (PODEM).4 This is particularly relevant when encoding properties that constrain the number of variables from a set of variables that must be set to 1, called cardinality constraints. We give a simple example in Example 5.
Figure. Example 5. Encoding techniques based on binary adders.37
Assumptions. Using SAT solvers as “oracles” is far more complex in practice than just having an NP oracle in theoretical considerations. On the one hand, as successful as SAT solvers are, they are not NP oracles. SAT solvers need time for solving and may not even answer within a reasonable time. On the other hand, modern SAT solvers are more sophisticated than NP oracles. They admit stateful procedures, where clauses may be added or removed during solving. To control added clauses without invalidating learned clauses, modern solvers support assumptions, popularized by MiniSat,13 which are additional literals introduced for representing the state of clauses. It turned out that assumptions are highly versatile and useful. We demonstrate assumptions in Example 6.
Figure. Example 6. Assumptions make our running formula satisfiab