
Introduction
This tutorial concerns the Boolean satisfiability or SAT problem. We are given a formula containing binary variables that are connected by logical relations such as
The tutorial is divided into three parts. In part I, we introduce Boolean logic and the SAT problem. We discuss how to transform SAT problems into a standard form that is amenable to algorithmic manipulation. We categorize types of SAT solvers and present two naïve algorithms. We introduce several SAT constructions, which can be thought of as common sub-routines for SAT problems. Finally, we present some applications; the Boolean satisfiability problem may seem abstract, but as we shall see it has many practical uses.
In part II of the tutorial, we will dig more deeply into the internals of modern SAT solver algorithms. In part III, we recast SAT solving in terms of message passing on factor graphs. We also discuss satisfiability modulo theory (SMT) solvers, which extend the machinery of SAT solvers to solve more general problems involving continuous variables.
Relevance to machine learning
The relevance of SAT solvers to machine learning is not immediately obvious. However, there are two direct connections. First, machine learning algorithms rely on optimization. SAT can also be considered an optimization problem and SAT solvers can find global optima without relying on gradients. Indeed, in this tutorial, we’ll show how to fit both neural networks and decision trees using SAT solvers.
Second, machine learning techniques are often used as components of SAT solvers; in part II of this tutorial, we’ll discuss how reinforcement learning can be used to speed up SAT solving, and in part III we will show that there is a close connection between factor graphs and SAT solvers and that belief propagation algorithms can be used to solve satisfiability problems.
Boolean logic and satisfiability
In this section, we define a set of Boolean operators and show how they are combined into Boolean logic formulae. Then we introduce the Boolean satisfiability problem.
Boolean operators
Boolean operators are standard functions that take one or more binary variables as input and return a single binary output. Hence, they can be defined by truth tables in which we enumerate every combination of inputs and define the output for each (figure 1). Common logical operators include:

Figure 1. Logical operators. Each of the first four operators (
- The
operator is written as and takes two inputs and . It returns if one or both of the inputs are and returns otherwise. - The
operator is written as and takes two inputs and . It returns if both the inputs are and otherwise. - The
operator is written as and evaluates whether the two inputs are consistent with the statement ‘if then ’. The statement is only disobeyed when is and is and so implication returns for this combination of inputs and otherwise. - The
operator is written as and takes two inputs and . It returns if the two inputs are the same and returns otherwise. - The
operator is written as and takes one input. It returns if the input is and vice-versa. We refer as the complement of .
Boolean logic formulae
A Boolean logic formula
For any combination of input variables
Boolean satisfiability and SAT solvers
The Boolean satisfiability problem asks whether there is at least one combination of binary input variables
A SAT solver is an algorithm for establishing satisfiability. It takes the Boolean logic formula as input and returns
Conjunctive normal form
To solve the SAT problem, we first convert the Boolean logic formula to a standard form that it is more amenable to algorithmic manipulation. Any formula can be re-written as a conjunction of disjunctions (i.e., the logical
Each term in brackets is known as a clause and combines together variables and their complements with a series of logical
The Tseitin transformation
The Tseitin transformation converts an arbitrary logic formula to conjunctive normal form. The approach is to i) associate new variables with sub-parts of the formula using logical equivalence relations, (ii) to restate the formula by logically
This process is most easily understood with a concrete example. Consider the conversion of the formula:
Step 1: We associate new binary variables
We work from the inside out (i.e., from the deepest brackets to the least deep) and choose sub-formulae that contain a single operator (
Step 2: We restate the formula in terms of these relations. The full original statement is now represented by
This is getting closer to the conjunctive normal form as it is now a conjunction (logical
Step 3: We convert each of these individual terms to conjunctive normal form. In practice, there is a recipe for each type of operator:
The first of these recipes is easy to understand. If
The remaining recipes are not obvious, but you can confirm that they are correct by writing out the truth tables for the left and right sides of each expression and confirming that they are the same. Applying the recipes to equation 5 we get the final expression in conjunctive normal form:
Literals
In the conjunctive normal form, each clause is a disjunction (logical
we write:
We collectively refer to the variables and their complements as literals and so this formula contains literals
-clauses and -SAT
When expressed in conjunctive normal form, we can characterise the problem in terms of the number of variables, the number of clauses and the size of those clauses. To facilitate this we introduce the following terminology:
- A clause that contains kk variables is known as a kk-clause. When a clause contains only a single variable, it is known as a unit clause.
- When all the clauses contain kk variables, we refer to a problem as kk-SAT. Using this nomenclature, we see that equation 9 is a 3-SAT problem.
Establishing satisfiability
SAT solvers are algorithms that establish whether a Boolean expression is satisfiable and they can be classified into two types. Complete algorithms guarantee to return
Here are two naïve algorithms that will help you understand the difference:
- An example of a complete algorithm is exhaustive search. If there are VV variables, we evaluate the expression with all 2V2V combinations of literals and see if any combination returns truetrue. Obviously, this will take an impractically long time when the number of variables are large, but nonetheless it is guaranteed to return either SATSAT or UNSATUNSAT eventually.
- An example of an incomplete algorithm is Schöning’s random walk. This is a Monte Carlo solver in which we repeatedly (i) randomly choose an unsatisfied clause, (ii) choose one of the variables in this clause at random and set it to the opposite value. At each step we test if the formula is now satisfied and if so return SATSAT. After 3V3V iterations, we return UNKNOWNUNKNOWN if we have not found a satisfying configuration.
When a solver returns
Bad news and good news
First, the bad news. The SAT problem is proven to be NP-complete and it follows that there is no known polynomial algorithm for establishing satisfiability in the general case. An important exception to this statement is 2-SAT for which a polynomial algorithm is known. However, for 3-SAT and above the problem is very difficult.
The good news is that modern SAT solvers are very efficient and can often solve problems involving tens of thousands of variables and millions of clauses in practice. In part II of this tutorial we will explain how these algorithms work.
Related problems to SAT
Until now we have focused on the satisfiability problem in which we try to establish if there is at least one set of literals that makes a given statement evaluate to
UNSAT: In the UNSAT problem we aim to show that there is no combination of literals that satisfies the formula. This is subtly different from SAT where algorithms return as soon as they find literals that show the formula is
Model counting: In model counting (sometimes referred to as #SAT or #CSP), our goal is to count the number of distinct sets of literals that satisfy the formula.
Max-SAT: In Max-SAT, it may be the case that a formula is
Weighted Max-SAT: This is a variation of Max-SAT in which we pay a different penalty for each clause when it is invalid. We wish to find the solution that incurs the least penalty.
For the rest of this tutorial, we’ll concentrate on the main SAT problem, but we’ll return to these related problems in part III of this tutorial when we discuss factor graph methods.
SAT Constructions
Most of the remainder of part I of this tutorial is devoted to discussing practical applications of satisfiability problems. Based on the discussion thus far, the reader would be forgiven for being sceptical about how this rather abstract problem can find real-world uses. We will attempt to convince you that it can! However, before we can do this, it will be helpful to review commonly-used SAT constructions.
SAT constructions can be thought of as subroutines for Boolean logic expressions. A common situation is that we have a set of variables
Same
To enforce the constraint that a set of variables
Note that this is not in conjunctive normal form (the
Exactly one
To enforce the constraint that only one of a set of variables
Then we add a constraint that indicates that both members of any pair of varaiables cannot be simultaneously
At least , Less than , Exactly
There are many standard ways to enforce the constraint that at least
The idea is straightforward. If we have

Figure 2. SAT construction for ‘at least
If there are at least
The table in figure 2d also shows us how to constrain the data to have exactly K

Figure 3. SAT construction for ‘at least K’ constraint. a) Consider a data vector
Armed with these SAT constructions, we’ll now present two complementary ways of thinking about SAT applications. The goal is to inspire the novice reader to see the applicability to their own problems. In the next section, we’ll consider SAT in terms of constraint satisfaction problems and in the section following that, we’ll discuss it in terms of model fitting.
Applications: constraint satisfaction
The constraint satisfaction viewpoint considers combinatorial problems where there are a very large number of potential solutions, but most of those solutions are ruled out by some pre-specified constraints. To make this explicit, we’ll consider the two examples of graph coloring and scheduling.
Graph coloring
In the graph coloring problem (figure 4) we are given a graph consisting of a set of vertices and edges. We want to associate each vertex with a color in such a way that every pair of vertices connected by an edge have different colors. We might also want to know how many colors are necessary to find a valid solution. Note that this maps to our description of the generic constraint satisfaction problem; there are a large number of possible assignments of colors, but many of these are ruled out by the constraint that neighboring colors must be different.

Figure 4. Graph coloring. a) We are given a graph consisting of a set of vertices and edges connecting them. We want to color the vertices in such a way that two vertices have different colors if they are connected by an edge. The graph coloring problem aims to establish the smallest number of colors for which this is possible and return a satisfying assignment of colors. For this graph, it can be achieved with three colors. b) The graph coloring problem has a practical application in coloring maps. Here each American state corresponds to a vertex and we add an edge if two states are adjacent. It has been famously proven that all such 2D maps require a maximum of four colors.
To encode this as a SAT problem, we’ll choose the number of colors
Having set up the problem, we run the SAT solver. If it returns
Scheduling
The graph coloring problem is a rather artificial computer science example, but many real-world problems can similarly be expressed in terms of satisfiability. For example, consider scheduling courses in a university. We have a number of professors, each of whom teach several different courses. We have a number of classrooms. We have a number of possible time-slots in each classroom. Finally, we have the students themselves, who are each signed up to a different subset of courses. We can use the SAT machinery to decide which course will be taught in which classroom and in what time-slot so that no clashes occur.
In practice, this is done by defining binary variables describing the known relations between the real world quantities. For example, we might have variables
SAT as function fitting
A second way to think about satisfiability is in terms of function fitting. Here, there is a clear connection to machine learning in which we fit complex functions (i.e., models) to training data. In fact there is a simple relationship between function-fitting and constraint satisfaction; when we fit a model, we can consider the parameters as unknown variables, and each training data/label pair represents a constraint on the values those parameters can take. In this section, we’ll consider fitting binary neural networks and decision trees.
Fitting binary neural networks
Binary neural networks are nets in which both the weights and activations are binary. Their performance can be surprisingly good, and their implementation can be extremely efficient. We’ll show how to fit a binary neural network using SAT.
Following Mezard and Mora (2008) we consider a one layer binary network with

Figure 5. A binary neural network takes a binary data example
where the unknown model parameters
Given a training set of
To encode these constraints, we create new variables
The predicted label is the sum of the elements
where
We have one such constraint for each training example and we logically
It is easy to extend this example to multi-layer networks and to allow a certain amount of training error and we leave these extensions as exercises for the reader.
Fitting decision trees
A binary decision tree also classifies data
Learning the binary decision tree can also be framed as a satisfiability problem. From a training perspective, we would like to select the tree structure so that the training examples
We’ll develop a simplified version of the approach of Narodytska et al. (2018). Incredibly, we can learn both the structure of the tree and which features to branch on simultaneously. When we run the SAT solver for a given number
We’ll describe the SAT construction in two parts. First we’ll describe how to encode the structure of the tree as a set of logical relations and then we’ll discuss how to choose branching features that classify the data correctly.
Tree structure: We create
- Each non-leaf node must branch on exactly one feature.
- The left and right ancestor variables at the root are all
. - The left ancestor variables
at a node are the same as the parent’s, but the index associated with the parents branching variable is also if we branched left to get here. - The right ancestor variables
at a node are the same as the parent’s, but the index associated with the parents branching variable is also if we branched right to get here. - You can’t branch on a variable twice in any one path to a leaf.
- A data example reaches a leaf node if the left and right ancestors match its pattern of
and elements as described above. - All data that reach a given leaf node must have the same class.
Any set of variables

Figure 6. Tree encoding with binary variables. a) Tree with 7 nodes. b) The parameters
Classification: The second part of the construction ensures that the data examples
We’ll also create several book-keeping variables that are needed to set this up as a SAT problem, but are not required to run the model once trained. We introduce ancestor variables

Figure 7. Binary decision tree construction. a) Tree with branching variables. b) The branching index
Using these variables, we build Boolean expressions to enforce the following constraints:
- Each non-leaf node must branch on exactly one feature.
- The left and right ancestor variables at the root are all
. - The left ancestor variables
at a node are the same as the parent’s, but the index associated with the parents branching variable is also if we branched left to get here. - The right ancestor variables
at a node are the same as the parent’s, but the index associated with the parents branching variable is also if we branched right to get here. - You can’t branch on a variable twice in any one path to a leaf.
- A data example reaches a leaf node if the left and right ancestors match its pattern of
and elements as described above. - All data that reach a given leaf node must have the same class.
Collectively, these constraints mean that all of the data must be correctly classified. When we logically
Conclusion
This concludes part I of this tutorial on SAT solvers. We’ve introduced the SAT problem, shown how to convert it to conjunctive normal form and presented some standard SAT constructions. Finally, we’ve described several different applications which we hope will inspire you to see SAT as a viable approach to your own problems.
In the next part of this tutorial, we’ll delve into how SAT solvers actually work. In the final part, we’ll elucidate the connections between SAT solving and factor graphs. For those readers who still harbor reservations about the applicability of a method based purely on Boolean variables, we’ll also consider (i) how to convert non-Boolean variables to binary form and (ii) methods to work with them directly using SMT solvers.
Further reading
If you want to try working with SAT algorithms, then this tutorial will help you get started. For an extremely comprehensive list of applications of satisfiability, consult SAT/SMT by example. This may give you more inspiration for how to re-frame your problems in terms of satisfiability.
Work with Us!
Impressed by the work of the team? RBC Borealis is looking to hire for various roles across different teams. Visit our career page now and discover opportunities to join similar impactful projects!
Careers at RBC Borealis