This post is an introduction to model theory, of sorts. Occasionally I get asked what model theory is, and I generally find it quite difficult to give someone who doesn’t already know any model theory a good answer to this question, that actually says anything useful about what model theory is really about without leaving them hopelessly lost. This is my attempt to provide a real taste of model theory in a way that should be accessible to a math grad student without a background in logic.
Let’s say I make a graph with the following procedure: I start with a countably infinite set of vertices. For each pair of vertices, I flip a fair coin. If the coin lands heads, I put an edge between those two vertices; if the coin lands tails, no edge.
Now you make another graph in a very similar manner. You also start with a countably infinite set of vertices. But instead of flipping a coin, you roll a fair standard six-sided die for each pair of vertices. If the die comes up 6, you put an edge between those two vertices; if it comes up anything from 1 through 5, no edge.
What is the probability that these two graphs are isomorphic?
For the numerical answer, paste “Gur zhygvcyvpngvir vqragvgl” into https://rot13.com/. An explanation will appear later in this post.
There are several cases in which we can identify a mathematical object up to isomorphism with a list of first-order properties it satisfies (I’ll tell you what that means in a sec) and some data about cardinality. Here’s a couple examples: All countable dense linear orders without endpoints are isomorphic. Any two algebraically closed fields of the same characteristic, which have transcendence bases of the same cardinality, are isomorphic. It turns out that the possibility of uniquely specifying a mathematical structure in this way corresponds to interesting structural properties of that structure.
First, the basic definitions:
A first-order language consists of a set of relation symbols, each of which is labeled with a number representing its arity (number of inputs it takes), a set of function symbols, each of which is also labeled with a number representing its arity, and a set of constant symbols (which could also just be thought of as 0-ary function symbols). For example, the language of linear orders has one binary relation and no functions or constants. The language of fields has constants and , binary functions and , a unary function (no unary function for reciprocal, because the functions should be total), and no relations.
A first-order structure in a given language is a set in which each constant symbol is interpreted as an element of , each n-ary function symbol is interpreted as a function , and each n-ary relation symbol is interpreted as a subset of (or alternatively, as a function ). So linear orders and fields are examples of structures in their respective languages.
We can compose function symbols, constant symbols, and variables into ways of pointing to elements of a structure, called terms. We have as many variables as we want, and they are terms. Constant symbols are terms. And for each n-ary function symbol and terms , is a term. So in the language of fields, we can construct terms representing integers by adding however many 1s together (and then negating to get negative numbers), and then combine these with variables using addition and multiplication to get terms representing polynomials in however many variables with integer coefficients. In the language of linear orders, since we have no functions or constants, the only terms are variables.
A first-order formula is a way of actually saying things about a first-order structure and elements of it represented by variables. If is an n-ary relation and are terms, then is a formula. If are terms, then is a formula (you can think of this as just meaning that languages always have the binary relation by default). Boolean combinations of formulas are formulas (i.e. if and are formulas, then so are , , and ), and if is a formula that refers to a variable , then and are formulas. Any variable that appears in a formula without being bound to a quantifier or is called a free variable, and if each free variable is assigned to an element of a structure, the formula makes a claim about them, which can be either true or false. For example, in a ring, is true iff is a unit.
A first-order formula with no free variables is called a sentence. These are true or false statements about a first-order structure. Many types of mathematical objects are defined by listing first-order sentences that are true of them. For instance, a linear order is a structure with a relation satisfying transitivity (), antisymmetry (), and totality (), and a linear order is dense without endpoints if it also satisfies and . These are all first-order sentences. Algebraically closed fields of a given characteristic are another example. The field axioms are first-order sentences. For each positive integer , we can formulate a first-order sentence saying that every polynomial of degree has a root: (the s represent the coefficients, with the leading coefficient normalized to ). So we just add in these infinitely many sentences, one for each . And we can say that the field has characteristic by saying (with ones), or say that it has characteristic by, for each prime , saying .
First-order sentences can tell us a lot about a structure, but not everything, unless the structure is finite.
Löwenheim–Skolem theorem: Given a countable set of first-order sentences (in particular, any set of sentences if the language is countable), if there is any infinite structure in which they are all true, then there are first-order structures of every infinite cardinality in which they are all true.
This is why the uniqueness results all have to say something about cardinality. You might also think of some examples of ways to identify an infinite mathematical object up to isomorphism with a list of axioms without saying directly anything about cardinality, but in all such cases, you’ll be using an axiom that isn’t first-order. For instance, all Dedekind-complete ordered fields are isomorphic to the reals, but Dedekind-completeness isn’t a first-order sentence. Same goes for any way of characterizing the natural numbers up to isomorphism that says something like “every set of natural numbers that contains 0 and is closed under successor contains all of the natural numbers”.
Let’s go back to the example of countable dense linear orders. If you don’t know the proof that all countable dense linear orders are isomorphic, here it goes: suppose we have two countable dense linear orders, and . Since they’re countable, we can label the elements of each of them with distinct natural numbers. We’re going to match elements of to elements of one at a time such that we get an isomorphism at the end. To ensure that every element of gets matched to something in , on odd-numbered steps, we’ll take the lowest-numbered element of that hasn’t been matched yet, and match it with an element of . Similarly, to ensure that every element of gets matched to something in , on even-numbered steps, we’ll take the lowest-numbered element of that hasn’t been matched yet, and match it with an element of . As for what we do on each step (suppose it’s an odd-numbered step; even-numbered steps are the same but with the roles of and reversed), at the start of the step, finitely many elements of have already been matched. We take the first element that hasn’t yet been matched. Call it . is either greater than all previously matched elements, less than all previously matched elements, or between two previously matched elements that don’t already have previously matched elements between them. Since is dense and has no endpoints, we know that in the first case, there will be something greater than all previously matched elements of , so we can match to it; in the second case, there will be something less than all previously matched elements of for us to match to; and in the third case, there will be something between the elements matched to the elements on either side of , which we can match to. By doing this, we continue to preserve the ordering at each step, so the bijection we get at the end is order-preserving, and thus an isomorphism.
Now let’s get back to the warm-up exercise. A graph can be viewed as a first-order structure whose elements are the vertices, with a single binary relation (the edge relation) that is symmetric and anti-reflexive (symmetry and anti-reflexivity are both first-order conditions). There are some more first-order sentences satisfied by both of our random graphs with probability 1. Given any two finite disjoint sets of vertices, we can find another vertex that’s connected to everything in the first set and not connected to anything in the second set. This is because each vertex has the same positive probability of having this property, they’re all independent, and there’s infinitely many of them, so there also must be some (in fact, infinitely many) that have all the desired edges and none of the undesired edges. To write this condition using first-order sentences, for each natural number and , we have a sentence
(the big conjunction before “” includes for each and , so that this says and are disjoint).
This is enough for us to construct an isomorphism, using essentially the same proof as for countable dense linear orders. Since we each started with countably many vertices, we can label each of our vertices with natural numbers, and then iteratively match the next available unmatched vertex in one graph to a vertex on the other, alternating between which graph we take the next available unmatched vertex from on each step, just like before. On each step, only finitely many vertices have been matched. The new vertex shares edges with some of the already matched vertices and doesn’t share edges with some others. We need to match it with a vertex in the other graph that shares exactly the same pattern of edges with previously matched vertices. And we know that somewhere in that graph, there must be such a vertex. So we can match the new vertex and keep going, and the bijection we get at the end preserves the edge relation, and is thus an isomorphism.
For the general argument that these are both special cases of, we’ll need the concept of a type (not to be confused with the identically-named concept from type theory). Given a first-order structure and , say that have the same type over if for every first-order formula (where are its free variables), holds iff does. So, for example, in a dense linear order without endpoints, if , then, in order for to have the same type as over , it must be the case that as well, since is a first-order formula, and and must satisfy exactly the same first-order formulas with parameters in . And as it turns out, this is enough; if and , then and have the same type over . In an infinite random graph, if vertices and have the same type over some other vertices , then must have an edge to each that has an edge to, and vice-versa. Again, this turns out to be enough to guarantee that they have the same type.
In both of these cases, for any finite set of elements , there are only finitely many types over . Let’s count them. Each is its own type, since is a formula, so if and have the same type over , then, since , as well. Let’s ignore these and count the rest. In a dense linear order without endpoints, we can assume WLOG that . There are nontrivial types over : , , and, for each , . In an infinite random graph, there are nontrivial types over vertices : for each , there’s a type of vertices that have edges to everything in and no edges to anything in .
Theorem: Let be a countably infinite first-order structure such that for every and , there are only finitely many types over . Then every countable structure satisfying the same first-order sentences that does is isomorphic to .
That is, our ability to specify a countable structure (up to isomorphism) by its first-order properties corresponds exactly to the condition that there are only finitely many different behaviors that elements of the structure can have in relation to any given finite subset. The proofs that all countable dense linear orders without endpoints are isomorphic and that all countable random graphs are isomorphic look the same because they both follow the proof of this theorem, which goes like so:
Suppose there are only finitely many types over . Let be one of those types, and let be the others. For each , there’s some formula that’s true for but not for . Then is a formula that holds only for . That is, the entire type is specified by a single formula; it wasn’t just coincidence that we were able to find such a formula for the types in each of our two examples.
Lemma: If and are two first-order structures in the same language which satisfy all the same sentences, and satisfy all the same formulas (i.e., for any , is true in iff is true in ), and has only finitely many types, then there’s a natural bijection between types in over and types in over . A formula is true for in some type over iff is true for in the corresponding type over .
Proof: Let be the types in over , and for each , let be a formula specifying that is the type of . These formulas specify types in over as well; for any other formula , either or in . These are first-order formulas, so again since satisfy the same first-order formulas that do, one of or is true in as well. So determines the truth value of every such formula; that is, it specifies the type of over , and formulas are true in this type iff they are true in the corresponding type in over . To show that these are all of the types in over , consider the formula . In , when we plug in , the formula is true. And satisfies all the same formulas as , so the same formula must also be true in when we plug in . That is, for any , there’s some such that is true, so every element of must have one of the above types.
Armed with this lemma, we can prove the theorem. Let and be countable structures satisfying the same first-order sentences, and suppose for every , there are only finitely many types over . We’ll match elements of to elements of one at a time, using the same back-and-forth trick from our two examples to ensure that we get a bijection at the end. After steps, we’ll have from matched with from , and we’ll want to ensure that and satisfy exactly the same first-order formulas. If we’ve done this, then on step , we’ll have an element either of or of , which we need to match with some element of the other one. We can match it to an element that has the corresponding type; that is, we’re matching and such that the type of over corresponds to the type of over . Then satisfy the same formulas that do, so by induction, and satisfy the same formulas for every (the assumption that and satisfy the same first-order sentences provides a base case). Thus, the bijection we get at the end preserves the truth-values of all formulas, so it is an isomorphism, and we’re done.
As it turns out, the converse of the theorem is also true. Given a set of first-order sentences for which there is, up to isomorphism, only one countable model, all models have only finitely many types over any finite list of elements. Whenever there’s infinitely many types, there will be some types (which cannot be specified by a single formula) that appear in some models but not in others.
Let’s turn to the other example I introduced at the beginning: any two algebraically closed fields of the same characteristic with transcendence bases of the same cardinality are isomorphic. Every field has a transcendence basis, so a corollary of this is that any two uncountable algebraically closed fields of the same characteristic and cardinality are isomorphic.
A sketch of the proof: Given algebraically closed fields and of the same characteristic, with transcendence bases and of the same cardinality, any isomorphism between a subfield of and a subfield of extends to a maximal such isomorphism (by Zorn’s lemma). Since and have the same cardinality, there’s a bijection between them, and since and have the same characteristic, this bijection extends to an isomorphism between the fields they generate. Thus there is a maximal isomorphism between subfields of and of which restricts to a bijection between and . Now we just need to show that these subfields are all of and . This is because, given any such isomorphism between subfields and with and , if , then let , and let be the minimal polynomial of . Applying the isomorphism to the coefficients gives us an irreducible polynomial over , which must have a root , and then by matching with , we get an isomorphism , contradicting maximality of the isomorphism.
Here’s another example: Any two vector spaces over the same vector space, with bases of the same cardinality, are isomorphic. Since every vector space has a basis, a corollary of this is that, over a countable field, any two uncountable vector spaces of the same cardinality are isomorphic. Citing Zorn’s lemma is overkill, since there’s only one way to extend a bijection between bases to an isomorphism. But the basic idea is the same in each case: We have an appropriate notion of basis, and we extend a bijection between bases to an isomorphism. And vector spaces are also first-order structures; the language has a binary operation , a constant , and, for each scalar , a unary operation for multiplication by .
The thing that unites both these cases is called strong minimality. A first-order structure is called minimal if every set defined by a first-order formula is either finite, or the complement of a finite set. More formally: is minimal if for every formula and , one of or is finite. We call a structure strongly minimal if every structure satisfying the same first-order sentences is also minimal. (This turns out to be equivalent to, for each , there’s a finite upper bound on the size of whichever of or is finite, as vary.)
Let’s go over the general notion of “basis” we’ll be using: Say that is algebraic over if there is a formula such that holds, and is finite. In algebraically closed fields, this corresponds to the usual notion of being algebraic over the subfield generated by . In vector spaces, this corresponds to being a linear combination of . Call independent if no element of is algebraic over any other elements of . In other words, you can’t pin down an element of to one of finitely many possibilities by using a single formula and other elements of . In a vector space, independence is linear independence. In an algebraically closed field, independence is algebraic independence. Now call a basis if it is a maximal independent set. If is minimal, this turns out to imply that every is algebraic over some . An increasing union of independent sets is independent, so by Zorn’s lemma, every structure has a basis.
Now let’s look at type spaces in minimal structures. Let be a minimal structure and . If is algebraic over , then there’s some formula such that holds and is as small as possible. Then and have the same type iff . So this type is implied by a single formula, and there are only finitely many elements of this type. There’s only one remaining type: the type of elements that aren’t algebraic over . If and are both non-algebraic over , then for every formula , since is minimal, one of and must have only finitely many solutions ; and , being non-algebraic, must both be solutions to the other one. This shows they have the same type over . This non-algebraic type is optional; in some cases, there might not be any elements that aren’t algebraic over .
Let and be minimal structures in the same language, which satisfy the same first-order sentences. They each have a basis. If those bases have the same cardinality, then and are isomorphic. Say a “partial isomorphism” between and is a bijection between a subset of and a subset of , such that whenever a formula is true about some elements of the subset of , then it is also true about the corresponding elements of the subset of , and vice-versa. If and are bases for and , respectively, then a bijection between and is a partial isomorphism (this is because if and satisfy all the same formulas, and and , then must have the unique non-algebraic type over , has the unique non-algebraic type over , and these unique non-algebraic types satisfy the same formulas, so it follows by induction on the number of variables that a formula is true of distinct elements of iff it is true of distinct elements of ). An increasing union of partial isomorphisms is a partial isomorphism, so by Zorn’s lemma, there’s a maximal partial isomorphism extending a bijection between and . If this maximal partial isomorphism is a bijection between and , and , then let . is algebraic over (since ), so there’s a single formula () that is true for , and which determines its type over (meaning, determines its type over for every ). Then, where correspond to under the partial isomorphism, there must be such that (since satisfies the same formulas do, and ). , because this can be expressed as part of the type of over , which is the same as the type of over . Thus we can extend the partial isomorphism by matching with . Thus, in our maximal partial isomorphism, , and for the same reason, , so it is an isomorphism.
So for a strongly minimal structure, the structures satisfying the same sentences are classified by the cardinality of a basis. This isn’t quite the end of the story; in some cases, a structure with too small a basis would be finite, and we could thus distinguish it from the rest with a first-order sentence saying that there are distinct elements (for large enough ). This isn’t the case for algebraically closed fields, which are infinite even when the empty set is a transcendence basis. But for vector spaces, the empty basis generates a one-element vector space, so an infinite vector space must have basis of size at least one.
And if the vector space is over a finite field, then its basis must be infinite. Another case where where the basis must be infinite is an infinite set. A set is a first-order structure in the language with no relations, no functions, and no constants. Every subset of a set is independent, so a basis for the set is just the entire set. In these cases where a basis must be infinite; there’s only one (up to isomorphism) countable model: the model with a countably infinite basis. You can check that both of these examples satisfy the finitely-many-types condition from the previous section for having a unique countable model.
So the general story, for a strongly minimal structure , is that there is some such that structures satisfying the same sentences as are classified by cardinalities that are at least , that being the cardinality of a basis. In a countable language, the cardinality of a structure is the maximum of and the cardinality of a basis, so it follows that an uncountable strongly minimal structure is isomorphic to all structures of the same cardinality satisfying the same sentences.
In the previous section, we had a converse, so you may ask, if an uncountable structure is isomorphic to all structures of the same cardinality satisfying the same sentences, is it strongly minimal? This is not quite true. For example, consider a vector space over a countable field, where we add two unary relations and to the language, each of which define subspaces of , which are disjoint and span , and then add a unary function to the language, which is a linear function such that , and is an isomorphism between and . Vector spaces like this are classified by the dimension of , so there is a unique one (up to isomorphism) of any given uncountable cardinality. It is not strongly minimal because itself is a formula picking out a set that is neither finite nor the complement of a finite set. But it is almost strongly minimal, in the sense that it is basically just the vector space , and is strongly minimal. It turns out that for any uncountable structure (in a finite or countable language) that is isomorphic to every structure of the same cardinality and satisfying the same sentences, there’s a formula defining a subset that is strongly minimal in an appropriate sense, such that the rest of the structure can be parameterized somehow using the subset.