This post is an introduction to model theory, of sorts. Occasionally I get asked what model theory is, and I generally find it quite difficult to give someone who doesn’t already know any model theory a good answer to this question, that actually says anything useful about what model theory is really about without leaving them hopelessly lost. This is my attempt to provide a real taste of model theory in a way that should be accessible to a math grad student without a background in logic.
Warm-up exercise
Let’s say I make a graph with the following procedure: I start with a countably infinite set of vertices. For each pair of vertices, I flip a fair coin. If the coin lands heads, I put an edge between those two vertices; if the coin lands tails, no edge.
Now you make another graph in a very similar manner. You also start with a countably infinite set of vertices. But instead of flipping a coin, you roll a fair standard six-sided die for each pair of vertices. If the die comes up 6, you put an edge between those two vertices; if it comes up anything from 1 through 5, no edge.
What is the probability that these two graphs are isomorphic?
For the numerical answer, paste “Gur zhygvcyvpngvir vqragvgl” into https://rot13.com/. An explanation will appear later in this post.
Introduction
There are several cases in which we can identify a mathematical object up to isomorphism with a list of first-order properties it satisfies (I’ll tell you what that means in a sec) and some data about cardinality. Here’s a couple examples: All countable dense linear orders without endpoints are isomorphic. Any two algebraically closed fields of the same characteristic, which have transcendence bases of the same cardinality, are isomorphic. It turns out that the possibility of uniquely specifying a mathematical structure in this way corresponds to interesting structural properties of that structure.
First, the basic definitions:
A first-order language consists of a set of relation symbols, each of which is labeled with a number representing its arity (number of inputs it takes), a set of function symbols, each of which is also labeled with a number representing its arity, and a set of constant symbols (which could also just be thought of as 0-ary function symbols). For example, the language of linear orders has one binary relation and no functions or constants. The language of fields has constants
and
, binary functions
and
, a unary function
(no unary function for reciprocal, because the functions should be total), and no relations.
A first-order structure in a given language is a set in which each constant symbol is interpreted as an element of
, each n-ary function symbol is interpreted as a function
, and each n-ary relation symbol is interpreted as a subset of
(or alternatively, as a function
). So linear orders and fields are examples of structures in their respective languages.
We can compose function symbols, constant symbols, and variables into ways of pointing to elements of a structure, called terms. We have as many variables as we want, and they are terms. Constant symbols are terms. And for each n-ary function symbol and terms
,
is a term. So in the language of fields, we can construct terms representing integers by adding however many 1s together (and then negating to get negative numbers), and then combine these with variables using addition and multiplication to get terms representing polynomials in however many variables with integer coefficients. In the language of linear orders, since we have no functions or constants, the only terms are variables.
A first-order formula is a way of actually saying things about a first-order structure and elements of it represented by variables. If is an n-ary relation and
are terms, then
is a formula. If
are terms, then
is a formula (you can think of this as just meaning that languages always have the binary relation
by default). Boolean combinations of formulas are formulas (i.e. if
and
are formulas, then so are
,
, and
), and if
is a formula that refers to a variable
, then
and
are formulas. Any variable that appears in a formula without being bound to a quantifier
or
is called a free variable, and if each free variable is assigned to an element of a structure, the formula makes a claim about them, which can be either true or false. For example, in a ring,
is true iff
is a unit.
A first-order formula with no free variables is called a sentence. These are true or false statements about a first-order structure. Many types of mathematical objects are defined by listing first-order sentences that are true of them. For instance, a linear order is a structure with a relation satisfying transitivity (
), antisymmetry (
), and totality (
), and a linear order is dense without endpoints if it also satisfies
and
. These are all first-order sentences. Algebraically closed fields of a given characteristic are another example. The field axioms are first-order sentences. For each positive integer
, we can formulate a first-order sentence saying that every polynomial of degree
has a root:
(the
s represent the coefficients, with the leading coefficient normalized to
). So we just add in these infinitely many sentences, one for each
. And we can say that the field has characteristic
by saying
(with
ones), or say that it has characteristic
by, for each prime
, saying
.
First-order sentences can tell us a lot about a structure, but not everything, unless the structure is finite.
Löwenheim–Skolem theorem: Given a countable set of first-order sentences (in particular, any set of sentences if the language is countable), if there is any infinite structure in which they are all true, then there are first-order structures of every infinite cardinality in which they are all true.
This is why the uniqueness results all have to say something about cardinality. You might also think of some examples of ways to identify an infinite mathematical object up to isomorphism with a list of axioms without saying directly anything about cardinality, but in all such cases, you’ll be using an axiom that isn’t first-order. For instance, all Dedekind-complete ordered fields are isomorphic to the reals, but Dedekind-completeness isn’t a first-order sentence. Same goes for any way of characterizing the natural numbers up to isomorphism that says something like “every set of natural numbers that contains 0 and is closed under successor contains all of the natural numbers”.
Countable structures
Let’s go back to the example of countable dense linear orders. If you don’t know the proof that all countable dense linear orders are isomorphic, here it goes: suppose we have two countable dense linear orders, and
. Since they’re countable, we can label the elements of each of them with distinct natural numbers. We’re going to match elements of
to elements of
one at a time such that we get an isomorphism at the end. To ensure that every element of
gets matched to something in
, on odd-numbered steps, we’ll take the lowest-numbered element of
that hasn’t been matched yet, and match it with an element of
. Similarly, to ensure that every element of
gets matched to something in
, on even-numbered steps, we’ll take the lowest-numbered element of
that hasn’t been matched yet, and match it with an element of
. As for what we do on each step (suppose it’s an odd-numbered step; even-numbered steps are the same but with the roles of
and
reversed), at the start of the step, finitely many elements of
have already been matched. We take the first element that hasn’t yet been matched. Call it
.
is either greater than all previously matched elements, less than all previously matched elements, or between two previously matched elements that don’t already have previously matched elements between them. Since
is dense and has no endpoints, we know that in the first case, there will be something greater than all previously matched elements of
, so we can match
to it; in the second case, there will be something less than all previously matched elements of
for us to match
to; and in the third case, there will be something between the elements matched to the elements on either side of
, which we can match
to. By doing this, we continue to preserve the ordering at each step, so the bijection we get at the end is order-preserving, and thus an isomorphism.
Now let’s get back to the warm-up exercise. A graph can be viewed as a first-order structure whose elements are the vertices, with a single binary relation (the edge relation) that is symmetric and anti-reflexive (symmetry and anti-reflexivity are both first-order conditions). There are some more first-order sentences satisfied by both of our random graphs with probability 1. Given any two finite disjoint sets of vertices, we can find another vertex that’s connected to everything in the first set and not connected to anything in the second set. This is because each vertex has the same positive probability of having this property, they’re all independent, and there’s infinitely many of them, so there also must be some (in fact, infinitely many) that have all the desired edges and none of the undesired edges. To write this condition using first-order sentences, for each natural number
and
, we have a sentence
(the big conjunction before “” includes
for each
and
, so that this says
and
are disjoint).
This is enough for us to construct an isomorphism, using essentially the same proof as for countable dense linear orders. Since we each started with countably many vertices, we can label each of our vertices with natural numbers, and then iteratively match the next available unmatched vertex in one graph to a vertex on the other, alternating between which graph we take the next available unmatched vertex from on each step, just like before. On each step, only finitely many vertices have been matched. The new vertex shares edges with some of the already matched vertices and doesn’t share edges with some others. We need to match it with a vertex in the other graph that shares exactly the same pattern of edges with previously matched vertices. And we know that somewhere in that graph, there must be such a vertex. So we can match the new vertex and keep going, and the bijection we get at the end preserves the edge relation, and is thus an isomorphism.
For the general argument that these are both special cases of, we’ll need the concept of a type (not to be confused with the identically-named concept from type theory). Given a first-order structure and
, say that
have the same type over
if for every first-order formula
(where
are its free variables),
holds iff
does. So, for example, in a dense linear order without endpoints, if
, then, in order for
to have the same type as
over
, it must be the case that
as well, since
is a first-order formula, and
and
must satisfy exactly the same first-order formulas with parameters in
. And as it turns out, this is enough; if
and
, then
and
have the same type over
. In an infinite random graph, if vertices
and
have the same type over some other vertices
, then
must have an edge to each
that
has an edge to, and vice-versa. Again, this turns out to be enough to guarantee that they have the same type.
In both of these cases, for any finite set of elements , there are only finitely many types over
. Let’s count them. Each
is its own type, since
is a formula, so if
and
have the same type over
, then, since
,
as well. Let’s ignore these and count the rest. In a dense linear order without endpoints, we can assume WLOG that
. There are
nontrivial types over
:
,
, and, for each
,
. In an infinite random graph, there are
nontrivial types over
vertices
: for each
, there’s a type of vertices that have edges to everything in
and no edges to anything in
.
Theorem: Let be a countably infinite first-order structure such that for every
and
, there are only finitely many types over
. Then every countable structure
satisfying the same first-order sentences that
does is isomorphic to
.
That is, our ability to specify a countable structure (up to isomorphism) by its first-order properties corresponds exactly to the condition that there are only finitely many different behaviors that elements of the structure can have in relation to any given finite subset. The proofs that all countable dense linear orders without endpoints are isomorphic and that all countable random graphs are isomorphic look the same because they both follow the proof of this theorem, which goes like so:
Suppose there are only finitely many types over . Let
be one of those types, and let
be the others. For each
, there’s some formula
that’s true for
but not for
. Then
is a formula that holds only for
. That is, the entire type is specified by a single formula; it wasn’t just coincidence that we were able to find such a formula for the types in each of our two examples.
Lemma: If and
are two first-order structures in the same language which satisfy all the same sentences,
and
satisfy all the same formulas (i.e., for any
,
is true in
iff
is true in
), and
has only finitely many types, then there’s a natural bijection between types in
over
and types in
over
. A formula
is true for
in some type over
iff
is true for
in the corresponding type over
.
Proof: Let be the types in
over
, and for each
, let
be a formula specifying that
is the type of
. These formulas
specify types in
over
as well; for any other formula
, either
or
in
. These are first-order formulas, so again since
satisfy the same first-order formulas that
do, one of
or
is true in
as well. So
determines the truth value of every such formula; that is, it specifies the type of
over
, and formulas are true in this type iff they are true in the corresponding type in
over
. To show that these are all of the types in
over
, consider the formula
. In
, when we plug in
, the formula is true. And
satisfies all the same formulas as
, so the same formula must also be true in
when we plug in
. That is, for any
, there’s some
such that
is true, so every element of
must have one of the above types.
Armed with this lemma, we can prove the theorem. Let and
be countable structures satisfying the same first-order sentences, and suppose for every
, there are only finitely many types over
. We’ll match elements of
to elements of
one at a time, using the same back-and-forth trick from our two examples to ensure that we get a bijection at the end. After
steps, we’ll have
from
matched with
from
, and we’ll want to ensure that
and
satisfy exactly the same first-order formulas. If we’ve done this, then on step
, we’ll have an element either of
or of
, which we need to match with some element of the other one. We can match it to an element that has the corresponding type; that is, we’re matching
and
such that the type of
over
corresponds to the type of
over
. Then
satisfy the same formulas that
do, so by induction,
and
satisfy the same formulas for every
(the assumption that
and
satisfy the same first-order sentences provides a base case). Thus, the bijection we get at the end preserves the truth-values of all formulas, so it is an isomorphism, and we’re done.
As it turns out, the converse of the theorem is also true. Given a set of first-order sentences for which there is, up to isomorphism, only one countable model, all models have only finitely many types over any finite list of elements. Whenever there’s infinitely many types, there will be some types (which cannot be specified by a single formula) that appear in some models but not in others.
Uncountable structures
Let’s turn to the other example I introduced at the beginning: any two algebraically closed fields of the same characteristic with transcendence bases of the same cardinality are isomorphic. Every field has a transcendence basis, so a corollary of this is that any two uncountable algebraically closed fields of the same characteristic and cardinality are isomorphic.
A sketch of the proof: Given algebraically closed fields and
of the same characteristic, with transcendence bases
and
of the same cardinality, any isomorphism between a subfield of
and a subfield of
extends to a maximal such isomorphism (by Zorn’s lemma). Since
and
have the same cardinality, there’s a bijection between them, and since
and
have the same characteristic, this bijection extends to an isomorphism between the fields they generate. Thus there is a maximal isomorphism between subfields of
and of
which restricts to a bijection between
and
. Now we just need to show that these subfields are all of
and
. This is because, given any such isomorphism between subfields
and
with
and
, if
, then let
, and let
be the minimal polynomial of
. Applying the isomorphism to the coefficients gives us an irreducible polynomial over
, which must have a root
, and then by matching
with
, we get an isomorphism
, contradicting maximality of the isomorphism.
Here’s another example: Any two vector spaces over the same vector space, with bases of the same cardinality, are isomorphic. Since every vector space has a basis, a corollary of this is that, over a countable field, any two uncountable vector spaces of the same cardinality are isomorphic. Citing Zorn’s lemma is overkill, since there’s only one way to extend a bijection between bases to an isomorphism. But the basic idea is the same in each case: We have an appropriate notion of basis, and we extend a bijection between bases to an isomorphism. And vector spaces are also first-order structures; the language has a binary operation , a constant
, and, for each scalar
, a unary operation for multiplication by
.
The thing that unites both these cases is called strong minimality. A first-order structure is called minimal if every set defined by a first-order formula is either finite, or the complement of a finite set. More formally: is minimal if for every formula
and
, one of
or
is finite. We call a structure strongly minimal if every structure satisfying the same first-order sentences is also minimal. (This turns out to be equivalent to, for each
, there’s a finite upper bound on the size of whichever of
or
is finite, as
vary.)
Let’s go over the general notion of “basis” we’ll be using: Say that is algebraic over
if there is a formula
such that
holds, and
is finite. In algebraically closed fields, this corresponds to the usual notion of
being algebraic over the subfield generated by
. In vector spaces, this corresponds to
being a linear combination of
. Call
independent if no element of
is algebraic over any other elements of
. In other words, you can’t pin down an element of
to one of finitely many possibilities by using a single formula and other elements of
. In a vector space, independence is linear independence. In an algebraically closed field, independence is algebraic independence. Now call
a basis if it is a maximal independent set. If
is minimal, this turns out to imply that every
is algebraic over some
. An increasing union of independent sets is independent, so by Zorn’s lemma, every structure has a basis.
Now let’s look at type spaces in minimal structures. Let be a minimal structure and
. If
is algebraic over
, then there’s some formula
such that
holds and
is as small as possible. Then
and
have the same type iff
. So this type is implied by a single formula, and there are only finitely many elements of this type. There’s only one remaining type: the type of elements that aren’t algebraic over
. If
and
are both non-algebraic over
, then for every formula
, since
is minimal, one of
and
must have only finitely many solutions
;
and
, being non-algebraic, must both be solutions to the other one. This shows they have the same type over
. This non-algebraic type is optional; in some cases, there might not be any elements that aren’t algebraic over
.
Let and
be minimal structures in the same language, which satisfy the same first-order sentences. They each have a basis. If those bases have the same cardinality, then
and
are isomorphic. Say a “partial isomorphism” between
and
is a bijection between a subset of
and a subset of
, such that whenever a formula is true about some elements of the subset of
, then it is also true about the corresponding elements of the subset of
, and vice-versa. If
and
are bases for
and
, respectively, then a bijection between
and
is a partial isomorphism (this is because if
and
satisfy all the same formulas, and
and
, then
must have the unique non-algebraic type over
,
has the unique non-algebraic type over
, and these unique non-algebraic types satisfy the same formulas, so it follows by induction on the number of variables that a formula is true of distinct elements of
iff it is true of distinct elements of
). An increasing union of partial isomorphisms is a partial isomorphism, so by Zorn’s lemma, there’s a maximal partial isomorphism extending a bijection between
and
. If this maximal partial isomorphism is a bijection between
and
, and
, then let
.
is algebraic over
(since
), so there’s a single formula
(
) that is true for
, and which determines its type over
(meaning, determines its type over
for every
). Then, where
correspond to
under the partial isomorphism, there must be
such that
(since
satisfies the same formulas
do, and
).
, because this can be expressed as part of the type of
over
, which is the same as the type of
over
. Thus we can extend the partial isomorphism by matching
with
. Thus, in our maximal partial isomorphism,
, and for the same reason,
, so it is an isomorphism.
So for a strongly minimal structure, the structures satisfying the same sentences are classified by the cardinality of a basis. This isn’t quite the end of the story; in some cases, a structure with too small a basis would be finite, and we could thus distinguish it from the rest with a first-order sentence saying that there are distinct elements (for large enough
). This isn’t the case for algebraically closed fields, which are infinite even when the empty set is a transcendence basis. But for vector spaces, the empty basis generates a one-element vector space, so an infinite vector space must have basis of size at least one.
And if the vector space is over a finite field, then its basis must be infinite. Another case where where the basis must be infinite is an infinite set. A set is a first-order structure in the language with no relations, no functions, and no constants. Every subset of a set is independent, so a basis for the set is just the entire set. In these cases where a basis must be infinite; there’s only one (up to isomorphism) countable model: the model with a countably infinite basis. You can check that both of these examples satisfy the finitely-many-types condition from the previous section for having a unique countable model.
So the general story, for a strongly minimal structure , is that there is some
such that structures satisfying the same sentences as
are classified by cardinalities that are at least
, that being the cardinality of a basis. In a countable language, the cardinality of a structure is the maximum of
and the cardinality of a basis, so it follows that an uncountable strongly minimal structure is isomorphic to all structures of the same cardinality satisfying the same sentences.
In the previous section, we had a converse, so you may ask, if an uncountable structure is isomorphic to all structures of the same cardinality satisfying the same sentences, is it strongly minimal? This is not quite true. For example, consider a vector space over a countable field, where we add two unary relations
and
to the language, each of which define subspaces of
, which are disjoint and span
, and then add a unary function
to the language, which is a linear function such that
, and
is an isomorphism between
and
. Vector spaces like this are classified by the dimension of
, so there is a unique one (up to isomorphism) of any given uncountable cardinality. It is not strongly minimal because
itself is a formula picking out a set that is neither finite nor the complement of a finite set. But it is almost strongly minimal, in the sense that it is basically just the vector space
, and
is strongly minimal. It turns out that for any uncountable structure (in a finite or countable language) that is isomorphic to every structure of the same cardinality and satisfying the same sentences, there’s a formula defining a subset that is strongly minimal in an appropriate sense, such that the rest of the structure can be parameterized somehow using the subset.