# First-order Model Theory

*First published Sat Nov 10, 2001; substantive revision Mon Dec 10, 2018*

First-order model theory, also known as classical model theory, is a branch of mathematics that deals with the relationships between descriptions in first-order languages and the structures that satisfy these descriptions. From one point of view, this is a vibrant area of mathematical research that brings logical methods (in particular the theory of definition) to bear on deep problems of classical mathematics. From another point of view, first-order model theory is the paradigm for the rest of model theory; it is the area in which many of the broader ideas of model theory were first worked out.

- 1. First-order languages and structures
- 2. Elementary maps
- 3. Five grand theorems
- 4. Three useful constructions
- 5. Three successful programmes
- Bibliography
- Academic Tools
- Other Internet Resources
- Related Entries

## 1. First-order languages and structures

Mathematical model theory carries a heavy load of notation, and HTML is
not the best container for it. In what follows, syntactic objects
(languages, theories, sentences) are generally written in roman or
greek letters (for example L, T, φ), and set-theoretic objects such
as structures and their elements are written in italic (*A*,
*a*). Two exceptions are that variables are italic (*x*,
*y*) and that sequences of elements are written with lower case
roman letters (a, b).

We recall and refine some definitions from the entries on
classical logic
and
model theory.
A *signature* is a set
of individual constants, predicate symbols and function symbols; each
of the predicate symbols and function symbols has an *arity*
(for example it is binary if its arity is 2). Each signature K gives
rise to a first-order language, by building up formulas from the
symbols in the signature together with logical symbols (including =)
and punctuation.

If K is a signature, then a *structure of signature* K, say
*A*, consists of the following items:

- A set called the
*domain*of*A*and written dom(*A*); it is usually assumed to be nonempty; - for each individual constant
*c*in K, an element*c*of dom(^{A}*A*); - for each predicate symbol
*P*of arity*n*, an*n*-ary relation*P*on dom(^{A}*A*); - for each function symbol
*F*of arity*n*, an*n*-ary function*F*from dom(^{A}*A*) to dom(*A*).

The *elements* of *A* are the elements of
dom(*A*). Likewise the *cardinality* or *power* of
*A* is the cardinality of its domain. Since we can recover the
signature K from the first-order language L that it generates, we can
and will refer to structures of signature K as L-*structures*.
We think of *c* as a name for the element
*c*^{A} in the structure *A*, and
likewise with the other symbols.

For example the field of real numbers forms a structure
* R* whose elements are the real numbers, with
signature consisting of the individual constant 0 to name the number
zero, a 1-ary function symbol − for minus, and two 2-ary function
symbols + and . for plus and times. At first sight we can’t add a
symbol to express 1/

*x*, since all the named functions have to be defined on the whole domain of the structure, and there is no such real number as 1/0. But on second thoughts this is not a serious problem; any competent mathematician puts the condition ‘

*x*is not zero’ before dividing by

*x*, and so it never matters what the value of 1/0 is, and we can harmlessly take it to be 42. But most model theorists are uncomfortable with any kind of division by zero, so they stick with plus, times and minus.

If L is the first-order language of signature K, then
Tarski’s model-theoretic truth
definition tells us when a sentence of L is true in *A*,
and when an assignment of elements of *A* to variables
satisfies a formula of L in *A*. Instead of talking of
assignments satisfying a formula, model theorists often speak of the
set of *n*-tuples of elements of *A* that
is *defined* by a formula
φ(*v*_{1},…,*v*_{n});
the connection is that an *n*-tuple
(*a*_{1},…,*a*_{n}) is
in the defined set if and only if the assignment taking each
*v*_{i} to *a*_{i}
satisfies the formula.

If φ is a sentence, we write

A⊨ φ

to mean that φ is true in *A*, or in other words, *A*
is a model of φ. If
φ(*v*_{1},…,*v*_{n})
is a formula with free variables as shown, we write

A⊨ φ[a]

to mean that the *n*-tuple a is in the set defined by φ.
(The entry on
classical logic
uses
the notation ‘*A,s*
⊨
φ’,
where *s* is any assignment to all the variables
of L that assigns to each variable *v*_{i} free
in φ the *i*-th element in the *n*-tuple a.)

Two L-structures that are models of exactly the same sentences of L
are said to be *elementarily equivalent*. Elementary equivalence
is an equivalence relation on the class of all L-structures. The set of
all the sentences of L that are true in the L-structure *A* is
called the *complete theory* of *A*, in symbols
Th(*A*). A theory that is Th(*A*) for some structure
*A* is said to be *complete*. (By the completeness
theorem for first-order logic, for which see the entry on
classical logic,
a theory is complete if
and only if it is maximal syntactically consistent.) The two structures
*A* and *B* are elementarily equivalent if and only if
Th(*A*) = Th(*B*).

To continue the example of the field * R* of
real numbers: It is often not at all obvious whether two given
structures are or are not elementarily equivalent. One of the greatest
achievements of the pre-history of model theory was Tarski’s
description in 1930 of Th(

*) (which he published in full only after the war; see his book in the Bibliography below). This description implied among other things that the structures elementarily equivalent to*

**R***are exactly the real-closed fields, a class of fields which was already known to the algebraists in its own right.*

**R**
When mathematicians introduce a class of structures, they like to
define what they count as the basic maps between these structures. The
basic maps between structures of the same signature K are called
*homomorphisms*, defined as follows. A *homomorphism from
structure A to structure B* is a function *f* from
dom(*A*) to dom(*B*) with the property that for every
atomic formula
φ(*v*_{1},…,*v*_{n})
and any *n*-tuple a =
(*a*_{1},…,*a*_{n}) of
elements of *A*,

A⊨ φ [a] ⇒B⊨ φ [b]

where b is
(*f*(*a*_{1}),…,*f*(*a*_{n})).
If we have ‘⇔’ in place of
‘⇒’ in the quoted condition, we say that *f* is
an *embedding* of *A* into *B*. Since the language
includes =, an embedding of *A* into *B* is always
one-to-one, though it need not be onto the domain of *B*. If it
is onto, then the inverse map from dom(*B*) to dom(*A*)
is also a homomorphism, and both the embedding and its inverse are said
to be *isomorphisms*. We say that two structures are
*isomorphic* if there is an isomorphism from one to the other.
Isomorphism is an equivalence relation on the class of all structures
of a fixed signature K. If two structures are isomorphic then they
share all model-theoretic properties; in particular they are
elementarily equivalent.

If *A* and *B* are structures of signature K with
dom(*A*) a subset of dom(*B*), and the interpretations in
*A* of the symbols in K are just the restrictions of their
interpretations in *B*, then we say that *A* is a
*substructure* of *B* and conversely *B* is an
*extension* of *A*. If moreover *B* has some
elements that are not in *A*, we say that *A* is a
*proper substructure* of *B* and *B* is an
*proper extension* of *A*. If *B* is a structure
and *X* is a nonempty subset of dom(*B*), then there is a
unique smallest substructure of *B* whose domain contains all of
*X*. It is known as the *substructure of B generated by
X*, and we find it by first adding to *X* all the
elements *c ^{B}* where

*c*are individual constants of K, and then closing off under the functions

*F*where

^{B}*F*are function symbols of K.

For example the substructure of the field
* R* generated by the number 1 consists of 1, 0
(since it is named by the constant 0), 1+1, 1+1+1 etc., −1,
−2 etc., in other words the ring of integers. (There is no need
to close off under multiplication too, since the set of integers is
already closed under multiplication.) If we had included a symbol for
1/

*x*too, the substructure generated by 1 would have been the field of rational numbers. So the notion of substructure is sensitive to the choice of signature.

## 2. Elementary maps

Let L be a first-order language and let *A* and *B* be
L-structures. Suppose *e* is a function which takes some
elements of *A* to elements of *B*. We say that
*e* is an *elementary map* if whenever a sequence of
elements *a*_{1}, …,
*a*_{n} in the domain of *e* satisfy a
formula
φ(*x*_{1},…,*x*_{n})
of L in *A*, their images under *e* satisfy the same
formula in *B*; in symbols

A⊨ φ(a_{1},…,a_{n}) ⇒B⊨ φ(e(a_{1}),…,e(a_{n})).

We say that *e* is an *elementary embedding* of
*A* into *B* if *e* is an elementary map and its
domain is the whole domain of *A*. As the name implies,
elementary embeddings are always embeddings.

If there is an elementary embedding from *A* to *B*
then *A* and *B* are elementarily equivalent. On the
other hand an embedding between elementarily equivalent structures, or
even between isomorphic structures, need not be elementary. (For
example, writing * Z* for the abelian group of
the integers with signature consisting of 0 and +, the embedding from

*to*

**Z***that takes each integer*

**Z***n*to 2

*n*is an embedding, and of course

*is isomorphic to itself; but this embedding is not elementary, since 1 satisfies the formula ¬∃*

**Z***y*(

*y*+

*y*=

*v*

_{1}), but 2 doesn’t.)

We say that *A* is an *elementary substructure* of
*B*, and *B* is an *elementary extension* of
*A*, if *A* is a substructure of *B* and the
inclusion map is an elementary embedding. It’s immediate from the
definitions that an elementary extension of an elementary extension of
*A* is again an elementary extension of *A*.

Elementary embeddings are natural maps to consider within
first-order model theory. Around 1950 Abraham Robinson was impressed
that maps between algebraic structures in general seem hardly ever to
be elementary, whereas some important maps (such as embeddings between
two algebraically closed fields, or between two real-closed fields)
turn out to be elementary. He was also surprised to find that this fact
about algebraically closed fields is another way of stating a
celebrated theorem called the Hilbert Nullstellensatz. These
observations of Robinson have had a huge effect on the development of
model theory. In Robinson’s terminology, a first-order theory is
*model-complete* if every embedding between models of the theory
is elementary. This notion has found many uses, and it often appears in
applications of model theory in algebra.

A notion that is closely related to model-completeness, but should not
be confused with it, is elimination of quantifiers. Suppose L is a
first-order language, T is a theory in L, and Φ is a set of
formulas of L. We say that T has *elimination of quantifiers down
to* Φ if for every formula
φ(*x*_{1},…,*x*_{n})
of L there is a formula
ψ(*x*_{1},…,*x*_{n})
in Φ such that in every model of T, φ and ψ are satisfied
by exactly the same *n*-tuples of elements
(*a*_{1},…,*a*_{n}). (The
‘method of elimination of quantifiers’ discussed in section 2.2
of Tarski’s truth definitions) was a
syntactic and pre-model-theoretic method for proving elimination of
quantifiers down to a particular set of formulas.) A theory is said
to *have quantifier elimination* if it has elimination of
quantifiers down to quantifier-free formulas.

The connection between model-completeness and elimination of quantifiers is as follows. Robinson showed that a theory is model-complete if and only if it has elimination of quantifiers down to existential formulas (i.e. formulas that either are quantifier-free or consist of one or more existential quantifiers followed by a quantifier-free formula). So theories that have quantifier elimination are model-complete, but the converse need not hold. Still, showing that a theory is model-complete is sometimes a useful first step towards showing that it has quantifier elimination.

To return to elementary embeddings: They have a number of properties that make them useful. We have space for four.

The downward Löwenheim-Skolem theorem:

Suppose L is a first-order language which has κ formulas,Ais an L-structure and λ is a cardinal which is at least κ but less than the cardinality ofA. Suppose also thatXis a set of at most λ elements ofA. ThenAhas an elementary substructure which has cardinality exactly λ and contains all the elements inX.

There is a proof of this in the entry on classical logic, using Skolem hulls. Note that λ must be infinite since every first-order language has infinitely many formulas.

The elementary chain theorem:

Suppose that L is a first-order language andA_{0},A_{1}, … is a sequence (of any length) of L-structures such that any structure in the sequence is an elementary substructure of all the later structures in the sequence. Then there is a unique smallest L-structureBwhich contains all the structures in the sequence as substructures; this structureBis an elementary extension of all the structures in the sequence.

The elementary amalgamation theorem:

Suppose L is a first-order language,Ais an L-structure andB,Care two elementary extensions ofA. Then there are an elementary extensionDofBand an elementary embeddingeofCintoDsuch that (i) for each elementaofA,e(a)=a, and (ii) ifcis an element ofCbut not ofA, thene(c)is not inB.

The elementary amalgamation theorem is a consequence of the compactness theorem in the next section.

The upward Löwenheim-Skolem theorem:

Suppose L is a first-order language which has κ formulas,Ais an L-structure whose cardinality is an infinite cardinal μ, and λ is a cardinal which is at least as great as both κ and μ. ThenAhas an elementary extension whose cardinality is λ.

This also follows from the compactness theorem, as shown in the entry on classical logic. The name of the theorem is a little unfortunate, since the theorem was first proved by Tarski, and Skolem didn’t even believe it (because he didn’t believe in uncountable cardinals).

There is another proof using the elementary amalgamation theorem and
the elementary chain theorem. One can show that the
structure *A* has a proper elementary
extension *A*′. (There is a proof of this using the
compactness theorem and the diagram lemma — see 3.1 and 3.2
below; another proof is by ultrapowers — see 4.1 below.) Now
use *A*′ and again *A*′ for the
structures *B* and *C* in the elementary amalgamation
theorem. Then *D* as in the theorem is an elementary extension
of *A*, and by (ii) in the theorem, it must contain elements
that are not in *A*′, so that it is a proper elementary
extension. Repeat to get a proper elementary extension of *D*,
and so on until you have an infinite elementary chain. Use the
elementary chain theorem to find an elementary extension of *A*
that sits on top of this chain. Keep repeating these moves until you
have an elementary extension of
*A* that has cardinality at least λ. Then if necessary
use the downward Löwenheim-Skolem theorem to pull the cardinality down
to exactly λ. This kind of argument is very common in
first-order model theory. By careful choice of the amalgams at the steps in the
construction, we can often ensure that the top structure has further properties
that we might want (such as saturation, see 4.2 below).

## 3. Five grand theorems

The five theorems reported in this section are in some sense the pillars of classical model theory. All of them are theorems about first-order model theory. A great deal of the work done in the third quarter of the twentieth century was devoted to working out the consequences of these theorems within first-order model theory, and the extent to which similar theorems hold for languages that are not first-order.

### 3.1 The compactness theorem

If T is a first-order theory, and every finite subset of T has a model, then T has a model.

There is a proof of this theorem in the entry on classical logic. The theorem has several useful paraphrases. For example it is equivalent to the following statement:

Suppose T is a first-order theory and φ is a first-order sentence. If T ⊨ φ then there is a finite subset U of T such that U ⊨ φ.

(See the entry on model theory for the notion ⊨ of model-theoretic consequence. To derive the second statement from the first, note that ‘T ⊨ φ’ is true if and only if there is no model of the theory T ∪ {¬ φ}.)

Anatolii Mal’tsev first gave the compactness theorem in 1938 (for first-order logic of any signature), and used it in 1940/1 to prove several theorems about groups; this seems to have been the first application of model theory within classical mathematics. Leon Henkin and Abraham Robinson independently rediscovered the theorem a few years later and gave some further applications. The theorem fails badly for nearly all infinitary languages.

### 3.2 The diagram lemma

If *A* is an L-structure, then we form the *diagram* of
*A* as follows. First add to L a supply of new individual
constants to serve as names for all the elements of *A*. (This
illustrates how in first-order model theory we easily find ourselves
using uncountable signatures. The ‘symbols’ in these
signatures are abstract set-theoretic objects, not marks on a page.)
Then using L and these new constants, the *diagram* of
*A* is the set of all the atomic sentences and negations of
atomic sentences that are true in *A*.

IfB′ is a model of the diagram ofA, andBisB′ with the new constants removed from the signature, then there is an embedding ofAintoB.

Namely, if an element of *A* is named by a new constant
*c*, then map that element to the element of *B*′
named *c*. A variant of this lemma is used in the proof of the
elementary amalgamation theorem.

### 3.3 The Lyndon interpolation theorem

This theorem may have the longest pedigree of any theorem of model
theory, since it generalises the Laws of Distribution for syllogisms,
which go back at least to the early Renaissance. The theorem is
easiest to state if we assume that our first-order languages have
symbols ∧, ∨ and ¬, but not → or ⇔. Then an
occurrence of a predicate symbol R in a formula φ is said to
be *positive* (resp. *negative*) if it lies within the
scope of an even (resp. odd) number of occurrences of ¬.

Suppose L and M are first-order languages, L ∪ M is the smallest first-order language containing both L and M, and L ∩ M is the language consisting of all the formulas which are in both L and M. Suppose T is a theory in L, U is a theory in M, and no (L ∪ M)-structure is both a model of T and a model of U. Then there is a sentence φ of L ∩ M which is true in all models of T and false in all models of U. (This sentence φ is called theinterpolant.) Moreover every predicate symbol with a positive occurrence in φ has a positive occurrence in some sentence of T and a negative occurrence in some sentence of U, and conversely every predicate symbol with a negative occurrence in φ has a negative occurrence in some sentence of T and a positive occurrence in some sentence of U.

There are several proofs of this theorem, and not all of them are model-theoretic. Without the last sentence, the theorem is known as Craig’s interpolation theorem, since William Craig proved this version a few years before Roger Lyndon found the full statement in 1959. As Craig noted at the time, his interpolation theorem gives a neat proof of Evert Beth’s definability theorem, which runs as follows.

Suppose that L is a first-order language and M is the first-order
language got by adding to L a new predicate symbol R. Suppose also that
T is a theory in M. We say that T *implicitly defines* R if it
is false that there are two M-structures which are models of T, have
the same elements and interpret all the symbols of L in the same way
but interpret the symbol R differently. We say that T *defines*
R *explicitly* if there is a formula
φ(*x*_{1},…,*x*_{n})
of L such that in every model of T, the formulas φ and
R(*x*_{1},…,*x*_{n}) are
satisfied by exactly the same *n*-tuples
(*a*_{1},…,*a*_{n}) of
elements. It is easy to see that if T defines R explicitly then it
defines R implicitly. (This fact is known as *Padoa’s method*;
Padoa used the failure of implicit definability as a way of proving the
failure of explicit definability.) Beth’s theorem is the converse:

Suppose that L, M, R and T are as above. If T defines R implicitly then T defines R explicitly.

### 3.4 The omitting types theorem

This theorem needs some preliminary definitions. Suppose L is a first-order language, T is a complete theory in L, and Φ is a set of formulas of L which all have the free variable*x*(and no other free variable). We say that an L-structure

*A*

*realises*Φ if there is an element of

*A*that satisfies all the formulas in Φ; if

*A*has no such element, we say that

*A*

*omits*Φ. A formula ψ(

*x*) of L, with just the free variable

*x*, is called a

*support*of Φ in T if the consequences of T include the sentence ∃

*x*ψ(

*x*) and the sentence ∀

*x*(ψ(

*x*) → φ(

*x*)) whenever φ(

*x*) is a formula in Φ. It is not hard to see that if there is a support of Φ in T then every model of T realises Φ. The converse is not always true, but the omitting types theorem tells us that it is true if we restrict ourselves to countable first-order languages:

Suppose L is a first-order language which has countably many formulas. Suppose T is a complete theory in L, and Φ is a set of formulas of L which all have the free variablex. Finally suppose that every model of T with just countably many elements realises Φ. Then there is a support of Φ in T. (In other words, there is a finite reason why the ‘type’ Φ can’t be omitted in any model of T.)

The omitting types theorem goes back to the mid 1950s. It very definitely
depends on the language being first-order and countable. It has several
useful generalisations, for example *model-theoretic forcing*,
which is an analogue of the forcing construction in
set theory.
In fact the games used for
model-theoretic forcing (see the entry on
logic and games)
can be adapted to prove the
omitting types theorem too. There are similar but more complicated
theorems for uncountable first-order languages; some of these can be
paraphrased as omitting types theorems for
infinitary languages.

### 3.5 The initial model theorem

A quantifier-free formula is said to be a *Horn formula* (after
Alfred Horn) if it has one of the three forms

- ψ,
- φ
_{1}∧ … ∧ φ_{n}→ ψ, - ¬(φ
_{1}∧ … ∧ φ_{n}),

where the formulas φ_{1}, …,
φ_{n}, ψ are all atomic. A *universal Horn
sentence* (also known to the computer scientists as a *Horn
clause*) is a sentence that consists of universal quantifiers
followed by a quantifier-free Horn formula; it is said to be
*strict* if no negation sign occurs in it (i.e. if it doesn’t
come from a quantifier-free Horn formula of the third kind).

Let T be a theory consisting of strict universal Horn sentences. Then T has a modelAwith the property that for every modelBof T there is a unique homomorphism fromAtoB. (Such a modelAis called aninitial modelof T. It is unique up to isomorphism.)

This theorem is a generalisation, due to Mal’tsev, of a group-theoretic
construction called *construction by generators and relations*.
It is the main idea behind *algebraic specification*, which is
one approach to the specification of systems in computer science. The
required behaviour of the system is written down as a set of strict
universal Horn sentences, and then the initial model of these sentences
is an abstract version of the required system.

## 4. Three useful constructions

A construction is a procedure for building a structure. We have already seen several constructions in the theorems above: for example the omitting types construction and the initial model construction. Here are three more.

### 4.1 Products and reduced products

If *A* and *B* are L-structures, we form their
*product* *C* = *A* × *B* as
follows. The elements of *C* are the ordered pairs
(*a*,*b*) where *a* is an element of *A*
and *b* is an element of *B*. The predicate symbols are
interpreted ‘pointwise’, i.e. so that for example

(a,b)is in P^{C}if and only ifais in P^{A}andbis in P^{B}.

The structures *A* and *B* are called the
*factors* of *A* × *B*. In the same way
we can form products of any number of structures. If all the factors of
a product are the same structure *A*, the product is called a
*power* of *A*. A theorem called the *Feferman-Vaught
theorem* tells us how to work out the complete theory of the
product from the complete theories of its factors.

This construction has some variants. We can define an equivalence
relation on the domain of a product *C*, and then take a
structure *D* whose elements are the equivalence classes; the
predicate symbols are interpreted in *D* so as to make the
natural map from dom(*C*) to dom(*D*) a homomorphism. In
this case the structure *D* is called a *reduced product*
of the factors of *C*. It is a *reduced power* of
*A* if all the factors are equal to *A*; in this case the
*diagonal map* from *A* to *D* is the one got by
taking each element *a* to the equivalence class of the element
(*a*,*a*,…).

Suppose we use a set *I* to index the factors in a product
*C*. An *ultrafilter* over *I* is a set *U*
of subsets of *I* with the properties

- if sets
*X*and*Y*are in*U*then so is their intersection*X ∩ Y*; - if
*X*is in*U*and*X ⊆ Y ⊆ I*then*Y*is in*U*; - for each subset
*X*of*I*, exactly one of*X*and its complement*I*\ X is in*U*.

If we have an ultrafilter *U* over *I*, then we can
construct a reduced product from *C* by making two elements of
*C* equivalent if and only if the set of indices at which they
are equal is a set in the ultrafilter *U*. This is indeed an
equivalence relation on the domain of *C*, and the resulting
reduced product is called an *ultraproduct* of the factors of
*C*. If *C* is a power of *A* then this
ultraproduct is called an *ultrapower* of *A*, and it is
sometimes written *U*-prod *A*. A theorem called
*Łoś’s theorem* describes what sentences are true in
an ultraproduct. Its most useful consequence is the following:

IfUis an ultrafilter then the diagonal map fromAtoU-prodAis an elementary embedding.

If the ultrafilter *U* is *nonprincipal*, i.e. contains
no finite sets, then the diagonal map is not onto the domain of
*U*-prod *A*, and in fact *U*-prod *A* is
generally much larger than *A*. So we have a way of constructing
large elementary extensions. The axiom of choice guarantees that every
infinite set has many nonprincipal ultrafilters over it. Ultrapowers
are an essential tool for handling large cardinals in set theory (see
the entry on
set theory).

A remarkable (but in practice not terribly useful) theorem of
Saharon Shelah tells us that a pair of structures *A* and
*B* are elementarily equivalent if and only if they have
ultrapowers that are isomorphic to each other.

### 4.2 Saturation

Suppose *A* is an L-structure, *X* is a set of elements
of *A*, *B* is an elementary extension of *A* and
*b*, *c* are two elements of *B*. Then *b*
and *c* are said to have *the same type over X* if for
every formula
φ(*v*_{1},…,*v*_{n+1})
of L and every *n*-tuple d of elements of *X*,

B⊨ φ[b,d] ⇔B⊨ φ[c,d].

We say that *A* is *saturated* if whenever *X* is
a set of elements of *A*, of cardinality less than that of
*A*, and *B* is any elementary extension of *A*,
we always have that every element of *B* has the same type over
*X* as some element already in *A*.

This rather heavy definition gives little clue how useful saturated structures are. If every structure had a saturated elementary extension, many of the results of model theory would be much easier to prove. Unfortunately the existence of saturated elementary extensions depends on features of the surrounding universe of sets. There are technical ways around this obstacle, for example using weakenings of the notion of saturation. We have two main ways of constructing elementary extensions with some degree of saturation. One is by ultrapowers, using cleverly constructed ultrafilters. The other is by taking unions of elementary chains, generalising the proof we gave for the upward Löwenheim-Skolem theorem.

The existence of partially saturated elementary extensions of the
field * R* of real numbers is the main technical
fact behind Abraham Robinson’s

*nonstandard analysis*. See Section 4 of the entry on model theory for more information on this. Though model theory provided the first steps in nonstandard analysis, this branch of analysis rapidly became a subject in its own right, and its links with first-order model theory today are rather slim.

### 4.3 Ehrenfeucht-Mostowski models

Let *A* be an L-structure, *X* a set of elements of
*A* and < a linear ordering of *X* (not necessarily
definable by a first-order formula). We say that (*X*,<) is
an *indiscernible sequence* in *A* if for every natural
number *n*, and all elements
*a*_{1},…,*a*_{n},*b*_{
1},…,*b* _{n} of *A* such
that *a*_{1} < … <
*a*_{n} and *b*_{1} <
… < *b*_{n}, the map taking each
*a*_{i} to the corresponding
*b*_{i} is an elementary map. If T is a theory
with infinite models, then T has models that are the Skolem hulls (see
the entry on
classical logic)
of
indiscernible sequences. These models are known as
*Ehrenfeucht-Mostowski models*, after the two Polish model
theorists who first carried out this construction in the mid 1950s.
These models tend to be the opposite of saturated; we can arrange that
very few types over sets of elements are represented among their
elements. Some important distinctions between different models of set
theory can be expressed in terms of the indiscernible sequences within
these models; see the entry on
set theory.

## 5. Three successful programmes

Every healthy branch of mathematics needs a set of problems that form a serious challenge for its researchers. We close with a brief introduction to some of the research programmes that drove first-order model theory forwards in the second half of the twentieth century. The book of Marcja and Toffalori in the bibliography gives further information about these programmes. There are other current programmes besides these; see for example the handbook edited by Yuri Ershov, which is about model theory when the structures are built recursively.

### 5.1. Categoricity and classification

In 1904 Oswald Veblen described a theory as *categorical* if it
has just one model up to isomorphism, i.e. it has a model and all its
models are isomorphic to each other. (The name was suggested to him
by John Dewey, who also suggested the name ‘disjunctive’
for other theories. This pair of terms come from traditional logic as
names of types of sentence.) The depressing news is that there are no
categorical first-order theories with infinite models; we can see this
at once from the upward Löwenheim-Skolem theorem. In fact if T is
a first-order theory with infinite models, then the strongest kind of
categoricity we can hope for in T is that for certain infinite
cardinals κ, T has exactly one model of cardinality κ, up
to isomorphism. This property of T is called
κ-*categoricity*.

Now there is a heuristic principle that many people have used, though it seems to have no simple formulation. We suggest ‘Few is beautiful’. The principle says that if a first-order theory T constrains its models (of a particular cardinality) to be all similar to each other, this can only be because the models of T have few irregularities and asymmetries. So there should be a good structural description of these models. One should expect that they are ‘good structures’ from the point of view of classical mathematics. As a first step, one easily sees from the upward and downward Löwenheim-Skolem theorems that if T is κ-categorical for some κ at least as large as the number of formulas in the language of T, then T must be a complete theory. From now on, T is a complete theory with infinite models; and for simplicity we will assume that the language of T is countable.

In 1954 Jerzy Łoś announced that he could only find three kinds of example of theories T that are κ-categorical. Namely:

- T is
*totally categorical*if it is κ-categorical for every infinite cardinal κ. A typical example is the complete theory of an infinite-dimensional vector space over a finite field. - T is
*uncountably categorical*(but not totally categorical) if it is κ-categorical precisely when κ is uncountable. Essentially the only example that Łoś could find was the complete theory of an algebraically closed field; this is uncountably categorical by a well-known theorem of Steinitz. - T is
*countably categorical*(but not uncountably categorical) if it is κ-categorical precisely when κ is countable. A typical example is the complete theory of a dense linear ordering with no first or last element; this is countably categorical by a well-known theorem of Cantor.

Łoś asked whether there are any other possibilities besides these three. (Of course most complete theories are not κ-categorical for any κ.)

This question of Łoś was a tremendous stimulus to
research, and it led to a classic paper of Michael Morley in 1965 which
showed that Łoś’s three possibilities are in fact the only
ones. One central idea of Morley’s analysis was that models of an
uncountably categorical theory have the smallest possible number of
types of element; this led directly to the branch of model theory
called *stability theory*, which studies theories that have a
limited number of element types. These theories have the remarkable
property that every infinite indiscernible sequence in any of their
models is indiscernible under any linear ordering whatever; so these
sequences are a kind of generalisation of bases of vector spaces.
Another idea implicit in Morley’s work, but much clarified by later
work of William Marsh, John Baldwin and Alistair Lachlan, was that in
any model of an uncountably categorical theory there is a central core
(called a *strongly minimal set*) which carries a dependence
relation obeying similar laws to linear dependence in vector spaces. In
terms of this dependence relation one can define a dimension for the
model, and what remains of the model outside the core is so closely
tied to the core that the dimension determines the model up to
isomorphism.

Saharon Shelah developed Morley’s ideas with great resourcefulness and
energy. His main aim was to stretch the ‘Few is beautiful’
idea by showing that there are clear dividing lines between kinds of
theory T. On one side of a dividing line are theories with some good
structural property that forces the number of nonisomorphic models of
a given cardinality to be small. On the other side, every theory has
(for example) two models of the same cardinality that are not
isomorphic but are extremely hard to tell apart. Shelah coined the
name *classification theory* for this research. The text of
Lascar listed below is an elegant introduction to this whole
programme, from Łoś to Shelah. Meanwhile Shelah himself has
extended it far beyond first-order logic. Even in the first-order
case, Shelah had to invent new set-theoretic techniques (such as
proper forcing) to carry out his constructions.

Classification theory has two related though fundamentally
distinct aims: to classify the models of a theory (or to show that
such a classification is impossible) by some relatively simple
combinatorial invariants and to classify theories themselves by the
presence or absence of some fundamental structures within their
models. With the second edition of Classification Theory, Shelah
dropped the subtitle “And the number of non-isomorphic
models” in order to emphasize the broader goals of the project.
In general, a class of theories may be recognized as classification
theoretically robust if it admits characterizations both in terms of
cardinals and with respect to some absolute condition on formulae
relative to the theory. For example, by definition, a theory T in a
language L is stable if there is some cardinal κ ≥ |L| so
that for every model M of T of cardinality at most κ, the number
of 1-types over M is at most κ. Equivalently, a theory T is
stable if no formula has the *order property* relative to T.
That is, there is no L-formula φ(*x*;*y*)
(where *x* and *y* may be tuples of variables) so that for
each natural number n it is consistent with T that there be
*a*_{1},
…, *a*_{n}; *b*_{1}, …,
*b*_{n} so that
φ(*a*_{i},*b*_{j}) holds
just in case *i* ≤ *j*.

### 5.2. Geometric model theory

Geometric model theory grew out of Michael Morley’s 1965 paper, but in
a different direction from Shelah’s work (though today it makes regular
use of technical tools developed by Shelah in his classification
programme). Morley had shown that models of an uncountably categorical
theory have structural properties that are interesting in their own
right, regardless of the complete theory of the structure; so it became
the custom to talk of *uncountably categorical structures*,
meaning models of uncountably categorical theories. (And likewise
*totally categorical structures*.) Independently Boris Zilber in
Siberia and Greg Cherlin in the United States noticed that any infinite
group that is definable in an uncountably categorical structure must
have many features in common with the algebraic groups studied by
algebraic geometers. Zilber in particular showed that many methods from
algebraic geometry generalise to the model-theoretic case. His secret
weapon was Bezout’s Theorem from geometry, which he used to guide him
to solutions of very difficult model theoretic problems; for example
his theorem that no totally categorical theory can be axiomatised by a
finite number of axioms. (It was secret in the sense that it guided his
intuition but never appeared explicitly in his results.) Zilber also
noticed an important difference between the first and second of Łoś’s
examples above. Namely, in a vector space the subspaces (i.e. the
subsets closed under linear dependence) form a modular lattice; but the
algebraically closed subsets of an algebraically closed field form a
lattice that is not modular.

Partly because of the difficulty of communications between Siberia
and the West, these results of Zilber took some time to digest, and in
part they had to be rediscovered in the West. But when the message did
finally get through, the result was a new branch of model theory which
has come to be known as *geometric model theory*. The programme
is broadly to classify structures according to (a) what groups or
fields are interpretable in them (in the sense sketched in the entry
on
model theory)
and (b) whether or not the
structures have ‘modular geometries’; and then to use this
classification to solve problems in model theory and geometry. From
the mid 1980s the leader of this research was Ehud Hrushovski. In
the early 1990s, using joint work with Zilber, Hrushovski gave a
model-theoretic proof (the first complete proof to be found) of the
geometric Mordell-Lang conjecture in all characteristics; this was a
conjecture in classical diophantine geometry. Bouscaren (ed.) 1998
is devoted to Hrushovski’s proof and
the necessary background in model theory. Both (a) and (b) are
fundamental to Hrushovski’s argument.

### 5.3. O-minimality

Of the three programmes described here, this is the oldest, since it
grew out of Tarski’s description of the complete theory of the field of
real numbers (which he proved by the method of quantifier elimination).
In the course of giving this description,
Tarski had shown that every first-order formula φ(*x*) in
the relevant language, possibly with parameters, is satisfied by
exactly the same assignments as some boolean combination of formulas of
the form *x < s* or *t < x* where *s*,
*t* are constant terms naming parameters. Another way of saying
this is that

Every set of elements definable by a first-order formula is a finite union of open intervals with named endpoints, together with some finite set of elements.

A linearly ordered structure with this property is said to be
*o-minimal*. (The idea of the name is that o-minimality is an
analogue of ‘strong minimality’, in a form that makes sense
for structures that carry a linear ordering, whence ‘o-’
for ordering.)

In 1982 Lou van den Dries showed that the fact that the field of
real numbers is o-minimal gives a large amount of useful information
about the definable sets of higher dimension, such as the family of
definable subsets of the real plane. Soon after this, Julia Knight,
Anand Pillay and Charles Steinhorn noticed that if a structure
*A* is o-minimal, then so is any structure elementarily
equivalent to *A*, and that van den Dries’ analysis of
higher-dimensional definable set applies to all these structures. These
results led to much activity on the frontier between model theory and
function theory. Several old problems from model theory and function
theory were solved. Alex Wilkie showed that the field of real numbers
with a symbol for exponentiation is o-minimal and has a model-complete
complete theory, and thereby gave a positive answer to Tarski’s old
problem of whether this structure allows a quantifier elimination,
though his method was very far from the syntactic analysis that Tarski
had in mind. (This is one case where we need to remember the difference
between being model-complete and having quantifier elimination;
see section 2 above. The question whether this particular theory
has quantifier elimination is much harder and is closely related to
a deep conjecture of number theory called Schanuel’s conjecture;
see the paper of Macintyre and Wilkie.)
We now know a wide range of ways of adding interesting
features to the field of real numbers in such a way that the resulting
structure is still o-minimal (and hence in some sense mathematically
tractable). Van den Dries has urged that o-minimal structures provide a
good setting for developing the ‘tame topology’ programme
of Alexander Grothendieck.

In 2006, Jonathan Pila and Alex Wilkie showed that provided one
removes the subsets defined using only polynomial inequalities,
subsets of *R*^{n} definable in an
o-minimal expansion of the real field have few rational points.
Subsequently, following a strategy first employed by Pila and Umberto
Zannier to reprove the Manin-Mumford conjecture various authors have
used this o-minimal counting theorem to solve some important open
problems in diophantine geometry.

Kobi Peterzil and Sergei Starchenko have developed a
theory of o-minimal *complex* analysis. Just as with
the classical approach to complex analysis, one may interpret
the complex numbers as the set of ordered pairs of real numbers
with addition and multiplication defined by the usual rules
involving their real and imaginary parts. Of their results in this
area, their algebraicity theorem, which asserts that if a subset of
*C*^{n} is complex analytic (meaning that it
is closed and is locally defined by the vanishing of finitely many
complex analytic functions) and definable in some o-minimal expansion
of the real field, then it must be algebraic, that is defined by the
vanishing of polynomial equations, is the most striking result and
has been strong consequences in the study of functional transcendence and
homogeneous dynamics.

All three of these programmes generated new techniques for proof, constructions
and classifications. As we should expect, researchers have explored the range of
application of each technique. One result of this has been the emergence of
several useful classes of first-order theories which relate to more than
one of the three programmes. For example a central tool of
Shelah’s classification theory was his notion of *forking*, a
far-reaching generalisation of earlier algebraic notions of dependence
relation. The class of *simple* theories is defined by the fact that
forking has certain nice properties while the class of
*rosy theories* is characterised by the existence of a good notion
of independence coming from a further generalisation of forking called
*þ-forking*; several natural examples of simple theories
came to light in geometric model theory, and the complete theories of
o-minimal structures are examples of rosy theories. In parallel with
these technical advances, first-order model theory continues to grow
more closely involved with problems in number theory, functional
analysis and other branches of pure and, and even applied,
mathematics.

In Shelah's programme of classification theory, a central
rôle is played by
*dividing lines*. That is, the class of all theories should be
divided into those having some property and those which do not and
this division into these two classes should be authentic in the sense
that the classes should admit various definitions of different
characters. For example, the class of stable theories may be
described in many facially dissimilar ways as, for example, those
theories with respect to which every type is definable, those with
respect to which no formula has the order property, or those for which
there is some infinite cardinal *κ* at least as large as
that of the language for which over every model of size *κ*
there are no more than *κ* many *1*-types. Many of
these classification theoretic classes are defined by forbidding the
existence of certain combinatorial configurations. For example, the
class of *dependent* or *NIP* (for "Not the Independence
Property") theories is defined by saying that for **no** formula
φ(*x*,*y*) is it possible to find a model
*M* and sequences
(*a _{i}*)

_{i=0}

^{∞}and (

*b*) where the

_{S}*b*sequence is indexed by the subsets of the natural numbers so that φ(

*a*) if and only if

_{i},b_{S}*i ∈ S*. Around the same time that Shelah isolated the independence property, Vladimir Vapnik and Alexey Chervonenkis discovered the same notion in machine learning theory. Consequences of the model theoretic analysis of NIP have since been drawn in theory of neural networks, extremal combinatorics, the theory of differential privacy, and machine learning theory more broadly.

## Bibliography

- Beth, E., 1953, “On Padoa’s method in the theory of
definition”,
*Nederlandse Akademie van Wetenschappen*, Proceedings (Series A), 56: 330–339. - Bouscaren, E. (ed.), 1998,
*Model Theory and Algebraic Geometry: An introduction to E. Hrushovski’s proof of the geometric Mordell-Lang conjecture*(Lecture Notes in Mathematics: Volume 1696), Berlin: Springer-Verlag. - Buechler, S., 1996,
*Essential Stability Theory*, Berlin: Springer-Verlag. - Chang, C. and Keisler, J., 1990,
*Model Theory*, Amsterdam: North-Holland. - Chatzidakis, Z.
*et al*. (eds.), 2008,*Model Theory with Applications to Algebra and Analysis*, Volumes 1 and 2, Cambridge: Cambridge University Press. - Dries, L. van den, 1998,
*Tame Topology and O-minimal Structures*, Cambridge: Cambridge University Press. - Ealy, C. and Onshuus, A., 2007, “Characterizing rosy
theories”,
*Journal of Symbolic Logic*, 72: 919–940. - Ehrig, H. and Mahr, B., 1985,
*Fundamentals of Algebraic Specification I: Equations and Initial Semantics*, Berlin: Springer-Verlag. - Ershov, Y. (ed.), 1998,
*Handbook of Recursive Mathematics I, Recursive Model Theory*, New York: Elsevier. - Hart, B., Lachlan, A. and Valeriote, M., 1996,
*Algebraic Model Theory*, Dordrecht: Kluwer. - Haskell, D., Pillay, A. and Steinhorn, C., 2000,
*Model Theory, Algebra, and Geometry*, Cambridge: Cambridge University Press. - Hodges, W., 1993,
*Model Theory*, Cambridge: Cambridge University Press. - –––, 1998, “The laws of distribution for
syllogisms”,
*Notre Dame Journal of Formal Logic*, 39: 221–230. - Karpinski, M. and A. Macintyre, 1997, Polynomial bounds for VC
dimension of sigmoidal and general Pfaffian neural networks, 1st
Annual Dagstuhl Seminar on Neural Computing (1994),
*Journal of Computer and System Sciences*, 54(1[2]): 169–176. - Lascar, D., 1986,
*Stability in Model Theory*, Harlow: Longman. - Macintyre, A. and Wilkie, A., 1996, “On the decidability of
the real exponential field”, in
*Kreiseliana: About and around Georg Kreisel*, P. Odifreddi (ed.), Wellesley MA : A. K. Peters, 441–467. - Marcja, A. and Toffalori, C., 2003,
*A Guide to Classical and Modern Model Theory*, Dordrecht: Kluwer. - Marker, D., 2002,
*Model Theory: An Introduction*, New York: Springer-Verlag. - Morley, M., 1965, “Categoricity in
power”,
*Transactions of the American Mathematical Society*, 114: 514–538. - Peterzil, Y. and S. Starchenko, Sergei, 2009, Complex analytic
geometry and analytic-geometric categories,
*Journal für die reine und angewandte Mathematik*, 626: 39–74. - Pillay, A., 1996,
*Geometric Stability Theory*, Oxford: Oxford University Press. - Pila, J., 2011, “O-minimality and the André-Oort
conjecture for
**C**^{n}”,*Annals of Mathematics*(2), 173(3): 1779–1840. - Pila, J. and Wilkie, A., 2006, “The rational points of a
definable set”,
*Duke Mathematics Journal*, 133(3): 591–616. - Pila, J. and Zannier, U., 2008, “Rational points in periodic
analytic sets and the Manin-Mumford conjecture”,
*Rendiconti Lincei Matematica E Applicazioni*, 19(2): 149–162. - Poizat, B., 2000,
*A Course in Model Theory*, New York: Springer. - Shelah, S., 1990,
*Classification Theory*, Amsterdam: North-Holland. - Tarski, A., 1951,
*A Decision Method for Elementary Algebra and Geometry*, Berkeley: University of California Press. - Vaught, R., 1974, “Model theory before 1945”,
in
*Proceedings of the Tarski Symposium*, L. Henkin,*et al*. (eds.), Providence RI : American Mathematical Society, 153–172. - Veblen, O., 1904, “A System of Axioms for Geometry”,
*Transactions of the American Mathematical Society*, 5(3): 343–384 - Wagner, F., 2000,
*Simple Theories*, Dordrecht: Kluwer Academic Publishers. - Wilkie, A., 1996, “Model completeness results for expansions of
the real field by restricted Pfaffian functions and the exponential
function”,
*Journal of the American Mathematical Society*, 9: 1051–1094.

## Academic Tools

How to cite this entry. Preview the PDF version of this entry at the Friends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers, with links to its database.

## Other Internet Resources

- Hrushovski, E. and Loeser, F., 2010, Non-archimedean tame topology and stably dominated types, online manuscript at arXiv.org.
- Simon, P., 2012, Lecture notes on NIP theories, online manuscript at arXiv.org.
- Schematic representation of the classification of theories.