## Notes to Generalized Quantifiers

1.
The use of “quantifier” (German
“*quantor*”, French
“*quantificateur*”, etc. to denote \(\forall\) and
\(\exists\) became established in logic toward the end of the
1920s.

2. I use the standard convention of letting mathematical expressions denote themselves whenever convenient.

3.
As a rule of thumb I use type-writer font for quantifier expressions
and italics for the signified quantifiers. In logical languages, on
the other hand, it is convenient to abuse notation somewhat by using
the same symbol for both the expression and the quantifier, when no
confusion results. I sometimes do the same for predicate symbols, so
that the letters *A*, *B* …, *R*,… stand
for both the symbol and the set or relation it denotes.

4. See Parsons (1997 [2017]) for an illuminating account of the Aristotelian square of opposition and some modern misunderstandings about it, and Westerståhl (2012) for a comparison with the “modern square”, which differs from the Aristotelian one in that all doesn’t have existential import but not all does (for Aristotle and most of his medieval followers, the reverse held).

5. See Peters and Westerståhl (2006: Ch. 2.5), for a proof that there does not exist a semantics assigning individuals or sets of individuals even to phrases of the two forms all A and some B in a systematic (compositional) way.

6. Some such properties were considered even in the early days of predicate logic, for example, the quantifier \(\exists !\) meaning “there exists exactly one”.

7. Rather than seeing generalized quantifiers as mappings from universes to second-order relations, Lindström took them to be classes of models of the corresponding type. This is a negligible difference, since we have \[Q_M(R_1,\ldots,R_k) \iff (M,R_1,\ldots,R_k) \in Q\] (see section 6 for the notation on the right-hand side).

8.
It is often held that the idea of a totality of everything has been
shown to be incoherent by Russell’s paradox. Indeed, the paradox
proved Frege’s original system to be inconsistent, and shows
that there cannot be a *set* containing all sets. Williamson
(2003) claimed that an absolute notion of “everything” can
be made formally coherent, although a semantics in which
interpretations are not objects is needed. This sparked off a debate
about absolutely general quantification; see, e.g., Cartwright (1994),
Glanzberg (2004), Linnebo (2006), Rayo (2012), and the collection Rayo
and Uzquiano (2006). Filin Karlsson (2018) gives an overview and
suggests an account in terms of stratified set theory (NFU).

9. These are the modern variants. Aristotle considered all with existential import, i.e.,

\[(\textit{all}_{\,\text{ei}})_M(A,B) \iff \emp\neq A \subseteq B\]and similarly for not all, which he took to be the negation of \(\textit{all}_{\,\text{ei}}\).

10.
Below, we are interpreting most as
“more than half of the” (on finite universes). Often,
most rather seems to mean something
like “a large majority of the”. There is probably some
vagueness involved, as well as an ambiguity: the
“threshold” may be set at different values in different
contexts. We are not dealing with vagueness here, and we assume that a
*fixed context assumption* chooses a suitable meaning. By
default, we thus set the “threshold” to 1/2.

11. A main source for the mathematics of model-theoretic logics is Barwise and Feferman (1985).

12. Here is a fact which is non-trivial but still relatively easy to prove:

\[\FO(\textit{MO}) \equiv \FO(Q_0,\textit{most})\]See Peters and Westerståhl (2006) for proofs of this and similar facts.

13. Nor does it have the completeness property or the Tarski property, though it does have the Löwenheim property.

14. See Ebbinghaus and Flum (1995) for the mathematics, and Westerståhl (1989) or Peters and Westerståhl (2006) for surveys focused on linguistic applications.

15. For the coding of quantifiers of arbitrary monadic types, or polyadic quantifiers, it may be practical to use a few more symbols in the coding language.

16. See, for example, Hopcroft and Ullman (1979) for an introduction to automata theory.

17. In the sense that if a binary word is accepted, so are all permutations of that word (when coded as suggested above).

18.
Here 0 can be defined as the unique *y* in *N* such that
\(y + y = y\). For more results in this area, see M.
Mostowski (1998) and Steinert-Threlkeld
and Icard (2013).

19.
This is one way of providing for *recursive definitions* in
the logic. A simpler operator that one can add is the transitive
closure operator. If *R* is a binary relation, the *transitive
closure* of *R*, \(\TC(R)\), is the smallest transitive
relation containing *R*. It can also be defined as follows (cf.
the definition of *RECIP* in the list
(12)):

Note the quantification over *n*: \(\TC(R)\) is not in general
definable in *FO* from *R*. It can also be defined
recursively:

To be able to do this *inside* our logic we can add formulas of
the form \(\TC(x,y,\f)(u,v)\) whenever \(\f\) is a formula, and the
semantic rule that when \(a,b \in M\), \(\p(x,y,\zbar)\) has the free
variables shown, and \(\cbar\) corresponds to \(\zbar\),

This gives us the logic \(\FO(\TC)\); the LFP operator generalizes this to other forms of recursion. See Ebbinghaus and Flum (1995) for details about the definitions, and for results about these logics, including those mentioned in this and the next two paragraphs.

20. Thereby contradicting common claims that natural languages are too unsystematic and chaotic to allow for a mathematically precise treatment; cf., for example, Russell’s “Misleading Form Thesis”. See Montague (1974), in particular the papers “Universal grammar” and “The proper treatment of quantification in ordinary English”.

21. Barwise & Cooper (1981), Keenan & Stavi (1986), Higginbotham & May (1981). The papers in Benthem (1986) provided further logical and linguistic development of these ideas. Surveys of the whole area are Westerståhl (1989), Keenan & Westerståhl (2011), and Peters & Westerståhl (2006).

22. I use the traditional NP for “noun phrase” and N for “noun”; linguists today prefer DP and NP, respectively.

23.
(a) is practically immediate, and for (b) it is easy to verify that
\(Q{^{\text{rel}}}\) always satisfies Conserv
and Ext when *Q* is of type
\({\langle}1{\rangle}\). In the other direction, any Conserv
and Ext \(Q'\) has a
type \({\langle}1{\rangle}\) “counterpart” *Q*
defined by \(Q_{M}(B) \Leftrightarrow Q'_{M}(M,B)\); then

so \(Q' = Q{^{\text{rel}}}\).

24.
But there again it is often natural to consider the Ext
quantifier \(W{^{\text{rel}}}\), where
\(W{^{\text{rel}}}_M(A,R)\) says that *R* is a well-ordering of
*A*.

25.
Indeed, the intersective quantifiers mentioned so far have the
stronger property of being *cardinal*; i.e., only the
cardinality of \(A\cap B\) matters. An example of an intersective but
non-cardinal quantifier is *no _ except Mary*, defined in the
next section.

26. See Keenan and Westerståhl (2011: sect. 19.2) for discussion.

27. For detailed discussion and further references, see Peters and Westerståhl (2006): Ch. 6.3 for existential there sentences, and Ch. 5 for much more on monotonicity, including the connection with polarity items.

28. For much more on this, and on the treatment of possessive and exceptive determiners in general, see Peters and Westerståhl (2006, 2013).

29.
This is the *universal* reading of John’s.
There is also an *existential*
reading, e.g., in

When John’s dogs escape, his neighbors usually catch them.

The antecedent doesn’t say that all of his dogs escape, only that some of them do.

30. See Bonnay (2008) and references therein for an overview of the discussion of conditions like Isom in the context of logicality.

31. For example, from their relativized versions. This suggestion about sameness or constancy is further discussed in Westerståhl (2017).

32.
van Benthem (1989) suggested that
(37)
could be interpreted as: “*R* includes a 1-1 function from
*A* to *B*”, which is not *FO*-definable.

33. This result is proved in Luosto (2000); the proof is quite difficult. More general results on the undefinability of resumption can be found in Hella, Väänänen, & Westerståhl (1997). Some discussion of the linguistic aspects appears in Peters & Westerståhl (2002).

34.
See Dalrymple et al. (1998) for an
extended discussion. *RECIP* is not definable in
*FO*.

35. Branching or partially ordered quantifiers is another way of generalizing (prefixes of) \(\forall\) and \(\exists\); it appeared in logic with Henkin (1961). Hintikka (1973) argued that partially ordered prefixes with \(\forall\) and \(\exists\) that are not first-order definable occur essentially in English too. The debate that followed Hintikka’s proposal was re-analyzed in Barwise (1981), who also suggested that (42) is a clearer example of branching in English than Hintikka’s original examples. Semantically, branching quantifiers are already subsumed under our notion of a generalized quantifier, since they can all be seen as polyadic quantifiers, like \(Br(Q_1,Q_2)\), although the special syntax is then lost.

Note that the construction in
(42)
only makes good sense when \(Q_1\) and \(Q_2\) are *right monotone
increasing*, i.e., \(Q_i(A,B)\) and \(B\subseteq B'\) entails that
\(Q_(A,B')\), \(i = 1,2\). Then

and one sees that (42) is a generalization of this. There has been some discussion about if and how (42) can be reformulated for other quantifiers; apart from Barwise (1979), see Westerståhl (1987) and Sher (1997).

36. Peters and Westerståhl (2006) argues that the difference between D- and A-quantification is mainly syntactic, and that all languages appear to be able to express Conserv and Ext type \({\langle}1,1{\rangle}\) quantifiers (whether by determiners or other means), even though some languages are claimed to lack phrases denoting type \({\langle}1{\rangle}\) quantifiers.

37. These 34 chapters are written by linguists or linguistics students, most of whom are native speakers of the respective language.

38. Szabolcsi agrees that some alleged shortcomings of GQ theory are orthogonal to its aims, but thinks the compositionality and scope problems are serious. As to compositional analysis of the meaning of complex determiners, she sees no principled problems with adding such analyses to current GQ theory. Her intricate discussion of scope is too complex to be sketched here.

39.
The ubiquity of this property, which is also called
*smoothness*, is discussed at length in Peters and
Westerståhl (2006: Ch. 5).

40. For example, is the premise true if most Americans who know three foreign languages speak at least one of them at home? Or must they speak all three, or in general most of the foreign languages they know, at home?

41.
In fact, not just monotonicity but also *exclusion*: being
able to reason from the fact that two predicates are disjoint, which
also comes naturally to speakers. Technically, the insight—which
essentially goes back to Keenan and Faltz
(1984)—is that while mere monotonicity only requires a
*pre-order* (reflexive and transitive), \(x \leq y\) entails
\(f(x) \leq f(y)\), the domains of the relevant functions here are
usually *bounded distributive lattices*, which enables one to
express other properties besides monotonicity, in particular
exclusion. A recent formulation of the monotonicity calculus in given
in Icard, Moss, and Tune (2017).

42.
Let *C* be the predicate “likes every clarinetist”.
So Pat is a *C* (second pre miss). So if you like every *C*
you like Pat. But then, using the first pre miss, if you like every
*C*, everyone likes *you*. And that’s what the
conclusion says. Each step in this argument can be construed as an
application of the monotonicity profile \(-\textit{every}+\).

43. As should be clear from the above, the monotonicity calculus and the axiomatized syllogistic fragments can be seen as different ways to approach the same phenomena, a point of view explored in Icard (2014).

44. More exactly, it is NP-complete. Further, Mostowski and Wojtyniak (2004) proved that the branching construction in Hintikka’s famous villagers sentence is also NP-complete:

Some relative of each villager and some relative of each townsman hate each other.

45.
This is supervised learning (back-propagation through a recurrent
neural network): the network is asked if \(Q_M(A,B)\) holds in simple
models \((M,A,B)\), reacts to feedback depending on the answer, and
tries again. The lack of an effect for Conserv
is explained by the fact that in the set-up used there is no
difference between *A* and *B* in the models: the sets
\(A-B\), \(A\cap B\), \(B-A\), and \(M-(A\cup B)\) are all on a
par.