|
Jean-Pierre
Koenig’s current research program (09/01/08) Speakers
of a language know tens of thousands of words. A survey I conducted of
college-educated English speakers suggests they know about 4,000 verbs (Koenig et al. 2003). The sheer
size of one's lexicon raises several questions: ·
What
information is associated with all these words? ·
How
is it organized? ·
How
is it used when interpreting sentences? My research focuses on
these questions. During my graduate studies and the early part of my career,
I focused on the organization of lexical knowledge. In particular, I
developed a model that accounts for the way word structure is similar to, but
not identical to, the structure of sentences, i.e., words may have the
creativity and productivity of syntax, but they can also be idiosyncratic
(see Koenig (1999) for a summary). I also showed that this model accounts for
the kinds of regularity that languages display between what words mean and
the contexts in which they occur (see Davis and Koenig (2000) and Koenig and
Davis (2001)). In more recent years, while
I have continued the investigation of general principles that explain how the
meaning of words affects the contexts in which they occur,
my research has expanded in three directions. The
first concerns the
informational boundaries of word meaning (how complex verb meaning can be and
what verb meaning can include). The
second concerns the extent to which word meaning and syntactic structure
mirror each other within and across languages. The third concerns the role semantic information plays in the
on-line interpretation of sentences. 1. The
cartography of verb meaning Given the results of the
survey I just mentioned and that, on average, each English verb has between
three and four meanings (depending on the particular dictionary), speakers
know between 12,000 and 16,000 verb meanings. What is the meaning of all
these verbs made up of? This is one of the questions my recent research has
tried to answer. How
do we distinguish information that is part and is not part of the meaning of
verbs? Verbs
describe situations; those situations include participants who play certain
roles (e.g., in an eating event, there is an entity acting to ingest some food that is then consumed). Linguists have traditionally distinguished information
about situation participants that is part of the meaning of verbs/is strongly
associated with verbs and information about situation participants that is
not/is weakly associated with verbs. Roles that include the former kind of
information are called arguments; roles that include the latter kind of
information, adjuncts. Koenig et al. (2003) proposed a new model of the
difference between these two kinds of semantic information (participant
roles). Arguments are participant roles that are (i)
required by the situations described by verbs and (ii) are specific to these
verbs and a restricted set of verbs (e.g., all situations occur somewhere,
but only some situations involve an entity that causes a change of state in
another entity). The relevance of the specificity of participant information
to its strength of association with situation-types, I claim, is the verbal
parallel of the special role that feature distinctiveness has been shown to
play in object categorization. A more information-theoretic way of
conceptualizing the model is that participant role information is associated
with the meaning of verbs to the degree that it co-occurs with the situations
the verb described more than expected by chance (so-called (point-wise)
mutual information between the verb's denoted situation-type and the
participant role). The research I just outlined provides a model of the
factors that determine what information is part of, or is strongly associated
with, a verb's meaning. How
complex can the structure of verb meaning be? There seems to be a limit to the
structural complexity of verb meanings. Intuitively, verb meanings typically
describe situations that are no more complex than a cause leading to a change
of state. Thus, there are many words like kill
(an activity that causes the death (change of state) of an entity), but there
appears to be no single word that would mean something like pay (someone) to kill (an activity
that causes another entity to act so that a change of state occurs). In Koenig et al. (2008), we showed on
the basis of a survey of 1,800 verbs that verb meaning can be more complex
than hitherto believed when a verb describes situations that involve tool
manipulation, i.e. situations in which an agent uses an instrument to perform
an action (a typical hominoid behavior). What
kind of information differentiates verbs which share meaning structure? Coarse-grained semantic structure
(whether the described situations are a state, or include a cause, a change
of state, or the use of a tool) leads to conceptual classes that include
hundreds of verb meanings. Out of the myriads of ways situation-types denoted
by verbs in each of these classes could vary, what are the ways in which
verbs typically vary? In other words, why do we have the
verbs we have in each of these classes? In contrast to research on
the overall structure of verb meaning, little research has been devoted to
this question. To delineate the kind of non-structural information verb
meanings can include, Koenig et al. (2008) semantically classified the
meaning of all the verbs that describe situations that must or may include
the use of a tool to induce a change of state (about 1,800 in total). This
study showed that these verbs fall into a small set of semantic classes and
that the idiosyncratic information distinguishing one verb meaning from the
other is not randomly distributed. Some aspects of the meaning of verbs that
describe situations that must or may include use of a toll are more finely
carved than others in ways that parallel research on goal/result focus in
adults' and children's event descriptions. How
are changes of state categorized differently across languages? In several South Asian and
South-East Asian languages (Hindi, Tamil, Thai) one can felicitously say the
equivalent of English He killed Lisi, but she didn't die. In recent work, my student Liancheng Chief and I (Koenig and Chief (2008) have argued that
such apparently odd verb meanings can be reduced to a variation on the
semantics of change of state verbs in better known languages. Simply put,
these languages conceptualize the described situations as involving a change
that is non-null, but less or equal to the maximum (i.e. paraphraseable
for Mandarin sha
‘kill’ as ‘hurt in a way that the change in vital signs is
less or equal to the maximum, i.e. death’). We further show that only
the meaning of pairs of verbs which describe changes of state that can be
conceptualized as gradable (changes in health, persuasion, …)
vary between these languages and languages like English. 2. Non-uniformity
in the syntax/semantics interface The structure of
expressions in logical or artificial languages “mirrors” the
structure of the meaning they express (technically, one can define a
homomorphism between a syntactic and semantic algebra). This parallel between
the syntax and semantics of natural languages is only partially surface true
(i.e. true of the apparent structure of sentences). Much of the research on
the interface between syntax and semantics over the last quarter century has
studied the extent of this “imperfection” in natural languages
and proposed explanations for it (see Koenig (2005) for a survey). To advance
our understanding of that issue, I have studied one particular semantic field
(aspect operators) and how it is realized syntactically. Aspect operators in
natural languages are expressions (typically verbs or verb forms) that
indicate whether an event is completed or merely stopped, whether an event is
on-going, whether an event has consequences for the present (or some other
reference interval). One dominant view is that
the syntax of aspect markers is uniform across languages, that there is a
single, universal mapping between aspectual operators and syntactic
positions. In Koenig and Muansuwan (2005), original
data from Thai challenges this view. Briefly put, the seventeen Thai aspect
markers are all verbs, but they fall into two distinct classes. Members of
the first class are syntactic heads that take following verb phrases as
complements; members of the second class are syntactic modifiers that modify
preceding verb phrases. We show that the fact that one and the same language
has two (symmetric) ways to map
aspect operators onto syntactic structure seriously undermines the uniformity
hypothesis. Our research supports claims that the architecture of natural
languages allows for a dissociation between semantic
operators and syntactic heads. Despite their common semantic operator status,
Thai aspect markers may, but need not, be syntactic heads. In my most recent
work (presented at conferences, but not yet published), Poornima
Shakthi and I have extended this argument on the
basis of data from Hindi. Hindi aspect markers combine with lexical verbs to
form complex predicates. Most interestingly, Hindi shows the same “dual”
mapping between syntax and semantics that Thai does. Some of the aspect
operators are syntactic heads that take preceding verbs as complements, and
some are syntactic modifiers that modify following verbs. (Hindi is a verb
final language, i.e. heads follow their complements.) Data from Hindi support
the same conclusion regarding the architecture of the syntax/semantics
interface as the Thai data supports. Some of my recent research
has also expanded on the semantics of aspect operators. Koenig and Muansuwan (2001) show how semantic operators that select
non-necessarily proper subparts of an event can be used to model the meaning
of Thai aspect classes. Nishiyama and Koenig (2008)
show that these kinds of operators can be used to provide a new approach to
the meaning of the English and Japanese perfect. They argue that the various
traditional interpretations of the perfect come from further specification of
an underspecified meaning, typically through pragmatic inferences and
validate their hypothesis through a corpus study of a ps-random
sample of over 600 English and Japanese example discourses. 3. How
much of the syntactic context in which words occur is truly semantically
determined? It is well-known that the
meaning of words partially determines how participant roles that are included
in the meaning of verbs are realized syntactically. In the last few years,
expanding on Koenig and Davis (2001), I have re-examined to what degree the
meaning of verbs determines the realization of arguments. In Koenig and Davis
(2003) and Koenig and Davis (2006), we argue that previous research,
including our own, has overestimated the role of semantics by surreptitiously
positing unwarranted semantic representations. Koenig and Davis (2003) and
Koenig and Davis (2006) also use semantic underspecification
to properly bound the effects of verb meaning on the syntactic realization of
participant roles. Most recently, in joint
research with Karin Michelson (Koenig and Michelson (To appear)), I have
looked at the interplay of nominal reference and argument realization. Our
analysis of Oneida kin terms (an Iroquoian language), which are at the same
time nominal and verbal stems suggests that (i) the
issue of nominal reference (nominal index selection, technically) is in
principle distinct from the issue of the realization of the members of kin
relations, (ii) that the range of properties relevant to argument realization
may include non-event-dependent properties (e.g., the absolute age of members
of a kin relation), and (iii) it seems to be indeed a universal that only n-1
of a noun's arguments need be realized by phrases co-occurring with the noun.
Oneida kin noun stems only seem to
be an exception to this putative universal because they are also verb stems. 4. The
influence of lexical meaning on sentence processing Over the last ten years, my
research has combined theoretical linguistics, quantitative linguistics, and
experimental methodologies to answer the fundamental questions about
speakers' word knowledge I have just described. In most of my experimental
work to date, I have tried to determine what
kind of participant information is used in the on-line interpretation of
sentences and when it is used. In a
series of papers from 1999 through 2002, Gail Mauner
and I showed that whether a semantic argument is activated or not upon
reading a verb form (i.e. whether readers of sentences activate the notion of
an agent/cause …) depends on whether or not the
semantic argument is syntactically active (is part of the verb's argument
structure in Linguistics parlance). Critically, the activation of the
semantic argument does not depend on the realization of that argument, since
it was unexpressed in all our critical conditions. More importantly, we have
experimentally validated our model of participant role activation through a
series of experiments (see Koenig et al. (2003), Conklin et al. (2004),
Koenig et al. (Under revision)). In all these experiments, extensive corpus
work showed that the effects of strength of association between
situation-types and participant roles was not due to extraneous syntactic
factors, in particular how frequently that role is expressed. In recent work (experiments
conducted, presented at conferences and being written up for publications),
we have expanded this research based on the results of the semantic results
of the survey of verbs that require or allow instruments detailed in Koenig
et al. (2008). Verbs that require an instrument role tend to constrain the
range of fillers of that role more than verbs that allow an instrument
(compare the range of likely instruments for chop and injure). In a
series of experiments conducted by Breton Bienvenue,
we showed that this subtle difference in the event information verbs include
can affect eye-movements: Participants launch earlier looks for highly
constraining verbs and use these semantic constraints to infer which of a set
of possible depicted objects are likely instruments of the
verb. Gail Mauner and I have also shown that
the fact that a syntactically similar preceding sentence contains a verb that
belongs to the same narrow semantic class (e.g. membership in the class of cutting cuts), can facilitate first
and later inhibit the processing of the verb and post-verb regions of an
immediately following sentence. Both results provide support for the semantic
classification discussed in Koenig et al. (2008). Finally, a former student of mine
and I have shown that part-of-speech information associated with preceding
words can affect the lexical processing of subsequent words above and beyond
any effect of semantic similarity between the two words (see Melinger and Koenig (2007)). |