+ All Categories
Home > Documents > A framework for computing extrasentential references

A framework for computing extrasentential references

Date post: 20-Nov-2023
Category:
Upload: rpi
View: 0 times
Download: 0 times
Share this document with a friend
21
159 A framework for computing extrasentential references TOMEK STRZALKOWSKI AND NICK CERCONE Natural Language Group, Luboratory for Computer and Communications Research, School of Computing Science, Simon Fraser University, Burnaby, E.C., Canada V5A 1S6 Received July 21, 1985 Accepted August 26, 1986 We are concerned with developing a computational method for selecting possible antecedents of referring expressions over sentence boundaries. Our stratified model which uses a A-categorial language for meaning representation incorporates valuable features uf Fregean-type semantics (a la Lewis, Montague, Partee, and others) along with features of situation semantics developed by Barwise and Perry. We con5ider a series of selected two-sentence stories which we use to illustrate referential interdependencies between sentences. We explain the conditions under which such dependencies arise, explain the conditions under which various translations can be performed, and formalize a set of rules which specify how to compute the reference. We restrict our discussion to two-sentence stories to avoid most of the problems inherent in where to look for the reference, that is, how to determine the proper antecedent. We restrict our considerations in this paper to situations where a reference, if it can be computed at all, has a unique antecedent. Thus we consider examples such as John wants to catch afish. He (John) wants to eat it. and John interviewed a man. The man killed him (John). We then summarize the transformation which encom- passes these rules and relate it to the stratified model. We discuss three aspects of this transformation that merit special attention from the computational viewpoint and summarize the contributions we have made. We also discuss the computational charac- teristics of the stratified model in general and present our ideas for a computer realization; there is no implementation of the .atified model at this time. Key words: natural language processing, meaning representation, discourse analysis, anaphora. Nous nous inttressons au dtveloppement de methodes computationnelles pour choisir les antkctdents possibles de rkfkrences dans une autre phrase. Notre modkle stratifib qui utilise un langage A-catCgorieI pour Ia representation de sens incorpore des caracteristiques trts utiles des stmantiques du type Frege (a la Lewis Montague, Partee et autres) ainsi que des traits des stmantiques situationnelles de Barwise et Perry. Nous considirons une strie d’histoires en dewcphrases que nous utilisons pour montrer les interdipendancesrkfkrentielles entre phrases. Nous expliquons les conditions sous lesquellesde telles dtpendances se prtsentent ainsi que les conditions sous lesquelles diverses traductions peuvent Ctre obtenues et formalisons un ensemble de rtgles qui sptcifient comment calculer la reference. Nous restreignons notre discussion B des histoires en deux phrases pour Cviter autant que possible le problkme de determiner ou il fuut regarder pour la rifirence, c’est-8-dire de determiner le boon antickdent. Dans cet article, nous nous restreignons aux situations oh la reference a, dans le cas oh on peut le calculer, un anttctdent unique. Ainsi, nous considtrons des exemples tels John wants to catch afish. He (John) wants to eat it et John interviewed u man.. The man killed him (John). Nous resumons ensuite la transformation qui tient compte de ces regles et la mettons en relation avec le modkle stratifie. Nous discutons trois aspects de cette transformation qui meritent une attention particulitre du point de vue computationnel et resumons notre contribution. Nous discutons aussi des caracteristiques computationnelles du modkle stratifit en gtntral et prksentons nos idtes concernant leur implantation sur ordinateur;il n’existe jusqu’i present aucune implhentation du modkle stratifie. Mots cliss: traitement du langage nature], representation du sens, analyse du discours, anaphore. [Traduit par la revue] Comput. Intell. 2, 159-179 (1986) Introduction By the time you have completed reading this sentence you will have understood its meaning. The pronomial reference its was resolved in the comprehension process. Not all sentences contain such “simple” examples of pronomial reference. Dis- ambiguating forward versus backward references remains a thorny problem as illustrated in the following contrasting pairs of sentences John is at home. If he is not drunk Peter will take me there. John is at home. If he is not drunk Peter will be surprised. The correct solution to the forward reference problem is properly a matter of pragmatics, that is, indexing into the “correct” possible world. Although crucial for comprehension, this is a different problem than the one that we would like to consider in this paper. We would like to discuss how to get the reference, that is, how to select a possible antecedent for a refemng expression, rather than focussing on where to look for the reference, that is, how to determine the proper antecedent actually meant by the seaker, when resolving multisentence references. The latter problem is investigated in Strzalkowski (1986c). The founders of modem logic dealt with such reference problems. Frege wrote about them in 1892 in On Sense and Reference (Geach and Black 1960). Russell first considered these problems in his paper On Denoting (Russell 1905). Although these works were well known, it wasn’t until the 1940’s that Frege’s work was studied in greater detail (Quine 1953) and special attention was paid to Russell’s theory of descriptions (Smullyan 1948). These philosophers (and numerous others) established the theoretical framework for studying meaning and meaning representations for natural language. Their pioneering efforts were substantiated by the further development of logic, espe- cially the modem authors of research into intensional logic (Cresswell 1970, 1973; Montague 1974; Partee 1976; Dowty 1 976). The continued development of intensional logics, and the concern with problems of their interpretation and computational tractability, has had a curious effect. Artificial intelligence (AI)
Transcript

159

A framework for computing extrasentential references

TOMEK STRZALKOWSKI A N D NICK CERCONE Natural Language Group, Luboratory for Computer and Communications Research, School of Computing Science,

Simon Fraser University, Burnaby, E.C. , Canada V5A 1S6 Received July 21, 1985

Accepted August 26, 1986

We are concerned with developing a computational method for selecting possible antecedents of referring expressions over sentence boundaries. Our stratified model which uses a A-categorial language for meaning representation incorporates valuable features uf Fregean-type semantics (a la Lewis, Montague, Partee, and others) along with features of situation semantics developed by Barwise and Perry. We con5ider a series of selected two-sentence stories which we use to illustrate referential interdependencies between sentences. We explain the conditions under which such dependencies arise, explain the conditions under which various translations can be performed, and formalize a set of rules which specify how to compute the reference. We restrict our discussion to two-sentence stories to avoid most of the problems inherent in where to look for the reference, that is, how to determine the proper antecedent. We restrict our considerations in this paper to situations where a reference, if it can be computed at all, has a unique antecedent. Thus we consider examples such as John wants to catch afish. He (John) wants to eat it. and John interviewed a man. The man killed him (John). We then summarize the transformation which encom- passes these rules and relate it to the stratified model. We discuss three aspects of this transformation that merit special attention from the computational viewpoint and summarize the contributions we have made. We also discuss the computational charac- teristics of the stratified model in general and present our ideas for a computer realization; there is no implementation of the .atified model at this time.

Key words: natural language processing, meaning representation, discourse analysis, anaphora.

Nous nous inttressons au dtveloppement de methodes computationnelles pour choisir les antkctdents possibles de rkfkrences dans une autre phrase. Notre modkle stratifib qui utilise un langage A-catCgorieI pour Ia representation de sens incorpore des caracteristiques t r ts utiles des stmantiques du type Frege (a la Lewis Montague, Partee et autres) ainsi que des traits des stmantiques situationnelles de Barwise et Perry. Nous considirons une strie d’histoires en dewcphrases que nous utilisons pour montrer les interdipendances rkfkrentielles entre phrases. Nous expliquons les conditions sous lesquelles de telles dtpendances se prtsentent ainsi que les conditions sous lesquelles diverses traductions peuvent Ctre obtenues et formalisons un ensemble de rtgles qui sptcifient comment calculer la reference. Nous restreignons notre discussion B des histoires en deux phrases pour Cviter autant que possible le problkme de determiner ou il fuut regarder pour la rifirence, c’est-8-dire de determiner le boon antickdent. Dans cet article, nous nous restreignons aux situations oh la reference a, dans le cas oh on peut le calculer, un anttctdent unique. Ainsi, nous considtrons des exemples tels John wants to catch afish. He (John) wants to eat it et John interviewed u man.. The man killed him (John). Nous resumons ensuite la transformation qui tient compte de ces regles et la mettons en relation avec le modkle stratifie. Nous discutons trois aspects de cette transformation qui meritent une attention particulitre du point de vue computationnel et resumons notre contribution. Nous discutons aussi des caracteristiques computationnelles du modkle stratifit en gtntral et prksentons nos idtes concernant leur implantation sur ordinateur; il n’existe jusqu’i present aucune implhentation du modkle stratifie.

Mots cliss: traitement du langage nature], representation du sens, analyse du discours, anaphore. [Traduit par la revue]

Comput. Intell. 2, 159-179 (1986)

Introduction By the time you have completed reading this sentence you

will have understood its meaning. The pronomial reference its was resolved in the comprehension process. Not all sentences contain such “simple” examples of pronomial reference. Dis- ambiguating forward versus backward references remains a thorny problem as illustrated in the following contrasting pairs of sentences

John is at home. If he is not drunk Peter will take me there. John is at home. If he is not drunk Peter will be surprised.

The correct solution to the forward reference problem is properly a matter of pragmatics, that is, indexing into the “correct” possible world. Although crucial for comprehension, this is a different problem than the one that we would like to consider in this paper. We would like to discuss how to get the reference, that is, how to select a possible antecedent for a refemng expression, rather than focussing on where to look f o r the reference, that is, how to determine the proper antecedent actually meant by the seaker, when resolving multisentence

references. The latter problem is investigated in Strzalkowski (1 986c).

The founders of modem logic dealt with such reference problems. Frege wrote about them in 1892 in On Sense and Reference (Geach and Black 1960). Russell first considered these problems in his paper On Denoting (Russell 1905). Although these works were well known, it wasn’t until the 1940’s that Frege’s work was studied in greater detail (Quine 1953) and special attention was paid to Russell’s theory of descriptions (Smullyan 1948).

These philosophers (and numerous others) established the theoretical framework for studying meaning and meaning representations for natural language. Their pioneering efforts were substantiated by the further development of logic, espe- cially the modem authors of research into intensional logic (Cresswell 1970, 1973; Montague 1974; Partee 1976; Dowty 1 976).

The continued development of intensional logics, and the concern with problems of their interpretation and computational tractability, has had a curious effect. Artificial intelligence (AI)

160 COMPUT. INTELL. VOL. 2. 1986

researchers have shied away from intensional logic for lack of a feasible computational paradigm, although they have borrowed key notions from the theory shamelessly in an effort to imbue their representations with more general expressive power and formal interpretability. Other A1 researchers have devoted effort to developing intermediate notational systems which fall somewhere between first-order logic and intensional logic, for example, Moore (1981), Schubert and Pelletier (1982), and Nash-Webber and Reiter (1977). Nevertheless, these writers accommodated few selected features of Montague’s system.

Alternatives to Fregean-style seniantics are currently under study and have been labeled situation semantics, vide Partee (1972) and Barwise and Perry (1983). The ongoing formaliza- tion of this approach, albeit intuitive, has not yet addressed computational issues in detail.

Although we are at a preliminary stage with this research, we believe the results we present are nonetheless promising and will prove significant. We start by introducing our model for processing natural language, the stratified model, and the different kinds of representations this model employs. Within this framework we introduce the basic use of a A-categorial language A for meaning representation, a fuller account of which is given in Strzalkowski (1986c), which possesses adequate expressive power to represent discourse meaning at the appropriate level in the stratified model. We provide rule 1 (the basic translation rule) to translate expressions from the categorial language L (utilized in earlier stages of the stratified model) into a language A for representing intersentential dependencies in discourse. We then consider a series of selected nuo-sentence stories which we use to illustrate referential interdependencies between sentences. We explain the condi- tions under which such dependencies arise and formalize a set of rules (rules 2- 10) which specify how to compute the reference. The discussion preceding each rule explains the conditions under which various translations can be performed, for example, imperfect contexts, attitude report contexts, referring to a name, conditional contexts, etc., and specifies a rule for carrying the appropriate translation. At this time we restrict the discussion to two-sentence stories to avoid most of the problems inherent in where to look for the reference, that is, how to select the proper antecedent. We restrict our considerations in this paper to situations in which a reference, if it can be computed at all, has a unique antecedent. We then summarize the transfor- mation Fm-l, which encompasses rules 2-10 among others, in more general terms to relate it more concretely to the stratified model. We discuss the computational characteristics of the stratified model and present our ideas for a computer realization of it; there is no implementation of the stratified model at this time. We conclude with discussion of three aspects of the transformation F,- that merit special attention from a compu- tational viewpoint and summarize the contributions we have made.

2. Processing natural language We present a theory of stratified meaning representation as a

framework for investigating problems of natural language understanding which integrates various levels of language processing with methods of modelling the language denota- tional base (the “universe”). The theory is formulated from the perspective of the hearer in some hypothetical discourse situation as he attempts to decode the meaning of a message delivered by the speaker. The hearer has the stratified model at

his disposal augmented with his individual knowledge base; other individuals are assumed to maintain such knowledge bases as well. This knowledge base contains appropriately encoded information about what its owner knows, believes, doubts, imagines, etc. about the universe. The actual organization of the knowledge base is not a concern at this point, beyond the fact that its content should be conveniently accessible to its owner. The stratified model provides tools for manipulating informa- tion incoming to or outgoing from the knowledge base. Information enters (and leaves) the model through numerous receptors (or ports) which, for now, we leave unspecified. Since we are interested in the linguistic type of data, we shall assume that, at a port, information has the form of discourse in some source language SL. Perhaps other types of data are processed in a similar fashion, but we ignore these beyond their final appearance in the knowledge base.

The discourse entering a port undergoes numerous transfor- mations before it can be assimilated into the knowledge base. These transformations embody the full range of language processing from lexical and syntactic analysis to advance semantic and pragmatic evaluation. The resulting representa- tion should accurately reflect the meaning of the original speaker’s communication as perceived by the hearer. This simplified account of natural language understanding can become significantly more complicated if we allow for conver- sational discourses where more than one party makes a contribution.

The stratified model employs several different kinds of representation. We use the categorial language L for represent- ing syntactically preprocessed utterances, and the A-categorial language A for representing intersentential dependencies in discourse. The final representation of discourse content has not been decided upon yet, but we advocate a form resembling that of abstract situation (see Barwise and Perry (1983)).

The stratified model provides semantics for verifying ac- quired information with respect to the universe. This informa- tion is mapped into the universe (interpreted), and the resulting classification is stored in the knowledge base for future verification, whenever possible. The mapping into the universe (interpretation) is not a straightforward process. To perform this mapping efficiently an individual has to properly grasp the structure and organization of the universe. We assume that an intelligent individual is capable of deriving appropriate models of the universe, and that he directs the semantic mapping into these models rather than into the original universe.

It has often been assumed that there exists some universal semantics that can classify all linguistic (and other) information depending only on the actual state of the universe, (see for example. Lewis (1976)). Unfortunately, this kind of semantic classification is not always available for an individual because the individual’s perception of the universe is subjective, and this fact is further reflected by the contents of his knowledge base. In this sense, different individuals “run” different instances of the stratified model, primarily with respect to different universes they recognize, but possibly with respect to other factors as well. The stratified model we construct may not belong to anyone in particular, and we do not make claims about whether or not it has any psychological significance. We shall assume, however, that our model is owned hypothetically by some individual who patiently participates in the various discourse situations we analyze in this research. The primary role of the model is to provide insight into the way an intelligent individual

STRZALKOWSKI AND CERCONE 161

may come to understand natural language utterances, and prepare for a computer implementation of this process.

We do not construct a complete stratified model in this paper. We concentrate on selected problems of discourse analysis and investigate a number of questions concerning discourse cohe- sion and derivation of a formal representation of discourse content. In the area of text cohesion we propose a general scheme for computing intersentential dependencies created by pronominals and definite descriptions. We discuss the pheno- mena of backward and forward references as well as direct and indirect references, and show how these can be accommodated within the general framework of the stratified model.

3. The stratified model Any meaning representation must mediate between a uni-

verse (or a world) which is to be described and a language which is used to express (or represent) whatever one knows, observes, foresees, imagines, remembers, disagrees, believes, etc. about that universe. The language is equipped with syntax and semantics rich enough to be able to precisely represent everything we want to represent. In mathematical analysis, for example, we use a carefully designed symbolic language to communicate about functions, sets, and their properties and behavior. In everyday life, we speak our natural (human) language. Both situations can be included in the model outlined above, yet there is some significant difference between them. This difference lies in the “distance” between a meaning representation language and universe in this bipolar model, depicted in Fig. 1. In the case of the symbolic language of mathematical analysis this distance is rather small in the sense that no intermediate-level representation is necessary to map the language into the universe. Human languages, however, evolved over centuries and they are so complicated and sophisticated that they require some nontrivial decoding process in order to uncover the mapping between a language and “reality.” We can therefore revise our bipolar model as illustrated in Fig. 2. We can now select some artificial, relatively small universe, for example, a blocks world, or extensional database, and devise a language which directly manipulates that universe. Some translation program then maps a limited subset of natural language into this data manipulation language. Problems emerge soon after one attempts to enrich the structure of the universe to make things behave more like the real world. A more complex language is needed to properly express what is going on in the universe. Subsequently, the language decoding process becomes considerably complicated and before long our translation program segments into a number of not necessarily sequential subtasks, each of which presents a separate research and programming challenge. As the gap between the ultimate meaning representation level and the universe widens, we can no longer directly manipulate the universe with the meaning representation language. Most of the early A1 systems fell into this general framework.

The denotational base for human languages is so extremely complicated that the design of a full-size mapping from the decoding level into the universe is an enormous task.’ Although we may not comprehend the universe directly, we can maintain an encoded image so that the mapping could be directed into it. Our model of meaning representation, therefore, has to be

‘We assume that the decoding level preserves the source language expressive power.

Meaning representation Universe

FIG. 1 . The bipolar model of meaning representation.

modified once again. In Fig. 3 we illustrate two intermediate levels of source language approximation (SLA) and universe approximation (UA).’ The idea of approximating reality is almost as old as science itself. When the universe did not fully cooperate we often created simplified models to instantiate one observation or another. If we chose our model wisely, we could discover some actual laws. Although these models gave perhaps only a stylized image of the universe, they were easier to manipulate and speculate upon than the original world. If we wanted our discoveries to have any practical significance, we had to provide a method to map the model back into the original universe. In general, the more accurate a model, the more credible our findings. For example, Newtonian mechanics once appeared to be a precise approximation of the real world until relativistic theory showed how simplified a model it was. Nevertheless, the Newtonian model can quite appropriately be mapped into a well-constrained part of the universe.

Analogously for language translation, we may find it easier to think of the universe approximation as a step-by-step process. We can build an n-degree approximation which does not necessarily have to be considered sequentially. Let F: SL --* SLA and G: UA --j U be the language decoding transformation and the universe encoding transformation, respectively. If F and G can be defined then the original problem of discovering the mapping between SL and U will have been reduced (probably) to a much easier mapping between SLA and UA. There are two major factors influencing the relative difficulty of this mapping we want to reduce. One is the inherent ambiguity of source language. The other derives from the lack of a sufficient fragmentation of the universe in the sense that we could always select a piece of it which exactly corresponds to a particular language utterance. We have no guarantee that the one-level approximation depicted in Fig. 3 will suffice. In general, one can imagine a sequence of subsequent (but not necessarily sequential) reductions F 1 , ..., F, and G,, ..., GI such that Fi: SLAi- - SLAi and Gi: UAi --j UAi- I ,3 and such that the bipolar model (SLA,, UA,) requires no further reductions. Ultimately we obtain a stratified mapping from SL in U, as shown schematically in [ 11,

where M is a mapping in the bipolar model. The model presented above is quite general and idealized to

some degree; we shall be more specific about it later. The structure presented in [ 11 (slightly simplified for expository reasons) can be considered as ever busy machinery, possessed by some intelligent individual, which passes information

’The process of deriving the universe approximation is of course from U to UA, even if in Fig. 3 all arrows lead,in one direction. In fact these arcs can be traversed in both directions, and the one we chose here stresses our main interest in language processing problems.

’It should be relatively clear that transformations Fi (and G,) are not functions in general, therefore the notation Fi C SLAi-l x SLAi may seem more appropriate at times.

-

162

* Natural F Decoding language level b Universe

M

Source language

SL

through and back between SL and U. This information, accumulated at intermediate levels of the structure, will provide a necessary context, knowledge, beliefs, etc. for interpreting information in the further flow. In this sense the transformations on both sides can be steadily enriched. An alternative is to relegate knowledge and beliefs to some external knowledge base, thus making the model user-independent. In such a case, the structure in [ I ] is incomplete because, to make it work, one has to bring in some individual knowledge base and plug that knowledge base into a proper place in the model. Only by doing so can we guarantee that the transformations will get the necessary support. When talking of transformations Fi’s and GI’s , we shall always assume that an adequate knowledge base is being provided.

A question which arises quite naturally is how many strata (and therefore, how many transformations) are there or, perhaps, what is the minimal number of strata for a natural langugage system. It is not easy to state any concrete numbers; different stages of language processing require different amounts and quality of extralingual information for interpreting language expressions. Too much external knowledge, especial- ly when applied at too early a stage, can be as undesirable as a lack of knowledge. Recall the difficulties faced by the “semantic grammar” approach to natural language understanding (Hendrix etat. 1978).

We concentrate on the left side of the stratified model in this paper, that is, the part to the left of the mapping M in [ I ] . More discussion concerning problems of universe modelling on the right side of the model can be found in Strzalkowski (1986~). There are many ways to model the left side. Specific instantia- tions may vary; however, from current A1 practice one can expect something like the following. Some early Fi transforma- tions (for i = I , 2, 3, . . .) would be concerned with phonology, lexical analysis, and syntactic parsing. Transformations closer to the centre of the model ( i = ..., n - 2, n - I , n) would be primarily devoted to semantic and pragmatic issues like anaphora resolution, intersentential dependences, and general discourse problems. We do not insist that this segmentation has to be maintained. In fact, transformations Fils are closely related to the changes in our perception of the universe which are instantiated as subsequent universe models on the right side. These changes are reflected in deriving appropriate (that is, conforming to the corresponding view of the universe) represen- tations of utterances by the transformations on the left side of the stratified model. Consequently, we can impose no limits on the quality or quantity of information (linguistic and otherwise) that is to be used by an F, as long as the desired result is produced. Some degree of sequentiality of processing will probably be

- 1

G Universe F SL decoding M U encoding b level b level b U

SLA UA

desirable, but the distinctions between different transformations may be defined along some other than the traditional dirnen- sions.

A meaning representation langauge at any level SLAi will be determined by the transformation Fi leading to it. The better understood a transformation, the more we can say about various details of the meaning representation it builds. Recent work in AI, linguistics, and philosophy provides some details on what to expect at different SLAi levels, for example, see Quine (1960), Cresswell (1973), Lewis (1976), Thomason (1976), Webber (1979), Grosz and Sidner (1985), etc.

We assume that at any level SLAi all expressions of the meaning representation lanaguage of this level are unambiguous in terms of the universe UAi corresponding to SLAi in the stratified model. In other words, if ai E SLAi then there is at most one pi E UAi such that Mi(ai) = pi, where M i is the semantic mapping available at the stratum (SLAi, UAi). This condition extends on every stratum for 0 S i S n. The English sentence (or an utterance of it) Every man loves a woman is unambiguous in the real world, although it describes more than one different “situati~n.”~ However, when we try to represent the sentence in first-order logic (at some level SLAi) we obtain at least two different translations of it, each corresponding to a different “situation” at some UAi, Before we can determine, if ever, which reading is being meant by the speaker, we face the problem of an ambiguous statement. Therefore, ambiguity is in fact introduced by transformations. This fact also strongly suggests that the pair of coindexed transformations Fi and G;’ are often performed simultaneously.

Let us now examine the structure of a stratum (SLAi, UAi), 0 6 i s n. We have assumed that the language of the level SLAi, the ith degree transformation of the source langauge SL, communicates about UAi, the ith degree representation of the original universe U. In other words, there exists a mapping M i such that Mi(SLAi) = UAi. Except for the ultimate mapping M, = M in the stratified model, the relation M i is too complex to be computed directly. It is quite common in current A1 practice that a premature rendering of M i is attempted for implementa- tion. In effect we normally describe some submapping Mi C M; such that

[2] SLAi 3 SLA; +% UA:. C UAi

with the problems not covered by Mi given, at best, some ad hoc solutions. Nonetheless these methods resulted in a number of satisfactory, though limited utility, A1 systems, for example,

‘We place the word “situation” in quotes here to avoid a reference to the theory of situations.

STRZALKOWSKI AND CERCONE 163

LUNAR (Woods et al. 1972), PLANES (Waltz et al. 1976), MARGIE (Riesbeck 1975), etc. Our ultimate goal can be roughly verbalized now: to find that ultimate stratum (SLA,, UA,) so that M, could be entirely computed.

It would be most desirable by the time we reach level SLA, when translating some expression a from SL = SLA: that we get a single (not a set of possibilities) representation a, of a at SLA,, if one can be produced at all. That is,

[3] F,oF,-lo ... O F l ( a ) = a, E SLA,

where F,OFj is a compound transformation which combines effects of constituent transformations. Theoretically, the left side of [3] can be interpreted as a transitive closure. The level SLA, would therefore be the one at which every expression of SL gets a single, unambiguous representation. We must remember that the notion of ambiguity is relative to both the SLA,’s language and UA, structure. At one possible SLA, we may obtain a set A, of an’s as the translation of a, but at another, the same set will be considered as an atomic expression. This consideration introduces an element of subjectivity into the translation process which cannot be entirely avoided. In a sense the original problem of finding the mapping between SL and U satisfied [3]. Language, taken at face value, is unambiguous, but an utterance can rarely be taken at face value. When we utter anything we certainly mean something by the utterance and if the hearer expects this, as he most probably does, our utterance is potentially ambiguous for him. Both speaker and addressee actually compute some steps in the stratified model to get the intended and presumed meaning of the utterance, respectively, by reaching, hopefully, the same ultimate stratum (SLA,,

4. Using a A-categorial language For meaning representation

We assume that we have an arbitrarily selected English subset FMT C SL. Let L be that part of the language at level SLA,-2 in the stratified model where F I O ... OF,-,(FMT) = L. We concentrate on the translation of some selected example expressions, sentences, and paragraphs of L into a A-categorial language A defined at level SLA,- I . We assume that the reader is familiar with a categorial grammar of a simple fragment of English, such as FMT (Montague 1974). The language L may differ considerably from FMT since L is the product of n - 2 transformations already performed over FMT. Although the transformations from FI to Fn-3 are neglected in this presenta- tion, the transformation Fn-2, identified here with the categorial grammar CAT, provides appropriately “parsed” expressions, sentences, and paragraphs of FMT. It is not necessary that Fnw2 be a categorial grammar; perhaps some other syntactic system would be more suitable in practice. Nonetheless the simplicity and elegance of CAT make this grammar most suitable for this presentation. The transformation F,- I that we construct oper- ates on the level SLA,-2 to which L belongs.

We sketch the definition of the language A as required for this presentation. A possesses adequate expressive power to repre- sent discourse meaning at level SLA,- We also speculate that A promotes computational efficiency.

Definition 1 (lexicon)

fall into the following six classes:

UA,).

The well-formed expressions of A are built of symbols which

’Remember that a does not have to be any commonly understood syntactic unit and may stand for an entire discourse.

(a) VAR of variables: t , u , x , y, z (individual variables), P,

(b) CON of constants: a , b, c , ..., A , B , D, ...; (c) PAR of parentheses: ( ); (d) LAM of lambda-abstractor: A; (e) LOP of logical operators: &, U , 3, = ,l; (f) QUA of quantifier symbols: 3, V.

Q, R, ..., CI, C2, ... (predicate variables);

The nature of the sets VAR and CON is not uniform. In fact a structure of types has been superimposed over the language so that every element of VAR and CON is assigned to some type. Let a be such a type, then by VA% and CON, it is understood the set of variables of type a and the set of constants of type a, respectively.

Definition 2 (types)

such that The set of types of expressions of A is the smallest set TYPES

(a) t , e E TYPES; (b) for any 6, u E TYPES, 0/6 E TYPES. 0

Here t and e are the basic types which are not in the form u/6 for any u, 6 E TYPES. With this concept of type we can define the notion of well-formedness of expressions in A. Definition 3 (syntax)

type a is the smallest set WFE, such that Let a, p E TYPES. The set of well-formed expressions of

(a) if x E VAR, then x E WFE,; (b) if a E CON, then a E WFE,; (c) if E E W E , and x E VARB then AXE E (d) if El E WFEala and E2 E W E p then (E l E2) E WFE,; (e) if E l , EZ E WFE, then (E l = E2) E WE,; (f) if E l , E2 E WFE, and u E VAR then-E~. ( E I 8r E2),

Let Ln-2 and L,- I be fully developed languages defined at levels SLAnwz and SLA,- respectively. Our first effort is to describe the transformation F,- I such that F,- 1(Ln-2) = L,- I , or more precisely, to formulate a collection of rules R,-I = {R(!-I)~. &-I)*, ... ), & - I C F n - 1 such thatR,-dL) = A. BY saying that the collection of rules R,-l is a subset of the transfornation F,- it is understood that the sum of domains of rules within this set does not exhaust the domain of the transformation. Note that the transformation F,-l, and any other transformation Fi, 1 4 i S n, is not a function in the set theoretic sense. In general, an application of a transformation to a language expression may result in more than one different translation. Rule 1 summarizes most of the translation rules suggested by Montague (1974). Partee (1976), and Dowty (1976), with A used in place of intensional logic (IL). We assume the straightforward correspondence between categories of L and types of A, that is, if a is a basic category in L, then Fn-l(a) = a is a type in A; if a is a derived category: u/S, a//& ..., then F,-l(a) = u/6 is a type in A. The reader is reminded that B(a) and E(a) stand for the set of basic expressions of category a within L and the set of all expressions of category a within L, respectively (Montague 1974).

Rule 1 (the basic translation rule)

a E B(a) then F,- ~ ( a ) E CONF“-,(,).

and P E VAR,,,.

(El u E2). (El 3 EZ), (El = E2), VuE1,3uE1 E WFE,. 0

(i) Let a be any category in L, different than T = r/ (r /e) . If

(ii)IfaEB(T)thenF,-I(a) = (AP(Pa’) ) , wherea’ €CON,

164 COMPUT. INTELL. VOL. 2 . 1986

(iii) For any categories a. p of L, if u is any of the following: a/@, a//@, ..., and if El E E(a) and E2 E E ( @ ) , then Fn-l((El, E2)) = EIE2 E WEFm-,(.-+ where (...) is the syntactic operation of L.

(iv) If E E E ( t / / e ) then

Fn-l(a/an E ) = (AQ(%(Fn- I(E))x) & ( e x ) ) )

Fn-l(eveV E ) = ( iQ(vx ( (Fn-~(E) )x ) 2 ( 0 x 1 ) ) Fn- ](the E ) = (AQ(gX((Fn- i(E))x) & (Cx) &

( V Y ( ( ( F ~ - I ( O ) Y ) & (C Y)) 3 ( X = Y)) & ( Q x)))

where x, y E VAR,, Q E VAR,,,, and C is a context, C E W r / c . 0

Example I

(1) Bill interviewed every applicant.

into A. Suppose also the following translations hold?

Suppose we want to translate the L sentence

(i) interview+ i (ii) applicant + a (iii) Bill --* (AP(P B ) )

The constants i , a , and B above belong to types TV = ( t / e ) / ( t / ( t / e ) ) , r /e , and e , respectively. Applying steps (iii) and (iv) of rule 1, we obtain (based on a correct syntactic analysis in CAT)

(iv)(l)+(Vx(ax)> ( i B x ) ) . O

Suppose we have the following translations:

(i) man -, m (ii) woman -, w (iii) loves -, 1

Example 2

Here, again, the constants m, w , and 1 are, respectively, of types r /e , t / e , and TV. The two possible grammatical analyses of the sentence ( 2 ) Every man loves a woman.

lead to two different translations called the weak and strong readings of (2), respectively. These are

(Vx(m x) 3 ( 3 y ( w Y) & ( 1 x y ) ) ) and (iv) (2) (v) (2) -, ( ~ Y ( w Y) & W x ( m x ) 3 ( 1 x y))). 0 We now focus on selected examples of two-sentence

“stories” and try to discover and formalize referential inter- dependencies between them and the conditions in which such dependencies arise. Restricting the discussion to two-sentence “stories” avoids, at this stage, most of the problems of where to look for the reference, thus we concentrate entirely on the question how to get the reference. A good example of the former is given below in the form of the three-sentence dialogue (in a restaurant, to a waiter):

speaker-A: I’ll have pepsi. speaker-B: I’ll have nothing. speaker-C: I’ll have the same.

Although we are aware of this kind of problem, for now we consider only situations where a reference, if it can be made at

6We shall use the symbol ---* as the translation operation symbol in place of Rule 1’s F,- I .

all, has a unique antecedent. The fact that we restrict our dis- cussion to two-sentential paragraphs should not be taken literally, especially when interpreting the translation rules (from rule 2 onward). Most rules will be written using two hypotheti- cal “sentences” SI and S2, where the latter contains a reference element to an object mentioned in the former. The sentence SI will normally be considered to establish a context for making that reference. In fact it is not necessary that any single such sentence exists. We assume only that at the time S2 is expressed there is enough context information accumulated by different means so that the reference can properly be made, that is, the referent is uniquely identified. In our idealized model all this is reduced to the situations described by the sentence S1 or its elements, but in general S1 need neither be a single sentence nor a collection of sentences and may contain information acquired from other sources (observation, knowledge base, beliefs, etc.).

5. The the determiner The traditional Russellian translation for the rhe determiner

which can be found in numerous works, for example, Montague (1974), Lewis (1976), and Partee (1976), tends to be

If a is a common noun whichpanslates to (I’ and F(n) denotes the a, then F(a) translates into P[3y(Vx[a’(x) E x = y]) 8 P(y)l.

When mapped into our A-notation the formula above becomes

the a + (AP(3y(Vx(a x ) = ( x = y ) ) & P ( y ) ) )

As explained by Partee ( 1976),

[the king denotes] the set of all properties such that there is a unique entity which is a king, and he has those properties.

According to the type structure of L we can obtain the translation of the the determiner from the above as

(THE1) the-, (AQ(XP(3y(Vx(Qx) = ( x = y ) ) & ( P y ) ) ) )

We show that translation THE1 fails when a wider than single sentence context is used. Example 3

(3a) John interviewed a man. (3b) The man killed him. (that is, John; we shall assume this henceforth). According to the traditinnal translation of the by formulaTHE1, (3b) translates into

Let us consider the following “story”:

(36)- (3y (Vx(mx) E ( x = y ) ) & ( k y J ) )

where killed --* k and man --* m. This translation is question- able. Notice that it is not the fact of being a man that makes the entity unique. Observe also that even a context-less sentence

The queen is wealthy.

cannot be properly understood without a clear reference to the queen in question, that is,

The queen such that P(the queen) is wealthy.

where P belongs to { I know something of her, I can see her, ...}.’

’There are sentences closely resembling the above which do not require restrictive context-setting situations because the use of the theP phrase is generic in them as in The tiger lives in the jungle. Refer, however, to Vender (1971) for further discussion.

STRZALKOWSKI AND CERCONE 165

Thus it appears that the use of a definite description requires some context-setting situation C where its reference could be validated. The appropriate the translation could be modified to

(THEZ) the-, (AP(AC(AQ(3x(Px) & ( V y ( ( P y ) & ( C y ) ) = ( x = Y ) ) & (Q x)))))

Using formula THE2, the translations of sentences (3a) and (3b) become

(3a) -., (3x(rn x ) & ( i J x ) ) (3b) -+ @x(m x ) & (Vy((m Y ) & ( i J y ) )

= ( x = y ) ) & ( k x J ) )

At first glance, the formula THE2 seems overstated. One can observe that a definite description may be used in a more or less “definite” manner so that, at one extreme, we do not need any external context to refer unambiguously to an object. Subse- quently, the description

The man in a grey coat I saw yesterday in the library.

still seems amenable to proper translation by the formula THEl . Let us use, following Barwise and Perry (1983), the term described situation for the situation referred to by the utterance itself. We see that in the example above the described situation is contingently the same as the context-setting situation. Therefore we can consider formula THEl as just a special case of the more general formula THEZ. Consider, however, another controversial example.

Example 4

sentences (4a, 4b) , (4a , 4c ) , and (4a, 4 d ) .

(4a) John interviewed a man. (4b) The bastard killed him. (4c) The employee killed him. (4d) The woman killed him.

(We assume that him refers to John, even if it needs not be the case in general.) Even a cursory glance at these sentences should convince us that the facts (Px) and (Cy) in the THE2 translation formula cannot be specified arbitrarily, and actually their mutual compatibility has a significant influence on the anaphora referent-taking process. The the-referenced object must belong to a class at least as broad as the intended referent in the context. So in (46) rhe bastard can be identified with a man, while the same process for the employee in (4c ) is less obvious. In (46 ) it is apparent that the woman and a man from (40) operate in distinct contexts. This is not to say that the woman of (46 ) cannot be taken as referring to the man described in (4a ) . If that was the case, the fragment would assert of some individual that it is both a man and a woman. Depending on a particular interpretation of these predicates, and certain other circum- stances which we do not discuss here, such a reading may be discarded as inconsistent. These considerations lead to the final the translation THE3.

(THE31 the + (XP (XC(XQ(3x(P x ) & (C x) & ( V y ( ( P y) &

Suppose we have three different “stories” created by pairs of

(C y ) ) 3 (x = y)) & (Q x)))))

Here the facts (P x), (Q x), and (C x) give characteristics of the referenced object. The part under the universal quantifier emphasizes the uniqueness of the x under the context C. The biconditional in THE2 can be dropped now, as we explicitly assert P and C of x. Observe that the literal (P y) in the part

under the universal quantifier is often insignificant for fixing a unique reference for the object. Examine, for example, the story consisting of (4a ) and (46 ) . The same cannot be said, however, of the instance (P x ) outside the scope of V. In this case we acquire some additional knowledge about the already selected individual. In the rest of this discussion we shaII often drop the literal ( P y ) whenever it does not lead to an ambiguous situation.

The next question to ask is how can we uniformly establish the context of C? The following two cases are clearly per- ceptible.

(Sl) The context is unknown as in The queen (I can see her) is very wealthy. We must employ pragmatics to resolve this. (S2) The context is known from a previous statement with a determiner. Having ( 3 x ( P x ) & (Q x) ) as the context-setting sentence, we get the context for x as (hx(P x ) & (Q x) ) .

We are ready now to present the first formal rule, petj4ect context translation rule (rule 2), for translating two-sentence paragraphs.

Rule 2 (perfect context translation rule)

An object u referenced by the has been mentioned previously in a de re context, thus its existence is presupposed. Let Sl( u ) be the context sentence that mentions u. Let Sz (u ) be the sentence in question. We have

(i) Sl(a P) -+ ( 3 u ( P u ) & (F u))

(ii) S2(the PI) -+ (3u(C u) & (Vx((P1 x) & (C x ) )

Il ( x = u)) c9z (PI u ) & (FI 4) The context C is derived from S, as (Au(P u ) & (F u)).

fied as shown below. Example 3 presented earlier in this section can now be modi-

Example 3 (modijied) Let (3a) be as given before, that is,

( 3 ~ ) - ( 3 x ( m x ) & ( i J x ) )

We derive the context C from (3a)

(Xx(mx) & ( i J x ) )

The incremental translation of (36) follows.

the man + (AC(AQ(3x(m x ) (C x ) (Vy((m Y ) & (C Y ) ) 3 ( x = Y ) ) (Q 4))

(AQ(3x(m x ) & ((hxl(m X I ) & ( i J xd)x) & (tJy((m Y ) & ((Axlfm XI) & ( i J x d ) ~ ) ) 3 (x = Y ) ) & ( Q 4)) -+

3 (x = Y ) ) & ( Q 4))

3 (x = y)) & ( k x J ) )

the man (whom John interviewed) +

( X Q ( 3 x ( m x ) & ( i J x ) & ( V y ( ( m y ) & ( i J y ) )

the man (whom . . .) killed John + (3 . r (mx) & ( i J x ) & ( V y ( ( m y ) & ( i J y ) )

where killed John + (Xv(k v J ) ) as before.

Observe that if literals P and PI in the formula THE3 are identical, as will often happen in practice, we can drop P 1 to make the translation shorter. Note, however, that if the woman was used in place of the man in (36) the latter sentence would translate as (if we insisted that the woman and the man

corefemed)

(3b ) - (3x (mx) & ( w x ) & ( i J x ) & ( k x J ) & (Vy((m Y ) ( W Y ) 81 ( i J y ) ) 3 ( x = Y ) ) )

This translation can be discarded right away when we add an interpretation to our language that precludes an individual from being both a man and a woman at the same time. In a practical implementation this feature will constitute an important criteri- on for selecting proper referents. It must be noted, however, that there are circumstances in which being a man and a woman is not contradictory, even without a change to the interpretation. These include the use of nonsingular descriptions such as generic or functional which are discussed elsewhere (Strzal- kowski 1986b,c).

The perfect context translation rule explains some aspects of translating two-sentence “stories,” but as we shall see it accounts for only the most straightforward referential situations which do not involve either imperfect or attitude report constructions.

6. Imperfect contexts We distinguish two very general classes of verbs found in

natural language sentences. These are imperfect verbs like seek, want to a, go, build, imagine, . . . and perfect verbs like find, come, have, have written, have imagined, . . . An informal definition of imperfect verbs follows.

Definition A verb will be called imperfect if the immediate effects of the

action or state described by this verb last at most as long as the action or state itself does, and its results on the surrounding world cannot be determined before the action or state is committed. 0

The characteristics of imperfect verbs can be summarized as follows:

(a) They have no permanent influence on the situation surrounding an utterance: want to marry, must have, ...;

(b) They can be committed, however, when turned into a perfect form: have married, have, ...;

(c) They (but not only they) can create nonreferential translations of sentences;

(d) An imperfect verb v can be decomposed into an imperfect operator i; and the perfect form 0. Thus seFk = try, sePk = find.’

(e) Complement-taking imperfect verbs act as imperfectness operators on the complement and its main verb, creating compound imperfect verbs. A special consideration will be given to the verbs want and must.

The imperfectness of imperfect verbs may be further stressed by contrasting them with the attitude report verbs like see that, imagine that, or believe that which can also create nonreferen- tial readings of sentences. Observe that we must use an imperfect verb in (5b ) , like must, to maintain a possible nonreferential reading of this sentence in context of (5a) .

( 5 0 ) John wants to marry a unicorn. (5b) The unicorn must have a pink tail.

‘In fact, the particle “to” is not a part of the imperfect operator such as “try to” or “want to”, etc. In contrast with Montague’s (1974) classi- fication of these verbs in the category ( r /e) / ( r /e) we need imperfect operators in the category ( r /e) / r , with the particle “to” being a part of the complement phrase.

This is not, however, the case in (66) as read in context of (6a) .

(6a) John imagines that a unicorn lives in the park. (6b) The unicorn has a pink tail.

This is because imagines that as a perfect attitude report verb creates an abstract situation that of the John’s image which survives the utterance of (6a) and can be subsequently referred to directly by (6b). Imperfect verbs do not possess this property. We discuss attitude report contexts in the next section.

Table 1 presents some examples of imperfect verbs and their perfect counterparts. Observe how the context in which an imperfect verb is used influences the perfecting operation. Notice that an imperfect verb can be applied to a complement with another imperfect verb as in

John wants to seek a queen.

thus raising the imperfectness level. Two perfecting operations must be performed on the last example to get the perfect form of John has found a queen.

Example 5

(7a) (76) As the reader has perhaps already observed, the existence of the queen in (7 b) is not necessarily presupposed as a consequence of the possible de dicro reading of (7a) . Notice also that (76) would have completely different meaning when considered without the context supplied by (7a ) . We can paraphrase (7a) and (7b) as (in a possible reading)

The queen John would eventually marry, if any, must be

John wants to marry a wealthy queen.

Considering the following “story.” John wants to marry a queen. The queen must be wealthy.

wealrhy, or

Examining example 5 and other similar “stories” we can differentiate two reference situations. The situation where the context-setting sentence bas its referential reading (that is, there exists a particular queen John wants to many) is correctly represented by the perfect context translation rule. This rule, however, cannot be used when both sentences have their nonreferential readings. To account for nonreferential readings in imperfect contexts we formulate a new rule called impeqect context translation rule (rule 3 ) . Rule 3 (imperfect context translation rule)

An object u referenced by the has been recently mentioned in a de dicto environment, that is, its existence is not assumed. Let S, and S2 be defined as in rule 2. Then

(i) S,(a P) - (imp@u(P u) & (F u ) ) ) ~

where imp is the imperfect operator such that it is derived from the imperfect verb of SI .

(ii) S2(the PI) + (imp1(3u(C u) & (Vx((P1 x) & (C x ) ) zl ( x = u))) 8I (PI u) (Fl 4))

where imp, is the imperfect operator of S2, and the context C is derived from SI as (Xu(P u) & (F u)). 0

If a sentence with an imperfect verb has a referential reading

’Here imp is a higher-order operator which stands for the imperfect operator, classified into category ( t /e) /r , and other sentence elements which are not under its scope, if applicable. In the imperfect transla- tion of (70) we have “John wants” as imp.

STRZALKOWSKI AND CERCONE 167

TABLE 1 . Imperfect verbs and their perfect forms

Verb Forms ~~ ~~ -

imperfect perfect form possible imperfect operator

seek find try (to) go come try (to) go to a 0 go (to) want to a a want (to) wish to a a wish (to) be building have built try (to)

in the form

( 3 x ( P x ) & (F’ x ) )

then the nonreferential reading featured in ,rule 3 is obtained by realizing that imp = P‘ , and F = F’. To support the formulation of rule 3 recall sentences ( 7 4 and (7b) from example 5 . Notice that using the imperfect verb must (or will, with the same effect) in (7b), we extend the “imperfectness” of (7a) and at the same time the de dicto reading, on (7b). Notice further that if we used a perfect verb in (7b) as in

(7c) The queen is wealthy.

we would resolve the de dictolde re ambiguity and both sentences would have their de re readings. Thus the presence of an imperfect construction in (76) is essential for preserving (7a)’s de dicto reading and for extending it over (76) . It should also be clear that rule 3 requires that the object referenced in the current utterance has nonreferential status in an imperfect context. In other words, the passage

(7a‘) John married a queen. (76) has only a referential translation with must be wealthy consid- ered as a perfect construction. lo The formal de dirt0 translation of (7b) in the context of (7a) can be sketched as follows.

step I

The queen must be wealthy.

Select a reference context from (7a)’s translation, that is,

from ( w J(Sx(q x ) & (m J x ) ) ) we select (A x(q x ) & (m J x ) ) .

Base translation of (76) is (according to (ii)) step 2

(AC (must ( ~ U ( C u ) & ( v x ( i q x ) & (C x ) ) 3 ( x = u)) & (wealthy u) & ( q u ) ) ) )

step 3 Apply the A-expression obtained in step 2 to that selected in

step 1 . After a simple refinement we get the final translation of (76) as

(76) --* (must ( W q y ) & ( m J y ) & (wealthy y ) & (Vx((q x)

( m J 4) 2 ( x = y ) ) ) ) Nevertheless, Yartee (1972) argues that there exists a class of contextual situations closely resembling that of example 5 where no nonreferential reading is possible. According to Partee

‘?he reader is reminded that we consider only singular interpreta- tions of nominal descriptions in this paper.

if the underlined phrases in the following paragraphs are to co-refer, they must also have referential interpretations.

( 8 ) John wants to marry a queen. Bill wants to marry the queen too.

(9) John is looking for a pen. Bill is looking for &=

If (8) and (9) were indeed the counter examples, rule 3 would have to be rejected or at least reformulated in some narrow sense. But this is not the case. Both examples are somewhat unfortunate because they describe identical attitudes toward an individual which originate from different sources, and this is further stressed by the use of the word too. That is why the nonreferential readings are so deeply hidden from our intuition. They exist, however, and are quite legitimate. If we conse- quently use rule 3 then we can interpret the second sentences of ( 8 ) and (9), respectively, as

Bill wants to marry the queen John marries (ifany) too (no matter who she is) (because, say, Bill always tries to do what John is doing).

Bill is looking for the pen John (eventually) finds (if any) (no matter what it is). Observe that if we used the pen John is looking for as the antecedent for it in (9) we would have the referential reading all right; the pen’s existence would be presupposed by the first sentence. However, when the latter sentence is used nonreferen- tially, the pen could materialize only when John eventually finds it; refer also to Montague (1974) and Heim (1982) for similar considerations. Heim is more concerned with truly-conditional aspects of meaning representation, but many of her observations are relevant to our research. Finally, compare ( 8 ) and (9) with (10) below where the nonreferential reading for it as referring to the nonreferential use of a unicorn is clearly perceptible.

(10) John is looking for a unicorn. Bill wishes to see it( the unicorn Johnfznds) (because he is curious how a unicorn could look).

We wish to avoid any confusion when interpreting terms “referential use” or “nonreferential use” as applied to some language expressions. By referential use of a (definite) descrip- tion we mean the use where the speaker intentionally assumes or believes that there exists something which fits that description. In this sense the definite description in

(1 1 ) is interpreted referentially if we share such a belief, or just lack information to the contrary. The question whether the descrip- tion was used to point to some particular individual or only attributively (Donnellan 1971) is not relevant at this stage and this is not the distinction we are making here. Later we will see that a part of what Donnellan (1971) called attributive use of a definite description is just a special case of nonreferential (or rather semireferential) use in attitude report contexts. Other classifications in the use of definite descriptions, for example, inner attributive use, functional use (Barwise and Perry 1983), or generic use belong to a different dimension than singular descriptions, and therefore must be treated differently (see Strzalkowski ( 1 986)).

Alternatively, when we use the term nonreferenrial use we mean that the speaker intentionally does not refer to anything at all. That is, he knows or believes of nothing the description he

The present king of France is bald.

168 COMPUT. INTELL. VOL. 2. 1986

uses is pointing to, or even if he does believe it refers to something, the existence of such a referent is not relevant to what he is saying. Thus

(12)

with the queen used nonreferentially, the speaker does not assume that anything being a queen actually exists. But unlike Donnellan’s attributive use when we failed to refer to anything, thus causing our statement to be neither true nor false (see Donnellan (1971)), here (12) may be true or false of John (as part of his personal characteristic, for example).

7. Attitude report contexts When a description is used consequently nonreferentially the

context-setting situation changes so significantly that we need a separate rule to account for these cases. In example 3 from Sect. 6 when a queen was used referentially we would further describe her as the queen John wants to marry, while when interpreting the “story” nonreferentially we could only speak of the queen John (eventually) marries (if any). This difference has been properly accommodated by rules 2 and 3.

We turn now to another class of verbs which can also create nonreferential readings. These are anitua‘e report verbs (Barwise and P e w 1983) like imagine that, see that, believe that, etc. We restrict our attention to the perfect contexts, in particular to those that do not involve any imperfect operators. The attitude report verbs used in such situations will be called perfect attitude report verbs. Although we are giving just one example with the attitude report verb imagine that, the discussion below applies to other perfect attitude report verbs as well.

Example 6

(13a) John imagines that u unicorn lives in the park. (1 3 b) The unicorn has a pink tail.

Basically we can differentiate three reference situations be- tween ( 13a) arid ( 1 3b) . As before we do not consider the trivial case where a unicorn of ( 1 3a) and the unicorn of ( 1 3b) do not co-refer. Assume first that we have the following translation as given.

John wants to marry a queen. The queen must be wealthy.

Let us analyze the following “story” in detail..

unicorn --+ u imagines that - imt to have a pink tail + hpi to live in the park + lp

Case I . Suppose a unicorn in (13a) is used referentially, that is, its existence is presupposed. We obtain (we consider here just one possible referential reading)

( 1 3 ~ ) - ( 3 x ( u x ) & ( i m t J ( l p x ) ) ) (13b) --+ x ) & (C x ) 8~ (VY((U Y ) & (C Y ) )

3 ( x = y ) ) & (hpt x ) )

where the context C is (Ax(u x ) & (imt J(1p x ) ) ) according to perfect context translation rule (rule 2).

Case2. Suppose to the contrary that a unicorn in (13a) has been used nonreferentially. That is

(1 3a) + (imt J(3x( u x ) & (@ x ) ) )

We have now two options for translating (1 3b) . In one case the unicorn used in ( 136) refers to some particular individual the

speaker of (136) knows but perhaps John does not. In this case

(136) + (%u x ) & (C x ) & (VY((U Y ) 8L (C Y ) ) 3 ( x = Y ) ) & (hptx) )

where C = (Xx(u x ) & (imt J(lp x ) ) as before.

This translation correctly emphasizes that, from the point of view of the speaker of (136), a unicorn in (13a) has been used referentidly . Therefore both sentences get their referential translations, and case 2 reduces to case 1 . Observe that case 2 is that of misunderstanding the intention of the speaker, but it is how the hearer interprets the discourse at the moment. This situation may later be corrected to restore the original speaker’s meaning if the assumed interpretation of discourse leads to an incoherent representation.

Case 3. Let us assume that the speaker of (136) has the possibility of glancing into John’s image of the unicorn and sees that it has a pink tail there. The speaker of (13b) is therefore extending our information of what John imagines to

(1 3c) John imagines that a unicorn that lives in the park has a pink tail.

Here, we should expect ( 1 3 b ) to translate as

(136)- ( imrJ(3x(ux)& ( C x ) & ( V y ( ( u y ) & ( C y ) )

where C = (Xx(imt J( (ux) & ( lpx ) ) ) ) , and (13c) can be read as

John imagines that the unicorn he imagines to live in the

The speaker of (1 3 b) uses the unicorn semireferentially, taking the image of unicorn as the context-setting situation. It does not mean that the speaker of (1 3 b) takes the image of the unicorn as , the referent. The image is the only thing he knows about this unicorn, but he addresses the unicorn itself, even if the latter does not exist. Moreover, the speaker of (1 3b) believes that the context image uniquely determines the unicorn, however nonexistent, which enables him to use the definite description. He does not need any imperfect operator in his utterance. This is because the abstract situation created by an utterance of ( 1 3u) persists for some (short) period of time after the utterance took place. This situation involves an implicit export of an attitude report operator from (13a) into (13b). That is (136) should be understood now as

3 ( x = Y ) ) & (hptx) ) )

park has a pink tail.

John imagines that the unicorn has a pink tail

but it is not necessary that the actants in (13a) and (136) must co-refer. The attitude report verb imagine that, although perfect (according to the definition given in Sect. 6), has the ability to create nonreferential readings and, once created, behave quite differently than an “ordinary” perfect verb (like those reported in Sect. 5 ) . One may wonder whether other attitude report verbs would behave in the same manner, especially “epistemic” attitudes such as know rhar or see that. It has been widely assumed that such attitudes are always referential (see, for example, Barwise and Perry (1983)). It is important to avoid a confusion here. When a has the form of ( 3 x ( P x ) & (Q x)), then know that(a) does not say anything about the existence of x until we employ a general principle known as veridicality principle”

”In short, veridicality principle says that if att is an attitude report operator and a is a proposition. then att(a) implies a. The principle is not working in general case, but it is said to be true for epistemic attitudes (Banvise and Perry 1983).

STIUALKOWSKI AND CERCONE 169

and state that know that(a) 3 a. Veridicality of some attitude report constructions does not change things a bit. If the principle applies in certain cases, it will help to simplify some expressions and reduce degree of ambiguity produced by the transformation. But before it is used, both translations (referential and nonrefer- ential) are legitimate and have to be considered different.

This discovery calls for another translation rule, the attitude report context translation rule (rule 4) .

Rule 4 (attitude report context translation rule) An object u referenced by the has been recently mentioned in

a de dicto environment with an attitude report verb att. Let SI , S2 be defined as in rule 2. Then the only nonreferential reading of S2 with respect to u can be obtained as ,

(i) Sl(a P) --* ( a t t ( 3 u ( ~ u) & (F u)))”

(ii) &(the PI) + (attl(%C 4 (VY((PI Y ) ( C Y ) )

The context C is derived from SI as (hu(att((P u) & ( F u)))), and att, is the implicit attitude report operator imported from s,. 0

Thus far it appears that case 3 presents the only situation when a nonreferential reading can be exported from an attitude report context. Montague (1974) and Partee (1972) seem to agree with this observation. Otherwise we would always find ourselves in case 2 unless, perhaps, the sentence S2 contain an imperfect construction. Rule 4 can be easily generalized over all cases where att, is an explicit attitude report construction in S2 as in

1 ( x = y ) ) & (PI u) & ( F , u) ) )

John believes that a unicorn resembles Mary. He imagines that the animal has a single horn.

So stated, rule 4 can also account for the situations where SZ contains an imperfect construction imp, in place of att,. The nonreferential reading of the following paragraph where the animal is co-referred with a unicorn illustrates this point.

John believes that a unicorn resembles Mary. But the animal must have a horn.

An important outcome of rule 4 in its original formulation is that it can explain a part of what Donnellan (1971) called the attributive use and Banvise and Perry ( 1983) called value free use of definite descriptions. Suppose someone says:

(14) The man drinking the martini is a fool.

in the sense that the definite description the man drinking the martini is used attributively (Donnellan 197 1). For a speaker to utter (14) is to explore an implicit context-setting situation in which there is a man drinking the martini, if the definite description used in (14) is to be singular, not generic, as we assume here. Therefore, to say (14) is to make the reference in the following context.

(14a) I believe that there is a man drinking the martini. (146) The man drinking the martini is a fool.

The context-setting sentence (140) is implicit here. A side effect of this assumption is that one cannot use a pronoun in place of a definite description when saying (14b). In general any attitude report verb att can be used in (14a) and then imported to (14b) according to rule 4 . Three other examples illustrate this point further.

”Here, again, att is a higher-order constant. See the note following rule 3.

(15a) I imagine he has a wife. (implicit) (15b) His wife is the cook. (explicit) (16a) I think she has a husband. (implicit) (16b) Her husband is kind to her. (explicit) (17a) I believe there is a book on my antique table. (implicit) (17b) Take the book off my antique table! (explicit)

We observe that, in part, the attributive use of singular definite descriptions is just a special case of the nonreferential use in attitude report context. The latter does not account for all of Donnellan’s attributive uses, in particular for what Barwise and Perry named the inner attributive use. These problems are discussed more fully in Strzalkowski ( 1 9 8 6 ~ ) . Among them, the problem of equivalence between certain other class of attribu- tive statements, such as The man who kills somebody is a murderer, and a conditional form, such as If a man kills somebody he is a murderer, is investigated.

8. Pronominal references In a sense, a definite pronoun can be regarded as the most

concise form of a definite anaphora. Although its refemng capabilities are significantly narrowed as compared to a definite description, we can formulate a set of translation rules for definite pronouns in discourse that closely parallel rules 2, 3, and 4 .

Example 7 Consider the following “story.”

(18a) John wants1 to catch a j s h . (18 b) Heo wants2 to eat i t , .

(Subscripts with the verb want are for identification purpose only.) The translations of pronouns has been suggested by Montague (1974) as

he,, + (AP(P x,,))

where x,, is a free nonlinguistic variable from the category e of names. We adopt this representation here. Observe that accord- ing to this solution the variable x, stands for a name (unknown to us) of some individual we refer to, consciously or not, by the use of a pronoun. We do not consider here cases where the pronoun it may be used in place of other sentence constituents than names or individual definite descriptions. Suppose further that we have

catch-, c eat+ e fish + f

Applying rule 1 .we easily obtain the translation

(W -+ (w2 xo(e xo X I ) )

which correctly represents the meaning of (18b) if we have no idea to whom or to what the pronouns heo and itl refer. We shall call such a representation context-less or literal. If, however, having (18a) as a context-setting sentence we decide that he0 has been used to refer to John, we quickly modify translation of (18b) along the lines of the following derivation.

(18b) -+ [ ( M P 41, ( M W Z x(e x x d ) l + . ( (Ax(w2x(exx1) ) )1 )+ ( ~ 2 4 e J x 1 ) )

We feel this result is correct no matter what interpretation has been assigned to (18a) based on the assumption that names are rigid designators (Kripke 1972), that is, they persistently refer to the same objects independently of the situation in which they

170 COMPUT. W E L L . VOL. 2 . 1986

are used. The same cannot be said, however, when we consider the reference situation between ajish in (18a) and it, in (18b).

Suppose the speaker of (18a) used a fish referentially thus creating a de re translation of his utterance. By using it in ( I 8b) we may therefore refer to thefishJohn wants to catch, that is, in (1 8 b) we claim that there exists a particular fish John wants to catch and this fish (not any other) is to be eaten by him. The reference for it, will be therefore in the form:

& (VY((fy) & ( W I J ( C J Y))) 3 ( x = Y)) 8~ ( Q x ) ) )

which when applied to our last translation of (186) (abstracted over x,) will yield

(AQ(%fx) & ( W I J ( c J

(186) +. ( W f x ) (WI J(c J 1)) & (VYNfY) & (WI J(c J Y)))

2 ( x = y ) ) &L (w2 J(e J XI))

More generally, we can formulate the following rule, called perfect pronominal context translation rule (rule 5 ) .

Rule 5 (perfect pronominal context transiation rule)

in the form When a context-setting sentence S1 has a referential reading

(i) Sl(a P) 4 (%P x ) & ( F x))

and the object x is pronominally refercnced by he, in a sentence SZ with literal meaning represented by

(ii) SAheJ + (FI x n )

then the translation of S2 in context of S, is derived as

(iii) SAhen) ---* [C, ( M F I x))l

(iv) C = (AQ(%P x) & ( F x) I% (VyW y) & ( F y)) 3 ( x = y)) & ( Q XI)) 0

Observe that the rule applies also to the situations when he,, refers to a name, although in such a case the context C will be set by the name itself.

Suppose now that a fish in ( 1 8a) has been used nonreferen- tially. What it of ( 18 b) may refer to now is thefish John catches (ifany). That is, the catching will set the reference for it (Partee 1972). The reference would now take the form

where the context C is drawn as

( A Q ( 3 x ( f x ) & ( c J x) & ( V Y ( ( ~ Y ) 8~ ( c J Y)) 3 (x = Y ) ) & ( Q 4))

In this situation, the meaning of (18b) will be rather represented by

(186)- (w2 J ( % f x ) & ( c J x ) & (Vy((fy) 6% ( c J Y))

(x = Y)) CQ ( e J x)))

Rule 6 summarizes this case formally.

Rule 6 (imperfect pronominal context translation rule) If a context-setting sentence SI with imperfect operator imp

is used to utter its nonreferential reading, that is,

(i) Sl(a P) + (imp(3x(P x) & ( F x)))

and the object x has been further referenced by a pronoun he,, in a sentence S2 with literal meaning of

(ii) &(he,,) - (impl(FI x n ) )

then the translation of S2 in context of SI is obtained as the result of the following derivation

(iii) Sz(he,J --* [ (Wimp, @)I. tc, (WFI xNll

(iv) C = (AQ(%P x ) & (F x) & (Vy((P Y ) I% ( F Y))

where the context C is drawn from SI as

3 ( x = y)) & ( Q x))) 0

It is possible to formulate further translation rules for other contextual situations in the spirit of rules 5 and 6. By simple analogy the following rule may be expected for pronominal attitude report contexts.

Ruie 7 (attitlide report pronominal context transfation rule) If a context-setting sentence Sl with an attitude report verb att

is used to utter its nonreferential reading, that is,

(i) Sl(a P) --* (att(ax(P x ) & ( F x) ) )

and the object x has been further referenced by a pronoun he,, in a sentence S2 with the literal meaning of

(ii) &(hen) 4 (attl(F1 x n ) )

then the translation of SZ in context of SI is obtained as the result of the following derivation

(iii) [(A@(attl @)), [C, (WFI x))ll

(iv) C = (AQ(3x(att((P x ) & ( F x))) & (Vy(att((P y)

where the context C is drawn from SI as

& (FY))) 3 ( x = Y)) & ( Q 4)) and the attitude report operator of S2 may be either imported or explicit. 0

9. Referring to a name Unlike the examples from Sects. 6, 7, and 8, the definite

description is not used to focus our attention on a particular object we are talking about, but to extend the information about that object which is unique-because we have already known its name. One aspect of this problem has already been discussed in Sect. 5 (recall example 2 from Sect. 5 ) , and it influenced both the form of the formula THE3 and the rules of translation, especially rules 2, 3, and 4. Consider the following example.

Example 8

(194 Fatsy wants to catch afish. (19b) The cat wants to eat ito.

We shall concentrate here entirely on the possibility that the definite description the cat refers to the individual named Fatsy in context-setting sentence (190). Translation rule 1 gives us the representation of (1 9a) as (concentrating on referential reading

Suppose one hears the following "story."

only) ( 1 9 ~ ) - ( 3 x ( f x ) & ( w F ( c F x ) ) )

The literal (de re) reading of (19b) is given below.

(19b) + (3x(cat x) & (C x) & (Vy((cat y) & (C y))

where Fatsy + ( A P ( P 0).

2 (x = y)) & ( w x(e x .ro)))

To obtain context C for the cut as referring to Fatsy we

C = ( A s ( 3 x ( f x ) & ( w s(c s x))))

where it, --., (AP(P xo)) and cut + cat.

abstract the translation of (194 over its subject, thus getting

STRZALKOWSK :I AND CERCONE 171

We then supply C as an argument to the literal translation of (19b) above, obtaining an extended representation of (196) as

( 1 9 b ) - , ( 3 x ( c a t x ) & ( 3 u ( f u ) & ( w x ( c x u ) ) ) & (Vy((cat Y ) & (%f u ) 44 ( w Y ( C y u))) ) 3 (x = Y ) ) 84

(w 4 e x ~0))) Thus far we have that there exists a unique cat such that it wants to catch afish and eat ito. We have merely applied rule 2 to one of the possible referents for the cat. (We could easily co-index the cat with afish.) But we want to achieve more than merely saying that the actant of ( 1 9 4 wants to eat ito. We actually learn that this individual is nothing other than Fatsy. The name, as a rigid designator, gives us an unambiguous context, independent of circumstances. We therefore obtain

(19b)+ [ ( W P F ) ) , (hx(catx) & ( 3 u ( f u ) & ( w 4 c x 4)) 8L (Vy((ca ty) 82 @ u ( f u ) & (w Y ( C Y u)))) 3 ( x = Y ) ) & ( w x(e x xo)))l--*

( 3 4 f u) & ( w Y ( C y u))) ) 3 ( y = F)) ( w F(e F xo)) ( c a t F ) & ( 3 u ( f u ) & ( w F ( c F u ) ) ) & ( V y ( ( c a t y ) &

The part under the universal quantifier reports that the only cat around that also wants to catch a fish is Fatsy. This is nontrivial information about the situation described which cannot be derived from literal translation of (19b). The next significant extension of our state of knowledge is the information that Fatsy is a cat. Similar considerations may be given to pronominal references (as we have already mentioned in association with rule 5 ) . The use of utterance attributes such as stress and intonation will often decide whether a name is just contingent knowledge we acquire about an object we refer to (we saw this situation in the last example), or whether a name is the ultimate referent, which situation will be discussed next. The apprcach we present allows for accommodating both cases. Let us consider a further example.

Example 9 (20u) Fatsy wants to catch afish. (20b) The cat belongs to John.

The literal translation of (20b) can be easily obtained with formula THE3 as

( 2 0 6 ) + ( 3 x ( c u t x ) & ( C x ) & ( V y ( ( c a t y ) & ( C y ) ) 3 (x = y ) ) & ( b t x J ) )

where belongs to + bt. Suppose that the cat in (20b) refers to Fatsy in (20a) as a particular individual known to the speaker of (20b) by name. (Lets call him B hence.) In this case we cannot take the situation described in (20u) as a context-setting situation. Rather, B refers to an individual whose uniqueness is beyond any doubt for himself. In other words, upon hearing the individual’s name, B can fetch an unique identifying context from his knowledge base. Observe that B cannot use the cut should he be aware of more than one Fatsy at the instant his statement is uttered. B may, however, disambiguate his reference upon examining what the speaker of (20a) (call him A) has said. Assume that the use of the definite article the in (20b) is the reaction of B when he hears the name Fatsy which, at least for B, unambiguously refers to some individual B knows of. The definite description used by B is entirely drawn from the context-setting situation B refers to, and which is now hidden from us and perhaps the speaker A as well. This description conveys a piece of B’s state of knowledge or belief about the individual in question, and may vary from scanty remarks such as

(20c) I t belongs to John.

to much more informative remarks like

(20d) This awful animal belongs to John.

The fact that B exploits some external context-setting situation known to him becomes even more clear when B makes his reference mistakenly in the sense that A has had another individual in mind when uttering (20a). If B’s utterance does not clash with A’s knowledge base then the latter acquires some false information of FatsyA which is actually of FatsyB. What B means by his utterance of (20b) is that

The individual F I refer to by the cat is a cat and its name is Fatsy, and this information is sufficient for me to pick up a unique individual, that is, F . From my point of view the speaker A is talking of F too.

In other words the context-setting situation has the factual form of

C -, (Vy((cat Y ) & (Fatsy y ) ) 3 ( Y = F)) The literal (Fatsy y ) , which may be read as being a Fatsy, creates the core of the context-setting situation the speaker B is refemng to (Burge 1975; Barwise and Perry 1983). The other literal may or may not be relevant here depending on the way B reaches his unique referent (that is, whether B uses A’s utterance for that or not). In general, therefore, the context C the speaker B uses in (206) is expressed as

C = (Xx(Fatsy x ) )

and the final translation of (206) follows as

(20b) + (cat F) & (Fatsy F ) & (Vy(( cat y ) & (Fatsy y ) ) 3 ( y = F)) & (bt F J )

Note that the use of the literal (cat y ) under the scope of the universal quantifier, although most probably significant for B, may be redundant, or even invalid if it turns out that F is actually an alligator. We have two possibilities in the latter case. Either the audience B is addressing becomes misinformed, which can happen when they exploit different context-setting situations, or the audience can still pick up F correctly if the context-setting situation they are looking at contains F and the clause

(Vy(Fatsy Y ) 3 ( Y = 0) This time B cannot mislead his audience on the basis of the logical truth of the formula

(d? 3 $1 2 (4) & p I*) where P stands for B’s incorrect belief.

The argument above accounts for the nature of the value- loaded use of definite descriptions (Barwise and Perry 1981) (or what Donnellan (197 1) called referential) and can be applied in a similar fashion to other referential situations we have discussed.

We formalize this discussion with two new translation rules (rules 8 and 9) . The rules account for the two distinct situations mentioned which we have named contingent and ultimate references to a proper name.

Rule 8 (names as contingent referents)

sentence SI of the form

(i) SdNI-, (FI N)

An individual N, named N, mentioned in a context-setting

172 COMPUT. INTELL. VOL. 2 , 1986

is further referenced in a sentence S2 by a definite description the P refemng to its action or description in SI rather than to its name. If the literal translation of S2 is in the form

(ii) L ( 3 x ( P x ) & (C x ) & (Vy((P y ) & (C y ) ) 3 ( x = Y ) ) (F2 4)

(iii) &(the P) -+ [ ( A p ( p N)), (Ax[(AC, L) , (As(F1 s))l)la

An individual N, named N, mentioned by a sentence SI of the

(9 S I N -+ (FI N)

then the translation of S2 in context of S1 is derived as

Rule 9 (names as ultimate referents)

form

is further referenced by the P in a sentence S2 on the basis of its name only. If S2 has the literal translation in the form

(ii) L = ( ~ ( P x ) & (Cx) & ( V y ( ( P y ) & ( C y ) ) 3 (x = Y ) ) (F2 4)

then S2 translates in the context of S1 as

!iii) Mthe P ) -+ [ M p N)), (AxKK, L ) , ( M N s))l)l where N is the predicative use of name N. 0

10. Conditional contexts In Sect. 7 on attitude report contexts we found that a part of

the attributive use of some nominal expressions could be explained in terms of nonreferential attitude report readings. In these cases we assumed that the speaker was addressing an entity whose existence was relative to his attitude toward it (beliefs, imaginations, etc.). But this examined just one side of the coin. In the following example we list just a few sentences where an attributive reading cannot be explained by attitudes.

Example 10 Apparently, the following sentences can be interpreted

nonreferentially without any reference to speaker attitudes.

(21) The cat that Mary buys is a Burmese. (22) The man who is drinking the martini is a fool. (23) A man who kills somebody is a murderer.

It should be relatively clear that sentences like (21), (22), and (23) above roughly fall under the following scheme.

(24) or, in other words,

(25) ifalthe a P’s then hek F’s Rewriting (25) more formally we obtain

(26) i f ( g x ( a x) & ( P x ) ) then ( F himk) Clearly, (26) is just another way to express (24) if the latter is to be understood attributively. No such equivalence can be made when (24) is used referentially. The choice between the and a in (24) depends on speaker confidence as to the uniqueness of the entity established in the condition part of the sentence. Observe that in (26) the part between i f and then constitutes our context-setting utterance SI, and the part past then is our S 2 . It is not necessary to assume a pronominal reference between S, and Sz, as it has been suggested in (26), although using a definite pronoun in S2 appears natural. In fact the general form of a conditional context utterance may be taken as

(27) if SI (a P) then S2 (the P I )

AfThe a such that P(himk) F’s

Although sentences like those of example 10 translate into conditional context form, not all conditional context utterances can be expressed in terms similar to (21), (22), and (23) or the like. The importance of conditional structures in natural language has been long acknowledged (Webber 1979; Heim 1982). In Webber’s (1979) thesis on discourse anaphora, she recognized both conditionals and attributives, but she did not see the equivalence between them.

Example I I Consider the following pairs of sentences. If the first sentence

in a pair is used nonreferentially, it has an equivalent conditional context reading expressed by the second sentence in the pair.

(28) The cat that Mary buys is a Burmese. IfMary buys a car, it is a Burmese.

(29) The man who is drinking the martini is a fool. rfa man is drinking the martini, he is a fool.

(30) The man who kills somebody is a murderer. I fa man kills somebody, he is a murderer.

Thus the first sentence in (26) gets its nonreferential conditional context reading in the form:

(28a) Resolving pronominal reference of x, to x we obtain

(28b) (%(cat x ) & (buys M x)) 3 (3u(cat u ) & (buys M u ) & (Vy((cat y ) &

(%(cat x) & (buys M x ) ) 3 (burmese x,)

(buys M y ) ) 3 ( y = u)) & (burmese u ) ) 0

This latter translation requires some explanation as it is important not to confuse things at this point. The translation is not to represent the meaning of the conditional statement featured in the second sentence of (28) when taken alone. Obviously, the conditional statement does not contain the uniqueness implication that is present in (28b) (see, for example, Heim (1982)). This translation represents a singular, attributive reading of (28) in which the uniqueness implication is clear. What the first sentence in (28) says is that the definite noun phrase the cat that Mary buys has at most one antecedent, which may be either a particular cat or a concept of such a cat (which interpretation we spare here). If this sentence is used referentially then there is no problem with instantiating the referent which must be exactly one. When the sentence is used attributively, however, we cannot claim that the antecedent of the definite description actually exists, and therefore we cannot instantiate our reference. We can do that if the reference proves successful, that is, under the condition that the antecedent can be instantiated. Thus the paraphrase of the attributive statement in (28) we are aiming at is IfMary buys a cat then there is only one cat she buys and this cat is a Burmese. Other conditional paraphrases in example 1 1 shouId be understood in this way, even if this may not appear the most natural reading, as in (30). Note that if the uniqueness implication could not be passed from an attributive statement such as (28) to its conditional para- phrase, the translation featured in (28b) would not be acceptable. This situation occurs if we interpret the first sentence in (28) BS generic and the second sentence as addressing instances of a generic concept, here the cat that Mary buys. In such a case we can only go as far as (28a), and then seek some more elaborate translation rule to resolve the internal anaphora.

The last example has been chosen so that both sides of the conditional context reading for (28). (29), and (30), when taken alone, have a straightforward perfect referential reading. This

STRZALKOWSKI AND CERCONE 173

reading does not represent the general case, and other prono- minal context translation rules may be useful for resolving pronominal references between the sides-consider only If John wants to marry a queen, she must be wealthy. As we will see, this reading cannot be obtained from the imperfect reading of The queen John wants to marry must be wealthy.

An interesting consequence of discovering the equivalence between purely attributive readings of some sentences and their conditional readings is that we can now explain translations of utterances previously considered generic (Hirst 1983). Consider, for example,

In a referential interpretation we talk of a’particular tiger and some particular cat. In a conditional context reading with respect to a tiger, for example, we address some particular cat, but not a particular tiger. In such a case we utter that ifthere is a tiger, it is more dangerous than a (particular) cat. Still the conditional reading may be applied to both a tiger and a cat yielding something like ifthere are a tiger and a cat, the tiger is more dangerous than the cat. I think the reader should not have any problems with applying the principles suggested above to derive proper representations for each of these readings. However, the truly generic utterances cannot be verbalized this way. The following sentence

A tiger is more dangerous than a cat.

The president is elected every four years.

when used at some higher “naming” level (Strzalkowski 1986b,c) cannot be equated with

* Ifthere is a president, he is elected every four years.

unless, of course, the latter is used in a generic sense too. We summarize these observations as a formal translation rule

(rule lo), using the notion of conditional contextpattern, which (roughly speaking) has the form of (ASI(AS2(SI 3 S,))) as expressed in A. Rule 10 (Conditional context translation rule)

translating to If a sentence S has a singular reading, referential over x ,

(i) S(the/a P)+ ( 3 x ( P x ) & ( U x ) & ( Q x ) )

and every referential singular reading equivalent to it assumes the same form, where V is an optional uniqueness clause possibly present when the is used in S, that is,

(ii) U = ( W C x ) & ( V y ( V y ) &C ( C Y ) ) 3 ( x = y ) ) ) where C is an external context that cannot be instantiated, then the conditional context nonreferential reading of S exists and is obtained as the result of the following derivation

(iii) S(the/a P) --+ [[L., Q], PI where L is the conditional context pattern in the form

L. = (ASl(hS2((3x(SI x ) ) 3 (Sz Xn) ) ) ) (iv)

and the pronominal reference of xn is resolved to the argument of SI by one of the pronominal context translation rules. 0

Condition (i) in the formulation of rule 10 seems to be quite restrictive regarding the form a sentence can take in order to qualify for a conditional context nonreferential reading. Ob- serve that the literal P can be verbalized as

(Ax( P I x ) such rhar (Q I x ) ) as, for example, in a man who breaks the law. Yet the restrictive

wh-clause is not strictly necessary to maintain Pin the requested form. If one utters

A tiger is dangerous

a nonreferential interpretation is still possible in conditional context with P = an entity such that it is a tiger, so that P translates as

P-+ (Ax(Ex) & (tigerx))

where E is the entity predicate which, as nonsignificant, may be omitted.

The question naturally arises: What kind of sentences do not qualify for conditional context readings? Obviously, the utter- ance that presupposes the existence does not qualify. (But this very sentence does!) Compare

(31a) There is a unicorn with a pink tail which lives in the park.

(3 1 b) A unicorn which has a pink tail lives in the park.

(32a) There is a man that killed somebody (and) who will be prosecuted.

(326) A man who killed somebody will be prosecuted.

Both (31a) and (32a) have strictly referential readings, and no nonreferential reading is possible. Observe, however, that (3 1 a) has an equivalent referential reading of the form

(31a) + (3x(u-with-pink-tail-which-lp x))13

which violates the restriction that every equivalent referential reading must meet form (i) from rule 10. In contrast (3 1 b ) can be translated as

(31b)+ ( 3 x ( u x ) & (hptx) & ( lpx ) )

with P = (Ax( u x ) & (hpt x ) ) and Q = (Ax(1p x) ) . The next observation to make in conjunction with rule 10 is

that sentence S does not necessarily have to be a top-level clause. Again consider the sentence

(33) John wants to marry a queen. The referential reading of sentence (33) assumes the form of

(33)(ref) ---* ( 3 4 4 x ) & ( w J(m J x) ) )

Rule 10 provides us with a conditional context (perfect) reading as

(33)(ccper)+ ( 3 x ( q x ) ) 3 (wJ(mJher , ) )

This translations reads: ifthere is a queen, John wants (to marry her). By rule 3 we can also derive the imperfect reading of (33).

(33)(imp)+ ( w J ( 3 x ( q x ) & ( m J x ) ) )

And applying rule 10 to the embedded clause under the scope of the existential quantification, we get another conditional context (imperfect) reading of (33).

(33)(ccimp) --., ( w J ( ( 3 x ( q x ) ) 3 ( m J her,)))

This latter reading may be verbalized as John wants (to marry the queen ifthere is a queen).

There is a subtle difference in meaning between (33)(ccper)

”One might overcome restriction (i) in rule 10 by inserting the non- significant literal “entity” in place of missing P or Q component. That would lead to the conditional context reading roughly paraphrased as “if there is something, it Qs.” This reading is. however, unlikely enough to be taken seriously.

174 COMPUT. INTELL. VOL. 2, 1986

and (33)(ccimp). The former states that unless a queen exists John's attitude toward her cannot be instantiated, that is, her existence causes John's desire to become in effect. In the latter interpretation, John's will or desire is independent of a queen's existence. Rather, her existence will cause John to many her, relative of course to his will. More precisely, in (33)(ccimp) John wants to reach the state in which a queen's very existence will entail John's marrying her.

Similar considerations can be given to utterances with attitude report verbs. Thus for the sentence

(34) John believes that a unicorn lives in the park.

four nonequivalent translations may be derived:

(34)(ref)+ ( 3 x ( u x ) & ( b J(1px))) (Weeper) 4 (%u x ) ) 3 ( b JVp x,)) (34)(aW -+ ( b J(%u x ) & ( lp XI)) (34Wcatt) + ( b J ( M u XI) 3 Up x,)))

11. Indirect references By an indirect reference we mean the reference situation

where the object referenced by a definite description does not explicitly occur in the context-setting sentence but its presence therein can be inferred. To become more precise we first introduce the concept of individual knowledge base.

For every individual OL and every moment of time t , there is a knowledge base KB; such that it contains all facts about the universe a knows, believes, disagrees about, etc. at time t . Instead of the time instant t we may talk of some space-time location 1, but its temporal expanse should be small enough to keep the knowledge base in a fairly static condition. Suppose then that at some location I, an individual (Y (the addressee in some discourse situation) is to interpret some utterance S2(y) with an anaphoric expression y to the preceding discourse. Let S l ( x ) be the current context-setting sentence, and let x be an object referred to by the description x in S1, Note that x does not have to correspond to any nominal construction in preceding discourse (noun phrase, relative clause, etc.) and may involve entire sentences or sets of sentences. Consider, for example, the following paragraph where the description x is spread over a set of consecutive sentences resulting in the object x of six washed, cored, and cut into pieces apples.

(35) John bought six apples. He washed and cored them. Then he cut them into pieces.

Let further KB; be a's knowledge base at the time S2(y ) is uttered. If KB; contains the fact x R y such that y is another object and R is a binary relation, and y' is some not necessarily explicit description of y such that it can be added to Sl(x) thus creating Si(x, y ' ) , then y refers indirectly to x iff y refers directly to y'. Here is a typical example of indirect reference.

(36a) John boughr a car. ( x = the car John bought) (366) The engine works well.

[KBj0,,,, = {car(x) + 3 y engine( y ) & has(x, y ) } y = x's engine; R = has;14

y' = the engine of the car John bought]

For additional examples the reader is referred to Brown and Yule (1983). One must be careful when selecting the relating R which, as we saw, is all important in indirect references. Certainly not any binary relation can be appropriate. In the

' 9 0 be more precise one could verbalize R as "is powered by" or the like.

fragment (36), for example, the relation has not would not create an indirect reference. In fact, we need R to be a function such that y = R(x) . This function is determined by the context 3,. the current discourse topic, and some other related phenom- ena. It should not be difficult to imagine that in discourse (36) the addressee may recall some other part of his KB, for example, that of

{ 3 y engine( y ) & no-car-has(y)}

so that the R = has-not would be appropriate. This attitude is relative to the development of discourse prior the utterance (36a).

We do not fully investigate the problem of indirect references in this paper beyond the general statements given above. We observe that the problem of indirect references is a special case of the intersentential reference phenomenon that we have addressed in this paper. When we assume that the context- setting sentence S1 has already been evaluated against an individual's knowledge base before he attempts an interpreta- tion of S2, then the general reference scheme of context-setting sentence f current sentence is maintained. The only difference between direct and indirect references lies in the way the context-setting sentence S, is being produced.

12. Forward references When we introduced our reference scheme earlier in this

paper, we did not constrain in any way the mutual spatio- temporal relationship between the context-setting sentence SI and the current utterance S2. The only requirement .was a relative physical proximity of the two so that they could be considered as fragments of some larger discourse. Avoiding unnecessary complications, we yielded to the natural limitation of written language which imposes explicit left-to-right order- ing of words. In this sense, the context Sl always preceded the sentence S2, and this was especially explicit in the examples we presented thus far. What we have considered as a notational convenience before, now needs a careful classification.

In practical applications of our theory we cannot ignore the fact the language expressions can be produced and recognized one at a time, which imposes a linear temporal ordering on utterances. This observation leads to a quite natural division of all reference situations into backward references and forward references. Perhaps surprisingly, the actual temporal relation- ship of S1 and S2 does not have decisive impact in this matter. Clearly, if the context-setting sentence SI has been obtained from a discourse fragment that wholly preceded the current utterance S2, references made from Sz to SI have the backward character. Most of the examples presented in this paper exhibit this characteristic, which is only partially a matter of conve- nience; backward references are far more common in discourse than forward connections. There are, however, situations where the context-setting sentence is not available at the time the utterance S2 is produced, often deliberately withheld by the speaker and supplied at some later time. If the addressee reacts accordingly, postponing resolution of references in S2 until S1 is provided, we have the situation where S2 precedes S, (more precisely, the discourse fragment that S1 derives from). In such a case we speak of forward reference, of which the following example is characteristic: (37) He ran quickly down the stairway. Inspector John Flynn

had his reasons to be in such a hurry.

Unfortunately, the backward/forward distinction is not always so clear cut. Often S, and S2 will overlap arbitrarily so that we

STRZALKOWSKI AND CERCONE 175

need some more definite method to decide whether a backward or a forward reference is taking place. Sometimes the distinction will not make much sense, however. Consider, for example, the situation in which the current utterance, which includes S2 in it, is itself a part of the context S1 .I5

Let Sl(x) be a context-setting sentence, and x be a description which fully describes some unique object x to be referenced by an expression y of the “current” utterance S2( y). l6 Taking into account the temporal ordering of language, we shall say that y refers backward to x iff the description x of the object x wholly temporarily precedes the utterance of y. In such a case we speak of a backward (co-)reference between y and x. Converseljr, if x is not fully defined prior to the utterance of y, we speak of forward (co-)reference between y and x . initially, this defini- tion may seem overly complicated. Consider, however, the following paragraph in which the context-setting sentence SI is derived from the utterances (38a) and (38c), with the interven- ing “current” utterance S2 as (38b).

(38a) John bought Some delicious coffee in the pub. (38b) He drank it back in hi5 oflce. (38c) The cream he used was not fresh and the coffee tasted

bad.

Clearly, the proper antecedent for it in (38b) is the delicious coffee John bought with the not fresh cream he added, and not just the delicious coffee he bought, what would be the case if it referred backward into (38a). In fact the antecedent of it is not yet fully defined in (38a), and only after reading (38c) can one resolve the reference properly. In this sense, it refers forward, although neither (38a) nor (38c) contains an explicit antece- dent. Observe also that the definite description the coffee in (38c) refers backward and has the same antecedent as it in (386).

A word of warning is necessary here. After reading (38a) and (386). the reader may come, quite rightfully, to the conclusion that the pronominal ir refers anaphorically (hence backward) to the delicious coffee John bought in the pub. Later, upon reading (38c), he may change his earlier decision and relate both it from (386) and the coffee from (38c) to the common antecedent previously described. This change in the reference of it involves some nonlocal processing that does not fit into our simple SI-SZ scheme, and therefore cannot be solved by the transformation F,,- I . Note that if the reader of (38) had employed the scheme independently to (38a) and (386) first, and then to (38a,b) and (38c), he would have obtained distinct antecedents for it and the coffee-an obviously undesired result. For this reason, and some others to be explained later, the transformation F,- 1 does not take us yet at the ultimate stratum in the stratified model. Another level is still necessary as discussed in Strzalkowski ( 1 9 8 6 ~ ) .

13. Summary of the transformation F,1 There are many other types of references which we do not

discuss in this paper. Among these, the one-anaphora, clausal ir, and elliptical constructions are perhaps the most prominent. See also Webber (i979) and Brown and Yule (1983) for more comprehensive lists. A considerable number of rules resem- bling those of 2- 10 have to be added before the transformation

”For the sake of completeness, however, we attempt to formally define the notions of backward and forward references.

‘(‘By full description of an object x we mean here that x is ready for a direct reference from Sl(y). Observe that the notion is relative to S2 and speaker’s intention. Note also that the description x may come from discourse as well as from an individual’s knowledge base.

F,,- could be regarded as nearing a complete form. We limited our discussion to an in-depth investigation of a small number of intersentential reference cases since we believe that further rules, while undoubtly important, will not change the general image of the transformation F,,- which we have presented thus far. We now summarize the characteristics of the transformation F,,-l in the stratified model.

The transformation F,,-l takes, from the level SLA,,-I, a representation of some actual discourse entered originally in the source language SL and produces the subsequent representation of the former at the level SLA,- Let A = {a1, . . . ,ti,} be such a discourse representation at SLA,-z. Without sacrificing gener- ality we may assume that A is finite and fully ordered, that is, 6; < 6i+ 1 for 1 d i < s, where < is a precedence relation. Each 6; E A is an image of some actual expression from SL in categorial grammar (and other early transformations), and in most cases will be classified into the syntactic category of sentences ( t in CAT). Because CAT is known to introduce ambiguity, and also because of ambiguity introduced by earlier transformations, a 6; may not be a single representation (a parse tree) but would be rather a set of these. That is, 6; = (6: , . . . , ti:} such that t$ E F,,-2(ui), u; E SLA,-3. It would be more convenient, however, to regard the SLA,-* representation of discourse not as a single set A , but as a set A* of sets such that every A E A* is an alternative discourse representation, and if A = {ti1, . . . , S,} then every 6; E A is a single p-marker in CAT. In other words, let A* be the SLAnm2 representation of some actual discourse. We have

[4] A* = {AIA = {6l, ..., as}, 6; E F , , - z ( ~ i ) , U; E SLA,,-3}

where ui is the discourse fragment corresponding to 6;. For each A E A* the transformation F,-l produces the representation F,,- I(A) at the level SLA,- as a set of prototype discourse representations. Let D be such a discourse prototype produced by F,,-l at the level By cross section D , of D , D, = (D;, @), we shall mean the discourse representation D with a distinguished point c , and such that if D E F , - I ( @ I , -.., 8,,

..., 6,)) then DC E F,,-l({61, ..., 6=}) and D! E F,,-1({6,+1, ..., 6J). Let S2 = 6, be the current expression of A, in the sense that the set {til, ..., 6,-1} has already been processed into the level SLA,- I . Then the sum DC- I U @ = SI is the context-setting sentence for S2. It is not to say that @ and 05-1 are completely independent. In fact, D f will build not only on {6,+ . . ., SS}, but also on the context provided by DC-l. Here D { - I is a possible discourse representation accumulated by processing the discourse part from 6, to 6,- I

into SLA,,-I. The other constituent D f of S, stands for a presupposed representation of the rest of the discourse follow- ing S2. In practical application the structure D f does not have to be complete, or may be even ignored, but its theoretical role in resolving forward references cannot be neglected.

The processing of the transformation F,-l can be now formulated into the following recursive rule. Let (S, , S2) be the context-setting sentence/current .utterance pair with S2 = 6;. Then

[5 ] Fn-I((sl, s2)) = (0; E SLAn-11 3di E Fn-I(S2) s.t D; = SI U d;}

For every such D;, new SI and S2 are derived as follows:

[6] S z + 6 i + I , i < s

SI +w lJ @+I

This scheme introduces a considerable degree of ambiguity.

176 COMPUT. INTELL. VOL. 2 . 1986

The unconstrained F,- has to consider and process every pair (SI, S2) that can emerge from using formulas [5] and [61, resulting in a combinatorial explosion of possibilities (highly undesirable in a practical implementation). The process is finite, however. There are only finitely many different Sl’s that can ever be produced, and the number of different Sz’s is exactly the same as the cardinality of A which is finite by definition. The process starts with S2 = a l , the empty Do, and a number of alternative Df which depends on S,. The process stops when for every discourse prototype D, if D, E SLA,-I then i = s.

The complexity of this scheme can be reduced if we allow another transformation F, to parallel F, - I . As one may expect, unrestricted F,-l can produce a great number of discourse prototypes of which only a handful will have actual importance. The rest can perhaps be excluded as inconsistent, senseless, etc., but the local character of the F,- processing does not give any guarantee that the proper prototypes will be selected. That is why we relegated this job entirely to the next transformation F,, which is discussed in Strzalkowski (1986~). We have concen- trated here on the performance of F,-l on any particular pair

14. Toward a computer realization of the stratified model In our discussion we have stressed the computational per-

spective of our theory of meaning representation. The careful reader may have already assimilated some ideas on how to develop a computer implementation of various aspects of the stratified model. This section summarizes computational char- acteristics of the stratified model and points out other related problems not addressed directly in this paper. We do not propose an actual implementation for the stratified model. Instead, we highlight a number of issues that have to be investigated in detail before any computer realizaton of the theory can be attempted. We discuss the computational side of the stratified model in general and then concentrate on transfor- mation F,-

To implement the stratified model, one must consider translating the source language from a natural form available at level SL into the ultimate representation at level SLA,, and appropriate mappings of the “original” universe U onto various models including the ultimate model at level UA,. Finally, the mapping M specifying the formal semantics of the language at the level SLA, should connect the left side of the model with the right side, thus completing the implementation. Although each stratum (SLA,, UA,), 0 S i C n, has a mapping MI such that MI: SLA, --$ UA,, which guarantees the meaningfulness of the model, only M, = M must actually be made explicit in a computer implementation. This observation has a central significance in assessing the computational tractability of the stratified model. We have already pointed out that the problems with contemporary natural language understanding systems can often be identified with attempts to implement either an M, such that i < nor even to relate some two levels SLA, (usually i < n) and UA, (also j < n) that do not belong to the same stratum, that is, i # j. In fact, implementation of the left and right sides should be coordinated as we suggested in Sect. 3. For every SLA, to be created on the left side we should provide an appropriate UA, on the right side so that the mapping M I (however implicit) preserves the original semantics Mo between SL and U. It is not necessary that the entire right side be explicitly constructed. With the exception of the ultimate level UA,, and the universe U which is taken for granted, the remaining levels UA,, 1 S i S n are mostly transparent for a

( S l y S2).

system based on the stratified model. This becomes possible due to the implicit character of appropriate mappings MI. We must provide, however, a method for decoding the universe represen- tation at UA, back into U, and the transformations G,, . . . , GI on the right side must be given explicit instantiations.

Suppose that we have already developed the stratified model with all necessary strata (SLAi, UAJ, 0 =s i S n, and described all required transformations Fi and Gi as well as the mapping M. In Sect. 3 we indicated that the model is somehow possessed by an intelligent individual who, having also at his disposal an individual knowledge base KB, uses the model for on-line processing of information entering either by means of source language discourse or by sensing the universe or from the knowledge base. The flow of information may occur either in direction from SL to U (understanding) or backwards (commu- nicating), often affecting the knowledge base and altering its contents. Let us concentrate here, as in preceding sections, on the left side of the model in forward processing, that is, from SL to SLA,. The requirement of on-line processing restricts implementation of the transformations Fi’s. We cannot afford a purely sequential scheme of the left side. It would be unrealistic to expect from a transformation Fi to wait until a full representation of a discourse is gathered at the level SLAi-I before the transformation onto the next stratum could be attempted. In fact, except for only some very early transforma- tions (FI, F2, ...), beginning from some level SLA,-I all further transformations shouid work in tandem producing pieces of the ultimate discourse representation at the level SLA,. This is not to say that every word, phrase, or even sentence entering SL should trigger a cascaded processing on the left side. The process could be rather compared to the counter mechanism where the move of a next digit ring occurs only after the preceding ring moves a number of times. For example, it would often be advisable to postpone the syntactic analysis of the input until a sufficient number of lexical items of a current phrase or sentence is available thus reducing the degree of nondetermin- ism in parsing (Marcus 1980). Similarly, an extension to the discourse model by the transformation F, may be considered only after the relationship of the current utterances to the preceding discourse has been determined by the transformation F,- I . The process will not, however, be as regular as a counter. Some early transformations, including phonological, lexical, or even syntactic processing, which operate on fairly well-defined units (phonemes, lexemes, phrases), will be best realized sequentially, although this view may not be practical in general. We must always remember of the presence of the individual knowledge base which, by generating various presuppositions, may occasionally activate further transformations that build a higher-level representation of some discourse fragment before that fragment is even fully sensed at SL. This happens when we recognize the beginning of some pattern language construction (idiom, commonly used combination of words, etc.), and the knowledge base generates an expected continuation. In this sense, the syntactic well-formedness of utterances, for exam- ple, is not crucial for understanding them. If the actual continuation of discourse can be fit in our expectations further processing reduces to a mere verification, otherwise a revision has to be performed, if possible. The problem of cooperating transformations is discussed in greater detail in Strzalkowski (1986~).

We know very little about the actual form transformations F I to Fn-2 may eventually take, except perhaps that transformation F,-? is identified with the categorial grammar CAT. We assume

STRZALKOWSKI AND CERCONE 177

transformations FI to F,,-3 are computationally tractable (at least in theory). Implementation of CAT shoud not be a great challenge either,” but we do not exclude some other, and more powerful, grammatical system employed as F,,-*. Although we believe that the problems of language processing one may expect to be covered by the pre-F,,-] transformations are more or less worked out already, we shall investigate further their nature and role in the stratified model. Let us therefore concentrate our attention on the transformation F,,-

15. Computability of the transformatioh F,, There are three aspects of the transformation F,,-! that

merit special attention from a computational viewpoint. These aspects include implementation of the rules which create the core of the transformation (presently rules 1-12”), automating the process of deriving context-setting sentences according to the recursive scheme given in Sect. 13, and providing access to, and cooperation with, an individual knowledge base. To implement the reference rules one would generally need to solve the problem of A-reductions where their computational tracta- bility is questionable. To the extent that we are concerned with the problem, implementation of A-reductions has already been addressed in the computer literature, and concrete realizations have been suggested in standard programming languages including Pascal and LISP (Georgeff 1984) and Prolog. l9 But even an effective algorithm for evaluating the rules will not have much significance until we provide a method for computing context-setting sentences on which the rules operate. The recursive formula of Sect. 13 is the first step in this direction. We believe it will prove programmable, especially when close cooperation from the transformation F,, is assumed. What transformation F, should provide is a considerable reduction in the combinatorial explosion of possibilities generated by the formula. One of the most important factors in this reduction comes from limiting the available context to that part which is actually referenced by the current utterance S2. An implementa- tion of the forward context (the L$ component of discourse prototype cross section) may pose a serious problem here. This part of the context derives mainly from presuppositions made from an individual knowledge base.

Accessing and manipulating a knowledge base creates a separate programming problem. In addition to the well- understood organizational questions of maintaining a large data base, one must solve the general problem of inferences within the knowledge base. It would be impractical, and unrealistic in general, to expect the owner of a knowledge base to be always aware (read: can access directly) of any information that could be derived from what he knows or believes. However, the actual contents of the knowledge base, as well as its internal organization, should guarantee that, in a specific environment, appropriate facts will surface as the result of proper inferences. This requirement, in turn, raises the question of the quality and quantity of such inferences, that is, how fine and in-depth they have to be in a particular situation. Inferences that are too fine can be as harmful as inferences that are too shallow because they can undermine our ability to understand discourse at the level intended by the speaker. Take, for example, the Schank-style

”Categorial grammars create a subclass of context-free grammars. The reader may note,however, that Montague’s syntax is not context- free (Panee 1976).

“Rules 1 1 and 12 are described in Strzalkowski (1986~) . IgDahl, V. 1984. Private communication.

(Schank 1972) conceptual information processing paradigm, and the difficulties faced by the natural language understanding systems based on this approach (Schank 1975). In short, the conceptual approach requires an advance specification of the entire world model on which language communicates, as well as the level of detail at which inferences within this model are performed. Although the framework can be quite successful in simulating some well-defined “toy words,” it proves hopelessly impractical in real world situations.

16. Conclusions We have been concerned with developing a computational

methodology for finding extrasentential references which takes advantage of the expressive power and formal interpretability provided by the stratified model of meaning representation. In our investigations, we have limited the scope of inquiry to the problem of selecting possible antecedents in two-sentence “stories” to avoid most of the problems inherent in where to hok for the reference. We believe that this latter, ultimately important problem requires further consideration of various pragmatic issues which we were not prepared to address in this paper.

We began by introducing the stratified model and the different kinds of representations which this model employs. Within this framework we introduced the basic use of a A-categorial language A for meaning representation. We have provided methods for translating (computing) extrasentential references into intended meaning representations. We have proposed a new translation for the the determiner which we believe provides correct representation for definite anaphonc (and cataphoric) descriptions. Rule 1 (the basic translation rule) translates expressions from the categorial language L (utilized in earlier stages of the stratified model) into the A-categorial language A for representing intersentential dependencies in discourse. We have provided rules for perfect and imperfect context translations (rules 2 and 3), a rule for attitude report context translation (rule 4), perfect and imperfect pronominal context translations (rules 5 and 6), attitude report pronominal context translation (rule 7), rules for translating references to proper names that are either contingent (rule 8) or ultimate (rule 9), and a rule for conditional context translation (rule 10).

Rules were explained by considering a series of two-sentence “stories” which we used to illustrate referential interdependen- cies between sentences. We explained the conditions under which such dependencies could arise and formalized the set of rules (rules 2 through 10) which specified how to compute the reference.

We restricted our considerations to situations in which a reference, if it can be computed at all, had a presupposed unique antecendent. We then summarized the transformation F,,- I , which encompasses rules 2-10 (among others) in more general terms and related it to the Stratified model. We discussed the computational characteristics of the stratified model and pre- sented our ideas for a computer realization of it. Although there is no implementation of the stratified model presently, we did discuss three aspects of the transformation Fn-I that merit special attention from a computational viewpoint.

Obviously much work remains to be done. The research reported in this paper is based on some of the Ph.D. thesis research of one of us (Strzalkowski) into a theory of stratified meaning representation. Although we are at a preliminary stage with this research, we believe the results we presented are

178 COMPUT. INTELL. VOL. 2. 1986

nonetheless promising and will prove to be very significant to computational linguistics.

Acknowledgements We thank the reviewers whose many suggestions have helped

greatly to make this a more readable and comprehensive article. This research was supported by the Natural Sciences and Engineering Research Council of Canada under Operating Grant A4309.

AJDUKIEWICZ, K. 1935. Syntactic connection. In Polish logic. Edited by S. McCall. Oxford University Press, New York, NY. pp. 207-231. English translation, 1967.

BARENDERGT, H. P. 1981. The lambda calculus, its syntax and semantics. I n Studies in logic and foundations of mathematics. Vol. 103, North-Holland Publishing Co., Amsterdam, The Netherlands.

BARWSE, J., and PERRY, J. 1983. Situations and attitudes. The MIT Press, Cambridge, MA.

BOLC, L., and STRZALKOWSKI, T. 1982. Transformation of natural language into logical formulas. Proceedings of the 9th International Conference on Computational Linguistics, Prague, pp. 29-35.

1984. Natural language interface to the question-answering system for physicians. I n Computers and artificial intelligence, vol. 3, no. 1. VEDA, Bratislava, pp. 31-46.

BROWN, G., and YULE, G . 1983. Discourse analysis. I n Cambridge textbooks in linguistics. Cambridge University Press, New York, NY.

BURGE, T. 1975. Reference and proper names. I n The logic of grammar. Edited by D. Davison and G. Harman. Dickerson Publish- ing Co., Encino, CA, pp. 200-209.

CARNAP, R. 1947. Meaning and necessity. University of Chicago Press, Chicago, IL.

CERCONE, N. 1980. The representation and use of knowledge in an associative network for the automatic comprehension of natural language in natural language based computer systems. Edited by L. Bolc. Carl Hanser Verlag, Munich, West Germany.

198 1 . Some problems in representing grammatical modifiers. IEEE Pattern Analysis and Machine Intelligence, 4, pp. 357-368.

(editor.) 1983. Computational linguistics. Pergamon Press, London, England.

CERCONE, N., and STRZALKOWSKI, T. 1986. FromEnglish to semantic networks. I n Translating natural language into logical form. Edited by L. Bolc. Springer. In press.

CRESSWELL, M. J. 1970. Classical intensional logics. Theoria, 36, pp.

1973. Logics and languages. Methuen & Co., London, England.

BNNELLAN, K. 1971. Reference and definite descriptions. In Semantics. Edited by D. D. Steinberg and L. A. Jakobovits. Cambridge University Press, New York, NY, pp. 100-1 14.

DOWTY, D. R. 1976. Montague grammar and lexical decomposition of causative verbs. In Montague grammar. Edited by B. H. Partee. Academic Press, New York, NY, pp. 201-245.

GEACH, P. T., and BLACK, M. 1960. On sense and reference. I n Translations from the philosophical writings of Gottlob Frege. Blackwell Scientific Publications Ltd., Oxford, England, pp.

GEORGEFF, M. 1984. Transformations and reduction strategies for typed lambda calculus. ACM Transactions on Programming Lan- guages, 6, pp. 603-631.

GOODMAN, N. 195 I . The structure of appearance. Bobbs-Merrill, New York, NY.

GROSZ, B. J., and SIDNER, C. L. 1985. Discourse structure and the proper treatment of interruptions. Proceedings of the Ninth Interna- tional Joint Conference on Artificial Intelligence, Los Angeles, CA,

HAYES, P. 1974. Some issues and non issues in representation theory. Proceedings of the AISB Conference, University of Sussex, Brighton, England, pp. 63-79.

347-372.

56-78.

pp. 832-839.

HEIM, I. R. 1982. The semantics of definite and indefinite noun phrases. Doctoral dissertation. University of Massachusetts, Am- herst, MA.

1978. Developing a natural language interface to complex data. ACM Transactions on Database Systems, 3(2), pp. 105-147.

HIRST, G. 1983. Semantic interpretation against ambiguity. Doctoral dissertation. Department of Computer Science, Brown University, Providence, RI, Technical Report CS-83-25.

KIUPKE, S. 1972. Naming and necessity. I n Semantics of natural language. Edited by D. Davison and G. Harman. D. Reidel Publishing Co., Dordrecht. The Netherlands, pp. 253-355.

LEWIS, D. 1976. General semantics. In Montague grammar. Edited by B . H. Partee. Academic Press, New York, NY. pp. 1-50.

MARCUS, M. P. 1980. Theory of syntactic recognition for natural language. MIT Press, Cambridge, MA.

M ~ A L L A , G., and CERCONE, N. Editors. 1983. Special issue on knowledge representation. IEEE Computer, 16.

MCCARTHY, J., and HAYES, P. 1969. Some philosophical problems from the standpoint of artificial intelligence. Machine Intelligence,

MONTAGUE, R. 1974. Formal philosophy. Selected papers of Richard Montague. Edited by R. H. Thomason. Yale University Press, New Haven, CT.

MOORE, R. 1981. Problems in logical form. Proceedings of the 19th Annual Meeting of the Association for Computational Linguistics, Menlo Park, CA, pp. 117-124.

NASH-WEBBER, B., ~ ~ ~ R E I T E R , R. 1977. Anaphora and logical form: on formal meaning representation for natural language. Proceedings of the Fifth International Joint Conference on Artificial Intelligence, MIT, Cambridge, MA, pp. 121-131.

PARTEE, B. H. 1972. Opacity, coreference, and pronouns. I n Semantics of natural language. Edited by D. Davison and G. Harman. D. Reidel Publishing Co., Dordrecht, The Netherlands,

1976. Some transformational extensions of Montague gram- mar. I n Montague grammar. Edited by B. H. Partee. Academic Press, New York. NY, pp. 51-76.

QUINE, W. 1953. From a logical point of view. Harvard University Press, Cambridge, MA.

1960. Word and object. MIT Press, Cambridge, MA. REICHENBACH, H. 1947. Elements of symbolic logic. New York Free

Press, New Yo&, NY. RIESBECK, C. K. 1975. Conceptual analysis. I n Conceptual informa-

tion processing. Edited by R. C. Schank. North-Holland Publishing Co., Amsterdam, The Netherlands, pp. 83-156.

ROSSER, J., and TURQUETTE, A. 1952. Many valued logics. North- Holland Publishing Co., Amsterdam, The Netherlands.

RUSSELL, B. 1905. On denoting. Mind, 14, pp. 479-493.

HENDRIX, G. G., SACERDOTI, E. D., SAGALOWICZ, and SLOCUM, J.

4, pp. 463-502.

pp. 415-441.

1923. Vagueness. Australasian Journal of Philosophy, 1, pp.

SCHANK, R. 1972. Conceptual dependency: a theory of natural language understanding. Cognitive Psychology, 3, pp. 552-63 1.

editor. 1975. Conceptual information processing. North- Holland Publishing Co., Amsterdam, The Netherlands.

SCHUBERT, L. K. 1976. Extending the expressive power of semantic networks. Artificial Intelligence, 7 , pp. 163-198.

SCHUBERT, L. K., and PELLETIER, F. J. 1982. From English to logic: context-free computation of ‘conventional’ logical translation. American Journal of Computational Linguistics, 8( I ) , pp. 27-44.

SMULLYAN, R. 1948. Modality and description. The Journal of Symbolic Logic, 13( I), pp. 3 1-37.

STALNAKER, R. C. 1972. Pragmatics. I n Semantics of natural language. Edited by D. Davison and G. Harman. D. Reidel Publishing Co., Dordrecht, The Netherlands.

STRZALKOWSKI, T. 1983. ENGRA: yet another parser for English. Department of Computing Science, Simon Fraser University, Burnaby, B.C.. Technical Report TR 83-10.

1984. Toward a proper meaning representation for natural languages. Natural Language Group, Laboratory for Computer and

84-92.

STRZALKOWSKI AND CERCONE 179

Communications Research, Simon Fraser University, Working Paper # 1.

19860. Representing contextual dependencies in discourse. Proceedings of the Canadian Conference on Artificial Intelligence (CSCSI/SCEIO '86), Montreal, P.Q., pp. 57-61.

19866. An approach to non-singular terms in discourse. Proceedings of the 11 th International Conference on Computational Linguistics (COLING), Bonn, Germany.

1986~. A theory of stratified meaning representation. Doctoral dissertation. Simon Fraser University, Bumaby, B.C.

TARSKI, A. 1933. The concept of truth in the languages of the deductive sciences. In Logic, Semantics, Metamathematics. Edited by A. Tarski. English translation, Oxford 1956.

THOMASON, R. H. 1976. Some extensions of Mongague grammar. In

Montague Grammar. Edited by B. H. Partee. Academic Press, New York, NY, pp. 77-118.

VENDLER, Z. 1971. Singular terms. I n Semantics. Edited by D. D. Steinberg and L. A. Jakobovits. Cambridge University Press, New York, NY, pp. 115-133.

WALTZ,D. L., FININ,T. N., GREEN,F.,CONRAD, E., GOODMAN, B., and HADDEN, G. 1975. The PLANES system: natural language access to a large data base. Coordinated Science Laboratory, University of Illinois, IL, Technical Report T-34.

WEBBER, B. L. 1979. A formal approach to discourse anaphora. Doctoral dissertation. Garland.

WOODS, W. A., KAPLAN, R. M., and NASH-WEBBER, B. 1972. The lunar science natural language information system: final report. Bolt Beranek and Newman, fnc., Cambridge, MA, BBN Report #2378.


Recommended