Post on 22-Apr-2015
description
transcript
1/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
A Constructive Mathematics approach for NaturalLanguage formal grammars
An Introduction to Adpositional Grammars (AdGrams)
Federico Gobbo and Marco Benini{federico.gobbo,marco.benini}@uninsubria.it
University of Insubria, Varese, ItalyCC© Some rights reserved.
ECAP09, UAB, Barcelona, July 2009
2/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Philosophers and natural language formalization
How to clean natural languages (NLs) from ambiguity?
Leibniz: characteristica universalis to catch the laws of humanthought and lingua generalis as a quasi-natural Latin to beused as a written medium for scholars.
Frege: Begriffsschrift and the definition of unsaturatedexpressions.
Husserl: the meaning categories as the formal constituents ofa logical grammar.
2/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Philosophers and natural language formalization
How to clean natural languages (NLs) from ambiguity?
Leibniz: characteristica universalis to catch the laws of humanthought and lingua generalis as a quasi-natural Latin to beused as a written medium for scholars.
Frege: Begriffsschrift and the definition of unsaturatedexpressions.
Husserl: the meaning categories as the formal constituents ofa logical grammar.
3/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Computer science and logic-based approaches to NL
Goal: formalize NL grammars through mathematical formulaeproved through computation. Some formalisms in use today (thelist is not complete!):
Based on Chomsky’s constituency and transformation notions:Minimalism, Tree-Adjoining Grammar (TAG), Head-DrivenPhrase Structure Grammar (HPSG).
Based on categorial calculus: Type-Logical Grammar (TLG),Thinking Through Grammar (TTG), Combinatory CategorialGrammar (CCG), Pre-group.
Based on Tesniere Dependency and Valency: ExtensibleDependency Grammar (XDG), Algebraic Syntax, FunctionalGenerative Description (FGD), Mel‘cuk’s Dependency Syntax.
How to choose the best one, i.e., the most expressive, inlinguistic terms?
3/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Computer science and logic-based approaches to NL
Goal: formalize NL grammars through mathematical formulaeproved through computation. Some formalisms in use today (thelist is not complete!):
Based on Chomsky’s constituency and transformation notions:Minimalism, Tree-Adjoining Grammar (TAG), Head-DrivenPhrase Structure Grammar (HPSG).
Based on categorial calculus: Type-Logical Grammar (TLG),Thinking Through Grammar (TTG), Combinatory CategorialGrammar (CCG), Pre-group.
Based on Tesniere Dependency and Valency: ExtensibleDependency Grammar (XDG), Algebraic Syntax, FunctionalGenerative Description (FGD), Mel‘cuk’s Dependency Syntax.
How to choose the best one, i.e., the most expressive, inlinguistic terms?
3/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Computer science and logic-based approaches to NL
Goal: formalize NL grammars through mathematical formulaeproved through computation. Some formalisms in use today (thelist is not complete!):
Based on Chomsky’s constituency and transformation notions:Minimalism, Tree-Adjoining Grammar (TAG), Head-DrivenPhrase Structure Grammar (HPSG).
Based on categorial calculus: Type-Logical Grammar (TLG),Thinking Through Grammar (TTG), Combinatory CategorialGrammar (CCG), Pre-group.
Based on Tesniere Dependency and Valency: ExtensibleDependency Grammar (XDG), Algebraic Syntax, FunctionalGenerative Description (FGD), Mel‘cuk’s Dependency Syntax.
How to choose the best one, i.e., the most expressive, inlinguistic terms?
3/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Computer science and logic-based approaches to NL
Goal: formalize NL grammars through mathematical formulaeproved through computation. Some formalisms in use today (thelist is not complete!):
Based on Chomsky’s constituency and transformation notions:Minimalism, Tree-Adjoining Grammar (TAG), Head-DrivenPhrase Structure Grammar (HPSG).
Based on categorial calculus: Type-Logical Grammar (TLG),Thinking Through Grammar (TTG), Combinatory CategorialGrammar (CCG), Pre-group.
Based on Tesniere Dependency and Valency: ExtensibleDependency Grammar (XDG), Algebraic Syntax, FunctionalGenerative Description (FGD), Mel‘cuk’s Dependency Syntax.
How to choose the best one, i.e., the most expressive, inlinguistic terms?
4/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Psychological interpretation of NL formalisms
In recent years scholars got interested in strong psychologicalinterpretations of their formalisms:
“if I succeed to give a clear account of morepsychological phenomena thanks to my formalism, thismeans that my formalism is better.”
Surprisingly, formal grammarians didn’t read the 20-year longresults of cognitive linguistics, where linguistic phenomena areexplained in psychological and cognitive terms.
4/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Psychological interpretation of NL formalisms
In recent years scholars got interested in strong psychologicalinterpretations of their formalisms:
“if I succeed to give a clear account of morepsychological phenomena thanks to my formalism, thismeans that my formalism is better.”
Surprisingly, formal grammarians didn’t read the 20-year longresults of cognitive linguistics, where linguistic phenomena areexplained in psychological and cognitive terms.
5/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Cognitive linguistics and formalisation
Cognitive linguistics (e.g., Taylor, Cruse) are not interested informalisation, as their primary interest is in metaphorinterpretation: concepts are not formalised per se.
Nonetheless, Langacker borrowed the dichotomytrajector/landmark from the German school of Gestalt (e.g., KurtKoffka and Max Wertheimer) into syntax.
In other words, a cognitive account is inside the analyticalframework, instead of being a serendipity.
5/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Cognitive linguistics and formalisation
Cognitive linguistics (e.g., Taylor, Cruse) are not interested informalisation, as their primary interest is in metaphorinterpretation: concepts are not formalised per se.
Nonetheless, Langacker borrowed the dichotomytrajector/landmark from the German school of Gestalt (e.g., KurtKoffka and Max Wertheimer) into syntax.
In other words, a cognitive account is inside the analyticalframework, instead of being a serendipity.
5/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Cognitive linguistics and formalisation
Cognitive linguistics (e.g., Taylor, Cruse) are not interested informalisation, as their primary interest is in metaphorinterpretation: concepts are not formalised per se.
Nonetheless, Langacker borrowed the dichotomytrajector/landmark from the German school of Gestalt (e.g., KurtKoffka and Max Wertheimer) into syntax.
In other words, a cognitive account is inside the analyticalframework, instead of being a serendipity.
6/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajector vs. landmark and the Tesnerian dependency
a trajector is the most salient participant put in the focusedposition;
the landmark is the reference point of observation performedby the trajector.
Our hypothesis is that the trajector/landmark relation is conveyedin NLs either by prepositions (most Hindo-European languages, likeEnglish or Catalan) or by postpositions (e.g., Turkish, Japanese) –i.e., adpositions, and from this term adpositional grammar.
6/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajector vs. landmark and the Tesnerian dependency
a trajector is the most salient participant put in the focusedposition;
the landmark is the reference point of observation performedby the trajector.
Our hypothesis is that the trajector/landmark relation is conveyedin NLs either by prepositions (most Hindo-European languages, likeEnglish or Catalan) or by postpositions (e.g., Turkish, Japanese) –i.e., adpositions, and from this term adpositional grammar.
7/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajectors, landmarks and the Tesnerian dependency
Adpositional grammars (AdGrams) retain the concept of valencyfrom Tesniere, but it reconfigures the concept of dependencythanks to the dichotomy trajector/landmark.
The structure of NLs (morphology + syntax) can be expressedwith the following triple:
there is a Governor (G),
there is a Dependent (D),
D and G have their Relation (R).
Remark: R is often a phrasal or sentence adposition.
7/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajectors, landmarks and the Tesnerian dependency
Adpositional grammars (AdGrams) retain the concept of valencyfrom Tesniere, but it reconfigures the concept of dependencythanks to the dichotomy trajector/landmark.
The structure of NLs (morphology + syntax) can be expressedwith the following triple:
there is a Governor (G),
there is a Dependent (D),
D and G have their Relation (R).
Remark: R is often a phrasal or sentence adposition.
7/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajectors, landmarks and the Tesnerian dependency
Adpositional grammars (AdGrams) retain the concept of valencyfrom Tesniere, but it reconfigures the concept of dependencythanks to the dichotomy trajector/landmark.
The structure of NLs (morphology + syntax) can be expressedwith the following triple:
there is a Governor (G),
there is a Dependent (D),
D and G have their Relation (R).
Remark: R is often a phrasal or sentence adposition.
7/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajectors, landmarks and the Tesnerian dependency
Adpositional grammars (AdGrams) retain the concept of valencyfrom Tesniere, but it reconfigures the concept of dependencythanks to the dichotomy trajector/landmark.
The structure of NLs (morphology + syntax) can be expressedwith the following triple:
there is a Governor (G),
there is a Dependent (D),
D and G have their Relation (R).
Remark: R is often a phrasal or sentence adposition.
7/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajectors, landmarks and the Tesnerian dependency
Adpositional grammars (AdGrams) retain the concept of valencyfrom Tesniere, but it reconfigures the concept of dependencythanks to the dichotomy trajector/landmark.
The structure of NLs (morphology + syntax) can be expressedwith the following triple:
there is a Governor (G),
there is a Dependent (D),
D and G have their Relation (R).
Remark: R is often a phrasal or sentence adposition.
7/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Trajectors, landmarks and the Tesnerian dependency
Adpositional grammars (AdGrams) retain the concept of valencyfrom Tesniere, but it reconfigures the concept of dependencythanks to the dichotomy trajector/landmark.
The structure of NLs (morphology + syntax) can be expressedwith the following triple:
there is a Governor (G),
there is a Dependent (D),
D and G have their Relation (R).
Remark: R is often a phrasal or sentence adposition.
8/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Standing on the shoulder of which giants?
AdGrams distinguish two directions of dependency:
Dependency, when D is the trajector, and G is the landmark;
Government, when the trajector is G, and consequentially thelandmark is D.
The advantage is that no assumption on semantics is made, i.e.,we follow a strict world-model agnosticism: e.g., in somepossibly SF world the hotel killed the rabbit is true.
8/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Standing on the shoulder of which giants?
AdGrams distinguish two directions of dependency:
Dependency, when D is the trajector, and G is the landmark;
Government, when the trajector is G, and consequentially thelandmark is D.
The advantage is that no assumption on semantics is made, i.e.,we follow a strict world-model agnosticism: e.g., in somepossibly SF world the hotel killed the rabbit is true.
8/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Standing on the shoulder of which giants?
AdGrams distinguish two directions of dependency:
Dependency, when D is the trajector, and G is the landmark;
Government, when the trajector is G, and consequentially thelandmark is D.
The advantage is that no assumption on semantics is made, i.e.,we follow a strict world-model agnosticism: e.g., in somepossibly SF world the hotel killed the rabbit is true.
8/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Standing on the shoulder of which giants?
AdGrams distinguish two directions of dependency:
Dependency, when D is the trajector, and G is the landmark;
Government, when the trajector is G, and consequentially thelandmark is D.
The advantage is that no assumption on semantics is made, i.e.,we follow a strict world-model agnosticism: e.g., in somepossibly SF world the hotel killed the rabbit is true.
9/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
This is the prototypical Dependency-based tree...
�����
AAAAA
q→R
D G
Notation: D is always on the left branch, while G is always on theright.
10/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
...while this one is the prototypical Government-based tree.
�����
AAAAA
q←R
D G
Let’s see a couple of examples.
11/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
A Dependency-based phrase...
�����
AAAAA
q→ε
∆The torpedo
∆sank...
Figure: Adtree of The torpedo sank the ship.
The trajector is the torpedo, while the landmark is sank the ship.
Notation: delta (∆) means that some information is hidden.
12/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
...and a Government-based phrase
�����
AAAAA
q←ε
∆The ship
∆sank
Figure: Adtree of The ship sank.
The landmark is the ship, while the trajector is the act of sinking(in fact, the Agent and the Instrument are unexpressed here.)
13/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
A Dependency-based sentence...
�����
AAAAA
q→so
∆she can...
∆A. is...
Figure: Adtree for Alice is rich so she can pay
The most salient information (trajector) is the fact that Alice canpay.
Here, the adposition is so, which gives the Dependency structure.
14/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
...and a Government-based sentence
�����
AAAAA
q←
because
∆she is...
∆A. can...
Figure: Adtree for Alice can pay because she is rich
Again, the most salient information is the fact that Alice can pay:in this case the trajector is the Governor.
15/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Why constructive mathematics for formalisation
Unlike most other Dependency formal grammar frameworks,AdGrams use techniques of constructive mathematics, sinceconstructive logics are the natural framework to modelcomputation, as argumented by Troelstra and Barendregt.
Surprisingly, mostly if not all formal NL grammars based oncombinatorial calculus take Chomsky’s constituency for grant,using calculus “only” to build non-transformational grammars.
16/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
The novelties of adpositional grammars
Our approach is different: constructivism permit to hideinformation in a very natural and precise way, and it defines thegrammar as a specification in the language of logic, while thesemantics of logic act as a computational engine, so that we canparse, generate and manipulate sentences essentially for free.
Furthermore, we use a formal method for a non-constituency basedgrammar, ant this is the first attempt ever, at least as far as theauthors know.
17/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Provisional results
We have written an instance of AdGrams in the appropriate logicalformalism, based on intuitionistic logic, together with specializedsemantics.
This first test is fit for the quasi-natural language Esperanto and itproved to cover approxiamtely 95% of the available corpora oflanguage-in-use, except of the well-known open issues of quotingand name-entity recognition.
18/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Further directions
We are currently work on a more general model – i.e., an instanceof adpositional grammars which is cross-linguistically valid since thebeginning – where the engine captures the structure of every NL.
The intricacies are put in the lexicon – the part of semantics whichis rightly computable in terms of meaning components andpragmatic participants.
19/19
From philosophy to computer science How to formalise the cognitive linguistics results? Adpositional grammars Conclusions
Thanks. Any questions?
Download these slides here:
http://www.slideshare.net/goberiko/
CC© BY:© $\© C© Federico Gobbo 2009. Pubblicato in Italia.Attribuzione – Non commerciale – Condividi allo stesso modo 2.5