2007 CLINT-LIN-FEATSTR 1
Computational Linguisticsfor Linguists
Feature Structures
2007 CLINT-LIN-FEATSTR 2
Example PATR-IIGrammar and Lexicon
Grammar (grammar.grm)
Rule
s -> np vp
Rule
np -> n
Rule
vp -> v
Lexicon (lexicon.lex)
\w uther
\c n
\w sleeps
\c v
2007 CLINT-LIN-FEATSTR 3
Example PATR-IIGrammar and Lexicon
Grammar (grammar.grm)
Rule
s -> np vp
Rule
np -> n
Rule
vp -> v
Lexicon (lexicon.lex)
\w uther
\c n
\w sleeps
\c v
\w sleep
\c v
2007 CLINT-LIN-FEATSTR 4
Example PATR-IIGrammar and Lexicon
%Grammar(grammar.grm)
Rules -> npsg vpsgRules -> nppl vpplRulenpsg -> nsgRulenppl -> nplRulevpsg -> vsgRulevppl -> vpl
%Lexicon (lexicon.lex)
\w cows\c npl
\w uther\c nsg
\w sleeps\c vsg
\w sleep\c vpl
2007 CLINT-LIN-FEATSTR 5
Grammar and Lexiconwith Pronouns
%Grammar(grammar.grm)Rules -> npsg vpsgRules -> nppl vpplRulenpsg -> nsgRulenppl -> nplRulevpsg -> vsgRulevppl -> vpl
%Lexicon (lexicon.lex)\w he\c nsg\w him\c nsg\w she\c nsg\w her\c nsg\w they\c npl\w them\c npl\w sleeps\c vsg\w sleep\c vpl
2007 CLINT-LIN-FEATSTR 6
Problem with the Grammar
• The grammar allows:
he/him/she/her sleepsthey/them sleep
2007 CLINT-LIN-FEATSTR 7
Grammar and Lexiconwith Pronouns
%Grammar(grammar.grm)Rules -> npsgnom vpsgRules -> npplnom vpplRulenpsgnom -> nsgnomRulenpplnom -> nplnomRulenpsgacc -> nsgaccRulenpplacc -> nplaccRulevpsg -> vsgRulevppl -> vpl
%Lexicon (lexicon.lex)\w he\c nsgnom\w him\c nsgacc\w she\c nsgnom\w her\c nsgacc\w they\c nplnom\w them\c nplacc\w sleeps\c vsg\w sleep\c vpl
2007 CLINT-LIN-FEATSTR 8
Remarks
• The only mechanism available to CFG to prevent overgeneration is the creation of new categories.
• Whenever we add new categories the grammar gets longer and less understandable
• Is there another way?
2007 CLINT-LIN-FEATSTR 9
Constraints andInformation Structures
• PATR2 handles this problem by associating words with feature structures.
• Feature structures are commonly written as attribute-value matrices e.g. [cat noun num sing ]
• Items on the left are attributes• Items on the right are corresponding
values
2007 CLINT-LIN-FEATSTR 10
Constraints andInformation Structures
• Rules are then augmented with constraint equations between feature structures associated with constituents.
• These can be used to express constraints between constituents (eg subject/verb agreement),
• or to pass information from words up to higher constituents (e.g. np inherits information from n).
2007 CLINT-LIN-FEATSTR 11
Example of a PATR ruleswith Constraints
Rule
s -> np vp
<np num> = <vp num>
Rule
np -> n<np head> = <n head>
2007 CLINT-LIN-FEATSTR 12
Feature Constraints
Feature constraints comprise three parts, in this order:
1. a feature path, the first element of which is one of the symbols from the phrase structure rule
2. an equal sign (=)
3. either a simple value, or another feature path that also starts with a symbol from the phrase structure rule
2007 CLINT-LIN-FEATSTR 13
Unification
• Unification is the basic operation applied to feature structures in PC-PATR
• It consists of the merging of the information from two feature structures.
• Two feature structures can unify if their common features have the same values, but do not unify if any feature values conflict.
2007 CLINT-LIN-FEATSTR 14
Examples
[num sg] unified with [person first] gives[num sg person first] [num sg] unified with [num sg] gives [num sg]
[num sg] unified with [num pl] gives …
2007 CLINT-LIN-FEATSTR 15
Examples
[num sg] unified with [person first] gives[num sg person first] [num sg] unified with [num sg] gives [num sg]
[num sg] unified with [num pl] gives NOTHING
2007 CLINT-LIN-FEATSTR 16
Complex-Valued FS
• Feature structures can have either simple values, or complex values, such as this[cat np head [agr [ num sg
gen masc] deftype indef]]
• Feature structures can be arbitrarily nested and used to build linguistic representations.
2007 CLINT-LIN-FEATSTR 17
Building Up Structures
• Agreement Features – 3rd person singular[ num sing person 3 ]
• Noun Phrase – 3rd person sing noun phrase[ cat np agr [ num sing
person 3 ]]• Sentence – with 3rd person singular subject
[cat s subj [ cat np agr [ num sing person 3 ]]]
2007 CLINT-LIN-FEATSTR 18
Simple Unification Examples
1. [ agreement: [ number: singular person: first ] ]2. [ agreement: [ number: singular case:
nominative ] ]3. [ agreement: [ number: singular person: third ] ]
4.[ agreement: [ number: singular person: first ] case: nominative ] ]5. [ agreement: [ number: singular person: third ] case: nominative ] ]
2007 CLINT-LIN-FEATSTR 19
Checkpoint
Satisfy yourself that, using the previous
examples:
• unify(1,2) = 4
• unify(2,3) = 5
• unify(1,3) = fail
2007 CLINT-LIN-FEATSTR 20
Paths
• Portions of a feature structure can be referred to using the path notation.
• A path is a sequence of one or more feature names enclosed in angled brackets (< >). For instance,(1) <head>
(2) <head deftype>
(3) <head agr num>
• Paths are used to express feature constraints,
2007 CLINT-LIN-FEATSTR 21
Examples of Constraints
• <head deftype> = indef
• <np head agr> = <vp head agr>
2007 CLINT-LIN-FEATSTR 22
Constraint Equations
• The feature constraints associated with phrase structure rules in PC-PATR consist of a set ofunification expressions.
• Each expression has three parts, in this order:• a feature path, the first element of which is one
of the symbols from the phrase structure rule• an equal sign (=)• either a simple value, or another feature path
that also starts with a symbol from the phrase structure
2007 CLINT-LIN-FEATSTR 23
Execution of Equations
• Each equation is interpreted as an instruction to unify the left and right hand sides
• First, each side is "evaluated" before any unification is attempted. If the path does not exist it is created.
• After successful unification, the two structures are not merely equivalent, but identical, so that any changes to one affect changes to the other.
2007 CLINT-LIN-FEATSTR 24
Lexical Entries
• Lexical entries define the basic properties of words.
• Each definition divided into fields, each of which begins with a standard format marker at the beginning of a line. – \w the lexical form of the word, – \c word category (part of speech) – \g word gloss – \f additional features of this word
2007 CLINT-LIN-FEATSTR 25
Lexical Entry Examples
\w fox \c N \g canine \f <number> = singular
\w foxes \c N \g canine+PL \f <number> = plural
2007 CLINT-LIN-FEATSTR 26
Corresponding Feature Structures
• When these entries are used by the grammar, they are represented by these feature structures: [ cat: N gloss: canine lex: foxes number: singular ]
[ cat: N gloss: canine+PL
lex: foxes number: plural ]