+ All Categories
Home > Documents > IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt)...

IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt)...

Date post: 04-Aug-2020
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
Page 1: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Jan Tore Lønning


Page 2: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Lecture 8, 8 Oct

Information extraction, pipelines


Page 3: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Sentence structure:

Constituents and phrases


Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways



Page 4: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Sentences have inner structure

Sentence: a sequence of words

Properties of words:

morphology, tags, embeddings

Probabilities of sequences


Sentences have inner structure

The structure determines

whether the sentence is

grammatical or not

The structure determines how to

understand the sentence

So far But


Page 5: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Why syntax?

Some sequences of words are

well-formed meaningful


Others are not:

Are meaningful of some sentences

sequences well-formed words

It makes a difference:

A dog bit the man.

The man bit a dog.

BOW-models don't capture this



Page 6: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Two ways to describe sentence structure6

Phrase structure Dependency structure

Focus of INF2820 Focus of IN2110

Page 7: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Constituents and phrases

Constituent: A group of word which functions as a unit in the sentence

See Wikipedia: Constituent for criteria of constituency

Phrase: A sequence of words which "belong together"

= constituent (for us)

In some theories a phrase is a constituent of more than one word

7 NP


The small, cute dog

The dog from Baskerville







the apple

the small, cute dog

the apple that Kim had stolen from the store



Page 8: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Phrases can be classified into categories:

Noun Phrases, Verb Phrases, Prepositional Phrases, etc.

Phrases of the same category have similar distribution,

e.g. NPs can replace names

(but there are restrictions on case, number, person, gender agreement, etc.)

Phrases of the same category have similar structure, simplified:

NP (roughly): (DET) ADJ* N PP* (+ some alternatives, e.g. pronoun)



Page 9: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Phrase structure

A sentence is hierarchically

ordered into phrases

Various syntactic theories and

models and NLP tools depart

with respect to the actual trees:

Models based on X-bar theory

prefer "deep threes": binary


Penn treebank prefers shallow



Page 10: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

A Penn treebank tree10

Page 11: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Sentence structure:

Constituents and phrases


Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways



Page 12: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


A collection of analyzed sentences/trees

Penn treebank is best known


Page 13: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,



Treebanks are corpora in which each sentence has been paired with a parse tree (presumably the right one).

These are generally created

By first parsing the collection with an automatic parser

And then having human annotators correct each parse as necessary.

This requires detailed annotation guidelines that provide a POS tagset, a grammar and instructions for how to deal with particular grammatical constructions.

Page 14: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Treebank Grammars

Treebanks implicitly define a grammar for the language covered in the treebank.

Such grammars tend to be very flat due to the fact that they tend to avoid recursion.

To ease the annotators burden

For example, the Penn Treebank has 4500 different rules for VPs. Among them...

Speech and Language Processing - Jurafsky and Martin


Page 15: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Different types of treebanks


Human annotators assign trees.

The trees define a grammar:

Many rules

Penn uses flat trees

Parse bank

Start with a grammar

And a parser

Parse the sentences

A human annotator selects the best analysis between the candidates

May be used for training a parse ranker

October 7, 2019


Page 16: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


There are available free dependency treebanks for many languages

The place to start in these days: http://universaldependencies.org/


One word per line, a number of columns for various information

CONLL-X, CONLL-U – different POSTAGs


fromAndrei's INF5830 slides

Page 17: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Sentence structure:

Constituents and phrases


Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways



Page 18: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

IE basics

Bottom-Up approach

Start with unrestricted texts, and do the best you can

The approach was in particular developed by the Message Understanding Conferences (MUC) in the 1990s

Select a particular domain and task


Information extraction (IE) is the task of

automatically extracting structured information

from unstructured and/or semi-structured

machine-readable documents. (Wikipedia)

Page 19: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


(Some appro-

aches do these

steps in a

different order

– or

simultaneously)From NLTK

Page 20: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Some example systems20

Stanford core nlp: http://corenlp.run/

SpaCy (Python): https://spacy.io/docs/api/

OpenNLP (Java): https://opennlp.apache.org/docs/

GATE (Java): https://gate.ac.uk/

UDPipe: http://ufal.mff.cuni.cz/udpipe

Online demo: http://lindat.mff.cuni.cz/services/udpipe/

Page 21: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Dependency parsing:



Treebanks and pipelines

Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways


Page 22: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Next steps

Chunk together words to phrases


Page 23: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Exactly what is an NP-chunk?

It is an NP

But not all NPs are chunks

Flat structure: no NP-chunk is part of another NP chunk

Maximally large

Opposing restrictions


[ The/DT market/NN ] for/IN

[ system-management/NN software/NN ] for/IN

[ Digital/NNP ]

[ 's/POS hardware/NN ] is/VBZ fragmented/JJ enough/RB that/IN

[ a/DT giant/NN ] such/JJ as/IN

[ Computer/NNP Associates/NNPS ] should/MD do/VB well/RB there/RB ./.

Page 24: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Regular Expression Chunker

Input POS-tagged sentences

Use a regular expression over POS to identify NP-chunks

NLTK example:

It inserts parentheses


grammar = r"""NP: {<DT|PP\$>?<JJ>*<NN>}

{<NNP>+} """

Page 25: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


B-NP: First word in NP

I-NP: Part of NP, not first word

O: Not part of NP (phrase)


One tag per token


Does not insert anything in the

text itself


Page 26: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Assigning IOB-tags

The process can be considered a form for tagging

POS-tagging: Word to POS-tag

IOB-tagging: POS-tag to IOB-tag

But one may in addition use additional features, e.g. words

Can use various types of classifiers

NLTK uses a MaxEnt Classifier (=LogReg, but the implementation is slow)

We can modify along the lines of mandatory assignment 2, using scikit-learn


Page 27: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


J&M, 3. ed.

Page 28: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Evaluating (IOB-)chunkers

cp = nltk.RegexpParser("")

test_sents = conll ('test', chunks=['NP'])

IOB Accuracy: 43.4%

Precision: 0.0%

Recall: 0.0%

F-Measure: 0.0%

What do we evaluate?

IOB-tags? or

Whole chunks?

Yields different results

For IOB-tags:

Baseline: majority class O,

yields > 33%

Whole chunks:

Which chunks did we find?


Lower numbers


Page 29: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Evaluating (IOB-)chunkers

cp = nltk.RegexpParser("")

test_sents = conll ('test',


IOB Accuracy: 43.4%

Precision: 0.0%

Recall: 0.0%

F-Measure: 0.0%

>> cp = nltk.RegexpParser(

r"NP: {<[CDJNP].*>+}")

IOB Accuracy: 87.7%

Precision: 70.6%

Recall: 67.8%

F-Measure: 69.2%


Page 30: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Sentence structure:

Constituents and phrases


Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways



Page 31: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Named entities31

Named entity:

Anything you can refer to by a proper name

i.e. not all NP (chunks):

high fuel prices

Maybe longer NP than just chunk:

Bank of America

Find the phrases

Classify them

Citing high fuel prices, [ORG United Airlines]

said [TIME Friday] it has increased fares by

[MONEY $6] per round trip on flights to

some cities also served by lower-cost

carriers. [ORG American Airlines], a unit of

[ORG AMR Corp.], immediately matched the

move, spokesman [PER Tim Wagner] said.

[ORG United], a unit of [ORG UAL Corp.],

said the increase took effect [TIME Thursday]

and applies to most routes where it

competes against discount carriers, such as

[LOC Chicago] to [LOC Dallas] and [LOC

Denver] to [LOC San Francisco].

Page 32: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Types of NE

The set of types vary between different systems

Which classes are useful depend on application


Page 33: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Page 34: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Useful: List of names,


Gazetteer: list of

geographical names

But does not remove all


cf. example


Page 35: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Representation (IOB)35

Page 36: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Feature-based NER

Similar to tagging and chunking

You will need features from several layers

Features may include

Words, POS-tags, Chunk-tags, Graphical prop.

and more (See J&M, 3.ed)


Page 37: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Feature-based NER algorithms37

Greedy decoding

"Word-by word", decide for the first word, then for the second word, etc.

Can use various learners, e.g. Logistic regression (MaxEnt)

We can use our set-up for mandatory 2 with smaller adjustments

For shortcomings and better alternatives, c.f. J&M, 3. ed, ch.8:

Maximum Entropy Markov Models (MEMM)

Conditional random fields (Preferred approach until recently

Page 38: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Neural NER

The last years: neural architectures show the best results

J&M, 3. ed., ch. 17, sec. 17.1.3, not curriculum in IN4080



Page 39: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Have we found the correct NERs?

Evaluate precision and recall as for chunking

For the correctly identified NERs, have we labelled them correctly?


Page 40: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Sentence structure:

Constituents and phrases


Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways



Page 41: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Extract the relations that exist

between the (named) entities in the


A fixed set of relations (normally)

Determined by application:


Preventing terrorist attacks

Detecting illness from medical record


• Born_in

• Date_of_birth

• Parent_of

• Author_of

• Winner_of

• Part_of

• Located_in

• Acquire

• Threaten

• Has_symptom

• Has_illness

Page 42: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Page 43: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Methods for relation extraction43

1. Hand-written patterns

2. Machine Learning (Supervised classifiers)

3. Semi-supervised classifiers via bootstrapping

4. Semi-supervised classifiers via distant supervision

5. Unsupervised

Page 44: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

1. Hand-written patterns

Example: acquisitions

[ORG]…( buy(s)|


aquire(s|d) )…[ORG]

Hand-write patterns like this


High precision

Will only cover a small set of


Low recall

Time consuming

(Also in NLTK, sec 7.6)


Page 45: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Page 46: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Methods for relation extraction46

1. Hand-written patterns

2. Machine Learning (Supervised classifiers)

3. Semi-supervised classifiers via bootstrapping

4. Semi-supervised classifiers via distant supervision

5. Unsupervised

Page 47: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

2. Supervised classifiers47

A corpus

A fixed set of entities and relations

The sentences in the corpus are hand-annotated:


Relations between them

Split the corpus into parts for training and testing

Train a classifier:

Choose learner: Naive Bayes, Logistic regression (Max Ent), SVM, …

Select features

Page 48: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

2. Supervised classifiers, contd.48


Use pairs of entities within the same sentence with no relation between them

as negative data


1. Find the NERs

2. For each pair of NERs determine whether there is a relation between them

3. If there is, label the relation

Page 49: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Examples of features49


Airlines, a unit

of AMR,


matched the


spokesman Tim

Wagner said

Page 50: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


The bottleneck is the availability of training data

To hand label data is time consuming

Mostly applied to restricted domains

Does not generalize well to other domains

Page 51: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Methods for relation extraction51

1. Hand-written patterns

2. Machine Learning (Supervised classifiers)

3. Semi-supervised classifiers via bootstrapping

4. Semi-supervised classifiers via distant supervision

5. Unsupervised

Page 52: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

3. Semisupervised, bootstrapping

If we know a pattern for a relation, we can determine whether a pair stands in the relation

Conversely: If we know that a pair stands in a relationship, we can find patterns that describe the relation



IBM – AlchemyAPI

Google – YouTube

Facebook - WhatsApp





Page 53: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,



Search for sentences containing IBM and AlchemyAPI

Results (Web-search, Google, btw. first 10 results):

IBM's Watson makes intelligent acquisition of Denver-based AlchemyAPI(Denver Post)

IBM is buying machine-learning systems maker AlchemyAPI Inc. to bolster its Watson technology as competition heats up in the data analytics and artificial intelligence fields. (Bloomberg)

IBM has acquired computing services provider AlchemyAPI to broaden its portfolio of Watson-branded cognitive computing services. (ComputerWorld)

Page 54: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Example contd.54

Extract patterns

IBM's Watson makes intelligent acquisition of Denver-based AlchemyAPI

(Denver Post)

IBM is buying machine-learning systems maker AlchemyAPI Inc. to bolster its

Watson technology as competition heats up in the data analytics and artificial

intelligence fields. (Bloomberg)

IBM has acquired computing services provider AlchemyAPI to broaden its

portfolio of Watson-branded cognitive computing services. (ComputerWorld)

Page 55: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


From the extracted sentences,

we extract patterns

Use these patterns to extract

more pairs of entities that stand

in these patterns

These pairs may again be used

for extracting more patterns,


…makes intelligent acquisition …

… is buying …

… has acquired …


Page 56: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Page 57: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

A little more57

We could

either extract pattern templates and searching for these

or features for classification and build a classifier

If we use patterns we should generalize

makes intelligent acquisition (make(s)|made) JJ* acquisition

During the process we should evaluate before we extend:

Does the new pattern recognize other pairs we know stand in the relation?

Does the new pattern return pairs that are not in the relation? (Precision)

Page 58: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Methods for relation extraction58

1. Hand-written patterns

2. Machine Learning (Supervised classifiers)

3. Semi-supervised classifiers via bootstrapping

4. Semi-supervised classifiers via distant supervision

5. Unsupervised

Page 59: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

4. Distant supervision for RE


A large external knowledge base, e.g. Wikipedia, Word-net

Large amounts of unlabeled text

Extract tuples that stand in known relation from knowledge base:

Many tuples

Follow the bootstrapping technique on the text


Page 60: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

4. Distant supervision for RE


Large data sets allow for

fine-grained features

combinations of features



Large knowledge-base


Page 61: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Methods for relation extraction61

1. Hand-written patterns

2. Machine Learning (Supervised classifiers)

3. Semi-supervised classifiers via bootstrapping

4. Semi-supervised classifiers via distant supervision

5. Unsupervised

Page 62: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

5. Unsupervised relation extraction

Open IE


1. Tag and chunk

2. Find all word sequences

satisfying certain syntactic constraints,

in particular containing a verb

These are taken to be the relations

3. For each such, find the immediate non-vacuous NP to the left and to the right

4. Assign a confidence score

United has a hub in Chicago, which is the headquarters of United Continental Holdings.

r1: <United, has a hub in, Chicago>

r2: <Chicago, is the headquarters of, United Continental Holdings>


Page 63: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Evaluating relation extraction

Supervised methods can be

evaluated on each of the

examples in a test set.

For the semi-supervised


we don’t have a test set.

we can evaluate the precision of

the returned examples manually

Beware the difference between

Determine for a sentence

whether an entity pair in the sen-

tence is in a particular relation

Recall and precision

Determine from a text:

We may use several occurrences

of the pair in the text to draw a




We skip the confidence scoring

Page 64: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

More fine grained IE


Identifying the "actors"


Named-entity recognition

Co-reference resolution

Relation detection

Event detection

Co-reference resolution of events

Temporal extraction

Template filling


So far Possible refinements

Page 65: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,


Sentence structure:

Constituents and phrases


Information extraction, IE


Named entity recognition

Relation extraction, 5 different ways



Page 66: IN4080 Natural Language Processing · Can use various learners, e.g. Logistic regression (MaxEnt) ... Semi-supervised classifiers via distant supervision 5. Unsupervised. 3. Semisupervised,

Some example systems66

Stanford core nlp: http://corenlp.run/

SpaCy (Python): https://spacy.io/docs/api/

OpenNLP (Java): https://opennlp.apache.org/docs/

GATE (Java): https://gate.ac.uk/

UDPipe: http://ufal.mff.cuni.cz/udpipe

Online demo: http://lindat.mff.cuni.cz/services/udpipe/
