+ All Categories
Home > Technology > FrameNet development for Latvian

FrameNet development for Latvian

Date post: 13-Apr-2017
Category:
Upload: normunds-gruzitis
View: 100 times
Download: 3 times
Share this document with a friend
15
FrameNet development for Latvian Normunds Grūzītis Guntis Bārzdiņš University of Latvia, Institute of Mathematics and Computer Science National information agency LETA 2nd International FrameNet Workshop, Juiz de Fora, Brazil, 8-9 October 2016
Transcript
Page 1: FrameNet development for Latvian

FrameNet development for Latvian

Normunds GrūzītisGuntis Bārzdiņš

University of Latvia, Institute of Mathematics and Computer ScienceNational information agency LETA

2nd International FrameNet Workshop, Juiz de Fora, Brazil, 8-9 October 2016

Page 2: FrameNet development for Latvian

Latvian• Member of the Baltic language group• Official language of European Union

• Around 2M speakers

• Typically classified as an under-resourced language Situation is rapidly improving in several directions of NLP

o Automatic speech recognitiono Machine translationo Natural language understandingo Natural language generation

Page 3: FrameNet development for Latvian

Latvian FrameNet: a pilot• Application/domain-specific (LETA)

Facilitates the semi-automatic information extraction process for the media monitoring needso For populating and updating profiles of public persons and

organizations

• Covers 26 Berkeley FrameNet frames: Being_born, Being_employed, Change_of_leadership, Earnings_and_losses, Education_teaching, Hiring, Personal_relationship, Residence, Win_prize, etc.

• Nearly 5000 annotated sentences

Page 4: FrameNet development for Latvian

FrameNet ontology: LETA frames

Page 5: FrameNet development for Latvian

FrameNet annotationson top of dependency heads

Page 6: FrameNet development for Latvian

Accuracy of automatic SRLParser / Year / Dataset

Frame identification FE identification

Precision Recall F1 Precision Recall F1

C6.0 / 2014 / LETA 63.5 62.7 63.1 65.9 76.8 70.9

C6.0 / 2014 / BFN 1.3 77.1 53.7 63.3 47.3 47.0 47.1

SEMAFOR / 2014 / BFN 1.3 69.7 54.9 61.4 58.1 38.8 46.5

LTH / 2007 / BFN 1.3 68.9 53.6 60.3 51.6 35.4 42.0

http://c60.ailab.lv

Exhaustive search binary classifier

Used to parse the entire LETA news archive (12M articles)

Page 7: FrameNet development for Latvian

LETA IE and KB population system

Page 8: FrameNet development for Latvian

Scalable Understandingof Multilingual MediA

Discover trends, emerging events, crucial new stories

H2020 grant No. 688139

Page 9: FrameNet development for Latvian

Event-based summarization

Storyline highlights across a set of related articles

Page 10: FrameNet development for Latvian

Multilingual / Cross-lingual apps

Page 11: FrameNet development for Latvian

Full stack of language resources for NLU and NLG [in Latvian]

Page 12: FrameNet development for Latvian

Full stack of language resources for NLU and NLG [in Latvian]

Page 13: FrameNet development for Latvian

Full stack of language resources for NLU and NLG [in Latvian]

Page 14: FrameNet development for Latvian

GF for implementing multilingual frames and constructions• FrameNet – semantic abstraction

BFN frames reused across languages Representation of valence patterns varies a lot FNs as such are semi-formal/computational

• GF – syntactic abstraction Grammar formalism and resource grammar library Towards a computational implementation of FNs

o In some aspects; for multilingual NLG Unified method to compare valence patterns across FNs

Page 15: FrameNet development for Latvian

Latvian FrameNet++• Integrated: a part of a multi-layered corpus• Balanced

We anticipate that the corpus will represent at least 2000 common verbs with at least 10 examples for each of the 1000 most common verbs

• Manually verified at all layers Instead of adding e.g. the syntactic layer afterwards by

an erroneous probabilistic parser

• Computationally oriented• Accessible (open data)


Recommended