+ All Categories
Home > Documents > AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson,...

AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson,...

Date post: 28-Dec-2015
Category:
Upload: jacob-campbell
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
24
AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: [email protected] www.nada.kth.se/theory/humanlang/tools.html
Transcript
Page 1: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEval and Missplel:Two Generic Tools for Automatic Evaluation

Johnny Bigert, Linus Ericson, Anton Solis

Nada, KTH, Stockholm, SwedenContact: [email protected]

www.nada.kth.se/theory/humanlang/tools.html

Page 2: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Manual evaluation Time-consuming, tedious, error-

prone Computers are good at repetitive

tasks, humans are not Unavoidable in some situations

Page 3: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Automatic evaluation Cheap, fast, accurate, easily

reproducible Incorporated in the development of

most NLP system

Page 4: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Automatic evaluation AutoEval: simplifies the

construction of (NLP system) evaluation

Missplel: introduces human-like errors into text

Page 5: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEval "I write evaluation code myself in

all our NLP projects" "Why would I need AutoEval?"

Page 6: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEval Our point exactly

Repetition of: Input and output file handling XML parsing and XML output Error handling, malformed input Data storage, management and

processing

Page 7: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEvalFeatures — avoids repetition: Handles input (XML/structured plain-

text) and generates output (XML) Handles data storage and processing

...and also: Generic and extendible script

language Efficient

Page 8: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEval

Script language: Simple C-like syntax Powerful Modules and macros in repository

files Extendible, add your own functions

Page 9: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEval

<root> <files> <file format="plain" type="in" name="datafile">TnT.wt</file> <file format="xml" type="out" name="outfile">out.xml</file> </files> <process> field(file("datafile"), "\t", "\n", var("word"), var("tag")); inc(cnt("tot")); inc(cnt(lookup("tag"))); </process> <processonce> outputintcon(out("outfile"), cntmap("global"), "global"); </processonce></root>

Example of configuration and script language:

Page 10: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

AutoEval

<evaloutput date="Mon May 26 12:37:39 2003"><global> <var name="tot">14119</var>

<var name="ab">714</var> <var name="ab.kom">44</var> <var name="ab.pos">149</var> <var name="ab.suv">24</var> ... <var name="vb.sup.akt">117</var> <var name="vb.sup.sfo">35</var></global>

The result:

Page 11: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel Missplel is a highly configurable

tool to introduce human-like spelling errors

Language, PoS tag set, character set and keyboard layout independent

All you need is a word/tag/lemma dictionary

Page 12: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel

Performance errors – Damerau: Keyboard mistypes (Damerau, 1964):

Insertion, deletion, substitution, transposition of letters

wellcvome, wellcme, wellcpme, wellcmoe

Result: a new existing/non-existing word word class (PoS tag) change or not

Page 13: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

MissplelCompetence errors – split compounds: May alter the semantics of a sentence

Kycklinglever – chicken liver Kyckling lever – chicken is alive

Settings of split compound elements: Minimum length? Allowed PoS tag? Found in dictionary? Word class change? etc.

Page 14: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel

Competence errors – sound errors: Letter level e.g. sound-alike errors Regular expression rules:

(.+)ei(.+) @1ie@2 receive recieve

Page 15: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel

Competence errors – syntax errors: Word/letter level Form new words from PoS tags,

missing/doubled words etc. Regular expression rules:

<rule ex="slutat skrika - slutat skrikit"> <match>vb\.sup\.akt(.*) vb\.inf.*</match> <to>vb.sup.akt@1 vb.sup.akt</to>

</rule>

Page 16: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel

Letters NN2 would VM0 be VBIwelcome AJ0-NN1

Litters NN2 damerau/wordexist-notagchange would VM0 okbee NN1 sound/wordexist-tagchangewelcmoe ERR damerau/nowordexist-tagchange

Page 17: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel <input> <filename>TnT.wt</filename> <expression>([^\t]+)\t([^\t]+)([^\r\n]*).*</expression> </input>

<output> <filename>output.wte</filename> <!-- %1% Word, %2% Tag, %3% Lemma, %4% Rest of line, %5% Error descr --> <format>%1% %2% %5%</format> <description> <noError>ok</noError> <existingWord>exist</existingWord> <nonExistingWord>noexist</nonExistingWord> <wordChange>-wordch</wordChange> <noWordChange>-nowordch</noWordChange> <tagChange>-tagch</tagChange> <noTagChange>-notagch</noTagChange> </description> </output> ...

Page 18: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel ... <options> <unknownTag>unknown</unknownTag> <unknownLemma>unknownLemma</unknownLemma> <escapeChar>@</escapeChar> <spaceChar> </spaceChar> <wordChar>'</wordChar> <sentenceSeparatorTag>mad</sentenceSeparatorTag> <maxErrorsInSentence>30</maxErrorsInSentence> <configDir>felstava/conf/</configDir> </options>

<wordlist> <create> <filename>Swedish.cwtl</filename> <expression>.+\t([^\t]+)\t([^\t]+)\t+([^\t]+)</expression> </create> <wordfile>outfile.gz</wordfile> <tagfile>tagfile</tagfile> </wordlist> ...

Page 19: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel ... <damerau> <reportName>damerau</reportName> <active>yes</active> <probability>10.0</probability> <confusionMatrix>confusionfile</confusionMatrix> <subst>1</subst> <ins>1</ins> <del>1</del> <transp>1</transp> <allowExistingWords>no</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </damerau> ...

Page 20: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel ... <splitCompound> <reportName>split</reportName> <active>no</active> <probability>99.0</probability> <splitUnknownWords>yes</splitUnknownWords> <splitThreshold>50</splitThreshold> <minWordLength>6</minWordLength> <minSplitWordLength>3</minSplitWordLength> <factors> <wordLength>1</wordLength> <inDictionaryFirst>10</inDictionaryFirst> <inDictionarySecond>10</inDictionarySecond> <tagAllowed>10</tagAllowed> <tagMatchFirst>0</tagMatchFirst> <tagMatchSecond>15</tagMatchSecond> </factors> </splitCompound> ...

Page 21: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel ... <soundError> <reportName>sound</reportName> <active>no</active> <filename>sound.test</filename> <probability>100.0</probability> <expression>(.+)\t(.+)\t(.+)</expression> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </soundError> ...

Page 22: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Missplel ... <syntaxError> <reportName>introduced</reportName> <active>no</active> <filename>error.rules</filename> <probability>100.0</probability> <allowExistingWords>yes</allowExistingWords> <forceAllowWords>no</forceAllowWords> <allowTagChange>yes</allowTagChange> <forceAllowTag>no</forceAllowTag> </syntaxError>

Page 23: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Applications AutoEval has been used to evaluate

Parsers PoS taggers PoS majority/ensemble tagging

Missplel has been used to evaluate Spell checkers Grammar checkers Robustness of parsers and taggers

Page 24: AutoEval and Missplel: Two Generic Tools for Automatic Evaluation Johnny Bigert, Linus Ericson, Anton Solis Nada, KTH, Stockholm, Sweden Contact: johnny@kth.se.

Licence AutoEval and Missplel are open

source under the Gnu General Public Licence

Source code available at www.nada.kth.se/theory/ humanlang/tools.html


Recommended