10/20/14 1Building the Multilingual Web of Data – ISWC
tutorial
Integrating NLP with Linked Data and RDF: the NIF format (hands on)
Ciro Baron Neto Ph.D student at University of Leipzig
10/20/14 2Building the Multilingual Web of Data – ISWC
tutorial
Overview• Github NLP2RDF web page overview
and NIF Online demos (Dashboard, Combinator...)• Examples–Example 1: How to annotate string• using Snowball Steamer and OpenNLP
–Example 2: • Query generated NIF data and Querying Brown Corpus
10/20/14 3Building the Multilingual Web of Data – ISWC
tutorial
NLP2RDF GitHub Website
• https://github.com/NLP2RDF/
• /home/ciro/websites/github/github.com/NLP2RDF/index.html
https://github.com/NLP2RDF/file:///home/ciro/websites/github/github.com/NLP2RDF/index.html
10/20/14 4Building the Multilingual Web of Data – ISWC
tutorial
dashboard.nlp2rdf.aksw.org
10/20/14 5Building the Multilingual Web of Data – ISWC
tutorial
nlp2rdf.aksw.org
10/20/14 6Building the Multilingual Web of Data – ISWC
tutorial
Example 1: Snowball Stemmer Wrapper
10/20/14 7Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
• Stemming algorithm is a process for removing suffixes from words.–CONNECT• CONNECTED• CONNECTION• CONNECTING• CONNECTIONS
10/20/14 8Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper• 1. Open the USB stick folder• 2. Go to “NIF_tutorial_hands_on_jars” folder • 3. Open the “instructions.txt” file in a text
editor• 4. Open a terminal• 5. Go to the “jar” folder
10/20/14 9Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper• Copy the second command of the
instructions.txt“java -jar snowball.jar -f text -i 'My favorite actress is Natalie Portman.'“• -f is used to define the format• -i is used to define the input
• Paste in the terminal
10/20/14 10Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
10/20/14 11Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
10/20/14 12Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
NIF Standard AnnotationsNIF Offset
10/20/14 13Building the Multilingual Web of Data – ISWC
tutorial
Snowball Stemmer Wrapper
NIF Standard Annotations
Snowball StemNIF Offset
10/20/14 14Building the Multilingual Web of Data – ISWC
tutorial
OpenNLP Wrapper• Back to the terminal and use the first command
of the instructions.txtjava -jar opennlp.jar -f text -i 'My favorite actress is Natalie Portman.' -modelFolder ../model/
• The -modelFolder parameter set the folder that contains the POS tagging OpenNLP trained models and tokenization.• You might add the parameter “--outfile
myAnnotatedFile.ttl“ to store the triples in a file.
10/20/14 15Building the Multilingual Web of Data – ISWC
tutorial
Example 2: Query Brown Corpus
10/20/14 16Building the Multilingual Web of Data – ISWC
tutorial
Querying with Twinkle • Open the “/twinkle/example” folder• Open the NIF_query_example file
in a text editor and copy the query• Open the “/twinle” folder and run
the command:java -jar twinkle.jar
10/20/14 17Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 18Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 19Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 20Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 21Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 22Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 23Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 24Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 25Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 26Building the Multilingual Web of Data – ISWC
tutorial
Querying Brown Corpus
10/20/14 27Building the Multilingual Web of Data – ISWC
tutorial
Exercise 3: Querying your own NIF annotated string
10/20/14 28Building the Multilingual Web of Data – ISWC
tutorial
Querying your own NIF annotated string
1. Annotate your string using one of the wrappers2. Save your annotated sentence to a file (using “--outfile”)3. Open Twinkle4. Query your string using Twinkle
10/20/14 29Building the Multilingual Web of Data – ISWC
tutorial
• Query your annotated string:– nif:Context– nif:Sentence– nif:anchorOf – nif:oliaCategory– nif:oliaLink… or practice with Brown Corpus!
10/20/14 30Building the Multilingual Web of Data – ISWC
tutorial
Thank you!
http://site.nlp2rdf.org/NLP2RDF Google+ Community
http://site.nlp2rdf.org/
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30