+ All Categories
Home > Documents > Lecture 6 Hidden Markov Models

Lecture 6 Hidden Markov Models

Date post: 12-Jan-2016
Category:
Upload: jenn
View: 20 times
Download: 0 times
Share this document with a friend
Description:
Lecture 6 Hidden Markov Models. CSCE 771 Natural Language Processing. Topics Smoothing again: Readings: Chapters. January 16, 2013. Overview. Last Time NLTK book http://readwrite.com/2011/03/25/python-is-an-increasingly-popu. Chomsky on You-tube. - PowerPoint PPT Presentation
19
Lecture 6 Hidden Markov Models Topics Topics Smoothing again: Readings: Chapters Readings: Chapters January 16, 2013 CSCE 771 Natural Language Processing
Transcript
Page 1: Lecture 6  Hidden Markov Models

Lecture 6 Hidden Markov Models

Lecture 6 Hidden Markov Models

Topics Topics Smoothing again:

Readings: Chapters Readings: Chapters

January 16, 2013

CSCE 771 Natural Language Processing

Page 2: Lecture 6  Hidden Markov Models

– 2 – CSCE 771 Spring 2013

OverviewOverviewLast TimeLast Time

NLTK book http://readwrite.com/2011/03/25/python-is-an-increasingly-popu

Page 3: Lecture 6  Hidden Markov Models

– 3 – CSCE 771 Spring 2013

Chomsky on You-tubeChomsky on You-tube

http://www.youtube.com/watch?v=8mA4HYTO790

Page 4: Lecture 6  Hidden Markov Models

– 4 – CSCE 771 Spring 2013

Python Text Processing with NLTK 2.0 CookbookPython Text Processing with NLTK 2.0 Cookbook

1.1. Tokenizing Text and WordNet BasicsTokenizing Text and WordNet Basics

2.2. Replacing and Correcting WordsReplacing and Correcting Words

3.3. Creating Custom CorporaCreating Custom Corpora

4.4. Part-of-Speech TaggingPart-of-Speech Tagging

5.5. Extracting ChunksExtracting Chunks

6.6. Transforming Chunks and TreesTransforming Chunks and Trees

7.7. Text ClassificationText Classification

8.8. Distributed Processing and Handling Large DatasetsDistributed Processing and Handling Large Datasets

9.9. Parsing Specific DataParsing Specific Data

Page 5: Lecture 6  Hidden Markov Models

– 5 – CSCE 771 Spring 2013

Chapter 1. Tokenizing Text and WordNet BasicsChapter 1. Tokenizing Text and WordNet BasicsIn this chapter, we will cover: In this chapter, we will cover:

Tokenizing text into sentences Tokenizing text into sentences

Tokenizing sentences into words Tokenizing sentences into words

Tokenizing sentences using regular expressions Tokenizing sentences using regular expressions

Filtering stopwords in a tokenized sentence Filtering stopwords in a tokenized sentence

Looking up synsets for a word in WordNet Looking up synsets for a word in WordNet

Looking up lemmas and synonyms in WordNet Looking up lemmas and synonyms in WordNet

Calculating WordNet synset similarity Calculating WordNet synset similarity

Discovering word collocationsDiscovering word collocations

Page 6: Lecture 6  Hidden Markov Models

– 6 – CSCE 771 Spring 2013

Chapter 2. Replacing and Correcting WordsChapter 2. Replacing and Correcting WordsIn this chapter, we will cover: Stemming words In this chapter, we will cover: Stemming words

Lemmatizing words with WordNet Translating text Lemmatizing words with WordNet Translating text with Babelfish Replacing words matching regular with Babelfish Replacing words matching regular expressions Removing repeating characters Spelling expressions Removing repeating characters Spelling correction with Enchant Replacing synonyms correction with Enchant Replacing synonyms Replacing negations with antonymsReplacing negations with antonyms

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 25). Packt Publishing. with NLTK 2.0 Cookbook (p. 25). Packt Publishing. Kindle Edition. Kindle Edition.

Page 7: Lecture 6  Hidden Markov Models

– 7 – CSCE 771 Spring 2013

Chapter 3. Creating Custom CorporaChapter 3. Creating Custom Corpora

In this chapter, we will cover: Setting up a custom In this chapter, we will cover: Setting up a custom corpus Creating a word list corpus Creating a part-corpus Creating a word list corpus Creating a part-of-speech tagged word corpus Creating a chunked of-speech tagged word corpus Creating a chunked phrase corpus Creating a categorized text corpus phrase corpus Creating a categorized text corpus Creating a categorized chunk corpus reader Lazy Creating a categorized chunk corpus reader Lazy corpus loading Creating a custom corpus view corpus loading Creating a custom corpus view Creating a MongoDB backed corpus reader Corpus Creating a MongoDB backed corpus reader Corpus editing with file lockingediting with file locking

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 45). Packt Publishing. with NLTK 2.0 Cookbook (p. 45). Packt Publishing. Kindle Edition. Kindle Edition.

Page 8: Lecture 6  Hidden Markov Models

– 8 – CSCE 771 Spring 2013

Chapter 4. Part-of-Speech TaggingChapter 4. Part-of-Speech Tagging

1.1. Default tagging Default tagging

2.2. Training a unigram part-of-speech tagger Training a unigram part-of-speech tagger

3.3. Combining taggers with backoff tagging Combining taggers with backoff tagging

4.4. Training and combining Training and combining

5.5. Ngram taggers Ngram taggers

6.6. Creating a model of likely word tags Creating a model of likely word tags

7.7. Tagging with regular expressions Tagging with regular expressions

8.8. Affix tagging Affix tagging

9.9. Training a Brill tagger Training a Brill tagger

10.10. Training the TnT tagger Training the TnT tagger

11.11. Using WordNet for tagging Tagging proper names Using WordNet for tagging Tagging proper names

Page 9: Lecture 6  Hidden Markov Models

– 9 – CSCE 771 Spring 2013

Chapter 5. Extracting ChunksChapter 5. Extracting Chunks

Chapter 5. Extracting Chunks In this chapter, we will Chapter 5. Extracting Chunks In this chapter, we will cover: Chunking and chinking with regular cover: Chunking and chinking with regular expressions Merging and splitting chunks with expressions Merging and splitting chunks with regular expressions Expanding and removing regular expressions Expanding and removing chunks with regular expressions Partial parsing with chunks with regular expressions Partial parsing with regular expressions Training a tagger-based chunker regular expressions Training a tagger-based chunker Classification-based chunking Extracting named Classification-based chunking Extracting named entities Extracting proper noun chunks Extracting entities Extracting proper noun chunks Extracting location chunks Training a named entity chunkerlocation chunks Training a named entity chunker

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 111). Packt Publishing. with NLTK 2.0 Cookbook (p. 111). Packt Publishing. Kindle Edition. Kindle Edition.

Page 10: Lecture 6  Hidden Markov Models

– 10 – CSCE 771 Spring 2013

Chapter 6. Transforming Chunks and TreesChapter 6. Transforming Chunks and TreesIn this chapter, we will cover: Filtering insignificant In this chapter, we will cover: Filtering insignificant

words Correcting verb forms Swapping verb phrases words Correcting verb forms Swapping verb phrases Swapping noun cardinals Swapping infinitive Swapping noun cardinals Swapping infinitive phrases Singularizing plural nouns Chaining chunk phrases Singularizing plural nouns Chaining chunk transformations Converting a chunk tree to text transformations Converting a chunk tree to text Flattening a deep tree Creating a shallow tree Flattening a deep tree Creating a shallow tree Converting tree nodesConverting tree nodes

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 143). Packt Publishing. with NLTK 2.0 Cookbook (p. 143). Packt Publishing. Kindle Edition. Kindle Edition.

Page 11: Lecture 6  Hidden Markov Models

– 11 – CSCE 771 Spring 2013

Chapter 7. Text ClassificationChapter 7. Text Classification

Chapter 7. Text Classification In this chapter, we will Chapter 7. Text Classification In this chapter, we will cover: Bag of Words feature extraction Training a cover: Bag of Words feature extraction Training a naive Bayes classifier Training a decision tree naive Bayes classifier Training a decision tree classifier Training a maximum entropy classifier classifier Training a maximum entropy classifier Measuring precision and recall of a classifier Measuring precision and recall of a classifier Calculating high information words Combining Calculating high information words Combining classifiers with voting Classifying with multiple classifiers with voting Classifying with multiple binary classifiersbinary classifiers

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 167). Packt Publishing. with NLTK 2.0 Cookbook (p. 167). Packt Publishing. Kindle Edition. Kindle Edition.

Page 12: Lecture 6  Hidden Markov Models

– 12 – CSCE 771 Spring 2013

Chapter 8. Distributed Processing and Handling Large DatasetsChapter 8. Distributed Processing and Handling Large DatasetsIn this chapter, we will cover: Distributed tagging with In this chapter, we will cover: Distributed tagging with

execnet Distributed chunking with execnet Parallel execnet Distributed chunking with execnet Parallel list processing with execnet Storing a frequency list processing with execnet Storing a frequency distribution in Redis Storing a conditional frequency distribution in Redis Storing a conditional frequency distribution in Redis Storing an ordered dictionary in distribution in Redis Storing an ordered dictionary in Redis Distributed word scoring with Redis and Redis Distributed word scoring with Redis and execnetexecnet

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 201). Packt Publishing. with NLTK 2.0 Cookbook (p. 201). Packt Publishing. Kindle Edition. Kindle Edition.

Page 13: Lecture 6  Hidden Markov Models

– 13 – CSCE 771 Spring 2013

Chapter 9. Parsing Specific DataChapter 9. Parsing Specific Data

Chapter 9. Parsing Specific Data In this chapter, we will Chapter 9. Parsing Specific Data In this chapter, we will cover: Parsing dates and times with Dateutil Time cover: Parsing dates and times with Dateutil Time zone lookup and conversion Tagging temporal zone lookup and conversion Tagging temporal expressions with Timex Extracting URLs from HTML expressions with Timex Extracting URLs from HTML with lxml Cleaning and stripping HTML Converting with lxml Cleaning and stripping HTML Converting HTML entities with BeautifulSoup Detecting and HTML entities with BeautifulSoup Detecting and converting character encodingsconverting character encodings

Perkins, Jacob (2010-11-09). Python Text Processing Perkins, Jacob (2010-11-09). Python Text Processing with NLTK 2.0 Cookbook (p. 227). Packt Publishing. with NLTK 2.0 Cookbook (p. 227). Packt Publishing. Kindle Edition. Kindle Edition.

Page 14: Lecture 6  Hidden Markov Models

– 14 – CSCE 771 Spring 2013

Page 15: Lecture 6  Hidden Markov Models

– 15 – CSCE 771 Spring 2013

Page 16: Lecture 6  Hidden Markov Models

– 16 – CSCE 771 Spring 2013

Page 17: Lecture 6  Hidden Markov Models

– 17 – CSCE 771 Spring 2013

Page 18: Lecture 6  Hidden Markov Models

– 18 – CSCE 771 Spring 2013

Page 19: Lecture 6  Hidden Markov Models

– 19 – CSCE 771 Spring 2013


Recommended