Deep Learning in Natural Language Processing Tong Wang Advisor: Prof. Ping Chen Computer Science University of Massachusetts Boston
Outline � Natural Language Processing � Deep Learning in NLP � My Research Projects � My Path in Computer Science � My Experience to Find Internship
What is Natural Language Processing � Natural Language Processing is related to
the area of human-computer interaction. � Natural language understanding � Natural language generation
Natural Language Processing
https://d396qusza40orc.cloudfront.net/nlangp/lectures/intro.pdf http://www.cs.nyu.edu/~petrov/lecture1.pdf
Natural Language Processing
http://www.slideshare.net/BenjaminBengfort/introduction-to-machine-learning-with-scikitlearn
NLP Applications � Information Extraction � Name Entity Recognition � Machine Translation � Question Answering � Topic Model � Summarization
Name Entity Recognition � Classify elements in text into categories
such as location, time, name of person, organization.
� Jim worked in Google corp. in 2012 � (Jim)[person] worked in (Google corp.)
[organization] in (2012)[time]
Machine Translation Difficulties � Words together are more than the sum
of their parts. � Can not translated word by word ◦ E.g, Fast food, Light rain
� Need a big dictionary with grammar rules in both languages, large start-up cost
� Require computer to understand
Why NLP is hard � Basically text is not computer-friendly � Many different ways to represent the
same thing � Order and context are extremely
important � Language is very high dimensional and
sparse. Tons of rare words. ◦ B4 (before), IC (I see), cre8(create)
� Ambiguity
Ambiguity � “At last, a computer understands you like
your mother” ◦ It understands you as well as your mother
understands you ◦ It understands (that) you like your mother ◦ It understands you as well as it understands
your mother
Deep Learning (Representation learning) in NLP
http://www.iro.umontreal.ca/~memisevr/dlss2015/DLSS2015-NLP-1.pdf
Deep Learning in NLP � Word Level Application: Word Embedding,
word2vec � Sentence/paragraph Level Application:
Neural Machine Translation, doc2vec, etc.
Word Representation � The majority of rule-based and statistical
NLP work regarded words as atomic symbols
� In vector space terms, this is a vector with one 1 and many zeros, it is called “one-hot” representation ◦ Condo: [0,0,0,0,1,0,0,…0] ◦ Apartment: [0,1,0,0,0,0,0,…0]
� These two vectors are orthogonal, no similarity
Neural Machine Translation
https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-gpus-part-2/
Text Simplification � Text simplification (TS) aims to simplify
the lexical, grammatical, or structural complexity of text while retaining its semantic meaning
� It can help various groups of people, including children, non-native speakers, and people with cognitive disabilities
Lexical Simplification � Substitute long and infrequent words with
shorter and more frequent words � Candidate selection ◦ Semantic similarity ◦ Syntax and grammar correct ◦ The meaning of the sentence remains the
same
� Disadvantage: On word level
LS System � For each word w in text: ◦ Check part of speech tagging of w � Retrieve top 20 most similar words from word2vec � For c in 20 candidate words:
� If c is the same pos with w � If c is not a different form of w, e.g, past tense. � If w is more difficult than c: � Put c in the sentence, compute sentence similarity and n-
gram � Otherwise continue
TS using Neural Machine Translation
� Original English and simplified English can be thought of as two different languages.
� TS would be the process to translate English to simplified English.
Steps � Collecting training data ◦ Pairs of sentences: original sentence and
simplified sentence ◦ From English Wikipedia and Simple English
Wikipedia
� Build RNN Encoder Decoder Model � Evaluation
Use Sentence Similarity to Collect Training Data
From Siamese Recurrent Architectures for Learning Sentence Similarity
Other projects � Extended topic model for word
dependency � Opinion mining for chemical spill in West
Virginia ◦ http://158.121.178.175/
� Compression and data mining
My Path in Computer Science � Huazhong Agricultural University,
Information and Computing Science, BS, China, 2006 – 2010
� Bioinformatics lab, Huazhong Agricultural University, 2010 - 2010
� Northeastern University, Computer Systems Engineering, MS, 2011-2013
� IoMosaic, Software Engineer, 2013 - 2013 � University of Massachusetts Boston,
Computer Science, PhD, 2014 - present
Keys to find internship � Good resume � Did a lot of projects � Networking (Very important!) ◦ Go to conference ◦ Ask for job reference from professors, friends,
alumni, strangers from Linkedin…
Prepare interview � Know that company � Behavior questions � Technical questions ◦ You must start to practice programming in
your favorite language at least 1 month before the interview. (Leetcode)