ML-Text MiningExamples in R
Text Mining Demo in R
Previous slides discussed some of the theory behind text mining and NLP.
What does this look like in practice?
In this deck, we’ll take a look at R code that:
Creates a word cloud out of text data stored in a csv fileExtracts key words from text data using an R NLP library
WordClouds
• Great for generating a quick visual summary of large amounts of text
• Can be very evocative• The nature of your text will have a strong
impact on the exact effect. Example: Word cloud of restaurant reviews
systemslearning
analysis
data
cell
controldesign
modeling
machine
imaging
energy
theory
modelling
dynamics
processing
protein
materials
models
optimization
development
networks
quantum
engineering
management
interactionsbiology
system
chemistry
computational
gene
change
molecular
computing
climate
ecology
cells
software
model
interaction
stress
methods
spectroscopy
image
simulation
network
functional
structure
water
evolution
transport
surface
flow
signal
regulation
neural
computer
function
magnetic
information
metabolism
microscopy
environmental
carbon
high
processes
algorithms
human
plant
brain
sensing
memory
process
optical
organic
synthesis
deep
proteins
performance
numerical
expression
mass
detection
risk
signaling
natural
behaviour
soil
power
transfer
intelligence
quality
cognitive
physiology
structural
science
resonance
nonlinear
chemical
fluid
genetics
WordCloud Code DemoSee supplemental text file for R code.
Keyword Extraction
Keyword extraction is still quite a difficult problem. It often relies on NLP, which involves identifying parts of words, rather than text mining, which focuses on statistical patterns.
To carry out keyword extraction you must first match words to their language parts – e.g. noun, verb.
You can play around with the algorithm used to extract keyword by, e.g. placing more emphasis on nouns, verbs, combinations, etc.
Data exploration is useful here.
Keyword Extraction DemoSee supplemental text file for R code.