Date post: | 04-Jul-2015 |
Category: |
Technology |
Upload: | stuart-shulman |
View: | 103 times |
Download: | 2 times |
Introduction to Text Analytics
October 2, 2013
Dr. Stuart ShulmanPhone No.: +1-413-345-8939
E-mail: [email protected]
The Value Proposition
Our solution helps users easily discover information to:• streamline business processes
• increase ROI & create new business opportunities
• identify positive and negative trends
• discover unique, rare or unexpected information
How Do These Tools Help Analysts?
What This Means for Analysis
The Core MethodsCoding and Classifying Text Data
Iteration and Re-Use Are Critical Techniques
Measure Everything Starting With Human Agreement
The Core DiscoverText Approach
An Indispensable Role for Humans
Innovation Happens in Groups
“CoderRank” – A Lifetime Accuracy Measurement
Vision Critical Patent Pending – “Enhanced Machine Learning”
Five Essential Tools for Text Analytics
1. Search
2. Filtering on Metadata
3. Human Coding
4. Automated Clustering
5. Machine Classification
A Social Media Use CaseSifting and Sorting Relevant Data
Great Researchers Demand Transparent Tools
The HMC is a Leading Edge Gnip Customer
Gnip Data Streams and Search Filters
Fair Warning
This part of the presentation contains strong and potentially quite offensive, inappropriate, disturbing, or just completely stupid language.
Studying Media Campaign Effects
Create Custom Machine Classifiers
Yes
No
No
Search is Fundamental for Purposive Sampling
Defined Search Speeds Up Discovery
Tumblr. – “The Wild West of the Internet”
Stupid Stuff People Do & Tweet
redacted
redacted
Are These Tweets Just Social Garbage?
redacted
redacted
Signs of Health Fear Engagement
redacted
redacted
An IdeaScreen Use CaseConcept Testing Data
Raw VoC Data: A Fortune 500 Tech Company
Near Duplicate Clusters Can Be Interesting
Two Naturally Occurring Clusters of Free Text
Wherever Humans Go in Numbers, There Are Clusters
1st Wave of Human Coding Blazes a Trail
A „Simple‟ Coding Scheme with No Coder Training
Filtering Based on Classifier Scores
Testing Coder Agreement on a Small Sample
Measuring Inter-Coder Agreement
Validation of Coders & Codes
Text Analytics is a Series Buckets & Datasets
Breaking Down Concerns by Subtype
Breaking Down Advocacy by Pro and Con
A New Vision Critical Front EndThe First Preview of the New Release
The New VC Front End for DiscoverText
Coding Items to Train a Classifier
Leverage Item Metadata While Coding or Filtering
Code Items in a List View