Date post: | 11-Feb-2017 |
Category: |
Documents |
Upload: | ryan-reede |
View: | 139 times |
Download: | 0 times |
Collecting and Analyzing Social Media Data Data Mining and Sentiment Analysis with Twitter’s Streaming API
Ryan Reede Oct. 2015
We’ve Created a Monster...We generate unimaginable amounts of data.
In one minute on the web there are...275,000 Tweets
417,000 Tinder Swipes
2,460,000 Facebook Posts
1,800,000 Likes on 216,000 new Instagram Posts
+ User behavior
What has made this possible?Faster Computers
Cheaper Storage
Smarter Software Algorithms
(What tweets looks like on the server)
How do we get from this...
BETris App Insight through Social Media
Quick hits:Our players get extremely frustrated on ‘insane’
mode.
The iAds are too distracting.
Over 200 players screenshotted and shared their high score on Instagram today.
Advanced analysis:Players are most active on Twitter after 9pm, let’s
sponsor tweets then.
90% of first time players are logging in with Twitter accounts, let’s focus resources there, not Facebook integration.
...to info like this?
Data Mining is the aggregation and analysis of massive amounts of information by computers to extract meaning.
What’s so great about Social Media specific data?Human interaction becomes observable, at massive scale.
It’s the only way to gain access to data on nearly half of the world.
Simpler → More advanced analysisIdentify influencers and communities online.
Measuring an individual's influence online.
Recommending content to users.
Data Mining?
Proper analysis of Social Media data can’t be done alone:Sociology
Computer Science/Machine Learning
Statistics
Neuroscience
Ethnography and more…
Inherent challenges exist:Social data comes in many different forms
Proprietary/Hard to access
The goal of Data/Computer ScienceMaking everything measurable numerically
Multidisciplinary Approach
Application Programming Interfaces help developers bring services from one company’s software into their own and vice-versa.
Uber uses the Google Maps API to display maps in their app.
HootSuite uses Twitter’s Post API to allow users to tweet from their app.
APIs for Data Mining:APIs can offer easy access to valuable data.
Using APIs does not always require a high degree of technical skill.
Facebook & Twitter have free public feed APIs
Gathering Data: APIs
Social Network Analysis and Graph Theory:
Dijkstra’s Algorithm: shortest path between two nodes (individuals) in a graph (social network).
Tightness of a community in a network
Similar algorithms can also measure:
Centrality to a community
Closeness to another individual
Analyzing content can yield other insight
Lexical analysis
Recommendation algorithms
Step 2: Processing the Data
Business decision making will never be the same
Pre-demo TakeawaysSocial Media generates petabytes of
varying data for mining.
Acquiring this data can be done various ways, but APIs can make it easy.
Most Social Networks have these to share some data!
If you’re not a developer, tools built on top of these APIs can be valuable.
IBM’s Watson Analytics
Sentiment140 and Topsy
Awareness of basic analysis methods
Small Scale Data Mining: Computing Tweet Sentiment
Built in Python with the Twitter Streaming API
Lexical Analysis compares words in tweet to a 3500 word dictionary for a total sentiment score.