Post on 07-May-2015
description
transcript
Using Social Sentiment to
Track Real Time Public
Opinion
Rob BaileyCEO, Datasift
rob.bailey@datasift.comTwitter: @RMB
Most Powerful Platform For Real Time Social
We ingest petabytes of Social data from
50+ global sources in Real Time with fully redundant, push-based systems
1. Ingestion
Each Social item gets tagged 40+
different ways (eg with Klout score)
in <300 milliseconds
3. Enrichment & Filtering
Enriched, filtered data is sent back
out through enterprise grade
API to 250+ customers
4. Distribution2. Cleaning & Normalization
Using proprietary technology, we
scrub the data to eliminate
noise/spam & normalize it
We do all of this in about half a second.
2
Sentiment in Social Is Like Bread
3
Sentiment in Social Is Like Bread
4
Fresh is better.
Freshest Sentiment Analysis
5
• We filter in real time, with latency < 300 MS
• Set up new filters in minutes
• Currently work with Lexalytics
• We cover English, German, Portuguese, Spanish, French
• 10 more languages coming
Benefits of Real Time Sentiment
6
• Capture events in real time
• Instantaneous drill-downs on drivers
• Beat the competition (News, Finance)
• Understand your audience faster
CASE STUDY: US Presidential Debates
7
• Both candidate’s sentiment levels dropped when they attacked
• Obama scored big with the bayonet comment• Delayed drop for Romney with Syrian & sea comment
CASE STUDY: Women Hate RIM
8
Women were better predictors than men about negative market reaction to the news.
CASE STUDY: Facebook IPO
9
Public sentiment on Twitter was a strong indicator of where the stock price would move
next.
Sentiment in Social Is Like Bread
10
Not everyone can handle it.
Hard To Replicate Technical Infrastructure
11
We ingest data from over 50 Social sources & 850+ news sources in real time & use redundant systems to compensate for sometimes unreliable partner APIs
Pickle is our massively parallel and scalable cluster of virtual machines that interprets, compiles and evaluates customer-created filters (expressed in CSDL) against every single incoming message
Augmentations covers all of the data services which either perform a real-time data analysis or lookup data from 3rd
party data sets of pre-computed data.
We use a variety of queing technologies including RabbitMQ to ensure constant flow of data
Our billing platform manages licenses & billing across hundreds of customers and 50+ content sources in real time, 5 min increments
Our data store is based upon a bespoke deployment of Hadoop/HBase and our own low level Virtual Machine.
Currently we have 1 Petabyte's of storage & are the only company worldwide to offer 2+ years of historical Social data
We deliver real-time data with very low latency (sub 200ms that our platform adds.) data through our HTTP streaming engine called 'Meteory' that is built on Node.JS
DataSift Infrastructure Highlights
Why Real Time So Important
12
• Catch problems before they explode
• Set up new searches quickly
• Competitors are watching
• Customers are increasingly impatient
Sentiment in Social Is Like Bread
13
GET STARTED: hello@datasift.com
GET INFORMED: @RMB