Date post: | 08-Jan-2017 |
Category: |
Data & Analytics |
Upload: | margriet-groenendijk |
View: | 514 times |
Download: | 0 times |
@MargrietGr
Margriet Groenendijk, PhDDeveloper Advocate for IBM Cloud Data Services
Open Data Science Conference UK9 October 2016, London
How To Analyse Weather Data and Twitter Sentiment with Spark and Watson
@MargrietGr
@MargrietGr
@MargrietGr
@MargrietGr
Analyse Weather Data and Twitter Sentiment using Spark and Watson
People love to talk about the weather on Twitter
What insights can you find when combining the data?
What is the weather sentiment related to?
@MargrietGr
Bluemix
Where to find the data?
Insights for Twitter
Weather Company Data
API services from IBM Bluemix
https://console.ng.bluemix.net/
@MargrietGr
Bluemix
Where to store the data?
Availablefrom IBM Bluemix
Cloudant NoSQL DB
@MargrietGr
Weather Company Data
Watson Tone Analyser
TweetsWeatherSentiment
Exploring the options
Bluemix
@MargrietGr
IBM Bluemix
▪ Free trial etc▪ lots of services etc
Free 30-day trial“Big Blue Box” containing all IBMs services
https://console.ng.bluemix.net/
@MargrietGr
Add a service in Bluemix
Add a service
Search for weather
Account + spaces
Weather Company Data for IBM Bluemix
@MargrietGr
Weather Company Data for IBM Bluemix
2 Credentials3 Ready to
use the REST APIs
Add service 1
@MargrietGr
Your own weather forecast in a Python notebook
@MargrietGr
Weather Company Data API
Show json weather file
@MargrietGr
Your own weather forecast in a Python notebook
London
@MargrietGr
https://developer.ibm.com/clouddataservices/2016/10/06/your-own-weather-forecast-in-a-python-notebook/
Weather in UK on Friday evening 7 October
@MargrietGr
Store weather data in Cloudant
@MargrietGr
Python script run daily on a Bluemix VM service
https://python-cloudant.readthedocs.io
@MargrietGr
Python script run daily on a Bluemix VM service
Add crontab job to run daily
Cloudant
@MargrietGr
▪ cloudant
▪ etc
@MargrietGr
▪ geospatial index▪ show map with 100 cities :-)
@MargrietGr
▪ geospatial index▪ show map with 100 cities :-)
@MargrietGr
Weather Company Data
Watson Tone Analyser
TweetsWeatherSentiment
Insights for Twitter
@MargrietGr
Insights for Twitter
@MargrietGr
Insights for Twitter
Only a 100…
dashDB
@MargrietGr
Add the dashDB service in Bluemix
Add a service
Search for dashDB
@MargrietGr
@MargrietGr
Use an existing service3
1
2
posted:2016-08-01,2016-10-01 followers_count:3000 friends_count: 3000 (weather OR sun OR sunny OR rain OR hail OR storm OR rainy OR drought OR flood OR hurricane OR tornado OR cold OR snow OR drizzle OR cloudy OR thunder OR lightning OR wind OR windy OR heatwave)
REST API docs:https://new-console.ng.bluemix.net/docs/services/Twitter/twitter_rest_apis.html#rest_apis
Search for tweets
4 Select table
@MargrietGr
@MargrietGr
Weather Company Data
Watson Tone Analyser
TweetsWeatherSentiment
Explore the data
IBM Data Science Experience
@MargrietGr
Nested data…
@MargrietGr
@MargrietGr
Load tweets from dashDB with Spark SQL
@MargrietGr
Clean data, summarise and load into pandas DataFrame
@MargrietGr
Add weather to tweets
Weather data is nested, pyspark.sql struggles with thatThere is no location data of tweets
Only 10% of all tweets available in the free plan through the Decahose streamWeather API only has 24 hours of data available
@MargrietGr
Weather Company Data
Watson Tone Analyser
TweetsWeatherSentimentX
@MargrietGr
Weather Company Data
crontab -e
0 23 * * * /path/to/file/do_something.sh
python do_something.py
TweetsWeatherSentiment
Watson Tone Analyser
@MargrietGr
Add sentiment - example
@MargrietGr
@MargrietGr
#Matthew
@MargrietGr
Use an existing service3
1
2
posted:2016-08-26,2016-10-06 followers_count:1000 friends_count:1000 (matthew OR hurricane matthew OR hurricane)
REST API docs:https://new-console.ng.bluemix.net/docs/services/Twitter/twitter_rest_apis.html#rest_apis
#matthew tweets
4 Select table
@MargrietGr
@MargrietGr
@MargrietGr
@MargrietGr
Some lessons learnedAPIs are great!Can extend and build on this, as all data is in the Cloud
Weather data only available for 24 hrs, great for weather apps, but harder to combine weather with historical tweets, need a daily script
Now ready to build a more efficient workflow that will be easily able to handle millions of tweetsStart a more in depth analysis in the Data Science Experience
@MargrietGr
▪ analyse data!▪ pretty plots
https://github.com/ibm-cds-labs/pixiedust
@MargrietGr
Margriet Groenendijk, PhDDeveloper Advocate for IBM Cloud Data Services
https://developer.ibm.com/clouddataservices/author/mgroenen/
Thank you!
Slides will be available onhttp://www.slideshare.net/MargrietGroenendijk