+ All Categories
Home > Technology > Hack Kid Con - Learn to be a Data Scientist for $1

Hack Kid Con - Learn to be a Data Scientist for $1

Date post: 15-Jan-2015
Category:
Upload: adrian-cockcroft
View: 1,631 times
Download: 2 times
Share this document with a friend
Description:
Attempt to inspire some kids to pay attention in Math and Science classes so they can get a good job and help fill the skills gap in the years to come.
Popular Tags:
37
LEARN TO BE A DATA SCIENTIST FOR $1 Hack Kid Conference - April 2014 by Adrian Cockcroft Battery Ventures
Transcript
Page 1: Hack Kid Con - Learn to be a Data Scientist for $1

LEARN TO BE A DATA SCIENTIST FOR $1

Hack Kid Conference - April 2014 by Adrian Cockcroft

Battery Ventures

Page 2: Hack Kid Con - Learn to be a Data Scientist for $1
Page 3: Hack Kid Con - Learn to be a Data Scientist for $1
Page 4: Hack Kid Con - Learn to be a Data Scientist for $1
Page 5: Hack Kid Con - Learn to be a Data Scientist for $1

A BIG new problem for a new generation

Page 6: Hack Kid Con - Learn to be a Data Scientist for $1

Now

A BIG new problem for a new generation

Page 7: Hack Kid Con - Learn to be a Data Scientist for $1

Now

A BIG new problem for a new generation

Your future job as a Data Scientist

Page 8: Hack Kid Con - Learn to be a Data Scientist for $1
Page 9: Hack Kid Con - Learn to be a Data Scientist for $1

WHAT DOES A DATA SCIENTIST DO?

Page 10: Hack Kid Con - Learn to be a Data Scientist for $1
Page 11: Hack Kid Con - Learn to be a Data Scientist for $1

The hive mind map shows popular twitter hashtags for the last 7 days and how they are connected

http://hivemindmap.com/?#

Page 12: Hack Kid Con - Learn to be a Data Scientist for $1

HIVE MIND MAPA mind-map of what’s happening on Twitter

Thanks to Mark Harwood for these slides and the Hive Mind Maphttp://www.infoq.com/presentations/elasticsearch-revealing-uncommonly-common

Page 13: Hack Kid Con - Learn to be a Data Scientist for $1

Connections

The thickness of a line between hashtags is based on the strength of connection

Tip:!Strength of connection is the number of tweets with both tags vs the number with only one - see “Jaccard similarity coefficient”

Page 14: Hack Kid Con - Learn to be a Data Scientist for $1

Top tweets

The most popular tweets for a tag are sorted based on the number of “retweets”

Page 15: Hack Kid Con - Learn to be a Data Scientist for $1

When?

The rise and fall of each hashtag’s popularity can be shown over time

Page 16: Hack Kid Con - Learn to be a Data Scientist for $1

Calendar summary

Tags that “peak” together are grouped into events on a calendar

Tip:!Peaks are detected using standard deviations. Only tags with a single peak are chosen as events

Tip:!Tags that rise and fall in popularity at the same time are detected using Pearson’s Correlation

Page 17: Hack Kid Con - Learn to be a Data Scientist for $1

What makes this possible?• Free software (Lucene, Java, Eclipse, Gephi, Tomcat, d3, Google analytics…)

• Free data (millions of users’ tweets from Twitter’s 1% sample feed)

• “Cloud” computing (rented server)

• Smarter web browsers (visualizations using HTML5’s SVG/Canvas)

• All the friendly folks on the internet (e.g. http://stackoverflow.com/questions/14799842)

• Some imagination…

Page 18: Hack Kid Con - Learn to be a Data Scientist for $1

Opportunities in Data Science• We are all generating volumes of data never seen before

• You can recycle the behaviors of billions of people into more intelligent systems

• customer purchases can be used for product recommendations

• user searches can be used for spelling corrections,

• Reader clicks can influence the trending news

• Spotify activity is used to make music recommendations)

• The tools have never been cheaper

• It has never been easier to find help in developing systems

Page 19: Hack Kid Con - Learn to be a Data Scientist for $1

…one more thing..

I’m writing these slides for you while on my annual snowboarding

trip to Canada. Data science pays well ;-)

Wish you were here…

Page 20: Hack Kid Con - Learn to be a Data Scientist for $1

HOW CAN A KID LEARN BIG DATA

FOR $1?

Page 21: Hack Kid Con - Learn to be a Data Scientist for $1

BIG DATA IN THE CLOUD WITH AMAZON EMRhttps://www.youtube.com/watch?v=S6Ja55n-o0M

Page 22: Hack Kid Con - Learn to be a Data Scientist for $1

LESS THAN $1After running two of the EMR examples, creating 6 computers in the cloud

to do the analysis for up to an hour each

Page 23: Hack Kid Con - Learn to be a Data Scientist for $1

GOOGLE BIGQUERYhttps://demobigquery.appspot.com/

Page 24: Hack Kid Con - Learn to be a Data Scientist for $1

BAY AREA WEATHERhttps://demobigquery.appspot.com/

Page 25: Hack Kid Con - Learn to be a Data Scientist for $1

WHY THE FLINTSTONES?https://demobigquery.appspot.com/

Page 26: Hack Kid Con - Learn to be a Data Scientist for $1

MEASURING KIDSHow good are you at Math and Science, is it getting better or worse?

Page 27: Hack Kid Con - Learn to be a Data Scientist for $1

SCHOOL DATAhttps://www.data.gov/

http://eddataexpress.ed.gov/state-report.cfm/state/CA/

Page 28: Hack Kid Con - Learn to be a Data Scientist for $1

ACHIEVEMENT SCORESDownload results into Excel to analyze and draw graphs

Page 29: Hack Kid Con - Learn to be a Data Scientist for $1

DOWNLOADED DATANeeded some clean-up. Made sure grade was consistent (4, 8, HS) for all

results, and created a short Subject column

Page 30: Hack Kid Con - Learn to be a Data Scientist for $1

SCORES 2004-2012 Elementary - 4th Grade, Middle School - 8th Grade, High School

Page 31: Hack Kid Con - Learn to be a Data Scientist for $1

SCORES 2004-2012 Elementary - 4th Grade, Middle School - 8th Grade, High School

About half of high school students in

California are proficient at Math and Science

Page 32: Hack Kid Con - Learn to be a Data Scientist for $1

CALIFORNIA SCHOOLSScience and Math Scores at Elementary, Middle and High School Level

Page 33: Hack Kid Con - Learn to be a Data Scientist for $1

CALIFORNIA SCHOOLSScience and Math Scores at Elementary, Middle and High School Level

Scores have been getting better. Good!

Page 34: Hack Kid Con - Learn to be a Data Scientist for $1

CALIFORNIA SCHOOLSScience and Math Scores at Elementary, Middle and High School Level

Scores have been getting better. Good!

Maybe the Math tests

were harder for everyone

that year?

Page 35: Hack Kid Con - Learn to be a Data Scientist for $1

CALIFORNIA SCHOOLSScience and Math Scores at Elementary, Middle and High School Level

Scores have been getting better. Good!4th Grade

“cohort” in 2004 was 8th Grade in 2008

Maybe the Math tests

were harder for everyone

that year?

Page 36: Hack Kid Con - Learn to be a Data Scientist for $1

DATA SCIENCE WITH EXCELPivot tables let you rearrange data and trend lines measure the slope

Page 37: Hack Kid Con - Learn to be a Data Scientist for $1

LEARN TO BE A DATA SCIENTIST FOR $1

• Everything is being measured

• The latest data science tools are available to anyone for pennies

• There is lots of freely available data

• Pay attention in math and science class, play around with EMR and Bigquery and get an interesting and well paid job as a data scientist!


Recommended