Date post: | 18-Jul-2015 |
Category: |
Data & Analytics |
Upload: | steve-nagoski |
View: | 529 times |
Download: | 0 times |
Big Data and Open Data Reuse
by Nonprofits for the Creation of
Sustainable Social Services
Nonprofit Technology Conference, Austin TX
Wed March 4, 2015 10:30 AM
Schedule: http://sched.co/1z1r
Eval: 15NTCSessionEval?c=1208
Hashtag: #15NTCReuseData
Who We Are – TechSoup Global
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata2
TechSoup Global is a nonprofit serving
the nonprofit community worldwide.
We have built nonprofit sector capacity through
technology for 25 years.
We are working toward a time when every social benefit
organization on the planet has the technology,
resources, and knowledge they need to operate at their
full potential.
Who We Are
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata3
• Steve Nagoski - Data Scientist
• Michael Enos - Director of Community and Platform
Who You Are & What You Care About
How do we Sustainably Connect our Information & Insights?
• Stories of Success – Collaboration Panel
• Questions About Open Data & Sustainability
Use #15NTCreusedata & Question Cards & Q&A
Data Reuse by Nonprofits
• Big Data & Open Data Trends
• Open Data Concerns
• Case Study: Balkans Data Academy
• Case Studies: Digital Humanitarians
• Data Science and Machine Learning
• Case Study: Hunger Index
• Sustainability of Open Information Initiatives
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata4
“The purpose of computing is insight,
not numbers.”
-Richard Hamming, 1961
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata5
Data Trends – Long Term
“What a computer is to me is it’s the most
remarkable tool that we’ve ever come up with,
and it’s the equivalent of a bicycle for our minds.”
- Steve Jobs, 1990
Big Data Trends - Global
• # of orgs and governments operating “Data Driven” grows every year, instrumenting & collecting broader data to make smarter decisions
• Online connectivity:
─ 350B SMS Messages/mo
─ 1.5T App Messages/mo (Whatsapp)
─ 15T Tweets/mo
─ 30B unique Facebook shares/mo
─ 3B Internet Users worldwide (40%), growing 8% YoY
• Cloud Storage makes storing 100PB/org affordable
─ Facebook, Microsoft, Amazon, Twitter, Thousands more.
─ Millions in the next 2 years
• New Analysis Tools are Efficient at those sizes
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata6
Open Data Trends - Global
• 2013 : G8 signs Open Data Charter
• 2014 : G20 pledge:
─ advance open data as weapon against corruption
• 2014 : UN recognizes need for “Data Revolution”
Still a LONG way to go
• 8% of participating countries publish spending figures
• 6% publish open data on government contracts
• 3% publish open data on ownership of companies
• Many Open Data initiatives not yet sustaining, growing
─ OpenDataBarometer.org, Jan 2015
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata7
Open Data Trends - US
• White House hires first Chief Data Scientist @dpatil
• Obama keynotes O’Reilly Strata conference Feb 2015
─ “Understanding and Innovating with Data has the potential to change the way we do almost anything for the better”https://www.youtube.com/watch?v=vbb-AjiXyh0
• 135,000 open govt datasets available at Data.gov
─ Weather, Maps, Healthcare, Political Funding, Census
• Collaboration between NGOs (Why) & Data Scientists (How) & Analysts/Engineers (What) to deliver stronger insights
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata8
Open Data Concerns - US
• Privacy vs Accountability & Transparency
─ Most open data Anonymous for Privacy Census
Public Services Usage Info
Driving Traffic Patterns
─ Some must be detailed for Accountability Health Inspection Data for Restaurants
Campaign Finance data for Politicians
─ Some we have committed to record for Accountability but have not put collection/access systems in place Police Shootings and/or Deaths Records
Public Access to Police Event Video
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata9
Open Data Concerns
• Misuse of Open Data and Misinterpretation
• Correlation != Causation
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata10
“The temptation to form
premature theories upon
insufficient data is the
bane of our profession.”
– Sherlock Holmes
“Torture the data, and it
will confess to anything.”
– Ronald Coase
Data Reuse by Nonprofits
• Big Data & Open Data Trends
• Open Data Concerns
• Case Study: Balkans Data Academy
• Case Studies: Digital Humanitarians
• Data Science and Machine Learning
• Case Study: Hunger Index
• Sustainability of Open Data Initiatives
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata11
Balkans Data Academy : Why / Why Not?
• 1 week Hackathon in Sarajevo Aug 2014
─ expose Bosnian election data to voters
• Project managed by TechSoup Foundation + Local Civic
Activists ZastoNe https://www.youtube.com/watch?v=BcxgAOCFppY
• Team– 15 people from 7 different Nonprofit Orgs w/
different skills + 1 common goal
• Set up framework for future Data Academies, expand
footprint, enable more local NGOs to expand project
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata12
Balkans Data Academy : What
• Outcomes – Success!
─ Database & API Created, Open Source Project - Github
─ Data now easy to reload and expand
─ Website Created
─ Introduction Video created
• Next Steps
─ Use for live data in October 2014 Election
─ Collaborate & Train to expand local nonprofit capabilities in
future Academies
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata13
Digital Humanitarians
Feb 2015, Dr. Patrick Meier
• The Rise of Digital Humanitarians
• The Rise of Big Crisis Data
• Crowd Computing Satellite & Aerial Imagery
• Artificial Intelligence applied to Disaster Response
• Verifying Big Crisis Data – Dealing with False Data
• Dictators vs Digital Humanitarians (Egypt, China, Iran)
http://iRevolution.net http://DigitalHumanitarians.com #DigitalJedis
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata14
Digital Humanitarians – Haiti Earthquake 2010
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata15
Digital Humanitarians – Philippines 2012
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata16
Resistance to AI / Machine Learning
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata18
• Oct 2010: Crowdsourcerer vs Muggles
“How Harry Potter Explains Humanitarian Crowd-Sourcing”
What is Machine Learning + AI Today
• Predictive Modeling + Threshold Automation
• Abuse prevention in Financial Svcs, Social Media
– Spam
– Personal/Community Abuse
– Fraud
– AML - Anti Money Laundering
– ATO - Account Take Over detection
• Detecting False Data
• Stitching Many sources to get the truest picture
• Constantly Adjusting, Measuring, Improving– Learning from False Positives, Negatives, most valuable Measures
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata19
Applying Machine Learning to #OpenData
• Counting Tents in Refugee Camp Satellite Images
• Stitching together area images from UAV cameras
• Translation Services for Global Responses
• Identifying unreliable/false posts in Social Media
• Smart Geolocation with minimal input metadata
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata20
Data Reuse by Nonprofits
• Big Data & Open Data Trends
• Open Data Concerns
• Case Study: Balkans Data Academy
• Case Studies: Digital Humanitarians
• Data Science and Machine Learning
• Case Study: Hunger Index
• Sustainability of Open Data Initiatives
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata21
Hunger Index - What problems are we trying to solve?
• Are Food Assistance Providers achieving our goals?
• How do we forecast and communicate the need for food?
• How can food assistance programs make better decisions
about programs and investments.
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata22
Total Meals
Required
MealsPurchased
Food Assistance
Missing Meals
What is the Hunger Index?
• An aggregate measure of the need for food by the most
vulnerable member of a community.
• An index for comparing performance year-to-year and
region-to-region.
• A measure of how well we are serving those in need in
our community.
• Began in 2007 in Santa Clara and San Mateo Counties,
expanding to Alameda, Sonoma and Santa Cruz Counties
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata23
Hunger Index Methodology: Components
Scope – Community, Income and Time Range
TMR – Total Meals Required
MP – Meals Purchased
FAP – Food Assistance Provided
TNF – Total Need for Food Assistance
MM – Missing Meals
HI – Hunger Index
• Counties
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata24
Hunger Index Methodology: Vulnerable Population
Scope
Geography
Time range
Income Demographics
http://www.census.gov/acs/www/
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata26
Hunger Index Methodology: TMR
TMR: Total Meals Required
• Households with Incomes < $50K
• Average Household Size– Table B25010
– Santa Clara County 2010 = 2.94 persons/household
• Number of Meals per year =
1095/person/year
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata27
Hunger Index Example: TMR, Santa Clara County 2010
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata28
Annual Income HouseholdsMeals Required
(millions)
0 thru $10,000 26,848 86.4
$10,000 to $20,000 38,863 125.1
$20,000 to $30,000 40,182 129.4
$30,000 to $40,000 38,351 123.5
$40,000 to $50,000 40,967 131.9
Total 185,211 596.3
Methodology: Meals Purchased (MP)
• From Consumer Expenditure Survey
–http://www.bls.gov/cex/csxstnd.htm
• No. of Households * Average Annual
Expenditure per household
• Important Correction: Subtract SNAP
purchases. http://www.cdss.ca.gov/research/PG352.htm
• Divide by Cost of a Meal to get Meals
Purchased http://www.cnpp.usda.gov/usdafoodcost-home.htm
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata29
Example MP Data: Santa Clara County 2010
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata30
Annual Income
(000)Households
Average Annual Expenditure
on Food
0 thru $10 26,848 $3,189
$10 to $20 38,863 $3,413
$20 to 30 40,182 $4,008
$30 to 40 38,351 $4,883
$40 to 50 40,967 $5,515
Methodology: Food Assistance Provided (FAP)
• Data in different formats normalized to
meals
• Time range
• For SC and SM Counties
– Food Banks, SNAP, WIC, Government School Meal
Programs Senior Nutrition, CACFP
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata31
Example FAP: Santa Clara County 2010
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata32
SourceMeals
(millions)
SNAP 81.4
Second Harvest Food Bank 24.7
School meals 21.3
WIC 14.1
CACFP 4.7
Other 1.6
Total (FAP) 147.8
Final Calculations
TNF: Total Need for Food Assistance
TNF = TMR – MP
296.6M = 596.2M – 299.6M
MM: Missing Meals
MM = TNF – FAP
148.8.M = 296.6M - 147.8M
HI: Hunger Index
HI = MM/TNF
0.502 or 50.2%
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata33
Example Final Calc: Santa Clara County 2010
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata34
TMR: Total Meals Required 596.2
MP: Meals Purchased 299.6
FAP: Food Assistance Provided 147.8
TNF: Total Need for Food 296.6
MM: Missing Meals 148.8
HI: Hunger Index 0.502
Findings and Implications
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata35
Analysis
– Compare against
previous year
– Look for major shifts in
components
– Trends
Collateral benefits
– Understanding of need
• Who, where, when
– Understanding of Food
Assistance
• Who, where, when
– Use of data in other contexts
– How is the population,
demographics and economics
changing over time
Findings and Implications
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata36
How many households are vulnerable and
how much food do they need to be healthy?
Year Households Meals Needed
2010 173,000 564 million
2011 185,000 596 million
Growth 7% 5.7%
Findings and Implications
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata37
Purchased300
Food Assistance
148.8
Missing Meals 147.8
Santa Clara County 2011596 Million Meals185,000 households
CalFresh55%
Food Bank17%
School meals14%
WIC10%
Other4%
Food Assistance in
Santa Clara 2011Total Food Assistance: 149 million meals
Santa Clara County Hunger Index
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata38
109.5136.6 147.8
110.4
137.1148.8
0
50
100
150
200
250
300
350
2009 2010 2011
Food Assistance Provided Missing Meals
Santa Clara County Hunger Index 2011
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata39
• Hunger Index indicates agencies still struggling to
catch up.
• Vulnerable households increased by more than
7% and need grew by over 8%
• Food Assistance grew by just over 8%.
• Most growth: CalFresh and WIC
• 149 million meals missing last year – enough to
feed 136,000 people for one year, more than the
population of Santa Clara.
What does the Hunger Index tell us?
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata40
• Households are spending less on food and
using more food assistance
• It will be a challenge for food assistance
programs to keep up
• We need to continue to work together to make
a difference
Data Reuse by Nonprofits
• Big Data & Open Data Trends
• Open Data Concerns
• Case Study: Balkans Data Academy
• Case Studies: Digital Humanitarians
• Data Science and Machine Learning
• Case Study: Hunger Index
• Sustainability of Open Data Initiatives
March 4 2015Open Data Reuse by Nonprofits #15NTCreusedata41