Date post: | 15-Jun-2015 |
Category: |
Documents |
Upload: | laura-manley |
View: | 906 times |
Download: | 0 times |
Government 3.0The Tools: Big Data and Open Data
1
Michael HollandFebruary 27, 2013
The CUSP Partnership• The University Partners:
– NYU, NYU-Poly, Univ. of Toronto, Warwick University, CUNY, IIT-Bombay, Carnegie Mellon University,
• The Industrial Partners:– IBM, Cisco, Xerox, ConEdison, [Lutron,] National
Grid, Siemens, ARUP, IDEO, AECOM
• City and State Agency Partners:– NYC Agencies, MTA, Port Authority
• National Laboratories:– [Lawrence Livermore National Laboratory, Los
Alamos National Laboratory, Sandia National Laboratories, Brookhaven National Laboratory]
A diverse set of other organizations have expressed interest in joining the partnership
2
Big data can be brought to bear on societal issues
• Sensing/transmission/storage/analysis capabilities growing rapidly
• How can you “instrument society”?• What do you want to know?• How can you find out?• What could you do with the
information?– Descriptive, predictive
• Greenhouse Gas Treaty Verification methodology is an example of this• Fuse surveys, direct measurements,
proxies to independently verify GHG emissions
Properly acquired, integrated, and analyzed, data can •Take government beyond imperfect understanding
– Better (and more efficient) operations, better planning, better policy•Improve governance and citizen engagement•Enable the private sector to develop new services for governments, firms, citizens •Enable a revolution in the social sciences
Environment
Meteorology, pollution, noise, flora, fauna
People
Relationships, location, economic /communications activities, health, nutrition, opinions, …
Infrastructure
Condition, operations
What does it mean to instrument a city?
• Organic data flows– Administrative records (census, permits, …)– Transactions (sales, communications, …)– Operational (traffic, transit, utilities, health system, …)
• Sensors– Personal (location, activity, physiological)– Fixed in situ sensors– Crowd sourcing (mobile phones, …)– Choke points (people, vehicles)
• Opportunities for “novel” sensor technologies– Visible, infrared and spectral imagery– RADAR, LIDAR– Gravity and magnetic – Seismic, acoustic– Ionizing radiation, biological, chemical– …
Urban Data Sources
311 Noise Report Density
Source EUI, Multi-Family Buildings
02
46
810
Perc
ent
0 100 200 300 400 500Current Weather Normalized Source Energy Intensity (kBtu/Sq. Ft.)
D. Hsu and C. Kontokosta, NYC Local Law 84 Benchmarking Report, 2012
Source EUI, Office Buildings
Building Energy Use
• 300 million mobile phones; 494,151 cell towers• Approximately 400,000 ATMs record video of all
transactions• 30 million commercial surveillance cameras • 4,214 red-light cameras; 761 speed-trap cameras• A third of large police forces equip patrol cars with
automatic license plate-readers that can check 1,000 plates per minute
Source: Wall Street Journal (January 3, 2013) – “In Privacy Wars, It’s iSpy vs. gSpy”
Some Sensor Stats: United States
Drop-off
Pick-up
Most drop-off’s occur Most drop-off’s occur on the avenues, most on the avenues, most pick-up’s on the streets pick-up’s on the streets
Lauro Lins, Fernando Chirigati, Nivan Ferreira,Claudio Silva and Juliana Freire - NY- Poly(Data obtained from TLC on June 6th, 2012)
9
Visualization of TLC GPS Data
May 1st – 7th 2011
3.6 Million Trips
Train Stations
Airports
Studying Taxi Patterns
Wang, P., Hunter, T., Bayen, A.M., Schechtner, K. & Gonzalez, M.C. Understanding Road Usage Patterns in Urban Areas. Nature, Sci. Rep. 2, 1001; DOI:10.1038/srep01001(2012).
Cell Tower Records for Traffic Analysis
Urban Observatory• Provisioned urban vantage point(s)
– MetroTech (1 MT and 388 Bridge St)– 277 Park Ave (at 47th Street)– Governor's Island
• Suite of bore-sighted instruments– Photometric and colorimetric optical imaging – Broad-band IR imaging (SWIR, MWIR, and thermal?)– Hyperspectral imaging (trace gases)– LIDAR (building motions, pollution)– Radar (building /street vibrations, building motion, traffic flow)
• Correlative data on the urban scenes– Meteorology (temperature, winds, visibility)– Scene geometry (distances, directions, identities of features visible)– Parcel and land use data, building characteristics and activities,
building utility consumptions, and real estate valuation data– In situ pollution data and location/nature of major sources– In situ vehicle and pedestrian traffic for the streets visible– Demographic and economic data
• Capability to archive, process, and analyze data acquired– Image processing chains– Data warehouse, GIS, Visualization tools– Software and procedures to enhance privacy protection
• Personnel and funding to create and operate the above
Looking South from the Empire State Building
Manhattan in the Thermal IR
Photo by Tyrone Turner/National Geographic
Other synoptic modalities: Hyperspectral, RADAR, LIDAR, Gravity, Magnetic, …
199 Water StreetBuilt 1993 :: 998,000 sq ft
electricity, natural gas, steamLEED Certified
Quantified Community
• Fully instrument a slice of the city– 10-100k people within 20 blocks of MetroTech or
a new development– Create a well-characterized test bed for
technologies/policies and behavioral interventions
• What constitutes “complete instrumentation”?– In situ vs. choke points vs. synoptic?– Acoustic/traffic/mobile
phones/video/IR/magnetic/CBRN/…– Economic data? Physiological data? Nutrition? …
15
• How to fully engage people who live/work in the community to provide data, participate in citizen science, create educational opportunities, …?– Foster improved quality of life: “cleanest/greenest/healthiest/most livable /…”– “I’ll show you the parking spaces …”– ???
• What might we expect to learn?
• Optimize operations– traffic flow, utility loads, services delivery, …
• Monitor infrastructure conditions– bridges, potholes, leaks, …
• Infrastructure planning – zoning, public transit, utilities
• Improve regulatory compliance (“nudges”, efficient enforcement)• Public health
– Nutrition, epidemiology, environmental impacts
• Abnormal conditions– Hazard detection, emergency management
• Data-driven formulation of data-driven policies and investments– Road pricing and congestion charging, time-of-day power, …)
• Better inform the citizenry• Enhance economic performance and competitiveness
What can cities do with the data?
Among the projects we’re considering
• Normalization, interoperability of city data sets• 3D Urban GIS capability• Multi-data correlations to improve city resource
allocation • Noise / Temperature / Pollution• Mobility• Novel sensing of public health• Building efficiency• Living Lab definition
17
Privacy Issues
• Privacy issues are structural - you can’t study society without studying people at some level
• People will voluntarily give up their data if they can see a personal or societal benefit– Social networks, voltstats.net, …
• Norms/expectations are changing with generations• There are technical fixes for multi-level
privacy/classification• Privacy is eroding in any event and we should do our
best to ensure it is done sensibly• We don’t yet know what the optimal level of privacy is
for studies of interest
18
An Ex-Oversight Staffer’s Opinions about
“Data” in an Agency Context
Research Program(Competitive)
Agency (Corporate)
Political (Macro)
Society
Disciplines
Societal Demands
DefenseEnergyEconomic SecurityHealthEnvironmentFood/WaterDiscovery
VALUE
Scientific Opportunities
AMO, bio, nano, NP, EPP, Astro
cosmology
MERIT
Context, Context, Context
One Systematic Evaluation Process:OMB/OSTP R&D Investment Criteria
Quality Relevance Performance
Prospective
[1] Mechanism of Award (e.g., 10 CFR 605)
[2] Justification of funding distribution among classes of performers
Planning & Prioritization:
Strategy
“Top N” Milestones
(5 < N < 10)
Retrospective
[1] Expert reviews of successes and failures
[2] Information on major awards
Evaluation of utility of R&D results to both field and broader “users”
Report on
“Top N” Milestones
GPRA-style Annual Metrics
Advisory Committees & NAS
Roles of “Data”
• Scientific Understanding: Data improves unbiased explanation of natural or social phenomena
• Administrative Action: Data ensures that Agencies transparently exercise their delegated authorities in a fashion that is not "arbitrary and capricious, an abuse of discretion, or otherwise not in accordance with the law."
• Legal or Political Action: Data as a tool for adjudicating disputes, i.e., winning contests and seeing one’s priorities implemented.
Is USG Robust Against “Big Data?”
[T]he median Congressional district is now about five points Republican-leaning relative to the country as a whole. Why this asymmetry? It’s partly because Republicans created boundaries efficiently in redistricting and partly because the most Democratic districts in the country, like those in urban portions of New York or Chicago, are even more Democratic than the reddest districts of the country are Republican, meaning there are fewer Democratic voters remaining to distribute to swing districts.
“As Swing Districts Dwindle, Can a Divided House Stand?” Nate Silver, NYT, Dec 27, 2012