+ All Categories
Home > Documents > Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture...

Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture...

Date post: 25-May-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
26
Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi Shekhar McKnight Distinguished University Professor Computer Sc. & Eng., University of Minnesota www.cs.umn.edu/~shekhar Acknowledgements: NSF, USDOD, USDA, …
Transcript
Page 1: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Towards Spatial Data Science for Smart Agriculture Big Data

Oct. 26-28, 2017UMN/MBDH Workshop on Agricultural Data Integration

Shashi ShekharMcKnight Distinguished University Professor

Computer Sc. & Eng., University of Minnesotawww.cs.umn.edu/~shekhar

Acknowledgements: NSF, USDOD, USDA, …

Page 2: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Outline

• Agriculture Big Data (AgBD) Examples- Precision Agriculture

- Global Agriculture Monitoring

• Data Management Tools

• Data Mining Tools

• Collaboration Opportunities

Page 3: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

• Reduce fertilizer run-offs, water use• Improves yield• Computing is critical

• Cyber-Physical Systems• Data & Data Science Elements

Precision Agriculture

3

Page 4: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Deconstructing Precision Agriculture

4

Page 5: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Crop Insurance, Prescriptive Farming

5

Page 6: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Global Agricultural Monitoring

Page 7: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Goals: • Design compelling visions, Identify gaps • Develop a research agenda55 Participants (Data-driven FEW & Data Sciences)

Global Temperature

Global Population

StateNexus Dashboard

Locations

Potentially Transformative Research Agenda: • National FEW Nexus Observatory & Dashboard for chokepoint monitoring, alerts, warnings• Novel Physics-aware Data Science for mining nexus patterns in multi-scale spatio-temporal-network data despite non-stationarity, auto-correlation, uncertainty, etc.• Scalable tools for consensus Geo-design via participative planning with nexus observations and policy projections• An INFEWS data science community to address crucial gaps, and shape next-generation Data Science

NSF INFEWS Data Science Workshop (@ USDA NIFA, Oct. 5th-6th, 2015; Shekhar, Mulla, & Schmoldt; www.spatial.cs.umn.edu/few)

Finding 1: Data & Data Science are crucial!• Understand problems, connections, impacts• Monitor FEW resources, and trends to detect risks• Support decision and policy making• Communicate with public and stakeholders

Finding 2: However, there are show-stopper gaps.1. Data Gaps: No global water & energy census, Heterogeneous data formats & collection protocols2. Data Science (DS) Gaps: Current DS methods are inadequate for spatio-temporal-network FEW data.

Aral Sea Shrinkage (1978-2014)Due to Cotton Farms

Alerts

Global Population

Food

Energy Water DataSc.

14 10 11 20

Gov. Aca. Industry

26 24 5

Sea-Surface Temperature Anomaly

Trends

Page 8: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Details @ https://sites.google.com/site/2016dsfew/home

Thanks: NSF MBDH Travel Support for Early Career Researchers

Monday, August 14th, 2017.http://ai4good.org/few17/

Page 9: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Outline

• Agriculture Big Data (AgBD) Examples

• Data Management Tools - Limitation of traditional tools

- Promising Spatial Tools

• Data Mining Tools

• Collaboration Opportunities

Page 10: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

10

Food Big Data & Collaboration Opportunities

• Current Big Data Tools are too generic – Click stream mining – false positive costs negligible

• One size big data tools do not fit all .Ag big data

Page 11: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

11

Big DataTools

• Current Big Data Tools (e.g., Machine Learning, Hadoop) – For click-stream mining to choose advertisements– False positive cost negligible, Sanity Check via A/B expt.– Google Flu Trends experience

• One size big data tools do not fit all (Food) big data

• Farm to Table Food Data – Physical Spaces: farms, precision agriculture, remote sensing, …– Location-aware – Spatio-temporal context, e.g., neighbors– False positive costs may be high

Page 12: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

12

Limitations of Hadoop

• Hadoop uses Hash (i.e. Random) partitioning– related objects scattered, not grouped

• Alternative is Spatial partitioning

Source: Spatial coding-based approach for partitioning big spatial data in Hadoop, X. Yao et al., Computers & GeoScience, 106:60-67, September 2017, Elsevier.

Page 13: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Food Big Data Curation

• Meta-data, Schema, DBMS (SQL, Hadoop)• Challenge: One size does not fit all!

• Ex. Spatial Querying – Geo-tag. Checkin, Geo-fence

• Spatial Querying Software• OGC Spatial Data Type & Operations• Data-structures: B-tree => R-tree• Algorithms: Sorting => Geometric• Partitioning: random => proximity aware

13

Page 14: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Outline

• Agriculture Big Data (AgBD) Examples

• Data Management Tools

• Data Mining Tools - Limitation of traditional tools

- Promising Spatial Tools

• Collaboration Opportunities

Page 15: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

15

Food Big Data Mining

• Current Big Data Mining Tools are generic – Click stream mining – false positive costs negligible

• One size big data mining tools do not fit all sensor big data

• Food Big Data – are often in Physical Spaces – High cost of false positives

Page 16: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Food Big Data Analysis

• Simulation, Statistics, Data Mining, Machine Learning• Challenge: One size does not fit all

– Prediction error vs. model bias, Cost of false positives, …• Ex. Interaction patterns

Pearson’s Correlation Ripley’s cross-K Participation Index

-0.90 0.33 0.5

1 0.5 1

(b) Spatial Partitions (c) Neighbor graph(a) a map of 3 features

Page 17: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

17

Limitation of Traditional Clustering

• Simulation, Statistics, Data Mining, Machine Learning• Challenge: One size does not fit all

– Prediction error vs. model bias, Cost of false positives, …• Ex. Clustering: Find groups of tuples

Traditional Clustering (K-means always finds clusters)

Spatial Clustering begs to differ!

Page 18: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Sensor Big Data Analysis: Spatial Methods

• Spatial Statistics, Spatial Data Mining– Quantify uncertainty, confidence, …– Is it (statistically) significant? – Is it different from a chance event or rest of dataset?

• e.g., SaTScan finds circular hot-spots

• Auto-correlation, Heterogeneity, Edge-effect, …

18

Page 19: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi
Page 20: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Legionnaires’ Disease Outbreak in New York

20Source: Ring-Shaped Hotspot Detection: A Summary of Results, IEEE ICDM 2014 (w/ E. Eftelioglu et al.)

Page 21: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Necrotic Ring Spot Detection

Original Image Grayscale Bitmap (0,1)

Log Likelihood Ratio: 3129p-value: 0.01

Number of Pixels Included: 4169Inner Radius: 179 pixels Outer Radius: 229 pixels

Page 22: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

Outline

• Agriculture Big Data (AgBD) Examples

• Data Management Tools

• Data Mining Tools

• Collaboration Opportunities - USDA/NIFA FACT

- NSF: INFEWS, CPS/Ag, NRI/Ag, …

Page 23: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

NIFA Food & Ag Cyberinfo. Tools (FACT)

Page 24: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

NSF CPS/AgExcerpts from NSF 17-529: Cyber Physical Systems (CPS)

Page 25: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

NSF CPS/Ag

Page 26: Towards Spatial Data Science · 2017-10-30 · Towards Spatial Data Science for Smart Agriculture Big Data Oct. 26-28, 2017 UMN/MBDH Workshop on Agricultural Data Integration Shashi

References

1. Spatial Computing, Communications of the ACM, 59(1), Jan. 2016.

2. From GPS and Virtual Globes to Spatial Computing 2020, Computing Community Consortium Report, 2013. www.cra.org/ccc/visioning/visioning-activities/spatial-computing

3. Spatiotemporal Data Mining: A Computational Perspective , ISPRS International Journal on Geo-Informtion, 4(4):2306-2338, 2015 (DOI: 10.3390/ijgi4042306).

4. Identifying patterns in spatial information: a survey of methods , Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 193-214, 1(3), May/June 2011. (DOI: 10.1002/widm.25).


Recommended