BIOL2050 - Big data lab

Post on 25-Jan-2017

921 views 0 download

transcript

Big data labBIOL2050

Challenges of Big Data

• Overwhelming

• Difficult to sort through to find something meaningful

• Hard to manage

Examples of Big data

www.google.com/trends/

- FIFA world cup- Beyonce- Potatoes- VHS

Big Data: What is the Big deal?

Google grew from processing 100 TB of data a day in 2004 to 20 PB a day in 2008

We are producing more data than we are able to store or analyze

Economist, 2010

Big Data: What is the Big deal?

Far out software

Big Data: What is the Big deal?

“Focusing on one individual at a time, we can provide better reminders, search results, and advertisements by considering all the locations the person is likely to be close to in the future (e.g., “Need a haircut? In 4 days, you will be within 100 meters of a salon that will have a $5 special at that time.”)”

Big Data: What is the Big deal?

Enable scientific breakthroughs

- Large Hadron Collider- Sloan Sky Survey- Genomics- Climate data

Hampton et al, 2013

Big data for ecology

• Ecologists produce large amount of data, but needs to be compiled

• Ecologists must treat data as products, just like publications

• Archive & share -> data repositories

Big data modelingexercise

Big Data for climate

Many different climate projects- WorldClim- CalClimate Commons- NOAA- European Climate Data - Climate Data WMO

Climate data and rasters

Point < Line < Raster

Climate data and rasters

Weather station 1 Weather station 2

Climate data and rasters

Weather station 1 Weather station 2

Interpolated values

Climate data and rasters

Climate data and rasters

Big data & species distributions

Desert nativeChaenactis fremontii

Invasive thistleCentaurea solstitalis

Climate & species distributions

Example

Consortium of California herbaria – plant databasehttp://ucjeps.berkeley.edu/consortium/

CalAdapt – Climate commonshttp://cal-adapt.org/data/tabular/

Plantago insularis

- Copy from internet- Paste special, “as text”- Delete everything except GPS and ID- Re-label specimen to “id”- Re-label “lat” and “lng”

- Copy and paste- Click away from data area- Check settings to match below

- Copy and paste- Click away from data area- Check settings to match below

Model climate change

• Pick one GPS point, remove all the others• Set time interval for daily, CCSM3• Download data• Plot temperatures from 1950 – 2099• Will your species go extinct? • Try other points