Date post: | 15-Jan-2017 |
Category: |
Technology |
Upload: | amazon-web-services |
View: | 1,548 times |
Download: | 2 times |
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Earth Observation in the Cloud
Jed Sundwall, AWS Open Data Global Lead
10 November 2015
Demo Day
2
Agenda
• Welcome! 🌍🌎🌏• Open data on AWS• NEXRAD on AWS• Landsat on AWS
Welcome! 🌍🌎🌏
3
Welcome! 🌏🌍🌎
4
Welcome! 🌎🌏🌍
5
Sponsor Sponsor Sponsor
Venue Sponsor
Thank you to our sponsors!
Our goals for this event
• Show off amazing work being done by our customers• Provide opportunities for you to network• Highlight the diversity of work made possible by Earth
observation data• Learn about your priorities and needs
New whitepaper
We have just published a new AWS Whitepaper on Minimizing Variable Costs for Shared Data.
Download it at:
http://bit.ly/s3-requester-pays-open-data
Open data on AWS
9
10
Why does AWS care about open data?
Open data is data that can be used by anyone for any purpose for free.
Many of our customers rely on quality open data as much as they rely on our computing, storage, and other web services.
Data on AWS
Amazon Web Services provides a comprehensive toolkit for gathering, storing, analyzing, and working with data at any scale.
Amazon Elastic MapReduce (Amazon EMR) provides the Apache Hadoop analytics framework as an easy-to-use managed service.
Amazon S3 lets you store and retrieve any amount of data, at any time, from anywhere on the web.
Amazon DynamoDB is a fully-managed NoSQL database service that makes it cost-effective to store and retrieve any amount of data.
11
1-click deployment to launch, on multiple regions around the world
Pay-as-you-go pricing
Advanced AnalyticsData Integration Analysis & Visualization
http://bit.ly/awsAnalytics12
The power of open data in the cloud
Making data open on AWS enables more innovation by making data available for rapid access to our flexible and low-cost computing resources.
13
Making data open on AWS enables more innovation by making data available for rapid access to our flexible and low-cost computing resources.
The power of open data in the cloud
14
AWS Partners Focused on Public Sector
15
History of InnovationAWS has been continually expanding its services to support virtually any cloud workload, now offering more than 40 services.
Amazon S3
Amazon SQS
Amazon EC2
Amazon SimpleDB
Amazon EBS
Amazon CloudFront
Elastic LoadBalancingAuto ScalingAmazon VPCAmazon RDS
Amazon SNSAWS Identity and Access Management
Amazon Route 53
Amazon SESAWS Elastic BeanstalkAWS CloudFormationAmazon ElastiCacheAWS Direct ConnectAWS GovCloud
AWS Storage Gateway
Amazon DynamoDB
Amazon CloudSearch
Amazon SWF
Amazon Glacier
Amazon Redshift
AWS Data Pipeline
Amazon Elastic TranscoderAWS OpsWorksAWS CloudHSMAmazon AppStreamAWS CloudTrailAmazon WorkSpacesAmazon Kinesis
Amazon ECS
Amazon Lambda
AWS Config
AWS CodeDeploy
Amazon RDS for Aurora
AWS KMS
Amazon Cognito
Amazon WorkDocs
AWS Directory Service
Amazon Mobile Analytics
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Amazon EFSAmazon WorkMailAmazon Machine Learning
16
AWS has announced price reductions 49* times since our inception in 2006. Recent price drops included…
Amazon ElastiCache reduces prices for cache nodes by an average of 34%
March 26, 2014
34%Amazon S3 reduces prices for Standard and Reduced Redundancy Storage, by an average of 51%
March 26, 2014
51%Amazon Route 53 lowers prices for both standard queries and latency-based routing queriesby 20%
July 31, 2014
20%
17
* As of June 2015
18
Open data as a platform
19
Open data as a platform
20
An Amazonian approach to open data
Two ideas that inform how we approach public data sets:• Work backwards from the customer• Eliminate undifferentiated heavy lifting
21
Working Backwards
• Think of data sets as products• Seek out valuable data by listening to customer needs• Consider real-world use cases for the data• Consider the size of the user community or market
opportunity
22
Undifferentiated heavy lifting
“…data must be organized, well-documented, consistently formatted, and error free. Cleaning the data is often the most taxing part of data science, and is frequently 80% of the work.”— Data Driven by DJ Patil and Hilary Mason
23
Undifferentiated heavy lifting
“…data must be organized, well-documented, consistently formatted, and error free. Cleaning the data is often the most taxing part of data science, and is frequently 80% of the work.”— Data Driven by DJ Patil and Hilary Mason
We ask: How can we get rid of that 80%?
24
Public datasets on AWS
To enable more innovation, AWS hosts a selection of datasets that anyone can access for free. Data in our public datasets is available for rapid access to our flexible and low-cost computing resources.
Earth ScienceLandsat on AWS
Life Sciences1000 Genomes Project
Internet ScienceCommon Crawl Corpus
NEXRAD on AWS
25
NEXRAD on AWS
The Next Generation Weather Radar (NEXRAD) is a network of 160 high-resolution Doppler radar sites that detects precipitation and atmospheric movement and disseminates data in 5 minute intervals from each site.
It has traditionally been time consuming and expensive to acquire, store, and analyze NEXRAD data. Accessing the full historical archive has been impossible.
26
NEXRAD on AWS
NEXRADSites
Public Amazon S3 Bucket
AmazonEC2
Public Amazon S3 Bucket
Real-time data chunks
Volume scan file assembly
Continuously updated archive
With NEXRAD on AWS, we provide an archive of individual volume scan files and real-time chunks as objects in Amazon S3.
This allows the data to be accessed programmatically via a RESTful interface and quickly deployed to any of our products for analysis and processing.
27
NEXRAD on AWS
Our collaborators, including Unidata, The Weather Company, NOAA, Climate Corporation, and CartoDB, have provided early use cases and tutorials on how to use this data in the cloud.
A wide range of users are interested in using NEXRAD on AWS for longitudinal analysis, to study and visualize specific weather events, and develop new products.
More info at http://aws.amazon.com/public-data-sets/nexrad
28
Landsat on AWS
29
30
Landsat on AWS
We have committed to make up to 1 petabyte of Landsat imagery readily available as objects on Amazon S3.
All Landsat 8 scenes from 2015 are available, along with a selection of cloud-free scenes from 2013 and 2014. All new Landsat 8 scenes are made available each day (~700 per day), often within hours of production.
31
The Traditional Approach
Data is most commonly accessed via a web interface and downloaded on premises before being loaded into a web server.
All bands are downloaded in a .tar archive, even if you only need a few bands.
Data acquisition is time consuming and inherently redundant. Analysis is limited by user’s access to bandwidth, storage, memory, and processing power.
32
Landsat on AWS
Landsat on AWS makes each band of each scene readily available as objects on Amazon S3. Data can be accessed programmatically via HTTP and quickly deployed to any of our products for analysis and processing.
Users do not need to worry about local storage and have access to virtually unlimited computing power on demand.
AmazonEC2
s3://landsat-pds
.tarUSGS
.tiff
33
Undifferentiated heavy lifting
We use GDAL to add “internal tiling” on each Landsat on AWS tiff, which allows developers to use HTTP range gets to access specific portions of each scene.
This allows people to only access the data they need when they need it. Standard tiff
objectInternal tiled tiff
object
RGBVisible light
InfraredVegetation
Shortwave infraredUrban areas
Wellington, New Zealand – Made on Snapsat.orghttps://landsat-pds.s3.amazonaws.com/L8/072/089/
RGBVisible light
InfraredVegetation
Shortwave infraredUrban areas
SHARE DATA VIA URLS, NOT COPIES
Wellington, New Zealand – Made on Snapsat.orghttps://landsat-pds.s3.amazonaws.com/L8/072/089/
36
Landsat on AWSIn the first 150 days (19 Mar – 16 Aug 2015)
• Over 200,000 scenes available
• Over 500 million hits globally
Image shows frequency of scene requests by path/row.
White: ~100 requestsOrange: >300k requests
Visualization by Drew BollingerDevelopment Seed
37
Landsat on AWS as a platform
38
New SNS topic for Landsat on AWS
arn:aws:sns:us-west-2:274514004127:OpenObjectAddL8
You can now subscribe to a publicly available Amazon Simple Notification Service (Amazon SNS) topic to be notified whenever a new batch of Landsat scenes are available at s3://landsat-pds.