+ All Categories
Home > Documents > Javaedge 2010-cschalk

Javaedge 2010-cschalk

Date post: 21-Jan-2015
Category:
Upload: chris-schalk
View: 2,589 times
Download: 1 times
Share this document with a friend
Description:
 
Popular Tags:
78
Google Cloud Computing for Java Developers: Platform and Monetization Chris Schalk Google Developer Advocate TheEdge 2010 Tel Aviv, Israel Dec 16 th , 2010
Transcript
Page 1: Javaedge 2010-cschalk

Google Cloud Computing for Java Developers: Platform and Monetization

Chris Schalk Google Developer Advocate 

TheEdge 2010 Tel Aviv, Israel Dec 16th, 2010 

Page 2: Javaedge 2010-cschalk

Google Cloud Platform Technologies at Glance

Google BigQuery  Google  Predic0on API 

Google Storage 

Google App Engine 

Google App Engine for Business (new) 

ExisFng 

New! 

Page 3: Javaedge 2010-cschalk

•  Part I - Intro to App Engine •  App Engine Details •  Development Tools •  App Engine for Business •  Apps Monetization – Apps Marketplace

• Part II – Google’s new cloud technologies •  Google Storage •  Prediction API •  BigQuery

Agenda

Page 4: Javaedge 2010-cschalk

Part I – Intro to App Engine

Topics covered

•  App Engine a PaaS •  App Engine usage/customers •  App Engine Technical Details

Page 5: Javaedge 2010-cschalk

Google App Engine Build your own applications in Google's cloud

Page 6: Javaedge 2010-cschalk

IaaS 

PaaS 

SaaS 

Source: Gartner AADI Summit Dec 2009 

Cloud Computing as Gartner Sees It

Page 7: Javaedge 2010-cschalk

• Easy to build

• Easy to maintain

• Easy to scale

7

Why Google App Engine?

Page 8: Javaedge 2010-cschalk

8

150,000+ ac0ve apps on a weekly basis 

8

By the Numbers

Page 9: Javaedge 2010-cschalk

9

100,000+ developers use every month 

9

By the Numbers

Page 10: Javaedge 2010-cschalk

10

1B+ daily pageviews 

10

By the Numbers

Page 11: Javaedge 2010-cschalk

11 

Some App Engine Partners

Page 12: Javaedge 2010-cschalk

App Engine Details

12

Page 13: Javaedge 2010-cschalk

Cloud Development in a Box

13

•  Downloadable SDK

•  Application runtimes •  Java, Python

•  Local development tools •  Eclipse plugin, AppEngine Launcher

•  Specialized application services

•  Cloud based dashboard

•  Ready to scale

•  Built in fault tolerance, load balancing

Page 14: Javaedge 2010-cschalk

Specialized Services

14 

Blobstore Images 

Mail  XMPP  Task Queue 

Memcache  Datastore  URL Fetch 

User Service 

Page 15: Javaedge 2010-cschalk

Language Runtimes

15 

Duke, the Java mascot Copyright © Sun Microsystems Inc., all rights reserved. 

Page 16: Javaedge 2010-cschalk

Ensuring Portability

16 

Page 17: Javaedge 2010-cschalk

Extended Language support through JVM

•  Java •  Scala •  JRuby (Ruby) •  Groovy •  Quercus (PHP) •  Rhino (JavaScript) •  Jython (Python)

17 

Duke, the Java mascot Copyright © Sun Microsystems Inc., all rights reserved. 

Page 18: Javaedge 2010-cschalk

Always free to get started

•  ~5M pageviews/month •  6.5 CPU hrs/day •  1 GB storage •  650K URL Fetch calls/day •  2,000 recipients emailed •  1 GB/day bandwidth •  100,000 tasks enqueued •  650K XMPP messages/day

18

Page 19: Javaedge 2010-cschalk

Application Platform Management

19 

Page 20: Javaedge 2010-cschalk

App Engine Dashboard

20 

Page 21: Javaedge 2010-cschalk

Development Tools for App Engine

21 

Page 22: Javaedge 2010-cschalk

Google Plugin for Eclipse

22 

Page 23: Javaedge 2010-cschalk

SDK Console

23 

Page 24: Javaedge 2010-cschalk

Two+ years in review

24 

Apr 2008 Python launch May 2008 Memcache, Images API Jul 2008 Logs export Aug 2008 Batch write/delete Oct 2008 HTTPS support Dec 2008 Status dashboard, quota details Feb 2009 Billing, larger files Apr 2009 Java launch, DB import, cron support, SDC May 2009 Key-only queries Jun 2009 Task queues Aug 2009 Kindless queries Sep 2009 XMPP Oct 2009 Incoming email Dec 2009 Blobstore Feb 2010 Datastore cursors, Appstats Mar 2010 Read policies, IPv6 May 2010 App Engine for Business Jun 2010 Task queue increases, Python pre-compilation… Jul 2010 Mapper API Aug 2010 Multi-tenancy, hi perf img serving, custom err pages Oct 2010 Instances Console, Delete Kind/App Data

Page 25: Javaedge 2010-cschalk

App Engine 1.4 Release New Features

1. Channel API   Allows for Server Push (Comet) to browser   ‐ hXp://code.google.com/appengine/docs/java/channel/ 

2. Always On 

3. Warm Up Requests –  Enabled by default for Java apps –  Can turn off in appengine‐web.xml via: <warmup‐requests‐enabled>false</warmup‐requests‐enabled> 

Page 26: Javaedge 2010-cschalk

App Engine 1.4 Release New Features

4. Hard Limit Updates –  No more 30 second limit for background work ‐> up to 10 minutes 

–  Response size limits for URLFetch have been raised from 1MB to 32MB 

–  Memcache batch get/put can now also do up to 32MB requests –  Image API requests and response size limits have been raised from 1MB to 32MB 

–  Mail API outgoing aXachments have been increased from 1MB to 10MB 

Page 27: Javaedge 2010-cschalk

Other Upcoming Features

1. Mapper API  First component of App Engine’s MapReduce toolkit 

•  hXp://code.google.com/p/appengine‐mapreduce/ –  Large scale data manipulaFon –  Examples include: 

•  Report generaFon •  CompuFng staFsFcs and metrics … 

– Java Example: •  hXp://ikaisays.com/2010/07/09/using‐the‐java‐mapper‐framework‐for‐app‐engine/ •  Google “sqlreduce”  

–  hXp://code.google.com/p/fredsa/source/browse/trunk/?r=115#trunk%2Fsqlreduce 

2. Matcher API  –  Matcher allows an app to register a set of queries to match against a stream of documents. For every document presented, matcher will return the ids of all the registered queries that match the document.  – Trusted tester program announced in App Engine forum –  Java support coming, but sFll Python only for now 

…but you can try out early versions now! 

Page 28: Javaedge 2010-cschalk

Introducing App Engine for Business

28

Same scalable cloud platform, but designed for the Enterprise

App Engine for Business

Page 29: Javaedge 2010-cschalk

Google App Engine for Business Details

•  Enterprise application management –  Centralized domain console (preview available)

•  Enterprise reliability and support –  99.9% Service Level Agreement –  Direct support

•  Hosted SQL –  Relational SQL database in the cloud (preview available)

•  SSL on your domain •  Extremely Secure by default

–  Integrated Single Sign On (SSO) •  Pricing that makes sense

–  Apps cost $8 per user, up to $1000 max per month

29 

Google App Engine for Business

Page 30: Javaedge 2010-cschalk

Enterprise App Development with Google

30 

Build your own

Google App Engine for Business

Buy from others

Google Apps Marketplace

Enterprise Firewall 

Enterprise Data  AuthenFcaFon  Enterprise Services  User Management 

Buy from Google

Google Apps for Business

Enterprise Application Platform

Page 31: Javaedge 2010-cschalk

31 

App Engine for Business Roadmap

Enterprise Administration Console Preview (signups available)

Direct Support Preview (signups available)

Hosted SQL Preview (signups available)

Service Level Agreement Available Q4 2010 (Draft published)

Enterprise billing Available Q4 2010

Custom Domain SSL Limited Release EOY 2010

Page 32: Javaedge 2010-cschalk

App Engine Resources

Get started with App Engine •  http://code.google.com/appengine

Read up on App Engine for Business and become a trusted tester •  http://code.google.com/appengine/business

•  bit.ly/gae4btt <- sign up!

Page 33: Javaedge 2010-cschalk

Enough technology.. How do you monetize your apps?

Page 34: Javaedge 2010-cschalk

Apps Monetization – Apps Marketplace

hXp://google.com/enterprise/marketplace/ 

Page 35: Javaedge 2010-cschalk

35 

Your Apps!

35

Add your Apps to the Marketplace! 

IaaS 

PaaS 

SaaS 

Page 36: Javaedge 2010-cschalk

Apps Monetization – Apps Marketplace for Developers

hXp://code.google.com/googleapps/marketplace/ 

Page 37: Javaedge 2010-cschalk

App Engine Demos

•  App Engine/Java •  Getting started

•  App Engine for Business •  Domain Console •  SQL •  Guestbook on SQL on GAE4B •  SQLReduce

Page 38: Javaedge 2010-cschalk

Part II - Google’s new Cloud Technologies

Topics covered •  Google Storage for Developers •  Prediction API (machine learning) •  BigQuery

Page 39: Javaedge 2010-cschalk

Google Storage for Developers Store your data in Google's cloud

Page 40: Javaedge 2010-cschalk

What Is Google Storage?

•  Store your data in Google's cloud o  any format, any amount, any Fme 

•  You control access to your data o  private, shared, or public 

•   Access via Google APIs or 3rd party tools/libraries 

Page 41: Javaedge 2010-cschalk

Sample Use Cases

Static content hosting e.g. static html, images, music, video

Backup and recovery e.g. personal data, business records

Sharing e.g. share data with your customers

Data storage for applications e.g. used as storage backend for Android, AppEngine, Cloud based apps

Storage for Computation e.g. BigQuery, Prediction API

Page 42: Javaedge 2010-cschalk

Google Storage Benefits

High Performance and Scalability        Backed by Google infrastructure  

Strong Security and Privacy         Control access to your data 

Easy to Use Get started fast with Google & 3rd party tools 

Page 43: Javaedge 2010-cschalk

Google Storage Technical Details

•  RESTful API  o  Verbs: GET, PUT, POST, HEAD, DELETE  o  Resources: identified by URI o  Compatible with S3  

•  Buckets  o  Flat containers  

•  Objects  o  Any type o  Size: 100 GB / object 

•  Access Control for Google Accounts  o  For individuals and groups  

•  Two Ways to Authenticate Requests  o  Sign request using access keys  o  Web browser login

Page 44: Javaedge 2010-cschalk

Performance and Scalability

•  Objects of any type and 100 GB / Object •  Unlimited numbers of objects, 1000s of buckets

•  All data replicated to multiple US data centers •  Utilizes Google's worldwide network for data delivery

•  Only you can use bucket names with your domain names •  Read-your-writes data consistency •  Range Get

Page 45: Javaedge 2010-cschalk

Some Early Google Storage Adopters 

Page 46: Javaedge 2010-cschalk

Google Storage - Availability

•  Preview in US currently o  100GB free storage and network from Google per

account o  Sign up for waitlist at http://code.google.com/apis/

storage/

•  Note: Non US preview available on case-by-case basis •  http://bit.ly/dKm770 (for Storage, BigQuery, Prediction)

Page 47: Javaedge 2010-cschalk

Google Storage - Pricing

o  Storage  $0.17/GB/Month 

o  Network  Upload - $0.10/GB  Download 

 $0.15/GB Americas / EMEA  $0.30/GB  APAC 

o  Requests  PUT, POST, LIST - $0.01 / 1000 Requests  GET, HEAD - $0.01 / 10000 Requests

Page 48: Javaedge 2010-cschalk

Demo

•  Tools: o  GS Manager o  GSUtil

•  Upload / Download

Page 49: Javaedge 2010-cschalk

Google Prediction API Google's prediction engine in the cloud

Page 50: Javaedge 2010-cschalk

Introducing the Google Prediction API

•  Google's sophisticated machine learning technology •  Available as an on-demand RESTful HTTP web service

Page 51: Javaedge 2010-cschalk

How does it work? 

"english"  The quick brown fox jumped over the lazy dog. 

"english"  To err is human, but to really foul things up you need a computer. 

"spanish"  No hay mal que por bien no venga. 

"spanish"  La tercera es la vencida. 

?  To be or not to be, that is the quesFon. 

?  La fe mueve montañas. 

The Prediction API finds relevant features in the sample data during training.

The PredicFon API later searches for those features during predicFon. 

Page 52: Javaedge 2010-cschalk

A virtually endless number of applicaFons... 

Customer Sentiment

TransacFon Risk 

Species IdenFficaFon 

Message RouFng 

Legal Docket ClassificaFon 

Suspicious AcFvity 

Work Roster Assignment 

Recommend Products 

PoliFcal Bias 

Uplit MarkeFng 

Email Filtering 

DiagnosFcs 

Inappropriate Content 

Career Counselling 

Churn PredicFon 

... and many more ... 

Page 53: Javaedge 2010-cschalk

Using the Prediction API

1. Upload 

2. Train 

Upload your training data to Google Storage  

Build a model from your data 

Make new predicFons 3. Predict 

A simple three step process... 

Page 54: Javaedge 2010-cschalk

Step 1: Upload Upload your training data to Google Storage 

•  Training data: outputs and input features •  Data format: comma separated value format

(CSV) "english","To err is human, but to really ..." "spanish","No hay mal que por bien no venga." ... Upload to Google Storage gsutil cp ${data} gs://yourbucket/${data}

Page 55: Javaedge 2010-cschalk

Step 2: Train Create a new model by training on data 

To train a model:

POST prediction/v1.1/training?data=mybucket%2Fmydata Training runs asynchronously. To see if it has finished:

GET prediction/v1.1/training/mybucket%2Fmydata

{"data":{ "data":"mybucket/mydata", "modelinfo":"estimated accuracy: 0.xx"}}}

Page 56: Javaedge 2010-cschalk

Step 3: Predict Apply the trained model to make predicFons on new data 

POST prediction/v1.1/query/mybucket%2Fmydata/predict { "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}}

Page 57: Javaedge 2010-cschalk

Step 3: Predict Apply the trained model to make predicFons on new data 

POST prediction/v1.1/query/mybucket%2Fmydata/predict { "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}} { data : { "kind" : "prediction#output", "outputLabel":"French", "outputMulti" :[ {"label":"French", "score": x.xx} {"label":"English", "score": x.xx} {"label":"Spanish", "score": x.xx}]}}

Page 58: Javaedge 2010-cschalk

Step 3: Predict Apply the trained model to make predicFons on new data 

import httplib

header = {"Content-Type" : "application/json"}

#...put new data in JSON format in params variable conn = httplib.HTTPConnection("www.googleapis.com")conn.request("POST", "/prediction/v1.1/query/mybucket%2Fmydata/predict”, params, header)

print conn.getresponse()

An example using Python 

Page 59: Javaedge 2010-cschalk

Prediction API Capabilities Data •  Input Features: numeric or unstructured text •  Output: up to hundreds of discrete categories

Training •  Many machine learning techniques •  Automatically selected •  Performed asynchronously

Access from many platforms: •  Web app from Google App Engine •  Apps Script (e.g. from Google Spreadsheet) •  Desktop app

Page 60: Javaedge 2010-cschalk

Prediction API v1.1 - features

•  Updated Syntax •  Multi-category prediction

o  Tag entry with multiple labels •  Continuous Output

o  Finer grained prediction rankings based on multiple labels •  Mixed Inputs

o  Both numeric and text inputs are now supported

Can combine continuous output with mixed inputs

Page 61: Javaedge 2010-cschalk

Prediction API Demos

•  Creating training data – recipes.csv •  Simple REST access

•  Training the prediction engine •  Start predicting!

•  A Java Web example

Page 62: Javaedge 2010-cschalk

Google BigQuery Interactive analysis of large datasets in Google's cloud

Page 63: Javaedge 2010-cschalk

Introducing Google BigQuery

•  Google's large data adhoc analysis technology o  Analyze massive amounts of data in seconds

•  Simple SQL-like query language •  Flexible access

o  REST APIs, JSON-RPC, Google Apps Script

Page 64: Javaedge 2010-cschalk

Why BigQuery?  

Working with large data is a challenge 

Page 65: Javaedge 2010-cschalk

Many Use Cases ... 

Spam Trends 

DetecFon 

Web Dashboards  Network OpFmizaFon 

InteracFve Tools 

Page 66: Javaedge 2010-cschalk

Key CapabiliFes of BigQuery 

•  Scalable: Billions of rows •  Fast: Response in seconds

•  Simple: Queries in SQL

•  Web Service o  REST o  JSON-RPC o  Google App Scripts

Page 67: Javaedge 2010-cschalk

Using BigQuery

1. Upload 

2. Import 

Upload your raw data to Google Storage  

Import raw data into BigQuery table 

Perform SQL queries on table 3. Query 

Another simple three step process... 

Page 68: Javaedge 2010-cschalk

Writing Queries

Compact subset of SQL o  SELECT ... FROM ...

WHERE ... GROUP BY ... ORDER BY ... LIMIT ...;

Common functions o  Math, String, Time, ...

Statistical approximations o  TOP o  COUNT DISTINCT

Page 69: Javaedge 2010-cschalk

BigQuery via REST

GET /bigquery/v1/tables/{table name}

GET /bigquery/v1/query?q={query}

Sample JSON Reply: { "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] } } Also supports JSON-RPC

Page 70: Javaedge 2010-cschalk

Security and Privacy

Standard Google Authentication •  Client Login •  OAuth •  AuthSub

HTTPS support •  protects your credentials •  protects your data

Relies on Google Storage to manage access

Page 71: Javaedge 2010-cschalk

Large Data Analysis Example

Wikimedia Revision history data from: hXp://download.wikimedia.org/enwiki/latest/enwiki‐latest‐pages‐meta‐history.xml.7z 

Wikimedia Revision History 

Page 72: Javaedge 2010-cschalk

Using BigQuery Shell 

Python DB API 2.0 + B. Clapper's sqlcmd http://www.clapper.org/software/python/sqlcmd/

Page 73: Javaedge 2010-cschalk

BigQuery from a Spreadsheet

Page 74: Javaedge 2010-cschalk

BigQuery from a Spreadsheet

Page 75: Javaedge 2010-cschalk

Further info available at: 

•  Google Storage for Developers o  http://code.google.com/apis/storage

•  Prediction API o  http://code.google.com/apis/predict

•  BigQuery o  http://code.google.com/apis/bigquery

Page 76: Javaedge 2010-cschalk

Recap

•  Google App Engine o  Google’s PaaS cloud development platform

•  Google App Engine for Business o  New enterprise version of App Engine

•  Google Storage o  New high speed data storage on Google Cloud

•  Prediction API o  New machine learning technology able to predict

outcomes based on sample data

•  BigQuery o  New service for Interactive analysis of very large data

sets using SQL

Page 77: Javaedge 2010-cschalk

Q&A

Page 78: Javaedge 2010-cschalk

Thank You!

Chris Schalk Google Developer Advocate

http://twitter.com/cschalk


Recommended