A Modern Framework for Amazon Elastic MapReduce (BDT309) | AWS re:Invent 2013

Post on 12-Jan-2015

816 views 3 download

Tags:

description

If you've ever developed code for processing data, you know what a mess it can be—especially on Hadoop. You lack debugging tools, instant feedback, automated tests, and a sane deploy. Mortar has developed a modern framework for data processing on Hadoop and Amazon Elastic MapReduce. It is a free, open framework providing instant, step-by-step execution visibility, automated testing, reusable components, and one-button deployment. See how Mortar demonstrates this framework on Amazon EMR on a sample data set to solve a big data problem.

transcript

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Mortar Data: K Young (CEO), Jeremy Karn (Lead Engineer)

November 15, 2013

A Modern Framework for EMR

Friday, November 15, 13

K Young Jeremy Karn

Friday, November 15, 13

This talk is technical

Friday, November 15, 13

Great: HadoopGreater: Amazon EMR

Friday, November 15, 13

A Modern Framework

Friday, November 15, 13

A Modern Framework

Friday, November 15, 13

Goal: 10x productivity• Collaboration• Efficient, Free Development• Testing / Debug• Reproducibility• And More...

Friday, November 15, 13

For data science(not a database)

Friday, November 15, 13

Iterate Locally

DeployCloud Execution EMR + Mortar

Friday, November 15, 13

Demo

Find the most popularprojects on GitHub

Friday, November 15, 13

Goal: Collaboration• Easily run code from others• Contribute changes back

Friday, November 15, 13

Goal: Efficient, Free Development• Rapid iteration• No cost• Resilient to errors

Friday, November 15, 13

Goal: Testing & Debug• Automated testing• See what is happening at runtime

Friday, November 15, 13

Goal: Reproducibility• Know what was run• 1-button deploy• Rollback

Friday, November 15, 13

Goal: Miscellaneous goals• Easy scheduling• Easily use other technologies: Python, Amazon

DynamoDB, MongoDB• Results easy to locate• Managed cluster lifetime• More granular API

Friday, November 15, 13

A Modern Framework for Amazon EMR

• Collaboration• Efficient, Free Development• Testing / Debug• Reproducibility• And More...

What you saw

Friday, November 15, 13

Next steps• bit.ly/mortar-reinvent• Documentation: help.mortardata.com• Follow: @mortardata

Friday, November 15, 13

Please give us your feedback on this presentation

As a thank you, we will select prize winners daily for completed surveys!

BDT309 Thank You

Friday, November 15, 13