Date post: | 26-Jan-2015 |
Category: |
Technology |
Upload: | yahoo-developer-network |
View: | 123 times |
Download: | 3 times |
Ooz ie – Now and Beyond
§ PRESENTED BY Mona Chitnis⎪ Hadoop User Group, Yahoo Sunnyvale, October 16, 2013
Team In Action
2 Yahoo Confidential & Proprietary
§ Alejandro Abdelnur § Mohammad Islam § Rohini Palaniswamy § Robert Kanter § Virag Kothari § Mona Chitnis § Ryota Egashira § Michelle Chiang § Bowen Zhang
OVERVIEW
4 Yahoo Confidential & Proprietary
Why Oozie? The Problem The Need
§ Doing something on the grid often required multiple steps § MapReduce job § Pig job § Streaming job § HDFS operation (mkdir, chmod, etc)…
§ Workflow scheduler with better support for grid jobs (native integration with Hadoop) § orchestrate dependency between jobs § execute at specific time or on data
availability § retry jobs in the event of failures
(reliable)
§ Multiple ad-hoc solutions existed § custom job control § shell scripts § cron…
§ Common framework for communication and execution of production process § sync (clocked dataset) awareness § async (unspecified freq) data
awareness
§ Cost of building and running apps were high § development and applications
engineering § support, operations, and hardware
§ Horizontally scalable and extensible system § Open-source § Workflows to couple resources instead
of having a monolithic code base
A server-based workflow scheduling system to manage Hadoop jobs
Overview
5 Yahoo Confidential & Proprietary
Oozie – A Workflow Engine
§ Oozie executes workflow defined as DAG of jobs § The job type includes MapReduce, Pig, Hive, shell script, custom Java code
etc. § Introduced in Oozie 1.x
start M/R job
M/R job
decision
fork
Pig job
M/R job
join
end Java FS job
ENOUGH
MORE
Control-flow nodes (start, kill, end | fork, join, decision)
Action nodes (map reduce, pig, hive, distcp, java, fs, sub-workflow, shell, ssh, email)
kill
OK
ERROR
Overview
Example M/R Action
JT and NN
Mapper
Reducer
Queue Name
Input Directory
Output Directory
6 Yahoo Confidential & Proprietary
Overview
7 Yahoo Confidential & Proprietary
Workflow State Transitions
Source: Chicago HUG, Dec 2012
Overview
8 Yahoo Confidential & Proprietary
Oozie (Coordinator) – A Scheduler
§ Oozie executes workflow based on § time dependency (frequency) § data dependency
§ Introduced in 2.x
HDFS/ HCat
Oozie Server
Oozie Client
Oozie Workflow
WS API Oozie Coordinator
Check Data Availability
Overview
9 Yahoo Confidential & Proprietary
Oozie (Bundle) – A Pipeline Framework
§ Users can define and execute a “bundle” of coordinator apps § large scale data processing (inter-related coordinators) § operability and manageability of pipelines
§ User can start/stop/suspend/resume/rerun in the bundle level § Introduced in 3.x, bundles are optional
HDFS/ HCat
Oozie Server
Oozie Client
Oozie Workflow
WS API
Oozie Coordinator
Check Data Availability
Bundle
Overview
10 Yahoo Confidential & Proprietary
Layers of Abstraction in Oozie
Coord Action
Coord Action
Coord Action
Coord Action
WF Job WF Job WF Job
M/R Job
PIG Job
M/R Job
PIG Job
Bundle
1. Bundle
Coord Job Coord Job
2. Coordinator
WF Job
3. Workflow
Overview
11 Yahoo Confidential & Proprietary
Architectural Overview
Oozie (Java Web-App)
Security
WS Callback WS API
DAG Engine
Oracle DB
Commands
Com
man
d Q
ueue
start rerun submit Command Executor
Thread Pool
Recovery Daemon Thread
Action Executors
M/R fs Pig pluggable, to support additional action types
Inst
rum
enta
tion WF
stor
e W
F lib
sub-wf
executed Asynchronously via Command Queue
resume kill suspend
info
start action
end action
check action
callback
signal job
notification
Web Services (JSON/REST API)
Overview
12 Yahoo Confidential & Proprietary
Oozie Security, Multi-tenancy and Scalability
Oozie Server
Hadoop Cluster
YARN RM
Launcher Mapper
Actual M/R Job
1 Auth.
End User (Kerberos, Y! specific)
2 Create
Launcher Job (super-user)
3 Execute User Job (doAs)
5 Async Callback
4 Response
Overview
USE CASES
14 Yahoo Confidential & Proprietary
Use Case 1: Time Triggers
Execute your workflow every 15 minutes
00:15 00:30 00:45 01:00
Use Cases and Common Patterns
15 Yahoo Confidential & Proprietary
Use Case 2: Time and Data Triggers
Materialize your workflow every hour, but only run them when the input data is ready (that is loaded to the grid every hour)
01:00 02:00 03:00 04:00
Hadoop Input Data
Exists?
Use Cases and Common Patterns
16 Yahoo Confidential & Proprietary
Use Case 2: Time and Data Triggers <coordinator-app name=“coord1” frequency=“${1*HOURS}”…> <datasets> <dataset name="logs" frequency=“${1*HOURS}” initial-instance="2009-01-01T23:59Z"> <uri-template>hdfs://bar:9000/app/logs/${YEAR}/${MONTH}/${DAY}/${HOUR}</uri-template> </dataset> </datasets> <input-events> <data-in name=“inputLogs” dataset="logs"> <instance>${current(0)}</instance> </data-in> </input-events> <action> <workflow> <app-path>hdfs://bar:9000/usr/abc/logsprocessor-wf</app-path> <configuration> <property> <name>inputData</name><value>${dataIn(‘inputLogs’)}</value> </property> </configuration> </workflow> </action>
Use Cases and Common Patterns
Dataset Definition
Input Events Definition with time of coordinator action materialized (created)
Action Definition
17 Yahoo Confidential & Proprietary
Use Case 3: Rolling Window
00:15 00:30 00:45 01:00
01:00
01:15 01:30 01:45 02:00
02:00
Access 15 minute datasets and roll them up into hourly datasets
Use Cases and Common Patterns
18 Yahoo Confidential & Proprietary
Use Case 4: Sliding Window
Access last 24 hours of data, and roll them up every hour
01:00 02:00 03:00 24:00
24:00
…
02:00 03:00 04:00 +1 day 01:00
+1 day 01:00
…
03:00 04:00 05:00 +1 day 02:00
+1 day 02:00
…
Use Cases and Common Patterns
§ 17 clusters § 13,000 jobs/server day
§ 2.8 M jobs/month
§ 16% of all Hadoop jobs
§ 75 products § 2,000+ projects
§ 255 monthly users § 5.4 M compute hrs/month
§ 770,000 workflows § Between 1-8 actions
§ Avg. 4 actions/workflow
§ 250 coordinator jobs/day § 67% of Oozie jobs kicked
thru coordinator
Proven Scale and Multi-tenancy
19 Yahoo Confidential & Proprietary
Where are We Today
20 Yahoo Confidential & Proprietary
Mix Of Job Types For Workflows
39%
29%
28%
4%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Jobs
Pig MapReduce Java Other
SAMPLE USE OF JOB TYPES
Pig § Data processing/ filtering § Aggregation
MapReduce § Publishing data (HDFS/HCat)
Java § Legacy code and logic
Others § Distcp and shell § Data copy/ transfer
Where are We Today
FEATURE DEEP-DIVE
22 Yahoo Confidential & Proprietary
Existing Features (Oozie 3.x) § HBase access through Oozie, via credentials
§ HCatalog access through Oozie, via credentials
§ Email action
§ DistCp action (intra as well as inter-cluster copy)
§ Shell action (run any script e.g. perl, python, hadoop CLI)
§ Workflow dry-run & Fork-Join validation
§ Bulk monitoring (REST API)
§ Coordinator EL functions for parameterized workflows
§ Job DAG
What’s New in Oozie
HBase Credentials
23 Yahoo Confidential & Proprietary
§ Add in workflow.xml § Add a section of "credentials". The type is "hbase”.
§ Specify the java action to use the credentials.
§ Put hbase-site.xml in oozie application path. And use <file> in workflow.xml to put hbase-site.xml in the distributed cache. A copy of the hbase-site.xml can be found in gateway:/home/gs/conf/hbase/hbase-site.xml.
§ Put jars "guava-*.jar, zookeeper-*.jar, hbase-*.jar, protobuf-java-*.jar” in workflow “lib” dir
§ Make sure you are using Oozie XSD version 0.3 and above for the tag.
<workflow-‐app name="foo-‐wf" xmlns="uri:oozie:workflow:0.3"> <credentials>
<credential name="hbase.cert" type="hbase"> </credential>
// optional properties -‐ zookeeper.znode.parent, hbase.zookeeper.quorum </credentials>
<start to=”map-‐reduce-‐action" />
<action name=’map-‐reduce-‐action' cred="hbase.cert"> <map-‐reduce>
<configuration> <property> <name>mapred.mapper.class</name>
<value>SampleMapperHBase</value> </property>
<property> <name>mapred.reducer.class</name> <value>org.apache.oozie.example.DemoReducer</value> </property> </configuration>
<file>hbase-‐site.xml#hbase-‐site.xml</file>
</java>
§ Refer to http://twiki.corp.yahoo.com/view/CCDI/UseHbaseCred
What’s New in Oozie
Oozie 4.0
24 Yahoo Confidential & Proprietary
HCatalog Integration
Job Notifications
SLA Monitoring
1
2
3
What’s New in Oozie
HCatalog Integration
§ Oozie now supports HCatalog datasets, in addition to HDFS § Query HCat server directly -OR- § Receive ‘partition created’ notifications
§ With HDFS datasets, poll NameNode to check data availability § Delay § Single source
Oozie NameNode
/data/click/2013/03/10 /data/click/2013/03/11 /data/click/2013/03/12
…….
HDFS
data exists? data exists?
…….
What’s New in Oozie
25 Yahoo Confidential & Proprietary
1
› HCat - metastore has info about HDFS
datasets, locations and file formats.
› Using HCat loader and storer, dataset can be
consumed uniformly using Pig, Hive and
Map/Reduce in Oozie, using the “database,
table, partition” abstraction.
› Oozie notified on partition availability via JMS
messages, to trigger workflows immediately
› Use JARs hcatalog-core.jar, webhcat-java-
client.jar, hive-common.jar, hive-exec.jar,
hive-metastore.jar, hive-serde.jar and
libfb303.jar in workflow ‘lib’
§ Docs -
http://oozie.apache.org/docs/4.0.0/DG_HCatalogIntegration.html
<coordinator-‐app name=”hcat-‐coord” … >
<datasets>
<dataset name=”inp-‐logs" frequency="${coord:hours(1)}”>
<uri-‐template>${hcatNode}/${db}/${table}/ds=${YEAR}-‐${MONTH}-‐${DAY};region=${region}</uri-‐template>
<done-‐flag></done-‐flag>
</dataset>
<dataset name=”out-‐logs" frequency=”${coord:days(1)}”>
<uri-‐template>${hcatNode}/${db}/${outputtable}/ds=${dataOut};region=${region}</uri-‐template>
<done-‐flag></done-‐flag> </dataset> ... <property> <name>FILTER</name>
<value>${coord:dataInPartitionFilter('input', 'pig')}
</value>
Pig action script:
A = load '$DB.$TABLE' using org.apache.hcatalog.pig.HCatLoader();
B = FILTER A BY $FILTER;
C = foreach B generate foo, bar; store C into '$OUTPUT_DB.$OUTPUT_TABLE' USING org.apache.hcatalog.pig.HCatStorer('$OUTPUT_PARTITION');
26 Yahoo Confidential & Proprietary
Latest Oozie 4.0 Features HCatalog Integration
What’s New in Oozie
With HCatalog + Notifications High-level Diagram
HCatalog
Data Producer HDFS
Update metadata (ALTER TABLE click ADD PARTITION(data=‘2013/03/12’) location ’hdfs://data/click/2013/03/12’)
/data/click/2013/03/12
Produce data (distcp, pig, M/R..)
What’s New in Oozie
27 Yahoo Confidential & Proprietary
With HCatalog + Notifications High-level Diagram
Oozie
Message Bus (e..g, ActiveMQ)
HCatalog
2. Register Topic
Data Producer HDFS
1. Query/Poll Partition
What’s New in Oozie
28 Yahoo Confidential & Proprietary
With HCatalog + Notifications High-level Diagram
Oozie
Message Bus (e..g, ActiveMQ)
HCatalog
3. Push notification <New Partition>
2. Register Topic
4. Notify New Partition
Data Producer HDFS Produce data (distcp, pig, M/R..)
/data/click/2013/03/12
1. Query/Poll Partition
Start workflow
Update metadata (ALTER TABLE click ADD PARTITION(data=‘2013/03/12’) location ’hdfs://data/click/2013/03/12’)
What’s New in Oozie
29 Yahoo Confidential & Proprietary
§ Notification event sent on jobs’ status change
§ Messages sent on the configured JMS-compliant message broker
§ Users should write message listeners to listen on select topics (e.g. username)
§ To filter more, apply JMS selectors on messages.
§ E.g. user, jobid, app-type, status, msg-type (JOB or SLA).
§ Docs -
http://oozie.apache.org/docs/4.0.0/DG_JMSNotifications.html
Filter desired app-types for notification: <property>
<name>oozie.service.EventHandlerService.
filter.app.types</name>
<value>workflow_job, workflow_action,
coordinator_job, coordinator_action</value>
</property>
Notification Msg Example: Coordinator Action Failure Event
› Header (Selectors) • AppType – Coordinator_Action • Status - FAILURE • User • App-Name
› Message Body (JSON) • ID (coord action id) • Parent ID (coord Job ID) • NominalTime • StartTime • EndTime • Status - FAILED, KILLED, SUSPENDED, TIMEDOUT • Error-Code, Error-Message (if KILLED or FAILED)
30 Yahoo Confidential & Proprietary
Latest Oozie 4.0 Features Job Notifications 2
What’s New in Oozie
§ Oozie can actively track SLAs on Jobs’ § Start-time, End-time, Duration
§ Event Status § START_MET, START_MISS
§ END_MET, END_MISS
§ DURATION_MET, DURATION_MISS
§ At any time, the SLA processing stage will reflect: § Not_Started <-- Job not yet begun
§ In_Process <-- Job started and is running, and SLAs are being tracked
§ Met <-- caused by an END_MET
§ Miss <-- caused by an END_MISS
§ Access/Filter SLA info via § Web-console dashboard
§ REST API
§ JMS Messages
§ Email alert
§ Docs - http://oozie.apache.org/docs/4.0.0/DG_SLAMonitoring.html
<workflow-‐app xmlns="uri:oozie:workflow:0.5" xmlns:sla="uri:oozie:sla:0.2" name=”sla-‐wf"> ... <end name="end"/> <sla:info> <sla:nominal-‐time>${nominalTime} </sla:nominal-‐time> <sla:should-‐start>${shouldStart} </sla:should-‐start> <sla:should-‐end>${shouldEnd} </sla:should-‐end> <sla:max-‐duration>${duration} </sla:max-‐duration> <sla:alert-‐events>start_miss,end_miss </sla:alert-‐events> <sla:alert-‐contact>joe@yahoo </sla:alert-‐contact> </sla:info> </workflow-‐app>
31 Yahoo Confidential & Proprietary
Latest Oozie 4.0 Features SLA Monitoring
3
What’s New in Oozie
SLA Monitoring Dashboard
32 Yahoo Confidential & Proprietary
What’s New in Oozie
Checking Oozie Job
33 Yahoo Confidential & Proprietary
1. CLI (yoozie_client)
$ oozie job -oozie http://localhost:11000/oozie -info 14-20090525161321-oozie-joe ---------------------------------------------------------------------------------------------------------------- Workflow Name : map-reduce-wf App Path : hdfs://localhost:8020/user/joe/workflows/map-reduce Status : SUCCEEDED Run : 0 User : joe Group : users Created : 2009-05-26 05:01 Started : 2009-05-26 05:01 Ended : 2009-05-26 05:01 Actions --------------------------------------------------------------------------------------------------------------------- Action Name Type Status Transition External Id External Status Error Code Start End ------------------------------------------------------------------------------------------------------------------------------------------------------hadoop1 map-reduce OK end job_200904281535_0254 SUCCEEDED - 2009-05-26 05:01 2009-05-26 05:01 ------------------------------------------------------------------------------------------------------------------------------------------------------
Demo
Checking / Debugging Oozie Jobs
34 Yahoo Confidential & Proprietary
2. Web-Console e.g. http://my-oozie-server:4080/oozie
Docs - https://cwiki.apache.org/confluence/display/OOZIE/Map+Reduce+Cookbook
Demo
What else is out there?
36 Yahoo Confidential & Proprietary
Oozie vs. Other Workflow Systems
Champion Yahoo! (now ASF) LinkedIn Spotify
Apache Affiliation TLP License only License only
Language Java Java Python
Adoption High, part of all standard Hadoop distributions Low Low
Code Complexity High (>100K lines) Medium (< 50K lines) Low (<10K lines)
Hadoop Job Support Extensive built-in support Limited job types Limited job types
Docs & Support Excellent Limited Limited
Auth. Kerberos, custom xml-based, custom Linux-based
Reruns Yes (recovery, retries at all levels) Partial After removing output, idempotent
UI Average Good -
Oozie at ASF
37 Yahoo Confidential & Proprietary
The Next Release
§ Scalability and performance improvements to handle higher loads
§ More 1 and 5 min frequency jobs
§ High Availability with Load Balancing
§ Flexible Cron-Based Scheduling
§ Handling cluster Rolling upgrades for Hadoop 2.0
Roadmap
Q & A
39 Yahoo Confidential & Proprietary