Date post: | 20-Jan-2015 |
Category: |
Education |
Upload: | hadoop-user-group |
View: | 1,550 times |
Download: | 4 times |
Apache Hadoop Grid Patterns and Anti-Patterns
Arun C Murthy Yahoo! Grid Team, CCDI [email protected]
Hello!
8/18/10 2
Who am I? Yahoo!
› Grid Team (CCDI)
› Lead the Apache Hadoop Map-Reduce Development Team
Apache
› Developer on Apache Hadoop since April 2006
› Committer
› Member of Apache Hadoop PMC
Apache Hadoop
8/18/10 3
The Software Hadoop Distributed File System
Hadoop Map-Reduce
Open source from Apache
Written in Java
Runs on
› Linux, Solaris, Mac OS/X
› Commodity hardware
Storage
8/18/10 4
HDFS Designed to store large files
Stores files as large blocks (64 to 128 MB)
Each block stored on multiple servers
Data is automatically re-replicated on need
Accessed from command line, Java API or C API
Data Processing
8/18/10 5
Hadoop Map-Reduce Map-Reduce is a programming model for efficient distributed computing
Efficiency from
› Streaming through data, reducing seeks
› Pipelining
A good fit for a lot of applications
› Log processing
› Web index building
Hadoop in the Enterprise
8/18/10 6
Usage and Importance Large number of corporations use Apache Hadoop at scale for several business critical
applications
› Large, shared, multi-tenant deployments to minimize fragmentation across organizations
Millions of dollars at stake!
› Yahoo
• Advertising, Search
• 40,000 machines and counting
http://wiki.apache.org/hadoop/PoweredBy
Hadoop in the Enterprise
8/18/10 7
… however Hadoop isn’t a silver bullet (at least as yet!)
› Hadoop still depends on users to utilize it effectively
› Pig/Hive help, one can still write badly suited queries
Need to adapt legacy applications to Hadoop, especially the Map-Reduce paradigm
Efficient usage of Hadoop clusters is critical to getting return on the investment
Hadoop Map-Reduce
8/18/10 8
Overview It works like a Unix pipeline:
› cat input | grep | sort | unique -c | cat > output
› Input | Map | Shuffle & Sort | Reduce | Output
Works on key/value pairs
› map <k1, v1> -> <k2, v2>
› reduce <k2, v2> -> <k3, v3>
Best Practices
8/18/10 9
Input to Applications Optimized to process large data-sets
Pattern: Coalesce processing of multiple small input files into smaller number of maps and use larger HDFS block-sizes for processing very large data-sets.
Best Practices
8/18/10 10
Map-Reduce - Mappers Process multiple-files per map for jobs with very large number of small input files
Process large chunks of data per-map for large-scale data-processing
› PetaSort – 66,000 maps with 12.5G per map
Pattern: Unless the application's maps are heavily CPU bound, there is almost no reason to ever require more than 60,000-70,000 maps for a single application.
Best Practices
8/18/10 11
Map-Reduce - Mappers Process multiple-files per map for jobs with very large number of small input files
Process large chunks of data per-map for large-scale data-processing
› PetaSort – 66,000 maps with 12.5G per map
The shuffle cross-bar (maps * reduces) is a key performance factor
Pattern: Applications should use fewer maps to process data in parallel, as few as possible without having really bad failure recovery cases.
› Unless the application's maps are heavily CPU bound, there is almost no reason to ever require more than 60,000-70,000 maps for a single application
Best Practices
8/18/10 12
Map-Reduce – Combiner and Shuffle Combiner
› Map-side aggregation to help reduce network traffic for the shuffle
› Cost of using combiners
Shuffle
› Compression of intermediate output
Pattern: Use combiners judiciously, ensure they really work! Compress intermediate outputs
Best Practices
8/18/10 13
Map-Reduce – Reducers Efficiency depends on shuffle, and the cross-bar
Configure appropriate number of reduces
› Too few reduces hurt the nodes
› Too many hurt the cross-bar
Pattern: Applications should ensure that each reduce should process at least 1-2 GB of data, and at most 5-10GB of data, in most scenarios.
Best Practices
8/18/10 14
Map-Reduce – Output Number of output artifacts is linear w.r.t. number of configured reduces
Compress outputs
Use appropriate file-formats for the output
› E.g. compressed text-files is not a great idea if you aren’t using a splittable codec
Think of the consumer of your data-set!
Consider using larger HDFS block-sizes.
Pattern: : Application outputs to be few large files, with each file spanning multiple HDFS blocks and appropriately compressed.
Best Practices
8/18/10 15
Map-Reduce – Distributed Cache Efficient distribution of read-only files for applications
Designed for small number of mid-sized files
Pattern: Applications should ensure that artifacts in the distributed-cache should not require more i/o than the actual input to the application tasks
Best Practices
8/18/10 16
Map-Reduce – Counters Global (across all tasks) counters, aggregated by the framework
Expensive!
Pattern: Applications should not use more than 10, 15 or 25 custom counters.
Best Practices
8/18/10 17
Map-Reduce – Total Order Outputs Sampling Partitioner
› Do not use a single reducer!
› E.g. Terasort/Petasort benchmarks
Joining fully sorted data-sets
› Do not need same cardinality e.g. number of buckets for the data-sets being joined
Pattern: Use combiners judiciously, ensure they really work!
Best Practices
8/18/10 18
HDFS – NameNode and JobTracker Operations NameNode: Please don’t hurt me!
› Not yet a silver bullet…
› Do not perform metadata operations for map/reduce tasks at the backend
Do not contact for JobTracker for cluster statistics etc. from the backend
Pattern: Applications should not perform any metadata operations on the file-system from the backend, they should be confined to the job-client during job-submission. Furthermore, applications should be careful not to contact the JobTracker from the backend.
Best Practices
8/18/10 19
Map-Reduce – Logs and Web-UI Tasks’ stdout/stderr stored on TaskTrackers
› Limit amount of logs
JobTracker/NameNode Web-UI
› Do not screen-scrape!
Best Practices
8/18/10 20
Oozie – Workflows Production pipelines are run via Oozie
Ensure workflows have small number of medium-to-large sized Map-Reduce jobs
› Collapse smaller jobs
Pattern: A single Map-Reduce job in a workflow should process at least a few tens of GB of data.
Anti-Patterns
8/18/10 21
In a large enough cluster, you see any and all of these… Applications not using a higher-level interface such as Pig/Hive
Processing thousands of small files (sized less than 1 HDFS block, typically 128MB) with one map processing a single small file.
Processing very large data-sets with small HDFS block size i.e. 128MB resulting in tens of thousands of maps.
Applications with a large number (thousands) of maps with a very small runtime (e.g. 5s).
Straight-forward aggregations without the use of the Combiner.
Applications with greater than 60,000-70,000 maps.
Applications processing large data-sets with very few reduces (e.g. 1).
› Pig scripts processing large data-sets without using the PARALLEL keyword
› Applications using a single reduce for total-order amount the output records
Anti-Patterns
8/18/10 22
Applications processing data with very large number of reduces, such that each reduce processes less than 1-2GB of data.
Applications writing out multiple, small, output files from each reduce.
Applications using the DistributedCache to distribute a large number of artifacts and/or very large artifacts (hundreds of MBs each).
Applications using tens or hundreds of counters per task.
Applications performing metadata operations (e.g. listStatus) on the file-system from the map/reduce tasks.
Applications doing screen scraping of JobTracker web-ui for status of queues/jobs or worse, job-history of completed jobs.
Workflows comprising of hundreds or thousands of small jobs processing small amounts of data.
Work underway in yahoo-hadoop-0.20.200 to prevent anti-patterns
Blog Post
8/18/10 23
http://developer.yahoo.net/blogs/hadoop/2010/08/apache_hadoop_best_practices_a.html
Thanks!
8/18/10 24 Yahoo! Presentation, Confidential