+ All Categories
Home > Documents > CA Workload Automation Advanced Integrations for … Integration for Hadoop: Automate, Accelerate,...

CA Workload Automation Advanced Integrations for … Integration for Hadoop: Automate, Accelerate,...

Date post: 22-Mar-2018
Category:
Upload: lekien
View: 222 times
Download: 3 times
Share this document with a friend
11
CA Workload Automation Advanced Integration for Hadoop: Automate, Accelerate, Integrate
Transcript

CA Workload Automation Advanced Integration for Hadoop: Automate, Accelerate, Integrate

The need to mine massive sets of information for unique insights into customer behaviors, competitive plays and market fluctuations has transformed big data initiatives into imperative, business-critical priorities.

Big Data. Big Deal.

2

So, what are businesses doing to get around these challenges and extract greater value from big data?

Big enough that the market for big data technolgies and services will exceed $41 billion by 2018.1

But just how big is big data?

And thanks to the significant emphasis organizations place on big data—as well as the budget they’re dedicating to it—demand for IT professionals with big data expertise jumped by nearly 90 percent between December 2013 and December 2014.2

That said, a big data initiative is not without its challenges. In fact, most companies estimate they’re analyzing a mere 12 percent of the data they have.3

1IDC, “Workload Automation Emerges as Business Innovation Engine in the Era of Cloud, Big Data, and DevOps,” April 2015.2Forbes, “Where Big Data Jobs Will Be In 2015,” December 29, 2014.3Forrester, “The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014,” February 28, 2014.

90%

$41 billion

12%

The open-source Apache™ Hadoop® platform has rapidly emerged as the dominant means by which businesses process, analyze and extract insights from their growing sets of data. It has become so popular, in fact, that the global Hadoop market is expected to reach $50.2 billion by 2020.4

So, what’s behind Hadoop’s rise in prominence?

For starters, it’s far less expensive than other data storage methods—companies can create Hadoop infrastructures built almost exclusively on cost-effective, scalable and resilient commodity hardware and software. As a result, it’s easy for

businesses to add more storage or processing power to their clusters without overtaxing IT budgets or diverting funds from other strategic initiatives.

Plus, Hadoop allows organizations to deliver a more personalized experience that meets their specific needs around big data—helping them optimize customer-facing services, more effectively support changing business goals and provide better, more efficient resource utilization.

Unfortunately, some challenges unique to the Hadoop environment can counteract the value the platform delivers.

Hadoop Holds the Key to Better Big Data Analysis

The largest Hadoop clusters can include upwards of 1,000 to 2,000 nodes.5

4Allied Market Research, “Global Hadoop Market - Industry Growth, Size, Share, Insights, Analysis, Research, Report, Opportunities, Trends and Forecasts Through 2020,” March 2014.

5Enterprise Tech, Systems Edition, “Hadoop Finds Its Place In The Enterprise,” October 29, 2014. 3

4

Because Hadoop infrastructures run independent of each other, operating one means workflows will inevitably incorporate both Hadoop and traditional jobs.

While Hadoop does include a basic scheduler that delivers some automation, it is focused primarily on jobs that run on Hadoop clusters, and doesn’t integrate well with other workload automation engines. This makes identifying the relevant data and assembling and analyzing it across multiple platforms and Hadoop clusters—as well as managing dependencies on external schedulers and data sources—extremely difficult.

This limitation can introduce several critical challenges.

Visualizing, monitoring and running multiple workflows across numerous, disparate IT

environments? >>

So, when it comes to managing a Hadoop environment alongside your existing infrastructure, are you struggling with:

Managing parallel, time-dependent jobs, as well as event-driven usage spikes? >>

Training users on multiple scheduling engines and writing and maintaining manual scripts for

each Hadoop job? >>

Managing Hadoop and Traditional Infrastructures: Critical, but not Always Easy

As you work to do more and more with big data, it’s only natural to expect your end-to-end business workflows to include an increasingly intricate blend of Hadoop and traditional jobs. Although this is certainly a normal result of incorporating big data into your broader workflows, it also means you’ll have to contend with greater complexity as you work to simultaneously orchestrate Hadoop and traditional jobs.

In most instances, Hadoop users will have to run their jobs separately from traditional ones. This greatly increases the time and effort required to deliver big data services to the business. Moreover, it limits your overarching visibility into all jobs—both traditional and Hadoop—currently executing.

And when this happens, it can create situations where confusion over the order in which jobs should be scheduled leads to slower response times and missed business opportunities.

Is it Hard for You to …

Visualize, Monitor and Run Multiple Workflows?

But what if you could monitor end-to-end workflows from a single console?

5

In much the same way that they must run their jobs separately from traditional ones, the majority of Hadoop users also have no means of triggering traditional jobs that are dependent on a specific piece of the Hadoop workflow.

This issue makes it quite challenging for you to manage parallel jobs, as doing so often requires users to toggle between Hadoop and traditional engines to manage the larger workflow in order to effectively meet key corporate objectives.

Worse yet, because you must typically prioritize resources and activities according to shifting business needs or time-sensitive events, any inability to understand the dependencies between jobs—or trigger them in the proper sequence—can dramatically increase the time and cost associated with completing a workflow.

Is it Hard for You to …

Manage Time-Dependent Jobs or Usage Spikes?

But what if you could manage the dependencies between Hadoop and traditional jobs from a centralized location?

6

As if jumping between Hadoop and traditional workload automation engines wasn’t challenging enough, the fact that these schedulers employ dissimilar user interfaces introduces an extra set of considerations to overcome.

Because you have no recourse but to schedule workflows using both approaches, you’re left deciding between two unattractive options: invest time and money training users on both schedulers, or build completely separate teams for each job type. No matter which route you choose, you’ll be making tradeoffs.

The resources you invest in building and executing detailed training programs could have supported a strategic organizational initiative instead. Likewise, operating two unique teams adds unnecessary management burden and makes it difficult to orchestrate Hadoop and traditional jobs in a seamless manner. And, ultimately, either approach results in reduced productivity and a slower overall delivery of insights to the business.

Is it Hard for You to …

Dedicate Resources Toward Training Users on Multiple Schedulers?

But what if you could schedule both Hadoop and traditional jobs from a single, easy-to-use, enterprise-class tool?

7

But what if there was a way to eliminate the challenges associated with Hadoop and more effectively utilize the platform for its true purpose?

When you’re able to effectively use it to align technology with business operations, Hadoop can deliver value across a number of use cases—everything from supporting eCommerce initiatives during the holiday season to automatically delivering personalized promotions to identifying potential acts of fraud or non-compliance.

And when you implement CA Workload Automation Advanced Integration for Hadoop, it’s possible.

The solution allows you to add Hadoop jobs into existing CA Workload Automation workflows and monitor everything through a single console—including real-time and batch processes across multiple, disparate platforms—in a holistic, enterprise-wide manner.

Here’s how …

There Is a Way:

CA Workload Automation Advanced Integration for Hadoop

8

Hadoop holds the key to unlocking your organization’s potential.

6Requires CA Workload Analytics.

CA Workload Automation Advanced Integration for Hadoop makes it possible to integrate Hadoop and traditional jobs by delivering:

How It Works

Seamless application integration Execute Hadoop jobs in sync with others throughout the enterprise, helping you reduce Hadoop job complexity while increasing overall reliability and flexibility.

Critical path analysis and forecasting Identify and understand the business impact of Hadoop jobs, so you can better attain and uphold critical service-level agreements.6

Resource optimization Increase the efficiency with which you allocate resources by coordinating work based on what’s currently available across physical, virtual and cloud environments.

Multi-platform scheduling Visualize end-to-end business processes—spanning Hadoop and other platforms—from a central point of control.

A familiar interface for all jobs Manage complete workflows without having to toggle between CA Workload Automation and Hadoop environments.

The visibility to know when you should run a specific workload Understand when a job is supposed to complete and quickly spot and triage any problems that may arise along the way.

9

Lower costs by eliminating the complexity associated with disconnected monitoring tools.

Gain unified visibility into your entire Hadoop environment.

Easily integrate big data services into your existing technology landscape.

Support changing business goals and help the enterprise drive growth and competitive advantage.

Improve performance and uptime through proactive monitoring and alerts.

With CA Workload Automation Advanced Integration for Hadoop, you’ll be able to:

Big Data Made Easy

Achieve consistent service levels, stronger business integration and lower cost and risk for customers.

Reduce the time and effort required to deliver big data business services.

Accelerate big data projects and foster a more agile enterprise that is better positioned to deliver greater business value.

10

Copyright © 2015 CA. All rights reserved. Apache is a registered trademark and Hadoop is a trademark of the Apache Software Foundation in the United States and other countries. All other trademarks, trade names, service marks and logos referenced herein belong to their respective companies.

This document is for your informational purposes only. CA assumes no responsibility for the accuracy or completeness of the information. To the extent permitted by applicable law, CA provides this document “as is” without warranty of any kind, including, without limitation, any implied warranties of merchantability, fitness for a particular purpose, or noninfringement. In no event will CA be liable for any loss or damage, direct or indirect, from the use of this document, including, without limitation, lost profits, business interruption, goodwill or lost data, even if CA is expressly advised in advance of the possibility of such damages.

CA does not provide legal advice. Neither this document nor any CA software product referenced herein shall serve as a substitute for your compliance with any laws (including but not limited to any act, statute, regulation, rule, directive, policy, standard, guideline, measure, requirement, administrative order, executive order, etc. (collectively, “Laws”)) referenced in this document. You should consult with competent legal counsel regarding any Laws referenced herein.

CS200-130318

CA Technologies (NASDAQ: CA) creates software that fuels transformation for companies and enables them to seize the opportunities of the application economy. Software is at the heart of every business, in every industry. From planning to development to management and security, CA is working with companies worldwide to change the way we live, transact, and communicate—across mobile, private, and public cloud, distributed and mainframe environments. Learn more at ca.com.

To learn more about how CA Workload Automation Advanced Integration for Hadoop can help your organization derive greater value from its Hadoop investments, please visit ca.com/wla.


Recommended