+ All Categories
Home > Technology > Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Date post: 15-Nov-2014
Category:
Upload: teamquest-corporation
View: 424 times
Download: 0 times
Share this document with a friend
Description:
Dave Wagner, TeamQuest Advocate, and Chris Lynn, Safeway's Capacity and Performance Management, cover the application of automatic, exception-oriented analytics to a wide variety of IT and business metrics in order to simultaneously optimize service performance and IT cost. Multiple conceptual approaches will be presented, including pros and cons. Most of the presentation will be real examples by which Safeway has integrated performance, capacity, business, and power data into an automated optimization process spanning 1000’s of servers and virtual servers and their applications.
Popular Tags:
23
Copyright © 2012 TeamQuest Corporation. All Rights Reserved. TeamQuest and the TeamQuest logo are registered trademarks in the US, EU and elsewhere. All other trademarks and service marks are the property of their respective owners. Automating IT Analytics to Optimize Service Delivery and Cost at Safeway David Wagner – TeamQuest Advocate Chris Lynn - Safeway Capacity Manager/Performance Analyst December 11, 2013
Transcript
Page 1: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

TeamQuest and the TeamQuest logo are registered trademarks in the US, EU and elsewhere.All other trademarks and service marks are the property of their respective owners.

Automating IT Analytics to Optimize Service Delivery

and Cost at Safeway

David Wagner – TeamQuest AdvocateChris Lynn - Safeway Capacity Manager/Performance Analyst

December 11, 2013

Page 2: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

• TeamQuest Perspectives

• Safeway Experiences

Agenda

Page 3: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

• Continuously financially-optimized IT environment– Always know where and when performance problems

will affect the bottom line– Identify cost and performance inefficiencies in support

of business processes and eliminate them

• Continuously optimized customer experience– Understand when, where and why customer

experiences fail– Resolve, predict and prevent customer dissatisfaction

issues

Desired State: Continuous Optimization

Page 4: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

• Significantly reduce initial CapEx, and ongoing OpEx– Make, and keep making, more money!

• Optimize resources for systems of customer engagement

• Deploy and refresh new applications faster– e.g. Retailers need to capture their share of mobile

commerce as it grows from $6 to $31B (2016)

• Respond faster to business spikes

• Prevent business impacting outages and slowdowns

Continuous IT Optimization Results

Page 5: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

Underlying IT

Infrastructure

Outage Manageme

nt

Customer Operations

Distribution Automation

Asset Manageme

nt

5

Services

Applications

Big Data Collection

Business Intelligence

Aligned Business and IT Intelligence

Enterprise IT Optimization

• Correlate business & IT performance

• Insight into how business process- changes impact IT

• Understand and optimize IT costs by business unit/process and technology

• Insight into business performance across technology stack

Network

Storage

How : Aligned Business and IT

Analytics

Server/OS

Page 6: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

• Federates existing data/information into purpose-designed optimization process – Technology data (e.g. server, network, storage, etc.)– Service data (catalog, metrics, tickets, etc.)– Financial data– Business data (analytics, KPIs, plans, TXNs, etc.)

• Automates IT analytics across all data sources– Flexible and adaptive to dynamic environments– Raw (commodity) data -> actionable information for IT

• Single-pane-of-glass IT Optimization

TeamQuest’s Approach: Federated IT Analytics

Page 7: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

• Continuous Optimization

– Pre-purchase validation

– Re-purposing– Consolidation

• Fully automated, low cost

• Integrated with Risk and Service Management

• Changed new VMwareClusters from every 6 weeks to:

– None for 18+ months…

– Consolidated 1000’s ofVM’s (Saving $M)

Result: Automated Application Financial

Optimization

Page 8: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Copyright © 2012 TeamQuest Corporation. All Rights Reserved.

• Continuous IT Optimization– Peak IT Performance– Ideal Resource Capacity– Optimized Resource Costs

• Automated IT Analytics– Predictive– Federated

• Aligned IT and Business Management– Performance– Capacity– Financial

Result: IT Optimized... Future Assured

Page 9: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Automating IT Analytics to Optimize Service Delivery and Cost at

Safeway

Chris Lynn - SafewayDecember 11, 2013 2:30-3:00

Page 10: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Background Server Storage Forecasting and Optimization Application Capacity Analysis – Dashboards

to Details Business KPI Analytics Vmware High Level Analytics

Topics

Page 11: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Background

• Manager of Safeway Capacity and Performance Team• [email protected]• http://www.linkedin.com/pub/chris-lynn/2/65/3

09/

• Environment Supported• ~4000 servers (~1700 physical)• ~200 significant applications• Unix, Windows, Mainframe, Teradata,

Tandem, etc.• Thousands of internal IT Customers, and

millions of shoppers

Page 12: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Server Storage Forecasting and Optimization

• Optimizing Availability (reducing incidents)• Optimizing Enterprise Capacity• Reducing Risk• Automated replacing Manual• Embedded expertise

Page 13: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Storage Capacity Incident Avoidance:

Old (on server) Manual Method$ dfFilesystem size used avail capacity Mounted on/dev/vx/dsk/rootvol 3.9G 2.5G 1.4G 65% //dev/vx/dsk/var 1.9G 832M 1.0G 45% /varswap 9.5G 16K 9.5G 1% /var/runswap 1.0G 2.6M 1021M 1% /tmp/dev/vx/dsk/patrol 1.9G 1.5G 227M 88% /appl/patrol/dev/vx/dsk/home 486M 347M 91M 80% /export/home/dev/vx/dsk/openv 1.4G 583M 717M 45% /usr/openv/dev/vx/dsk/performdg/usrlocal 1.9G 542M 1.3G 29% /usr/local/dev/vx/dsk/performdg/oracle 3.9G 1.1G 2.8G 28% /appl/oracle/dev/vx/dsk/performdg/apache 128M 27M 95M 22% /appl/apache/dev/vx/dsk/rootdg/opswarelv 241M 234K 216M 1% /var/opt/opsware/dev/vx/dsk/performdg/b1home 12G 432M 11G 4% /appl/perform/best1home/dev/vx/dsk/performdg/spool_apache 256M 145M 104M 59% /appl/spool/apache/dev/vx/dsk/performdg/manage 180G 122G 54G 70% /appl/perform/manager/dev/vx/dsk/performdg/workspace 90G 59G 29G 68% /appl/perform/workspace/dev/vx/dsk/performdg/collect 480G 442G 37G 93% /appl/perform/collect

Page 14: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Automated Storage Forecasting: File System

Exceptions

• Weekly automated prioritized scan• 4500 servers• 45000 filesystems

• Focused on meaningful exceptions

• A proactive shift from find to fix• Was – 50 minutes looking

for potential problems, 10 minutes to fix

• Now- 5 minutes looking for potential problems, 55 minutes fixing them

• Impossible to do manually

Page 15: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

New Automated (global exception) File System Forecast Analytic Details

• Complex multi-level thresholds1. Is file system utilization above 90% AND growing by >0.2% for the interval?2. Is file system utilization above 75% AND growing by >2% for the interval?3. Is the file system utilization above 15% AND growing by >15% for the interval?4. Is /appl/patrol above 90% AND growing for the interval?

• Individual exclusions and special cases• Physical and virtual in same report, but can be treated uniquely.• Sorted by date/time most likely to fill up• Show all candidates for a single server together (sorted by

highest one), minimize the time for operations to respond• Includes historical trend compared to just a point in time (e.g.

df)• Forecast utilization trend into the future (multiple statistical

options)• more than 24 hours of data to avoid temp FS• must have recent data to avoid shutdown servers• final measured number not below threshold• if final number >99.5% catches the very full fs that might not

be growing.

Page 16: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Executive Capacity Dashboards

Highly Stressed

Stressed

Well Used

Under Used

Capacity Risk Indicators

Page 17: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Application Capacity Dashboards

FastF

orward

-PR

Corpco

mm-PR

DataStg B2B

Crystal

-PRAM

Autonomy-P

R

EXE-PR

WLS

J-PR

WLV

J-PR

peoplesoft-PR

JOE-PR

Workb

rain-PR

Corema-PR

0%10%20%30%40%50%60%70%80%90%

100%

Capacity Risk Indicators

Highly Stressed Stressed Well Used Under Used

Page 18: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Application Capacity Analysis

• Automated Application Triage

• All relevant metrics• Embedded expertise• Enterprise perspective of

true capacity

AIX--2

52

Solaris

--240

Linux

--218

1

Windo

ws--1

445

ESX H

ost--

252

0%

100%

Capacity Risk Can-didates

(OS--#of systems)Under UsedWell UsedStressedHighly Stressed

Page 19: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Integration With Business Metrics

System/Platform capacity data:• Physical servers• Virtual servers• Tandem capacity systems• Teradata capacity

systems• Datacenter facilities

Business perspective:• Business transaction

volumes• Resource utilization

Page 20: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Vmware Aggregate Capacity

Aggregate shows the worst individual status

Page 21: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Vmware Aggregate Capacity

Aggregate shows the worst individual metric status

Page 22: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation
Page 23: Automating IT Analytics to Optimize Service Delivery and Cost at Safeway - A #GartnerDC Presentation

Lessons Learned/ Value Gained

• Reduced service risk• More proactive less reactive• Established a baseline to optimize capacity, and

a mechanism to measure the progress• Business and IT alignment• Performance and capacity to the business• Management and technical personnel

• Launch slowly in phases to not overwhelm the groups

• People really do care about formatting and color choice, not just content


Recommended