Post on 09-Jan-2017
transcript
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rachel North, Patrick Keyser (Sogeti), Ben Cabanas (GE), Ben Wilson (GE)
October 2015
ISM209
Acceleration of AWS Enterprise Adoption in GE
Migrating at Scale
What to expect from the session
GE Oil and Gas defined an aggressive strategy to migrate 100% of their application portfolio
to the cloud and in just 18 months have migrated over 250 applications to AWS.
In partnership with Sogeti, this session will focus on key enablers and solutions that drove
rapid cloud transformation in a Fortune 10 company. Agile methodologies empowered a
team to eliminate roadblocks and produced innovative concepts such as cloud parties
where 27 applications went live in AWS in 1 week. A focus on automation and self-service
functionality reduced outages by 98%, enhanced user experience and delivered significant
ROI.
By effectively partnering with business leaders and application owners they have reduced
their on premise infrastructure by 50% while eliminating obsolescence and simplifying
operations, driving business outcomes with rapid speed and agility.
Topics will include application portfolio discovery, refactoring and obstacles that were
overcome with an emphasis on rapid experimentation, creative use of native AWS services,
integration of enterprise services and processes and impact to operations in the cloud.
Results speak for themselves …
261Total O&G apps in the
cloud
24Apps migrated in 1 week
30Minutes or less
15Days to Migrate
8Services taken to EAS
team
52%TCO savings on average
O&G has moved
25% of its entire app
portfolio to the cloud
O&G CTO, P&L teams & Corporate
Teamed to accelerate
cloud migrations during a
migration party in
Florence
Time to spin up new compute
Infrastructure in the cloud
Wing to wing migration from
Ingestion through build and
Deployment
Backup/Recovery
and DNS delegation presented
as a standard to be used
by all businesses
Significant reduced cost to
deploy apps into AWS compared
to competitive internal offering
AgilityGo Faster Flexibly
Self Service IaaS & PaaS in 30
Minutes or less: Try it. If it fails, try
Something else
ResilienceSelf-Healing Apps
Driven by Bots,
Proactive monitoring, &
Robust Services
Cloud
FirstThe only choice
Building solutions
that enable 100% of
our O&G Portfolio to
run in the cloud
TransformOur teams
From siloed workstreams
and swim lanes to
comprehensive Innovators!
Shiftto SOA & Microservices
Monolithic COTS slow us down –
This is the era of API’s and
stateless computing
Evolveup the stack
Software-defined everything
frees up strategic resources
and self funds migrations
O&G cloud journey
Apr ‘15
206
Mar ‘15
177
Feb ‘15
163
Jan ‘15
131
Dec ‘14
128
Nov ‘14
115
Month with Migration Party
Q2 ‘14
24
Q3 ‘15Jun ‘15
229
Built Core Services & Automation
May
‘15
221
Q3 ‘14
63
Oct ‘14
82
Jul ‘15
261
400
Q4
‘15
AutomationThe Bot Army Rises
15 O&G Bots deployed …
• 179 instances terminated• 49 TB storage released• 95 instances downsized• 72 instances upgraded
2 Self-(micro) Service Tools
• DB Script Execution• Synthetic App Monitoring
261 completeProgress KPI’s
• 100+ apps in 6 migration parties • 80% cloud first adoption• 98% reduction in P1/P0’s• 15 cloud services created• Velocity = 50 apps/qtr.
On the road to 400 …
Program Savings: 13MApps Eliminated: 57Servers Eliminated: 791Apps in Cloud: 261
Simplification
Value-added servicesBenefits Update Status
Self-Service DB • Portal for SQL execution • MVP 1 available 6/11
Service
Wildcard DNS • Rapid provisioning, automated • Live in O&G – GE approved
Backup & Recovery • Seamless, no additional cost • Complete – submitted for Patent
Active Directory • Enabler for Ent. Manageability • O&G domain live US-EAST & EU-West
SSO/SAML • Rapid provisioning, automated • OpenID Connect deployed
Firewall Automation • Migration acceleration, automation • EAS team now moving aggressively
SCALR • Self-Service IaaS, PaaS • O&G transition from SM Complete
Monitoring • Perf & Avail, automated registration • Nagios XI deployment complete
Integrated Billing • Billing analytics via Cloudability • Leveraging API’s for CTO Billing
App Discovery • Full vertical discovery & mapping • Licensed for ServiceWatch
Auto Migration • One-Click lift and shift accelerator • Partnering with Racemi
s3Shuttle • Loosely Coupled Data Integration • Complete & Published
s3Tools • Utility copy for Amazon S3 data migration • Complete & Published
Synthetic Monitoring • Self Service open source GAPC alt. • Complete & Published
Key Management • Enterprise Authentication, Self Service • Complete – Windows & Linux OS Auth
In progress Not startedCompleted
S3Shuttle
Service
Status
s3 Publish & Subscribe Java + AWS Native Services
• Flexible bucket/folder config
• Leverages multipart upload
• STS support w/ AWS SDK
• SNS notification on upload
• SNS notifies SQS queue
• Loosely coupled with SQS
• Simple command line interface
• Optional delivery notification
Service
StatusMigration
AccelerationAutomationCycle Time
ReductionSimplificationCost
Benefit
Cloud Service CTQ’sCross
Business
Automated rehostingCloud Migration Solution Racemi Benefits
• Deployed on Source server.
• Secure and firewall friendly.
• Supports Live capture, low overhead,
fault tolerant.
• Simple Command line interface
• Supports security delegation via
AWS Identity and Access
Management (IAM) and AWS
STS services.
• Capture once, deploy many.
GE Firewall
https
(443) file
transfer
Amazon
EC2 Instances
Physical
Servers
Virtual
Machines
AgentOG Subnet
OG Subnet
Dynacenter
Server
Common
Subnet
Cross
Business
Service
StatusMigration
AccelerationAutomationCycle Time
ReductionSimplificationCost
Benefit
Cloud Service CTQ’s
Synthetic monitoring
Self-Service Web Monitoring Scalable Automation for the Cloud
• Scalable distributed monitoring
• Integrated ASG & ELB
• Integrated with Nagios XI
• Headless webkit (GhostDriver)
• GUI Recorder – Selenium Builder
• Jenkins/JUnit Capable integration
• Low cost framework
• Compatible with any webdriver client
• Pre-built Nagios alerting/metric capture
Cross
Business
Service
StatusMigration
AccelerationAutomationCycle Time
ReductionSimplificationCost
Benefit
Cloud Service CTQ’s
This is what “No-Ops” looks likeBenefits Update Status
EC2-Ogre • Kills Instances powered off > 30 days • Operational, has cleared 243 VMs
Bot
EBS-Ogre • Snaps, then deletes unmounted EBS • Live, has freed 34.2TB, saving $3.4k
Chef-Ogre • Removes orphaned Chef nodes • Live, keeps Chef environment clean
EC2 Tag Refinery • Auto tags AssetID, UAI from AppCI • Daily, distributed to Cloud Community
RDS Tag Refinery • Auto tags AssetID, UAI from AppCI • Runs daily, 100% accurate tagging
ADOrganizerBot • Automated access provisioning • Corp AD team asked for guidance
DevDownsizer • Downsizes underutilized EC2s • Downsized 96. Distributed to Community
GenUpgrader • Upgrades generation of EC2 (C1-
>C4)
• New! Upgraded 73 last execution.
AppTag Enforcer • Powers off EC2s if tags are not
correct
• Based off CMDB, gives 2 days to fix
IDMLinkBot • Enables IDM self service for Cloud
AD
• In place today, waiting for bugfix
EC2ReportDroid • Single pane of glass Amazon EC2
overview
• Heavily used by Admins
DiskCleaner • Self-healing from Nagios alarms • Under development
DiskExpander • Self-healing from Nagios alarms • Under development, Windows live
AutoTTO • Ensure infra ready to go when apps go
live
• Requirements gathering
RDSReportDroid • Single pane of glass Amazon RDS overview • Statistics tracking for Chef, Apps,
etc
GenUpgrader
Optimize Performance & Reduce Cost
• IDs instances that can be upgraded to the next
EC2 generation
• Builds communication email
• Leverage AWS PowerShell SDK
• Scheduled monthly against QA and Dev
• New EC2 gens offer better performance
at reduced cost
Upgrade EC2s to the next gen
• Upgrades C1 C4, M1 M3, M2 R3
• Continuously ensures Cloud VMs operate
at most efficient and effective levels
• Upgraded 73 QA + 29 DEV last month
Windows
Automation
Server
Outdated
GenerationLatest
Generation
M
1M
3
Outdated
GenerationLatest
Generation
C1 C4
DevDownsizer
Maximize efficiency
• Pulls “underutilized” EC2s from
AWS Trusted Advisor report, monthly
• Leverage AWS Powershell SDK
• What is underutilized?
<10% CPU utilization for 14 days
<5MB network I/O for 4 days or more
Downsize underutilized EC2s
Windows
Automation
Server
• Scheduled against Development envs today
• 96 machines downsized
• Cost Savings
AWS Trusted
Advisor
2xLarge xLarge
xLarge Large
Large Medium
Medium Small
Underutilized
instance
Rightsized
instance
XL L
ADOrganizer
Hands free Active Directory
• Provides immediate access to new Windows
and Linux builds
• Enables personal account use
• Leverage AWS Powershell SDK, AWS SQS
• Runs every 15 minutes
• Processed 891 instances to date
• Streamlines operations
Automated Access Provisioning
Windows
Automation
Server
• Delivers full Active Directory automation
• Sets patching windows based on environment
• Best practice enterprise automationNew
instances
Active Directory
Create OU for
Application
Create HPA group for
App
Add users to group
Create GPO to
provide/enforce HPA
access
Credential
Management SQS
0
5
10
15
20
25
30
35
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Apps Moved in 2014
WOW – what happened here?
Better project plans
Better Issue management
Ahhh….. I got it….
You implemented a new tool!!
Here is the answer …
It was mandated!!! So you fudged the numbers!
Wait for it Wait for it
No
No
No
No
We tried the traditional way
0
5
10
15
20
25
30
35
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Apps Moved in 2014
A new approach
What the heck is a cloud party?
It's NOT:
A Telepresence meeting
A conference call
A set of workflows
A team of faceless drones
0
5
10
15
20
25
30
35
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Apps Moved in 2014
What did we change?
Backlog
Standups
Pivot
Progress
Step change in performance
Functional
Specs
Technical
Specs
Development
TeamTest
Development Teams
Sad users
Ninja Team
It’s about clock speed – cycle time
Project Manager Functional Lead Technical Lead
Large work pieces
Large distributed teams that need oversight
Small work piecesHappy users
Agile Team
Simpler approach
• Monitoring
• Automation
• Automated testing
System
System
It’s about working differently
Backlog
Navigating the roadblocks
Apps to PipelineSME Partnership Leverage SME to build migration funnel
and force prioritization
Social engineeringAsk the right questions to stretch the
scope of discussion
Capability & Functionality Incentivize movement through next
generation cloud aware toolsets
Third Party VendorsValidateConfirm that licensing supports cloud
EducateDemystify cloud for vendors
CollaboratePartner with vendor on application
migration
ManagementPreparationPre-work and timeline
coordination for future success
PlanningTake into consideration
application upgrades and team
schedules
Team CompositionWing to WingEvolve siloed workers to be cross
functional
Lessons Learned
Automate, then Automate MoreEverything we do is with automation in
mind, from deployment to operations. This
is the only way to survive at scale.
Security at Every LayerFully utilizing the security provided in
the public cloud allows us to have
confidence in a multi-tenant world.
Embrace AgileFrom organization structure to project
management, everything we do is with
agile principles in mind.
Bias toward actionEveryone has a reason not to move to
cloud. Our mission is to find more
reasons why we should.
Work Instead of WorkflowEmbracing automation has allowed our
employees to concentrate on doing work,
instead of filling out workflows.
Encourage (calculated) RisksCelebrate failure. Talk about pivots.
Continuously examine new tools. This
leads to rapid innovation resulting in
progress.
Transformation – Rebuild technology skill
sets, encourage diversity and embrace “hands-on”
Pipeline – A pipeline of 50+ will ensure
consistent velocity
Collaboration - Embed Security & Risk
teams, CIO + CTO + Corp partnership
Cloud Aware – Rehosting is OK if it maximizes
margin, agility, resilience & performance
Enablers