@RealGeneKim, [email protected]
Session ID:
Gene Kim
SecureWorld Philadelphia
May 23, 2012
Security is Dead. Long Live Rugged DevOps: IT at Ludicrous Speed…
@RealGeneKim, [email protected]
Visible Ops: Playbook of High Performers
The IT Process Institute has been studying high-performing organizations since 1999 What is common to all the high
performers? What is different between them
and average and low performers?
How did they become great? Answers have been codified in
the Visible Ops Methodology
www.ITPI.org
@RealGeneKim, [email protected]: John Allspaw
@RealGeneKim, [email protected]: John Allspaw
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]: John Allspaw
@RealGeneKim, [email protected]: John Allspaw
@RealGeneKim, [email protected]: Theo Schlossnagle
@RealGeneKim, [email protected]: Theo Schlossnagle
@RealGeneKim, [email protected]: Theo Schlossnagle
@RealGeneKim, [email protected]: John Jenkins, Amazon.com
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]: James Wickett
@RealGeneKim, [email protected]
Since 1999, We’ve Benchmarked 1500+ IT Organizations
Source: IT Process Institute (2008)
Source: EMA (2009)
@RealGeneKim, [email protected]
High Performing IT Organizations
High performers maintain a posture of compliance Fewest number of repeat audit findings One-third amount of audit preparation effort
High performers find and fix security breaches faster 5 times more likely to detect breaches by automated control 5 times less likely to have breaches result in a loss event
When high performers implement changes… 14 times more changes One-half the change failure rate One-quarter the first fix failure rate 10x faster MTTR for Sev 1 outages
When high performers manage IT resources… One-third the amount of unplanned work 8 times more projects and IT services 6 times more applications
Source: IT Process Institute, 2008
@RealGeneKim, [email protected]
2007: Three Controls Predict 60% Of Performance
To what extent does an organization define, monitor and enforce the following? Standardized configuration strategy Process discipline Controlled access to production systems
Source: IT Process Institute, 2008
@RealGeneKim, [email protected]
The Downward SpiralOperations Sees… Too many fragile and insecure
applications in production Too much time required to restore
service Too much firefighting and unplanned
work Planned project work cannot
complete Frustrated customers leave Market share goes down Business misses Wall Street
commitments Business makes even larger
promises to Wall Street
Dev Sees… More urgent, date-driven
projects put into the queue Even more fragile code (less
secure) put into production More releases have
increasingly “turbulent installs” Release cycles lengthen to
amortize “cost of deployments” Bigger deployment failures More time spent on firefighting Ever increasing backlog of work
that cold help the business win Ever increasing amount of
tension between IT Ops, Development, Design…
These aren’t ITSM or IT Operations problems…These are business problems!
@RealGeneKim, [email protected]
My Mission: Figure Out How Break The IT Core Chronic Conflict
Every IT organization is pressured to simultaneously: Respond more quickly to urgent business needs Provide stable, secure and predictable IT service
Source: The authors acknowledge Dr. Eliyahu Goldratt, creator of the Theory of Constraints and author of The Goal, has written extensively on the theory and practice of identifying and resolving core, chronic conflicts.
Words often used to describe process improvement:“hysterical, irrelevant, bureaucratic, bottleneck, difficult to understand, not
aligned with the business, immature, shrill, perpetually focused on irrelevant technical minutiae…”
@RealGeneKim, [email protected]
DevOps: It’s A Real Movement
I would never do another startup that didn’t employ DevOps like principles
It’s not just startups – it’s happening in the enterprise and in public sector, too
I believe working in DevOps environments will be a necessary skillset 5 years from now
Agile helped Dev regain trust with the business; DevOps will help all of IT
IT becoming more automated relies on DevOps practices (especially PaaS)
@RealGeneKim, [email protected]
If I Could Wave A Magic Wand, Everyone Will…
Become conversant with DevOps and recognize the practices when you see them
Be energized about how information practitioners can contribute in this organizational journey
Leave with some concrete steps to get some great outcomes
Become a part of a team that starts putting DevOps practices into place
42
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
The Prescriptive DevOps Cookbook
“DevOps Cookbook” Authors
Patrick DeBois, Mike Orzen, John Willis
Goals
Codify how to start and finish DevOps transformations
How does Development, IT Operations and Infosec become dependable partners
Describe in detail how to replicate the transformations describe in “When IT Fails: The Novel”
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
The First Way:Systems Thinking (Left To Right)
Don’t pass defects downstream Don’t optimize locally Always increase flow: elevate bottlenecks,
reduce WIP, throttle release of work, reduce batch sizes
Understanding where reliance is placed
@RealGeneKim, [email protected]
Phase 1: Extend the Agile CI/CR Processes
Create one-step Dev, Test and Production environment creation procedure in Sprint 0
Create the one-step automated code deployment procedure
Properly integrate release, configuration and change into the value stream (as well as QA and infosec)
Ensure developers don’t leave until production change is successful
Assign Ops person into Dev team
@RealGeneKim, [email protected]
Definition: Kanban Board
Signaling tool to reduce WIP and increase flow
53
@RealGeneKim, [email protected]
The First Way:Systems Thinking: Infosec Insurgency
Have someone attend the daily Agile standups Gain awareness of what the team is working on
Find the automated infrastructure project team (e.g., puppet, chef) Release managers can provide hardening guidance Integrate and extend their production configuration monitoring Put ASSERTs to find misconfigurations, enforce https, etc.
Find where code packaging is performed Integrate security testing pre- and post-deployment
Integrate testing into continuous integration and release process Add security test scripts to automated test library
Define what changes/deploys cannot be made without triggering full retest
@RealGeneKim, [email protected]
The First Way:Outcomes
Determinism in the release process
Creating single repository for code and environments
Consistent Dev, QA, Int, and Staging environments, all properly built before deployment begins
Decreased cycle time
Reduce deployment times from 6 hours to 45 minutes Refactor deployment process that had 1300+ steps
spanning 4 weeks Faster release cadence
@RealGeneKim, [email protected]
The Second Way:Amplify Feedback Loops (Right to Left)
Expose visual data so everyone can see how their decisions affect the entire system
Get Development closer to Operations and customers
Create a reliable system system of work that improves itself
@RealGeneKim, [email protected]
Phase 2: Extend Release Process And Create Right -> Left Feedback Loops
Embed Dev into Ops escalation process Invite Dev to post-mortems/root cause analysis
meeting Have Dev cross-train IT Operations Ensure application monitoring/metrics to aid in
Ops and Infosec work (e.g., incident/problem management
@RealGeneKim, [email protected]
The Second Way:Amplify Feedback Loops: Infosec Insurgency
Give production feedback to developers: being attacked is a gift Capture all instances of “UNION ALL” in user input and graph it, show
it to developers Show all instances of segfaults
Create reusable Infosec use and abuse stories that can be added to every project “Handle peak traffic of 4MM users and constant 4-6 Gb/sec
Anonymous DDoS attacks” Integrate Infosec and IR into the Ops/Dev escalation processes
(e.g., RACI) Help build blameless post-mortems
Pre-enable, shield streamline successful audits Document separation of duty and compensating controls Don’t let them disrupt the work
@RealGeneKim, [email protected]
The Second Way:Outcomes
Defects and security issues getting fixed faster than ever
Reusable Ops and Infosec user stories now part of the Agile process
All groups communicating and coordinating better
Everybody is getting more work done
@RealGeneKim, [email protected]
The Third Way:Culture Of Continual Experimentation And Learning
Foster a culture that rewards: Experimentation (taking risks) and learning from
failure Repetition is the prerequisite to mastery
Why? You need a culture that keeps pushing into the danger
zone And have the habits that enable you to survive in the
danger zone
@RealGeneKim, [email protected]
Help IT Operations…
“The best way to avoid failure is to fail constantly”
Harden the production environment
Have scheduled drills to “crash the data center”
Create your “chaos monkeys” to introduce faults into the system (e.g., randomly kill processes, take out servers, etc.)
Rehearse and improve responding to unplanned work NetFlix: Hardened AWS service
StackOverflow Amazon firedrills (Jesse
Allspaw) The Monkey (Mac)
@RealGeneKim, [email protected]
Phase 3: Organize Dev and Ops To Achieve Organizational Goals
Allocate 20% of Dev cycles to non-functional requirements
Integrate fault injection and resilience into design, development and production (e.g., Chaos Monkey)
@RealGeneKim, [email protected]
Help Product Management…
Lesson: Allocate 20% of Dev cycles to paying down technical debt
@RealGeneKim, [email protected]
The Third Way:Culture Of Continual Experimentation And Learning: Infosec
Infosec remediation projects in the Agile backlog Make technical debt visible Help prioritize work against features and other non-functional
requirements
Release your Chaos Monkey Evil/Fuzzy/Chaotic Monkey Eridicate SQLi and XSS defects in our lifetime
Rehearse cleaning up after the Chaos Monkey Find processes that waste everyone’s time Eliminate needless complexity
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
The Third Way:Outcomes
Technical debt is being paid off
Exploitable attack surface area decreases
Continual reduction of unplanned work
More cycles for planned work
More resilient code and environments
Balancing nimbleness and practiced repetition
Enabling wider range of risk/reward balance
@RealGeneKim, [email protected]
This Is An Important ProblemOperations Sees… Fragile applications are prone to failure Long time required to figure out “which
bit got flipped” Detective control is a salesperson Too much time required to restore
service Too much firefighting and unplanned
work Urgent security rework and
remediation Planned project work cannot complete Frustrated customers leave Market share goes down Business misses Wall Street
commitments Business makes even larger promises
to Wall Street
Dev Sees… More urgent, date-driven projects
put into the queue Even more fragile code (less
secure) put into production More releases have increasingly
“turbulent installs” Release cycles lengthen to
amortize “cost of deployments” Failing bigger deployments more
difficult to diagnose Most senior and constrained IT ops
resources have less time to fix underlying process problems
Ever increasing backlog of work that cold help the business win
Ever increasing amount of tension between IT Ops, Development, Design…
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
@RealGeneKim, [email protected]
If I Could Wave A Magic Wand, Everyone Will…
Become conversant with DevOps and recognize the practices when you see them
Be energized about how ITSM practitioners can contribute in this organizational journey
Leave with some concrete steps to get some great outcomes
Become a part of a team that starts putting DevOps practices into place
94
@RealGeneKim, [email protected]
When IT Fails: The Novel and The DevOps Cookbook
Coming in July 2012
“In the tradition of the best MBA case studies, this book should be mandatory reading for business and IT graduates alike.”Paul Muller, VP Software Marketing, Hewlett-Packard
“The greatest IT management book of our generation.”Branden Williams, CTO Marketing, RSA
Gene Kim, Tripwire founder, Visible Ops co-author
@RealGeneKim, [email protected]
When IT Fails: The Novel and The DevOps Cookbook
Our mission is to positively affect the lives of 1 million IT workers by 2017
If you would like the “Top 10 Things Infosec Needs To Know About DevOps,” sample chapters and updates on the book:
Sign up at http://itrevolution.com Email [email protected] Hand me a business card
Gene Kim, Tripwire founder, Visible Ops co-author