Date post: | 15-Feb-2017 |
Category: |
Technology |
Upload: | james-huston |
View: | 75 times |
Download: | 2 times |
Smart Platform Infrastructure
How we are learning to let our team sleep at night
James Huston DevOPS Days Charlotte
February 2017
whoami
• James Huston - Director of Platform Engineering @ Red Ventures
• Over the last 20 years I have been on teams that:
• Tried a lot of things, some worked, some didn’t
• Learned a lot of do’s and don’ts
The Team
Thomas Hopkins Ryan Ruscett
Alfonso Cabrera Garrett JohnsonMike Guthrie
So what do I have to share?• Sleep
• Operations -vs- Platform Ops
• Infrastructure (AWS)
• Monitoring and Alerting
• Security
• Workflows
• Documentation
• Docker
Sleep
• Our jobs are 24/7/365
• Small teams
• Resource bound
• To be successful, We need sleep
Operations -vs- Platform Ops• Deeper knowledge
• Correct -vs- Fast
• Snowflakes?
• Wide breadth of knowledge
• Fast turn around, or self service
• Automate all the things
Platform OpsPlatform enables developers to safely and consistently perform their own operations and build resilient and secure applications.
Infrastructure• Traditional Operations - Healthy Infrastructure
• Linux in your datacenter
• Apps on top of that
• Platform Ops - Healthy Applications
• AWS/Azure/Google
• Managed services
• Apps on top of that
Monitoring and Alerting
• You are likely underestimating its importance
• Integrate them from the beginning, don’t bolt them on.
• Make sure your alerts go to the correct people
• Don’t create alerts that you are going to ignore!
Infrastructure Layout
Staging Production
Our Infrastructure
Infrastructure - Why is it Important
• Take advantage of Autoscaling for scale and auto healing
• Design to be secure from the start
• Design with monitoring and alerting built in
• Build your infrastructure in a standard, documented, reproducible way
Immutable Infrastructure• First line of debugging: remove the machine and let
it get replaced
• Avoid snowflakes/unicorns as much as possible
• Replace for security reasons
• Easy to implement (in the cloud anyhow)
• Salt/Chef/Puppet - use it for initial config, don’t push changes
Program and Automate• Reproduce repeatable infrastructures
• Team review of changes before they are made
• Pull requests
• Easy Rollback
• Shareable and reusable modules
• https://github.com/segmentio/stack
Terraform
• Plays nice with Most of the Things
• Multiple cloud providers, VMware, OpenStack
• Grafana, DataDog, New Relic, PagerDuty, Logentries
• MySQL, PostgreSQL
• Program all the things - Except Snowflakes
Terraform -vs- CloudFormation
• State
• Fast
• Admin Access
• No State
• Not so fast
• AWS Service Catalog
Security - SSO
• Don’t underestimate the power of the dark side OR your need to use Single Sign On (SSO)
• Active Directory, LDAP, Okta for AWS/Apps
• JumpCloud or LDAP for EC2 instances
• Avoid tools that don’t support SSO (GitHub.com) in favor of tools that do (GitHub Enterprise)
Security
• Don’t share SSH keys among your team(s). Ever.
• 0.0.0.0/0 on a security group that is not a public ELB? That’s likely bad.
• eg. future VPN or DirectConnect
Developer Workflows• Automation is key
• Use standard tooling (Makefile, shell scripts, etc)
• Bamboo -vs- Jenkins
• Centralization
• Provide guardrails and let teams with the expertise control their own destiny
• Documentation of workflows is critically important
Documentation
• README.MD - keep docs with your projects
• Centralize infrastructure, CI/CD, and other core docs
• Make it mandatory in governance
• Set a good example!
Docker
Security Info ala Jérôme Petazzoni (https://jpetazzo.github.io/) http://bit.ly/1t1DG3Q
Docker• Don’t run things as root
• Update often!
• For real security, run all filesystems read-only
• Use small (Alpine, Debian) base images
• Use only approved images
• Update them often
• Windows? All of the above.
Docker
• KISS - Keep It Simple Stupid!
Drumroll PleaseThe “Cloud” makes Platform Ops a reality. We can now program and automate “all the things” and we have the tools to make our infrastructure and applications maintain and heal themselves …
And we get to sleep at night
411James Huston
Director of Platform Engineering @ Red Ventures
@hustonjs