Vault in Production at Apptio
Lee BriggsSnr Infrastructure Engineer
© 2016 Apptio, All rights reserved (v2.5)2
$(whoami)
Based in London
Work for Apptio
Github:https://github.com/jaxxstorm
Twitter:https://twitter.com/briggsl
Blog:https://www.leebriggs.co.uk
Apptio Infrastructure
© 2016 Apptio, All rights reserved (v2.5)4
Some Apptio numbers
Almost 6000 unique ”vms”
15 global ”datacenters”
Physical and AWS VPCs
Hundreds of MySQL databases
Over 3.5 petabytes of raw storage
Over 178Tb of memory
Over 170,000 CPU cores
“The (initial) problem”How do we provide audited access to lots of MySQL instances?
© 2016 Apptio, All rights reserved (v2.5)6
Vault
Vault provides:
Audit logging
MySQL Credential management
High availability
A secure way to store credentials
© 2016 Apptio, All rights reserved (v2.5)7
Vault
What we needed to figure out
How to deploy vault in 15 datacenters Automated, easily configurable
How to connect several hundred databases to those vaults
High availability
Sane backups
Make it easier than passing around passwords or looking in app config files
The journey
© 2016 Apptio, All rights reserved (v2.5)9
Step 1: Deploy Vault
We already had consul in all DCs
Spread across racks in DC
Across AZs in AWS
Is connected using WAN federation
We use Puppet for configuration management
The puppet module takes care of download/install
Connect to consul – HA backend
This also provides us with TLS
We deployed vault onto all consulservers
© 2016 Apptio, All rights reserved (v2.5)10
Step 2: Initialise Vault
Automating this isn’t trivial
Plaintext keys are bad
By default, vault outputs plaintext unseal keys
Solution: Use the GPG support
We already used GPG to store encrypted files in git Using puppet + eyaml
Also using git-crypt
This way, the keys are protected by the each user’s GPG private key
We used the API to init vault in each DC
We provide 7 GPG keys, and need 3 users to unseal a vault
© 2016 Apptio, All rights reserved (v2.5)11
Step 3: Unseal the Vault
At this stage, we have around 60 instances of vault to unseal..
Doing this “manually” is obviously not tenable
Automating this is dangerous..
© 2016 Apptio, All rights reserved (v2.5)12
Unseal
https://github.com/jaxxstorm/unseal
Add your vaults servers to a config file
Add your encrypted unseal key
You can also put the plaintext key, but don’t!
Prompts for your GPG keyring password
If you’re running GPG agent, this is a security risk..
Unseals all vaults
Each unseal command runs in a goroutine
Can send unseal command to 75 vaults in around 15s!
Unseal Demo
© 2016 Apptio, All rights reserved (v2.5)14
Step 4: Configure the vault
We need to now add some configuration for all DCs
Answers https://github.com/UKHomeOffice/vaultctl
https://www.hashicorp.com/blog/codifying-vault-policies-and-configuration/
Allows you to define the vault config in yaml
Can then run vaultctl to configure your vault server as you require Enable LDAP with config
Enable audit logging
Enable MySQL backend
We run this in a loop for all DCs Only need to hit a single vault server in each DC
© 2016 Apptio, All rights reserved (v2.5)15
Step 5: Add MySQL configuration
We provision VMs using internal tool “selfserve”
When VM is provisioned for DB
Puppet runs, installs mysql
Puppet adds a “vault” user with grants
We then add roles to each DB config – readonly and full
Selfserve makes an API call to that regions vault, adding it as a backend Selfserve has its own token which has write permissions to the mysql backend using policy
We mount all databases with path mysql/<hostname>
© 2016 Apptio, All rights reserved (v2.5)16
Step 6: Make logins easy
Configure ldap auth with policies for customers mapped to LDAP groups
Some people can get write access, some only get read access
However, authing with ldap and then having to do vault write was difficult for users to remember
Have to vault auth
Then vault read <creds>
Having to look this up when on-call isn’t fun if you don’t do it regularly
© 2016 Apptio, All rights reserved (v2.5)17
Breakglass
A simple golang command line tool to automate the login process
Prompts for your AD password, and you specify the mysql host you need
It finds the correct vault endpoint using DNS forwarding, and then automatically drops you into a mysql shell
Inspired by vault ssh
It’s not currently open source, but hoping to have that done by end of Q3.
Breakglass Demo
More Considerations
© 2016 Apptio, All rights reserved (v2.5)20
ACLs
If you’re using consul as your backend turn on ACLS!
You should also block access to port 8500/8501 where possible
Consul can be used extensively to pivot to RCE:
http://www.kernelpicnic.net/2017/05/29/Pivoting-from-blind-SSRF-to-RCE-with-Hashicorp-Consul.html
If you store your secrets in consul, don’t let someone delete them
By default, the consul web api allows access to delete and modify any key
This requires an investment in implementing tokens
You can use vault to manage these!
© 2016 Apptio, All rights reserved (v2.5)21
Backups
When we init vault, we use the key prefix “vault/$datacenter”
Our DC’s are completely distinct, we never share secrets between DCs
We use consul snapshot to take backups Take them once per hour
We copy them to another DC
We test restores weekly Start vault on a difference port
Connect it to the existing consul with the “vault/$datacenter” prefix
All done via ansible
Have users unseal – users run when they come online
Verify integrity
Shutdown
© 2016 Apptio, All rights reserved (v2.5)22
Lessons Learned
Pick 1 thing and “vault it” Trying to secure all your secrets in vault straight away can be overwhelming
We now store the majority of our secrets in vault after lessons learned from MySQL
Have a good story for configuration, backups and unsealing
Consul + Vault has a great HA story As long as you use consul’s service discovery of course
“Automated” secret management has trade-offs Be aware of them
Abstract away the user pain where possible
Golang is great for cmdline tools! These packages use viper + cobra
https://github.com/spf13/cobra
THANK YOU