Date post: | 10-May-2015 |
Category: |
Technology |
Upload: | puppet-labs |
View: | 54,173 times |
Download: | 2 times |
R.I.Pienaar
Puppet Camp Ghent
Managing Puppet using MCollective
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Who am I?
• Puppet user since 0.22.x
• Architect of MCollective
• Author of Extlookup and Hiera
• Developer at Puppet Labs London
• Blog at http://devco.net
• Tweets at @ripienaar
• Volcane on IRC
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
The Problem?
• Puppet needs management just like other software
• Enabling, disabling, ad-hoc runs, custom environments etc
• The Puppet Master is a finite resource that needs protection
• Orchestrated deploys
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Available on yum.puppetlabs.com and apt.puppetlabs.com
http://srt.ly/mcpuppet
package{[“mcollective-puppet-agent”, “mcollective-puppet-client”]: ensure => present}
MCollective Puppet Agent
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
unix text here
Obtaining Statuses
$ mco puppet status
* [ ============================================================> ] 11 / 11
node8.example.net: Currently stopped; last completed run 14 minutes 16 seconds ago ....
Summary of Applying:
false = 11
Summary of Daemon Running:
stopped = 11
Summary of Enabled:
enabled = 10 disabled = 1
Summary of Idling:
false = 11
Finished processing 11 / 11 hosts in 72.05 ms
Per node status
Estate wide summary
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
$ mco puppet count
Total Puppet nodes: 11
Nodes currently enabled: 10 Nodes currently disabled: 1
Nodes currently doing puppet runs: 5 Nodes currently stopped: 6
Nodes with daemons started: 10 Nodes without daemons started: 1 Daemons started but idling: 6
Obtaining Statuses
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
$ mco rpc puppet last_run_summary
* [ ============================================================> ] 28 / 28
. . .
Summary of Config Retrieval Time:
Average: 20.13
Summary of Total Resources:
Average: 435
Summary of Total Time:
Average: 39.33
Finished processing 28 / 28 hosts in 311.23 ms
Obtaining Statuses
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
$ mco puppet runonce
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2593.85 ms
$ mco puppet count
Total Puppet nodes: 11
Nodes currently enabled: 10 Nodes currently disabled: 1
Nodes currently doing puppet runs: 2 Nodes currently stopped: 9
Nodes with daemons started: 10 Nodes without daemons started: 1 Daemons started but idling: 8
Doing Basic Runs
Puppet 3 disable message
Run with default configured splay and splaylimit
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Run with no splay, still subject to enable/disable
$ mco puppet runonce -f
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Doing Basic Runs
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Force splay and set a custom splay limit
$ mco puppet runonce --splay --splaylimit 120
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Doing Basic Runs
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Selects 2 tags in a specific Puppet Environment
$ mco puppet runonce --tag webserver --tag syslog --environment development
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Tags and Environment
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Do a noop run, gathers reports and audit information
$ mco puppet runonce --noop
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Doing noop Runs
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
When puppet.conf has noop=true,do an actual run on demand
$ mco puppet runonce --tag webserver --no-noop
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Doing no-noop Runs
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Does a single run against a differentPuppet Master
$ mco puppet runonce --server secops.example.net:8134 --tag compliance
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Choosing a Master
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
The Big Red Button
Disables Puppet, does not change currentlydisabled nodes reasons
$ mco puppet disable “we f’d up, stop the train!”
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted Could not disable Puppet: Already disabled
Summary of Enabled:
disabled = 11
Finished processing 11 / 11 hosts in 90.06 ms
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
The Big Green Button
Enables all disabled Puppet nodes
$ mco puppet enable -S ‘puppet().disable_message=/stop the train/’
* [ ============================================================> ] 10 / 10
Summary of Enabled:
enabled = 10
Finished processing 10 / 10 hosts in 90.06 ms
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Selective Runs
Run using a filter:all web servers with fact cluster=a
$ mco puppet runonce -W “cluster=a roles::webserver”
* [ ============================================================> ] 5 / 5
Finished processing 5 / 5 hosts in 90.06 ms
Facter fact Puppet Class
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Selective Runs
Run using a filter:nodes where we manage /srv/www
$ mco puppet runonce -S “resource(‘File[/srv/www]’).managed=true”
* [ ============================================================> ] 5 / 5
Finished processing 5 / 5 hosts in 90.06 ms
Any Puppet resource
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Selective Runs
Run using a filter:Most recent run config_version was xyz
that had > 5 resource failures
$ mco puppet runonce -S “resource().failed_resources>5 and resource().config_version=xyz”
* [ ============================================================> ] 5 / 5
Finished processing 5 / 5 hosts in 90.06 ms
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Runs all nodes with a maximum concurrency
$ mco puppet runall 72013-01-19 20:58:59: Running all nodes with a concurrency of 72013-01-19 20:58:59: Discovering enabled Puppet nodes to manage2013-01-19 20:59:02: Found 11 enabled nodes2013-01-19 20:59:06: node3.example.net schedule status: Started a background Puppet run2013-01-19 20:59:07: node1.example.net schedule status: Started a background Puppet run2013-01-19 20:59:09: node4.example.net schedule status: Started a background Puppet run2013-01-19 20:59:10: node6.example.net schedule status: Started a background Puppet run2013-01-19 20:59:12: node0.example.net schedule status: Started a background Puppet run2013-01-19 20:59:13: node5.example.net schedule status: Started a background Puppet run2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:21: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:25: node9.example.net schedule status: Puppet is currently applying a catalog, cannot run now2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run2013-01-19 20:59:33: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:38: node2.example.net schedule status: Started a background Puppet run2013-01-19 20:59:41: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:46: middleware.example.net schedule status: Started a background Puppet run2013-01-19 20:59:50: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:55: node7.example.net schedule status: Started a background Puppet run
Roll Out A Change Quickly
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Does not attempt to manage disabled nodes
2013-01-19 20:58:59: Running all nodes with a concurrency of 72013-01-19 20:58:59: Discovering enabled Puppet nodes to manage2013-01-19 20:59:02: Found 11 enabled nodes
Roll Out A Change Quickly
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Starts the first 6 quickly but considersadministrators doing 1other run at the same time
2013-01-19 20:59:02: Found 11 enabled nodes2013-01-19 20:59:06: node3.example.net schedule status: Started a background Puppet run2013-01-19 20:59:07: node1.example.net schedule status: Started a background Puppet run2013-01-19 20:59:09: node4.example.net schedule status: Started a background Puppet run2013-01-19 20:59:10: node6.example.net schedule status: Started a background Puppet run2013-01-19 20:59:12: node0.example.net schedule status: Started a background Puppet run2013-01-19 20:59:13: node5.example.net schedule status: Started a background Puppet run2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 7
Roll Out A Change Quickly
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
node9 was being run by an administrator or normalschedule already, skipped to next node
2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:21: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:25: node9.example.net schedule status: Puppet is currently applying a catalog, cannot run now2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run
Roll Out A Change Quickly
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Regularly checks the concurrency and startsmore nodes soon as possible.
Average node run time 34.39s, totaltime 55 seconds
2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run2013-01-19 20:59:33: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:38: node2.example.net schedule status: Started a background Puppet run2013-01-19 20:59:41: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:46: middleware.example.net schedule status: Started a background Puppet run2013-01-19 20:59:50: Currently 7 nodes applying the catalog; waiting for less than 72013-01-19 20:59:55: node7.example.net schedule status: Started a background Puppet run
Roll Out A Change Quickly
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Does runonce in batches of 5, 5 minute sleepper batch. ^c after any batch to stop.
15 minute total run time.
$ mco puppet runonce --batch 5 --batch-sleep 300
* [ ============================================================> ] 11 / 11
Finished processing 11 / 11 hosts in 903686.29 ms
Roll Out A Change SlowlyWait 5 minutes
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Advanced Status And Performance Metrics
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Distribution of various metrics.
$ mco puppet summary
Summary statistics for 28 nodes:
Total resources: ▂▇▂▁▁▃▁▂▂▂▄▁▂▁▁▁▁▁▂▁ min: 332.0 max: 695.0 Out Of Sync resources: ▇▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ min: 0.0 max: 2.0 Failed resources: ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ min: 0.0 max: 0.0 Changed resources: ▇▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ min: 0.0 max: 2.0 Config Retrieval time (seconds): ▆▇▅▄▁▃▃▁▁▁▃▁▁▄▂▁▁▁▁▁ min: 2.7 max: 57.1 Total run-time (seconds): ▇▃▄▄▄▃▂▂▂▂▃▂▁▁▁▁▁▂▁▁ min: 7.0 max: 125.1 Time since last run (seconds): ▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂ min: 10.0 max: 89.0k
Performance Analysis
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Distribution of various metrics.
Config Retrieval time (seconds): ▆▇▅▄▁▃▃▁▁▁▃▁▁▄▂▁▁▁▁▁ min: 2.7 max: 57.1 Total run-time (seconds): ▇▃▄▄▄▃▂▂▂▂▃▂▁▁▁▁▁▂▁▁ min: 7.0 max: 125.1
Performance Analysis
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Distribution of config retrieval time.
$ mco plot resource config_retrieval_time
Information about Puppet managed resources Nodes 8 ++----*-----+----------+-----------+----------+----------+----------++ + * + + + + + + 7 ++ ** ++ | * * | 6 ++ * * ++ | * * | | * * | 5 ++ * * ++ | * * | 4 ++ * * ++ | * * | 3 ++ * * * * ++ | * * ** * ** | 2 ++* **** * * * ++ | * * * | | * * * | 1 ++ ************** ****** * * ** ++ + + + * + ** + *+ *** + 0 ++----------+----------+---------********-----+--*******-+----*-----++ 0 10 20 30 40 50 60 Config Retrieval Time
Performance Analysis
Slow machines
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Find machines with config_retrieval_time over30 seconds - all the dev servers.
$ mco find -S "resource().config_retrieval_time > 30"dev3.example.netdev4.example.netdev7.example.netdev6.example.netdev8.example.netdev9.example.netdev10.example.net
Performance Analysis
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Maintenance Windows and Access Control
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Only cert=manager can enable and disablethe Puppet Agent indicating maintenance
periods
policy default denyallow cert=manager enable disable * *allow cert=sysadmin runonce status * *allow cert=developer * environment=development *
Puppet State As ACL
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Puppet State As ACL
policy default denyallow cert=manager stop start * *allow cert=noc stop start puppet().enabled=falseallow cert=developer * environment=development *
NOC can start and stop servicesonly during a maintenance window.
Manager user can always overridemaintenance windows.
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
What is MCollective?
• Ruby framework for writing Orchestration systems
• Provides Authentication, Authorization and Auditing
• No direct communication between client and nodes
R.I.Pienaar | [email protected] | http://devco.net | @ripienaar
Questions?twitter: @ripienaar
email: [email protected]
blog: www.devco.net
github: ripienaar
freenode: Volcane
Questions?