How can I deploy rolling OS upgrades & reboots with Puppet or MCollective?
I'm looking for the best way to perform regular rolling upgrades for my infrastructure.
Typically, this involves doing this on each host, one at a time:
sudo yum update -y && sudo reboot
But, I'm hitting limits of that being a scalable.
I want to only reboot one node at a time within each of my roles, so that, say, I don't take down all of my load balancers, or DB cluster members, at the same time.
Ideally, I'd wanna do something like:
for role in $(< roles_list.txt) ; do
mco package update_all_and_reboot \
--batch 1 --batch-sleep 90 \
-C $role -F environment=test
done
But, that doesn't quite seem to exist. I'm not sure if using the "shell" agent is the best approach, either?
mco shell run 'yum update -y && reboot' \
--batch 1 --batch-sleep 90
Am I just looking at the wrong sort of tool for this job, though? Is there something better for managing these sort of rolling reboots, but that I can somehow link up with my Puppet-assigned roles, so that I can be comfortable that I'm not taking down anything important all at once, but that I can still do some parallel updates & reboots?
Configuration
Deploy
cd /usr/share/ruby/vendor_ruby/mcollective/application
wget https://raw.githubusercontent.com/arnobroekhof/mcollective-plugin-power/master/application/power.rb
and
cd /usr/libexec/mcollective/mcollective/agent
wget https://raw.githubusercontent.com/arnobroekhof/mcollective-plugin-power/master/agent/power.ddl
wget https://raw.githubusercontent.com/arnobroekhof/mcollective-plugin-power/master/agent/power.rb
on both hosts, i.e. test-server1
and test-server2
.
Services
Restart mcollective on both services:
[vagrant@test-server1 ~]# sudo service mcollective restart
and
[vagrant@test-server2 ~]# sudo service mcollective restart
Commands
Run the following commands on the mcollective server node:
The host test-server2
is listening:
[vagrant@test-server1 ~]$ mco ping
test-server2 time=25.32 ms
test-server1 time=62.51 ms
---- ping statistics ----
2 replies max: 62.51 min: 25.32 avg: 43.91
Reboot the test-server2
:
[vagrant@test-server1 ~]$ mco power reboot -I test-server2
* [ ============================================================> ] 1 / 1
test-server2 Reboot initiated
Finished processing 1 / 1 hosts in 123.94 ms
The test-server2
is rebooting:
[vagrant@test-server1 ~]$ mco ping
test-server1 time=13.87 ms
---- ping statistics ----
1 replies max: 13.87 min: 13.87 avg: 13.87
and it has been rebooted:
[vagrant@test-server1 ~]$ mco ping
test-server1 time=22.88 ms
test-server2 time=54.27 ms
---- ping statistics ----
2 replies max: 54.27 min: 22.88 avg: 38.57
Note that it is possible to shutdown a host as well:
[vagrant@test-server1 ~]$ mco power shutdown -I test-server2
* [ ============================================================> ] 1 / 1
test-server2 Shutdown initiated
Finished processing 1 / 1 hosts in 213.18 ms
Original code
/usr/libexec/mcollective/mcollective/agent/power.rb
module MCollective module Agent class Power<RPC::Agent action "shutdown" do out = "" run("/sbin/shutdown -h now", :stdout => out, :chomp => true ) reply[:output] = "Shutdown initiated" end action "reboot" do out = "" run("/sbin/shutdown -r now", :stdout => out, :chomp => true ) reply[:output] = "Reboot initiated" end end end end # vi:tabstop=2:expandtab:ai:filetype=ruby
/usr/libexec/mcollective/mcollective/agent/power.ddl
metadata :name => "power", :description => "An agent that can shutdown or reboot them system", :author => "A.Broekhof", :license => "Apache 2", :version => "2.1", :url => "http://github.com/arnobroekhof/mcollective-plugins/wiki", :timeout => 5 action "reboot", :description => "Reboots the system" do display :always output :output, :description => "Reboot the system", :display_as => "Power" end action "shutdown", :description => "Shutdown the system" do display :always output :output, :description => "Shutdown the system", :display_as => "Power" end
/usr/share/ruby/vendor_ruby/mcollective/application/power.rb
class MCollective::Application::Power<MCollective::Application description "Linux Power broker" usage "power [reboot|shutdown]" def post_option_parser(configuration) if ARGV.size == 1 configuration[:command] = ARGV.shift end end def validate_configuration(configuration) raise "Command should be one of reboot or shutdown" unless configuration[:command] =~ /^shutdown|reboot$/ end def main mc = rpcclient("power") mc.discover :verbose => true mc.send(configuration[:command]).each do |node| case configuration[:command] when "reboot" printf("%-40s %s\n", node[:sender], node[:data][:output]) when "shutdown" printf("%-40s %s\n", node[:sender], node[:data][:output]) end end printrpcstats mc.disconnect end end # vi:tabstop=2:expandtab:ai
Modified code
/usr/libexec/mcollective/mcollective/agent/power.ddl
metadata :name => "power",
:description => "An agent that can shutdown or reboot them system",
:author => "A.Broekhof",
:license => "Apache 2",
:version => "2.1",
:url => "http://github.com/arnobroekhof/mcollective-plugins/wiki",
:timeout => 5
action "update-and-reboot", :description => "Reboots the system" do
display :always
output :output,
:description => "Reboot the system",
:display_as => "Power"
end
/usr/libexec/mcollective/mcollective/agent/power.rb
module MCollective
module Agent
class Power<RPC::Agent
action "update-and-reboot" do
out = ""
run("yum update -y && /sbin/shutdown -r now", :stdout => out, :chomp => true )
reply[:output] = "Reboot initiated"
end
end
end
end
# vi:tabstop=2:expandtab:ai:filetype=ruby
Command
[vagrant@test-server1 ~]$ mco power update-and-reboot -I test-server2
* [ ============================================================> ] 1 / 1
Finished processing 1 / 1 hosts in 1001.22 ms