Chef recipe order of execution redux
Given the following recipe:
ruby_block "block1" do
block do
puts "in block1"
end
action :create
end
remote_file "/tmp/foo" do
puts "in remote_file"
source "https://yahoo.com"
end
I'd expect the ruby_block to run first (because it comes first) and then the remote_file.
I'd like to use the ruby_block to determine the url for the remote_file to download from, so the order is important.
If it wasn't for my puts() statements I'd assume that these are getting run in the expected order, because the log says:
==> default: [2014-06-12T17:49:19+00:00] INFO: ruby_block[block1] called
==> default: [2014-06-12T17:49:19+00:00] INFO: remote_file[/tmp/foo] created file /tmp/foo
==> default: [2014-06-12T17:49:20+00:00] INFO: remote_file[/tmp/foo] updated file contents /tmp/foo
But above that, my puts() statements come out as follows:
==> default: in remote_file
==> default: in block1
If you think that the resources are being run in the expected order, consider this recipe:
ruby_block "block1" do
block do
node.default['test'] = {}
node.default['test']['foo'] ='https://google.com'
puts "in block1"
end
action :create
end
remote_file "/tmp/foo" do
puts "in remote_file"
source node.default['test']['foo']
end
This one fails as follows:
==> default: [2014-06-12T17:55:38+00:00] ERROR: {} is not a valid `source` parameter for remote_file. `source` must be an absolute URI or an array of URIs.
==> default: [2014-06-12T17:55:38+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)
The string "in block1" doesn't appear in the output, so the ruby_block was never run.
So the question is, how can I force the ruby_block to run, and run first?
Good question - both of your examples work the way that I would expect, but it isn't immediately obvious why.
As StephenKing wrote in his response, the first thing to understand is that recipes are compiled (to produce a set of resources), and then resources are converged (to effect changes to your system). These two phases are often interleaved - some of your resources might be converged before Chef has finished compiling all of your recipes. Erik Hollensbe covers this in some detail in his post "The Chef Resource Run Queue".
Here's your first example again:
ruby_block "block1" do
block do
puts "in block1"
end
action :create
end
remote_file "/tmp/foo" do
puts "in remote_file"
source "https://yahoo.com"
end
These are the steps that Chef will go through in processing that example.
- First, the ruby_block declaration is compiled, which results in a resource called
ruby_block[block1]
being added to the resource collection. The contents of the block (the firstputs
statement) don't run yet - it is saved to be run when this resource is converged. - Next, the remote_file declaration is compiled. This results in a resource called
remote_file[/tmp/foo/]
being added to the resource collection, with a source of "https://yahoo.com". In the process of compiling this declaration, the secondputs
statement will be executed - this has side effect of printing "in remote_file", but it doesn't affect the resource that that is put into the resource collection. - With nothing else to compile, Chef starts converging the resources in the resource collection. The first one is
ruby_block[block1]
, and Chef runs the ruby code in the block - printing "in block1". After it finishes running the block, it logs a message to say that the resource was called. - Finally, Chef converges
remote_file[/tmp/foo]
. Again, it logs a message (or two) associated with that activity.
That should produce the following sequence of output:
- Nothing printed when the ruby_block is compiled.
- "in remote_file" will be printed while the remote_file is compiled.
- "in block1" will be printed while the ruby_block is converged.
- A Chef log message will be printed after the ruby_block is converged.
- Other Chef logs messages will be printed during/after the remote_file is converged.
Onto your second example:
ruby_block "block1" do
block do
node.default['test'] = {}
node.default['test']['foo'] ='https://google.com'
puts "in block1"
end
action :create
end
remote_file "/tmp/foo" do
puts "in remote_file"
source node.default['test']['foo']
end
As with the first example, we don't expect anything to be printed while the ruby_block is compiled - the whole "block" is saved, and its contents won't run until that resource is converged.
The first output we see is "in remote_file", as the puts
statement is executed when Chef compiles the remote_file resource. On the next line, we set the source
parameter to the value of node.default['test']['foo']
, which is apparently {}
. That's not a valid value for source
, so the Chef run terminates at that point - before the code in the ruby_block
ever runs.
Therefore, the expected output of this recipe is:
- No output while compiling the ruby_block
- "in remote_file" printed while compiling the remote_file
- An error due to the invalid
source
parameter
Hopefully that helps you to understand the behaviour you're seeing, but we still have a problem to solve.
Although you asked "how can I force the ruby_block to run first?", your comment to StephenKing suggests this isn't really what you want - if you really wanted that block to run first, you could put it directly into your recipe code. Alternatively, you could use the .run_action() method to force the resource to be converged as soon as it is compiled - but you say that there are still more resources that need to converge before the ruby_block can be useful.
As we've seen above, resources aren't "run", they're first "compiled" and then "converged". With that in mind, what you need is for the the remote_file
resource to use some data that is not known when it is compiled, but will be known when it is converged. In other words, something like the "block" parameter in the ruby_block
- a piece of code that doesn't run until later. Something like this:
remote_file "/tmp/foo" do
puts "in remote_file"
# this syntax isn't valid...
source do
node.default['test']['foo']
end
end
Fortunately, such a thing does exist - it's called Lazy Attribute Evaluation. Using that feature, your second example would look like this:
ruby_block "block1" do
block do
node.default['test'] = {}
node.default['test']['foo'] = 'https://google.com'
puts "in block1"
end
action :create
end
remote_file "/tmp/foo" do
puts "in remote_file"
source lazy { node['test']['foo'] }
end
And the expected output of this recipe?
- No output while compiling the ruby_block
- "in remote_file" printed while compiling the remote_file
- "in block1" printed while converging the ruby_block
- Chef Log message showing the ruby_block was converged
- Chef Log messages showing the remote_file was converged