What's the difference between include_tasks and import_tasks?

There's quite a bit about this topic in the documentation:

  • Includes vs. Imports
  • Dynamic vs. Static

The main difference is:

All import* statements are pre-processed at the time playbooks are parsed.
All include* statements are processed as they encountered during the execution of the playbook.

So import is static, include is dynamic.

From my experience, you should use import when you deal with logical "units". For example, separate long list of tasks into subtask files:

main.yml:

- import_tasks: prepare_filesystem.yml
- import_tasks: install_prerequisites.yml
- import_tasks: install_application.yml

But you would use include to deal with different workflows and take decisions based on some dynamically gathered facts:

install_prerequisites:

- include_tasks: prerequisites_{{ ansible_os_family | lower }}.yml

Imports are static, includes are dynamic. Imports happen at parsing time, includes at runtime.

Imports basically replace the task with the tasks from the file. There are no import tasks at runtime. So, attributes like tags, and when (and most likely the rest) are copied to every imported task.

includes are indeed executed. tags and when of an included task apply only to the task itself.

Tagged tasks from an imported file get executed if an import task is untagged. No tasks is executed from an included file if an include task is untagged.

All tasks from an imported file get executed if an import task is tagged. Only tagged tasks from an included file get executed if an include task is tagged.

Limitations of imports:

  • can't be used with with_* or loop attributes
  • can't import a file, which name depends on a variable

Limitations of includes:

  • --list-tags doesn't show tags from included files
  • --list-tasks doesn't show tasks from included files
  • you cannot use notify to trigger a handler name which comes from inside a dynamic include
  • you cannot use --start-at-task to begin execution at a task inside a dynamic include

More on it here and here.

For me that basically comes down to the fact that imports can't be used with the loop attribute.

imports would certainly fail in cases like this:

# playbook.yml
- import_tasks: set-x.yml
  when: x is not defined

# set-x.yml
- set_fact
  x: foo
- debug:
  var: x

debug is not executed, since it inherits when from the import_tasks task. So, no importing task files that change variables used in the import task's when attribute.

I had a policy to start with imports, but once I needed an include I made sure nothing is imported by the included file or its children. But that's pretty damn hard to maintain. And it's still not clear if it'll protect me from troubles. I mean, mixing includes and imports is not recommended.

I can't use only imports, since I occasionally need loops. I could probably switch to only includes. But I decided to switch to imports everywhere except for the cases where I need loops. I decided to experience all those tricky edge cases first-hand. Maybe there won't be any in my playbooks. Or hopefully I'll find a way to make it work.

UPD A possibly useful trick to create a task file that can be imported many times, but executed only once:

- name: ...
  ...
  when: not _file_executed | default(False)

- name: ...
  ...
  when: not _file_executed | default(False)

...

- name: Set _file_executed
  set_fact:
    _file_executed: True

UPD One not really expected effect of mixing includes and imports is that an include task's vars override the ones of the imported tasks:

playbook.yml:

- hosts: all
  tasks:
    - import_tasks: 2.yml
      vars:
        v1: 1
    - include_tasks: 2.yml
      vars:
        v1: 1

2.yml:

- import_tasks: 3.yml
  vars:
    v1: 2

3.yml:

- debug:
    var: v1    # 2 then 1

Probably, because include_tasks first imports the files, and then applies its vars directive.

Actually, it can also be reproduced like so:

playbook.yml:

- hosts: all
  tasks:
    - import_tasks: 2.yml
      vars:
        v1: 1
    - include_tasks: 2.yml
      vars:
        v1: 1

2.yml:

- debug:
    var: v1    # 2 then 1
  vars:
    v1: 2

UPD Another case of mixing includes and imports.

playbook.yml:

- hosts: all
  tasks:
    # say, you're bound to use include here (because you need a loop)
    - include_tasks: 2.yml
      vars:
        https: yes

2.yml:

- import_tasks: 3.yml
  when: https

3.yml:

- import_tasks: 4.yml
  vars:
    https: no  # here we're trying to temporarily override the https var
- import_tasks: 4.yml

4.yml:

- debug:
    var: https

We get true and true, see the previous case (include_tasks' vars take precedence over import_tasks' ones). To avoid that we can switch to includes in 3.yml. But then the first include in 3.yml is skipped. Since it inherites when: https from the parent task, so the first task basically reads:

- import_tasks: 4.yml
  vars:
    https: no  # here we're trying to temporarily override the https var
  when: https

The solution is to switch to includes in 2.yml as well. That prevents propagation of when: https to the child tasks.