Ansible is an extremely powerful and flexible open source server configuration orchestration and automation tool.

It executes tasks idempotently. It will not take action and make changes if no changes are needed. Every task is executed in a repetable fashion and the outcome will always be the same ensuring repetitive, consitent, and predictible outcome every time it's ran. It's agentless and uses SSH to run tasks remotely.

After reading this destilled down information primer about Ansible you should be able to start using it successfully.

Prerequisites

Minimum Requirements to use Ansible:

  • access to a Linux, Unix or MacOS operating system
  • basic command line knowledge
  • common text editor like Atom, MS Visual Studio Code, or SublimeText
  • basic knowledge of YAML syntax (or a installed YAML syntax linter)
  • latest version of Ansible installed locally

Ansible Tools

ansible

ansible "Define and run a single task 'playbook' against a set of hosts". That's as exciting as it gets right here. This tool isn't used too much. It's pefect for running ad-hoc commands using a single module. One such example is running it with the setup module to fetch target system detailed information.

This ad-hoc command will use the setup module (-m setup) to gather as much information as possible from the target system. This is very useful when using built-in Ansible runtime variables when building templates for more complex playbooks:

ansible hostname -i inventory -u user --ask-pass -m setup

Other examples of ad-hoc commands using the ansible tool

This command will ping the host and show a reply:

ansible hostname -i inventory -u user --ask-pass -m ping

Using the shell module this command will fetch and display system uptime:

ansible hostname -i inventory -u user --ask-pass -m shell -a "uptime | cut -f6 -d\",\""

Real life example on how to use this powerful tool for an otherwise time consuming task. You need a report showing the version of Apache web server on all 300 production and 600 non-production servers.

Assuming you already have an inventory file broken down into production, non-production, and different hostgroups this shouldn't take longer than 5 minutes to finish.

Let's say there are 2 groups in the inventory file hosts:

[production]
host-prod01
...
host-prod300

[non-production]
host-nonprod01
...
host-nonprod600

To create a list for production systems you'd run this:

ansible production -i hosts -u username --ask-pass -m shell -a "httpd --version"

To create a list for non-production systems you'd run this:

ansible non-production -i hosts -u username --ask-pass -m shell -a "httpd --version"

ansible-config

ansible-config View, edit, and manage Ansible configuration. This is a rarely used Ansible command. It's very useful when initially setting up custom Ansible configuration or troubleshooting any configuration issues.

ansible-doc

ansible-doc Plugin documentation tool. This is a command line equivalent to https://docs.ansible.com. Very useful for a quick reference while offline.

ansible-galaxy

ansible-galaxy command to manage Ansible roles in shared repostories, the default of which is Ansible Galaxy https://galaxy.ansible.com. Ansible Galaxy is a great way of storing and sharing individual specialized roles.

ansible-lint

ansible-lint find syntax and formatting problems before running a deploy. In reality, Ansible will check the syntax and fail before it runs but only after you already type in SSH (and optionally vault) password. So to save a few seconds here and there this can be run to validate the code avoid failures later.

It's also very useful when debugging formatting and spacing issues. And these are extremely fun to troubleshoot (they're not).

ansible-playbook

ansible-playbook this is the money maker. This runs playbooks starting from simple single task ones all the way to complex multi role complex setups used for building solphisticated multi-service systems.

The minimal set of arguments ansible-playbook needs:

hostname or groupname this is where the playbook should deploy to. Can be a single host or a group of many servers grouped according to your deployment needs.

-i hosts this is the hosts file which Ansible will use to look up all hosts to which it should deploy. The name of it doesn't have to be hosts and its location can be somwhere else as long as the correct path is given to ansible-playbook with -i argument.

playbook.yml filename of the playbook to run

-u user

--ask-pass this is needed if SSH password is used

--become in case the system doesn't want to make you a sandwich

ansible-vault

In depth howto on ansible-vault

ansible-vault after ansible-playbook this is the second most used ansible tool. It's an "encryption/decryption utility for Ansible data files". If any of the Ansible code has to be stored and shared via GitHub, even if in private repositories, it's a recommended practice to encrypt any secrets that public eyes shouldn't see.

As long as the content of the encrypted files doesn't change this is a one time deal to encrypt sensitive files. Ansible will decrypt them on-the-fly during a playbook runtime.

To encrypt a file with ansible-vault:

ansible-vault encrypt secrets.txt

It'll ask you for the encryption password twice, standard procedure. After than the file will be encrypted in place and ready for a future deploys.

To decrypt a vault file:

ansible-vault decrypt secrets.txt

Type in the password used during encrypting. The vaulted file will be decrypted on-the-fly back to clear text.

In order to use the vaulted (encrypted) file with ansible-playbook without the need to decrypt manually before each deploy just supply this argument:

-ask-vault-pass

You will be asked to type in the encryption password before the playbook is ececuted. After that, ansible will decrypt any vault files automatically.

Ansible Directory Structure

There are two scenarios I'd like to explain:

  1. Single role structure
  2. Multi-role structure

Single Role Directory and Files Structure

Based on the output of ansible-galaxy init which by default is the complete set of directories needed for a single role. Not all of these directories are required to be present for Ansible to run. Only those which contain valid content. However, they don't necessarly have to be removed if not in use.

This layout applies to a single or standalone role. The purpose of a single role is to be used for simple one purpose deploys and to be portable and sharable (via Ansible Galaxy for example).

This type of a role can be ran by itself or can be included in a larger set of roles in a more complex deployment scenario. A good practice is to use these individual roles to split large complex deploys into smaller more manageble parts. One example is a LAMP stack. Instead of grouping all tasks into one large role it can be split into the following more manageble and more portable individual roles:

  • apache
  • php
  • mysql
  • os-config

Each one of these will have its own separate isolated directory structure with its own set of tasks and variables following the directory tree shown below.

A real life example for a single role deploy may be building a single standalone web server, either physical or cloud. After the initial OS installation a single role can be ran to install and configure a LAMP stack for example without the need for any other tasks since the only purpose of the server will be to serve PHP/MySQL based applications.

Directory Tree of a Single Standalone Role

.
├── README.md
├── defaults
│   └── main.yml
├── files
├── handlers
│   └── main.yml
├── meta
│   └── main.yml
├── tasks
│   └── main.yml
├── templates
├── tests
│   ├── inventory
│   └── test.yml
└── vars
    └── main.yml

defaults - contains all globally default variables which aren't defined anywhere else. This is where user editable variables should be kept. This location is on the bottom of the variable precendence hierarchy

files - any files that will be copied over to the target nodes should be kept in there

handlers - contains handler tasks that listen for commands to handle services

meta - contains information about the role itself

tasks - this one is required. This is where all tasks are defined

templates - very similar in purpose as files but contains Jinja templates instead

tests - this one can be safely removed

vars - contains all variables which shouldn't be edited by users

Multi Role Directory and Files Structure

This is where things get more interesting. Having a more complex environment may require this approach where multiple diverse roles are required but they also need to share the same set of global variables and inventory. This approach also makes it very easy to share each role between multiple deployment environments. This is where Ansible shows its muscles and it's very easy to get done. This is a perfect tool for larger multiple server, high availability environment deploys.

The multi role directory structure requires a few more additional directories, which weren't needed in the single role approach:

  • group_vars
  • host_vars
  • roles
  • inventory (this one is optional but nice to have to make things look neatier)

Directory Tree of a Complex Multi-role Deployment

.
├── ansible.cfg
├── group_vars
│   ├── database
│   │   └── main.yml
│   └── www
│       └── main.yml
├── host_vars
│   ├── db1
│   │   └── main.yml
│   ├── db2
│   ├── www1.yml
│   └── www2
├── inventory
│   └── prod
│   └── test
│   └── dev
├── large-deploy.yml
└── roles
    ├── role1
    │   ├── README.md
    │   ├── defaults
    │   │   └── main.yml
    │   ├── files
    │   ├── handlers
    │   │   └── main.yml
    │   ├── meta
    │   │   └── main.yml
    │   ├── tasks
    │   │   └── main.yml
    │   ├── templates
    │   ├── tests
    │   │   ├── inventory
    │   │   └── test.yml
    │   └── vars
    │       └── main.yml
    └── role2
        ├── README.md
        ├── defaults
        │   └── main.yml
        ├── files
        ├── handlers
        │   └── main.yml
        ├── meta
        │   └── main.yml
        ├── tasks
        │   └── main.yml
        ├── templates
        ├── tests
        │   ├── inventory
        │   └── test.yml
        └── vars
            └── main.yml

group_vars - this is where variables for a group of servers are stored. It is dependent on how servers are group in the inventory files. For example, multiple web servers defined in the inventory:

[web]
web1
web2
web3

[database]
db1
db2

In this case the group_vars subfolder which will be used to store variables for the web group of servers will be called web. Everytime Ansible runs on any of the servers in that particular web hosts group it will look for variables in this folder.

Files in this directory can be called however you want, as long as they have .yml extension. Any Ansible vaulted files stored in this location will be decrypted automatically.

host_vars - files or subfolders with files containing variables for individual hosts.

For example:

.
├── host_vars           # main hosts vars directory
│   ├── db1             # subdirectory for a host db1
│   │   └── main.yml    # variables file for host db1
│   ├── db2             # variables file for host db2
│   ├── www1.yml        # variables file for host www1

These are some common naming examples but it's always a good rule of thumb to pick one and stick with it. I prefer to break things down into smaller variable files within subdirectories.

roles - this directory contains all individual roles.

inventory - this is optional. It's a personal preference where the host inventory file is stored. However, since we are already following a certain level of folder chierarchy, it seems logical to follow the same with the inventory. In addition the inventory can be further split into smaller logical files:

inventory/production - contains production hosts only inventory/testing - contains testing hosts only inventory/development - contains development hosts only

This sort of a hosts inventory layout is optional and completely up to the user's preference but it does make managing large, multi environment deployments much easier.

Ad-Hoc Plays

There are some cases when Ansible can be used in an ad-hoc fashion to collect a single piece of information or a fact from a group of many servers. A good example for this would be to collect a kernel version on all running servers in order to figure out if any require a bug or security patching. Such simple task doesn't require a full role or a playbook and can be accomplished with just a single line command:

ansible -i hosts servers -m shell -a "uname -a"

This command will connect to each host in the servers hostgroup and execute a shell module, which will run uname -a command and return its value on the command line. This is a very easy and quick way to get detailed information from a large number of hosts without the need to access each one of them manually via SSH or without any sophisticated CMDB software.

Depending on the security and complexity of your setup additional arguments may be needed with this command like:

--user USER or --become-user USER --ask-pass --become --ask-become-pass

Playbooks

Ansible playbook is a collection of plays made up of series of tasks which will run in orderly fashion once executed. A single playbook may contain hostnames, variables, vault secrets, multiple plays, and numerous tasks. Basically, everything that a role contains but in a single file. I personally never use playbooks in such way. In a complex environments things can get very messy very quickly so it's better to rely on roles.

Everyone has their own favorite coding style so it's probably the best to pick the one that someone's most comfortable with and stick with it throughout the whole project. I always found individual roles to be the easiest and cleanest approach.

Let's look at an example:

├── ansible.cfg
├── group_vars
├── host_vars
├── inventory
├── lamp.yml
└── roles
    ├── apache
    ├── php
    └── mysql

The lamp.yml playbook includes roles: apache, php, and mysql. Good thing about having these separated into individual roles is that they can still be deployed individually as standalone services and each one can have their own set of unique variables.

Example content of lamp.yml

---
- hosts: web

  roles:
    - { role: apache }
    - { role: php }
    - { role: mysql }

Once the above playbook is ran it will automatically include all tasks from all 3 roles.

So with just one command we can build a fully functional LAMP stack server:

ansible-playbook -i inventory/production web lamp.yml --user joebagofdonuts --ask-pass --become --ask-become-pass

Templates

Templates are the backbone of Ansible playbooks. They are what makes it so easy to deploy complex configuration files to many hosts based on custom variables.

Templates for roles are stored in role/templates/ directory. Template files use .j2 extension.

They use a simple Jinja2 templating language. A template can be any configuration file that has to be deployed to a number of hosts but certain values must differ based on the host configuration.

A good example is ServerName directive in Apache's httpd.conf file. It will differ for each host and can be easily set for each by using host_vars:

host_vars:

server_name: www1.example.com

httpd.conf.j2:

ServerName {{ server_name }}

Inventory

Static File Inventory

Ansible inventory can be based on static files for environments where servers don't rotate frequently or there's no easy method of getting an up to date hosts inventory dynamically like it can be done from AWS or VMWare clusters. All hosts must be added and maintained manually in those files or Ansible won't be aware of them and will throw an error.

Inventory files can have either an INI or YAML format. Totally based on user preferences. My personal choice is INI but that's because that's what I learned Ansible with. Some users prefer to have the same format in inventory files as they have in their playbooks. Ansible doesn't care.

The simpliest form of an inventory file contains two parts: a hostgroup and hosts.

Example hosts file:

[all]
server1
server2

This file can then be used with ansible or ansible-playbook with the -i hosts argument.

Naming scheme in the inventory is fully up to user's preferrences and imagination as long as it makes sense. Hosts can be represented either by their DNS name or by IP address.

Some users like to add variables in the inventory files. Apart from a very rare exceptions (FreeBSD bootstrap scenario) I never use inventory files for variables. Either host_vars or group_vars are the proper places for variables.

Multi Environment Inventories

For senarios where a single role will be deployed to multiple environments it's very easy to make it work seamlesly with multiple inventory files.

Testing environment test:

test

[test:children]
web-test

[web-test]
web-test1
web-test2

Production environment prod:

prod

[prod:children]
web-prod

[web-prod]
web-prod1
web-prod2

To re-use the same playbook or role across these two environments simply specify the environment specific inventory file: -i test or -i prod. Having them seperate in large environments makes it much easier to maintain, especially when complex inheritance rules come in play where some servers share multiple groups and groups have sub-groups.

Things will get much easier down the road when combining inventory groups with group_vars will let the user to use variable based conditionals to apply multiple configurations based on environment requirements.

Dynamic Inventories

Dynamic inventory does not rely on the content of static inventory files. Instead, it relies on a dedicated script to connect to a remote resource. Via API calls, it creates dynamic, current, up-to-date inventory of all objects reported by the remore resource.

AWS's ec2.py is one of such scripts. It has to be properly configured prior to being used.

More detailed information on how to configure Ansible to use Boto and ec2.py script to generate dynamic inventories in an AWS environment: Ansible Dynamic Inventory for AWS

With the ec2.py inventory script configured to use it with Ansible, replace the static inventory file with the script path:

ansible-playbook -i ec2.py playbook.yml --key-file=aws-ssh-key -e "variable_host=tag_Name_aws_instance_name"

What happens in the above command is that ansible-playbook command will get a dynamic list of all instances in the AWS account and based on the variable_host tag it will only deploy to an instance that matches the name: aws_instance_name.

More detailed information about Ansible and AWS dynamic inventory can be found here: Ansible Dynamic Inventory for AWS

Variables Presedence

Not all variables are treated equal in Ansible chierarchy and it can be very frustrating. That's why it's a good practice to pick on central place for all variables and stick to that location throughout all playbooks.

List of Ansible variables precedence. They start with least important on top and end with the most important on the bottom. Any variable contained in more than one of these locations will be overwritten by a variable from a file that's in higher position in this hierarchy.

role defaults                           # in role/defaults/main.yml
inventory file or script group vars     # hostname or IP_address vars: ssh_port = 80
inventory group_vars/all                # a catch all group_vars file
playbook group_vars/all                 #
inventory group_vars/*                  # specific group_vars file, example: www
playbook group_vars/*                   #
inventory file or script host vars      #
inventory host_vars/*                   # specific host_vars file, example: db1
playbook host_vars/*                    #
host facts                              # what Ansible gathers prior to running the play
play vars                               # any vars included in the playbook.yml file
play vars_prompt                        # plays can ask for variable via prompts
play vars_files                         # plays can have their own variable files included via include_vars
role vars (defined in role/vars/main.yml)
block vars (only for tasks in block)
task vars (only for the task)
role (and include_role) params          # these get passed in via include_role parameters
include params
include_vars
set_facts / registered vars
extra vars (always win precedence)      # vars added on the command line

This list can be a very useful reference when troubleshooting templates. However, sticking to one place for variables, such as role defaults/vars folder for roles, and group_vars/host_vars for host specific variables, can ensure trouble free and easy to maintain plays.

Example Workflow

This is a simplified summary of the workflow I use with Ansible. In this example I create a new role in an already existing Ansible file structure.

Create a New Role

If you have an existing Ansible project and want to add a new role:

  • go to ansible_project/roles
  • create a new role with ansible-galaxy init example-role
  • or optionally create a minimum directory structure manually

Optional: Initialize a new git branch

Initialize a new git repository if one hasn't been done yet, it can be either in the root of the Ansible project or in the newly creted role directory

cd ansible_project/roles/example-role and git init .

Create a new git branch

git branch example-role and git checkout example-role

Create playbooks, tasks, add variables, vault secrets, files, templates, etc

(Required) Add tasks in ansible_project/roles/example-role/tasks/main.yml

(Required) Add the host(s) to the inventory file

Add variables in:

  • constant variables, shouldn't be edited in the future: ansible_project/roles/example-role/vars/main.yml
  • variables that change based on user preferences: ansible_project/roles/example-role/defaults/main.yml

Add files in ansible_project/roles/example-role/files

Add templates in ansible_project/roles/example-role/templates

Test the New Playbook

It's always a good idea to test the playbooks on a temporary non-production system. Obviously, it has to be identical to the production one. Not necessary specs wise but the OS and software versions must match.

The easiest way to set up a simple dev environment is to use VirtualBox locally on the workstation.

There are several tricks to make the testing easier and less painful. None of them have to be used but these are the tools I've found to be very useful while working on testing and debugging playbooks.

List Target Hosts

If you add --list-hosts to the ansible-playbook command it will show which hosts the current playbook will be deployed to.

In a complex environment, it's a good practice to check that first before running a live deploy to make sure we're targetting all and the right hosts.

Limit Hosts

If you need to test on just a single server from a large group of hosts use the --limit=hostname option.

Check, Dry Run

Adding --check to ansible-playbook will cause a simulation of the deploy. It'll run just as the real thing and the target host will show all activity in the logs.

Sometimes the check mode fails where a regular live deploy wouldn't. Tasks which require or check the presence of files or installation of system packages from URLs or locally will fail too. But at least this can be a useful testing tool for simple playbooks or selected tasks.

Limit the Scope with Tags

When testing large playbooks running the whole thing takes too much time just for debugging a single task. Using --tags=tag_name will limit the run to the specific task tagged with that name.

Debug Mode

Add a strategy: debug line to your playbook to run it in an interactive debug mode. This allows to debug and modify broken tasks during a live deploy without restarting the whole run. Very useful for large and complex playbooks that take a long time to finish.

---
- hosts: web
  strategy: debug

Commands for the debug console:

P task/host/result/vars - print the value

task.args[key] = value — change the module arguments

vars[args] = value — change the variable argument value

r — run the task again

c — continue

q — exit the debugger

Optional: Merge the Branch

After all the testing is done and results are good it helps (and it's a good practice) to merge the development branch. There may have been too many changes to keep track of and GitHub (or BitBucket) can help with seeing them all by using the visual differences feature. This way it's very easy to confirm all needed changes have been added and easy to catch last minute bugs that could have slipped through.

To add all the changes to version control with git:

git status - check what files were modified git add file1 file2 - add modified files to the current commit git commit -m "commit comments" - commit the changes and add comments git push origin branch_name - finally push the changes up to GitHub

Head to github.com and create a new pull request. Doing so will allow to review all the code in full. If all is well, the pull request can be merged into the master branch. The master branch is what we want to use for deploying to production. Good practice not to use any development branches for production deployments.

Production Deploy

Production deploy can range from a single task on a single server to deploying complete clusters from scratch. The concept of production deploy is that no manual changes should ever be done on any of the servers that were deployed with the final playbooks. The final production systems, no matter how small or large, should never be touched manually.

From the Ansible point of view the production deploy isn't and shouldn't be any different than previous deployments done to development servers. The only difference will be the target hosts. Variables, templates, vault secrets, files, and everything else should work together beautifully to make every playbook applicable across all environments.


Reading Time

~17 min read

Published

Category

Ansible

Tags

Contact