Developer Ansible Guide¶
Warning
The documentation is not up to date.
Todo
TODO [nku] update documentation for developers -> can delete a lot and refer to the documentation within the role.
TODO [nku] mention the logic of the ec2 dynamic inventory with the prf_id and suite filter => can use all, exp_name, check_status_yes, check_status_no, is_controller_yes, is_controller_no
TODO [nku] remove mention of host_type all
This document contains some useful information on the internals of this Ansible project. E.g., how to update/extend certain parts.
Adding A Variable to the Experiment State¶
To add a variable to the experiment state, do: 1. Add it to the
experiment-state
template. 2. Add
it to exp_facts
in the role called “Set experiment variables (facts)
based on loaded state” in the experiment-state
tasks. 3. To more
generally integrate it to the project, add it to the example
designs and the
expdesign script (the latter to translate
experiment table specifications).
Important Data Structures¶
exp_facts
: This data structure is a dictionary with the experiment name as key and information about the status of the experiment as values.Example:
exp_facts: { 'experiment_1': { 'exp_id': '5', 'exp_runs_ext': [...], 'exp_job_ids': ['5_0_0', ...], 'exp_job_ids_unfinished': ['5_0_0', ...], 'exp_job_ids_pending': ['5_0_0', ...], 'exp_job_ids_running': [], 'n_repetitions': '2', 'common_roles': ['setup-common'] }, 'experiment_2': {...} }
Naming:
exp_facts[exp_name]
->exp_fact
host_types
: This data structure is a dictionary with the host type as a key. For each host type, it stores a dictionary with the experiment name as key and the host configuration (for this host type and experiment) as value.all
is a special entry that collects the total number of instances for eachhost_type
summed over all experiments (attention: its second key is the host type and not the experiment name!). The...
below are just an ellipsis to make the example more concise and do not represent actual values.Example:
host_types: { 'client': { 'experiment_1': { 'init_roles': 'setup-client', 'n': 1, 'check_status': false }, 'experiment_2': { 'init_roles': 'setup-client', 'n': 1, 'check_status': false } }, 'server': { 'experiment_1': {...}, 'experiment_2': {...} } 'all': { 'client': { 'n': 2 } 'server': { 'n': ... } } }
Naming:
host_types[group]
->host_type
host_types[group][exp_name]
->host_facts
host_type_names
: A list of all host types, without the special keyall
(i.e., not equal tohost_types.keys()
). This is convenient to loop through host types.exp_hosts
: variable set on “controller” hosts which contains a list of hosts involved in the current experiment.host_group_name
andhost_group_name_long
:The first is the “short” group name used in the experiment suite config file, e.g.
server
.The second is the group name used in the ansible inventory, defined by the EC2 plugin. It’s of the form
tag_Name_<prj_id>_<host_group_name>_SEP_<exp_name>
._SEP_
is a separator defined in group_vars/all in the variableseparator
. This is used to splithost_group_name
andexp_name
when we recover those two variables from the group name.
suite_all
: Group of EC2 hosts belonging to this project ID.
Roles¶
An (incomplete) list of roles and their purpose:
experiment-aws
: General role to handle creating AWS EC2 instances.experiment-aws-ec2-create
: Create EC2 instances for a specific host type and experiment.experiment-aws-ec2-manage
: This is an optimization.experiment-aws
first launches all instances usingexperiment-aws-ec2
and only then waits for SSH to come up. Thus, instances can boot up concurrently and the next one is not only started after the current one is completely initialized.experiment-vpc*
: Roles to create or remove a VPC (Virtual Private Cloud) for this project. We use a single VPC for the entire project.experiment-clear
: Role to terminate all running EC2 instances from the current project. It prints the instance IDs to remove and gives the operator 10s to double check those IDs. By pressingCTRL+C
, the counter can be stopped. PressingA
next aborts and does not delete those instances. PressingC
instead continues the play.experiment-job*
: Roles to manage the initial setup, start, scheduling, and status checking of jobs. There is a job for every run and repetition.experiment-load
: This role initializes the jobs by loading them from the experiment config.experiment-parse-config
: Parse the experiment suite, check assertions, and set default values. Thedict_default
filter is in this folder.experiment-set-vars
: This role sets variables for a host. This is somewhat a hack and derives the experiment name and host group from the inventory group name. This workaround was necessary, since AFAIK the EC2 plugin doesn’t allow us to set host variables when launching new EC2 instances.experiment-state
: This role handels the storing and loading of the experiment state. There is one JSON file for every experiment in the current suite.setup-*
: Example roles that show how (individual or all) hosts can be set up.
Custom JINJA2 Filters¶
dict_default
: Set a default value for a dictionary at the specified query.The query supports the wildcard character ’*’ and expects dot notation (i.e., d.plants and not d[‘plants’]). The wildcard tolerates lists.
Example usage:
Data:
{ "animals": { "cats": 10, "dogs": 1 } "plants": { "bushes": 2, "pot plants": 3 } }
Examples:
Set a default for the plants “cacti”:
{{ data | dict_default("plants", "cacti", 0) }}
Results in the new dictionary:
{ "animals": { "cats": 10, "dogs": 1 } "plants": { "bushes": 2, "pot plants": 3, "cacti": 0 } }
Set a default category “other” for all entries if its not present:
{{ data | dict_default("*", "other", 0) }}
Results in the new dictionary:
{ "animals": { "cats": 10, "dogs": 1, "other": 0 } "plants": { "bushes": 2, "pot plants": 3, "other": 0 } }
Remarks:
Note that this filter can only add key/value pairs to an existing dictionaries:
WRONG:
{{ data | dict_default("plants.house", "cacti", 0) }}
becausedata["plants"]
does not contain a dictionary for keyhouse
.CAREFUL:
{{ data | dict_default("plants", "house.cacti", 0) }}
adds the entryhouse.cacti: 0
. It does not add a dictionary under keyhouse
with the entrycacti: 0
Job scheduling¶
There is a group of hosts for every experiment. Those are created
dynamically by adding the tag tag:Exp
with the experiment name as
value to every EC2 instance.
The first host in the group is the “controller”, which is used to
monitor the job status of the experiment and to advance to the next job
when the current one is done. At every job-switch, the controller pulls
all results and stores them on the localhost. It also updates the state
YAML files and distributes the updated exp_facts
variable to all
hosts involved in this experiment (exp_hosts
and localhost
).
Inventory Host Groups¶
Important: do not use all
. The EC2 plugin adds other
EC2 instances to the inventory even if they do not belong to the current
project. When we use all
, Ansible tries to connect to them, which
can cause SSH connection errors (if no public key for those hosts is
defined) or even unintended configration overwriting of other
instances!
Use suite_all
instead to specify a play for all EC2 instances
belonging to the current project.