Experiment Configuration¶

Warning

The documentation is not up to date.

Todo

TODO [nku] update documentation -> try to build the documentation by including the top comments of the ansible roles

The experiment configuration defines the experiments to run. For each experiment, it specifies the host types to use, how they are setup, and how the experiments are started.

General Syntax¶

There are two possibilities to define experiments: - Experiment design: this is the general approach, where one specifies all factors (for different experiment runs) manually. - Experiment table: this is a convenient shorthand to define experiments that use the cross-product of all specified factors. It provides the option to concisely specify the experiment and can then be translated to an experiment design using the expdesign script.

Experiment Design¶

The general layout is as follows:

<< experiment_1 >>:
  n_repetitions: << nr >>
  common_roles:
    - << ansible-role-name >>
  host_types:
    << host_type_1 >>:
      n: << nr >>
      check_status: << boolean (optional) >>
      init_roles: << ansible-role-name >>
  base_experiment:
    << global_variable_1 >>: << nr, str, or $FACTOR$ >>
    host_vars:
      << host_type_1 >>:
        << host_arg_1 >>: << nr, str, or $FACTOR$ >>
    $CMD$:
      << host_type_1 >>: << str >>
  factor_levels:
    - << global_variable_1 >>: << nr, str >>
      host_vars:
        << host_type_1 >>:
          << host_arg_1 >>: << nr, str >>

Terms marked with << >> are placeholders that can be replaced by user-chosen values and the rest are keywords. Placeholders with the suffix _1 signal that there could be arbitrarily more entries like them.

Examples are in the designs folder.

Keywords¶

A suite design is a dict (YAML object) of experiments. The keys of the dict are the (unique) experiment names of the suite. In addition, there are special keywords that start with a $ and are used to configure the suite.

Keyword	Required	T y p e	Short Description
`<< experiment_name >>`	yes	d i c t	Definition of an expe riment of the suite (suite can have multiple experiments).
$ETL$	no (default: {})	d i c t	Definition of ETL pipeline for processing results (e.g, generate plot).
$SUITE_VARS$	no (default: {})	d i c t	Definition of default variables that are available in all experiments of the suite.

Experiment Keywords¶

Keyword	Required	Ty pe	Short Description
`` n_repetitions``	yes	i nt	Number of repetitions of each run (i.e. each level config)
` common_roles`	no (default: [])	s tr or li st	One or more Ansible role(s) that are run for all hosts during the initial setup. A single role can be specified as string, multiple roles need to use list notation.
`host_types`	yes	di ct	Dictionary of hosts used for the given experiment. The keys are the name of the host type, the value is another dictionary with configurations (see values).
`ba se_experiment`	yes	di ct	Dictionary of variables defined for this experiment
`` factor_levels``	yes if `base_experiment` contains at least one factor, no otherwise	li st	List of dictionaries for each run. Each dictionary specifies the values for variables marked with $FACTOR$ in `base_experiment`.

Host Type Keywords¶

Keyword	Required	T ype	Short Description
`n`	yes	int	Number of EC2 instances
$CMD$	yes	d ict	Dictionary of hosts with their run starting commands.
` check_status`	no (default: True)	b ool	Boolean set to true when the status of this host type should be checked when evaluating whether a job finished
`init_roles`	no (default: [])	str or l ist	One or more Ansible role(s) that are run for hosts of this type during the initial setup. A single role can be specified as string, multiple roles need to use list notation.

Base Experiment Keywords¶

base_experiment contains variable definitions and the commands to start the experiment run. By convention, global variables for all host types are stored directly as key/value pairs.

Keyword	Required	Type	Short Description
$ INCLUDE_VARS$	no	str or list	Load default variables from a file in `doe-suite-config/d esigns/design_vars`.
`host_vars`	no	dict	Defines variables for the different host types here.

Note, it is only a convention to group variables by host type. In practice, e.g., also a host of type “client” can use variables from “server”. The $CMD$ property can also be defined as a factor: $FACTOR$ and then $CMD$ needs to be defined in factor_levels.

Factor Levels¶

factor_levels is a list of dictionaries. Each dictionary must have an entry for every variable that is marked with the value $FACTOR$ in base_experiment (also $CMD$ is possible).

The number of dictionaries defines the number of runs for the experiment. Each dictionary should therefore contain a unique variable assignment (otherwise, there are duplicate runs).

Defining Commands¶

The $CMD$ property in base_experiment contains for each host type the starting command of the software artifact (e.g., command to start the benchmark software).

Within a command, there are two different types of variables available: - {{ }}: these are global variables from group_vars/all - [% %]: these are variables that correspond to factors or other of the current run. In most cases, they have the form [% my_run.* %]. If there is a factor a (i.e., base_experiment: a: $FACTOR$), then the variable [% my_run.a %] refers to the level of this factor in the respective run.

There are two options on how to pass configurations to the artefact with a command: - Pass factor levels as command line arguments (e.g., use the factor a as an argument to echo: echo [% my_run.a %]). - Pass factor levels via the config.json file. For convenience, there is a config.json file in the working directory that contains the run config.

Moreover, for multi-instance experiments there is a variable exp_host_lst that contains information on all involved hosts of the experiment. The format of the list is as follows:

[{"host_type": x, "exp_host_type_idx": x, "exp_host_type_n": x, "is_controller": x, "public_dns_name": x, "private_ip_address": x}, ... ]

Unfortunately, at the moment it is not supported to use this variable within the experiment design (e.g., within a command). TODO [nku] build functionality to provide access to host list of experiment: should be able to use [% exp_host_lst %] in $CMD$

DoE-Suite

Navigation

Related Topics