Quickstart¶
Follow Installation to set up a project and install the DoE-Suite first.
The DoE-Suite provides a demo_project
in the root of the repository that shows the required structure to integrate DoES into an existing project.
After completing the Installation section, it should be possible to run the examples , i.e., under demo_project/doe-suite-config/designs
, of the demo project.
Afterward, you can change the environment variable DOES_PROJECT_DIR to point to your own project (instead of the demo project) and continue from there as described in the Tutorial.
A Minimal Example¶
We start with the minimal example of a suite design:
---
# The suite `example01-minimal` contains a single experiment called `minimal`.
# We run this experiment on a single instance `n=1` of host type `small` and we only use a single repetition.
# The experiment consists of four runs, i.e., configurations:
# - echo "hello world."
# - echo "hello world!"
# - echo "hello universe."
# - echo "hello universe!"
#
# For the experiment configuration, we use the `cross` format:
# The different levels for each factor are listed in `base_experiment` and
# we create the runs by taking a cross product of all factor levels.
# (e.g., [world, universe] x [".", "!"] results in 4 runs)
minimal: # experiment name
n_repetitions: 1
host_types:
small: # one instance of type `small`
n: 1
$CMD$: "echo \"[% my_run.arg1 %] [% my_run.arg2 %][% my_run.arg3 %] \"" # command to start experiment run
base_experiment:
arg1: hello # fix parameter between runs (constant)
arg2:
$FACTOR$: [world, universe] # varied parameter between runs (factor)
arg3:
$FACTOR$: [".", "!"] # varied parameter between runs (factor)
$ETL$: # ensures that stderr.log is empty everywhere and that no files are generated except stdout.log
check_error:
experiments: "*"
extractors: {ErrorExtractor: {}, IgnoreExtractor: {} }
What does this design do?
It runs on a single instance of type
small
The design contains two
$FACTOR$
:arg2
andarg3
, with two levels each. In total there are 2 x 2 = 4 runs, i.e., configurations:echo "hello world."
echo "hello world!"
echo "hello universe."
echo "hello universe!"
Show Resulting Commands
$ make design suite=example01-minimal
Traceback (most recent call last):
File "/home/runner/work/doe-suite/doe-suite/doespy/doespy/design/validate_extend.py", line 123, in <module>
suite_design, suite_design_ext = main(
File "/home/runner/work/doe-suite/doe-suite/doespy/doespy/design/validate_extend.py", line 42, in main
prj_id = util.get_project_id()
File "/home/runner/work/doe-suite/doe-suite/doespy/doespy/util.py", line 24, in get_project_id
raise ValueError("env variable:DOES_PROJECT_ID_SUFFIX not set")
ValueError: env variable:DOES_PROJECT_ID_SUFFIX not set
make: *** [Makefile:358: design] Error 1
Save it as example01-minimal.yml
or something similar under doe-suite-config/designs
.
Afterwards, you can run the experiment suite with:
make run suite=example01-minimal id=new cloud=aws
This will start the experiment suite on AWS.
First, it creates a VPC and an EC2 instance corresponding to the host_type: small
.
The doe-suite-config/group_vars/small/main.yml
file contains the configuration for the instance.
Show Source
---
# AWS EC2
instance_type: t2.medium
ec2_volume_size: 16
ec2_image_id: ami-08481eff064f39a84
ec2_volume_snapshot: snap-0b8d7894c93b6df7a
# ETH Euler
euler_job_minutes: 10
euler_cpu_cores: 1
euler_cpu_mem_per_core_mb: 3072
euler_gpu_number: 0
euler_gpu_min_mem_per_gpu_mb: 0
euler_gpu_model: ~
euler_env: "gcc/8.2.0 python/3.9.9"
euler_scratch_dir: "/cluster/scratch/{{ euler_user }}"
# Docker
docker_image_id: "doe-ubuntu20"
docker_image_tag: "latest"
After creating the instance, the DoE-Suite runs the four shell commands sequentially on the instance.
Whenever, a command finishes, the resulting stdout
and stderr
together with potential result files are fetched and saved under doe-suite-results
on your local machine.
Quick Command Reference¶
These are the most important commands to get started with the DoE-Suite.
make run suite=example01-minimal id=new
make run suite=example01-minimal id=last
make clean
To get an overview of the functionality use make or make help:
Running Experiments
make run suite=<SUITE> id=new - run the experiments in the suite
make run suite=<SUITE> id=<ID> - continue with the experiments in the suite with <ID> (often id=last)
make run suite=<SUITE> id=<ID> cloud=<CLOUD> - run suite on non-default cloud ([aws], euler)
make run suite=<SUITE> id=<ID> expfilter=<REGEX> - run only subset of experiments in suite where name matches the <REGEX> (suite must be valid)
make run-keep suite=<SUITE> id=new - does not terminate instances at the end, otherwise works the same as run target
Clean
make clean - terminate running cloud instances belonging to the project and local cleanup
make clean-result - delete all inclomplete results in doe-suite-results
Running ETL Locally
make etl suite=<SUITE> id=<ID> - run the etl pipeline of the suite (locally) to process results (often id=last)
make etl-design suite=<SUITE> id=<ID> - same as `make etl ...` but uses the pipeline from the suite design instead of results
make etl-all - run etl pipelines of all results
make etl-super config=<CONFIG> out=<PATH> - run the super etl to combine results of multiple suites (for <CONFIG> e.g., demo_plots)
make etl-super ... pipelines="<P1> <P2>" - run only a subset of pipelines in the super etl
Clean ETL
make etl-clean suite=<SUITE> id=<ID> - delete etl results from specific suite (can be regenerated with make etl ...)
make etl-clean-all - delete etl results from all suites (can be regenerated with make etl-all)
Gather Information
make info - list available suite designs
make status suite=<SUITE> id=<ID> - show the status of a specific suite run (often id=last)
Design of Experiment Suites
make design suite=<SUITE> - list all the run commands defined by the suite
make design-validate suite=<SUITE> - validate suite design and show with default values
Setting up a Suite
make new - initialize doe-suite-config from a template
Running Tests
make test - running all suites (seq) and comparing results to expected (on aws)
make euler-test cloud=euler - running all single instance suites on euler and compare results to expected
make etl-test-all - re-run all etl pipelines and compare results to current state (useful after update of etl step)
Todo
We could potentially include here a bit more extensive example as in the comment below