Commands¶
Warning
The DoE-Suite can easily start many instances in a remote cloud. If there is an error in the execution, or the suite finishes before all jobs are complete, then these remote resources are not terminated and can generate high costs. Always check that resources are terminated. We also provide the following command to ensure that the previously started instances are terminated:
make clean
The interface of the DoE-Suite is defined in a Makefile
.
In the following, we focus on the most frequently used commands.
make help
Show Output
$ make help
Running Experiments
make run suite=<SUITE> id=new - run the experiments in the suite
make run suite=<SUITE> id=<ID> - continue with the experiments in the suite with <ID> (often id=last)
make run suite=<SUITE> id=<ID> cloud=<CLOUD> - run suite on non-default cloud ([aws], euler)
make run suite=<SUITE> id=<ID> expfilter=<REGEX> - run only subset of experiments in suite where name matches the <REGEX> (suite must be valid)
make run-keep suite=<SUITE> id=new - does not terminate instances at the end, otherwise works the same as run target
Clean
make clean - terminate running cloud instances belonging to the project and local cleanup
make clean-result - delete all inclomplete results in doe-suite-results
Running ETL Locally
make etl suite=<SUITE> id=<ID> - run the etl pipeline of the suite (locally) to process results (often id=last)
make etl-design suite=<SUITE> id=<ID> - same as `make etl ...` but uses the pipeline from the suite design instead of results
make etl-all - run etl pipelines of all results
make etl-super config=<CONFIG> out=<PATH> - run the super etl to combine results of multiple suites (for <CONFIG> e.g., demo_plots)
make etl-super ... pipelines="<P1> <P2>" - run only a subset of pipelines in the super etl
Clean ETL
make etl-clean suite=<SUITE> id=<ID> - delete etl results from specific suite (can be regenerated with make etl ...)
make etl-clean-all - delete etl results from all suites (can be regenerated with make etl-all)
Gather Information
make info - list available suite designs
make status suite=<SUITE> id=<ID> - show the status of a specific suite run (often id=last)
Design of Experiment Suites
make design suite=<SUITE> - list all the run commands defined by the suite
make design-validate suite=<SUITE> - validate suite design and show with default values
Setting up a Suite
make new - initialize doe-suite-config from a template
Running Tests
make test - running all suites (seq) and comparing results to expected (on aws)
make euler-test cloud=euler - running all single instance suites on euler and compare results to expected
make etl-test-all - re-run all etl pipelines and compare results to current state (useful after update of etl step)
Running an Experiment Suite¶
Here we focus on the commands that are used to start and continue an experiment suite. For more information on the experiment suite design, see Suite Design and on the execution, see Running Experiments.
make run suite=example01-minimal id=new
make run suite=example01-minimal id=last
make run suite=example01-minimal id=<ID>
make run suite=example01-minimal id=new cloud=euler
make run-keep suite=example01-minimal id=new
Warning
If you use run-keep
, be sure to check that instances are terminated when you are done.
Cleaning up Cloud¶
By default, after an experiment suite is complete, all experiment resources created on the cloud are terminated.
However, if something goes wrong, i.e. an error occurs, the suite times out, or the suite is stopped manually, the created resources on the cloud remain running.
Further, creating resources on a cloud and setting up the environment takes a considerable amount of time.
So, for debugging and short experiments, it can make sense not to terminate the instances.
If you use run experiments with run-keep
, be sure to check that instances are terminated when you are done.
make clean
Tip
Double check on the cloud that all resources are terminated, and setup budget alerts.
ETL Results Processing¶
The ETL pipeline is used to process the results of an experiment suite. The results processing runs on your local machine and is triggered automatically when the new results are available locally, i.e., an experiment job is complete.
However, often it is also useful to trigger a run of the ETL pipeline manually, e.g., for styling a plot.
# can replace `id=last` with actual id, e.g., `id=1655831553`
make etl suite=example01-minimal id=last
Super ETL pipelines can be used to process the results of multiple experiment suites together.
# can set `out` for example to a figures folder of a paper
make etl-super config=demo_plots out=.
Status and Info¶
make info
# w/o suite filter (all suites)
make status id=last
# w/ suite filter
make status suite=example01-minimal id=last
Developing Suite Designs¶
Tip
Ensure that the environment variable DOES_PROJECT_DIR
points to the project directory.
make new
make design suite=example01-minimal
make design-validate suite=example01-minimal
# can replace `id=last` with actual id, e.g., `id=1655831553`
make etl-design suite=example01-minimal id=last
# The same as: `make etl suite=example01-minimal id=last`
# but uses the etl pipeline defined in `doe-suite-config/designs`
# compared to the etl pipeline in `doe-suite-results/example01-single_<ID>/suite_design.yml`