Checkpoints and Cost Saving and Monitoring

The IntellAgent system is optimized for cost efficiency and provides multiple levels of checkpoints to reduce expenses.

This document outlines the different types of checkpoints and their locations within the pipeline.

Policies Graph Checkpoint

When initializing the SimulatorExecutor, the system searches for the DescriptionGenerator checkpoint at:

<args.output_path>/policies_graph/descriptions_generator.pickle

If the checkpoint exists, the descriptions_generator (which contains the policies graph) will be loaded. If it does not exist, the system will generate the graph. During execution, the system automatically saves the descriptions_generator to this path.

As a result, it will be loaded by default in subsequent runs unless the output_path is changed.

Dataset Events Checkpoints

Generating dataset events is a computationally expensive process. It is recommended to run it once and reuse the same dataset for all experiments to ensure consistency across experiments. You can specify the dataset name using the --dataset argument. The dataset will be saved at the following location:

<args.output_path>/datasets/<dataset_name>.pickle

During each iteration of size mini_batch_size (as defined in the configuration file), all generated events up to that point will be saved as checkpoints. This allows you to resume the dataset generation from the last mini-batch if it is interrupted.

By default, the --dataset argument is set to latest, which will automatically load the most recently generated dataset.

Additionally, you can set a cost_limit (in dollars) by defining the cost_limit variable in the configuration file. Note that this feature may not be supported by all models.

Experiment Checkpoint

Running the simulator on all events can be both costly and time-consuming.
It is important to note that, unlike previous checkpoints, a new experiment is generated by default at each run.

The experiment name can be specified using the --experiment variable. If not provided, it will default to exp_{i}, where i is a sequential number.
All experiment dumps will be saved in the following folder:

<args.output_path>/experiments/<dataset_name>__<experiment_name>

The simulator results are saved after every mini_batch_size (as defined in the configuration file).
If the run is interrupted and you want to resume it, you need to set the --experiment variable to the experiment_name.

Additionally, you can define a cost_limit (in dollars) in the configuration file by setting the cost_limit variable. Note that this feature may not be supported by all models.