Skip to content

Step-by-Step Guide to Running the Airline Agent

The airline chatbot offers an advance example of a chatbot structure that operates with tools and database. For more simple example, refer to the education example.

The environment (prompt, tools and db schema) was adapted from tau-benchmark, and modified to integrate with the IntellAgent framework.

Step 1 - Understand the input

The examples/airline/input folder contains the following structure:

airline/
└── input/
    ├── wiki.md                       # Prompt for the airline agent
    ├── tools/                        # Directory containing all the agent tools 
    │   ├── agent_tools.py            # Python script containing all the agent tools that are part of the enviorment
    │   ├── book_reservation_tool.py  # Python script containing the book reservation tool
    │   ├── ..  # All the rest of the tools
    │   └── util.py  # utils functions that are used by the tools
    ├── data/                  # Directory containing data schema (csv with one example)
    │   ├── flights.csv            # Flights data scheme and example 
    │   ├── reservation.csv          # Reservation data scheme and example
    │   └── users.csv                # Users data scheme and example
    └── validatiors/           # Directory containing the data validators
         └── data_validators.py   # The datavalidators

The folders contain all the essential inputs for the environment, including the prompt, database schema, tools, and data validators. For more details about the environment inputs, refer to customization.

Step 2 - Configure the Simulator Run Parameters

The parameters in the configuration files have already been updated as follows in the config/config_airline.yml file:

environment:
    prompt_path: 'examples/input/airline/wiki.md'  # Path to your agent's wiki/documentation
    tools_file: 'examples/input/airline/tools/agent_tools.py'   # Path to your agent's tools 
    database_folder: 'examples/input/airline/data' # Path to your data schema
    database_validators: 'examples/airline/input/validators/data_validators.py' # Optional! Path to the file with the validators functions

If you also want to modify the LLMs used by either the simulator or the chatbot, refer to customization.

Step 3 - Run the Simulator and Understand the Output

Running the Simulation

Execute the simulator using:

python run.py --config ./config/config_airline.yml --output_path results/airline

Understanding the Descriptor Generator Output

The simulator processes your input in several stages:

2.0. Task Description Generation

  • Automatically inferred from the prompt (can be manually specified in config_airline.yml)
  • Defines the chatbot's role as an airline agent handling reservations within policy constraints

2.1. Flow Extraction The system identifies four main flows:

  • Book flight
  • Modify flight
  • Cancel flight
  • Refund

2.2. Policy Extraction Each flow has associated policies. Examples:

Flow Policy Example Category Challenge Score
Book Flight Agent must obtain user ID, trip type, origin, and destination Knowledge extraction 2
Modify Flight All reservations can change cabin without changing flights Company policy 2
Cancel Flight Cancellation allowed within 24h of booking or airline cancellation Logical Reasoning 4
Refund Compensation available for eligible members based on status/insurance Company policy 3

2.3. Relations Graph Generation

  • Creates a network of policy relationships
  • Nodes: Individual policies
  • Edges: Policy relationships
  • Weights: Combined challenge scores

All descriptor data is saved to: output_path/policies_graph/descriptions_generator.pickle

2.4. Events Generation The event generation process occurs in three stages:

2.4.1. Symbolic Representation Generation

  • Converts policies into symbolic format
  • Processes in parallel using multiple workers (configured in config)

2.4.2. Symbolic Constraints Generation

  • Creates constraints based on the symbolic representation
  • Uses same worker and timeout configuration as symbolic generation

2.4.3. Event Graph Generation

  • Final event creation (most time-intensive phase)
  • Includes restriction filtering, validation, and result compilation
  • Controlled by configurable difficulty levels
  • Generates samples in batches according to dataset configuration

All generated events are saved to: output_path/datasets/dataset__[timestamp].pickle

Note: Event generation is cost-controlled via the config settings.

Step 4 - Analyze Simulator Results

After the simulation completes, you can find the results in the specified output path directory (results/airline). The structure will look like this:

experiments/
├── dataset__[timestamp]__exp_[n]/    # Experiment run folder
│   ├── experiment.log                # Detailed experiment execution logs
│   ├── config.yaml                   # Configuration used for this run
│   ├── prompt.txt                    # Prompt template used
│   ├── memory.db                     # Dialog memory database
│   └── results.csv                   # Evaluation results and metrics
│
datasets/
├── dataset__[timestamp].pickle       # Generated dataset snapshot
└── dataset.log                       # Dataset generation logs
│
policies_graph/
├── graph.log                         # Policy graph generation logs
└── descriptions_generator.pickle     # Generated descriptions and policies

Output Files Overview

  • experiment.log: Contains detailed logs of the experiment execution, including timestamps and any errors encountered during the run.
  • config.yaml: This file holds the configuration settings that were used for this specific simulation run, allowing for easy replication of results.
  • prompt.txt: The prompt template that was utilized during the simulation, which can be useful for understanding the context of the agent's responses.
  • memory.db: A database file that stores the dialog memory, which can be analyzed to understand how the agent retained and utilized information throughout the simulation.
  • results.csv: This file includes the evaluation results and metrics from the simulation, providing insights into the performance of the agent.

In addition to the experiment folder, you will find:

  • dataset__[timestamp].pickle: A snapshot of the generated dataset at the time of the simulation, which can be used for further analysis.
  • dataset.log: Logs related to the dataset generation process, detailing any issues or important events that occurred during this phase.
  • graph.log: Logs related to the generation of the policy graph, which can help in understanding the generated policies and their relations for the scenarios generation process.
  • descriptions_generator.pickle: A file containing the generated descriptions and policies, useful for reviewing the agent's learned behaviors and strategies.

Step 5 - Run the Simulator Visualization

To visualize the simulation results using streamlit, run:

cd simulator/visualization 
streamlit run Simulator_Visualizer.py

This will launch a Streamlit dashboard showing detailed analytics and visualizations of your simulation results. In the visualization you can:

  • Load simulator memory and experiments by providing their full path
  • View conversation flows and policy compliance
  • Analyze agent performance and failure points

Note: Make sure you have streamlit installed (pip install streamlit) before running the visualization.