Step-by-Step Guide to Running the Airline Agent
The airline chatbot offers an advance example of a chatbot structure that operates with tools and database. For more simple example, refer to the education example.
The environment (prompt, tools and db schema) was adapted from tau-benchmark, and modified to integrate with the IntellAgent framework.
Step 1 - Understand the input
The examples/airline/input
folder contains the following structure:
airline/
└── input/
├── wiki.md # Prompt for the airline agent
├── tools/ # Directory containing all the agent tools
│ ├── agent_tools.py # Python script containing all the agent tools that are part of the enviorment
│ ├── book_reservation_tool.py # Python script containing the book reservation tool
│ ├── .. # All the rest of the tools
│ └── util.py # utils functions that are used by the tools
├── data/ # Directory containing data schema (csv with one example)
│ ├── flights.csv # Flights data scheme and example
│ ├── reservation.csv # Reservation data scheme and example
│ └── users.csv # Users data scheme and example
└── validatiors/ # Directory containing the data validators
└── data_validators.py # The datavalidators
The folders contain all the essential inputs for the environment, including the prompt, database schema, tools, and data validators. For more details about the environment inputs, refer to customization.
Step 2 - Configure the Simulator Run Parameters
The parameters in the configuration files have already been updated as follows in the config/config_airline.yml file:
environment:
prompt_path: 'examples/input/airline/wiki.md' # Path to your agent's wiki/documentation
tools_file: 'examples/input/airline/tools/agent_tools.py' # Path to your agent's tools
database_folder: 'examples/input/airline/data' # Path to your data schema
database_validators: 'examples/airline/input/validators/data_validators.py' # Optional! Path to the file with the validators functions
If you also want to modify the LLMs used by either the simulator or the chatbot, refer to customization.
Step 3 - Run the Simulator and Understand the Output
Running the Simulation
Execute the simulator using:
python run.py --config ./config/config_airline.yml --output_path results/airline
Understanding the Descriptor Generator Output
The simulator processes your input in several stages:
2.0. Task Description Generation
- Automatically inferred from the prompt (can be manually specified in
config_airline.yml
) - Defines the chatbot's role as an airline agent handling reservations within policy constraints
2.1. Flow Extraction The system identifies four main flows:
- Book flight
- Modify flight
- Cancel flight
- Refund
2.2. Policy Extraction Each flow has associated policies. Examples:
Flow | Policy Example | Category | Challenge Score |
---|---|---|---|
Book Flight | Agent must obtain user ID, trip type, origin, and destination | Knowledge extraction | 2 |
Modify Flight | All reservations can change cabin without changing flights | Company policy | 2 |
Cancel Flight | Cancellation allowed within 24h of booking or airline cancellation | Logical Reasoning | 4 |
Refund | Compensation available for eligible members based on status/insurance | Company policy | 3 |
2.3. Relations Graph Generation
- Creates a network of policy relationships
- Nodes: Individual policies
- Edges: Policy relationships
- Weights: Combined challenge scores
All descriptor data is saved to: output_path/policies_graph/descriptions_generator.pickle
2.4. Events Generation The event generation process occurs in three stages:
2.4.1. Symbolic Representation Generation
- Converts policies into symbolic format
- Processes in parallel using multiple workers (configured in config)
2.4.2. Symbolic Constraints Generation
- Creates constraints based on the symbolic representation
- Uses same worker and timeout configuration as symbolic generation
2.4.3. Event Graph Generation
- Final event creation (most time-intensive phase)
- Includes restriction filtering, validation, and result compilation
- Controlled by configurable difficulty levels
- Generates samples in batches according to dataset configuration
All generated events are saved to: output_path/datasets/dataset__[timestamp].pickle
Note: Event generation is cost-controlled via the config settings.
Step 4 - Analyze Simulator Results
After the simulation completes, you can find the results in the specified output path directory (results/airline
). The structure will look like this:
experiments/
├── dataset__[timestamp]__exp_[n]/ # Experiment run folder
│ ├── experiment.log # Detailed experiment execution logs
│ ├── config.yaml # Configuration used for this run
│ ├── prompt.txt # Prompt template used
│ ├── memory.db # Dialog memory database
│ └── results.csv # Evaluation results and metrics
│
datasets/
├── dataset__[timestamp].pickle # Generated dataset snapshot
└── dataset.log # Dataset generation logs
│
policies_graph/
├── graph.log # Policy graph generation logs
└── descriptions_generator.pickle # Generated descriptions and policies
Output Files Overview
- experiment.log: Contains detailed logs of the experiment execution, including timestamps and any errors encountered during the run.
- config.yaml: This file holds the configuration settings that were used for this specific simulation run, allowing for easy replication of results.
- prompt.txt: The prompt template that was utilized during the simulation, which can be useful for understanding the context of the agent's responses.
- memory.db: A database file that stores the dialog memory, which can be analyzed to understand how the agent retained and utilized information throughout the simulation.
- results.csv: This file includes the evaluation results and metrics from the simulation, providing insights into the performance of the agent.
In addition to the experiment folder, you will find:
- dataset__[timestamp].pickle: A snapshot of the generated dataset at the time of the simulation, which can be used for further analysis.
- dataset.log: Logs related to the dataset generation process, detailing any issues or important events that occurred during this phase.
- graph.log: Logs related to the generation of the policy graph, which can help in understanding the generated policies and their relations for the scenarios generation process.
- descriptions_generator.pickle: A file containing the generated descriptions and policies, useful for reviewing the agent's learned behaviors and strategies.
Step 5 - Run the Simulator Visualization
To visualize the simulation results using streamlit, run:
cd simulator/visualization
streamlit run Simulator_Visualizer.py
This will launch a Streamlit dashboard showing detailed analytics and visualizations of your simulation results. In the visualization you can:
- Load simulator memory and experiments by providing their full path
- View conversation flows and policy compliance
- Analyze agent performance and failure points
Note: Make sure you have streamlit installed (pip install streamlit
) before running the visualization.