Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Enabling automation of experiments running v2.0 #469

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
8726ab8
Revising to enable automation of experiments running v1.0
xisen-w Nov 4, 2024
b44bef5
Any new updates
xisen-w Nov 15, 2024
c100876
Revising to enable automation of experiments running v1.0
xisen-w Nov 4, 2024
18370d4
Any new updates
xisen-w Nov 15, 2024
21a99d2
Add template
you-n-g Nov 15, 2024
86ae0b2
Stoping tracking additional env
xisen-w Nov 20, 2024
f94dbff
Merge branch 'automated-evaluation' of https://github.com/microsoft/R…
xisen-w Nov 20, 2024
66ffd6d
Uploading relevant envs
xisen-w Nov 20, 2024
0ef80a5
Adding tests
xisen-w Nov 20, 2024
907d980
Updating
xisen-w Nov 20, 2024
51388d1
Updated collect.py to extract result from trace
xisen-w Nov 23, 2024
af6220e
Update .gitignore to remove the unecessary ones
xisen-w Nov 23, 2024
54c3c6d
"Remove unnecessary files"
xisen-w Nov 23, 2024
78708e4
Merge branch 'automated-evaluation' of https://github.com/microsoft/R…
xisen-w Nov 25, 2024
3f131f3
Merge branch 'main' into automated-evaluation
xisen-w Nov 25, 2024
38bb9e6
Updated to enable automatic collection of experiment result information
xisen-w Nov 25, 2024
10b0053
Updating the env files & Upading test_system file
xisen-w Nov 25, 2024
238f492
Updated relevant env for better testing
xisen-w Nov 25, 2024
68ca63a
Updated README.md
xisen-w Nov 25, 2024
8b18fad
reverting gitignore back
xisen-w Nov 25, 2024
2395dc5
Updates
xisen-w Dec 3, 2024
b7cc98e
README update
xisen-w Dec 3, 2024
0b5a09d
Updates on env README
xisen-w Dec 3, 2024
24cd0c2
Updating collect.py
xisen-w Dec 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ venv/
ENV/
env.bak/
venv.bak/
.huaxia_env
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for? is it necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! Was about to hide this.


# Spyder project settings
.spyderproject
Expand Down Expand Up @@ -151,7 +152,7 @@ reports/
# git_ignore_folder
git_ignore_folder/

#cache
# cache
*cache*/
*cache.json

Expand All @@ -169,4 +170,11 @@ mlruns/

# shell script
*.out
*.sh

# Logs
*.log
logs/
log/

# Ignore results directory
RD-Agent/rdagent/scenarios/kaggle/automated_evaluation/results/
136 changes: 136 additions & 0 deletions rdagent/scenarios/kaggle/automated_evaluation/eval.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#!/bin/bash

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file for?

# Comments
cat << "EOF" > /dev/null
Experiment Setup Types:
1. DS-Agent Mini-Case
2. RD-Agent Basic
3. RD-Agent Pro
4. RD-Agent Max

Each setup has specific configurations for:
- base_model (4o|mini|4o)
- rag_param (No|Simple|Advanced)
- if_MAB (True|False)
- if_feature_selection (True|False)
- if_hypothesis_proposal (True|False)
EOF

# Get current time and script directory
SCRIPT_PATH="$(realpath "$0")"
SCRIPT_DIR="$(dirname "$SCRIPT_PATH")"
current_time=$(date +"%Y%m%d_%H%M%S")
export SCRIPT_DIR
export current_time

# Parse command line arguments
PARALLEL=1
CONF_PATH=./
COMPETITION=""
SETUP_TYPE=""

while getopts ":sc:k:t:" opt; do
case $opt in
s)
echo "Disable parallel running (run experiments serially)" >&2
PARALLEL=0
;;
c)
echo "Setting conf path $OPTARG" >&2
CONF_PATH=$OPTARG
;;
k)
echo "Setting Kaggle competition $OPTARG" >&2
COMPETITION=$OPTARG
;;
t)
echo "Setting setup type $OPTARG" >&2
SETUP_TYPE=$OPTARG
;;
\?)
echo "Invalid option: -$OPTARG" >&2
exit 1
;;
esac
done

# Validate required parameters
if [ -z "$COMPETITION" ] || [ -z "$SETUP_TYPE" ]; then
echo "Error: Competition (-k) and setup type (-t) are required"
exit 1
fi

# Create necessary directories
mkdir -p "${SCRIPT_DIR}/results/${current_time}"
mkdir -p "${SCRIPT_DIR}/logs/${current_time}"

# Configure experiment based on setup type
configure_experiment() {
local setup=$1
case $setup in
"mini-case")
echo "if_using_vector_rag=True" > "${SCRIPT_DIR}/override.env"
echo "if_using_graph_rag=False" >> "${SCRIPT_DIR}/override.env"
echo "if_action_choosing_based_on_UCB=True" >> "${SCRIPT_DIR}/override.env"
echo "model_feature_selection_coder=True" >> "${SCRIPT_DIR}/override.env"
echo "hypothesis_gen=False" >> "${SCRIPT_DIR}/override.env"
;;
"basic")
echo "if_using_vector_rag=False" > "${SCRIPT_DIR}/override.env"
echo "if_using_graph_rag=False" >> "${SCRIPT_DIR}/override.env"
echo "if_action_choosing_based_on_UCB=False" >> "${SCRIPT_DIR}/override.env"
echo "model_feature_selection_coder=True" >> "${SCRIPT_DIR}/override.env"
echo "hypothesis_gen=True" >> "${SCRIPT_DIR}/override.env"
;;
"pro")
echo "if_using_vector_rag=True" > "${SCRIPT_DIR}/override.env"
echo "if_using_graph_rag=False" >> "${SCRIPT_DIR}/override.env"
echo "if_action_choosing_based_on_UCB=True" >> "${SCRIPT_DIR}/override.env"
echo "model_feature_selection_coder=True" >> "${SCRIPT_DIR}/override.env"
echo "hypothesis_gen=True" >> "${SCRIPT_DIR}/override.env"
;;
"max")
echo "if_using_vector_rag=True" > "${SCRIPT_DIR}/override.env"
echo "if_using_graph_rag=True" >> "${SCRIPT_DIR}/override.env"
echo "if_action_choosing_based_on_UCB=True" >> "${SCRIPT_DIR}/override.env"
echo "model_feature_selection_coder=True" >> "${SCRIPT_DIR}/override.env"
echo "hypothesis_gen=True" >> "${SCRIPT_DIR}/override.env"
;;
esac
}

# Execute experiment
run_experiment() {
local setup_type=$1
local competition=$2

configure_experiment "$setup_type"

# Run the main experiment loop
python -m rdagent.app.kaggle.loop \
--competition "$competition" \
--setup "$setup_type" \
--result_path "${SCRIPT_DIR}/results/${current_time}/result.json" \
>> "${SCRIPT_DIR}/logs/${current_time}/experiment.log" 2>&1

# Store experiment setup and results
cat > "${SCRIPT_DIR}/results/${current_time}/experiment_info.json" << EOF
{
"setup": {
"competition": "$competition",
"setup_type": "$setup_type",
"timestamp": "$current_time"
},
"results": $(cat "${SCRIPT_DIR}/results/${current_time}/result.json")
}
EOF
}

# Run the experiment
run_experiment "$SETUP_TYPE" "$COMPETITION"

# Cleanup
trap 'rm -f "${SCRIPT_DIR}/override.env"' EXIT

echo "Experiment completed. Results are stored in ${SCRIPT_DIR}/results/${current_time}"

Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you submit this file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought since this was not to be merged I could show how the json is organised for now.

I will hide it in the next commit.

"setup": {
"competition": "sf-crime",
"setup_type": "mini-case",
"timestamp": "20241107_051618"
},
"results":
}
9 changes: 9 additions & 0 deletions scripts/exp/ablation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Introduction

| name | .env | desc |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is not complete

| -- | -- | -- |
| full | full.env | enable all features |
| minicase | minicase.env | enable minicase |



5 changes: 5 additions & 0 deletions scripts/exp/ablation/env/basic.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
if_using_vector_rag=False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the environment should capitalized.

if_using_graph_rag=False
if_action_choosing_based_on_UCB=False
model_feature_selection_coder=True
hypothesis_gen=True
1 change: 1 addition & 0 deletions scripts/exp/ablation/env/full.env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@

5 changes: 5 additions & 0 deletions scripts/exp/ablation/env/max.env
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it needs to add the path to the knowledge base.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KG_IF_USING_VECTOR_RAG here should be set to false.

Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
if_using_vector_rag=True
if_using_graph_rag=True
if_action_choosing_based_on_UCB=True
model_feature_selection_coder=True
hypothesis_gen=True
5 changes: 5 additions & 0 deletions scripts/exp/ablation/env/mini-case.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
if_using_vector_rag=True
if_using_graph_rag=False
if_action_choosing_based_on_UCB=True
model_feature_selection_coder=True
hypothesis_gen=False
5 changes: 5 additions & 0 deletions scripts/exp/ablation/env/pro.env
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it also needs to add the path to the knowledge base.

Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
if_using_vector_rag=True
if_using_graph_rag=False
if_action_choosing_based_on_UCB=True
model_feature_selection_coder=True
hypothesis_gen=True
3 changes: 3 additions & 0 deletions scripts/exp/tools/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The tools in the directory contains following generalfeatures
- collecting envs and run each
- collect results and generate summary
28 changes: 28 additions & 0 deletions scripts/exp/tools/collect.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the env name (e.g. basic, max, pro) displayed in the collected results?

import json

def collect_results(dir_path) -> list[dict]:
summary = []
for root, _, filies in os.walk(dir_path):
for file in filies:
if file.endswith(".json"):
with open(os.path.join(root, file), "r") as f:
data = json.load(f)
summary.append(data)
return summary

def generate_summary(results, output_path):
# First analyze the results and generate a summary
# For each experiment, we find the best result, the metric, and result trajectory
#TODO: Implement this

# Then write the summary to the output path
with open(output_path, "w") as f:
json.dump(results, f, indent = 4)

if __name__ == "__main__":
result_dir = os.path.join(os.getenv("EXP_DIR"), "results")
results = collect_results(result_dir)
generate_summary(results, os.path.join(result_dir, "summary.json"))
print("Summary generated successfully at ", os.path.join(result_dir, "summary.json"))

51 changes: 51 additions & 0 deletions scripts/exp/tools/run_envs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#!/bin/sh
cat << "EOF" > /dev/null
Given a directory with *.env files. Run each one.

usage for example:

1) directly run command without extra shared envs
./run_envs.sh -d <dir_to_*.envfiles> -j <number of parallel process> -- <command>

2) load shared envs `.env` before running command with different envs.
dotenv run -- ./run_envs.sh -d <dir_to_*.envfiles> -j <number of parallel process> -- <command>

EOF

# Function to display usage
usage() {
echo "Usage: $0 -d <dir_to_*.envfiles> -j <number of parallel process> -- <command>"
exit 1
}

# Parse command line arguments
while getopts "d:j:" opt; do
case $opt in
d) DIR=$OPTARG ;;
j) JOBS=$OPTARG ;;
*) usage ;;
esac
done

# Shift to get the command
shift $((OPTIND -1))

# Check if directory and jobs are set
if [ -z "$DIR" ] || [ -z "$JOBS" ] || [ $# -eq 0 ]; then
usage
fi

COMMAND="$@"

# Before running commands
echo "Running experiments with following env files:"
find "$DIR" -name "*.env" -exec echo "{}" \;

# Export and run each .env file in parallel
find "$DIR" -name "*.env" | xargs -n 1 -P "$JOBS" -I {} sh -c "
set -a
. {}
set +a
$COMMAND
"

21 changes: 21 additions & 0 deletions scripts/exp/tools/test_system.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

# Test directory setup
TEST_DIR="test_run"
mkdir -p "$TEST_DIR/results"

# Test 1: Environment loading
echo "Testing environment loading..."
./scripts/exp/tools/run_envs.sh -d scripts/exp/ablation/env -j 1 -- env | grep "if_using"

# Test 2: Parallel execution
echo "Testing parallel execution..."
./scripts/exp/tools/run_envs.sh -d scripts/exp/ablation/env -j 4 -- \
echo "Processing env with RAG setting: $if_using_vector_rag"

# Test 3: Result collection
echo "Testing result collection..."
EXP_DIR="$TEST_DIR" python scripts/exp/tools/collect.py

# Cleanup
rm -rf "$TEST_DIR"
Loading