GitHub - OPPO-PersonalAI/Agent-KB: Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

This is the official repository for the paper "Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving" (ICML 2025 CFAgentic Workshop Best Paper Runner-Up Award).

The work is done by Xiangru Tang*, Tianrui Qin*, Tianhao Peng*, Ziyang Zhou, Daniel Shao, Tingting Du, Xinming Wei, Peng Xia, Fang Wu, He Zhu, Ge Zhang, Jiaheng Liu, Xingyao Wang, Sirui Hong, Chenglin Wu, Hao Cheng, Chi Wang, Wangchunshu Zhou.

Key Features of Agent KB

Hierarchical Memory Structure: Combines working memory, episodic memory, and semantic knowledge base.
Agentic Reasoning: Supports autonomous decision-making and planning using LLMs.
Cross-Domain Adaptability: Designed for generalization across different task domains (e.g., QA, coding, planning).
Modular Design: Easy to integrate with various benchmarks and environments.

GAIA Part

To start, follow the steps below:

Access to the main part of GAIA

cd ./Agent-KB-GAIA/examples/open_deep_research

Install dependencies

Run the following command to install the required dependencies from the requirements.txt file:

pip install -r requirements.txt

Install the development version of `smolagents`

pip install -e ../../.[dev]

Set up environment variables

The agent uses the SearchTool for web search, which requires an environment variable with the corresponding API key, based on the selected provider:

SERP_API_KEY for SerpApi: Sign up here to get a key

Depending on the model you want to use, you may need to set environment variables. You need to set the OPENAI_BASE_URL and OPENAI_API_KEY environment variable.

Usage test

Step 1: Prepare the Data

Download the GAIA dataset and place it under the following directory:

./data/gaia

The expected directory structure is as follows:

├── data
│   ├── gaia
│       ├── test
│       └── validation

Step 2: Test run on GAIA

You're now all set to run on GAIA! Simply execute the run_gaia.py script like so:

python run_gaia.py --model-id openai:gpt-4.1 --model-id-search openai:gpt-4.1 --run-name gpt-4.1-gaia

If you'd like to use different questions or datasets, you can refer to the run_gaia.py script for guidance and make the necessary adjustments.

Agent KB usage

Now, let's start configuring Agent KB.

Step 1: Prepare the Knowledge Base Data

Format your knowledge base samples properly and save them in the following file:

./agent_kb/agent_kb_database.json

Each sample in the JSON file should follow this structure:

{
  "question": "",
  "agent_planning": "",
  "search_agent_planning": "",
  "agent_experience": "",
  "search_agent_experience": ""
}

Step 2: Start the Agent KB Service

Launch the Agent KB service by running the script below:

python ./agent_kb/agent_kb_service.py

Step 3: Execute the Run Command

Once everything is configured, run the main script using a command similar to the following example:

python run_gaia.py --model-id openai:gpt-4.1 --model-id-search openai:gpt-4.1 --run-name gpt-4.1-gaia --agent_kb --concurrency 1

SWE-bench Part

Access to main part of SWE-bench

cd ./Agent-KB-SWE-bench/scripts

Please note that all the files mentioned below are in the "scripts" directory.

Running Inference

bash run_swe_bench_hints_agentless_repo.sh

bash run_swe_bench_agentless.sh

bash run_swebench_eval_hints_agentless_repo.sh

Other scripts support:

Hint variants: repo_2nd, all_bench_hints
Baselines: plain, plain_qwen, plain_claude

Script Guide

Build & Setup

Script	Description
`build_env.sh`	Build and set up Docker-based environment.
`build.sh`	Local build script (e.g., hint databases).
`run_infer_template.sh`	Base template for LLM prompt runs.

Inference Scripts

With Hints (Agentless Subset)

Script	Hint Source	Description
`run_swe_bench_hints_agentless.sh`	location hints	location hints for baseline.
`run_swe_bench_hints_agentless_repo.sh`	RepoClassBench 1st round	Main Agent KB reasoning setup.
`run_swe_bench_hints_agentless_repo_2nd.sh`	RepoClassBench refined	Deeper/longer hints.
`run_swe_bench_hints_agentless_all_bench_hints.sh`	Combined	Merged hints for high coverage.

Baselines (No Hints)

Script	Model	Description
`run_swe_bench_plain.sh`	GPT-4	Plain run on full benchmark.
`run_swe_bench_plain_claude.sh`	Claude	Claude 3.7
`run_swe_bench_plain_qwen.sh`	Qwen	Qwen 3

Evaluation Scripts

Script	Matches
`run_swebench_eval_plain.sh`	Baseline on full set
`run_swebench_eval_agentless.sh`	Location hints
`run_swebench_eval_hints_agentless_repo.sh`	Location hints, RepoClassBench hint
`run_swebench_eval_agentless_repo_2nd.sh`	Location hints, RepoClassBench refined subset
`run_swebench_eval_agentless_all_bench_hints.sh`	Location hints, all hints subset
`run_swebench_eval_plain_claude.sh`	Claude baseline eval
`run_swebench_eval_plain_qwen.sh`	Qwen baseline eval

Acknowledgement

This work builds upon and adapts code from the official implementations of two prominent open-source frameworks:

smolagents — a lightweight, flexible, and powerful library designed to enable the rapid development and deployment of AI agents with minimal effort.
OpenHands — an advanced platform enabling AI-powered software development agents capable of interacting with code, terminals, and APIs just like human developers.

These frameworks were instrumental in validating the effectiveness and performance of our proposed methods. We extend our gratitude to the contributors and maintainers of both projects for their foundational work in advancing agent-based systems.

Cite Us

@article{tang2025agent,
  title={Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving},
  author={Tang, Xiangru and Qin, Tianrui and Peng, Tianhao and Zhou, Ziyang and Shao, Daniel and Du, Tingting and Wei, Xinming and Xia, Peng and Wu, Fang and Zhu, He and others},
  journal={arXiv preprint arXiv:2507.06229},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Agent-KB-GAIA		Agent-KB-GAIA
Agent-KB-SWE-bench		Agent-KB-SWE-bench
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Key Features of Agent KB

GAIA Part

Access to the main part of GAIA

Install dependencies

Install the development version of `smolagents`

Set up environment variables

Usage test

Step 1: Prepare the Data

Step 2: Test run on GAIA

Agent KB usage

Step 1: Prepare the Knowledge Base Data

Step 2: Start the Agent KB Service

Step 3: Execute the Run Command

SWE-bench Part

Access to main part of SWE-bench

Running Inference

Script Guide

Build & Setup

Inference Scripts

With Hints (Agentless Subset)

Baselines (No Hints)

Evaluation Scripts

Acknowledgement

Cite Us

Star ⭐

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

OPPO-PersonalAI/Agent-KB

Folders and files

Latest commit

History

Repository files navigation

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Key Features of Agent KB

GAIA Part

Access to the main part of GAIA

Install dependencies

Install the development version of smolagents

Set up environment variables

Usage test

Step 1: Prepare the Data

Step 2: Test run on GAIA

Agent KB usage

Step 1: Prepare the Knowledge Base Data

Step 2: Start the Agent KB Service

Step 3: Execute the Run Command

SWE-bench Part

Access to main part of SWE-bench

Running Inference

Script Guide

Build & Setup

Inference Scripts

With Hints (Agentless Subset)

Baselines (No Hints)

Evaluation Scripts

Acknowledgement

Cite Us

Star ⭐

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Install the development version of `smolagents`

Packages