OpenLens AI: Fully Autonomous Multimodal Agent for Health Informatics Research

📄 Paper: Read our research paper on arXiv

🌐 Project Page: Explore detailed documentation and examples

🚀 Try Now: Use our cloud application directly in your browser

OpenLens AI is a fully autonomous multimodal agent designed for the medical research field. Provide it with your dataset and a single-line research idea, and it will independently conduct literature review, design experiments, analyze data, and generate comprehensive research reports—no manual intervention required. Also supports domains other than healthcare.

🔥 New: General domain supported (e.g. software, machine learning, etc.) 🔥 New: Chinese language support for figures and papers.

🔍 Key Features

No installation required! Visit our project page to learn more about OpenLens AI or try our cloud application to experience the fully autonomous research agent without any setup.

✅ Automated Literature Review: Search and summarize medical papers based on your research question
✅ Data Analysis: Analyze medical datasets and generate comprehensive reports
✅ Experiment Design: Suggest and validate experimental approaches
✅ Code Generation and Execution: Generate and execute code for data analysis and experiments with OpenHands
✅ Multi-Agent Collaboration: Coordinate multiple specialized agents to handle complex research tasks
✅ LaTeX Paper Generation: Automated creation and management of research papers and reports in LaTeX format
✅ Interactive UI: Streamlit-based interface for monitoring and interacting with the research process
✅ Context Management: Automated management of contextual information for agents via vector search
✅ Vision-Language Feedback: Integrate with VLM for visualization and feedback
✅ Chinese Language Support: Full support for Chinese paper writing
⬜ Powerpoint-Based Figures: Automated generation of Powerpoint-based figures for demonstrations for better visual quality (to replace the current graphviz-based figures)
⬜ Context Manager via Long Context Model: Integrate with long context model for context management (in addition to the current vector search-based approach)

🚀 Quick Start

Prerequisites

Python 3.9 or higher
Docker (for OpenHands runtime environment)
API keys for:
- LLM service (e.g., DeepSeek, OpenAI, Qwen, etc.)
- Tavily search API (for literature search)

Installation

Clone the repository:

git clone git@github.com:jarrycyx/openlens-ai.git --recurse-submodules
cd openlens-ai

Ensure Docker Installation:

Pull the runtime directly (recommended):

docker --version

# Pull docker
ALIYUN_REMOTE_DOCKER_NAME=crpi-hbt8nkulkjqjqkie.cn-hangzhou.personal.cr.aliyuncs.com/cyx-docker/openlens-ai:runtime-latest
docker pull $ALIYUN_REMOTE_DOCKER_NAME
docker tag $ALIYUN_REMOTE_DOCKER_NAME openlens-ai:runtime-latest

or build from scratch (if needing to support Chinese paper writing, download windows-fonts.tar.gz or collect fonts in C://windows/Fonts/):

# Optional: Collect Chinese fonts
cd openlens_ai/tools/openhands_configs/ && tar -xzvf windows-fonts.tar.gz

# Build base docker for tex-live, etc
bash openlens_ai/tools/openhands_configs/build_docker_base.sh 
# Build runtime docker to meet the requirements of OpenHands
bash openlens_ai/tools/openhands_configs/build_docker_runtime.sh 
# Check the ID of the built image
docker images
# Tag the image name with openlens-ai:runtime-latest
docker tag <IMAGE_ID> openlens-ai:runtime-latest

Install dependencies: First install OpenHands:

cd modules/OpenHands

Then install OpenHands following the instructions.

Install python dependencies:

# If wish to visualize the workflow, install graphviz:
#   sudo apt-get install graphviz graphviz-dev
#   pip install pygraphviz

conda create -n py312 python=3.12 # Or with uv / venv
conda activate py312
pip install --upgrade pip
pip install -e .

Configure environment variables:

cp .env.example .env
# Edit .env with your API keys and model settings

Configuration

In your config.toml file, configure the following:

[llm]
language = "chs"  # Language setting: "chs" for Chinese, "eng" for English

[llm.chat] # Main language model used for general tasks and coding
model = "glm-4.5-air"  # The main language model used for general tasks
base_url = "https://cloud.infini-ai.com/maas/v1/"  # Base URL for the model API service
api_key = "<YOUR API KEY>"  # API key for accessing the language models

[llm.vision]
model = "glm-4.1v-9b-thinking"  # The vision model used for image analysis tasks
base_url = "https://open.bigmodel.cn/api/paas/v4/"  # Base URL for the vision model API service
api_key = "<YOUR API KEY>"  # API key for accessing the vision model

[rerank]
rerank_model = "bge-reranker-v2-m3"  # The reranking model used to improve search result relevance
rerank_api_key = "<YOUR API KEY>" # API key for accessing the reranking model (infiniai service)
rerank_base_url = "https://cloud.infini-ai.com/maas/v1/"  # Base URL for the reranking model API service

[tools]
tavily_api_key = "<YOUR API KEY>"  # API key for Tavily search service used for web search

[docker]
docker_name = "openlens-ai:runtime-latest"  # Name of the Docker container used for the agent environment

See config.full-example.toml for more detailed configuration options.

Running the Application

Option 1: Command Line Interface

Example:

python -m openlens_ai.main \
  --question "What are the temporal patterns of vital sign deterioration preceding cardiac arrest events in critical care settings?" \
  --dataset-path "datasets/eicu-demo" \
  --thread-id "pred_aki_trend_eicu_demo" \
  --notify-email "dzdzzd@126.com" \
  --interrupt-after-subgraph "none" \
  --language "chs" \
  --domain "medical" # Domain setting: "medical" for healthcare, "general" for other domains

Option 2: Interactive Web Interface

streamlit run start_app.py

Then open your browser to http://localhost:8501 to access the interactive interface.

🧠 Architecture

OpenLens AI uses a multi-module architecture powered by LangGraph:

Literature Reviewer: Searches and analyzes relevant medical literature
Data Analyzer: Processes and analyzes medical datasets
Supervisor: Coordinates the research process and makes high-level decisions
Coder: Generates code and technical solutions for data processing
LaTeX Writer: Generates LaTeX documents for research papers and reports

Agents communicate through a shared state and can call various tools including:

Web search (Tavily)
Code execution (OpenHands)
File operations
Vector search for context management
Literature Search Tools:
- ✅ arXiv Search and Paper Reading
- ✅ medRxiv Search and Paper Reading
- ✅ Google Scholar Search
- ✅ Tavily Search
- ⬜ IACR ePrint Search
- ⬜ Semantic Scholar Search and Paper Reading
- ⬜ PubMed Search

📁 Project Structure

openlens_ai/
├── agents/              # Agent implementations
│   ├── coder.py
│   ├── data_analyzer.py
│   ├── latex_writer.py   # LaTeX document generation agent
│   ├── literature_reviewer.py
│   └── supervisor.py
├── prompts/             # LLM prompt templates
├── tools/               # Custom tools and utilities
├── utils/               # Helper functions
├── build_graph.py       # Main graph construction
├── chatbot.py           # Chatbot interface
├── frontend.py          # Streamlit frontend
└── state.py             # State management

🛠️ Customization

Adding New Agents

Create a new agent in openlens_ai/agents/
Follow the pattern in existing agents like coder.py
Register the agent in build_graph.py

Adding New Tools

Add tool implementation in openlens_ai/tools/
Register the tool in the appropriate agent
Update prompts if needed

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for details on how to contribute to this project.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Uses OpenHands for code execution sandbox
Powered by LangGraph for workflow orchestration
Uses Streamlit for the web interface
Inspired by recent advances in AI for medical research

Name		Name	Last commit message	Last commit date
Latest commit History 214 Commits
.streamlit		.streamlit
examples		examples
exp		exp
llm_router		llm_router
modules		modules
openlens_ai		openlens_ai
static		static
webui		webui
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
config.full-example.toml		config.full-example.toml
config.minimal.toml		config.minimal.toml
datasets.md		datasets.md
modules.md		modules.md
pyproject.toml		pyproject.toml
start_app.py		start_app.py
test_build_graph.py		test_build_graph.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenLens AI: Fully Autonomous Multimodal Agent for Health Informatics Research

🔍 Key Features

🚀 Quick Start

Prerequisites

Installation

Configuration

Running the Application

Option 1: Command Line Interface

Option 2: Interactive Web Interface

🧠 Architecture

📁 Project Structure

🛠️ Customization

Adding New Agents

Adding New Tools

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

jarrycyx/openlens-ai

Folders and files

Latest commit

History

Repository files navigation

OpenLens AI: Fully Autonomous Multimodal Agent for Health Informatics Research

🔍 Key Features

🚀 Quick Start

Prerequisites

Installation

Configuration

Running the Application

Option 1: Command Line Interface

Option 2: Interactive Web Interface

🧠 Architecture

📁 Project Structure

🛠️ Customization

Adding New Agents

Adding New Tools

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages