🔍 TRUE Framework v1.0

Transparent • Reproducible • Understandable • Executable

✅ Evaluate the openness and reproducibility of open LLMs 📊

Start New Evaluation

💡 Tip: Enter a URL or click on the input field to see popular model suggestions. Click "Auto-Analyze & Start Evaluation" to begin.

Popular Models (0) - Click to Select:

🔥 Trending on HuggingFace

Meta Llama 3.2 (3B)

Google Gemma 2 (9B)

Qwen 2.5 (72B)

Microsoft Phi-3.5 Mini

DeepSeek V2.5

💻 Code Models

Qwen 2.5 Coder (32B)

DeepSeek Coder V2

Code Llama (70B)

Code Llama Python

WizardCoder Python

🌟 Community Favorites

Nous Hermes 3

Mixtral 8x7B

Yi 1.5 (34B)

Aquila 2 (34B)

InternLM 2.5 (20B)

🔬 Research Models

Stanford Alpaca

Databricks Dolly

Open Assistant

RedPajama

OpenLLaMA

🏢 Enterprise Models

Salesforce XGen

Amazon MistralLite

NVIDIA Nemotron

Alibaba GTE-Qwen2

GLM-4 (9B)

Evaluation Leaderboard

Current Run Historic Total

Rank	Model	Evals	Score	Tier	T	R	U	E	Date

Data Persistence

Export/Import

Backup and restore your evaluations

🏆 TRUE Framework Tier Classification

Models are classified into four tiers based on their total TRUE Framework score (maximum 30 points):

Platinum Fully open and reproducible

28-30 points

Gold Strong openness, minor gaps

21-27 points

Silver Some transparency, low reproducibility

11-20 points

Bronze Minimal openness

0-10 points

⚠️ Important Notice

Accuracy Disclaimer: The TRUE Framework scores are based on publicly available information and may not reflect the complete picture of a model's openness. Scores generated by this tool should be considered preliminary assessments.

Verification Required: Please independently verify all evidence links and information before making decisions based on these evaluations. Model documentation and availability may change over time.

Use Case: This tool is intended for educational and research purposes to promote transparency in AI development. For critical decisions, conduct thorough independent research.

🔍 TRUE Framework v1.0

Start New Evaluation

Evaluation:

1. Transparent (Max 10 pts)

2. Reproducible (Max 10 pts)

3. Understandable (Max 6 pts)

4. Executable (Max 4 pts)

Total Score: 0/30

Evaluation Leaderboard

Data Persistence

Export/Import

🏆 TRUE Framework Tier Classification

⚠️ Important Notice