TRUE Framework

🔍 TRUE Framework v1.0

Transparent Reproducible Understandable Executable

Evaluate the openness and reproducibility of open LLMs 📊

Start New Evaluation

💡 Tip: Enter a URL or click on the input field to see popular model suggestions. Click "Auto-Analyze & Start Evaluation" to begin.

Popular Models (0) - Click to Select:
🔥 Trending on HuggingFace
Meta Llama 3.2 (3B)
Google Gemma 2 (9B)
Qwen 2.5 (72B)
Microsoft Phi-3.5 Mini
DeepSeek V2.5
💻 Code Models
Qwen 2.5 Coder (32B)
DeepSeek Coder V2
Code Llama (70B)
Code Llama Python
WizardCoder Python
🌟 Community Favorites
Nous Hermes 3
Mixtral 8x7B
Yi 1.5 (34B)
Aquila 2 (34B)
InternLM 2.5 (20B)
🔬 Research Models
Stanford Alpaca
Databricks Dolly
Open Assistant
RedPajama
OpenLLaMA
🏢 Enterprise Models
Salesforce XGen
Amazon MistralLite
NVIDIA Nemotron
Alibaba GTE-Qwen2
GLM-4 (9B)

Evaluation Leaderboard

Rank Model Evals Score Tier T R U E Date

Data Persistence

Export/Import

Backup and restore your evaluations

🏆 TRUE Framework Tier Classification

Models are classified into four tiers based on their total TRUE Framework score (maximum 30 points):

Platinum Fully open and reproducible

28-30 points

Gold Strong openness, minor gaps

21-27 points

Silver Some transparency, low reproducibility

11-20 points

Bronze Minimal openness

0-10 points

⚠️ Important Notice

Accuracy Disclaimer: The TRUE Framework scores are based on publicly available information and may not reflect the complete picture of a model's openness. Scores generated by this tool should be considered preliminary assessments.

Verification Required: Please independently verify all evidence links and information before making decisions based on these evaluations. Model documentation and availability may change over time.

Use Case: This tool is intended for educational and research purposes to promote transparency in AI development. For critical decisions, conduct thorough independent research.