A web-based tool for evaluating the openness and reproducibility of open Large Language Models (LLMs) using the TRUE (Transparent, Reproducible, Understandable, Executable) framework.
Visit: https://csheargm.github.io/true_framework/
The TRUE Framework provides a systematic scorecard approach to evaluate how "open" a model truly is—beyond just licensing. It scores models across four key dimensions with a maximum total score of 30 points.
- Transparent (Max 10 pts) - Critical components openly disclosed
- Reproducible (Max 10 pts) - Feasibility to retrain the model
- Understandable (Max 6 pts) - Well-documented for understanding
- Executable (Max 4 pts) - Can run or fine-tune locally
- Platinum (28–30): Fully open and reproducible
- Gold (21–27): Strong openness, minor gaps
- Silver (11–20): Some transparency, low reproducibility
- Bronze (0–10): Minimal openness
- Predefined Model Templates: Quick evaluation of popular models (Mistral, LLaMA, Falcon, etc.)
- Custom Model URL Input: Evaluate any model by providing its repository URL
- Auto-Analysis: Attempts to automatically detect common openness indicators
- Interactive Scoring: Click-based evaluation with evidence tracking
- Leaderboard: Ranked list of evaluated models with filtering
- Modification Tracking: Edit previous evaluations with history tracking
- Multiple Persistence Options:
- Local browser storage (default)
- Google Forms integration (optional)
- JSON export/import
- Open the tool in your browser
- Choose evaluation method:
- Select a predefined model from the dropdown
- Enter a custom GitHub/HuggingFace URL
- Click "Start Evaluation"
- Check criteria that the model meets
- Add evidence URLs for validation
- Save your evaluation
For custom URLs, click "Auto-Analyze Repository" to attempt automatic detection of:
- License files
- Model weights
- Training/inference code
- Documentation
- Evaluations automatically saved in browser
- Data persists across sessions
- Private to your device
- Create a Google Form with appropriate fields
- Click "Setup" in persistence options
- Enter your form's submission URL
- Evaluations will be sent to your form
- Export all evaluations as JSON
- Import evaluations from JSON files
- Share evaluations across devices
- Fork this repository
- Go to Settings → Pages
- Set source to main branch, root folder
- Your site will be available at:
https://[username].github.io/true_framework/
Simply open index.html in a web browser. No build process required!
- Add a CNAME file with your domain
- Configure DNS settings
- Enable HTTPS in GitHub Pages settings
Advantages:
- No coding required
- Free with Google account
- Automatic spreadsheet integration
- Built-in timestamp and validation
Setup:
- Create a Google Form with fields matching the evaluation data
- Get the form's prefilled URL
- Extract entry IDs
- Configure in the app
Create automated persistence using GitHub Actions:
- Store evaluations as JSON in repository
- Use GitHub API for updates
- Maintain version history
For more robust backend:
- Real-time synchronization
- User authentication
- Advanced querying
- Scalable storage
Deploy a simple API using:
- Vercel/Netlify Functions
- AWS Lambda
- Google Cloud Functions
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
- GitHub API integration for automatic repository analysis
- Batch evaluation mode
- Comparison view for multiple models
- Export to standardized report format
- Community voting on evaluations
- Historical score tracking
- API endpoint for programmatic access
- Integration with model registries
MIT License - See LICENSE file for details
Based on the TRUE Framework specification for evaluating open LLM reproducibility.
For issues or questions, please open an issue on GitHub.