- Illinois
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Agents - RAGs - MCP serv
LLM Inference, Alignment, DPO
plus Preference Tuning + Reward Optimization, high-Throughput, RL, HPO, KV Cache compressions, Reasoning methods, Jailbreaking, TTT n ComputationLLM integrations - LAM - GenAI
+ Action LMs, ecosystems, voice features, TTS, AudioStars
✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models
RewardBench: the first evaluation tool for reward models.
A resource repository for representation engineering in large language models
A resource repository for machine unlearning in large language models
Accelerating your LLM training to full speed! Made with ❤️ by ServiceNow Research
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
[NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"
[ CVPR 2024 ] Implementation for "GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation"
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
[ACL2025] Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Source code for the paper "Eraser: Jailbreaking defense in large language models via unlearning harmful knowledge".
The fastai book, published as Jupyter Notebooks
Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
PyTorch implementations of deep reinforcement learning algorithms and environments
First token cutoff sampling inference example
Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
An implemtation of Everyting of Thoughts (XoT).
[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.


