generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[GRPO] Pad per minibatch instead of per generation batch
#3495
opened May 26, 2025 by
edbeeching
•
Draft
3 tasks
Allow an user to train from a local dataset
#3470
opened May 19, 2025 by
gogo2464
Loading…
1 of 5 tasks
Add async tool-enabled vLLM server for GRPO training via OpenAI-compatible interface
#3469
opened May 19, 2025 by
BjarniHaukur
Loading…
5 tasks
Fix GRPO/DAPO/Dr.GRPO documentation: formula corrections and KL divergence clarification
#3395
opened Apr 30, 2025 by
JenWei0312
Loading…
1 of 5 tasks
Reintroduce
generate method for PPOTrainer
#3374
opened Apr 27, 2025 by
CloseChoice
Loading…
4 tasks done
[Feat] Suppport SGLang as rollout engine of GRPO trainer
#3370
opened Apr 27, 2025 by
ryang-max
Loading…
2 of 8 tasks
Allow for saving the PPOTrainer value model (critic model)
#3308
opened Apr 16, 2025 by
AMindToThink
Loading…
Support for Models With Pre-Finetuned LoRA Adapters in GRPO: Add use_peft_as_reference Flag
#3196
opened Mar 31, 2025 by
LoganVegnaSHOP
Loading…
5 tasks done
GRPO: Scalable training with one LLM/node
#3186
opened Mar 31, 2025 by
jglaser
Loading…
3 of 5 tasks
🚀 Enhance GRPO VLLM server from sync to async and accelerate training
#3182
opened Mar 30, 2025 by
binary-husky
Loading…
Fixing GRPO
reward_func being a model with DeepSpeed ZeRO-3
#2984
opened Feb 28, 2025 by
jamesbraza
Loading…
Add the metrics completion_length_max and completion_length_min
#2930
opened Feb 22, 2025 by
dignfei
Loading…
4 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.