Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix: undefined current_gradient_accumulation_steps
#4014 opened Sep 5, 2025 by ysjprojects Loading…
2 of 5 tasks
[DRAFT] Refactor DPO
#3906 opened Aug 15, 2025 by qgallouedec Draft
6 tasks
Test in distributed setting
#3902 opened Aug 15, 2025 by qgallouedec Loading…
5 tasks
Implement DPOP
#3864 opened Aug 7, 2025 by 1485840691 Loading…
dynamic temperature
#3844 opened Aug 4, 2025 by shirinyamani Draft
5 tasks
[GSPO]: Refactor _compute_loss
#3835 opened Aug 1, 2025 by pramodith Loading…
2 of 5 tasks
Dynamic sampling option in GRPO trainer based on DAPO paper
#3758 opened Jul 23, 2025 by almeidava93 Loading…
2 of 5 tasks
Support dLLM in GRPO reference model creation
#3743 opened Jul 18, 2025 by xijia-tao Loading…
Add basic support for FSDP/Lora when using TRL/VLLM
#3735 opened Jul 14, 2025 by ojh31 Loading…
5 tasks
feat: Initial implementation of RePO trainer and components
#3655 opened Jun 26, 2025 by celsowm Loading…
5 tasks
[WIP] vllm-server-spec-dec-support
#3643 opened Jun 24, 2025 by shirinyamani Loading…
5 tasks
GRPO: Pack Responses within the same group.
#3642 opened Jun 24, 2025 by pramodith Draft
4 of 5 tasks
Add Entropy Control to GRPOTrainer
#3628 opened Jun 22, 2025 by 1485840691 Loading…
Feature: Add SGLang support for GRPO Trainer
#3627 opened Jun 21, 2025 by PrinsYin Draft
5 tasks
ClearML logging of visualization in RewardTrainer evaluation
#3602 opened Jun 16, 2025 by ioverho Loading…
2 of 5 tasks
Fix: corrected fsdp in GRPO trainer
#3582 opened Jun 13, 2025 by tryumanshow Loading…
2 of 5 tasks
Chisquare regularized DPO
#3573 opened Jun 12, 2025 by asparius Loading…
Add Bidirectional Knowledge Distillation Option to GKDTrainer
#3508 opened May 29, 2025 by shaischaudhry Loading…
3 of 5 tasks
ProTip! no:milestone will show everything without a milestone.