Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add the metrics completion_length_max and completion_length_min
#2930 opened Feb 22, 2025 by dignfei Loading…
4 tasks
Add a raw generate API to the vLLM server
#3227 opened Apr 3, 2025 by wilrop Loading…
5 tasks
[WIP] Iterative training scripts for SPIN and SPPO
#3011 opened Mar 5, 2025 by jkx19 Draft
3 of 5 tasks
[WIP] vllm-server-spec-dec-support
#3643 opened Jun 24, 2025 by shirinyamani Loading…
5 tasks
dynamic temperature
#3844 opened Aug 4, 2025 by shirinyamani Draft
5 tasks
ClearML logging of visualization in RewardTrainer evaluation
#3602 opened Jun 16, 2025 by ioverho Loading…
2 of 5 tasks
Add basic support for FSDP/Lora when using TRL/VLLM
#3735 opened Jul 14, 2025 by ojh31 Loading…
5 tasks
Implement DPOP
#3864 opened Aug 7, 2025 by 1485840691 Loading…
Support dLLM in GRPO reference model creation
#3743 opened Jul 18, 2025 by xijia-tao Loading…
Add config_init_kwargs option in GRPOConfig
#4069 opened Sep 12, 2025 by hokuyama0106 Loading…
2 of 5 tasks
Reintroduce generate method for PPOTrainer
#3374 opened Apr 27, 2025 by CloseChoice Loading…
4 tasks done
Allow an user to train from a local dataset
#3470 opened May 19, 2025 by gogo2464 Loading…
1 of 5 tasks
Test in distributed setting
#3902 opened Aug 15, 2025 by qgallouedec Loading…
5 tasks
Online-dpo-ben
#4252 opened Oct 10, 2025 by burtenshaw Draft
5 tasks
Update max_length explanation for VLM trainers
#4220 opened Oct 7, 2025 by sergiopaniego Loading…
5 tasks
Reintroducing step method in ppo_trainer
#3410 opened May 3, 2025 by jskaf34 Loading…
2 of 5 tasks
add vllm support for token ids as input
#3280 opened Apr 11, 2025 by wybryan Loading…
GRPO: Scalable training with one LLM/node
#3186 opened Mar 31, 2025 by jglaser Loading…
3 of 5 tasks
ProTip! Mix and match filters to narrow down what you’re looking for.