generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add On-Policy Distillation from thinking labs to paper index.
#4410
opened Oct 30, 2025 by
pramodith
Loading…
4 of 5 tasks
Add tip for logging evaluation metrics during regular evaluations
#4367
opened Oct 29, 2025 by
cam1llynha
Loading…
[OpenENV] Openenv rollout_func signature proposal
#4344
opened Oct 27, 2025 by
kashif
Loading…
5 tasks
Use explicit tiny-Qwen2ForCausalLM-2.5 model_id param in CI tests
#4331
opened Oct 23, 2025 by
albertvillanova
Loading…
refactor: simplify parameter freezing in modeling_base.py
#4305
opened Oct 20, 2025 by
Ki-Seki
Loading…
2 of 5 tasks
[SFT] Log mean token accuracy from Liger kernel
#4302
opened Oct 18, 2025 by
kashif
Loading…
5 tasks
feat: Add Multi-Token Prediction (MTP) support to SFTTrainer
#4290
opened Oct 15, 2025 by
KLGR123
Loading…
Remove FSDP1 support: use FSDP2 exclusively
#4260
opened Oct 11, 2025 by
behroozazarkhalili
Loading…
Fix DPO Trainer Bug For Qwen2-VL (Issue 2660)
#4257
opened Oct 11, 2025 by
FabianSchuetze
Loading…
1 of 3 tasks
update guided decoding param to structured outputs
#4117
opened Sep 22, 2025 by
jiqing-feng
Loading…
feat:add support for 'image_grid_thw'(QwenVL) in DPOTrainer
#4091
opened Sep 15, 2025 by
ycma8
Loading…
2 of 5 tasks
Add
config_init_kwargs option in GRPOConfig
#4069
opened Sep 12, 2025 by
hokuyama0106
Loading…
2 of 5 tasks
[Draft] Add configurable dataset column logging to GRPOTrainer W&B tables
#4045
opened Sep 9, 2025 by
davanstrien
•
Draft
Fix #3982: Fix DPO Trainer support for Gemma 3 vision models
#4022
opened Sep 6, 2025 by
akshay-babbar
Loading…
Fix: undefined
current_gradient_accumulation_steps
#4014
opened Sep 5, 2025 by
ysjprojects
Loading…
2 of 5 tasks
Enable saving and loading precomputed reference log probabilities in …
#3986
opened Sep 1, 2025 by
ginkyenglee
Loading…
3 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.