huggingface / trl Public

generated from fastai/nbdev_template

Notifications You must be signed in to change notification settings
Fork 2.3k
Star 16.1k

Code
Issues 510
Pull requests 75
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: huggingface/trl

Labels 34 Milestones 0

New pull request New

Clear current search query, filters, and sorts

68 Open 508 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[ALST/Ulysses] Added ALST/Ulysses documentation

#4420 opened Nov 1, 2025 by kashif • Draft

5 tasks

Updated OpenEnv docs

#4418 opened Oct 31, 2025 by sergiopaniego • Draft

8 tasks

Add On-Policy Distillation from thinking labs to paper index.

#4410 opened Oct 30, 2025 by pramodith

Loading…

4 of 5 tasks

Gold refactor

#4373 opened Oct 29, 2025 by qgallouedec • Draft

5 tasks

Add tip for logging evaluation metrics during regular evaluations

#4367 opened Oct 29, 2025 by cam1llynha

Loading…

[OpenENV] Openenv rollout_func signature proposal

#4344 opened Oct 27, 2025 by kashif

Loading…

5 tasks

Use explicit tiny-Qwen2ForCausalLM-2.5 model_id param in CI tests

#4331 opened Oct 23, 2025 by albertvillanova

Loading…

wip - env

#4320 opened Oct 22, 2025 by qgallouedec

Loading…

5 tasks

refactor: simplify parameter freezing in modeling_base.py

#4305 opened Oct 20, 2025 by Ki-Seki

Loading…

2 of 5 tasks

[SFT] Log mean token accuracy from Liger kernel

#4302 opened Oct 18, 2025 by kashif

Loading…

5 tasks

Tool call

#4300 opened Oct 18, 2025 by qgallouedec • Draft

5 tasks

feat: Add Multi-Token Prediction (MTP) support to SFTTrainer

#4290 opened Oct 15, 2025 by KLGR123

Loading…

Remove FSDP1 support: use FSDP2 exclusively

#4260 opened Oct 11, 2025 by behroozazarkhalili

Loading…

Fix DPO Trainer Bug For Qwen2-VL (Issue 2660)

#4257 opened Oct 11, 2025 by FabianSchuetze

Loading…

1 of 3 tasks

Online-dpo-ben

#4252 opened Oct 10, 2025 by burtenshaw • Draft

5 tasks

Add support for Python 3.14

#4225 opened Oct 8, 2025 by albertvillanova

Loading…

Add trust_remote_code to GRPOConfig

#4186 opened Oct 1, 2025 by muupan

Loading…

3 of 4 tasks

update guided decoding param to structured outputs

#4117 opened Sep 22, 2025 by jiqing-feng

Loading…

feat:add support for 'image_grid_thw'(QwenVL) in DPOTrainer

#4091 opened Sep 15, 2025 by ycma8

Loading…

2 of 5 tasks

Add config_init_kwargs option in GRPOConfig

#4069 opened Sep 12, 2025 by hokuyama0106

Loading…

2 of 5 tasks

[Draft] Add configurable dataset column logging to GRPOTrainer W&B tables

#4045 opened Sep 9, 2025 by davanstrien • Draft

Fix #3982: Fix DPO Trainer support for Gemma 3 vision models

#4022 opened Sep 6, 2025 by akshay-babbar

Loading…

Fix: undefined current_gradient_accumulation_steps

#4014 opened Sep 5, 2025 by ysjprojects

Loading…

2 of 5 tasks

Remove attention mask when position ids is returned

#3997 opened Sep 2, 2025 by qgallouedec • Draft

Enable saving and loading precomputed reference log probabilities in …

#3986 opened Sep 1, 2025 by ginkyenglee

Loading…

3 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!