generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
16 / 1816 of 18 issues completedOpen
16 / 1816 of 18 issues completed
Copy link
Labels
🏋 DPORelated to DPORelated to DPO🏋 DPPORelated to DDPORelated to DDPO🏋 GKDRelated to GKDRelated to GKD🏋 GRPORelated to GRPORelated to GRPO🏋 Iterative SFTRelated to Iterative SFTRelated to Iterative SFT🏋 KTORelated to KTORelated to KTO🏋 ORPORelated to ORPORelated to ORPO🏋 Online DPORelated to Online DPORelated to Online DPO🏋 PPORelated to PPORelated to PPO🏋 PRMRelated to PRMRelated to PRM🏋 RLOORelated to RLOORelated to RLOO🏋 RewardRelated to Reward modellingRelated to Reward modelling🏋 SFTRelated to SFTRelated to SFT🏋 XPORelated to XPORelated to XPO🐛 bugSomething isn't workingSomething isn't working
Description
Caused by huggingface/transformers#35651 that adds a new condition for scaling the loss.
Spotted huggingface/transformers#35856.
For each one, check if there is the same issue, if so fix it.
ahans30
Sub-issues
Metadata
Metadata
Assignees
Labels
🏋 DPORelated to DPORelated to DPO🏋 DPPORelated to DDPORelated to DDPO🏋 GKDRelated to GKDRelated to GKD🏋 GRPORelated to GRPORelated to GRPO🏋 Iterative SFTRelated to Iterative SFTRelated to Iterative SFT🏋 KTORelated to KTORelated to KTO🏋 ORPORelated to ORPORelated to ORPO🏋 Online DPORelated to Online DPORelated to Online DPO🏋 PPORelated to PPORelated to PPO🏋 PRMRelated to PRMRelated to PRM🏋 RLOORelated to RLOORelated to RLOO🏋 RewardRelated to Reward modellingRelated to Reward modelling🏋 SFTRelated to SFTRelated to SFT🏋 XPORelated to XPORelated to XPO🐛 bugSomething isn't workingSomething isn't working