Commit 6bb00bc
fix(trainer): Correct loss scaling for incomplete gradient accumulation steps (huggingface#39659)
* Fix issue[huggingface#38837]: wrong loss scaled in last step of epoch
* chore: trigger CI
* Update src/transformers/trainer.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Update src/transformers/modeling_flash_attention_utils.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
---------
Co-authored-by: taihang <taihang@U-2RHYVWX7-2207.local>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>1 parent 3cb9c61 commit 6bb00bc
1 file changed
+5
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2530 | 2530 | | |
2531 | 2531 | | |
2532 | 2532 | | |
| 2533 | + | |
| 2534 | + | |
| 2535 | + | |
2533 | 2536 | | |
2534 | 2537 | | |
2535 | 2538 | | |
| |||
3830 | 3833 | | |
3831 | 3834 | | |
3832 | 3835 | | |
3833 | | - | |
| 3836 | + | |
| 3837 | + | |
3834 | 3838 | | |
3835 | 3839 | | |
3836 | 3840 | | |
| |||
0 commit comments