Skip to content

Conversation

@leizhenyuan
Copy link
Contributor

change torch.cuda.get_current_device depends on DEVICE_TYPE.

@leizhenyuan
Copy link
Contributor Author

@danielhanchen Pls help review, thanks.

Copy link
Contributor

@danielhanchen danielhanchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I moved the get_current_device as a global func

@danielhanchen danielhanchen merged commit a2d41ea into unslothai:main Jul 22, 2025
Datta0 pushed a commit to Datta0/unsloth that referenced this pull request Jul 24, 2025
* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>
danielhanchen added a commit that referenced this pull request Sep 16, 2025
* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Sep 22, 2025
* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Sep 26, 2025
* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Sep 30, 2025
* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 1, 2025
* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 5, 2025
* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 16, 2025
* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 16, 2025
* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 17, 2025
* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 17, 2025
* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

* Update pyproject.toml

* Delete _amd_install.sh

* Update device_type.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 20, 2025
* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

* Update pyproject.toml

* Delete _amd_install.sh

* Update device_type.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update tokenizer_utils.py

* Versioning

* Update pyproject.toml

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 20, 2025
* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

* Update pyproject.toml

* Delete _amd_install.sh

* Update device_type.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update tokenizer_utils.py

* Versioning

* Update pyproject.toml

* Update loader.py

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update _utils.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* local_files_only

* Cut Cross Entropy

* Update llama.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Oct 30, 2025
* Update loader.py

* Update vision.py

* Update vision.py

* custom_datatype

* recheck

* Float16

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

* Update pyproject.toml

* Delete _amd_install.sh

* Update device_type.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update tokenizer_utils.py

* Versioning

* Update pyproject.toml

* Update loader.py

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update _utils.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* local_files_only

* Cut Cross Entropy

* Update llama.py

* Update vision.py

* Update vision.py

* Update vision.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Nov 3, 2025
* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Bug fix

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* torch_dtype

* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

* Update pyproject.toml

* Delete _amd_install.sh

* Update device_type.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update tokenizer_utils.py

* Versioning

* Update pyproject.toml

* Update loader.py

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update _utils.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* local_files_only

* Cut Cross Entropy

* Update llama.py

* Update vision.py

* Update vision.py

* Update vision.py

* Qwen 3 VL vLLM (#3489)

* Update __init__.py

* patch_torchao

* torchao_logger

* Update rl_replacements.py

* Fix

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Versioning

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
danielhanchen added a commit that referenced this pull request Nov 7, 2025
* Update rl.py

* Fix CE Loss

* Versioning

* Update loader.py

* Update loader.py

* extract_model_type_from_config

* Model types

* Update loader.py

* get_transformers_model_type

* Update loader.py

* Update loader.py

* Update loader.py

* Update rl.py

* Update pyproject.toml

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update vision.py

* Update vision.py

* Fix DataParallel

* Update _utils.py

* Update rl.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update mapper.py

* Versioning

* Update loader.py

* Update loader.py

* Update rl.py

* Versioning

* Update _utils.py

* Fix auto_mapping

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Message

* Update vision.py

* Update loader.py

* Update vision.py

* cache_implementation

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Save max_seq_length

* Update _utils.py

* Update rl.py

* Update vision.py

* Update llama.py

* Mistral3 vllm (#3349)

* [WIP] use vLLM for vision language models

* Update README.md

Editing icon sizes

* Update README.md

Updating icon sizes

* Update README.md (#2885)

* MoE kernels AGPLv3

* versioning

* Many bug fixes (#2908)

* add deepseek v3

* add deepseek r1 base

* add deepseek r1 zero

* add deepseek distill llama

* add deepseek distill models

* remove redundant code when constructing model names

* add mistral small to registry

* rename model registration methods

* rename deepseek registration methods

* refactor naming for mistral and phi

* add global register models

* refactor model registration tests for new registry apis

* add model search method

* remove deprecated registration api

* add quant type test

* add registry readme

* make llama registration more specific

* clear registry when executing individual model registration file

* more registry readme updates

* Update _auto_install.py

* Llama4

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Synthetic data

* Update mapper.py

* Xet and Synthetic

* Update synthetic.py

* Update loader.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>

* silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912)

* Dynamically adjust get_per_token_logps function and patch as well (#2911)

* add intel gpu with vllm support (#2903)

* [bugs] fix for casual mask (#2868)

* fix for casual mask

* use un_casual in sdpa

* add missing mask

* fix for type

* Explicitly check if xformers exists for attention (#2889)

* Update __init__.py

* Update llama.py

* if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913)

* Move inputs to right devices. (#2919)

* Move tensors to right devices

* fix multi gpu for non mistral models

* multi GPU RoPE for gemma2

* Finish up multi GPU inference

* Make multiGPU rope a list

* Remove unnecessary transfer to CPU

* Remove unnecessary move to CPU

* Donot move inputs to device yet

will be handled separately in another PR

* Move inputs to appropriate decoder device

* Make device count global variable

* Cleanup RoPE device code

* Fixup num_gpu to device count

* Cleanup device counts

* Use device index for RoPE get_cache

* Donot typecast

* Use tuple instead of list for tensors. Use device index directly

* fixup move to device logic

* WIP VLM vLLM

* Make vLLM patch a function

* Add save and load lora functions

* Make fast_inference setup depend on the flag

* Improve fast inference patching mechanism

* Make vision setting depend on checks in fastbasemodel

* Check LoRA and vLLM intercompatibility for vision models

* Comment pointing to vLLM LoRA check

* Improve lora validation on vLLM

* Error out on no vLLM and increase max lora rank

* Bug fixes (#3017)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* fix for casual mask (#3011)

* [intel] add for intel path for llama.py (#3012)

* fix for intel path

* remove unuse code

* Update unsloth/models/llama.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update llama.py

* Fix Gemma 2 (#3024)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* falcon force float32 on sm<75 machines (#3026)

* Fix torch compile issues (#3028)

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update pyproject.toml

* Delete .gitignore

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update _utils.py

* Update pyproject.toml

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update chat_templates.py

* Seasame force float16 / float32

* Fix Seasame

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* is_multimodal

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update vision.py

* UNSLOTH_DISABLE_STATIC_GENERATION

* Update vision.py

* Auto vision detection

* Sesame

* Whisper

* Update loader.py

* Update loader.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

* Update _utils.py

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* logging

* Update pyproject.toml

* Update rl.py

* versioning

* Update rl.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* logits / temperature

* Update rl_replacements.py

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Debugging only

* Update llama.py

* Update llama.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Generic efficient GRPO

* Update rl_replacements.py

* Update rl_replacements.py

* Remove debugging

* Update rl_replacements.py

* Update rl_replacements.py

* Update vision.py

* Update llama.py

* Update rl_replacements.py

* versioning

* Update _utils.py

* Update vision.py

* Update mapper.py

* Update loader.py

* Update mapper.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update loader.py

* Update _utils.py

* Update vision.py

* gradient checkpointing

* Gemma 3N fixes

* Update loader.py

* Versioning

* Gemma 3N fixes

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Fix setup.py

* setup.py

* Prints

* Update setup.py

* Update setup.py

* Update setup.py

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update pyproject.toml

* Update vision.py

* Update vision.py

* Update pyproject.toml

* Update vision.py

* Update _utils.py

* Update __init__.py

* Update __init__.py

* Small fixes

* Update vision.py

* Update vision.py

* versioning

* Update __init__.py

* Update llama.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update vision.py

* Update vision.py

* compiler stance

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990)

This reverts commit 204fc46.

* skip_guard_eval_unsafe fix

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update synthetic.py

* Update llama.py

* Update llama.py

* Fix `quantization_method`

* versioning

* Update _utils.py

* Update _utils.py

* Update _utils.py

* check stride

* Cleanup

* Update rope_embedding.py

* Update gemma2.py

* Fix `set_stance`

* Update pyproject.toml

* Update _utils.py

* Fixup patch vllm

* Disable mllama

* Use variables to decide VLM support

* Better attn_impl handling

* Patch TF protobuf incompatability

* Torch 2.8 (#3186)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update _auto_install.py

* Update pyproject.toml

* Update rl.py

* Protobuf issue

* Update pyproject.toml

* Fix extras transformers typo in pyproject.toml

* Update _utils.py

* Bug fixes (#3195)

* Fix mamba

* Update loader.py

* Update vision.py

* Update loader.py

* Filter vLLM standby logs (#3131)

* filter vLLM standby logs

* safeguard standby logger patch

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

* Update unsloth/models/_utils.py

---------

Co-authored-by: Daniel Han <danielhanchen@gmail.com>

* Update loader.py

* Add scaler

* Update llama.py

* Update _utils.py

* Versioning

* GPT OSS fix

* GPT OSS fix

* Update loader.py

* Update vision.py

* Update vision.py

* Update loader.py

* Update vision.py

* Update vision.py

* Update llama.py

* Update llama.py

* Update llama.py

* Versioning

* Update mapper.py

* Update vision.py

* Update vision.py

* Update vision.py

* Upcast norms

* Update loader.py

* Update vision.py

* Upcast layernorms

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update llama.py

* Update save.py

* Update rl.py

* Update pyproject.toml

* Update rl.py

* Update rl_replacements.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Update __init__.py

* Torch 2.8

* Update rl_replacements.py

* Update loader.py

* UNSLOTH_ENABLE_CCE

* Fix

* Update loader.py

* Update loader.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Update __init__.py

* Import fixes

* Update loader.py

* Fix aimv2 issue

* Update loader.py

* Update import_fixes.py

* Update import_fixes.py

* Update loader.py

* Update loader.py

* Update loader.py

* Upgrade

* Update loader.py

* Update loader.py

* Update loader.py

* Update loader.py

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* adallow float32 dtype in FastLanguageModel (#3204)

* Update loader.py

* Update vision.py

* Suppress message and use unsloth sampling params

* Use trl sampling params for now

* Improve error message

* fixup quantized fast inference model name

* Add mistral 3 support

---------

Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>

* Set padding to 0

* Fix patch

* fixup patch (#3359)

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>

* Update vision.py

* Versioning

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* MXFP4 dequant

* Update loader.py

* Update vision.py

* load_in_16bit

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl.py

* Update vision.py

* offload_embedding

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update rl_replacements.py

* Update loader.py

* Fix padding issue

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* Update vision.py

* New models

* Update llama.py

* Versioning

* Update _utils.py

* Update llama.py

* Update _utils.py

* Update llama.py

* Fix AMD

* Update _utils.py

* Update llama.py

* Update vision.py

* DEVICE_TYPE_TORCH

* Update __init__.py

* Update __init__.py

* Update _utils.py

* Move DEVICE_TYPE

* Update rl_replacements.py

* Update loader.py

* AMD install script

* Move AMD

* Update _amd_install.sh

* Update pyproject.toml

* Update pyproject.toml

* Delete _amd_install.sh

* Update device_type.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update _utils.py

* Update tokenizer_utils.py

* Versioning

* Update pyproject.toml

* Update loader.py

* Update _utils.py

* Update pyproject.toml

* Update pyproject.toml

* Update _utils.py

* Update pyproject.toml

* Update _utils.py

* Update _utils.py

* Update loader.py

* Update _utils.py

* Update _utils.py

* local_files_only

* Cut Cross Entropy

* Update llama.py

* Update vision.py

* Update vision.py

* Update vision.py

* Qwen 3 VL vLLM (#3489)

* Update __init__.py

* patch_torchao

* torchao_logger

* Update rl_replacements.py

* Fix

* Update rl.py

* Update rl.py

* Update rl.py

* Update rl.py

* Update _utils.py

* Versioning

* fbgemm fp8 block quant support (>=1.4.0) (#3531)

* fbgemm fp8 block quant support (>=1.4.0)

* Verify for fp8 support before proceeding

* Use unsloth zoo's Version and improve comments

* spacessss

* Update vision.py

* Update vision.py

* Update rl.py

* vllm_sampling_params

* Update rl.py

* Update rl.py

* Update rl.py

* Add `ruff` pre-commit hook and apply it (#3424)

* Add Ruff pre-commit config and workflow

* Add kwarg spacing enforcement helper

* Apply Ruff formatting

* Update fp8.py

* Revert ruff on some files

* Update

* force-exclude = true

* Datasets issue

* Ruff

* Remove mapper

* Update mapper.py

* Update pyproject.toml

---------

Co-authored-by: Datta Nimmaturi <venkatadattasainimmaturi@gmail.com>
Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com>
Co-authored-by: jeromeku <jerome.ku@gmail.com>
Co-authored-by: DoubleMathew <mmathew23@gmail.com>
Co-authored-by: Lei Zhenyuan <zhenyuan.lei@intel.com>
Co-authored-by: parth2510 <parthguptapg7326@gmail.com>
Co-authored-by: Dan Saunders <danjsaund@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants