Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 15 additions & 10 deletions unsloth_zoo/saving_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2068,22 +2068,27 @@ def determine_base_model_source(model_name, token=None):
if local_path:
local_is_quantized, local_quant_type = check_model_quantization_status(local_path)

# Priority 1: Local unquantized OR Local mxfp4
if local_path and (not local_is_quantized or local_quant_type == "mxfp4"):
if not local_is_quantized:
return (local_path, True, "local_unquantized", False, None)
else: # local_quant_type == "mxfp4"
return (local_path, True, "local_mxfp4", True, "mxfp4")

# Priority 2: HF unquantized
# Priority 1: Local unquantized
if local_path and not local_is_quantized:
return (local_path, True, "local_unquantized", False, None)

# Priority 2: Local mxfp4
if local_path and local_is_quantized and local_quant_type == "mxfp4": # local_quant_type == "mxfp4"
return (local_path, True, "local_mxfp4", True, "mxfp4")
Comment on lines +2071 to +2077
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This refactoring to split the checks is a good step for clarity. To further improve it, you could group the high-priority local model checks to avoid repeating if local_path and make the logic more compact.

Suggested change
# Priority 1: Local unquantized
if local_path and not local_is_quantized:
return (local_path, True, "local_unquantized", False, None)
# Priority 2: Local mxfp4
if local_path and local_is_quantized and local_quant_type == "mxfp4": # local_quant_type == "mxfp4"
return (local_path, True, "local_mxfp4", True, "mxfp4")
# Priority 1 & 2: Local models
if local_path:
if not local_is_quantized:
return (local_path, True, "local_unquantized", False, None)
elif local_quant_type == "mxfp4":
return (local_path, True, "local_mxfp4", True, "mxfp4")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two blocks are actually functionally equivalent.


# Priority 3: HF unquantized
if hf_exists and not hf_is_quantized:
return (model_name, False, "HF_unquantized", False, None)

# Priority 3: HF quantized (covers both "both quantized" and "just HF quantized")
# Priority 4: HF quantized (covers both "both quantized" and "just HF quantized")
if hf_exists and hf_is_quantized:
return (model_name, False, f"HF_{hf_quant_type}", True, hf_quant_type)

# Priority 4: Nothing suitable found
# Priority 5: Local other quantization
if local_path and local_is_quantized:
return (local_path, True, f"local_{local_quant_type}", True, local_quant_type)

# Priority 6: Nothing suitable found
Copy link
Collaborator

@rolandtannous rolandtannous Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've intentionally restricted using a 4bit bnb base model when the merge method is 16bit , because the upcasting from 4bit-> 32 bit to merge will result in a high precision loss, so we don't want the case for Priotrity 5 when we are merging in 16bits.
in fact, we explicitely prohibit this and issue a warning. a 4 bit model should be mostly used for inference , and a 16bit base model should be available if we wish to finetune.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand the reason behind restricting merging of 4bit models, this is not the place where it should happen.
The code calling this has to check the quantization anyway, since a remote model is always returned if found, even if that is 4bit.
Filtering here is just reduntant and makes it more difficult to find out why the merging isn't working, since you get a model not found error instead of a model not compatible error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with a better error message if that's the intended purpose of this change, cause the priority logic seems almost the same (end point is no merging happens) , and i wasn't sure whether there was a functional bug motivating the change, hence had to ask. LGTM if that's the case.

return (None, False, "", False, None)
pass

Expand Down