Skip to content

Conversation

@stephantul
Copy link
Contributor

This PR adds quantization. Quantization can be applied during distillation, or during loading. Both are equivalent, except that distill-time quantization leads to smaller embedding sizes.

@stephantul stephantul requested a review from Pringled April 20, 2025 17:22
@codecov
Copy link

codecov bot commented Apr 20, 2025

Codecov Report

Attention: Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
model2vec/quantization.py 94.73% 1 Missing ⚠️
Files with missing lines Coverage Δ
model2vec/distill/distillation.py 94.85% <100.00%> (+0.11%) ⬆️
model2vec/model.py 94.70% <100.00%> (+0.14%) ⬆️
tests/test_model.py 97.91% <100.00%> (+0.20%) ⬆️
tests/test_quantization.py 100.00% <100.00%> (ø)
model2vec/quantization.py 94.73% <94.73%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Copy link
Member

@Pringled Pringled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, two questions

return embeddings.astype(np.float64)
elif quantize_to == DType.Int8:
# Normalize to [-127, 127] range for int8
scale = np.max(np.abs(embeddings)) / 127.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this ever be 0 (zero division issues?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if all embeddings are 0

elif quantize_to == DType.Float64:
return embeddings.astype(np.float64)
elif quantize_to == DType.Int8:
# Normalize to [-127, 127] range for int8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this not be [-128, 127] (the range of an 8-bit signed integer)? Not sure if it's relevant for the code though since it doesn't change the division.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the symmetry is more important than making sure the 1 extra value is used. I updated the comment.

@stephantul stephantul merged commit 6731674 into main Apr 21, 2025
5 checks passed
@stephantul stephantul deleted the quantization branch April 21, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants