feat: add quantization #217

stephantul · 2025-04-20T17:22:17Z

This PR adds quantization. Quantization can be applied during distillation, or during loading. Both are equivalent, except that distill-time quantization leads to smaller embedding sizes.

codecov · 2025-04-20T17:23:57Z

Codecov Report

Attention: Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
model2vec/quantization.py	94.73%	1 Missing ⚠️

Files with missing lines	Coverage Δ
model2vec/distill/distillation.py	`94.85% <100.00%> (+0.11%)`	⬆️
model2vec/model.py	`94.70% <100.00%> (+0.14%)`	⬆️
tests/test_model.py	`97.91% <100.00%> (+0.20%)`	⬆️
tests/test_quantization.py	`100.00% <100.00%> (ø)`
model2vec/quantization.py	`94.73% <94.73%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Pringled

LGTM, two questions

Pringled · 2025-04-21T16:15:36Z

model2vec/quantization.py

+        return embeddings.astype(np.float64)
+    elif quantize_to == DType.Int8:
+        # Normalize to [-127, 127] range for int8
+        scale = np.max(np.abs(embeddings)) / 127.0


Can this ever be 0 (zero division issues?)

Only if all embeddings are 0

Pringled · 2025-04-21T16:18:18Z

model2vec/quantization.py

+    elif quantize_to == DType.Float64:
+        return embeddings.astype(np.float64)
+    elif quantize_to == DType.Int8:
+        # Normalize to [-127, 127] range for int8


Should this not be [-128, 127] (the range of an 8-bit signed integer)? Not sure if it's relevant for the code though since it doesn't change the division.

I think the symmetry is more important than making sure the 1 extra value is used. I updated the comment.

feat: add quantization

836f7ac

stephantul requested a review from Pringled April 20, 2025 17:22

stephantul mentioned this pull request Apr 20, 2025

Size of output model is half of original #213

Closed

Pringled approved these changes Apr 21, 2025

View reviewed changes

stephantul added 2 commits April 21, 2025 18:40

add comment

cb7feb2

merge

eb21ede

stephantul merged commit 6731674 into main Apr 21, 2025
5 checks passed

stephantul deleted the quantization branch April 21, 2025 16:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add quantization #217

feat: add quantization #217

Uh oh!

stephantul commented Apr 20, 2025

codecov bot commented Apr 20, 2025 •

edited

Loading

Pringled left a comment

Pringled Apr 21, 2025

stephantul Apr 21, 2025

Pringled Apr 21, 2025

stephantul Apr 21, 2025

Uh oh!

Labels

3 participants

feat: add quantization #217

feat: add quantization #217

Uh oh!

Conversation

stephantul commented Apr 20, 2025

codecov bot commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Pringled left a comment

Choose a reason for hiding this comment

Pringled Apr 21, 2025

Choose a reason for hiding this comment

stephantul Apr 21, 2025

Choose a reason for hiding this comment

Pringled Apr 21, 2025

Choose a reason for hiding this comment

stephantul Apr 21, 2025

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

codecov bot commented Apr 20, 2025 •

edited

Loading