fix: load faster, make quantization better #279

stephantul · 2025-09-11T08:46:29Z

This PR speeds up loading by about 20ms by not recreating the StaticModel. We check if quantization is needed, and if not, we skip the entire quantize function. In addition, we also optimize the memory usage and flow of the quantize function: previously, if someone cast to the original embedding dtype, we would cast anyway.

codecov · 2025-09-11T08:48:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
model2vec/model.py	`95.29% <100.00%> (+0.05%)`	⬆️
model2vec/quantization.py	`97.14% <100.00%> (+0.36%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Pringled

Nice

fix: load faster, make quantization better

f596694

stephantul requested a review from Pringled September 11, 2025 08:46

tests: add test

d867281

Pringled approved these changes Sep 11, 2025

View reviewed changes

stephantul merged commit 5a8578d into main Sep 11, 2025
6 checks passed

stephantul deleted the quantize-loading-fix branch September 11, 2025 09:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: load faster, make quantization better #279

fix: load faster, make quantization better #279

Uh oh!

stephantul commented Sep 11, 2025

codecov bot commented Sep 11, 2025 •

edited

Loading

Pringled left a comment

Uh oh!

Labels

3 participants

fix: load faster, make quantization better #279

fix: load faster, make quantization better #279

Uh oh!

Conversation

stephantul commented Sep 11, 2025

codecov bot commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Pringled left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

3 participants

codecov bot commented Sep 11, 2025 •

edited

Loading