Skip to content

feat(ADR-261 M2): multi-bit + large-N ANN scaling study — measured, no crossover (refutes M1 prediction)#1066

Merged
ruvnet merged 2 commits into
mainfrom
feat/v2-beyond-sota-sweep
Jun 14, 2026
Merged

feat(ADR-261 M2): multi-bit + large-N ANN scaling study — measured, no crossover (refutes M1 prediction)#1066
ruvnet merged 2 commits into
mainfrom
feat/v2-beyond-sota-sweep

Conversation

@ruvnet

@ruvnet ruvnet commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Summary

ADR-261 Milestone-2 — completes the SymphonyQG investigation by building the two levers M1 predicted would produce a crossover (multi-bit codes + large N) and measuring them. The prediction is refuted with data: a published negative with the mechanism explained.

Built

  • Multi-bit quantized traversal (hnsw_quantized.rs): generalized from 1-bit to b-bit-per-dimension (b ∈ {1,2,4}, 16/32/64 bytes/node) over the Pass-2 rotated coordinates — same quantizer as ADR-156 §10's measure_multibit.
  • Deterministic scaling harness (ann_measure.rs): run_scaling_study/best_float_op/best_quant_op + a scaling_report (#[ignore], --release) and a CI-safe scaling_study_small_is_consistent.

MEASURED (dim=128, K=10, L2, M=16, ef_construction=200, seeded, --release, target recall ≥ 0.90)

N bits quant best recall quant/float QPS @ equal recall
10k 1 / 2 / 4 1.000 / 1.000 / 1.000 0.19× / 0.46× / 0.48×
100k 1 / 2 / 4 0.207 / 0.346 / 0.788 none (never ≥ 0.90)
250k 1 / 2 / 4 0.108 / 0.210 / 0.624 none

Verdict — NO crossover, prediction refuted

  • Multi-bit helps at small N but not enough: at N=10k more bits lift the ratio 0.19×→0.48× and let b≥2 reach the 0.90 bar 1-bit missed — but quant stays slower than float HNSW at equal recall.
  • The predicted large-N crossover moved the wrong way: as N grows, quant recall collapses (b=4: 1.000→0.788→0.624, never reaching 0.90) while float HNSW holds ≥0.92. A denser graph packs near-neighbours whose low-bit codes can't be separated, so the approximate score steers the beam off-path faster than the float-distance saving repays.
  • Caveat: our HNSW + our per-node multi-bit code, not SymphonyQG's RaBitQ-fused graph — refutes the direction at ≤250k, not their published million-scale numbers. The true 1:1 million-scale reproduction is the documented deferred build.

Validation

  • cargo test --workspace --no-default-featuresexit 0, 0 failed. ruvector lib 151→156 (+5 multi-bit/scaling tests).
  • python archive/v1/data/proof/verify.pyVERDICT: PASS (off the proof path).
  • The §11 table is produced by the committed scaling_report test (ANN_BENCH_N-tunable) — reproducible.

🤖 Generated with claude-flow

ruvnet and others added 2 commits June 14, 2026 10:07
…ng harness

Generalize the SymphonyQG-style quantized-traversal HNSW from 1-bit Hamming to a
b-bit-per-dimension code (b ∈ {1,2,4}), mirroring ADR-156 §10's multi-bit RaBitQ
scheme (rotate via FHT Pass-2, uniform mid-rise scalar quantizer over [-3,3],
ranked by per-dim L1). b=1 is byte-for-byte the original construction (codes in
{0,1} ⇒ L1 == Hamming), pinned by one_bit_build_bits_matches_legacy_build.
Bytes/node scales linearly: 128-d → 16/32/64 B for b=1/2/4.

- hnsw_quantized.rs: QuantizedHnswIndex::build_bits(...,bits,...), bits()/
  bytes_per_node() accessors, code-L1 greedy+beam traversal. build(...) kept as
  the b=1 backward-compatible entry point. +4 tests (multi-bit recall regression,
  bits clamp, bytes/node, legacy parity).
- ann_measure.rs: build_indices_bits / build_quant_bits / run_scaling_study +
  best_float_op / best_quant_op; scaling_report (#[ignore], --release) and a
  CI-safe scaling_study_small_is_consistent.
- ann_bench.rs: 2-bit and 4-bit quant criterion benches over the shared graph.

ruvector lib 151 → 156 passed, 0 failed, 1 ignored (scaling_report).

Co-Authored-By: claude-flow <ruv@ruv.net>
…over (refutes M1 prediction)

Multi-bit (b∈{1,2,4}) quantized HNSW traversal + N∈{10k,100k,250k} scaling study,
measured on this box. No crossover at any (N,b): at 10k more bits help (ratio
0.19→0.48×, b≥2 reaches 0.90 recall) but quant stays slower than float HNSW at
equal recall; at 100k/250k quant recall collapses (b=4: 1.0→0.788→0.624, never
≥0.90) while float holds ≥0.92. The predicted large-N crossover moved the wrong
way. Published negative with the mechanism explained. ADR-261 §11.

Co-Authored-By: claude-flow <ruv@ruv.net>
@ruvnet ruvnet merged commit 1f05456 into main Jun 14, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant