Skip to content

refactor(shadow-account): single-source price-feature names + keep bound precision#316

Merged
warren618 merged 1 commit into
HKUDS:mainfrom
Robin1987China:shadow-price-feature-contract
Jun 27, 2026
Merged

refactor(shadow-account): single-source price-feature names + keep bound precision#316
warren618 merged 1 commit into
HKUDS:mainfrom
Robin1987China:shadow-price-feature-contract

Conversation

@Robin1987China

Copy link
Copy Markdown
Contributor

Summary

Two small follow-ups from the #314 review (the non-blocking items I said I would take separately): remove a duplicated feature-name list, and stop losing precision on extracted return bounds.

Why

  • The price-feature names lived in two places — _PRICE_FEATURES in extractor.py and a literal ("entry_rsi14", "prior_5d_return") tuple in codegen.py. Add a feature in one place and forget the other, and codegen silently stops flattening its bounds. They are the contract between extraction and codegen, so they should have one source.
  • _cluster_to_rule rounded bounds with round(., 2). RSI tolerates that, but prior_5d_return sits around ±0.0x, so a 0.0123 bound collapses to 0.01 — the band the generated engine and scanner check loses resolution.

Changes

  • models.py: add PRICE_FEATURES on the data-contract boundary (where the frozen dataclasses already live).
  • extractor.py: alias the local name to models.PRICE_FEATURES; round price-feature bounds to 4 decimals.
  • codegen.py: iterate PRICE_FEATURES instead of the literal tuple.
  • tests: a drift guard (codegen reads exactly the names the extractor writes) and a precision regression for small return bounds.

I left out the third review nit — computing RSI/return only when a rule has price bounds. That path runs once per code, not in a hot loop, and gating it would add template branching for negligible gain.

Test Plan

  • ruff check (changed source files) — clean
  • pytest tests/test_shadow_account.py tests/test_shadow_scanner.py — 74 passed
  • pytest (full, excl e2e) — 4473 passed, 5 skipped

Checklist

  • No changes to protected areas without prior discussion (all inside shadow_account/)
  • No hardcoded values
  • Code follows CONTRIBUTING.md guidelines
  • Documentation updated (if user-facing change) — n/a, internal refactor
…und precision

Follow-up to the HKUDS#314 review (non-blocking items I said I'd take separately):

 - models: add PRICE_FEATURES on the data-contract boundary; extractor and
   codegen now import it instead of each repeating the literal tuple, so the
   names can't drift between extraction and codegen
 - extractor: round price-feature bounds to 4 decimals — round(.,2) collapsed
   return-type bounds (prior_5d_return ~ ±0.0x, e.g. 0.0123 -> 0.01)
 - tests: drift guard (codegen reads the same names the extractor writes) +
   precision regression

Deliberately left out the third review nit (compute RSI/return only when a
rule has price bounds): that path runs once per code, not in a hot loop, and
gating it would add template branching for negligible gain.

Signed-off-by: Robin1987China <41602358+Robin1987China@users.noreply.github.com>
@shadowinlife

Copy link
Copy Markdown
Contributor

Code Review — 5-Agent Parallel Review

Reviewed by 5 parallel agents: Goal Verification, Code Quality, Security, QA Execution, and Context Mining.

Overall Verdict: ✅ PASSED

# Review Area Agent Type Verdict Confidence
1 Goal & Constraint Verification Oracle ✅ PASS HIGH
2 QA Execution unspecified-high ✅ PASS HIGH
3 Code Quality Oracle ✅ PASS HIGH
4 Security (supplementary) Oracle ✅ PASS NONE
5 Context Mining unspecified-high ✅ PASS HIGH

Goal Achievement

Goal Status Evidence
Unify PRICE_FEATURES to single source ✅ ACHIEVED models.py defines → extractor.py aliases (is identity check) → codegen.py imports
Fix precision loss round(.,2)round(.,4) ✅ ACHIEVED Updated in _cluster_to_rule, regression test verifies 0.0123 does not collapse to 0.01
3rd item (conditional RSI computation) explicitly deferred ✅ Correct decision Author rationale: "negligible gain, adds template branching"

QA Execution Results

Check Result
ruff check (3 source files) All checks passed!
test_shadow_account.py 65 passed in 8.93s
2 new regression tests Both PASS
Full suite (excl e2e) 4472 passed, 6 skipped, 0 failures

Non-blocking Findings

1. MINOR — PRICE_FEATURES docstring inaccuracy about scanner

The docstring states "the scanner evaluates them in real time", but scanner.py does not compute entry_rsi14 or prior_5d_return. The scanner has its own independent feature pipeline (momentum, price_above_ma, volume_ratio), and entry_rsi14 as a condition key falls through _feature_for_condition_key() returning None — silently skipped. This is a pre-existing functional gap, not introduced by this PR, but the docstring description is inaccurate.

Suggested fix — amend the docstring to something like:

#: ...codegen flattens their ``{min,max}`` bounds into the template.
#: The scanner currently uses its own feature pipeline and does not
#: directly consume these names.

2. MINOR — Precision test assertion is clever but opaque

test_price_bounds_keep_four_decimal_precision uses min_support=10 with 4 rows, which correctly exercises the _heuristic_single_rule fallback path (which also calls the updated _cluster_to_rule). The test passes, but the assertion:

assert bounds["min"] != round(bounds["min"], 2) or bounds["max"] != round(bounds["max"], 2)

requires a mental pause. A direct assertion like assert bounds["min"] == pytest.approx(expected_p10, abs=1e-4) would be more self-documenting. Non-blocking.

3. NITPICK — Function-local imports in new tests

Both new tests use function-local imports (from src.shadow_account import codegen, extractor), which diverges slightly from the top-level import pattern used elsewhere in the file. No functional impact.


Follow-up Suggestion

Consider a follow-up issue for the scanner gap: _feature_for_condition_key("entry_rsi14") returns None in scanner.py, causing RSI bounds to be silently skipped during real-time signal scanning. This appears to be a residual from PR #314 review Blocking Issue #1 (claimed fixed but possibly incomplete for the scanner path).


Summary: Clean, focused refactor that correctly achieves both stated goals with appropriate test coverage and zero regressions across the full test suite. Ready to merge.

@warren618

Copy link
Copy Markdown
Collaborator

Thanks for the PR — we’ve seen it and will review the current diff as soon as we can.

I also noticed the community review above; we’ll take that into account when checking whether this is ready to merge.

@warren618 warren618 merged commit 2acc3b8 into HKUDS:main Jun 27, 2026
1 check passed
@warren618

Copy link
Copy Markdown
Collaborator

Merged via squash in 2acc3b8. Thanks @Robin1987China — this gives Shadow Account extraction/codegen a shared PRICE_FEATURES contract and keeps the four-decimal return-bound regression coverage.

I’ll track the scanner/docstring follow-up separately rather than blocking this PR.

@Robin1987China

Copy link
Copy Markdown
Contributor Author

Quick note on the docstring finding before the follow-up — I think the scanner part is already covered, so the docstring is accurate as written.

scanner.py does compute entry_rsi14: _compute_rsi at scanner.py:203, called in _compute_features (:197-199), and _feature_for_condition_key maps the rsi key to it (:238). That path went in with #314 (18b6247), so the "returns None → silently skipped" behavior was the pre-#314 state.

One nuance worth being precise about: prior_5d_return is range-checked through the momentum feature (same 5-day return), not a feature literally named prior_5d_return — so "the scanner evaluates them" is true in effect, just via that mapping for the return side. Happy to tighten the wording to say that explicitly if you think it reads as overclaiming.

The review was likely run against a tree before #314 landed — easy to miss given the timing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants