Skip to content

feat(dataflows): AKShare vendor for Chinese A-share stocks (.SS/.SZ)#1067

Open
ydhawesome wants to merge 4 commits into
TauricResearch:mainfrom
ydhawesome:feat/akshare-a-share-vendor
Open

feat(dataflows): AKShare vendor for Chinese A-share stocks (.SS/.SZ)#1067
ydhawesome wants to merge 4 commits into
TauricResearch:mainfrom
ydhawesome:feat/akshare-a-share-vendor

Conversation

@ydhawesome

@ydhawesome ydhawesome commented Jun 22, 2026

Copy link
Copy Markdown

Summary

  • New vendor akshare that routes .SS (Shanghai) and .SZ (Shenzhen) tickers to Chinese domestic data sources (东方财富 / 同花顺) instead of yfinance/Alpha Vantage.
  • Zero-config auto-routing: route_to_vendor() detects A-share tickers by suffix and prepends akshare to the vendor chain automatically — no user configuration required; all other tickers are completely unaffected.
  • Sentiment analyst branches on is_a_share() to substitute 东方财富 heat-rank for StockTwits and a clear placeholder for Reddit (not applicable for A-shares).
  • Optional install group: pip install tradingagents[china] pulls in akshare>=1.14.0 and curl_cffi>=0.7.0.
  • 9 unit tests covering is_a_share() detection and automatic vendor pre-routing.

Data sources used (all public, no auth)

Data type Source
OHLCV price ak.stock_zh_a_hist (东方财富, front-adjusted)
Technical indicators stockstats (same pipeline as yfinance path)
News 东方财富 JSONP search API (direct HTTP, avoids broken ak.stock_news_em)
Fundamentals / financials ak.stock_financial_abstract_ths (同花顺)
Announcements 东方财富 announcement REST API
Social sentiment 东方财富 hot-rank (ak.stock_hot_rank_em)

Note: ak.stock_news_em and several stock_*_sheet_by_report_em functions fail with PyArrow/pandas 2.x due to regex incompatibility and changed API response shapes. This PR works around both issues with direct HTTP calls to the underlying endpoints.

Test plan

  • pytest tests/test_akshare_routing.py -v — 9/9 pass
  • End-to-end analysis of 600519.SS (贵州茅台) completes successfully with MiniMax-M3, all data from domestic APIs, final decision returned in Chinese
  • AAPL analysis route is unchanged (verified no akshare code is invoked)

Notes

  • curl_cffi is used only for the 东方财富 news JSONP endpoint; the rest of the module uses requests (already in core deps).
  • The [china] extra is intentionally separate so users in non-China markets don't pull in the additional deps.
A-share tickers (Shanghai .SS / Shenzhen .SZ suffixes) are automatically
routed to a new `akshare` vendor that fetches real Chinese market data from
东方财富 and 同花顺, with no configuration required. All other tickers
continue to use the existing yfinance / alpha_vantage path unchanged.

New vendor functions:
- OHLCV price data via ak.stock_zh_a_hist (front-adjusted)
- Technical indicators via stockstats (same pipeline as yfinance path)
- News via 东方财富 JSONP search API (bypasses broken ak.stock_news_em)
- Fundamentals / balance sheet / cash flow via ak.stock_financial_abstract_ths
- Announcements (insider proxy) via 东方财富 announcement API
- Social sentiment via 东方财富 hot-rank (public, no auth required)
- Sentiment analyst branches on is_a_share() to substitute domestic sources

Optional dependency group added: pip install tradingagents[china]
installs akshare>=1.14.0 and curl_cffi>=0.7.0.

9 unit tests added (tests/test_akshare_routing.py) covering:
- is_a_share() detection for .SS / .SZ / non-Chinese tickers
- route_to_vendor() auto-prepend of akshare for A-shares
- Non-A-share tickers are NOT routed through akshare

End-to-end verified: 600519.SS (贵州茅台) full analysis completes with
MiniMax-M3, all data sourced from Chinese domestic APIs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Chinese A-share market data via the akshare vendor, automatically routing Shanghai and Shenzhen tickers to domestic data sources. The review feedback identifies critical issues where swallowing exceptions in the new utility functions bypasses the fallback routing mechanism. Additionally, the feedback suggests ensuring akshare is always prioritized first regardless of user configuration, simplifying a redundant prefix check, and removing a no-op assignment in the test suite.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +91 to +93
except Exception as e:
logger.warning("AKShare price fetch failed for %s: %s", ticker, e)
return f"<A-share price data unavailable for {ticker}: {type(e).__name__}>"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Swallowing all exceptions (including ImportError when akshare is not installed) and returning a placeholder string prevents the fallback routing mechanism in route_to_vendor from working.

If akshare is not installed or fails due to a network issue, get_stock_data_akshare will return a string starting with <A-share price data.... Because it returns a string instead of raising an exception, route_to_vendor treats the call as a success and immediately returns this placeholder string to the agent, completely bypassing the fallback to yfinance (which could have successfully fetched the data).

To fix this, let the exception propagate so that route_to_vendor can catch it and proceed to the next vendor in the fallback chain.

Suggested change
except Exception as e:
logger.warning("AKShare price fetch failed for %s: %s", ticker, e)
return f"<A-share price data unavailable for {ticker}: {type(e).__name__}>"
except Exception as e:
logger.warning("AKShare price fetch failed for %s: %s", ticker, e)
raise
Comment on lines +176 to +178
except Exception as e:
logger.warning("AKShare indicators failed for %s/%s: %s", ticker, indicator, e)
return f"<A-share indicator '{indicator}' unavailable for {ticker}: {type(e).__name__}>"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the price fetch function, catching all exceptions here and returning a placeholder string prevents route_to_vendor from falling back to yfinance for technical indicators when akshare is not installed or fails. Let the exception propagate so fallback can occur.

Suggested change
except Exception as e:
logger.warning("AKShare indicators failed for %s/%s: %s", ticker, indicator, e)
return f"<A-share indicator '{indicator}' unavailable for {ticker}: {type(e).__name__}>"
except Exception as e:
logger.warning("AKShare indicators failed for %s/%s: %s", ticker, indicator, e)
raise
Comment on lines +219 to +221
except Exception as exc:
logger.warning("东方财富 news fetch failed (query=%r): %s", query, exc)
return []

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If _fetch_em_news fails (e.g., due to ImportError when curl_cffi is missing, or a network error), returning an empty list [] causes get_news_akshare to return a "no news found" placeholder instead of propagating the error. This prevents the router from falling back to yfinance news. Raising the exception here allows the fallback chain to work correctly.

Suggested change
except Exception as exc:
logger.warning("东方财富 news fetch failed (query=%r): %s", query, exc)
return []
except Exception as exc:
logger.warning("东方财富 news fetch failed (query=%r): %s", query, exc)
raise
Comment on lines +37 to +39
def _em_prefix(code: str) -> str:
"""Return the 东方财富 market prefix used by hot-rank: 'SH600519'."""
return ("SH" if code.startswith("6") or code.startswith("688") else "SZ") + code

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition code.startswith("688") is redundant because any string starting with "688" already starts with "6". You can simplify this to just check code.startswith("6").

Suggested change
def _em_prefix(code: str) -> str:
"""Return the 东方财富 market prefix used by hot-rank: 'SH600519'."""
return ("SH" if code.startswith("6") or code.startswith("688") else "SZ") + code
def _em_prefix(code: str) -> str:
"""Return the 东方财富 market prefix used by hot-rank: 'SH600519'."""
return ("SH" if code.startswith("6") else "SZ") + code
Comment thread tradingagents/dataflows/interface.py Outdated
Comment on lines +170 to +172
if isinstance(ticker_arg, str) and is_a_share(ticker_arg):
if "akshare" not in primary_vendors:
primary_vendors = ["akshare"] + primary_vendors

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If "akshare" is already present in primary_vendors but is not the first element (e.g., if the user configured yfinance,akshare), the current check if "akshare" not in primary_vendors: will evaluate to False and do nothing. This means yfinance would still be tried before akshare for A-shares, which contradicts the goal of always routing to akshare first.

To ensure akshare is always the primary vendor for A-shares regardless of the user's configuration order, remove "akshare" if it exists in primary_vendors before prepending it.

Suggested change
if isinstance(ticker_arg, str) and is_a_share(ticker_arg):
if "akshare" not in primary_vendors:
primary_vendors = ["akshare"] + primary_vendors
if isinstance(ticker_arg, str) and is_a_share(ticker_arg):
if "akshare" in primary_vendors:
primary_vendors.remove("akshare")
primary_vendors = ["akshare"] + primary_vendors
Comment thread tests/test_akshare_routing.py Outdated
{"get_stock_data": {"akshare": fake_akshare, "yfinance": fake_yfinance}},
):
# Default config lists yfinance first for US equities
interface.set_config = lambda cfg: None # no-op; default vendor order is used

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assignment interface.set_config = lambda cfg: None is a no-op because set_config is neither defined nor imported in tradingagents/dataflows/interface.py.

Furthermore, reassigning module-level attributes directly in tests without restoring them is a bad practice that can lead to flaky tests or side effects across the test suite. Since interface.py does not use set_config, this line can be safely removed.

- Re-raise exceptions in get_stock_data_akshare, get_indicators_akshare,
  and _fetch_em_news so route_to_vendor fallback chain works correctly
  when akshare is not installed or a network error occurs
- Always move akshare to front of vendor list for A-shares (not just
  prepend when absent), so user config like 'yfinance,akshare' still
  routes domestic tickers to akshare first
- Simplify _em_prefix: remove redundant startswith('688') check since
  '688...' already satisfies startswith('6')
- Remove no-op interface.set_config assignment from test
A-share data hosts (东方财富, 同花顺, etc.) must not be routed through a
VPN/proxy. When a foreign proxy node (e.g. Clash on 127.0.0.1:7897) sits in
the path, the TLS stream to these domestic servers is corrupted, surfacing as
`SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC`, or the connection is dropped by
the server's anti-crawler when accessed from an overseas IP.

- Extend NO_PROXY with the domestic data hosts at import time. requests/urllib3
  honor NO_PROXY by host-suffix, and akshare itself uses bare requests.get
  internally, so this transparently covers both akshare's calls and ours.
- Pass an explicit no-proxy override to the curl_cffi news call and build a
  direct (ProxyHandler({})) opener for the urllib announcements call, for
  libraries that do not consult NO_PROXY.

This resolves SSLError when fetching A-share prices/indicators/news on machines
running a system (env-var) proxy. Note it cannot defeat a system-wide TUN /
transparent proxy that intercepts below the application layer — that case must
be fixed in the VPN config (route CN traffic DIRECT).

3 unit tests added for the NO_PROXY merge behavior (additive, idempotent,
preserves existing entries).
@ydhawesome

Copy link
Copy Markdown
Author

Thanks for the review! Addressed all of the Gemini feedback and pushed two follow-up commits.

1. Review feedback addressed

  • get_stock_data_akshare, get_indicators_akshare, and _fetch_em_news now re-raise on failure instead of returning a placeholder string, so route_to_vendor's fallback chain works correctly.
  • A-share tickers always move akshare to the front of the vendor list (even when a user config like yfinance,akshare already contains it).
  • Removed the redundant startswith(688) check in _em_prefix and the no-op interface.set_config assignment in the test.

2. Proxy-bypass fix for domestic data hosts

  • Found while testing in China: when a VPN/proxy (e.g. Clash) is active, the TLS stream to 东方财富/同花顺 gets corrupted (SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC) or dropped by anti-crawler when reached from an overseas IP.
  • Fix extends NO_PROXY with the domestic data hosts at import (covers akshare's internal bare requests.get too), plus explicit no-proxy overrides for the curl_cffi news call and the urllib announcements call. Documented limitation: this handles the common env-var proxy case but cannot defeat a system-wide TUN/transparent proxy (that must be fixed in the VPN config).

Testing

  • pytest tests/test_akshare_routing.py — 12/12 pass (routing + NO_PROXY merge behavior).
  • End-to-end: full analysis of 600519.SS (贵州茅台) completes with all data from domestic APIs; non-A-share tickers (e.g. AAPL) never invoke akshare.

I will also rebase this branch onto the latest main shortly to resolve drift with the recent interface.py refactor. Happy to adjust anything — e.g. make the NO_PROXY mutation opt-in via config rather than applied at import.

…gressions

The earlier commits replaced interface.py and sentiment_analyst.py wholesale,
which silently reverted unrelated v0.3.0 work (the errors.py vendor-routing
refactor, the FRED/Polymarket vendors and OPTIONAL_CATEGORIES, and the
current-date-at-top-of-analyst-prompt fix).

Rebuild both files from main and re-apply only the akshare additions:

- interface.py: now purely additive (+34 lines, 0 deletions). Adds the akshare
  imports, "akshare" in VENDOR_LIST, akshare entries in the nine stock/
  fundamental/news VENDOR_METHODS, and A-share auto-routing in the new
  vendor_chain logic. akshare is A-share-only: stripped from the chain for
  non-Chinese tickers and prepended first for .SS/.SZ, regardless of configured
  vendor order. Failures propagate so the existing fallback chain (and the typed
  VendorRateLimitError / NoMarketDataError handling) still works.
- sentiment_analyst.py: keep main's prompt wording; only branch on is_a_share()
  to substitute the 东方财富 heat-rank proxy for StockTwits/Reddit.

Also fixed ruff findings in akshare_utils.py (import order, unused pandas,
Optional to | None, stray quoted annotation). Updated the Shenzhen routing test
to mock yfinance alongside akshare, matching the real method registry.

Tests: test_akshare_routing + test_vendor_routing + test_structured_agents all
pass (44); ruff clean on the changed files.

@gyx09212214-prog gyx09212214-prog left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice direction. The A-share routing is a real gap, and keeping the extra dependencies behind [china] is a good boundary.

One edge case I?d suggest checking: get_global_news appears to use curr_date as its first argument rather than a ticker. Since route_to_vendor() infers A-share status from args[0], get_global_news("2026-06-25", ...) will not be detected as A-share and may never route to the new akshare implementation. If global market news is intended to use domestic sources for China workflows, this may need method-specific routing rather than ticker-based routing.

Also, _ensure_domestic_no_proxy() mutates NO_PROXY at import time. That may surprise users in non-China flows that only import the module. A lazy call inside the akshare vendor path might keep the side effect narrower.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants