Unified AI proxy/router — route any LLM through one endpoint. Multi-provider support with 236 provider entries (OpenAI, Anthropic, Gemini, DeepSeek, Groq, xAI, Mistral, Fireworks, Cohere, NVIDIA, Cerebras, Pollinations, Puter, Cloudflare AI, HuggingFace, DeepInfra, SambaNova, Meta Llama API, Moonshot AI, AI21 Labs, Databricks, Snowflake, and many more) with MCP Server (94 tools), A2A v0.3 Protocol, and Electron desktop app.
Live counts (v3.8.40): providers 236 · MCP tools 94 · MCP scopes 30 · A2A skills 6 · open-sse services 298 · routing strategies 17 · auto-combo scoring factors 12 · DB modules 94 · DB migrations 106 · base tables 17 · search providers 11 · i18n locales 42. Refresh with
npm run check:docs-all.
If
grep -rn "name" src/ open-sse/ bin/returns nothing, the name does not exist. Do not document it.
The recurring failure mode in AI-generated docs is plausible-but-unverified specifics.
Every claim in a .md file under docs/ should be verifiable against the source.
Rules (enforced by npm run check:fabricated-docs):
- Never state an API name, endpoint, path, CLI command, or env var without grepping for it first.
grep -rn "theName" src/ open-sse/ bin/ # 0 hits → do not document
- Never write a line count, file size, migration count, provider count, or strategy count from memory.
wc -l <file> # exact line count ls <dir>/*.ts | wc -l # file count
- Every code example should be copy-pasted from real usage or actually run — not synthesized.
Link to a real call site (
path:line) instead of inventing a signature. - Prefer citing real source (
file.ts:line) over paraphrasing behavior — verifiable and self-correcting. - A shorter doc that is 100% accurate beats a comprehensive one with fabrications. Wrong docs cost more than missing docs, because people trust and act on them.
The script scripts/check/check-fabricated-docs.mjs extracts every route path, env var, hook
name, function name, and file reference from docs/**/*.md and verifies each one against the
codebase. Run it locally before pushing docs; it runs in CI via npm run check:docs-all.
- Runtime: Next.js 16 (App Router), Node.js
>=22.0.0 <23 || >=24.0.0 <27, ES Modules ("type": "module") - Language: TypeScript 6.0 (
src/) + JavaScript (open-sse/,electron/) - Database: better-sqlite3 (SQLite) —
DATA_DIRconfigurable, default~/.omniroute/ - Streaming: SSE via
open-sseinternal workspace package - Styling: Tailwind CSS v4
- i18n: next-intl with 42 locales (
src/i18n/messages/) — refresh withls src/i18n/messages/*.json | wc -l - Desktop: Electron (cross-platform: Windows, macOS, Linux)
- Schemas: Zod v4 for all API / MCP input validation
| Command | Description |
|---|---|
npm run dev |
Start Next.js dev server |
npm run build |
Production build: next build → .build/next/ + assemble dist/ |
npm run build:release |
Clean rebuild + HEAD sentinel (dist/BUILD_SHA) — use for deploy |
npm run start |
Run production build |
npm run build:cli |
Build CLI package |
npm run lint |
ESLint on all source files |
npm run typecheck:core |
TypeScript core type checking |
npm run typecheck:noimplicit:core |
Strict checking (no implicit any) |
npm run check |
Run lint + test |
npm run check:cycles |
Check for circular dependencies |
npm run electron:dev |
Run Electron app in dev mode |
npm run electron:build |
Build Electron app for current OS |
Build output layout:
| Directory | Purpose | Gitignored |
|---|---|---|
src/ |
Application source (TypeScript / TSX) | No |
.build/ |
Build intermediates (distDir = .build/next) |
Yes |
dist/ |
Shippable bundle assembled by assembleStandalone |
Yes |
The pipeline is a single next build pass — intermediates land in .build/next/, the
assembled bundle in dist/. VPS deploys rsync dist/ into the remote
/usr/lib/node_modules/omniroute/app/ directory (VPS image path is unchanged).
# All tests (unit + vitest + ecosystem + e2e)
npm run test:all
# Single test file (Node.js native test runner — most tests use this)
node --import tsx/esm --test tests/unit/your-file.test.ts
node --import tsx/esm --test tests/unit/plan3-p0.test.ts
node --import tsx/esm --test tests/unit/fixes-p1.test.ts
node --import tsx/esm --test tests/unit/security-fase01.test.ts
# Integration tests
node --import tsx/esm --test tests/integration/*.test.ts
# Vitest (MCP server, autoCombo)
npm run test:vitest
# E2E with Playwright
npm run test:e2e
# Protocol clients E2E (MCP transports, A2A)
npm run test:protocols:e2e
# Ecosystem compatibility tests
npm run test:ecosystem
# Coverage (see CONTRIBUTING.md)
npm run test:coverageFor authoritative coverage requirements, test execution, and PR gates, see CONTRIBUTING.md.
2 spaces · semicolons required · double quotes (") · 100 char width · es5 trailing commas.
Always run prettier --write on changed files.
- Target: ES2022 · Module:
esnext· Resolution:bundler strict: false— prefer explicit types, don't rely on inference- Path aliases:
@/*→src/,@omniroute/open-sse→open-sse/,@omniroute/open-sse/*→open-sse/*
- Security (error, everywhere):
no-eval,no-implied-eval,no-new-func - Relaxed in
open-sse/andtests/:@typescript-eslint/no-explicit-any= warn - React hooks rules and
@next/next/no-assign-module-variabledisabled inopen-sse/andtests/
| Element | Convention | Example |
|---|---|---|
| Files | camelCase / kebab-case | chatCore.ts, tokenHealthCheck.ts |
| React components | PascalCase | Dashboard.tsx, ProviderCard.tsx |
| Functions/variables | camelCase | getHealth(), switchCombo() |
| Constants | UPPER_SNAKE | MAX_RETRIES, DEFAULT_TIMEOUT |
| Interfaces | PascalCase (I prefix optional) |
ProviderConfig |
| Enums | PascalCase (members too) | LogLevel.Error |
- Order: external → internal (
@/,@omniroute/open-sse) → relative (./,../) - No barrel imports from
localDb.ts— import from the specificdb/module instead
- try/catch with specific error types; always log with context (pino logger)
- Never silently swallow errors in SSE streams — use abort signals for cleanup
- Return proper HTTP status codes (4xx client, 5xx server)
- NEVER commit API keys, secrets, or credentials
- Validate all user inputs with Zod schemas
- Auth middleware required on all API routes
- Never log SQLite encryption keys
- Sanitize user content (dompurify for HTML)
- Public upstream OAuth identifiers (Gemini / Antigravity / Windsurf-style client_id/secret + Firebase Web keys extracted from public CLIs): use
resolvePublicCred()fromopen-sse/utils/publicCreds.ts, never as string literals. Full pattern indocs/security/PUBLIC_CREDS.md. - Error responses (HTTP / SSE / executor / MCP): use
buildErrorBody()orsanitizeErrorMessage()fromopen-sse/utils/error.ts, never put rawerr.stack/err.messagein a Response body. Full pattern indocs/security/ERROR_SANITIZATION.md. exec()/spawn()with runtime values: pass via theenvoption, never string-interpolate paths/values into the script body. Reference:src/mitm/cert/install.ts::updateNssDatabases.- Prefer secure-by-default libraries when available — see tldrsec/awesome-secure-defaults for the curated list (Helmet.js, DOMPurify, ssrf-req-filter, safe-regex, Google Tink, etc.).
All persistence uses SQLite through 83 domain-specific modules in src/lib/db/. Top modules:
- Core:
core.ts,migrationRunner.ts,encryption.ts,stateReset.ts - Providers / catalog:
providers.ts,models.ts,providerLimits.ts,compressionAnalytics.ts - Routing:
combos.ts,modelComboMappings.ts,domainState.ts,commandCodeAuth.ts - Auth:
apiKeys.ts,secrets.ts,registeredKeys.ts,sessionAccountAffinity.ts - Usage / billing:
quotaSnapshots.ts,creditBalance.ts,usage*.ts,compressionCacheStats.ts - Storage:
backup.ts,cleanup.ts,jsonMigration.ts,healthCheck.ts,databaseSettings.ts - Extension modules:
evals.ts,webhooks.ts,reasoningCache.ts,readCache.ts,tierConfig.ts,compressionCombos.ts,compressionScheduler.ts,batches.ts,files.ts,syncTokens.ts,proxies.ts,oneproxy.ts,upstreamProxy.ts,versionManager.ts,cliToolState.ts,prompts.ts,detailedLogs.ts,contextHandoffs.ts,compression.ts,stats.ts
Live count: ls src/lib/db/*.ts | wc -l (currently 83). Drift detection: npm run check:docs-counts.
Schema migrations live in db/migrations/ (97 files as of v3.8.24) and run via migrationRunner.ts.
src/lib/localDb.ts is a re-export layer only — never add logic there.
core.ts:getDbInstance()returns a singletonbetter-sqlite3instance with WAL journaling.SCHEMA_SQLdefines 17 base tables (verify withgrep -c "CREATE TABLE" src/lib/db/core.tsminus 1 for the bookkeeping_omniroute_migrationstable). Helpers:rowToCamel,encryptConnectionFields.migrationRunner.ts: Applies versioned SQL files fromdb/migrations/inside transactions. Tracks applied migrations in_omniroute_migrationstable.- Migrations: 97 files (
001_initial_schema.sql→099_*.sql). Each migration is idempotent and runs in a transaction. Live count:ls src/lib/db/migrations/*.sql | wc -l. - Domain modules import
getDbInstance()fromcore.tsfor all CRUD operations. Each module owns a specific table/set of tables (e.g.,providers.ts→provider_connections,combos.ts→combos). Encryption helpers protect sensitive fields at rest. localDb.tsre-exports all domain modules — consumers import from here for convenience.
Next.js App Router routes — each follows a consistent pattern:
Route → CORS preflight → Body validation (Zod) → Optional auth (extractApiKey/isValidApiKey)
→ API key policy enforcement (enforceApiKeyPolicy) → Handler delegation (open-sse)
| Route | Handler | Notes |
|---|---|---|
chat/completions/route.ts |
handleChat() |
+ prompt injection guard (clones request) |
responses/route.ts |
handleChat() (unified) |
Responses API format |
embeddings/route.ts |
handleEmbedding() |
Model listing + creation |
images/generations/route.ts |
handleImageGeneration() |
Model listing + creation |
audio/transcriptions/route.ts |
audio handler | Multipart form data |
audio/speech/route.ts |
TTS handler | Binary audio response |
videos/generations/route.ts |
video handler | ComfyUI/SD WebUI |
music/generations/route.ts |
music handler | ComfyUI workflows |
moderations/route.ts |
moderation handler | Content safety |
rerank/route.ts |
rerank handler | Document relevance |
search/route.ts |
search handler | Web search (12 providers per open-sse/handlers/search.ts:6) |
No global Next.js middleware file — interception is route-specific. Auth is optional
(controlled by REQUIRE_API_KEY env). Prompt injection guard is unique to chat completions.
The open-sse/ workspace is the core streaming engine. Full request flow:
Client Request
→ src/app/api/v1/.../route.ts (Next.js route)
→ open-sse/handlers/chatCore.ts::handleChatCore()
→ Semantic/signature cache check
→ Rate limit check (rateLimitManager)
→ Combo routing? → open-sse/services/combo.ts::handleComboChat()
→ resolveComboTargets() → ordered ResolvedComboTarget[]
→ For each target: handleSingleModel() (wraps chatCore)
→ translateRequest() (open-sse/translator/)
→ Convert source format (e.g., OpenAI) → target format (e.g., Claude)
→ getExecutor() → provider-specific executor instance
→ executor.execute() (BaseExecutor → DefaultExecutor or provider-specific)
→ buildUrl() + buildHeaders() + transformRequest()
→ fetch() to upstream provider
→ Retry logic with exponential backoff
→ Response translation back to client format
→ If Responses API: responsesTransformer.ts TransformStream
→ SSE stream or JSON response to client
Handlers (open-sse/handlers/): chatCore.ts, responsesHandler.ts, embeddings.ts,
imageGeneration.ts, videoGeneration.ts, musicGeneration.ts, audioSpeech.ts,
audioTranscription.ts, moderations.ts, rerank.ts, search.ts.
Upstream headers: merged after default auth; same header name replaces executor value.
T5 intra-family fallback recomputes headers using only the fallback model id.
Forbidden header names: src/shared/constants/upstreamHeaders.ts — keep sanitize,
Zod schemas, and unit tests aligned when editing.
- Free (3): Qoder AI, Qwen Code, Kiro AI
- OAuth (14): Claude Code, Antigravity, Codex, GitHub Copilot, Cursor, Kimi Coding, Kilo Code, Cline, Qwen (
⚠️ free tier discontinued 2026-04-15), Kiro, Qoder, Gemini, Windsurf (v3.8), GitLab Duo (v3.8) - API Key (120+): OpenAI, Anthropic, Gemini, DeepSeek, Groq, xAI, Mistral, Perplexity, Together, Fireworks, Cerebras, Cohere, NVIDIA, Nebius, SiliconFlow, Hyperbolic, HuggingFace, OpenRouter, Vertex AI, Cloudflare AI, Scaleway, AI/ML API, Pollinations, Puter, Longcat, Alibaba, Kimi, Minimax, Blackbox, Synthetic, Kilo Gateway, Z.AI, GLM, Deepgram, AssemblyAI, ElevenLabs, Cartesia, PlayHT, Inworld, NanoBanana, SD WebUI, ComfyUI, Ollama Cloud, Perplexity Search, Serper, Brave, Exa, Tavily, OpenCode Zen/Go, Bailian Coding Plan, DeepInfra, Vercel AI Gateway, Lambda AI, SambaNova, nScale, OVHcloud AI, Baseten, PublicAI, Moonshot AI, Meta Llama API, v0 (Vercel), Morph, Featherless AI, FriendliAI, LlamaGate, Galadriel, Weights & Biases Inference, Volcengine, AI21 Labs, Venice.ai, Codestral, Upstage, Maritalk, Xiaomi MiMo, Inference.net, NanoGPT, Predibase, Bytez, Heroku AI, Databricks, Snowflake Cortex, GigaChat (Sber), CrofAI, AgentRouter, ChatGPT Web, Baidu Qianfan, AWS Polly, RunwayML, GitLab Duo, Amazon Q, Empower, Poe, and many more.
- Self-Hosted (8+): LM Studio, vLLM, Lemonade, Llamafile, Triton, Docker Model Runner, Xinference, Oobabooga
- Custom: OpenAI-compatible (
openai-compatible-*) and Anthropic-compatible (anthropic-compatible-*) prefixes
Providers are registered in src/shared/constants/providers.ts with Zod validation at module load.
Provider-specific request executors: base.ts, default.ts, cursor.ts, codex.ts,
antigravity.ts, github.ts, kiro.ts, qoder.ts, vertex.ts,
cloudflare-ai.ts, opencode.ts, pollinations.ts, puter.ts.
base.ts(BaseExecutor): Abstract base withbuildUrl(),buildHeaders(),transformRequest(), retry logic (exponential backoff), andexecute(). Subclasses override URL/header/transform methods for provider-specific behavior.default.ts(DefaultExecutor extends BaseExecutor): Handles most OpenAI-compatible providers. Reads provider config fromproviderRegistry.tsto resolve base URL, auth header format, and request transformations.getExecutor()(executors/index.ts): Factory that returns the correct executor instance based on provider ID. Provider-specific executors (Cursor, Codex, Vertex, etc.) override only what differs from the default.
Translates between API formats (OpenAI-format ↔ Anthropic, Gemini, etc.). Includes request/response translators with helpers for image handling.
translator/index.ts: ExportstranslateRequest()and format constants. Called bychatCore.tsbefore executor dispatch.- Flow:
translateRequest(body, sourceFormat, targetFormat)→ detects source format (OpenAI, Anthropic, Gemini) → applies the matching translator module → returns transformed body ready for the target provider. - Response translation runs in reverse after upstream response, converting back to the client's expected format.
responsesTransformer.ts — transforms Responses API format to/from Chat Completions format.
createResponsesApiTransformStream(): Returns aTransformStreamthat converts Chat Completions SSE chunks (data: {"choices":[...]}) into Responses API SSE events (response.output_item.added,response.output_text.delta, etc.).- Used when the client sends a Responses API request: the request is internally converted to Chat Completions format, dispatched normally, and the response is piped through this transform stream before reaching the client.
115 service modules in open-sse/services/ (top-level only; 184 including sub-dirs like autoCombo/ and compression/). Refresh: ls open-sse/services/*.ts | wc -l. Key modules:
combo.ts (routing engine), usage.ts, tokenRefresh.ts,
rateLimitManager.ts, accountFallback.ts, sessionManager.ts, wildcardRouter.ts,
autoCombo/, intentClassifier.ts, taskAwareRouter.ts, thinkingBudget.ts,
contextManager.ts, modelDeprecation.ts, modelFamilyFallback.ts,
emergencyFallback.ts, workflowFSM.ts, backgroundTaskDetector.ts, ipFilter.ts,
signatureCache.ts, volumeDetector.ts, contextHandoff.ts, compression/ (prompt
compression pipeline), and more.
Modular prompt compression that runs proactively before the existing reactive context manager.
strategySelector.ts: Selects compression mode based on config, compression combo assignments, combo overrides, auto-trigger thresholds, and defaults. Priority: assigned compression combo > combo override > auto-trigger > default mode > off.lite.ts: 5 lite-mode techniques:collapseWhitespace,dedupSystemPrompt,compressToolResults,removeRedundantContent,replaceImageUrls. Target: 10-15% savings at <1ms latency.caveman.ts/cavemanRules.ts: Caveman-style semantic condensation backed by built-in rules plus file-loaded language packs undercompression/rules/.engines/rtk/: Rule-based terminal/tool-output compression inspired by RTK patterns. Detects command output classes, applies JSON filter packs, deduplicates repeated lines, strips ANSI/code noise, and preserves errors/actionable context. The RTK JSON DSL supports replace, match-output short-circuit, strip/keep, per-line truncation, head/tail/max-line truncation, inline tests, trust-gated project/global custom filters, and optional redacted raw-output retention for authenticated recovery.engines/registry.ts: Registers engines (caveman,rtk) and powers stacked pipelines.stats.ts: Per-request compression stats tracking (original tokens, compressed tokens, savings %, techniques used, engine breakdown, compression combo id).types.ts:CompressionMode(off/lite/standard/aggressive/ultra/rtk/stacked),CompressionConfig,CompressionStats,CompressionResult.- DB settings in
src/lib/db/compression.ts, compression combos insrc/lib/db/compressionCombos.ts, API routes undersrc/app/api/settings/compression/,src/app/api/context/*, and preview/language-pack routes undersrc/app/api/compression/*.
handleComboChat(): Entry point for combo-routed requests. Receives the combo config and iterates through targets in order until one succeeds or all fail.resolveComboTargets(): Expands a combo configuration into an ordered array ofResolvedComboTarget[], each specifying provider + model + account + credentials.- Strategies (15): priority, weighted, fill-first, round-robin, P2C, random, least-used, reset-aware (v3.8),
reset-window, cost-optimized, strict-random, auto, lkgp, context-optimized, context-relay. Source:
ROUTING_STRATEGY_VALUESinsrc/shared/constants/routingStrategies.ts. - Each target calls
handleSingleModel()which wrapshandleChatCore()with per-target error handling and circuit breaker checks.
Policy engine modules: policyEngine.ts, comboResolver.ts, costRules.ts,
degradation.ts, fallbackPolicy.ts, lockoutPolicy.ts, modelAvailability.ts,
providerExpiration.ts, quotaCache.ts, responses.ts, configAudit.ts.
94 tools total (TOTAL_MCP_TOOL_COUNT, open-sse/mcp-server/server.ts): a 34-entry base registry (MCP_TOOLS in schemas/tools.ts, bundling the core / cache / compression / 1proxy / advanced tools) plus standalone module sets — memory (3), skill (4), agentSkill (3), pool (6), gamification (8), plugin (8), notion (6), obsidian (22). 3 transports (stdio / SSE / Streamable HTTP). Scoped auth (30 scopes — see OMNIROUTE_MCP_SCOPES), Zod schemas. See docs/frameworks/MCP-SERVER.md.
Core tools (20): get_health, list_combos, get_combo_metrics, switch_combo, check_quota, route_request, cost_report, list_models_catalog, web_search, simulate_route, set_budget_guard, set_routing_strategy, set_resilience_profile, test_combo, get_provider_metrics, best_combo_for_task, explain_route, get_session_snapshot, db_health_check, sync_pricing.
Cache tools (2): cache_stats, cache_flush.
Compression tools (5): compression_status, compression_configure, set_compression_engine, list_compression_combos, compression_combo_stats.
1proxy tools (3): oneproxy_fetch, oneproxy_rotate, oneproxy_stats.
Memory tools (3): memory_search, memory_add, memory_clear.
Skill tools (4): skills_list, skills_enable, skills_execute, skills_executions.
Agent-skill tools (3): A2A skill discovery / invocation bridges.
Gamification tools (8): levels, badges, leaderboard, and community-federation queries.
Plugin tools (8): plugin marketplace listing, install/enable/disable, and runtime inspection.
Notion tools (6) + Obsidian tools (22): knowledge-base read/write integrations (the largest tool family — vault search, note CRUD, WebDAV-backed file ops).
- Tool registration: Each tool is an object with
{ name, description, inputSchema: ZodSchema, handler: async (args) => {...} }. Zod validates inputs before the handler fires. createMcpServer()andstartMcpStdio()exported frommcp-server/index.ts.createMcpServer()wires all tool sets;startMcpStdio()launches the stdio transport.- Transports: stdio (CLI
omniroute --mcp), SSE (/api/mcp/sse), Streamable HTTP (/api/mcp/stream). All share the same tool/scope engine. - Scopes (30): Control which tool categories an API key can access. Enforcement happens before handler dispatch.
- Audit: Every tool invocation is logged to SQLite (
mcp_audittable) with tool name, args, success/failure, API key attribution, and timestamp.
JSON-RPC 2.0, SSE streaming, Task Manager with TTL cleanup.
Agent Card at /.well-known/agent.json.
Skills (6): smartRouting.ts, quotaManagement.ts, providerDiscovery.ts, costAnalysis.ts, healthReport.ts, listCapabilities.ts.
taskManager.ts: State machine lifecycle for tasks:submitted → working → completed | failed | canceled. Tasks have TTL and are cleaned up automatically.- JSON-RPC methods:
message/send(sync),message/stream(SSE),tasks/get,tasks/cancel. Dispatched viaPOST /a2a. - Skills: Registered in a DB-backed registry. Each skill receives task context
(messages, metadata) and returns structured results.
quotaManagement.tssummarizes quota;smartRouting.tsrecommends routing decisions. - Agent Card:
/.well-known/agent.jsonexposes capabilities, skills, and metadata for client auto-discovery.
Agent Communication Protocol registry and manager.
Extraction, injection, retrieval, summarization, and store modules for persistent conversational memory across sessions.
Extensible skill framework: registry, executor, sandbox, built-in skills, custom skill support, interception, and injection.
registry.ts: DB-backed skill registration and discovery. Skills have metadata (name, description, version, enabled status) stored in SQLite.executor.ts: Execution engine with configurable timeout and retry logic. Receives skill name + input, looks up the skill, runs it in the sandbox.sandbox.ts: Isolation layer for custom (user-provided) skills. Limits resource access and execution time.- Built-in skills: Ship with OmniRoute (e.g., quota management, routing). Located alongside the registry.
- Interception/Injection: Skills can intercept requests in the pipeline (pre/post processing) or inject context into prompts.
Policy index for compliance enforcement.
MITM proxy capability with certificate management, DNS handling, and target routing.
Request middleware including promptInjectionGuard.ts.
Hot-reloadable guardrails framework (3 built-in: pii-masker, prompt-injection, vision-bridge). Fail-open. The pii-masker guardrail is registered and runs on every request, but its data-mutating logic is opt-in and OFF by default — it only redacts when PII_REDACTION_ENABLED (request) / PII_RESPONSE_SANITIZATION (response + streaming) are enabled (both defaultValue: "false"); with them off, payloads pass through untouched. A request can additionally opt OUT of any guardrail via header (x-omniroute-disabled-guardrails). Never make PII default-on (Hard Rule #20). See docs/security/GUARDRAILS.md.
CloudAgentBase abstract class + 3 agents (codex-cloud, devin, jules). Tasks persisted in cloud_agent_tasks; management auth required. See docs/frameworks/CLOUD_AGENT.md.
Generic eval framework: evalRunner.ts, runtime.ts. Targets: combo / model / suite-default. See docs/frameworks/EVALS.md.
HMAC-signed delivery, exponential backoff, auto-disable after 10 failures. 7 event types. See docs/frameworks/WEBHOOKS.md.
classify → policies → enforce. 3 route classes (PUBLIC / CLIENT_API / MANAGEMENT). See docs/architecture/AUTHZ_GUIDE.md.
Hybrid in-memory + SQLite cache for reasoning_content. Re-injects on multi-turn for strict providers (DeepSeek V4, Kimi K2, Qwen-Thinking, GLM, xiaomi-mimo). See docs/routing/REASONING_REPLAY.md.
Cloudflare Quick/Named, ngrok, Tailscale Funnel. See docs/ops/TUNNELS_GUIDE.md.
- Register in
src/shared/constants/providers.ts - Add executor in
open-sse/executors/(if custom logic needed) - Add translator in
open-sse/translator/(if non-OpenAI format) - Add OAuth config in
src/lib/oauth/constants/oauth.ts(if OAuth-based) - Add models in
open-sse/config/providerRegistry.ts
src/lib/db/AGENTS.md— SQLite persistence, domain modules, migrationsopen-sse/services/AGENTS.md— Routing engine, combo resolution, strategy selection
For any non-trivial change, read the matching deep-dive first:
This repository is a fork of diegosouzapw/OmniRoute. Keep fork-only operational
changes (for example GHCR image publishing, personal deployment workflows, or local
automation) out of upstream contribution PRs.
When preparing a PR for upstream, always start the work branch from upstream/main,
not from this fork's main:
git fetch upstream
git switch -c <branch-name> upstream/mainOnly cherry-pick or reapply the changes intended for the upstream PR.
- DB ops go through
src/lib/db/modules, never raw SQL in routes - Provider requests flow through
open-sse/handlers/ - MCP/A2A pages are tabs inside
/dashboard/endpoint, not standalone routes - No memory leaks in SSE streams (abort signals, cleanup)
- Rate limit headers must be parsed correctly
- All API inputs validated with Zod schemas
- Provider constants validated at module load via Zod (
src/shared/validation/providerSchema.ts) - Pricing data syncs from LiteLLM via
src/lib/pricingSync.ts - Memory/Skills are cross-cutting: affect MCP tools, request pipeline, and A2A skills
- ⛔ NEVER close a contributor's PR after using their code — always merge via GitHub so they get credit. See
.agents/workflows/review-prs.mdfor full policy.