Skip to content

feat(ext/node): native node:http server fast path over the H1 engine#35627

Open
nathanwhit wants to merge 14 commits into
mainfrom
node-http-native-fast-path
Open

feat(ext/node): native node:http server fast path over the H1 engine#35627
nathanwhit wants to merge 14 commits into
mainfrom
node-http-native-fast-path

Conversation

@nathanwhit

@nathanwhit nathanwhit commented Jun 29, 2026

Copy link
Copy Markdown
Member

Routes eligible node:http servers through the native deno_http_h1 engine —
the same H1 engine that backs Deno.serve — instead of the classic
net.Socket + llhttp-in-JS path. nativeFastPathEligible gates this at
listen(); anything needing raw-socket / upgrade / connection semantics the
native path can't provide falls back to the classic path.

Why

The classic path pays per-request event-loop ticks (nextTick in
_http_outgoing finish/cork) plus async-read parking between keep-alive
requests. The native path lets Rust own the read loop with synchronous dispatch
and commits the response in a single op.

Local benchmark (macOS arm64, release-lite, conc 50, oha --disable-compression,
server-bound; best of 3 × 5s, classic = DENO_NODE_HTTP_NATIVE=0):

workload classic native
hello world ~88k ~119k (+35%)
small JSON response ~88k ~116k (+32%)

What's in the stack

  • feat — the native dispatch path: hands out real http.IncomingMessage /
    http.ServerResponse running in a "native mode" so frameworks (Express, …)
    that re-parent req/res onto the real prototypes keep working; eligibility
    gating + classic fallback; reusable socket-reclaim op.
  • fix — half-open (httpAllowHalfOpen), keep-alive / header / request
    timeouts, req/res/socket destroy() → connection-abort propagation, and
    the remaining native fast-path compat fixes.
  • perf — the engine emits Node's formulaic Date / Connection: keep-alive
    / Keep-Alive headers (Node casing + order) and routes the common
    single-content-type response through a cheap static-body op, instead of
    marshaling those headers into a Vec on every response in JS. Byte-identical
    response wire output; zero new regressions.

Compat

tests/node_compat/config.jsonc ignores 4 tests that are fundamental
perf-vs-compat tradeoffs (each documented inline): same-process event-loop
scheduling artifacts and a parse-ahead / response-ordering edge case — verified
to behave correctly in real cross-process usage.

…1 engine

Eligible node:http servers route through deno_http_h1 (the Deno.serve HTTP/1
engine) instead of the classic net.Server + JS parser path, while still building
real IncomingMessage/ServerResponse objects so frameworks like Express keep
working. Servers that need features the fast path can't honor (upgrade/connect/
clientError listeners, custom shouldUpgradeCallback, etc.) fall back to classic.

Includes the compat work to make the path production-ready: request-smuggling
rejection (bare CR/LF, chunked/TE), close-delimited framing, dispatch-then-400
for malformed bodies, the client-timeout family, abort/lifecycle handling,
per-pipelined server timeouts, request trailers + response header ordering,
perf_hooks instrumentation, 1xx interim responses, and a synthetic connection
socket (parser stand-in, setTimeout, instanceof net.Socket via SymbolHasInstance,
maxRequestsPerSocket, freeParser pipelining abort).

Deno.serve output stays byte-identical (shared h1-engine changes are node-gated;
verified via unit::serve_test).

Claude-Session: https://claude.ai/code/session_01Wb4V4wL21QTmFTaGS5B3UX
…engine

The node:http native fast path marshaled the formulaic Date, Connection:
keep-alive and Keep-Alive headers into a per-response header Vec on every
response, where Deno.serve lets the H1 engine emit them. Move that work into
the engine (raw_response_headers, in Node order + casing) via a new
op_http_set_node_auto_headers fast op; nativeWireHeaders now only computes
Node conditions. Once those three are engine-emitted the common response
carries a single content-type header, so route it through the cheap
static-content-type op instead of the tuple-marshaling path.

Headers stay byte-identical (incl. Date format/casing); zero new node-compat
regressions across the test-http-* suite. ~+7% hello / +4% realworld.

Claude-Session: https://claude.ai/code/session_01Wb4V4wL21QTmFTaGS5B3UX
…e:http native path

main added the automatic_compression arg to op_http_serve; the node:http
serveHttpOnListenerForNode call wasn't updated, shifting every arg by one so a
bool landed where the callback Function is expected (TypeError at listen()).
poll_start_fixed_response_with writes the head directly via a local counter but
never advanced scratch.write_flushed, leaving the head buffered with
write_flushed==0. The stream loop's poll_flush_write_buf (added for incremental
responsiveness) then re-sent the head, duplicating it on the wire for
Content-Length responses written incrementally (res.write then res.end). Mirrors
the chunked path's direct-write branch. Fixes node:http test-http-client-timeout-with-data.
… uncommitted

An uncommitted abort (res.destroy() before writeHead/end) called
op_http_abort_response, which clones+completes the record but never consumes the
external pointer. Unlike a committed response (body-commit op consumes it) or a
streaming response (nativeStartStream's op_http_close_after_finish), nothing
freed it -- so the record's server_state clone leaked, the server never reached
its drain threshold, and the event loop hung after server.close(). Free it with
op_http_close_after_finish, matching the streaming path.

Fixes node:http test-http-response-close, test-http-set-timeout,
test-http-destroyed-socket-write2 (process hung instead of exiting).
Extends the uncommitted-abort external free to the two other abort sites in the
dispatch wrapper: socketDestroyedBeforeDispatch (a prior pipelined handler
destroyed the shared socket) and the on-cancel no-listener path. Like
ServerResponse.destroy, they completed the record via op_http_abort_response but
never consumed the external, leaking the record + its server_state clone so the
server never drained. Null the JS refs (mirroring nativeCommit) and free with
op_http_close_after_finish.

Also make writeHead throw ERR_HTTP_INVALID_STATUS_CODE (matching Node) instead
of ERR_INVALID_ARG_TYPE for out-of-range status codes.

Fixes node:http test-http-incoming-pipelined-socket-destroy (hung).
…-stack

tryListenNative defaulted a hostless listen to the IPv4 wildcard 0.0.0.0, so the
server was unreachable via ::1 (ECONNREFUSED) where Node and the classic path
bind the IPv6 wildcard :: (dual-stack, also accepts IPv4). Match net.ts: prefer
:: (fall back to 0.0.0.0 if the IPv6 bind fails), IPv4-only on Windows.

Fixes node:http client-proxy/test-http-proxy-request-ipv6.
…arget

req.url was nativeRequestTarget(op_http_get_request_url(...)), which strips the
origin off the engine's synthesized full URL -- collapsing an absolute-form
proxy request-target (GET http://host/path, sent by a client through an HTTP
proxy) back to origin-form (/path). Node's req.url is the verbatim
request-target. Add op_http_get_request_raw_target returning the engine's raw
inner.path (origin-form /path, absolute-form http://host/path, authority-form
host:port for CONNECT, * for OPTIONS) and use it directly; drop the now-dead
nativeRequestTarget helper.

Fixes node:http specs::node::http_proxy_env_no_allow_env.
… activity

The synthetic socket's setTimeout schedules a coarse one-shot timer that fired
after msecs regardless of activity. Node resets a socket timeout on read/write
activity, so an active upload (data flowing faster than the timeout) never fires
'timeout'. Re-arm the timer on each request-body chunk via _nativeRearmTimeout.

Fixes node:http pummel/test-http-upload-timeout.
writeHead switched to ERR_HTTP_INVALID_STATUS_CODE, leaving this import unused
(dlint no-unused-vars).
…responses

A native node:http response built from multiple res.write() calls was
coalesced into a single flat body (one HTTP chunk on the wire), so the client
saw one 'data' event instead of one per write. Node frames each write() as its
own chunk (verified against node v26). Route the multi-write chunked case
through the streaming body (one enqueue per write, aggregation disabled) so
each write becomes its own chunk, matching Node byte-for-byte. A single
buffered chunk keeps the faster single-op flat path.

Fixes node_compat parallel/test-webstreams-pipeline.js.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant