-sNODERAWSOCKETS DNS resolution, with general readiness#27182
Open
guybedford wants to merge 3 commits into
Open
-sNODERAWSOCKETS DNS resolution, with general readiness#27182guybedford wants to merge 3 commits into
guybedford wants to merge 3 commits into
Conversation
5bae5d9 to
917c563
Compare
d5328e2 to
4f24376
Compare
Adds epoll_create1/epoll_ctl/epoll_wait/epoll_pwait and a non-blocking JS-callback variant, emscripten_epoll_set_callback, on a single fd readiness model shared with poll(). Readiness is source-based: producers (sockets, pipes) post edges to a wait-queue on the FS node, which dup'd fds share. An epoll instance is a real FS fd whose stream holds an interest map (fd -> registration) and a ready list. epoll_ctl ADD arms a persistent listener on the watched node - the registration's edge in the interest graph; on an edge the listener appends the registration to the epoll's ready list (Linux's rdllist) and wakes any waiter. Because a source-based model only learns readiness from edges, epoll_ctl ADD/MOD also samples the current level once, so an fd already ready when watched is reported with no further event needed. A wait consumes the ready list (Linux's ep_send_events): each listed registration is re-derived against its current mask; level-triggered ones still ready are re-listed at the tail, edge-triggered ones leave until the next edge, and a no-longer-ready (spurious) edge is dropped. A fired EPOLLONESHOT drops its watched-node listener until EPOLL_CTL_MOD re-arms it, so a dead edge carries no traffic. The ready list is an intrusive doubly-linked list, so draining is O(ready) rather than O(registered), and the remainder past maxevents is rotated to the front for round-robin fairness. emscripten_epoll_set_callback registers a persistent consumer on that same ready list: the runtime delivers the ready set to the callback on each progress, with no blocking and no ASYNCIFY/JSPI. It is armed once (not per spin), re-fires on the next tick while the set stays ready (so level and overflow drain as a blocking epoll_wait loop would), and there is at most one callback per epoll (a second call replaces it; a NULL callback unregisters). Per-fd EPOLLET/EPOLLONESHOT apply unchanged, so a single callback can mix level/edge/oneshot fds. A blocking epoll_wait (under PROXY_TO_PTHREAD, ASYNCIFY, or JSPI) consumes the same ready list, so a wait and a callback on one epoll take disjoint slices rather than each seeing a private copy. The callback is delivered on the main thread's event loop (under PROXY_TO_PTHREAD use a blocking epoll_wait instead), and keeps the runtime alive only while the set can still fire: once every watched fd is closed the set is terminal and the keepalive is dropped, so no explicit disposal is required (closing the epoll or passing a NULL callback also dispose). Registrations key on the open file description (the dup-shared stream state), matching Linux: closing a watched fd and reusing its number for a different open does not resurrect the registration onto the new fd. A close (socket, pipe, or a nested epoll) notifies its node, so the watching epoll promptly re-derives and drops the registration - the analog of Linux's eventpoll_release_file walking the watched file's epitem list. Only sockets and pipes derive real readiness; every other stream type (regular files across MEMFS/NODEFS/NODERAWFS, devices, ttys) has no poll handler and is treated as always readable+writable, so epoll_ctl rejects it with EPERM. This also fixes poll() crashing on a NODERAWFS regular file, whose stream carries no stream_ops at all. EPOLLEXCLUSIVE distributes its single wakeup across multiple epolls watching one fd (round-robin), which suppresses the thundering herd for that case; suppressing it across multiple waiters on a single epoll is out of scope (one instance, and they already share the ready list). Known limitations: WASMFS epoll is out of scope (link error); ttys are not pollable (no poll handler), unlike Linux; and eviction of a closed watched fd is keyed on the fd number, so (unlike Linux) a dup that keeps the underlying description alive does not preserve the registration.
Under -sNODERAWSOCKETS, getaddrinfo() previously fabricated fake addresses via DNS.lookup_name. This adds real resolution backed by node:dns, plus an async getaddrinfo so names can be resolved without blocking. getaddrinfo() now resolves numeric addresses and /etc/hosts entries (read fresh through emscripten's FS) synchronously, and returns a full addrinfo list (one node per address). For a real hostname: - without JSPI it returns EAI_AGAIN (no synchronous DNS); resolve it via the async API below. - under JSPI it suspends the wasm stack on the node:dns lookup and returns the addresses directly (gated on ASYNCIFY == 2). The async API (available in all builds): - emscripten_dns_lookup_async(node, service, hint) takes the same inputs as getaddrinfo() and returns a pollable fd that becomes readable when resolution completes. - emscripten_dns_lookup_result(fd, struct addrinfo **res) reads the outcome: 0 on success (writing the addrinfo list to *res, freed with freeaddrinfo), or an EAI_* code on failure. The completion fd is a plain pollable descriptor: wait on it with epoll/poll/select. freeaddrinfo now frees the whole ai_next chain; EAI_AGAIN is added to the generated struct info. Tested with test_dns_async, test_dns_callback (completion via emscripten_poll_with_callback), test_dns_async_net, test_dns_async_default, and test_dns_jspi, including PROXY_TO_PTHREAD variants.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Async DNS, built on top of #27207 for async readiness. Implementation diff is the last commit - b5af97b.
This adds real DNS resolution under
-sNODERAWSOCKETSvianode:dns, along with an async syscall variant ofgetaddrinfofor non-JSPI environments.getaddrinfo()resolves numeric addresses and/etc/hostsentries (read fresh through emscripten's FS) synchronously, and returns a fulladdrinfolist. For a real hostname: without JSPI it returnsEAI_AGAIN(resolve via the async API); under JSPI it suspends the wasm stack on thenode:dnslookup and returns the addresses directly.Async API (available in all builds):
emscripten_dns_lookup_async(node, service, hint)— same inputs asgetaddrinfo(), returns a pollable fd that becomes readable when resolution completes.emscripten_dns_lookup_result(fd, struct addrinfo **res)— reads the outcome:0on success (writing theaddrinfolist to*res, freed withfreeaddrinfo), or anEAI_*code.The original PR in #27162 allowed the async callback for DNS lookup to be registered via
emscripten_set_socket_message_callback. Instead of "hijacking" that mechanism, we can now directly use an epoll'able file descriptor to represent the dns lookup operation. On readiness, the actualemscripten_dns_lookup_result()just gets called again.freeaddrinfonow frees the wholeai_nextchain;EAI_AGAINis added to the generated struct info.Tested with
test_dns_async,test_dns_callback(completion viaemscripten_poll_with_callback),test_dns_async_net,test_dns_async_default, andtest_dns_jspi, includingPROXY_TO_PTHREADvariants.