fix(kernel): server-sourced async state + cursor-state/metadata parity for use_kernel#838
Open
vikrantpuppala wants to merge 1 commit into
Open
fix(kernel): server-sourced async state + cursor-state/metadata parity for use_kernel#838vikrantpuppala wants to merge 1 commit into
vikrantpuppala wants to merge 1 commit into
Conversation
Brings the use_kernel=True path closer to Thrift parity on the
cursor-state-tracking and metadata clusters from the API+CUJ gap
audit, surfaces DML rowcount, and makes async command state
server-sourced.
Async command state — server is the source of truth (no local state):
- get_query_state / get_execution_result now re-attach to the
statement by id (Session.attach_async_statement, kernel side) and
read state / await results from the server, instead of consulting a
connector-side handle map. SEA keys GetStatementStatus/Result purely
on the id, so this works whether or not the connector still holds the
submitting handle — and get_execution_result is inherently re-callable
(each call re-attaches), matching the Thrift backend's re-fetchable
operation handle.
- Deleted _closed_commands (and its bounded-FIFO machinery
_record_closed / _CLOSED_COMMANDS_MAX): the connector no longer
invents a CLOSED-vs-SUCCEEDED answer. The server reports CLOSED
natively (verified: GET after close returns 200 state=CLOSED until
TTL, then 404 NOT_FOUND, which maps to a terminal state). This is the
local-state inconsistency removed.
- _async_handles / _async_statements are retained but DEMOTED to a
keep-alive-only registry: the submitting ExecutedAsyncStatement's Drop
fires close_statement, which would kill the live async query, so we
hold the handle until close_command / close_session. They are no
longer consulted for state or results.
Cursor-state tracking (T7):
- Set cursor.active_command_id in the _make_result_set chokepoint so
sync execute, async fetch, AND metadata all leave the cursor pointing
at the command that produced the current result set (matches Thrift's
unconditional set in _handle_execute_response).
- On a failed sync execute, publish the server-issued statement id
(read from the canceller's inflight slot) onto active_command_id
before re-raising. Best-effort; never masks the original failure.
Metadata:
- Normalize wildcard/blank catalog ('%' / '*' / '' / None) to None
(all-catalogs) across get_schemas/get_tables/get_columns
(_catalog_or_none), making the three symmetric (fixes columns).
- Normalize empty/whitespace-only pattern args to None (match-all) via
_none_if_blank. % / * stay as real LIKE wildcards on patterns.
- Drop the connector-side table_types drain + client-side refilter in
get_tables; the kernel filters table_types itself (case-insensitively
as of the batch-3 kernel change). Removed the dead
_drain_kernel_handle / _StaticArrowHandle helpers + unused imports.
- Fix the Cursor.columns() docstring: catalog_name=None is accepted on
all backends.
DML rowcount:
- Surface the kernel's num_modified_rows as cursor.rowcount for DML
instead of the hardcoded -1. None (SELECT / warehouses that don't
report it) leaves rowcount at -1, matching Thrift; getattr-guarded.
Bumps KERNEL_REV to the batch-3 kernel commit (case-insensitive
table_types, HTTP error-envelope parsing, num_modified_rows getter,
Session::attach_async_statement).
Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
6a21a91 to
78748fa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Brings the
use_kernel=Truepath closer to Thrift parity on the cursor-state-tracking (T7) and metadata clusters from the API+CUJ gap audit, surfaces DML rowcount, and makes async command state server-sourced (no connector-side state to drift). Batch 3 of the use_kernel parity work (follows #824 / #830).Async command state — server is the source of truth
This addresses the core concern that connector-side async bookkeeping drifts from the server.
get_query_state/get_execution_resultre-attach by id. They callSession.attach_async_statement(command_id)and read state / await results from the server, instead of consulting a connector-side handle map. SEA keysGetStatementStatus/Resultpurely on the id, so this works whether or not the connector still holds the submitting handle — andget_execution_resultis inherently re-callable (each call re-attaches), matching Thrift's re-fetchable operation handle. No moreunknown command_idon a second fetch._closed_commands(and its bounded-FIFO machinery_record_closed/_CLOSED_COMMANDS_MAX). This was the invented state — the connector guessing CLOSED-vs-SUCCEEDED locally, with the bounded FIFO making the answer depend on how many other commands had closed since. The server reportsCLOSEDnatively (verified live:GETafter close →200 state=CLOSEDuntil the result TTL, then404 NOT_FOUND→ mapped to a terminal state via_is_not_found)._async_handles/_async_statementsdemoted to keep-alive-only. They're retained for one reason: the submittingExecutedAsyncStatement'sDropfiresclose_statement, which would kill the live async query — so the connector holds the handle untilclose_command/close_session. They are no longer consulted for state or results.Cursor-state tracking (T7)
active_command_idstale. Set in the_make_result_setchokepoint so sync execute, async fetch, AND metadata all leave the cursor pointing at the producing command (matches Thrift's_handle_execute_response).Metadata
'%'/'*'/''/None→None) across schemas/tables/columns (_catalog_or_none), making them symmetric (fixescolumns)._none_if_blank);%/*stay as LIKE wildcards.table_typesconnector drain — the kernel filters it (case-insensitively, Bump version to 2.6.0 #139). Removed dead_drain_kernel_handle/_StaticArrowHandle+ unused imports.Cursor.columns()docstring —catalog_name=Noneis accepted on all backends.DML rowcount (enhancement)
num_modified_rowsascursor.rowcount;Noneleaves-1(matches Thrift).getattr-guarded.Testing
tests/unit/test_kernel_client.py, 116 pass):get_query_state/get_execution_resultrewritten to mockattach_async_statement(server-state, NotFound→terminal, re-callable); close paths assert close+drop (no_closed_commands); metadata normalization;num_modified_rows→rowcount.test_client.py128 pass.execute_async→poll→fetch,get_async_execution_resultre-callable, result resumable from a fresh cursor (proves state comes from the server, not per-cursor tracking); empty-string filter; case-insensitivetable_types(self-contained table+view);active_command_idafter metadata; DML rowcount. All green.black(pinned 22.3.0) clean.