Skip to content

fix(vector): reject non-positive similar_to neighbor count#9767

Open
eharris128 wants to merge 1 commit into
dgraph-io:mainfrom
eharris128:fix-similar-to-nonpositive-k
Open

fix(vector): reject non-positive similar_to neighbor count#9767
eharris128 wants to merge 1 commit into
dgraph-io:mainfrom
eharris128:fix-similar-to-nonpositive-k

Conversation

@eharris128

Copy link
Copy Markdown

Description

A similar_to() vector query takes a client-supplied number of neighbors. That argument is parsed in worker/task.go without a lower-bound check, so a zero or negative value flows into the HNSW search. On the default search path (no ef / distance_threshold option), SearchWithPath passes it through unclamped, and addPathNode (tok/hnsw/search_layer.go) computes effectiveMaxLen = maxResults + filtered and then evaluates slr.neighbors[:effectiveMaxLen]. A negative count panics with a negative slice bound ([:-1]); a count of zero empties the slice and then dereferences slr.neighbors[0]. Either way a single query with a neighbor count <= 0 panics the query goroutine and crashes the Alpha. The SearchWithOptions path already guards against this with an if maxResults < 0 { maxResults = 0 } clamp; only the default path was left unprotected.

This change:

  • Rejects a non-positive neighbor count at the request boundary (worker/task.go) with a clear error, so an invalid similar_to count returns a normal query error instead of crashing the server.
  • Mirrors the existing SearchWithOptions clamp into SearchWithPath as defense in depth.
  • Clamps effectiveMaxLen and guards the slr.neighbors[0] dereference in addPathNode, so the bottom-layer search cannot panic on a non-positive limit regardless of caller.
  • Adds a regression test (TestAddPathNodeNonPositiveMaxResults) that reproduces the panic on the old code and passes with the fix.

Valid queries are unaffected: a normal positive neighbor count truncates exactly as before.

Checklist

  • The PR title follows the Conventional Commits syntax, leading with fix:, feat:, chore:, ci:, etc.
  • Code compiles correctly and linting (via trunk) passes locally
  • Tests added for new functionality, or regression tests for bug fixes added as applicable

@eharris128 eharris128 requested a review from a team as a code owner July 2, 2026 15:18
The number-of-neighbors argument of a similar_to() vector query is parsed from
the request with no lower-bound check (worker/task.go). A zero or negative value
reaches the HNSW bottom-layer search on the default (no ef / distance_threshold)
path, where addPathNode computes effectiveMaxLen = maxResults + filtered and then
evaluates slr.neighbors[:effectiveMaxLen]. A negative value panics with a negative
slice bound, and zero empties the slice and then dereferences slr.neighbors[0], so
a single query with a neighbor count <= 0 panics the query goroutine and takes down
the Alpha.

Reject a non-positive neighbor count at the request boundary with a clear error,
and harden the HNSW library as defense in depth: mirror the maxResults < 0 clamp
that SearchWithOptions already applies into SearchWithPath, and clamp
effectiveMaxLen plus guard the neighbors[0] dereference in addPathNode. Add a
regression test for addPathNode with non-positive maxResults.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant