Skip to content

Updating Fern and fixing a couple small links #2210

Merged
rapids-bot[bot] merged 5 commits into
NVIDIA:mainfrom
cjnolet:codex/fresh-rapids-main
Jun 4, 2026
Merged

Updating Fern and fixing a couple small links #2210
rapids-bot[bot] merged 5 commits into
NVIDIA:mainfrom
cjnolet:codex/fresh-rapids-main

Conversation

@cjnolet

@cjnolet cjnolet commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

No description provided.

@cjnolet cjnolet self-assigned this Jun 2, 2026
@cjnolet cjnolet requested a review from a team as a code owner June 2, 2026 19:59
@cjnolet cjnolet added doc Improvements or additions to documentation non-breaking Introduces a non-breaking change labels Jun 2, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 2, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cjnolet cjnolet moved this to In Progress in Unstructured Data Processing Jun 2, 2026
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • Documentation
    • Added a Field Guide (with Compatibility, Integration Patterns, JIT Compilation, UDF Usage) and new Advanced Topics grouping; updated navigation, redirects, and many guide pages.
    • Improved API/reference docs formatting and consistency (cleaner parameter tables, normalized C/C++ API signature display, updated examples).
  • Chores
    • Documentation build now uses the configured Fern version (pinned to 5.44.3) and adds validation to ensure generated API markdown quality.

Walkthrough

This PR pins the Fern CLI version for docs builds and reorganizes documentation: "Advanced Topics" is repurposed as a "Field Guide", page slugs and navigation are updated, redirects are added, many internal links are migrated, multiple API doc signatures were reformatted (removing CUVS_EXPORT), and the API generator was refactored with new validation.

Changes

Documentation Structure Reorganization and Version Pinning

Layer / File(s) Summary
Build infrastructure version pinning
fern/build_docs.sh, fern/fern.config.json
fern_config_version() reads version from fern/fern.config.json; when not "*", the build uses npx --yes fern-api@${FERN_VERSION}. fern.config.json version changed from "*" to "5.44.3".
Site navigation and redirect configuration
fern/docs.yml
Added redirects for legacy advanced-topics and related URLs and reorganized navigation to introduce a User Guide "Field Guide" and Developer Guide "Advanced Topics" sections.
Page slug and landing content changes
fern/pages/advanced_topics.md, fern/pages/field_guide.md, fern/pages/jit_compilation.md, fern/pages/user_guide.md, fern/pages/user_guide/integration_patterns.md
advanced_topics.md repurposed as Field Guide landing (user-guide/field-guide); new field_guide.md added; JIT compilation and integration patterns slugs moved under Field Guide; user guide navigation simplified to include Field Guide links.
Cross-document link and minor content edits
fern/pages/api_guide.md, fern/pages/c_guidelines.md, fern/pages/cpp_guidelines.md, fern/pages/developer_guide.md, fern/pages/java_guidelines.md, fern/pages/python_guidelines.md, fern/pages/udf_usage.md, fern/pages/what_is_vector_search.md
Updated internal links to new routes (e.g., Link-time Optimization → /developer-guide/advanced-topics/link-time-optimization, Compatibility → /user-guide/field-guide/compatibility); API Guide adds C++→C ABI sentence.
Documentation formatting and signature normalization (C API pages)
fern/pages/c_api/*
Removed CUVS_EXPORT from displayed C API prototypes across many C API reference pages and normalized “Returns” sections to link cuvsError_t; adjusted several field descriptions/line-wrapping across neighbor/index/preprocessing pages.
Documentation formatting (C++/Python API pages)
fern/pages/cpp_api/*, fern/pages/python_api/*
Multiple parameter table reflows and anchor/link fixes, IVF-SQ Python save signature updated (removed include_dataset), and other presentation-only edits across C++/Python API docs.
API generator refactor and validation
fern/scripts/generate_api_reference.py
Major refactor to preserve Doxygen/Javadoc line semantics, ignore CUVS_EXPORT when parsing, add renderers (render_doxygen_summary, description-break logic), change how defaults are rendered, and add validate_generated_api_markdown() to detect squashed lists and leaked CUVS_EXPORT decorators. New helper functions added.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • rapidsai/cuvs#2030: Related documentation routing changes for advanced topics and UDF usage.
  • rapidsai/cuvs#1910: Overlaps IVF-SQ API/docs work referenced in this PR.
  • rapidsai/cuvs#2140: Introduced fern/build_docs.sh; this PR modifies its Fern invocation/version selection.

Suggested reviewers

  • divyegala
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
fern/build_docs.sh (1)

65-65: ⚡ Quick win

Consider validating Python3 availability.

The script now depends on Python3 to read the Fern version from config, but unlike Node.js (which has require_node_22), there's no explicit check for Python3 availability. While set -euo pipefail will cause the script to fail if Python3 is missing, the error message may not clearly indicate the requirement.

🐍 Suggested improvement to add Python3 validation

Add a validation function before line 55:

+require_python3() {
+  if ! command -v python3 >/dev/null 2>&1; then
+    echo "Fern docs build requires Python 3, but python3 was not found on PATH." >&2
+    echo "Install Python 3 before running fern/build_docs.sh." >&2
+    exit 1
+  fi
+}
+
+require_node_22
+require_python3
+
-require_node_22
-
 fern_config_version() {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/build_docs.sh` at line 65, The script now calls fern_config_version
(used where FERn_VERSION="$(fern_config_version)") which relies on Python3 but
there's no explicit check; add a validation helper (similar to require_node_22)
that checks for a working python3 (or python3 -V) and prints a clear error and
exits if not found, invoke that helper before calling fern_config_version
(before the FERn_VERSION assignment) so the script fails with a clear message
when Python3 is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@fern/docs.yml`:
- Around line 252-254: The "ABI Stability" navigation entry uses a relative path
"./developer_guide/abi_stability.md" that is inconsistent with other entries;
update the path for the page "ABI Stability" (the YAML mapping with key page:
"ABI Stability") to match the repository's nav convention (e.g.
"./pages/developer_guide/abi_stability.md") so sidebar resolution and link
checks succeed.

---

Nitpick comments:
In `@fern/build_docs.sh`:
- Line 65: The script now calls fern_config_version (used where
FERn_VERSION="$(fern_config_version)") which relies on Python3 but there's no
explicit check; add a validation helper (similar to require_node_22) that checks
for a working python3 (or python3 -V) and prints a clear error and exits if not
found, invoke that helper before calling fern_config_version (before the
FERn_VERSION assignment) so the script fails with a clear message when Python3
is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: df543cce-e9d5-411f-9666-bb48832e4d92

📥 Commits

Reviewing files that changed from the base of the PR and between 0c3d007 and 14a4b27.

📒 Files selected for processing (14)
  • fern/build_docs.sh
  • fern/docs.yml
  • fern/fern.config.json
  • fern/pages/advanced_topics.md
  • fern/pages/api_guide.md
  • fern/pages/c_guidelines.md
  • fern/pages/cpp_guidelines.md
  • fern/pages/developer_guide.md
  • fern/pages/java_guidelines.md
  • fern/pages/jit_compilation.md
  • fern/pages/python_guidelines.md
  • fern/pages/udf_usage.md
  • fern/pages/user_guide.md
  • fern/pages/user_guide/integration_patterns.md

Comment thread fern/docs.yml
Comment on lines +252 to +254
- page: "ABI Stability"
path: "./developer_guide/abi_stability.md"
- page: "Link-time Optimization"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix broken navigation path for ABI Stability.

Line 253 uses ./developer_guide/abi_stability.md, but navigation paths in this file consistently resolve from ./pages/.... This likely breaks the Developer Guide sidebar entry and link checks.

Suggested fix
       - section: "Advanced Topics"
         contents:
           - page: "ABI Stability"
-            path: "./developer_guide/abi_stability.md"
+            path: "./pages/developer_guide/abi_stability.md"
           - page: "Link-time Optimization"
             path: "./pages/jit_lto_guide.md"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- page: "ABI Stability"
path: "./developer_guide/abi_stability.md"
- page: "Link-time Optimization"
- page: "ABI Stability"
path: "./pages/developer_guide/abi_stability.md"
- page: "Link-time Optimization"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/docs.yml` around lines 252 - 254, The "ABI Stability" navigation entry
uses a relative path "./developer_guide/abi_stability.md" that is inconsistent
with other entries; update the path for the page "ABI Stability" (the YAML
mapping with key page: "ABI Stability") to match the repository's nav convention
(e.g. "./pages/developer_guide/abi_stability.md") so sidebar resolution and link
checks succeed.

@divyegala divyegala left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few small comments, pre-approving

Comment thread fern/pages/advanced_topics.md Outdated
Comment thread fern/pages/jit_compilation.md Outdated

For implementation details on building JIT LTO kernel fragments and linking them at runtime, see [Link-time Optimization](jit_lto_guide.md).
- [cuvs::neighbors::cagra::build()](/api-reference/cpp-api-neighbors-cagra) when graph construction uses `graph_build_params::ivf_pq_params` or `graph_build_params::iterative_search_params`
- [cuvs::neighbors::cagra::extend()](/api-reference/cpp-api-neighbors-cagra) when adding nodes, because the extension path searches the existing CAGRA graph

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the when necessary here? We should only add if needed

Comment thread fern/pages/field_guide.md

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename this file to field_guide.md and update all references

cjnolet and others added 2 commits June 2, 2026 18:11
@cjnolet

cjnolet commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test 729e9d2

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@fern/pages/c_api/c-api-neighbors-cagra.md`:
- Line 455: The note for the persistent_device_usage/kDeviceUsage parameter
contains the awkward phrase "alongside with the persistent kernel"; edit the
paragraph to use "alongside the persistent kernel" (or "with the persistent
kernel") instead, updating any other occurrences of "alongside with" in that
description so the sentence reads cleanly (e.g., "running any other work on GPU
alongside the persistent kernel makes the setup fragile").

In `@fern/pages/c_api/c-api-preprocessing-quantize-scalar.md`:
- Line 145: The page uses both “quantisation” and “quantization”; choose a
single variant (use “quantization”) and replace every occurrence on this page
(including the sentence "Applies quantization transform to given dataset" and
the other occurrence around line 170) so the terminology is consistent; search
for "quantis*" and "quantiz*" within
fern/pages/c_api/c-api-preprocessing-quantize-scalar.md and update all instances
to the chosen spelling.

In `@fern/pages/cpp_api/cpp-api-neighbors-brute-force.md`:
- Line 628: The parameter description for `include_dataset` currently uses
file-specific wording ("write out the dataset to the file") but these overloads
serialize to std::ostream&, so update both occurrences (the entries around the
`include_dataset` parameter in this document) to stream-neutral wording such as
"Whether or not to include the dataset in the serialized output." Preserve the
parameter type (`bool`) and default (`true`) and keep phrasing consistent with
other overload descriptions.

In `@fern/pages/cpp_api/cpp-api-neighbors-vamana.md`:
- Around line 535-536: Update the boolean parameter descriptions to use concise
phrasing: change "whether or not to serialize the dataset" to "whether to
serialize the dataset" for the include_dataset parameter and change "whether
output file should be aligned to disk sectors of 4096 bytes" to "whether the
output file should be aligned to 4096‑byte disk sectors" for sector_aligned;
apply these edits for every occurrence of include_dataset and sector_aligned in
cpp-api-neighbors-vamana.md so the wording matches the concise style used
elsewhere.

In `@fern/pages/python_api/python-api-neighbors-cagra.md`:
- Line 268: The markdown for the `metric` parameter contains an unescaped
asterisk in the cosine distance formula which triggers MD037; update the cosine
formula in the `metric` description (the line referencing "cosine distance is
defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 *
\|\|b\|\|_2)") so the multiplication asterisk is escaped or wrap the entire
formula in inline code/backticks; ensure the description text for `metric`
(mentioning "sqeuclidean", "inner_product", "cosine") uses the escaped asterisk
or code formatting to prevent markdownlint failures.

In `@fern/pages/python_api/python-api-neighbors-hnsw.md`:
- Line 92: The table row for parameter `M` contains unescaped arithmetic
expressions "m * 2" and "m * 3" which can trip markdownlint MD037; update the
text for `M` (the HNSW parameter) to wrap those expressions in code formatting
(e.g., `` `m * 2` `` and `` `m * 3` ``) or escape the asterisks so `graph_degree
= m * 2` and `intermediate_graph_degree = m * 3` are rendered as code; ensure
you update the same cell that references `graph_degree` and
`intermediate_graph_degree`.

In `@fern/pages/python_api/python-api-neighbors-ivf-flat.md`:
- Line 71: The table cell for the `metric` parameter contains unescaped
asterisks in the math phrases which triggers markdownlint MD037; update the
prose for the `metric` row (the `metric` parameter description) to either escape
each asterisk (e.g. \*) or wrap the mathematical expressions in inline
code/backticks so the multiplications and norm symbols are not parsed as
emphasis—ensure occurrences like a_i * b_i and ||a||_2 are escaped or
code-formatted consistently for `sqeuclidean`, `inner_product`, and `cosine`
formula descriptions.

In `@fern/pages/python_api/python-api-neighbors-ivf-pq.md`:
- Line 201: The markdown table cell for the `metric` parameter contains
unescaped asterisks in the cosine distance formula which breaks MD parsing;
update the cosine formula in the `metric` description (the line mentioning
"cosine distance is defined as...") to either escape the multiplication
asterisks or wrap the entire formula in inline code/backticks so the "*"
characters are not treated as emphasis (keep the rest of the text and valid
metric list unchanged).
- Line 349: Fix the typo in the `lut_dtype` parameter description: change
"dimansionality" to "dimensionality" in the sentence that reads "so fast shared
memory kernels can be used even for datasets with large dimansionality." Update
the documentation string for `lut_dtype` accordingly so it reads
"dimensionality" and keep the rest of the phrasing unchanged.

In `@fern/pages/python_api/python-api-neighbors-ivf-sq.md`:
- Line 74: The table cell describing `metric` contains raw asterisks in the
inline formulas (e.g., "a_i * b_i" and "\|\|a\|\|_2 * \|\|b\|\|_2") which
triggers MD037; update the markdown in the `metric` description to escape
multiplication operators by replacing * with \* in those expressions (for
example "a_i \* b_i" and "\|\|a\|\|_2 \* \|\|b\|\|_2") so the table renders
correctly and avoids markdown lint errors.

In `@fern/pages/python_api/python-api-neighbors-tiered-index.md`:
- Line 41: The metric description contains unescaped asterisks in the formulas
(e.g., in "inner product" and "cosine" definitions) which triggers markdown lint
MD037; update the text for the `metric` parameter to either wrap mathematical
operators/expressions like a_i * b_i and norms (||a||_2) in inline code spans or
escape the `*` operators so they are not treated as Markdown emphasis —
specifically edit the `metric` description line that mentions "inner product
distance" and "cosine distance" (and the sqeuclidean formula if needed) to use
inline code for expressions such as a_i * b_i and `||a||_2` or escape the `*`
characters.

In `@fern/pages/user_guide.md`:
- Line 11: Update the broken markdown link "[API Guide](/user-guide/api-guides)"
in fern/pages/user_guide.md to point to the correct configured page name
(singular) — replace the URL segment "/user-guide/api-guides" with
"/user-guide/api-guide" (matching the configured file name api_guide.md) so the
link resolves correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e5d5f3cf-5953-434e-8e92-3e28af7f73f0

📥 Commits

Reviewing files that changed from the base of the PR and between 6ff921e and 729e9d2.

📒 Files selected for processing (58)
  • fern/docs.yml
  • fern/pages/c_api/c-api-cluster-kmeans.md
  • fern/pages/c_api/c-api-core-c-api.md
  • fern/pages/c_api/c-api-distance-pairwise-distance.md
  • fern/pages/c_api/c-api-neighbors-all-neighbors.md
  • fern/pages/c_api/c-api-neighbors-brute-force.md
  • fern/pages/c_api/c-api-neighbors-cagra.md
  • fern/pages/c_api/c-api-neighbors-hnsw.md
  • fern/pages/c_api/c-api-neighbors-ivf-flat.md
  • fern/pages/c_api/c-api-neighbors-ivf-pq.md
  • fern/pages/c_api/c-api-neighbors-ivf-sq.md
  • fern/pages/c_api/c-api-neighbors-mg-cagra.md
  • fern/pages/c_api/c-api-neighbors-mg-ivf-flat.md
  • fern/pages/c_api/c-api-neighbors-mg-ivf-pq.md
  • fern/pages/c_api/c-api-neighbors-nn-descent.md
  • fern/pages/c_api/c-api-neighbors-refine.md
  • fern/pages/c_api/c-api-neighbors-tiered-index.md
  • fern/pages/c_api/c-api-neighbors-vamana.md
  • fern/pages/c_api/c-api-preprocessing-pca.md
  • fern/pages/c_api/c-api-preprocessing-quantize-binary.md
  • fern/pages/c_api/c-api-preprocessing-quantize-pq.md
  • fern/pages/c_api/c-api-preprocessing-quantize-scalar.md
  • fern/pages/cpp_api/cpp-api-cluster-agglomerative.md
  • fern/pages/cpp_api/cpp-api-cluster-kmeans.md
  • fern/pages/cpp_api/cpp-api-distance-distance.md
  • fern/pages/cpp_api/cpp-api-neighbors-all-neighbors.md
  • fern/pages/cpp_api/cpp-api-neighbors-brute-force.md
  • fern/pages/cpp_api/cpp-api-neighbors-cagra.md
  • fern/pages/cpp_api/cpp-api-neighbors-dynamic-batching.md
  • fern/pages/cpp_api/cpp-api-neighbors-epsilon-neighborhood.md
  • fern/pages/cpp_api/cpp-api-neighbors-hnsw.md
  • fern/pages/cpp_api/cpp-api-neighbors-ivf-pq.md
  • fern/pages/cpp_api/cpp-api-neighbors-ivf-sq.md
  • fern/pages/cpp_api/cpp-api-neighbors-nn-descent.md
  • fern/pages/cpp_api/cpp-api-neighbors-refine.md
  • fern/pages/cpp_api/cpp-api-neighbors-vamana.md
  • fern/pages/cpp_api/cpp-api-preprocessing-pca.md
  • fern/pages/cpp_api/cpp-api-preprocessing-quantize-pq.md
  • fern/pages/cpp_api/cpp-api-selection-select-k.md
  • fern/pages/cpp_api/cpp-api-stats-silhouette-score.md
  • fern/pages/cpp_api/cpp-api-stats-trustworthiness-score.md
  • fern/pages/field_guide.md
  • fern/pages/jit_compilation.md
  • fern/pages/python_api/python-api-cluster-kmeans.md
  • fern/pages/python_api/python-api-common.md
  • fern/pages/python_api/python-api-neighbors-all-neighbors.md
  • fern/pages/python_api/python-api-neighbors-cagra.md
  • fern/pages/python_api/python-api-neighbors-hnsw.md
  • fern/pages/python_api/python-api-neighbors-ivf-flat.md
  • fern/pages/python_api/python-api-neighbors-ivf-pq.md
  • fern/pages/python_api/python-api-neighbors-ivf-sq.md
  • fern/pages/python_api/python-api-neighbors-mg-cagra.md
  • fern/pages/python_api/python-api-neighbors-mg-ivf-flat.md
  • fern/pages/python_api/python-api-neighbors-mg-ivf-pq.md
  • fern/pages/python_api/python-api-neighbors-tiered-index.md
  • fern/pages/python_api/python-api-preprocessing-quantize-pq.md
  • fern/pages/user_guide.md
  • fern/scripts/generate_api_reference.py
💤 Files with no reviewable changes (2)
  • fern/pages/field_guide.md
  • fern/pages/jit_compilation.md
✅ Files skipped from review due to trivial changes (24)
  • fern/pages/cpp_api/cpp-api-neighbors-epsilon-neighborhood.md
  • fern/pages/cpp_api/cpp-api-neighbors-all-neighbors.md
  • fern/pages/python_api/python-api-neighbors-mg-ivf-pq.md
  • fern/pages/python_api/python-api-cluster-kmeans.md
  • fern/pages/python_api/python-api-neighbors-all-neighbors.md
  • fern/pages/cpp_api/cpp-api-cluster-agglomerative.md
  • fern/pages/python_api/python-api-neighbors-mg-cagra.md
  • fern/pages/cpp_api/cpp-api-neighbors-ivf-pq.md
  • fern/pages/python_api/python-api-common.md
  • fern/pages/cpp_api/cpp-api-neighbors-dynamic-batching.md
  • fern/pages/cpp_api/cpp-api-preprocessing-quantize-pq.md
  • fern/pages/cpp_api/cpp-api-selection-select-k.md
  • fern/pages/cpp_api/cpp-api-stats-silhouette-score.md
  • fern/pages/python_api/python-api-preprocessing-quantize-pq.md
  • fern/pages/python_api/python-api-neighbors-mg-ivf-flat.md
  • fern/pages/cpp_api/cpp-api-neighbors-refine.md
  • fern/pages/cpp_api/cpp-api-stats-trustworthiness-score.md
  • fern/pages/c_api/c-api-neighbors-refine.md
  • fern/pages/cpp_api/cpp-api-neighbors-nn-descent.md
  • fern/pages/cpp_api/cpp-api-distance-distance.md
  • fern/pages/c_api/c-api-neighbors-vamana.md
  • fern/pages/cpp_api/cpp-api-cluster-kmeans.md
  • fern/pages/cpp_api/cpp-api-neighbors-cagra.md
  • fern/pages/c_api/c-api-neighbors-mg-cagra.md

| `persistent` | `bool` | Whether to use the persistent version of the kernel (only SINGLE_CTA is supported a.t.m.) |
| `persistent_lifetime` | `float` | Persistent kernel: time in seconds before the kernel stops if no requests received. |
| `persistent_device_usage` | `float` | Set the fraction of maximum grid size used by persistent kernel. Value 1.0 means the kernel grid size is maximum possible for the selected device. The value must be greater than 0.0 and not greater than 1.0.<br /><br />One may need to run other kernels alongside this persistent kernel. This parameter can be used to reduce the grid size of the persistent kernel to leave a few SMs idle. Note: running any other work on GPU alongside with the persistent kernel makes the setup fragile.<br />- Running another kernel in another thread usually works, but no progress guaranteed<br />- Any CUDA allocations block the context (this issue may be obscured by using pools)<br />- Memory copies to not-pinned host memory may block the context<br /><br />Even when we know there are no other kernels working at the same time, setting kDeviceUsage to 1.0 surprisingly sometimes hurts performance. Proceed with care. If you suspect this is an issue, you can reduce this number to ~0.9 without a significant impact on the throughput. |
| `persistent_device_usage` | `float` | Set the fraction of maximum grid size used by persistent kernel. Value 1.0 means the kernel grid size is maximum possible for the selected device. The value must be greater than 0.0 and not greater than 1.0.<br /><br />One may need to run other kernels alongside this persistent kernel. This parameter can be used to reduce the grid size of the persistent kernel to leave a few SMs idle.<br />Note: running any other work on GPU alongside with the persistent kernel makes the setup fragile.<br />- Running another kernel in another thread usually works, but no progress guaranteed<br />- Any CUDA allocations block the context (this issue may be obscured by using pools)<br />- Memory copies to not-pinned host memory may block the context<br /><br />Even when we know there are no other kernels working at the same time, setting kDeviceUsage to 1.0 surprisingly sometimes hurts performance. Proceed with care. If you suspect this is an issue, you can reduce this number to ~0.9 without a significant impact on the throughput. |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix awkward phrasing in the persistent-kernel note.

“alongside with” is redundant; use either “alongside” or “with” for cleaner wording.

✏️ Suggested edit
-| `persistent_device_usage` | `float` | Set the fraction of maximum grid size used by persistent kernel. Value 1.0 means the kernel grid size is maximum possible for the selected device. The value must be greater than 0.0 and not greater than 1.0.<br /><br />One may need to run other kernels alongside this persistent kernel. This parameter can be used to reduce the grid size of the persistent kernel to leave a few SMs idle.<br />Note: running any other work on GPU alongside with the persistent kernel makes the setup fragile.<br />- Running another kernel in another thread usually works, but no progress guaranteed<br />- Any CUDA allocations block the context (this issue may be obscured by using pools)<br />- Memory copies to not-pinned host memory may block the context<br /><br />Even when we know there are no other kernels working at the same time, setting kDeviceUsage to 1.0 surprisingly sometimes hurts performance. Proceed with care. If you suspect this is an issue, you can reduce this number to ~0.9 without a significant impact on the throughput. |
+| `persistent_device_usage` | `float` | Set the fraction of maximum grid size used by persistent kernel. Value 1.0 means the kernel grid size is maximum possible for the selected device. The value must be greater than 0.0 and not greater than 1.0.<br /><br />One may need to run other kernels alongside this persistent kernel. This parameter can be used to reduce the grid size of the persistent kernel to leave a few SMs idle.<br />Note: running any other work on GPU alongside the persistent kernel makes the setup fragile.<br />- Running another kernel in another thread usually works, but no progress guaranteed<br />- Any CUDA allocations block the context (this issue may be obscured by using pools)<br />- Memory copies to not-pinned host memory may block the context<br /><br />Even when we know there are no other kernels working at the same time, setting kDeviceUsage to 1.0 surprisingly sometimes hurts performance. Proceed with care. If you suspect this is an issue, you can reduce this number to ~0.9 without a significant impact on the throughput. |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `persistent_device_usage` | `float` | Set the fraction of maximum grid size used by persistent kernel. Value 1.0 means the kernel grid size is maximum possible for the selected device. The value must be greater than 0.0 and not greater than 1.0.<br /><br />One may need to run other kernels alongside this persistent kernel. This parameter can be used to reduce the grid size of the persistent kernel to leave a few SMs idle.<br />Note: running any other work on GPU alongside with the persistent kernel makes the setup fragile.<br />- Running another kernel in another thread usually works, but no progress guaranteed<br />- Any CUDA allocations block the context (this issue may be obscured by using pools)<br />- Memory copies to not-pinned host memory may block the context<br /><br />Even when we know there are no other kernels working at the same time, setting kDeviceUsage to 1.0 surprisingly sometimes hurts performance. Proceed with care. If you suspect this is an issue, you can reduce this number to ~0.9 without a significant impact on the throughput. |
| `persistent_device_usage` | `float` | Set the fraction of maximum grid size used by persistent kernel. Value 1.0 means the kernel grid size is maximum possible for the selected device. The value must be greater than 0.0 and not greater than 1.0.<br /><br />One may need to run other kernels alongside this persistent kernel. This parameter can be used to reduce the grid size of the persistent kernel to leave a few SMs idle.<br />Note: running any other work on GPU alongside the persistent kernel makes the setup fragile.<br />- Running another kernel in another thread usually works, but no progress guaranteed<br />- Any CUDA allocations block the context (this issue may be obscured by using pools)<br />- Memory copies to not-pinned host memory may block the context<br /><br />Even when we know there are no other kernels working at the same time, setting kDeviceUsage to 1.0 surprisingly sometimes hurts performance. Proceed with care. If you suspect this is an issue, you can reduce this number to ~0.9 without a significant impact on the throughput. |
🧰 Tools
🪛 LanguageTool

[style] ~455-~455: This phrase is redundant. Consider writing “alongside” or “with”.
Context: ...r />Note: running any other work on GPU alongside with the persistent kernel makes the setup f...

(ALONGSIDE_WITH)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/c_api/c-api-neighbors-cagra.md` at line 455, The note for the
persistent_device_usage/kDeviceUsage parameter contains the awkward phrase
"alongside with the persistent kernel"; edit the paragraph to use "alongside the
persistent kernel" (or "with the persistent kernel") instead, updating any other
occurrences of "alongside with" in that description so the sentence reads
cleanly (e.g., "running any other work on GPU alongside the persistent kernel
makes the setup fragile").

<a id="cuvsscalarquantizertransform"></a>
### cuvsScalarQuantizerTransform

Applies quantization transform to given dataset

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a single spelling variant for “quantization” across this page.

This page mixes “quantisation” and “quantization”; please standardize to one variant for consistency.

Also applies to: 170-170

🧰 Tools
🪛 LanguageTool

[uncategorized] ~145-~145: Do not mix variants of the same word (‘quantization’ and ‘quantisation’) within a single text.
Context: ...# cuvsScalarQuantizerTransform Applies quantization transform to given dataset ```c cuvsEr...

(EN_WORD_COHERENCY)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/c_api/c-api-preprocessing-quantize-scalar.md` at line 145, The
page uses both “quantisation” and “quantization”; choose a single variant (use
“quantization”) and replace every occurrence on this page (including the
sentence "Applies quantization transform to given dataset" and the other
occurrence around line 170) so the terminology is consistent; search for
"quantis*" and "quantiz*" within
fern/pages/c_api/c-api-preprocessing-quantize-scalar.md and update all instances
to the chosen spelling.

| `os` | in | `std::ostream&` | output stream |
| `index` | in | [`const cuvs::neighbors::brute_force::index<half, float>&`](/api-reference/cpp-api-neighbors-brute-force#neighbors-brute-force-index) | brute force index |
| `include_dataset` | in | `bool` | Whether or not to write out the dataset to the file. Default: `true`. |
| `include_dataset` | in | `bool` | Whether or not to write out the dataset to the file.<br />Default: `true`. |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stream-overload description to avoid file-specific wording.

These overloads serialize to std::ostream&, but the description says “write out the dataset to the file.” Please update to stream-neutral wording (e.g., “serialized output”) and keep phrasing consistent with the other overloads.

Also applies to: 654-654

🧰 Tools
🪛 LanguageTool

[style] ~628-~628: Consider shortening this phrase to just ‘whether’, unless you mean ‘regardless of whether’.
Context: ...x | | include_dataset | in | bool | Whether or not to write out the dataset to the file.<b...

(WHETHER)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/cpp_api/cpp-api-neighbors-brute-force.md` at line 628, The
parameter description for `include_dataset` currently uses file-specific wording
("write out the dataset to the file") but these overloads serialize to
std::ostream&, so update both occurrences (the entries around the
`include_dataset` parameter in this document) to stream-neutral wording such as
"Whether or not to include the dataset in the serialized output." Preserve the
parameter type (`bool`) and default (`true`) and keep phrasing consistent with
other overload descriptions.

Comment on lines +535 to +536
| `include_dataset` | in | `bool` | whether or not to serialize the dataset<br />Default: `true`. |
| `sector_aligned` | in | `bool` | whether output file should be aligned to disk sectors of 4096 bytes<br />Default: `false`. |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use simpler phrasing for boolean parameter descriptions.

Consider changing “whether or not …” to “whether …” for include_dataset (and optionally sector_aligned) to match concise style used elsewhere in the API docs.

Also applies to: 563-564, 591-592

🧰 Tools
🪛 LanguageTool

[style] ~535-~535: Consider shortening this phrase to just ‘whether’, unless you mean ‘regardless of whether’.
Context: ...x | | include_dataset | in | bool | whether or not to serialize the dataset
Default: ...

(WHETHER)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/cpp_api/cpp-api-neighbors-vamana.md` around lines 535 - 536,
Update the boolean parameter descriptions to use concise phrasing: change
"whether or not to serialize the dataset" to "whether to serialize the dataset"
for the include_dataset parameter and change "whether output file should be
aligned to disk sectors of 4096 bytes" to "whether the output file should be
aligned to 4096‑byte disk sectors" for sector_aligned; apply these edits for
every occurrence of include_dataset and sector_aligned in
cpp-api-neighbors-vamana.md so the wording matches the concise style used
elsewhere.

| Name | Type | Description |
| --- | --- | --- |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type, valid values for metric are ["sqeuclidean", "inner_product", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2<br />- inner_product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type,<br />valid values for metric are ["sqeuclidean", "inner_product", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2<br />- inner_product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Escape the multiplication asterisk in the cosine formula to avoid MD037 lint failures.

The * in ... / ( \|\|a\|\|_2 * \|\|b\|\|_2) is being parsed as emphasis markup in this markdown context. Escaping it (or wrapping the formula in backticks) will prevent markdownlint failures.

🔧 Proposed fix
-| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type,<br />valid values for metric are ["sqeuclidean", "inner_product", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2<br />- inner_product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |
+| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type,<br />valid values for metric are ["sqeuclidean", "inner_product", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2<br />- inner_product distance is defined as distance(a, b) = \\sum_i a_i \* b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i \* b_i / ( \|\|a\|\|_2 \* \|\|b\|\|_2). |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type,<br />valid values for metric are ["sqeuclidean", "inner_product", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2<br />- inner_product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type,<br />valid values for metric are ["sqeuclidean", "inner_product", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2<br />- inner_product distance is defined as distance(a, b) = \\sum_i a_i \* b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i \* b_i / ( \|\|a\|\|_2 \* \|\|b\|\|_2). |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 268-268: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)


[warning] 268-268: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/python_api/python-api-neighbors-cagra.md` at line 268, The
markdown for the `metric` parameter contains an unescaped asterisk in the cosine
distance formula which triggers MD037; update the cosine formula in the `metric`
description (the line referencing "cosine distance is defined as distance(a, b)
= 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2)") so the multiplication
asterisk is escaped or wrap the entire formula in inline code/backticks; ensure
the description text for `metric` (mentioning "sqeuclidean", "inner_product",
"cosine") uses the escaped asterisk or code formatting to prevent markdownlint
failures.

| --- | --- | --- |
| `n_lists` | `int, default = 1024` | The number of clusters used in the coarse quantizer. |
| `metric` | `str, default="sqeuclidean"` | String denoting the metric type. Valid values for metric: ["sqeuclidean", "inner_product", "euclidean", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2,<br />- euclidean is the euclidean distance<br />- inner product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |
| `metric` | `str, default="sqeuclidean"` | String denoting the metric type.<br />Valid values for metric: ["sqeuclidean", "inner_product", "euclidean", "cosine"], where:<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2,<br />- euclidean is the euclidean distance<br />- inner product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix markdown emphasis parsing in the cosine formula.

Unescaped * in this table cell can trip MD037. Escaping the multiplication symbols (or wrapping formula text in code) will avoid lint failures.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 201-201: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)


[warning] 201-201: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/python_api/python-api-neighbors-ivf-pq.md` at line 201, The
markdown table cell for the `metric` parameter contains unescaped asterisks in
the cosine distance formula which breaks MD parsing; update the cosine formula
in the `metric` description (the line mentioning "cosine distance is defined
as...") to either escape the multiplication asterisks or wrap the entire formula
in inline code/backticks so the "*" characters are not treated as emphasis (keep
the rest of the text and valid metric list unchanged).

| `lut_dtype` | `default = np.float32` | Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected. Possible values [np.float32, np.float16, np.uint8] |
| `internal_distance_dtype` | `default = np.float32` | Storage data type for distance/similarity computation. Possible values [np.float32, np.float16] |
| `coarse_search_dtype` | `default = np.float32` | [Experimental] The data type to use as the GEMM element type when searching the clusters to probe. Possible values: [np.float32, np.float16, np.int8].<br />- Legacy default: np.float32<br />- Recommended for performance: np.float16 (half)<br />- Experimental/low-precision: np.int8 |
| `lut_dtype` | `default = np.float32` | Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected.<br />Possible values [np.float32, np.float16, np.uint8] |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Correct typo in parameter description (dimansionalitydimensionality).

Small wording fix, but it improves docs quality and avoids grammar/lint noise.

✏️ Proposed fix
-| `lut_dtype` | `default = np.float32` | Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected.<br />Possible values [np.float32, np.float16, np.uint8] |
+| `lut_dtype` | `default = np.float32` | Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimensionality. Note that the recall is slightly degraded when low-precision type is selected.<br />Possible values [np.float32, np.float16, np.uint8] |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `lut_dtype` | `default = np.float32` | Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected.<br />Possible values [np.float32, np.float16, np.uint8] |
| `lut_dtype` | `default = np.float32` | Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimensionality. Note that the recall is slightly degraded when low-precision type is selected.<br />Possible values [np.float32, np.float16, np.uint8] |
🧰 Tools
🪛 LanguageTool

[grammar] ~349-~349: Ensure spelling is correct
Context: ...an be used even for datasets with large dimansionality. Note that the recall is slightly degra...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/python_api/python-api-neighbors-ivf-pq.md` at line 349, Fix the
typo in the `lut_dtype` parameter description: change "dimansionality" to
"dimensionality" in the sentence that reads "so fast shared memory kernels can
be used even for datasets with large dimansionality." Update the documentation
string for `lut_dtype` accordingly so it reads "dimensionality" and keep the
rest of the phrasing unchanged.

| --- | --- | --- |
| `n_lists` | `int, default = 1024` | The number of clusters used in the coarse quantizer. |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type. Valid values for metric: ["sqeuclidean", "inner_product", "euclidean", "cosine"], where<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2,<br />- euclidean is the euclidean distance<br />- inner product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type.<br />Valid values for metric: ["sqeuclidean", "inner_product", "euclidean", "cosine"], where<br /><br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2,<br />- euclidean is the euclidean distance<br />- inner product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Escape formula multiplication operators to prevent MD037 in table markdown.

This metric description still uses raw * characters in prose math; markdownlint may treat them as emphasis markers.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 74-74: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)


[warning] 74-74: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/python_api/python-api-neighbors-ivf-sq.md` at line 74, The table
cell describing `metric` contains raw asterisks in the inline formulas (e.g.,
"a_i * b_i" and "\|\|a\|\|_2 * \|\|b\|\|_2") which triggers MD037; update the
markdown in the `metric` description to escape multiplication operators by
replacing * with \* in those expressions (for example "a_i \* b_i" and
"\|\|a\|\|_2 \* \|\|b\|\|_2") so the table renders correctly and avoids markdown
lint errors.

| Name | Type | Description |
| --- | --- | --- |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type. Valid values for metric: ["sqeuclidean", "inner_product", "euclidean", "cosine"], where<br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2,<br />- euclidean is the euclidean distance<br />- inner product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |
| `metric` | `str, default = "sqeuclidean"` | String denoting the metric type.<br />Valid values for metric: ["sqeuclidean", "inner_product", "euclidean", "cosine"], where<br />- sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = \\sum_i (a_i - b_i)^2,<br />- euclidean is the euclidean distance<br />- inner product distance is defined as distance(a, b) = \\sum_i a_i * b_i.<br />- cosine distance is defined as distance(a, b) = 1 - \\sum_i a_i * b_i / ( \|\|a\|\|_2 * \|\|b\|\|_2). |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid markdown emphasis conflicts in formula text.

Unescaped * in this metric formula can trigger MD037. Escaping the operator or using inline code will keep lint clean.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 41-41: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)


[warning] 41-41: Spaces inside emphasis markers

(MD037, no-space-in-emphasis)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/python_api/python-api-neighbors-tiered-index.md` at line 41, The
metric description contains unescaped asterisks in the formulas (e.g., in "inner
product" and "cosine" definitions) which triggers markdown lint MD037; update
the text for the `metric` parameter to either wrap mathematical
operators/expressions like a_i * b_i and norms (||a||_2) in inline code spans or
escape the `*` operators so they are not treated as Markdown emphasis —
specifically edit the `metric` description line that mentions "inner product
distance" and "cosine distance" (and the sqeuclidean formula if needed) to use
inline code for expressions such as a_i * b_i and `||a||_2` or escape the `*`
characters.

Comment thread fern/pages/user_guide.md
## References

- [References](references.md): cite the research papers behind cuVS vector search, preprocessing, clustering, and GPU primitives.
- [API Guide](/user-guide/api-guides): find task-focused NVIDIA cuVS API examples for clustering, vector indexing, preprocessing, common types, and supporting routines.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix likely broken API Guide link target.

Line 11 uses /user-guide/api-guides, but the configured page is api_guide.md (singular “api-guide”), so this URL likely 404s.

Suggested fix
-- [API Guide](/user-guide/api-guides): find task-focused NVIDIA cuVS API examples for clustering, vector indexing, preprocessing, common types, and supporting routines.
+- [API Guide](/user-guide/api-guide): find task-focused NVIDIA cuVS API examples for clustering, vector indexing, preprocessing, common types, and supporting routines.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- [API Guide](/user-guide/api-guides): find task-focused NVIDIA cuVS API examples for clustering, vector indexing, preprocessing, common types, and supporting routines.
- [API Guide](/user-guide/api-guide): find task-focused NVIDIA cuVS API examples for clustering, vector indexing, preprocessing, common types, and supporting routines.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@fern/pages/user_guide.md` at line 11, Update the broken markdown link "[API
Guide](/user-guide/api-guides)" in fern/pages/user_guide.md to point to the
correct configured page name (singular) — replace the URL segment
"/user-guide/api-guides" with "/user-guide/api-guide" (matching the configured
file name api_guide.md) so the link resolves correctly.

@cjnolet

cjnolet commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot Bot merged commit 3035cd0 into NVIDIA:main Jun 4, 2026
56 checks passed
@github-project-automation github-project-automation Bot moved this from In Progress to Done in Unstructured Data Processing Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants