expose include_orig_elements param in api#1
Merged
Conversation
corbanha
approved these changes
Jun 17, 2026
d029577 to
0947fbf
Compare
0947fbf to
2ee4b10
Compare
FouL06
approved these changes
Jun 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

TL;DR
Adds
include_orig_elementsas a new form parameter to control whether original elements are included in chunk metadata.What changed?
A new
include_orig_elementsboolean parameter (defaulting toTrue) has been added to the API. WhenTrue, the elements used to form each chunk are attached to that chunk's.metadata.orig_elementsas a gzipped+base64 blob. When set toFalse, these blobs are omitted from the response. The parameter is wired throughGeneralFormParams,pipeline_api, and all relevant chunking call sites.How to test?
Submit a document partition request with
include_orig_elements=falsein the form body and verify that the response chunks do not containorig_elementsin their metadata. Submit the same request withinclude_orig_elements=true(or omit the parameter entirely) and confirm thatorig_elementsis present in the chunk metadata as expected.Why make this change?
For large documents — particularly those with large tables — the
orig_elementsblob gets duplicated into every chunk, which can dramatically inflate the response payload size. Giving callers the option to opt out of this behavior allows them to receive significantly smaller responses when the original element data is not needed.