Skip to content

Add opt-in monolithic shared library for ODR-safe plugin co-loading#170

Open
blasscoc wants to merge 1 commit into
mainfrom
feat/shared-lib-build
Open

Add opt-in monolithic shared library for ODR-safe plugin co-loading#170
blasscoc wants to merge 1 commit into
mainfrom
feat/shared-lib-build

Conversation

@blasscoc

Copy link
Copy Markdown
Collaborator

MDIO is consumed header-only, so every translation unit that touches it statically links its own copy of tensorstore and its vendored Abseil. That is harmless for a single executable, but a process that dlopen()s more than one such plugin ends up with several copies of Abseil's global state and aborts at runtime with the infamous "ODR violation in Cord".

This adds an opt-in MDIO_BUILD_MONOLITHIC_SHARED (default OFF, so static consumers are untouched) that emits one libmdio_monolith.so, exposed via the mdio::monolith alias. Plugins link that single object instead of the tensorstore::* static deps, so the dynamic linker maps tensorstore and Abseil exactly once per process and the globals are singletons again.

Getting a self-contained shared object right took two non-obvious steps:

  • WHOLE_ARCHIVE on the tensorstore::* targets is a no-op because they are INTERFACE aggregators. A small recursive collector walks the link closure down to the concrete STATIC_LIBRARY targets (unwrapping the $<LINK_LIBRARY:WHOLE_ARCHIVE,...> genexes tensorstore uses for its alwayslink driver libs) and whole-archives those, so the explicit template instantiations (Spec / Zarr metadata JSON binders) and the driver self-registration objects are all force-included.

  • The interface deps are re-exposed to consumers as $<COMPILE_ONLY:...> usage requirements, propagating the transitive tensorstore/Abseil/ nlohmann include dirs and defines without dragging the static archives back into the consumer (which would re-duplicate Abseil and defeat the whole exercise).

MDIO is consumed header-only, so every translation unit that touches it
statically links its own copy of tensorstore and its vendored Abseil.
That is harmless for a single executable, but a process that dlopen()s
more than one such plugin ends up with several copies of Abseil's global
state and aborts at runtime with the infamous "ODR violation in Cord".

This adds an opt-in MDIO_BUILD_MONOLITHIC_SHARED (default OFF, so static
consumers are untouched) that emits one libmdio_monolith.so, exposed via
the mdio::monolith alias. Plugins link that single object instead of the
tensorstore::* static deps, so the dynamic linker maps tensorstore and
Abseil exactly once per process and the globals are singletons again.

Getting a self-contained shared object right took two non-obvious steps:

  - WHOLE_ARCHIVE on the tensorstore::* targets is a no-op because they
    are INTERFACE aggregators. A small recursive collector walks the link
    closure down to the concrete STATIC_LIBRARY targets (unwrapping the
    $<LINK_LIBRARY:WHOLE_ARCHIVE,...> genexes tensorstore uses for its
    alwayslink driver libs) and whole-archives those, so the explicit
    template instantiations (Spec / Zarr metadata JSON binders) and the
    driver self-registration objects are all force-included.

  - The interface deps are re-exposed to consumers as $<COMPILE_ONLY:...>
    usage requirements, propagating the transitive tensorstore/Abseil/
    nlohmann include dirs and defines without dragging the static archives
    back into the consumer (which would re-duplicate Abseil and defeat the
    whole exercise).

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants