feat(closes OPEN-10341): add native async runner for testset batches by gustavocidornelas · Pull Request #648 · openlayer-ai/openlayer-python

gustavocidornelas · 2026-05-19T14:00:56Z

Pull Request

Summary

Adds native async support to OpenlayerModel.run_batch_from_df so that customers whose per-row work hits slow APIs
(~5s/row) can run testset batches concurrently instead of strictly sequentially.

Users opt in by defining run as async def run(...); the framework then drives rows through asyncio.gather gated by a semaphore. Sync run keeps today's behavior byte-for-byte.

Changes

run may now be defined as async def run(...). When it is, run_batch_from_df dispatches rows concurrently
via asyncio.gather + asyncio.Semaphore(max_workers).
New max_workers kwarg on run_batch_from_df and batch, plus --max-workers on the CLI. Default resolves to
4 for async run, 1 for sync run — writing async def is the opt-in signal that interleaving is safe.
max_workers > 1 with a sync run raises ValueError rather than silently ignoring it.
Per-row exceptions still propagate and abort the batch (fail-fast, same as today). For the async path,
asyncio.gather cancels in-flight siblings before re-raising.
Extracted _run_rows_async, _apply_row_result, and _build_config helpers so the row bookkeeping is shared
between the sync and async paths.

Context

OPEN-10341: Add native async runner for testset batches

Testing

Manual testing

Monitoring

No expected impact

Notes

Backwards compatible: existing sync run implementations behave identically. Same sequential code path, same
fail-fast semantics, no executor or asyncio overhead.
Customer-facing usage: if a customer's openlayer_run.py already defines async def run(...), they get 4-way
concurrency automatically next release. To override, append --max-workers N to the batchCommand in
openlayer.json.

viniciusdsmello

Three inline notes (risk & suggestions). The core async path and per-row trace isolation look correct; these are about scale, the default, and one leaky error message.

viniciusdsmello · 2026-06-16T14:22:38Z

+                output = await self.run(**kwargs)
+                return index, output, tracer.get_current_trace()
+
+        return await asyncio.gather(*(_one(i, k) for i, k in rows))


Eager task creation won't scale to large testsets. asyncio.gather(*(_one(...) for ... in rows)) instantiates one coroutine/Task per row up front, and rows (line 150) materializes every filtered-kwargs dict at once. The semaphore bounds concurrency, not task count, so a 100k-row testset creates 100k pending tasks plus 100k kwargs dicts in memory regardless of max_workers.

For typical testsets this is fine, but since the motivating use case is large batches against slow APIs, consider a bounded worker pool (N workers pulling from an asyncio.Queue, or chunked gather) so memory scales with max_workers rather than row count. At minimum, let's document the limitation.

viniciusdsmello · 2026-06-16T14:22:38Z

+        is_async = inspect.iscoroutinefunction(self.run)
+
+        if max_workers is None:
+            max_workers = 4 if is_async else 1


Defaulting async to 4 is opinionated and silent. The "writing async def means interleaving is safe" contract is reasonable, but this jumps an async run from sequential to 4 concurrent invocations with no explicit opt-in at the call site. That can surprise a run that hits a rate-limited API or holds non-reentrant state.

Two options: (a) default async to 1 and require --max-workers N to scale, or (b) keep 4 but call it out prominently in the changelog/user docs. Either is fine. Flagging so it's a deliberate choice rather than an accident.

viniciusdsmello · 2026-06-16T14:22:38Z

+            else:
+                raise RuntimeError(
+                    "run_batch_from_df was called from inside a running event "
+                    "loop. Call `await self._run_rows_async(...)` directly "


Error message points at an internal. The guidance to "Call await self._run_rows_async(...)" references a private method that takes pre-built (index, kwargs) tuples, which isn't something a user can reasonably call. If invoking from inside a running loop is a real use case, expose a public async def run_batch_from_df_async(df, max_workers=...) and point here. Otherwise, soften the message so it doesn't direct users at internals.

feat(closes OPEN-10341): add native async runner for testset batches

394081e

gustavocidornelas requested a review from whoseoyster May 19, 2026 14:01

viniciusdsmello reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(closes OPEN-10341): add native async runner for testset batches#648

feat(closes OPEN-10341): add native async runner for testset batches#648
gustavocidornelas wants to merge 1 commit into
mainfrom
gustavo/open-10341-add-native-async-runner-for-testset-batches

gustavocidornelas commented May 19, 2026

Uh oh!

viniciusdsmello left a comment

Uh oh!

viniciusdsmello Jun 16, 2026

Uh oh!

viniciusdsmello Jun 16, 2026

Uh oh!

viniciusdsmello Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gustavocidornelas commented May 19, 2026

Pull Request

Summary

Changes

Context

Testing

Monitoring

Notes

Uh oh!

viniciusdsmello left a comment

Choose a reason for hiding this comment

Uh oh!

viniciusdsmello Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

viniciusdsmello Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

viniciusdsmello Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants