perf: scale throughput by effective batch size, not requested --batch-size by xieofxie · Pull Request #930 · microsoft/winml-cli

xieofxie · 2026-06-22T08:15:37Z

Summary

Fixes incorrect samples_per_sec in wmk perf when the requested --batch-size cannot be applied to the model.

The requested --batch-size only lands on inputs whose leading dimension is dynamic (see _resolve_shape). For a model with a statically-fixed batch dim (e.g. [1, 3, 224, 224]), input generation silently keeps the static value, so the session runs batch=1 — but throughput was still computed as config.batch_size / latency, inflating samples_per_sec by the requested batch factor (e.g. reporting 800 sps when the model actually processed 100). Latency stats were always correct; only the derived per-sample throughput was wrong.

Changes

New helper effective_batch_size() — reads the actual batch back from the first batched (rank ≥ 1) generated input, matching the module's existing "first dim is batch" convention. Falls back to the requested value when all inputs are scalar.
_collect_results() now scales samples_per_sec by the effective batch the session ran, not the requested config.batch_size. (batches_per_sec was already correct — it is per-call.)
_generate_inputs() logs a warning when the requested --batch-size could not be applied (static batch dim).
BenchmarkResult gains an effective_batch_size field, surfaced in the JSON benchmark_info alongside the requested batch_size, and shown in the console throughput line with a Note: when the two differ.

Tests

New TestEffectiveBatchSize in tests/unit/commands/test_perf_cli.py covers the helper (dynamic / static / scalar / fallback), throughput scaling for both static-batch (regression guard: 100 not 800 sps) and dynamic-batch cases, the warning on mismatch, and the JSON field. Full test_perf_cli.py, test_perf_module.py, and test_perf_composite.py suites pass.

Scope note

This addresses the batch-size correctness aspect of the epic. The broader items (batch-size sweep --batch-sizes 1,4,8,16 and --ep all cross-EP comparison) are not part of this PR.

Closes #155

xieofxie · 2026-06-22T08:18:35Z

For the issue 155, only the remaining --ep all is tracked via #449, so close it

DingmaomaoBJTU

Overall this is a clean, well-motivated fix. The core bug (samples_per_sec inflated by requested batch on static-batch models) is correctly diagnosed and fixed. The effective_batch_size helper is clear and its fallback behavior is well-documented. Tests cover the important scenarios including the regression guard. A few minor observations below.

- Simplify static-batch warning guard to fire consistently with the console Note (drop redundant batch_size != 1 check) - Document single-input assumption in effective_batch_size docstring

perf: add effective_batch_size for batch-size

f637aeb

xieofxie requested a review from a team as a code owner June 22, 2026 08:15

use warning

804bdfa

DingmaomaoBJTU reviewed Jun 23, 2026

View reviewed changes

Comment thread src/winml/modelkit/commands/perf.py Outdated

Comment thread src/winml/modelkit/commands/perf.py

perf: address PR review on effective batch size

3776265

- Simplify static-batch warning guard to fire consistently with the console Note (drop redundant batch_size != 1 check) - Document single-input assumption in effective_batch_size docstring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: scale throughput by effective batch size, not requested --batch-size#930

perf: scale throughput by effective batch size, not requested --batch-size#930
xieofxie wants to merge 3 commits into
mainfrom
hualxie/perf_batch_test

xieofxie commented Jun 22, 2026

Uh oh!

xieofxie commented Jun 22, 2026

Uh oh!

DingmaomaoBJTU left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xieofxie commented Jun 22, 2026

Summary

Changes

Tests

Scope note

Uh oh!

xieofxie commented Jun 22, 2026

Uh oh!

DingmaomaoBJTU left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants