perf: O(1) PlanDataInjector lookup by op kind#4535
Conversation
PlanDataInjector.injectPlanData walked every operator in the tree against every registered injector (`for (injector <- injectors if injector.canInject(op))`) -- N operators x M injectors canInject calls -- even though most operators in any tree are non-scan and match no injector. Add `opStructCase` to the PlanDataInjector trait and key a Map[OpStructCase, PlanDataInjector]. Look up by op.getOpStructCase (O(1)) then a single canInject confirm; non-scan operators skip the iteration entirely. Pure performance change -- no behavior difference. Closes apache#4530 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
check-suites.py requires every *Suite.scala to appear in both pr_build_linux.yml and pr_build_macos.yml. Add the new PlanDataInjectorSuite alongside its sibling org.apache.spark.sql.comet suites. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
I'm not familiar with this code so will defer to @mbutrovich for review |
|
I will review this tomorrow. |
|
@mbutrovich gentle ping on this one when you get a chance. No rush, just keeping it on your radar. Thanks! |
Taking a look today for realsies. Thanks for the patience, @schenksj! |
mbutrovich
left a comment
There was a problem hiding this comment.
Thanks for splitting this out from the Delta branch. The approach is sound and I am comfortable with it.
Since op_struct is a protobuf oneof, the op kind is mutually exclusive, and each old canInject already began with the matching hasIcebergScan / hasNativeScan check. That means at most one injector could ever match a given operator under the old loop too, so keying by OpStructCase is behavior-preserving rather than just an approximation. Good call adding the distinct-key guard in PlanDataInjectorSuite to catch two injectors sharing a kind.
I checked the serde-performance angle since that is the reason we moved to the RDD model. This change is clean there. The new opStructCase vals and the injectorsByKind map live on object singletons initialized once per executor JVM, nothing new enters the serialized Operator protobuf, and injectPlanData still returns the same protobuf shape. It runs inside CometExecRDD.compute on the executor, not in a shipped closure, so there is no transitive capture into the RDD.
|
|
||
| val result = PlanDataInjector.injectPlanData(root, Map.empty, Map.empty) | ||
|
|
||
| assert(result == root, "non-scan operator tree should be returned unchanged") |
There was a problem hiding this comment.
The injectPlanData leaves a non-scan operator tree unchanged test asserts result == root, which is a good pass-through check. Worth being clear that the result still goes through op.toBuilder and build(), so this verifies protobuf equality after a rebuild rather than asserting the rebuild was skipped. Fine as a behavior test, just not an allocation-avoidance assertion.
Which issue does this PR close?
Closes #4530.
Rationale for this change
PlanDataInjector.injectPlanDatawalked every operator in the tree against every registered injector:That is N operators × M injectors
canInjectcalls, even though most operators in any tree are non-scan and match no injector.This is an independent micro-optimization, extracted from the Delta Lake contrib integration branch (which registers a third injector) purely to keep that PR focused; there is no functional dependency.
What changes are included in this PR?
def opStructCase: Operator.OpStructCaseto thePlanDataInjectortrait, implemented byIcebergPlanDataInjector(ICEBERG_SCAN) andNativeScanPlanDataInjector(NATIVE_SCAN).Map[Operator.OpStructCase, PlanDataInjector]and look up byop.getOpStructCase(O(1)), then run a singlecanInjectconfirm (which still inspects detail fields likehasCommon/!hasFilePartition). Non-scan operators skip the iteration entirely.Pure performance change — no behavior difference.
How are these changes tested?
PlanDataInjectorSuite: asserts a non-scan operator tree passes throughinjectPlanDataunchanged (exercises the O(1) miss path), and that each registered injector resolves back to itself via its declaredopStructCase(the kinds are distinct and the map is complete, so no injector is silently shadowed).CometExecSuite(126/0), which exercises native-scan plan-data injection end-to-end on every native query, passes unchanged — confirming the refactor preserves behavior.