perf(android): Hit-test gestures without getLocationOnScreen#5595
perf(android): Hit-test gestures without getLocationOnScreen#5595runningcode wants to merge 5 commits into
Conversation
📲 Install BuildsAndroid
|
Performance metrics 🚀
|
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 18c0bc2 | 306.73 ms | 349.77 ms | 43.03 ms |
| 0eaac1e | 316.82 ms | 357.34 ms | 40.52 ms |
| d15471f | 303.49 ms | 439.08 ms | 135.59 ms |
| fc5ccaf | 276.52 ms | 370.46 ms | 93.93 ms |
| e2dce0b | 308.96 ms | 360.10 ms | 51.14 ms |
| 5b1a06b | 352.27 ms | 413.70 ms | 61.43 ms |
| 37ec571 | 366.04 ms | 424.28 ms | 58.23 ms |
| 9fbb112 | 361.43 ms | 427.57 ms | 66.14 ms |
| bbc35bb | 324.88 ms | 425.73 ms | 100.85 ms |
| ff8eea4 | 313.42 ms | 337.08 ms | 23.66 ms |
App size
| Revision | Plain | With Sentry | Diff |
|---|---|---|---|
| 18c0bc2 | 1.58 MiB | 2.13 MiB | 557.33 KiB |
| 0eaac1e | 1.58 MiB | 2.19 MiB | 619.17 KiB |
| d15471f | 1.58 MiB | 2.13 MiB | 559.54 KiB |
| fc5ccaf | 1.58 MiB | 2.13 MiB | 557.54 KiB |
| e2dce0b | 0 B | 0 B | 0 B |
| 5b1a06b | 0 B | 0 B | 0 B |
| 37ec571 | 0 B | 0 B | 0 B |
| 9fbb112 | 1.58 MiB | 2.11 MiB | 539.18 KiB |
| bbc35bb | 1.58 MiB | 2.12 MiB | 553.01 KiB |
| ff8eea4 | 1.58 MiB | 2.28 MiB | 718.64 KiB |
0xadam-brown
left a comment
There was a problem hiding this comment.
Great work! (And love the helper doc.)
One optional comment; otherwise lgtm
| * Maps a touch point expressed in the parent's local coordinate space into the child's local | ||
| * coordinate space. This mirrors how {@link ViewGroup} dispatches touch events to its children | ||
| * and lets us hit-test the whole tree with a single downward traversal, instead of calling {@link | ||
| * View#getLocationOnScreen(int[])} (which walks up to the root) for every view. |
| // (90, 220) maps to (-10, 20) in the child's space -> outside, despite being inside the decor. | ||
| assertNull(ViewUtils.findTarget(options, decorView, 90f, 220f, UiElement.Type.CLICKABLE)) | ||
| } | ||
|
|
There was a problem hiding this comment.
l: Nice! Beyond covering the top/left path, might be worth adding quick cases for parent scrollX/scrollY and a non-identity child matrix. The gesture listener mocks won't catch regressions there and they'd cover the other two mapToChild branches.
There was a problem hiding this comment.
added that test below! good point
| childY = point[1]; | ||
| } | ||
| } | ||
| return new ViewWithLocation(child, childX, childY); |
There was a problem hiding this comment.
i guess the next step to improve this could be to have a pool of reusable ViewWithLocations to reduce the number of allocations? but that could be done separately (if it makes sense at all)
There was a problem hiding this comment.
i always dreamed of the day I would need to use a pool to fix a real performance problem. it's a good point. can add a ticket for a follow up investigation to see if this makes sense.
There was a problem hiding this comment.
Claude says:
❯ Would it make sense to add a ticket for a follow up to add a pool for the ViewWithLocation objects?
⏺ Honestly, I'd lean against it — and I wouldn't file the ticket speculatively.
Reasoning:
- These are textbook ephemeral allocations. Each ViewWithLocation is ~24 bytes (one ref + two floats) and dies immediately after the
traversal. ART's young-gen allocates these with a bump pointer (TLAB) and collects them almost for free. Pooling trades that for a managed
pool with reset/lifecycle logic, thread-safety questions, and live objects the GC now has to keep scanning — on Android, hand-rolled pools for
small short-lived objects often net-lose. - The count is bounded and small — one per view on the hit-test path, and the traversal prunes whole subtrees as soon as a view doesn't
contain the touch, so it's nowhere near the full tree. This fires at interaction frequency (taps/scrolls), not per-frame render. - The real win in this PR was removing the per-view getLocationOnScreen walk (O(depth) up to the root, each call). The wrapper allocation is
noise next to that.
romtsn
left a comment
There was a problem hiding this comment.
dope! I wish my Algebraic geometry teacher had told me what would I need matrices for 😅
ViewUtils.findTarget called View.getLocationOnScreen for every visited view, and that walks from the view up to the root each time, making the traversal O(N*depth) per tap and scroll start. Instead, map the touch point down into each child's local coordinate space as we descend the tree — the same way ViewGroup dispatches touch events — so each view costs O(1) and the whole traversal is O(N). The locators still receive the original decor-view-relative coordinates, since the Compose locator hit-tests against window coordinates. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g (JAVA-534) The existing test only exercised the left/top offset path of mapToChild. Add cases for a scrolled parent and a non-identity child matrix so the other two coordinate-mapping branches are covered, and switch the class to Robolectric so the real Matrix math runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
570f8d3 to
0eb4cd6
Compare
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📜 Description
ViewUtils.findTarget(run on every tap and at every scroll start) calledView.getLocationOnScreenfor every visited view to hit-test it.getLocationOnScreenwalks from the view up to the root each call, so hit-testing the whole tree was O(N·depth) per gesture, repeatedly re-walking the same ancestor chains.This replaces that with the technique
ViewGroupitself uses to dispatch touch events: map the touch point down into each child's local coordinate space as we descend (offset by the parent's scroll and the child'sleft/top, then apply the child's inverse matrix for transformed views). Each view is then a cheapO(1)bounds check against its own[0,0,width,height], making the whole traversal O(N).Notes:
x,y, becauseComposeGestureTargetLocatorhit-tests against window coordinates.The
sentry-composelocator gets the sameLinkedList→ArrayDequecleanup (it already hit-tests in window space, so no coordinate change there).💡 Motivation and Context
JAVA-534 / #5481. Follow-up to #5594 — that PR removed the queue-node allocations; this one removes the per-view
getLocationOnScreencall, replacing anO(N·depth)traversal withO(N).Tradeoff: carrying the per-view local coordinates through the BFS needs a small holder object per node, so this re-introduces a per-node allocation that #5594 removed. That is an intentional trade — avoiding the repeated
O(depth)ancestor walk per view in exchange for a short-lived gen-0 object.A micro-benchmark (see testing notes) measures a ~4–5× speedup of the traversal on large view trees, scaling down to ~1.7× on trivial ones — the win is proportional to how many views sit under the touch.
💚 How did you test it?
ViewHelpers) to mock local geometry; all existingSentryGestureListenerClickTest/SentryGestureListenerScrollTestcases pass unchanged.ViewUtilsTestcoverage that exercises the actual transform — a child offset within its parent is hit when the point maps inside it and missed when it maps outside, even though the point is inside the decor view.Scroll target found: scrolling_container) and a click on a button near the bottom of the layout resolves correctly (view.id: scrolling_crash), confirming the transform is correct for a real non-trivial offset.Timing — micro-benchmark. Profiler-based measurement didn't work: on the test device (Pixel 3, Android 12) instrumented method tracing produces empty traces, fine-grained sampling overflows the buffer, and at the only working sampling rate (1 ms)
findTarget(~0.3 ms) is below the resolution — run-to-run variance dwarfs the difference. So I wrote a JVM micro-benchmark instead (not committed — it was a one-off).What it does: builds one real, attached, laid-out
Viewtree and times both hit-testing strategies over it — old (getLocationOnScreenper view) vs new (point-transform-during-descent) — 800 iterations × 6 rounds each, alternating order with warmup, on a release/R8 build. Only the per-view bounds computation differs between the two paths (the part this PR changes); the locator loop is excluded since it's identical for both. The tree is built so every node contains the touch point, forcing a full N-node traversal, and is swept across four sizes. Median of two runs:getLocationOnScreenSo
getLocationOnScreencosts ~1 µs/view, the new transform ~0.2 µs/view → a steady ~4–5× on the traversal, with absolute savings scaling from a few µs on a trivial screen to ~250 µs on a dense 300-view one. This is a per-tap / per-scroll-start main-thread cost (well under one frame even at the high end), not a frame-rate or startup change. Speedup ratios reproduced within ~8% across runs; absolute numbers drift with device thermals.📝 Checklist
sendDefaultPIIis enabled.🔮 Next steps
None.