Skip to content

perf(android): Hit-test gestures without getLocationOnScreen#5595

Open
runningcode wants to merge 5 commits into
mainfrom
no/java-534-findtarget-point-transform
Open

perf(android): Hit-test gestures without getLocationOnScreen#5595
runningcode wants to merge 5 commits into
mainfrom
no/java-534-findtarget-point-transform

Conversation

@runningcode

@runningcode runningcode commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

📜 Description

ViewUtils.findTarget (run on every tap and at every scroll start) called View.getLocationOnScreen for every visited view to hit-test it. getLocationOnScreen walks from the view up to the root each call, so hit-testing the whole tree was O(N·depth) per gesture, repeatedly re-walking the same ancestor chains.

This replaces that with the technique ViewGroup itself uses to dispatch touch events: map the touch point down into each child's local coordinate space as we descend (offset by the parent's scroll and the child's left/top, then apply the child's inverse matrix for transformed views). Each view is then a cheap O(1) bounds check against its own [0,0,width,height], making the whole traversal O(N).

Notes:

  • The locators still receive the original decor-view-relative x,y, because ComposeGestureTargetLocator hit-tests against window coordinates.
  • Because the traversal now stays in window/local space instead of mixing in screen space, hit-testing is also slightly more correct when the window isn't at the screen origin (status bar offset, split-screen).
  • Keeps the BFS order, so the selected target for overlapping clickable views is unchanged.

The sentry-compose locator gets the same LinkedListArrayDeque cleanup (it already hit-tests in window space, so no coordinate change there).

💡 Motivation and Context

JAVA-534 / #5481. Follow-up to #5594 — that PR removed the queue-node allocations; this one removes the per-view getLocationOnScreen call, replacing an O(N·depth) traversal with O(N).

Tradeoff: carrying the per-view local coordinates through the BFS needs a small holder object per node, so this re-introduces a per-node allocation that #5594 removed. That is an intentional trade — avoiding the repeated O(depth) ancestor walk per view in exchange for a short-lived gen-0 object.

A micro-benchmark (see testing notes) measures a ~4–5× speedup of the traversal on large view trees, scaling down to ~1.7× on trivial ones — the win is proportional to how many views sit under the touch.

💚 How did you test it?

  • Rewrote the gesture test harness (ViewHelpers) to mock local geometry; all existing SentryGestureListenerClickTest / SentryGestureListenerScrollTest cases pass unchanged.
  • Added ViewUtilsTest coverage that exercises the actual transform — a child offset within its parent is hit when the point maps inside it and missed when it maps outside, even though the point is inside the decor view.
  • On-device (Pixel 3, Android 12): scroll target still resolves (Scroll target found: scrolling_container) and a click on a button near the bottom of the layout resolves correctly (view.id: scrolling_crash), confirming the transform is correct for a real non-trivial offset.

Timing — micro-benchmark. Profiler-based measurement didn't work: on the test device (Pixel 3, Android 12) instrumented method tracing produces empty traces, fine-grained sampling overflows the buffer, and at the only working sampling rate (1 ms) findTarget (~0.3 ms) is below the resolution — run-to-run variance dwarfs the difference. So I wrote a JVM micro-benchmark instead (not committed — it was a one-off).

What it does: builds one real, attached, laid-out View tree and times both hit-testing strategies over it — old (getLocationOnScreen per view) vs new (point-transform-during-descent) — 800 iterations × 6 rounds each, alternating order with warmup, on a release/R8 build. Only the per-view bounds computation differs between the two paths (the part this PR changes); the locator loop is excluded since it's identical for both. The tree is built so every node contains the touch point, forcing a full N-node traversal, and is swept across four sizes. Median of two runs:

Tree (views under touch) old: getLocationOnScreen new: point-transform speedup
~9 (a simple screen) ~9–12 µs ~5–7 µs ~1.7×
~61 ~45–54 µs ~12–15 µs ~3.7×
~161 ~143–175 µs ~35–44 µs ~4.0×
~301 ~298–376 µs ~61–84 µs ~4.5–4.9×

So getLocationOnScreen costs ~1 µs/view, the new transform ~0.2 µs/view → a steady ~4–5× on the traversal, with absolute savings scaling from a few µs on a trivial screen to ~250 µs on a dense 300-view one. This is a per-tap / per-scroll-start main-thread cost (well under one frame even at the high end), not a frame-rate or startup change. Speedup ratios reproduced within ~8% across runs; absolute numbers drift with device thermals.

📝 Checklist

  • I added GH Issue ID & Linear ID
  • I added tests to verify the changes.
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • I updated the docs if needed.
  • I updated the wizard if needed.
  • Review from the native team if needed.
  • No breaking change or entry added to the changelog.
  • No breaking change for hybrid SDKs or communicated to hybrid SDKs.

🔮 Next steps

None.

@linear-code

linear-code Bot commented Jun 22, 2026

Copy link
Copy Markdown

JAVA-534

@sentry

sentry Bot commented Jun 22, 2026

Copy link
Copy Markdown

📲 Install Builds

Android

🔗 App Name App ID Version Configuration
SDK Size io.sentry.tests.size 8.45.0 (1) release

⚙️ sentry-android Build Distribution Settings

@runningcode runningcode marked this pull request as ready for review June 24, 2026 14:06
@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Performance metrics 🚀

  Plain With Sentry Diff
Startup time 319.17 ms 363.53 ms 44.36 ms
Size 0 B 0 B 0 B

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
18c0bc2 306.73 ms 349.77 ms 43.03 ms
0eaac1e 316.82 ms 357.34 ms 40.52 ms
d15471f 303.49 ms 439.08 ms 135.59 ms
fc5ccaf 276.52 ms 370.46 ms 93.93 ms
e2dce0b 308.96 ms 360.10 ms 51.14 ms
5b1a06b 352.27 ms 413.70 ms 61.43 ms
37ec571 366.04 ms 424.28 ms 58.23 ms
9fbb112 361.43 ms 427.57 ms 66.14 ms
bbc35bb 324.88 ms 425.73 ms 100.85 ms
ff8eea4 313.42 ms 337.08 ms 23.66 ms

App size

Revision Plain With Sentry Diff
18c0bc2 1.58 MiB 2.13 MiB 557.33 KiB
0eaac1e 1.58 MiB 2.19 MiB 619.17 KiB
d15471f 1.58 MiB 2.13 MiB 559.54 KiB
fc5ccaf 1.58 MiB 2.13 MiB 557.54 KiB
e2dce0b 0 B 0 B 0 B
5b1a06b 0 B 0 B 0 B
37ec571 0 B 0 B 0 B
9fbb112 1.58 MiB 2.11 MiB 539.18 KiB
bbc35bb 1.58 MiB 2.12 MiB 553.01 KiB
ff8eea4 1.58 MiB 2.28 MiB 718.64 KiB

Previous results on branch: no/java-534-findtarget-point-transform

Startup times

Revision Plain With Sentry Diff
259b9cc 313.27 ms 370.61 ms 57.34 ms

App size

Revision Plain With Sentry Diff
259b9cc 0 B 0 B 0 B

@0xadam-brown 0xadam-brown left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! (And love the helper doc.)

One optional comment; otherwise lgtm

* Maps a touch point expressed in the parent's local coordinate space into the child's local
* coordinate space. This mirrors how {@link ViewGroup} dispatches touch events to its children
* and lets us hit-test the whole tree with a single downward traversal, instead of calling {@link
* View#getLocationOnScreen(int[])} (which walks up to the root) for every view.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

// (90, 220) maps to (-10, 20) in the child's space -> outside, despite being inside the decor.
assertNull(ViewUtils.findTarget(options, decorView, 90f, 220f, UiElement.Type.CLICKABLE))
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l: Nice! Beyond covering the top/left path, might be worth adding quick cases for parent scrollX/scrollY and a non-identity child matrix. The gesture listener mocks won't catch regressions there and they'd cover the other two mapToChild branches.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added that test below! good point

childY = point[1];
}
}
return new ViewWithLocation(child, childX, childY);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess the next step to improve this could be to have a pool of reusable ViewWithLocations to reduce the number of allocations? but that could be done separately (if it makes sense at all)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i always dreamed of the day I would need to use a pool to fix a real performance problem. it's a good point. can add a ticket for a follow up investigation to see if this makes sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude says:

❯ Would it make sense to add a ticket for a follow up to add a pool for the ViewWithLocation objects?

⏺ Honestly, I'd lean against it — and I wouldn't file the ticket speculatively.

Reasoning:

  • These are textbook ephemeral allocations. Each ViewWithLocation is ~24 bytes (one ref + two floats) and dies immediately after the
    traversal. ART's young-gen allocates these with a bump pointer (TLAB) and collects them almost for free. Pooling trades that for a managed
    pool with reset/lifecycle logic, thread-safety questions, and live objects the GC now has to keep scanning — on Android, hand-rolled pools for
    small short-lived objects often net-lose.
  • The count is bounded and small — one per view on the hit-test path, and the traversal prunes whole subtrees as soon as a view doesn't
    contain the touch, so it's nowhere near the full tree. This fires at interaction frequency (taps/scrolls), not per-frame render.
  • The real win in this PR was removing the per-view getLocationOnScreen walk (O(depth) up to the root, each call). The wrapper allocation is
    noise next to that.

@romtsn romtsn left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dope! I wish my Algebraic geometry teacher had told me what would I need matrices for 😅

runningcode and others added 4 commits June 25, 2026 15:23
ViewUtils.findTarget called View.getLocationOnScreen for every visited
view, and that walks from the view up to the root each time, making the
traversal O(N*depth) per tap and scroll start.

Instead, map the touch point down into each child's local coordinate
space as we descend the tree — the same way ViewGroup dispatches touch
events — so each view costs O(1) and the whole traversal is O(N). The
locators still receive the original decor-view-relative coordinates,
since the Compose locator hit-tests against window coordinates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…g (JAVA-534)

The existing test only exercised the left/top offset path of mapToChild.
Add cases for a scrolled parent and a non-identity child matrix so the
other two coordinate-mapping branches are covered, and switch the class
to Robolectric so the real Matrix math runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@runningcode runningcode force-pushed the no/java-534-findtarget-point-transform branch from 570f8d3 to 0eb4cd6 Compare June 25, 2026 13:23
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@runningcode runningcode enabled auto-merge (squash) June 25, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants