fix(zkernel): mark ISR-reachable list helpers K_ISR_SAFE (IRAM) — closes #53#54
Open
swoisz wants to merge 3 commits into
Open
fix(zkernel): mark ISR-reachable list helpers K_ISR_SAFE (IRAM) — closes #53#54swoisz wants to merge 3 commits into
swoisz wants to merge 3 commits into
Conversation
k_sem_give() is K_ISR_SAFE (IRAM_ATTR) so it can run from an ESP-IDF IRAM ISR while the flash cache is disabled, but on its hot path it called z_sem_pop_waiter(), a flash-resident static helper. When the give came from an IRAM ISR during a concurrent flash op (e.g. an IRAM GPIO ISR submitting work -> k_work_submit -> k_sem_give), the fetch of z_sem_pop_waiter faulted with a cache-access error and the chip panicked (issue #53, found and fixed locally by a downstream). Mark z_sem_pop_waiter K_ISR_SAFE so it relocates into IRAM. It is pure list-walking over the caller-owned waiter list with no FreeRTOS calls, so it is safe in IRAM. Its other caller, k_sem_reset(), is flash-resident, but flash code calling an IRAM helper is fine. Sweeping the rest of the K_ISR_SAFE call graph for the same class (an IRAM function calling a flash-resident static helper) turned up one sibling: z_event_match() in k_event.c, called on both ISR-safe paths -- the post family (z_event_post_internal) and the wait family fast path (z_event_wait_internal). Mark it K_ISR_SAFE too; it is pure arithmetic with no FreeRTOS calls. No behavioral change; host (linux target) suite unaffected (224/224), clang-format clean. The fault is a target-only memory-placement issue the host build cannot observe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The issue-#53 fault is invisible to the host (linux) suite: it is a target-only memory-placement property. This guard runs against a real target ELF and fails if any K_ISR_SAFE helper landed in a flash-mapped text section instead of .iram0.text -- catching the exact regression where an IRAM_ATTR caller reaches a flash-resident static helper. Section-name based (.iram0.text vs .flash.text), so it is ISA-agnostic across Xtensa and RISC-V targets. Validated on esp32s3: z_sem_pop_waiter and z_event_match resolve into .iram0.text, while non-ISR helpers (k_sem_reset, k_timer_start) remain in .flash.text. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wire tools/check_iram_symbols.sh into the existing build-esp32s3 job so every PR verifies that K_ISR_SAFE helpers are IRAM-resident on a real target ELF -- the issue-#53 regression the host (linux) suite cannot observe. Reuses the target build already produced by that job, so no extra CI cost. Also make the script derive its symbol set from source (every K_ISR_SAFE definition in components/*/src) instead of a hand-maintained list, so a newly-added ISR-safe helper is covered automatically. Inlined/gc'd symbols report "skip" (safe); only an out-of-line, flash-resident symbol fails. Validated on esp32s3: 27 symbols derived, 25 confirmed in .iram0.text (incl. z_sem_pop_waiter and z_event_match), 2 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #53.
k_sem_give()isK_ISR_SAFE(IRAM_ATTR) so it can run from an ESP-IDF IRAM ISR while the flash cache is disabled, but on its hot path it calledz_sem_pop_waiter()— a flash-residentstatichelper. When the give comes from an IRAM ISR during a concurrent flash op (e.g. an IRAM GPIO ISR submitting work →k_work_submit→k_sem_give), fetching the flash-mapped helper faults with a cache-access error and the chip panics. Found and fixed locally by a downstream project (verified on esp32c5).Changes
z_sem_pop_waiter→K_ISR_SAFE(k_sem.c). Pure list-walk over the caller-owned waiter list, no FreeRTOS calls — safe in IRAM. Its other callerk_sem_reset()is flash-resident, but flash → IRAM calls are fine.z_event_match→K_ISR_SAFE(k_event.c). A sweep of the wholeK_ISR_SAFEcall graph for the same bug class (an IRAM function calling a flash-residentstatichelper) turned up exactly one sibling:z_event_match, called on both ISR-safe paths — the post family (z_event_post_internal) and the wait fast path (z_event_wait_internal). Pure arithmetic, no FreeRTOS calls.tools/check_iram_symbols.sh+ a step in the existingbuild-esp32s3job). Derives theK_ISR_SAFEsymbol set from source and asserts each is in.iram0.texton the target ELF — the regression the host (linux) suite structurally cannot observe. Inlined/gc'd symbols reportskip; only an out-of-line flash-resident symbol fails.Call-graph sweep
Audited every
K_ISR_SAFE/IRAM_ATTRentry point (k_sem, k_event, k_work, k_msgq, k_thread, k_timer, ring_buf, watchdog, gpio_dt). The rest were already clean: shared helpers (z_kernel_lock/unlock, allsys_dlist_*,ring_buf_*_claim/finish,k_timeout_*) are headerstatic inline/ALWAYS_INLINE(inline into the IRAM caller, no symbol);k_panic()is#define … __builtin_trap()(no symbol); FreeRTOS*FromISRpaths are IDF's IRAM responsibility.Two items outside this bug class, flagged for follow-up (not changed here):
esp_timer_start_periodicreached fromk_timer_esp_callbackunderCONFIG_K_TIMER_DISPATCH_ISR— an IDF API on the ISR-dispatch path.memcpyinring_buf.c— documented ROM-resident on ESP32-S3; confirm the same on RISC-V parts (C5/C6).Verification
z_sem_pop_waiterandz_event_matchresolve into.iram0.text(0x4037xxxx); non-ISR helpers likek_sem_reset/k_timer_startcorrectly stay in.flash.text(0x4201xxxx), confirming the check discriminates rather than rubber-stamps. 27 K_ISR_SAFE symbols derived, 25 confirmed IRAM, 2 skipped (inlined/not linked), exit 0.Follow-up offered (not in this PR)
A RISC-V (esp32c5) CI job would extend the guard to the exact ISA the downstream hit, and would cover the
memcpy-residency concern above. Held back because esp32c5 is preview in IDF v5.4 and could destabilize the matrix — happy to add it if wanted.