Cray CCE: drop `simd` from the GPU offload directive by sbryngelson · Pull Request #1606 · MFlowCode/MFC

sbryngelson · 2026-06-16T15:34:42Z

Summary

On the Cray CCE path the GPU-offload loop directive carries a simd clause. This drops it (and the
matching end):

- !$omp target teams distribute parallel do simd defaultmap(firstprivate:scalar)
+ !$omp target teams distribute parallel do defaultmap(firstprivate:scalar)

CCE-only — gated by MFC_COMPILER == CCE_COMPILER_ID in src/common/include/omp_macros.fpp.
The NVIDIA branch (target teams loop) and the AMD branch (target teams distribute parallel do,
already without simd) are untouched. This brings the CCE directive in line with AMD's.

Rationale: a cleanup, not a perf change

simd is a CPU-vectorization construct; on a SIMT GPU it provides no lane-vectorization and is
effectively vestigial. Removing it is:

Correctness-neutral. The 22 GPU Riemann-kernel regression tests (HLL / hypoelastic / MHD /
RMHD / IBM / examples) all pass, identical to with simd.

Performance-neutral. 7-case benchmark sweep on Frontier (CCE 19.0.0, MI250X), grind in
ns/gp/eq/rhs (mean of 2 reps), comparing two builds that differ only by this clause:

case	with simd	no simd	ratio
5eq_rk3_weno3_hll	5.118	5.099	1.00
5eq_rk3_weno3_hllc	4.860	4.783	0.98
5eq_rk3_weno3_lf	4.802	4.930	1.03
hypo_hll	3.090	3.040	0.98
ibm	12.159	12.160	1.00
igr	3.439	3.357	0.98
viscous_weno5_sgb_acoustic	7.947	7.914	1.00

The result is mixed and within measurement noise — rep-to-rep scatter was ~4–5%, larger than
every per-case delta (e.g. lf is nominally slower without simd). Net is ≈ neutral
(geometric mean ~0.6% toward no-simd, not significant). No performance claim is made — the
CI bench lane provides the authoritative numbers. This change is justified solely as removing a
clause that does nothing useful on a SIMT target.

Risk

Minimal: two-line, CCE-gated string change; no behavior change on any other compiler; correctness
verified on the GPU regression suite.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the Cray CCE-specific OpenMP GPU offload loop directive to drop the simd clause, aligning it with the existing AMD directive and avoiding potentially over-constraining GPU codegen on CCE.

Changes:

Removed simd from the CCE omp_start_directive GPU offload loop directive string.
Removed simd from the matching CCE omp_end_directive string.

… and performance-neutral)

codecov · 2026-06-16T20:14:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.51%. Comparing base (b4be438) to head (d4b9320).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1606   +/-   ##
=======================================
  Coverage   60.51%   60.51%           
=======================================
  Files          83       83           
  Lines       19905    19905           
  Branches     2950     2950           
=======================================
  Hits        12046    12046           
  Misses       5866     5866           
  Partials     1993     1993

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI review requested due to automatic review settings June 16, 2026 15:34

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Copilot started reviewing on behalf of sbryngelson June 16, 2026 15:55 View session

Cray CCE: drop simd from GPU offload directive (cleanup; correctness-…

d4b9320

… and performance-neutral)

sbryngelson force-pushed the cce-drop-simd branch from 9ab1089 to d4b9320 Compare June 16, 2026 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cray CCE: drop `simd` from the GPU offload directive#1606

Cray CCE: drop `simd` from the GPU offload directive#1606
sbryngelson wants to merge 1 commit into
MFlowCode:masterfrom
sbryngelson:cce-drop-simd

sbryngelson commented Jun 16, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

codecov Bot commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

sbryngelson commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Rationale: a cleanup, not a perf change

Risk

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

codecov Bot commented Jun 16, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

sbryngelson commented Jun 16, 2026 •

edited

Loading