Allow overriding popcorn id with POPCORN_SUBMITTER_ID env arg#57
Open
Jack-Khuu wants to merge 122 commits into
Open
Allow overriding popcorn id with POPCORN_SUBMITTER_ID env arg#57Jack-Khuu wants to merge 122 commits into
Jack-Khuu wants to merge 122 commits into
Conversation
Super important ascii art
feat/pre launch update
* Create symlink for popcorn-cli after installation I found that after installation, the default binary name is actually `popcorn-cli`. If we want to use `popcorn` as binary name in the subsequent steps, I believe we need to create a symlink. * Add `popcorn` alias to install.sh and remove manual symlink step from docs Move the symlink creation into install.sh so users get the `popcorn` command automatically. Uses symlink on Linux/macOS and copy on Windows.
* docs: add ACF (booster pack) usage guide to helion-hackathon.md Documents how to use PTXAS Advanced Controls Files from /opt/booster_pack/ during autotuning (autotune_search_acf) and in hardcoded submissions (advanced_controls_file). Includes the important caveat that ACF search only works when the autotuner actually runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove "How ACFs work" subsection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add TileIR backend usage guide to helion-hackathon.md Documents ENABLE_TILE=0 vs ENABLE_TILE=1 and the TileIR compilation pipeline available via nvtriton on B200 instances. Covers how to enable TileIR with Helion (ENABLE_TILE=1 + HELION_BACKEND=tileir), the different tunables (num_ctas/occupancy vs num_warps/maxnreg), and how to hardcode TileIR configs in submissions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: restructure ACF + TileIR as optional performance knobs Group both sections under a single "Optional: Extra Performance Knobs" heading to emphasize neither is required. Streamline both into step 1 (autotune) / step 2 (hardcode) format. Add a "Which combination" section showing all 4 options to try. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove "(Booster Pack)" from ACF heading Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: consolidate TileIR env var instructions Remove duplicate bash export block — the Python os.environ in the code example is sufficient for both local autotuning and submissions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify TileIR tunables come from autotuner output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: shorten "Which should I use?" section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: be explicit about ENABLE_TILE=0 vs ENABLE_TILE=1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: simplify TileIR comparison table to just backend names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* docs: add TileIR backend usage guide to helion-hackathon.md Documents ENABLE_TILE=0 vs ENABLE_TILE=1 and the TileIR compilation pipeline available via nvtriton on B200 instances. Covers how to enable TileIR with Helion (ENABLE_TILE=1 + HELION_BACKEND=tileir), the different tunables (num_ctas/occupancy vs num_warps/maxnreg), and how to hardcode TileIR configs in submissions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: restructure ACF + TileIR as optional performance knobs Group both sections under a single "Optional: Extra Performance Knobs" heading to emphasize neither is required. Streamline both into step 1 (autotune) / step 2 (hardcode) format. Add a "Which combination" section showing all 4 options to try. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove "(Booster Pack)" from ACF heading Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: consolidate TileIR env var instructions Remove duplicate bash export block — the Python os.environ in the code example is sufficient for both local autotuning and submissions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify TileIR tunables come from autotuner output Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: shorten "Which should I use?" section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: be explicit about ENABLE_TILE=0 vs ENABLE_TILE=1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: simplify TileIR comparison table to just backend names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add scoring system, rules, and open-ended contribution track Add point allocation table, scoring formula (correctness + performance ranking), rules & requirements, and the separate open-ended contribution track for non-kernel Helion contributions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: allow unlimited submissions, best one counts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: clarify rules to match actual submission format Each submission uses one static helion.Config for all shapes, not per-shape configs. Simplified rules to reflect this. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Revert "docs: clarify rules to match actual submission format" This reverts commit 4fac3a8. * Add per-shape config dispatch pattern to all submissions Use a factory function (_make_kernel) to create kernel variants with different helion.Config objects, and dispatch in custom_kernel() based on input tensor shapes. This lets participants optimize each benchmark shape independently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update example to show all shapes, remove DEFAULT_CONFIG Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: use Config(...) placeholders with distinct TODO comments for test vs benchmark shapes Test shapes: TODO to replace with default config or any config that passes correctness. Benchmark shapes: TODO to replace with autotuned config. Also add instructions on getting default config via autotune_effort="none". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove references to single-config-for-all-shapes pattern Per-shape configs are the recommended approach. Remove mentions of using a single config across all shapes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: remove references to default config in rules section Configs are always participant-provided. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add tips for version control, tmux, and machine reboots Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: move GPU machine tips to standalone section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: fix performance metric description to match actual eval method The previous description incorrectly stated geometric mean of 100 runs. The actual helion eval uses CUDA graphs with L2 cache clearing, 10 measurements, and arithmetic mean. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Replace hard 30% LOC limit with judges' discretion for inline triton/asm The LOC-based rule was gameable (denominator inflation with padding code), so switch to a qualitative rule: inline triton/asm is allowed as escape hatches, but predominantly inline submissions may be disqualified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Add spawn mode tip for autotuning in GPU machine section Spawn mode isolates each autotuner trial in a subprocess with timeout protection, preventing hangs or crashes from killing the entire run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Clarify that spawn mode is slower than fork mode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: create dedicated kernel folder on `popcorn setup` Instead of writing files directly into the current directory (which overwrites existing files), `popcorn setup` now creates a subfolder named after the problem directory (e.g. `softmax/`). If a folder with that name already exists, a `-N` suffix is appended (`softmax-1/`, `softmax-2/`, etc.) to avoid collisions. * docs: update setup docs to reflect new project folder behavior * style: fix rustfmt formatting in setup.rs
Replace the rank-based correctness/performance formula with a simpler top-3 system: 5 pts (1st), 3 pts (2nd), 1 pt (3rd) per scored problem. Mark fp8_quant as an unscored warm-up. Ties decided by kernel quality.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reorder so the unscored warm-up problem (fp8_quant) is listed first, matching the scoring note that says "Problem 1 is not scored".
* fix: writing multiple profile zip * test: cover profile trace helpers * ci: fix PR validation workflows --------- Co-authored-by: Mark Saroufim <marksaroufim@gmail.com>
* document submission inspection and deletion flow * reframe README note as reward hack section
* add to cli docstring * details in AGENTS.md * Apply suggestion from @burtenshaw Co-authored-by: burtenshaw <ben.burtenshaw@gmail.com> --------- Co-authored-by: Mark Saroufim <marksaroufim@gmail.com>
* Add aarch64 Linux support (DGX Spark / GB10) - Add aarch64-unknown-linux-gnu build target to the release workflow - Add .cargo/config.toml to configure the cross-linker for aarch64 - Update install.sh to detect arm64/aarch64 and download the correct binary (popcorn-cli-linux-aarch64.tar.gz) instead of the x86-64 build * validate arm64 installer in CI * move arm64 validation into test workflow * fold arm64 into test matrix * build arm64 release on native runner --------- Co-authored-by: brandonin <brandonin@users.noreply.github.com> Co-authored-by: Mark Saroufim <marksaroufim@gmail.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
dirs::home_dir() on Windows uses the shell API (SHGetKnownFolderPath), not the HOME env var, so we cannot redirect config lookup in tests. Gate config-fallback tests with #[cfg(not(windows))]. The env var override path is still tested cross-platform. Also recover from poisoned mutex so one test failure doesn't cascade.
Author
|
@msaroufim @S1ro1 lmk what you think? We'd use it on the HLH server side to submit on behalf of users |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When submitting via proxy or in a setting with multiple accounts it is difficult to switch submitter identites (currently always reads from .popcorn.yaml)
This PR adds POPCORN_SUBMITTER_ID as an env flag override
Build
Normal run
Override