recipe(pix2text-mfr): pin VED template as KNOWN-BROKEN regression coverage#946
Draft
ssss141414 wants to merge 1 commit into
Draft
recipe(pix2text-mfr): pin VED template as KNOWN-BROKEN regression coverage#946ssss141414 wants to merge 1 commit into
ssss141414 wants to merge 1 commit into
Conversation
…erage (upstream HF repo layout)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR: breezedeus/pix2text-mfr — VED image-to-text recipe pair (KNOWN-BROKEN regression-pin)
Iter: 6 (validated-negative shipped iter-5 as vision-encoder-decoder-003; this PR pins the recipes with
_statusmarkers per_meta-014convention)Producer: main agent (2026-06-23)
Claimed tier:
(Effort = L0★, Goal = L0-FAIL-UPSTREAM, Outcome = L0)Summary
This PR ships the
breezedeus/pix2text-mfrimage-to-text recipe pair as a KNOWN-BROKEN regression-pin per_meta-014_statusmarker convention. The recipes are structurally correct VED templates (would build any standard HF VisionEncoderDecoderModel checkpoint with covered inner decoder); they are blocked at the upstream HF repo-layout layer, not at the winml export layer. Each recipe carries a top-level_status: "BROKEN — DO NOT USE. ..."field documenting the exact failure and the recipe's regression-coverage purpose.This PR is shipped as a contribution because:
_statusmarker pattern is itself the regression-test forWinMLBuildConfig.from_dictcorrectly ignoring unknown top-level keys (positive control verified in iter-5 onopus-mt-fr-enencoder build with_statusfield present — build SUCCEEDED)..onnxfiles, no PyTorch weights forAutoModel.from_pretrained), so dropping the recipe entirely loses the diagnostic value.No source-code changes.
1. Recipe files
Both recipes carry
"_status": "BROKEN — DO NOT USE. winml build fails at fetch with \breezedeus/pix2text-mfr does not appear to have a file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.`. The HF repo stores weights in a non-standard layout. Recipe is structurally correct and would build any standard VED checkpoint; pinned here as regression coverage. See research/adding-model-support/model_knowledge/vision_encoder_decoder.json finding vision-encoder-decoder-003."`2. README index row
examples/recipes/README.md — row to add for
breezedeus/pix2text-mfr | image-to-text | composite (encoder + decoder) | recipe pair | **BROKEN (upstream HF repo layout)**.3. Build output directory + artifact inventory
temp/ved_build/(gitignored) — EMPTY for this checkpoint. The encoder build attempt aborts at the fetch stage:No
model.onnx, nomodel.onnx.data, noanalyze_result.json, etc. — the failure precedes the export pipeline.External-data layout check (
_meta-023): N/A (no artifacts produced).4. Build log
The recipe
_statusfield IS the build-log substitute — see §1. Stderr captured invision-encoder-decoder-003mechanism_notes.5. Appended findings
Per-model —
model_knowledge/vision_encoder_decoder.jsonnlpconnect/vit-gpt2-image-captioning(the recommended canonical L0★ VED reference instead of the broken one here).Skill-meta
No new
_meta-NNNfindings in this PR.6. Optimum-coverage probe verdict
Verdict: VENDOR-COVERED on
image-to-text(composite splits intoimage-feature-extractionencoder +text2text-generationdecoder via winml overrides pervision-encoder-decoder-002). Effort L0★ is correct CLASSIFICATION; in PRACTICE this specific checkpoint is upstream-blocked.7. Claimed (Effort, Goal, Outcome) tier
_statusfield together encode "we tried, here's why it can't proceed")8. Goal-ladder verdict table (per
_meta-018, per-half per_meta-020)winml buildaborts at fetch: HF repobreezedeus/pix2text-mfrships onlyencoder_model.onnx+decoder_model.onnx(pre-exported ONNX) +config.json/tokenizer*/preprocessor_config.json. NOpytorch_model.bin/model.safetensors/tf_model.h5/model.ckpt/flax_model.msgpack.AutoModel.from_pretrainedrequires one of these to materialize the PyTorch graph thatwinml buildthen traces._meta-018, FAIL halts the march. Lower tiers are unreachable until L0 is unblocked.Short-circuit honored: L0 FAIL-UPSTREAM is the FIRST FAIL verdict in the ladder, halts the march. The
_statusmarker on the recipe is the artifact-of-record for this halted state.Diligence ladder (
_meta-037) — invoked and recorded:vision_encoder_decoder.json— vision-encoder-decoder-003 fully documents the failure mode; no workaround documented because the gate is upstream.winml config— already succeeded;winml buildis where it fails.winml configproduced the recipe pair correctly from the HF config alone (doesn't need weights).--ep-options— N/A (failure is at fetch, not at EP).value_range/ shape pinning — N/A (failure is at fetch, not at export).HfApi.list_repo_tree('breezedeus/pix2text-mfr', recursive=True)returns:.gitattributes,README.md,config.json,decoder_model.onnx,encoder_model.onnx,generation_config.json,preprocessor_config.json,special_tokens_map.json,tokenizer.json,tokenizer_config.json. NO PyTorch weights, only pre-exported ONNX. The failure mode is unchanged from iter-5 (vision-encoder-decoder-003).Feature gap from step 7:
winml buildcould conceivably support a "fetch pre-exported ONNX from HF instead of PT→ONNX export" path for repos like this. Captured undervision-encoder-decoder-003feature_gaps_filed[]. Until then, this checkpoint stays BROKEN.9. Methodology-evolution declaration (per
_meta-031)No NEW methodology friction in this PR. The
_statusmarker convention (_meta-014) was the iter-5 finding that enabled shipping known-broken recipes; this PR uses it as designed. Triggers:FAIL-UPSTREAMis already covered by theFAILslot in_meta-018vocabulary (the-UPSTREAMsuffix is descriptive, not a new verdict).Reviewer should confirm "no methodology friction observed" per
_meta-031anti-trigger. The diligence ladder application IS the methodology working as designed — recipe ships with a documented FAIL-UPSTREAM verdict + feature_gap entry, not a silent failure.Reviewer hand-off package — Step 6 9-item self-check
_statusmarker)_statusfield — exact stderr captured in vision-encoder-decoder-003 mechanism_notes)_meta-018)