clear residual op in vpto emit#855
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces folding for pto.get_validshape intrinsics into materialized tile handles, along with erasing unused pto.set_validshape operations and performing dead code elimination (DCE) on leftover tile-handle view and allocation operations. The review feedback suggests optimizing the iterative DCE implementation to a worklist-based approach to improve complexity from UnrealizedConversionCastOp has exactly one result before accessing it.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| bool tileDceChanged = true; | ||
| while (tileDceChanged) { | ||
| tileDceChanged = false; | ||
| SmallVector<Operation *, 8> deadTileOps; | ||
| func.walk([&](Operation *op) { | ||
| if (!op->use_empty()) | ||
| return; | ||
| if (isa<pto::TReshapeOp, pto::MaterializeTileOp, pto::AllocTileOp>(op)) | ||
| deadTileOps.push_back(op); | ||
| else if (auto castOp = dyn_cast<UnrealizedConversionCastOp>(op)) { | ||
| if (castOp.getNumOperands() == 1 && | ||
| isa<pto::TileBufType>(castOp.getResult(0).getType())) | ||
| deadTileOps.push_back(op); | ||
| } | ||
| }); | ||
| for (auto *op : llvm::reverse(deadTileOps)) { | ||
| op->erase(); | ||
| tileDceChanged = true; | ||
| } | ||
| } |
There was a problem hiding this comment.
The current iterative DCE implementation performs a full function walk (func.walk) inside a while loop for every level of dead tile operations. For a chain of
We can optimize this to UnrealizedConversionCastOp has exactly 1 result before accessing getResult(0) to prevent potential out-of-bounds assertions.
SmallVector<Operation *, 8> worklist;
func.walk([&](Operation *op) {
if (!op->use_empty())
return;
if (isa<pto::TReshapeOp, pto::MaterializeTileOp, pto::AllocTileOp>(op)) {
worklist.push_back(op);
} else if (auto castOp = dyn_cast<UnrealizedConversionCastOp>(op)) {
if (castOp.getNumOperands() == 1 && castOp.getNumResults() == 1 &&
isa<pto::TileBufType>(castOp.getResult(0).getType()))
worklist.push_back(op);
}
});
while (!worklist.empty()) {
Operation *op = worklist.pop_back_val();
SmallVector<Value, 4> operands(op->getOperands());
op->erase();
for (Value operand : operands) {
if (auto *defOp = operand.getDefiningOp()) {
if (!defOp->use_empty())
continue;
if (isa<pto::TReshapeOp, pto::MaterializeTileOp, pto::AllocTileOp>(defOp)) {
worklist.push_back(defOp);
} else if (auto castOp = dyn_cast<UnrealizedConversionCastOp>(defOp)) {
if (castOp.getNumOperands() == 1 && castOp.getNumResults() == 1 &&
isa<pto::TileBufType>(castOp.getResult(0).getType()))
worklist.push_back(defOp);
}
}
}
}6c34f51 to
eb6e8d3
Compare
Codex Review该评论由 review 机器人自动更新。
SummaryPR 在两阶段 FoldTileBufIntrinsics 流水线里过早删除 Findings
VPTO pipeline runs this pass twice: |
问题总结
VPTO lowering pipeline 完成后,module 中残留的 tile 元数据 op 阻塞了 LLVM 导出。这些 op 分为两类:
语义已完成但未清理的死 op:
pto.set_valid_shape(始终无 user)、无 user 的pto.get_valid_shape、无 user 的pto.treshape、无 user 的pto.alloc_tile、无 user 的UnrealizedConversionCastOp。可被折叠的 cast 链:
pto.get_valid_shape→UnrealizedConversionCastOp→ LLVM i64 的 cast 链,其等价的 LLVM i64 值已存在于原始pto.alloc_tile的 valid row/col 属性中,但未被复用。