-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPlan] First step towards VPlan cost modeling. #92555
Conversation
This adds a new computeCost interface to VPReicpeBase and implements it for VPWidenRecipe and VPWidenIntOrFpInductionRecipe. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. For recipes that do not yet implement computeCost, the legacy cost for the underlying instruction is used. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks right to me, thanks! Adding various comments.
Worth noting in the commit message that the cost of VPWidenRecipe is computed directly, other recipes to follow.
InstructionCost ScalarCost = computeCost( | ||
getBestPlanFor(ElementCount::getFixed(1)), ElementCount::getFixed(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InstructionCost ScalarCost = computeCost( | |
getBestPlanFor(ElementCount::getFixed(1)), ElementCount::getFixed(1)); | |
InstructionCost ScalarCost = computeCost(getBestPlanFor(ScalarVF), ScalarVF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
/// Return the cost of instructions in an inloop reduction pattern, if I is | ||
/// part of that pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Return the cost of instructions in an inloop reduction pattern, if I is | |
/// part of that pattern. | |
/// Return the cost of instructions in an inloop reduction pattern, if \p I | |
/// is part of that pattern. |
(unrelated to this patch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately.
/// Return the cost of instructions in an inloop reduction pattern, if I is | ||
/// part of that pattern. | ||
std::optional<InstructionCost> | ||
getReductionPatternCost(Instruction *I, ElementCount VF, Type *VectorTy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better called getInLoopReductionPatternCost()?
(unrelated to this patch).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately.
Very well. Another suggestion is to use Invalid cost for "no cost" instead of optional.
/// Returns the execution time cost of an instruction for a given vector | ||
/// width. Vector width of one means scalar. | ||
VectorizationCostTy getInstructionCost(Instruction *I, ElementCount VF); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: hoist this to the public methods area above, along with getReductionCost(), rather than below to the public fields area here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved, thanks
return CM.getInstructionCost(UI, VF).first; | ||
} | ||
|
||
bool VPCostContext::skipForCostComputation(Instruction *UI) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool VPCostContext::skipForCostComputation(Instruction *UI) const { | |
bool VPCostContext::skipCostComputation(Instruction *UI) const { |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
VPSingleDefRecipe *VPC; | ||
if (auto *UV = R.getOperand(0)->getUnderlyingValue()) | ||
VPC = new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, | ||
TruncTy, *cast<CastInst>(UV)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses an underlying (Trunc) UV whose result type size may differ from the size of the inferred TruncTy of (S/ZExt) VPC, for legacy cost computation, hence dropping the assert in the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, UV is the underlying Z/SExt rather than the underlying Trunc, so its type clearly differs from that ot VPC. A more informative name would help, e.g., UnderlyingExt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the name, thanks!
@@ -738,6 +760,10 @@ class VPRecipeBase : public ilist_node_with_parent<VPRecipeBase, VPBasicBlock>, | |||
/// this VPRecipe, thereby "executing" the VPlan. | |||
virtual void execute(VPTransformState &State) = 0; | |||
|
|||
/// Compute the cost for the recipe. Returns an invalid cost if the recipe | |||
/// does not yet implement computing the cost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where/Is invalid returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to reflect the fact that the default implementation now returns the cost via the legacy CM, thanks!
@@ -730,6 +731,81 @@ void VPRegionBlock::execute(VPTransformState *State) { | |||
State->Instance.reset(); | |||
} | |||
|
|||
static InstructionCost computeCostForRecipe(VPRecipeBase *R, ElementCount VF, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be folded into Recipe::computeCost()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, as VPRecipeBase::computeCost has the generic implementation to fall back on the legacy CM. Subclasses implementing it would all need to invoke computeCostForRecipe
, unless there's an elegant alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A public non-virtual VPRecipeBase::cost()
or getCost() method can take care of skipping, forcing, or otherwise computing the cost of a recipe. The latter case can invoke a protected virtual VPRecipeBase::computeCost()
which actually implements cost computations, with CM of underlying Instructions as default. Another naming option: computeCost() and computeCostImpl(). Otherwise the distinction between computeCostForRecipe() and Recipe::computeCost() needs to be explained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved as suggested, thanks!
Instruction *UI = nullptr; | ||
if (auto *S = dyn_cast<VPSingleDefRecipe>(R)) | ||
UI = dyn_cast_or_null<Instruction>(S->getUnderlyingValue()); | ||
if (UI && Ctx.skipForCostComputation(UI)) | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instruction *UI = nullptr; | |
if (auto *S = dyn_cast<VPSingleDefRecipe>(R)) | |
UI = dyn_cast_or_null<Instruction>(S->getUnderlyingValue()); | |
if (UI && Ctx.skipForCostComputation(UI)) | |
return 0; | |
if (auto *S = dyn_cast<VPSingleDefRecipe>(R)) { | |
auto *UI = dyn_cast_or_null<Instruction>(S->getUnderlyingValue()); | |
if (UI && Ctx.skipForCostComputation(UI)) | |
return 0; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks!
if (!match(R, m_Binary<Instruction::ICmp>(m_VPValue(), m_VPValue()))) | ||
return false; | ||
return all_of(R->operands(), [](VPValue *Op) { | ||
return vputils::isUniformAfterVectorization(Op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: better to return match() && all_of()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed latest comments and also removed the VPWidenRecipe::computeCost in order to try to separate & split things up a bit more
/// Returns the execution time cost of an instruction for a given vector | ||
/// width. Vector width of one means scalar. | ||
VectorizationCostTy getInstructionCost(Instruction *I, ElementCount VF); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved, thanks
return CM.getInstructionCost(UI, VF).first; | ||
} | ||
|
||
bool VPCostContext::skipForCostComputation(Instruction *UI) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
LLVMContext &Ctx = OrigLoop->getHeader()->getContext(); | ||
VPCostContext CostCtx(CM.TTI, Legal->getWidestInductionType(), Ctx, CM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
LLVMContext &Ctx = OrigLoop->getHeader()->getContext(); | ||
VPCostContext CostCtx(CM.TTI, Legal->getWidestInductionType(), Ctx, CM); | ||
|
||
// Cost modeling for inductions is inaccurate in the legacy cost model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added as below, thanks!
// Cost modeling for inductions is inaccurate in the legacy cost model | ||
// compared to the recipes that are generated. To match here initially during | ||
// VPlan cost model bring up directly use the induction costs from the legacy | ||
// cost model and skip induction recipes. Note that we do this as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks!
if (auto *S = dyn_cast<VPSingleDefRecipe>(this)) | ||
if (auto *UI = dyn_cast_or_null<Instruction>(S->getUnderlyingValue())) | ||
return Ctx.getLegacyCost(UI, VF); | ||
|
||
if (auto *IG = dyn_cast<VPInterleaveRecipe>(this)) | ||
return Ctx.getLegacyCost(IG->getInsertPos(), VF); | ||
if (auto *WidenMem = dyn_cast<VPWidenMemoryRecipe>(this)) | ||
return Ctx.getLegacyCost(&WidenMem->getIngredient(), VF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks!
@@ -253,6 +253,18 @@ void VPRecipeBase::moveBefore(VPBasicBlock &BB, | |||
insertBefore(BB, I); | |||
} | |||
|
|||
InstructionCost VPRecipeBase::computeCost(ElementCount VF, VPCostContext &Ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment, thanks!
VPSingleDefRecipe *VPC; | ||
if (auto *UV = R.getOperand(0)->getUnderlyingValue()) | ||
VPC = new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, | ||
TruncTy, *cast<CastInst>(UV)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the name, thanks!
/// Return the cost of instructions in an inloop reduction pattern, if I is | ||
/// part of that pattern. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately.
/// Return the cost of instructions in an inloop reduction pattern, if I is | ||
/// part of that pattern. | ||
std::optional<InstructionCost> | ||
getReductionPatternCost(Instruction *I, ElementCount VF, Type *VectorTy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately.
@@ -730,6 +731,81 @@ void VPRegionBlock::execute(VPTransformState *State) { | |||
State->Instance.reset(); | |||
} | |||
|
|||
static InstructionCost computeCostForRecipe(VPRecipeBase *R, ElementCount VF, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A public non-virtual VPRecipeBase::cost()
or getCost() method can take care of skipping, forcing, or otherwise computing the cost of a recipe. The latter case can invoke a protected virtual VPRecipeBase::computeCost()
which actually implements cost computations, with CM of underlying Instructions as default. Another naming option: computeCost() and computeCostImpl(). Otherwise the distinction between computeCostForRecipe() and Recipe::computeCost() needs to be explained.
@@ -365,6 +374,9 @@ class LoopVectorizationPlanner { | |||
/// Return the best VPlan for \p VF. | |||
VPlan &getBestPlanFor(ElementCount VF) const; | |||
|
|||
/// Return the most profitable plan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this also fixes the best VF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks!
// Add the cost for the backedge. | ||
Cost += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can and should be taken care of by (loop) region::computeCost()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved, thanks!
/// The current implementation requires access to the legacy cost model which | ||
/// is why it is kept separate from the VPlan-only cost infrastructure. | ||
/// | ||
/// TODO: Move to VPlan::computeCost once the use of the legacy cost model | ||
/// has been retired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has more to do with Legal::inductions and reductions, and their CM cost; the former are kept separate from VPlan and its cost implementation, rather than the latter, atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, thanks!
@@ -7396,6 +7397,122 @@ LoopVectorizationPlanner::plan(ElementCount UserVF, unsigned UserIC) { | |||
return VF; | |||
} | |||
|
|||
InstructionCost VPCostContext::getLegacyCost(Instruction *UI, ElementCount VF) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks!
// Compute the cost of a replicate region. Replicating isn't supported for | ||
// scalable vectors, return an invalid cost for them. | ||
if (VF.isScalable()) | ||
return InstructionCost::getInvalid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it isn't supported, should it be (prevented and) asserted, instead of built and cost invalidated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but at the moment this is done via the cost. Might be worth to adjust (e.g. bail out during VPlan construction), but best done separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth leaving behind a TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks!
// blockNeedsPredication from Legal is used so as to not include all blocks in | ||
// tail folded loops. | ||
if (VF.isScalar()) | ||
return Cost / 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that getReciprocalPredBlockProb() should be used, which currently returns 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, required moving it to a header.
|
||
// For the scalar case, we may not always execute the original predicated | ||
// block, Thus, scale the block's cost by the probability of executing it. | ||
// blockNeedsPredication from Legal is used so as to not include all blocks in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockNeedsPredication is no longer used here, which only checks if VF is scalar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
Cond = Cond->getDefiningRecipe()->getOperand(0); | ||
auto *R = Cond->getDefiningRecipe(); | ||
if (!R) | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a TODO to match additional patterns preserving uniformity of booleans, e.g., AND/OR/etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
} | ||
for (Instruction *I : ReductionOperations) { | ||
auto ReductionCost = CM.getReductionPatternCost( | ||
I, VF, ToVectorTy(I->getType(), VF), TTI::TCK_RecipThroughput); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a comment that we precompute the cost of I
only if it is associated with a reduction pattern, i.e., has ReductionCost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, thanks.
/// LoopVectorizationLegality to handle inductions and reductions, which is | ||
/// why it is kept separate from the VPlan-only cost infrastructure. | ||
/// | ||
/// TODO: Move to VPlan::computeCost once the use of the legacy cost model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// TODO: Move to VPlan::computeCost once the use of the legacy cost model | |
/// TODO: Move to VPlan::computeCost once the use of Legal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
"More than a single plan/VF w/o any plan having scalar VF"); | ||
|
||
InstructionCost ScalarCost = | ||
computeCost(getBestPlanFor(ElementCount::getFixed(1)), ScalarVF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
computeCost(getBestPlanFor(ElementCount::getFixed(1)), ScalarVF); | |
computeCost(getBestPlanFor(ScalarVF), ScalarVF); |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
}); | ||
return RecipeCost; | ||
} | ||
|
||
InstructionCost VPBasicBlock::computeCost(ElementCount VF, VPCostContext &Ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should VPlan and VPBlockBase have cost()
instead of computeCost()
to align with VPRecipeBase, complementing their mutual execute()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
protected: | ||
/// Compute the cost of this recipe using the legacy cost model and the | ||
/// underlying instructions. | ||
InstructionCost computeCost(ElementCount VF, VPCostContext &Ctx) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Currently non-virtual, as only the base default implementation is provided, to be made virtual later.)
@@ -75,7 +75,7 @@ class VPValue { | |||
public: | |||
/// Return the underlying Value attached to this VPValue. | |||
Value *getUnderlyingValue() { return UnderlyingVal; } | |||
const Value *getUnderlyingValue() const { return UnderlyingVal; } | |||
Value *getUnderlyingValue() const { return UnderlyingVal; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loses const'ness? Makes the non-const version redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, removed, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
/// Return the cost of instructions in an inloop reduction pattern, if I is | ||
/// part of that pattern. | ||
std::optional<InstructionCost> | ||
getReductionPatternCost(Instruction *I, ElementCount VF, Type *VectorTy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adjust separately.
Very well. Another suggestion is to use Invalid cost for "no cost" instead of optional.
@@ -2948,6 +2994,9 @@ class VPBasicBlock : public VPBlockBase { | |||
return NewBlock; | |||
} | |||
|
|||
/// Compute the cost of this VPBasicBlock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Compute the cost of this VPBasicBlock | |
/// Compute the cost of this VPBasicBlock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks!
@@ -908,8 +908,13 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) { | |||
unsigned ExtOpcode = match(R.getOperand(0), m_SExt(m_VPValue())) | |||
? Instruction::SExt | |||
: Instruction::ZExt; | |||
auto *VPC = | |||
new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, TruncTy); | |||
VPSingleDefRecipe *VPC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth a comment that UnderlyingExt
, having distinct return type, is recorded only to support legacy cost computation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be clearer in the updated version, thanks!
// VPlan cost model bring up directly use the induction costs from the legacy | ||
// cost model and skip induction recipes. Note that we do this as | ||
// pre-processing; the VPlan may not have any recipes associated with the | ||
// original induction increment instruction. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, worth clarifying in the comment?
If original induction increment instructions do have recipes, is this pre-processing needed, in this preliminary version where recipe costs default to the CM cost of their underlying instructions? Perhaps to retain debug dumps.
Instructions associated with in-loop reductions do need to be pre-processed in order to take their getReductionPatternCost() rather than their getInstructionCost().
@@ -664,6 +676,9 @@ class VPBlockBase { | |||
/// the cloned recipes, including all blocks in the single-entry single-exit | |||
/// region for VPRegionBlocks. | |||
virtual VPBlockBase *clone() = 0; | |||
|
|||
/// Compute the cost of the block. | |||
virtual InstructionCost cost(ElementCount VF, VPCostContext &Ctx) = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: better place all cost()
methods next to execute()
, as done in VPRecipeBase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
And another test case:
|
@AlexHF is it possible the test case is over-reduced? I can reproduce the crash (with assertions, it triggers an assertion), but it is due to a dead load. Would it be possible to get a reproducer also without dead instructions? |
The test case can easily be over-reduced. The interestingness test I used is that a known-good version of |
Luckily, the code involved is open-source (a part of https://chromium.googlesource.com/angle/angle). Here's a couple of the non-reduced IR dumps I got using |
Thank you! I'll check the original failure as well. |
Looks like still crashes:
development/bin/opt -S --passes=loop-vectorize ./reduced.ll |
@alexey-bataev would it be possible to share the unreduced input without an unused load in the loop? |
Try this one |
The code from #92555 (comment) compiles without errors now. I checked an assertions-enabled Clang build as well. |
Thanks, should be fixed now! |
Given the assertion whack-a-mole, should this change be reverted until after LLVM 19 has branched early next week? (Or maybe just disable the consistency assertion?) |
Which commit is the fix for this? 2bb65660ae8b9b2e1896b07b881505a4ffc0393b? |
This patch adds a new temporary option to still use the legacy cost model after llvm#92555. It defaults to false and the only intended use is to adjust the default to true in the soon-to-be-cut release branch.
How about an option to enable the legacy behavior, off in |
Yep |
As discussed in llvm#92555 flip the default for the option added in llvm#99536 to true. This restores the original behavior for the release branch to give the VPlan-based cost model more time to mature on main.
This patch adds a new temporary option to still use the legacy cost model after llvm#92555. It defaults to false and the only intended use is to adjust the default to true in the soon-to-be-cut release branch. PR: llvm#99536
Summary: This patch adds a new temporary option to still use the legacy cost model after #92555. It defaults to false and the only intended use is to adjust the default to true in the soon-to-be-cut release branch. PR: #99536 Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60251192
This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's llvm#67647 and llvm#67934 which is an earlier version of the current PR. PR: llvm#92555
This reverts commit 0079835. Causes crashes, see comments on llvm#92555.
Extra test cases that caused revert of llvm#92555
This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks
and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches
will gradually migrate recipes to compute their own costs step-by-step.
It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.
The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.
Additional discussions and context can be found in @arcbbb's
#67647 and
#67934 which is an earlier version
of the current PR.