-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR][CIRGen] Improve switch support for unrecheable code #528
Conversation
This reverts commit 55c03f8
…m#357) This PR fixes lowering of the next code: ``` void foo(int x, int y) { switch (x) { case 0: if (y) break; break; } } ``` i.e. when some sub statement contains `break` as well. Previously, we did this trick for `loop`: process nested `break`/`continue` statements while `LoopOp` lowering if they don't belong to another `LoopOp` or `SwitchOp`. This is why there is some refactoring here as well, but the idea is stiil the same: we need to process nested operations and emit branches to the proper blocks. This is quite frequent bug in `llvm-test-suite`
This is how both libc++ and libstdc++ implement iterator in std::array, stick to those use cases for now. We could add other variations in the future if there are others around.
- Check whether container is part of std, add a fixed list of available containers (for now only std::array) - Add a getRawDecl method to ASTRecordDeclInterface - Testcases
This was a bit half backed, give it some love.
Inspired by similar work in libc++, pointed to me by Louis Dionne and Nikolas Klauser. This is initial, very conservative and not generalized yet: works for `char`s within a specific version of `std::find`.
…ents. Before this fix conversion of flat offset to GlobalView indices could crash or compute invalid result.
This reverts commit bbaa147.
`ScopeOp` may end with `ReturnOp` instead of `YieldOp`, that is not expected now. This PR fix this. The reduced example is: ``` int foo() { { return 0; } } ``` This is quite frequent bug in `llvm-test-suite`
One more step towards variable length array support. This PR adds one more helper for the `alloca` instruction and re-use the existing ones. The reason is the following: right now there are two possible ways to insert alloca: either to a function entry block or to the given block after all the existing alloca instructions. But for VLA support we need to insert alloca anywhere, right after an array's size becomes known. Thus, we add one more parameter with the default value - insertion point. Also, we don't want copy-paste the code, and reuse the existing helpers, but it may be a little bit confusing to read.
This PR adds `cir.ternary` lowering. There are two approaches to lower `cir.ternary` imo: 1. Use `scf.if` op. 2. Use `cf.cond_br` op. I choose `scf.if` because `scf.if` + canonicalization produces `arith.select` whereas `cf.cond_br` requires scf lifting. In many ways `scf.if` is more high-level and closer to `cir.ternary`. A separate `cir.yield` lowering is required since we cannot directly replace `cir.yield` in the ternary op lowering -- the yield operands may still be illegal and doing so produces `builtin.unrealized_cast` ops. I couldn't figured out a way to solve this issue without adding a separate lowering pattern. Please let me know if you know a way to solve this issue.
This PR fixes the next case ``` typedef struct { } A; A create() { A a; return a; } void foo() { A a; a = create(); } ``` i.e. when a struct is assigned to a function call result
…vmgh-352) (llvm#363) The error manifested in code like ``` int a[16]; int *const p = a; void foo() { p[0]; } ``` It's one the most frequent errors in current llvm-test-suite. I've added the test to globals.cir which is currently XFAILed, I think @gitoleg will fix it soon. Co-authored-by: Bruno Cardoso Lopes <[email protected]>
This PR addresses llvm#248 . Currently string literals are always lowered to a `cir.const_array` attribute even if the string literal only contains null bytes. This patch make the CodeGen emits `cir.zero` for these string literals.
Currently, codegen of lvalue comma expression would crash: ```cpp int &foo1(); int &foo2(); void c1() { int &x = (foo1(), foo2()); // CRASH } ``` This simple patch fixes this issue.
This PR addresses llvm#90. It introduces a new type constraint `CIR_AnyType` which allows CIR types and MLIR floating-point types. Present `AnyType` constraints are replaced with the new `CIR_AnyType` constraint.
Arrays can be first declared without a known bound, and then defined with a known bound. For example: ```cpp extern int data[]; int test() { return data[1]; } int data[3] {1, 2, 3}; ``` Currently `clangir` crashes on generating CIR for this case. This is due to the type of the `data` definition being different from its declaration. This patch adds support for such a case.
Breaks the pass into smaller more manageable rewrites.
…IdiomRecognizer. (llvm#389) Some tests started failing under `-DLLVM_USE_SANITIZER=Address` due to trivial use-after-free errors.
Like SCF's `scf.condition`, the `cir.condition` simplifies codegen of loop conditions by removing the need of a contitional branch. It takes a single boolean operand which, if true, executes the body region, otherwise exits the loop. This also simplifies lowering and the dialect it self. A new constraint is now enforced on `cir.loops`: the condition region must terminate with a `cir.condition` operation. A few tests were removed as they became redundant, and others where simplified. The merge-cleanups pass no longer simplifies compile-time constant conditions, as the condition body terminator is no longer allowed to be terminated with a `cir.yield`. To circumvent this, a proper folder should be implemented to fold constant conditions, but this was left as future work. Co-authored-by: Bruno Cardoso Lopes <[email protected]>
Once the LexicalScope goes out of scope, its cleanup process will also check if a return was set to be yielded, and, if so, generate the yield with the respective value. ghstack-source-id: 9305d2ba5631840937721755358a774dc9e08b90 Pull Request resolved: llvm#312
Instead of returning a boolean indicating whether the statement was handled, returns the ReturnExpr of the statement if there is one. It also adds some extra bookkeeping to ensure that the result is returned when needed. This allows for better support of GCC's `ExprStmt` extension. The logical result was not used: it was handled but it would never fail. Any errors within builders should likely be handled with asserts and unreachables since they imply a programmer's error in the code. ghstack-source-id: 2319cf3f12e56374a52aaafa4304e74de3ee6453 Pull Request resolved: llvm#313
Adds support for GCC statement expressions return values as well as StmtExpr LValue emissions. To simplify the lowering process, the scope return value is not used. Instead, a temporary allocation is created on the parent scope where the return value is stored. For classes, a second scope is created around this temporary allocation to ensure any destructors are called. This does not implement the full semantics of statement expressions. ghstack-source-id: 64e03fc3df45975590ddbcab44959c2b49601101 Pull Request resolved: llvm#314
5223c3c
to
9ae5d1f
Compare
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
// TODO: Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found many sample code that failed due to incorrect terminator in block, e.g.
switch(a) {
case 0:
break;
int x = 1;
}
switch(a) {
case 0:
return 0;
return 1;
int x = 1;
}
for (;;) {
break;
int x = 1;
}
Looks like it's another large work, so I just skip ReturnStmt here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, can you file a new issue and list these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opposed to return mlir::success();
because it will just silently skips something we don't know how to handle, I rather these things fails or crash, so that it's clear that they aren't implemented? What happens when you remove this return?
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
@@ -328,6 +328,14 @@ mlir::LogicalResult CIRGenFunction::buildLabelStmt(const clang::LabelStmt &S) { | |||
// IsEHa: not implemented. | |||
assert(!(getContext().getLangOpts().EHAsynch && S.isSideEntry())); | |||
|
|||
// TODO: After support case stmt crossing scopes, we should build LabelStmt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any TODO
in CIRGen should be TODO(cir)
@@ -2027,6 +2031,8 @@ class CIRGenFunction : public CIRGenTypeCache { | |||
// Scope entry block tracking | |||
mlir::Block *getEntryBlock() { return EntryBlock; } | |||
|
|||
bool IsInsideCaseNoneStmt = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this, reasons below.
// and clean LexicalScope::IsInsideCaseNoneStmt. | ||
for (auto *lexScope = currLexScope; lexScope; | ||
lexScope = lexScope->getParentScope()) { | ||
assert(!lexScope->IsInsideCaseNoneStmt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you remove this code? Also, why doesn't it work to just walk the scope up until you find a switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, we won't need this assert anymore if we could keep the case none stmt somehow as you suggested.
What happens if you remove this code?
Remove this code won't cause incorrect behavior currently (as we didn't support goto in that case yet), but I think it may produce strange error message in the future.
switch (int x) {
foo:
x = 1;
break;
case 2:
goto foo;
}
We need to avoid erasing the CaseNoneStmt
containing label foo
.
why doesn't it work to just walk the scope up until you find a switch?
Refer to the below code, we need to guarantee the removed Stmt
won't contain any LabelStmt
, whether the LabelStmt
is inside another nested switch or not.
switch(x) {
switch(x) {
case 1:
foo:
break;
}
break;
case 1:
goto foo;
}
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
// TODO: Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, can you file a new issue and list these?
clang/lib/CIR/CodeGen/CIRGenStmt.cpp
Outdated
// TODO: Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opposed to return mlir::success();
because it will just silently skips something we don't know how to handle, I rather these things fails or crash, so that it's clear that they aren't implemented? What happens when you remove this return?
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you are creating an orphan region, this mean that anything emitted inside a buildCaseNoneStmt
will never execute, right? The problem if a orphan region is that it won't get attached to anything, so it really adds no value (not even for unrecheable code analysis). If so, better just to split the current basic block A into two: B and C. A should jump to C and you emit the code in B.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find a good place to hold the block of CaseNoneStmt
.
For example
void f(int x) {
switch(x) {
break;
}
}
There is no region inside SwitchOp
, so we have to put the break
block outside SwitchOp
, which cause verification failed: 'cir.break' op must be within a loop or switch
.
Did I misunderstand something? Looking forward to your suggestions~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your point, but if you go for the current approach you might as well skip this codegen entirely, because what you are emitting won't ever be attached to anything. I think it's safer to mimic the original codegen here, what is Clang currently doing for OG codegen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should create a SwitchOp with at least one default region and delete that at the end if it ends up unused?
f726860
to
7a61b3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't find a good place to hold the block of CaseNoneStmt
.
For example
void f(int x) {
switch(x) {
break;
}
}
There is no region inside SwitchOp
, so we have to put the break
block outside SwitchOp
, which cause verification failed: 'cir.break' op must be within a loop or switch
.
Did I misunderstand something? Looking forward to your suggestions~
// and clean LexicalScope::IsInsideCaseNoneStmt. | ||
for (auto *lexScope = currLexScope; lexScope; | ||
lexScope = lexScope->getParentScope()) { | ||
assert(!lexScope->IsInsideCaseNoneStmt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Firstly, we won't need this assert anymore if we could keep the case none stmt somehow as you suggested.
What happens if you remove this code?
Remove this code won't cause incorrect behavior currently (as we didn't support goto in that case yet), but I think it may produce strange error message in the future.
switch (int x) {
foo:
x = 1;
break;
case 2:
goto foo;
}
We need to avoid erasing the CaseNoneStmt
containing label foo
.
why doesn't it work to just walk the scope up until you find a switch?
Refer to the below code, we need to guarantee the removed Stmt
won't contain any LabelStmt
, whether the LabelStmt
is inside another nested switch or not.
switch(x) {
switch(x) {
case 1:
foo:
break;
}
break;
case 1:
goto foo;
}
// TODO(cir): Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm opposed to return mlir::success(); because it will just silently skips something we don't know how to handle, I rather these things fails or crash, so that it's clear that they aren't implemented? What happens when you remove this return?
buildReturnStmt()
assume there is exactly one return block in a region, and there is one region in a lexical scope, the only exceptions are switch scope, which has multiple regions. The related code is
mlir::Block *getOrCreateRetBlock(CIRGenFunction &CGF, mlir::Location loc) {
unsigned int regionIdx = 0;
if (isSwitch())
regionIdx = SwitchRegions.size() - 1;
if (regionIdx >= RetBlocks.size())
return createRetBlock(CGF, loc);
return &*RetBlocks.back();
}
So if we remove the return here, the following code will cause crash. regionIdx
will be -1, and we'll call RetBlocks .back()
with empty RetBlocks
int f(int x) {
switch(x) {
return 0;
}
return 1;
}
By the way, I believe the current implementation of getOrCreateRetBlock()
about switch is incorrect and also should be solved after changing definition of SwitchOp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I believe the current implementation of
getOrCreateRetBlock()
about switch is incorrect and also should be solved after changing definition ofSwitchOp
.
Right, we should fix the logic, not take shortcuts like returning mlir::success()
. Can you elaborate on what do you mean by changing the definition of SwitchOp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted my thought in #528 to discuss it, thanks~
I'm going to resume reviewing this, sorry for the delay! |
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your point, but if you go for the current approach you might as well skip this codegen entirely, because what you are emitting won't ever be attached to anything. I think it's safer to mimic the original codegen here, what is Clang currently doing for OG codegen?
// TODO(cir): Rewrite the logic to handle ReturnStmt inside SwitchStmt, then | ||
// clean up the code below. | ||
if (currLexScope->IsInsideCaseNoneStmt) | ||
return mlir::success(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I believe the current implementation of
getOrCreateRetBlock()
about switch is incorrect and also should be solved after changing definition ofSwitchOp
.
Right, we should fix the logic, not take shortcuts like returning mlir::success()
. Can you elaborate on what do you mean by changing the definition of SwitchOp
?
@@ -704,6 +717,22 @@ CIRGenFunction::buildSwitchCase(const SwitchCase &S, mlir::Type condType, | |||
llvm_unreachable("expect case or default stmt"); | |||
} | |||
|
|||
mlir::LogicalResult CIRGenFunction::buildCaseNoneStmt(const Stmt *S) { | |||
// Create orphan region to skip over the case none stmts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should create a SwitchOp with at least one default region and delete that at the end if it ends up unused?
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from llvm#528.
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from llvm#528.
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from llvm#528.
This was superseded by #1006; thank you for laying the groundwork for it! |
Make logic cleaner and more extensible. Separate collecting `SwitchStmt` information and building op logic into different functions. Add more UT to cover nested switch, which also worked before this pr. This pr is split from #528.
Support non-block
case
and statementw that don't belong to anycase
region, fix #520 #521