Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen] Add const attribute to alloca operations #892

Merged
merged 1 commit into from
Oct 11, 2024

Conversation

Lancern
Copy link
Member

@Lancern Lancern commented Sep 27, 2024

This PR tries to give a simple initial implementation for eliminating redundant loads of constant objects, an idea originally posted by OfekShilon.

Specifically, this PR adds a new unit attribute const to the cir.alloca operation. Presence of this attribute indicates that the alloca-ed object is declared const in the input source program. CIRGen is updated accordingly to start emitting this new attribute.

@Lancern Lancern force-pushed the alloca-const branch 3 times, most recently from 0b45aea to e58e514 Compare September 27, 2024 17:41
Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this patch deal with constant reference and constexpr?

Comment on lines 34 to 37
// - If there is a load operation that properly dominates it, replace the
// load with that dominator load. This process is "recursive": if load A
// dominates load B and load B dominates load C, we should eventually
// replace load C with load A.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we records A dominate C directly?

@Lancern
Copy link
Member Author

Lancern commented Sep 30, 2024

How does this patch deal with constant reference [...]?

Constant reference is not yet taken care of in this patch, I'll add it later!

[...] and constexpr?

Well since constexpr variables are implifitly const I believe they are covered by this patch. Note that during CodeGen (and CIRGen) quite a lot of constexpr variable references have already been evaluated to their values so I believe there's not much we have to care here.

@ChuanqiXu9
Copy link
Member

Well since constexpr variables are implifitly const I believe they are covered by this patch.

In Decl, we have isInlineSpecified and isInline for the different cases. So I am hesitating when I see the use of isConstQualified here. But I didn't check it actually though. It maybe helpful to add some test here.

@Lancern
Copy link
Member Author

Lancern commented Sep 30, 2024

It maybe helpful to add some test here.

Sounds good to me, I'll add a test along with the update for references.

@Lancern
Copy link
Member Author

Lancern commented Sep 30, 2024

Two updates:

  • Added a test case for local constexpr variable;
  • Added const attribute for allocas for local reference variables.

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on cool ideas @Lancern!

I think we can split this patch into two: (1) introduce the const + CIRGen + tests and (2) optimization on top of new alloca attribute.

For (2), I wonder if you tried to explore the path of teaching more traditional optimizations about these new introduced properties (const in the example here). For example, a combo of:

I believe that in principle we should implement these hooks and try to get these optimizations for free from MLIR before we try to develop custom CIR ones. I wonder if you found any limitations while exploring those.

https://mlir.llvm.org/docs/Passes/#-sccp

@Lancern
Copy link
Member Author

Lancern commented Oct 1, 2024

For (2), I wonder if you tried to explore the path of teaching more traditional optimizations about these new introduced properties (const in the example here).

Not quite, I have only tried the combination with mem2reg. The optimization in this PR is quite orthogonal to mem2reg although they both optimize some simple cases like:

int produce_int();
int test() {
  const int x = produce_int();
  int a = x;
  int b = x;
  return a + b;
}

In this simple case, since x is only written once and its address does not escape, mem2reg could effectively eliminate all memory allocations and transform it into code similar to:

int produce_int();
int test() {
  const int x = produce_int();
  return x + x;
}

However, once the address of x escapes, for example in this case:

int produce_int();
void consume(const int &);
int test() {
  const int x = produce_int();
  int a = x;
  consume(x);
  int b = x;
  return a + b;
}

Since the allocation for x must be retained, mem2reg now becomes helpless. The load for a and the load for b could not be eliminated by mem2reg. sccp is not helpful either since it does not reason about values in memory. Only the optimization in this PR could eliminate the load for a and b. More over, the constness information is critical here to safely eliminate the load for b. Without this knowledge, an optimizer cannot safely assume that consume does not change x and it cannot eliminate the load for b.

I think we can split this patch into two: (1) introduce the const + CIRGen + tests and (2) optimization on top of new alloca attribute.

OK I'll split it later. I may draft a more detailed RFC along with PR (2) so we could all get a feel about the range and impact of this cool optimization.

@bcardosolopes
Copy link
Member

Not quite, I have only tried the combination with mem2reg. The optimization in this PR is quite orthogonal to mem2reg...

It's orthogonal but my point is that compiler optimizations usually work with a combination of multiple passes and not adhoc passes that do all work and analysis at once.

In this simple case, since x is only written once and its address does not escape, mem2reg could effectively eliminate all memory allocations and transform it into code...

Are you saying mem2reg can generate transformations that allows this to be optimized without any of the changes from this PR?

Since the allocation for x must be retained, mem2reg now becomes helpless.

I understand where you are coming from and what you want to achieve, but I'm a bit worried about making assumptions about memory in adhoc fashion, without for example, the help of a proper alias analysis to feed in this information.

In general, what I'm trying to convey is that we should first start implementing the hooks for the existing passes MLIR provides and slowly enable them in our pipeline. Putting to the context of this PR, I'd like to see how const can help in general, with small and sound pieces introduced in bites. I like the overall direction but C++ is quite tricky, and I don't see any report of this optimization being applied to any significant bigger piece of a code base, build time footprint and correctness guarantees, it feels a bit too-optimistic-too-early to me.

@Lancern
Copy link
Member Author

Lancern commented Oct 2, 2024

Are you saying mem2reg can generate transformations that allows this to be optimized without any of the changes from this PR?

For the simple case I shown in the previous comment, yes. But for more complex examples, we have to come up a way to teach mem2reg (or any other existing optimizations) about the constness added in this PR.

In general, what I'm trying to convey is that we should first start implementing the hooks for the existing passes MLIR provides and slowly enable them in our pipeline.

I get your idea. You're conveying that after landing the constness attribute, a more practical way to make it useful is to first try teach existing MLIR optimizations about the constness and see what they could already do. Do I understand it correctly?

@Lancern
Copy link
Member Author

Lancern commented Oct 2, 2024

Updated, removed the transformation pass from this PR.

@Lancern Lancern changed the title [CIR][Transform] Add constant load elimination pass [CIR][CIRGen] Add const attribute to alloca operations Oct 2, 2024
@bcardosolopes
Copy link
Member

bcardosolopes commented Oct 2, 2024

For the simple case I shown in the previous comment, yes. But for more complex examples, we have to come up a way to teach mem2reg (or any other existing optimizations) about the constness added in this PR.

Neat, might be worth adding that testcase to current mem2reg tests.

I get your idea. You're conveying that after landing the constness attribute, a more practical way to make it useful is to first try teach existing MLIR optimizations about the constness and see what they could already do. Do I understand it correctly?

This would be one interesting path to go along, yes. There are many possible paths though:

  • Mentioned above: add more existing passes to our pipeline and see what type of goodness you can get out of it (const would be a good example, but anything in general).
  • Some folks mention that LLVM cannot take advantage of source level constness. One train of work here is to find out what already exists but it's missing being propagated in LLVM to make that happen, is it because some information didn't get propagated from the frontend given that without CIR there's no way to propagate high level info early in the pipeline? It's possible all we need is a simple analysis pass on top of CIR that propagates const info such that LLVM lowering can emit even more metadata and help LLVM optimizations to better kick in. If I was working on this, it's probably where I'd start - give more info to LLVM so that existing LLVM optimizations can just do more work.
  • If you are really passionate about the pointer escaping aspect, you could find a way to integrate an escape analysis / alias analysis into CIR pipeline - a good start would be to check with the MLIR community what's out there or if there's anything we could reuse / collaborate on.

One concern I have with the existing PR approach is that dominance checks can get expensive, you might need more caching or more conservative assumptions, maybe looking into how LLVM eliminate redudant loads can provide you with a few more insights on how these opts usually operate to be efficient. Another caveat here is that ClangIR is currently WIP building bigger codebases / benchmarks, it's probably gonna get easier to get measurements / evaluate optimizations once we have a baseline for correctness and compile time.

This patch adds a new attribute `const` to the alloca operation to indicate that
the corresponding local variable declaration is `const`-qualified. Future
optimizations may find this new attribute useful.
@Lancern
Copy link
Member Author

Lancern commented Oct 10, 2024

Rebased onto the latest main.

@bcardosolopes bcardosolopes merged commit 959f03e into llvm:main Oct 11, 2024
6 checks passed
@Lancern Lancern deleted the alloca-const branch October 11, 2024 00:50
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
This PR tries to give a simple initial implementation for eliminating
redundant loads of constant objects, an idea originally posted by
OfekShilon.

Specifically, this PR adds a new unit attribute `const` to the
`cir.alloca` operation. Presence of this attribute indicates that the
alloca-ed object is declared `const` in the input source program. CIRGen
is updated accordingly to start emitting this new attribute.
lanza pushed a commit that referenced this pull request Nov 5, 2024
This PR tries to give a simple initial implementation for eliminating
redundant loads of constant objects, an idea originally posted by
OfekShilon.

Specifically, this PR adds a new unit attribute `const` to the
`cir.alloca` operation. Presence of this attribute indicates that the
alloca-ed object is declared `const` in the input source program. CIRGen
is updated accordingly to start emitting this new attribute.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants