Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Chandler Carruth <chandlerc <at> gmail.com>
Subject: RFC: Missing canonicalization in LLVM
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Wednesday 21st January 2015 22:16:41 UTC (over 3 years ago)
So, we've run into some test cases which are pretty alarming.

When inlining code in various different paths we can end up with this IR:

define void @f(float* %value, i8* %b) {
entry:
  %0 = load float* %value, align 4
  %1 = bitcast i8* %b to float*
  store float %0, float* %1, align 1
  ret void
}

define void @g(float* %value, i8* %b) {
entry:
  %0 = bitcast float* %value to i32*
  %1 = load i32* %0, align 4
  %2 = bitcast i8* %b to i32*
  store i32 %1, i32* %2, align 1
  ret void
}

Now, I don't really care one way or the other about these two IR inputs,
but it's pretty concerning that we get these two equivalent bits of code
and nothing canonicalizes to one or the other.

So, the naive first blush approach here would be to canonicalize on the
first -- it has fewer instructions after all -- but I don't think that's
the right approach for two reasons:

1) It will be a *very* narrow canonicalization that only works with overly
specific sets of casted pointers.
2) It doesn't effectively move us toward the optimizer treating IR with
different pointee types for pointer types indistinguishably. Some day, I
continue to think we should get rid of the pointee types entirely.

To see why #1 and #2 are problematic, assume another round of inlining took
place and we suddenly had the following IR:


AFAICT, this is the same and we still don't have a good canonicalization
story.

What seems like the obvious important and missing canonicalization is that
when we have a loaded value that is *only* used by storing it back into
memory, we don't canonicalize the type of that *value* (ignoring the
pointer types) to a single value type.

So, the only really suitable type for this kind of stuff is 'iN' where N
matches the number of bits loaded or stored.

I have this change implemented. It is trivial and unsurprising. However,
the effects of this are impossible to predict so I wanted to make sure it
made sense to others. Essentially, I expect random and hard to track down
performance fluctuations across the board. Some things may get better,
others may get worse, and they will probably all be bugs elsewhere in the
stack.

So, thoughts?
-Chandler
 
CD: 4ms