Features Download
From: Chandler Carruth <chandlerc <at> google.com>
Subject: Re: Moving towards a singular pointer type
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Saturday 7th February 2015 02:40:40 UTC (over 2 years ago)
On Fri, Feb 6, 2015 at 6:09 PM, Reid Kleckner  wrote:

> I think we should keep GEP essentially the same, but disassociate the
> being GEPd over from the type of the operands. So, assuming the new ptr
> type is spelled "ptr", we could use this syntax:
> %inner.ptr = getelementptr ptr, ptr %x, i32 1
> Or if I was adding 1 to a "struct A*" value in C:
> %next_elt = getelementptr %struct.A, ptr %x, i32 1
> Ditto for all other instructions that care about pointee types, like load
> and store:
> %v = load i32, ptr %p ; loads already know (and store!) their loaded type
> internally
> store i32 %v, ptr %p ; no need to duplicate that %p points to, we have
> type on %v

Emphatically agree. No instruction should really change semantics here.
GEPs should keep working the exact same way, the type involved should just
be separate from the pointer's type.

> I don't think this can be incremental, I think it all goes at once.

I have some ideas of how to make it incremental:

> I think you might need to add a new GEP bitcode opcode, since that
> instruction grows a new type operand that doesn't come from an operand
> or result type.

Yep. And you can add this first, defining the semantics to be as-if the
pointer operand was bit casted to a pointer to the new type. Then we could
(in theory, not in practice!) even use and test it with the current IR,
passing an i8* or any other pointer type operand.

Same goes for the load instruction. We could support the new syntax first.

Next, I think you might be able to introduce a generic pointer type,
spelled as 'ptr' which would verifier fail if used with the old load or gep
instructions. You might have to synthesize an unnamable pointee type to use
for the in-memory representation, but that seems not beyond reason. Then
you can wire up all the asm parsing and printing and bitcode stuff
incrementally without any disruption.

The remaining parts are more interesting and maybe harder to do
incrementally, but still seem at least somewhat decomposable:

- switching all of the LLVM optimizer and all of Clang to produce the new
forms of GEP and load rather than the old forms
- switching all of the optimizer and clang to use the new boring pointer
type now that they never form old gep and load instructions
- switching all the auto-upgrade functionality on
- removing the in-memory support for the old forms
- simplifying a ton of the in-memory support and the optimizer now that the
old forms can't show up


It also wouldn't be too hard to accept the old .ll syntax, since the
> upgrade path mostly discards information.

I agree here. If only because of th eregression test suite, and the
*incredible* tediousness of updating the pointers. The auto-upgrade for
this kind of thing is essentially perfect.

CD: 4ms