Features Download
From: Philip Reames <listmail <at> philipreames.com>
Subject: RFC: attribute for a pointer which is dereferenceable xor null
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Thursday 12th February 2015 17:59:17 UTC (over 2 years ago)
I'd like to propose that we add an attribute which expresses the notion 
that the specified value is /either/ null or dereferenceable up to a 
fixed size. (Note the xor.)  Our current dereferenceable(n) attribute 
doesn't quite fit the bill, it implies that the pointer is non-null.  
Similarly, our nonnull attribute says nothing about dereferenceability.

There are two syntax proposals below, but let's start with the motivation.

These semantics arise in a number of common cases:
- In C, malloc is defined to either return null, or a dereferenceable 
region of the size requested.
- In Java, any reference is either null or dereferenceable to the size 
of the static type.
- I suspect this will also be useful for Julia, Go, Rust, and others for 
similar reasons.

With such an attribute available, we can increase the effectiveness of 
LICM.  We can't move a load outside a loop if it might introduce a 
fault.  Knowing that a pointer is deferefenceable(N) at a location (i.e. 
the loop preheader) allows us to satisfy this constraint.  In the near 
term, we can simply add a case in the dereferenceability analysis that 
combines the new attribute and isKnownNonNull.  This won't be too 
effective out of the box, but will enable testing with llvm.assumes and 
might catch some cases.  I will probably also add a case to look at the 
controlling branch to the loop preheader since in practice that tends to 
be where a unswitched null check would live.

Longer term, I plan on introducing a mechanism to have isKnownNonNull 
consider trivially dominating conditions.  This will make the proposed 
attribute more powerful, but is explicitly not part of this proposal.  
That's a lot more work and will need a fair amount of discussion on its 

Now, on to possible syntax.

*Option 1*
We could simply redefine our current notion of dereferenceable(N) to 
allow the pointer to be null.  Since we already have the nonnull 
attribute, this wouldn't loose any expressibility.  Frontends would need 
to be modified to emit both dererefenceable(N) and nonnull if they want 
to preserve the same semantics.  Most of the existing utility functions 
for dereferenceability in LLVM would be modified to just check both.  
There'd need to by a forward migration added to the bytecode parser to 
enable upgrade from the old semantics to the new.

This is my preferred option, but in offline conversation, Hal objected 
to this change.  I'll let him describe his objection since I was never 
quite clear on it.

*Option 2*
We introduce a new attribute with the desired semantics.  This results 
in a collection of confusing overlapping attributes, but is otherwise 
straight forward.

My proposed strawman syntax would be: dereferenceable_or_null(N). 
(Bikeshedding welcomed.)  This would be a legal parameter and return 
attribute on both function declarations and call sites (i.e. calls and 
invokes).  As with above, we'd extend all the places that currently 
consider 'dereferenceable' to consider the new attribute in combination 
with isKnownNonNull.

CD: 3ms