Features Download
From: Philip Reames <listmail <at> philipreames.com>
Subject: RFC: liveoncall parameter attribute
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Monday 1st June 2015 23:28:09 UTC (over 2 years ago)
TLDR - I have a runtime which expects to be able to inspect certain 
arguments to a function even if that argument isn't used within the 
callee itself.  DeadArgumentElimination doesn't respect this today. I 
want to add an argument that records an argument to a call as live even 
if the value is known to be not used in the callee.

My use case

What my runtime is doing is trying to resolve a symbolic reference to a 
function from a call site which has been devirtualized by the compiler.

Rather than saving what the devirtualized callee actually was, all the 
(LLVM based) in-memory compiler does is save a bit indicating that it 
proved the given call site was monomorphic.  In LLVM, the call is 
represented as a patchable callsite using statepoints (could also be a 
patchpoint).  Before actually running the code in question, we patch 
over the generated code with a call to a helper routine which knows how 
to resolve the actual callee and patch the direct call target back into 
the patchable code section.

What's supposed to happen the first time this code is actually executed 
is that the running application thread calls into the helper routine, 
does a dynamic lookup of the callee (using the normal dynamic dispatch 
logic including all cornercases), patches the actual callee's entry 
address back into the source of the call, and then tail calls into the 
actual callee.  However, there's a complication with the step involved 
with doing the dynamic dispatch.  If the actual callee was visible to 
the LLVM compile, we might have proven that one of the arguments (say, 
the 'this' receiver pointer) was not used in the callee and replaced it 
with undef at the callsite.  This breaks the dynamic lookup.

(I really don't want to get into a discussion of whether this is the 
"right" way to implement such a thing.  This approach has various 
advantages, but more importantly, it's a _reasonable_ runtime design.  
In my view, LLVM should be able to support any reasonable design, 
regardless of whether it's the best one or not.)

The proposal

We add a new parameter attribute which can be placed either on a call 
site (call, invoke), or function declaration.  The exact semantics are 
that the parameter so tagged must be considered live up until the prolog 
of the callee actual starts executing.  It is illegal to make any 
assumptions in the caller about whether the callee uses this value or 
not.  This attribute does not inhibit inlining.  The semantics only 
apply if a call must be emitted (including tail or sibling calls).

My tentative name is liveoncall, but I'm open to better names.  Feel 
free to make suggestions.

Today, the actual implementation would be quite simple.  It will 
basically consist of a single special case in DeadArgumentElimination.  
In the long run, we might have to extend this to other inter-procedural 
analysis and optimization passes, but I suspect the diff will remain small.

Comparables & Alternatives
Today, the "meta arguments" to the patchpoint have a semantic which is 
similar to that proposed here.  They have the "liveoncall" property, but 
they *also* have the freedom to be freely allocated by the register 
allocator.  My proposed attribute does not allow this degree of freedom.

Similarly, statepoints support "deopt arguments", "transition 
arguments", and "gc arguments".  All of them have the liveoncall 
property, but they also have additional restrictions on liveness (such 
as "live-during-call" or "live-on-return") and placement.

In DeadArgumentElimination, we already have support for interposable 
functions.  The restrictions are similar, but apply to all arguments to 
a function rather than a subset.  You could view my proposed attribute 
as allowing interposition of the callee, but with restricted semantics 
on the interposed implementation.

An alternate approach would be to insert a dummy use into the callee, 
lower it to a noop late in the backend, and teach the inliner to remove 
it after inlining.  I suspect this would be both harder to implement and 
harder to optimize around.

CD: 3ms