Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane

From: Quentin Colombet <qcolombet <at> apple.com>
Subject: Heads-Up: Shrink-wrapping and TargetFrameLowering API changes
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Tuesday 5th May 2015 17:42:35 UTC (over 3 years ago)
Hi all,

The shrink-wrapping pass landed in r236507. This email gives a brief survey
on how you can use it and for out-of-tree target, how you have to update
the TargetFrameLowering to be able to compile.

To know more about the goal of the shrink-wrapping pass, look at the
description in http://reviews.llvm.org/D9210 <http://reviews.llvm.org/D9210>.
The short story is that shrink-wrapping provides a cheaper placement, in
terms of frequency, of the prologue and epilogue code sequence.


** Out-of-Tree Target: TargetFrameLowering API Changes **

This section helps the maintainers of Out-of-Tree targets to update their
code to the latest APIs. It also describes why those changes are necessary
and thus, may be useful for anyone that want to use the shrink-wrapping
pass. The shrink-wrapping pass is disabled by default (see How Can I Use
Shrink-Wrapping? for more details).

With the introduction of shrink-wrapping some APIs in TargetFrameLowering
needed to be updated to accommodate the fact, that the prologue block is
not necessarily the entry block of the function.

The impacted functions are:
virtual void emitPrologue(MachineFunction &MF,
                          MachineBasicBlock &MBB) const = 0;
virtual void adjustForSegmentedStacks(MachineFunction &MF,
                                      MachineBasicBlock &PrologueMBB) const
{}
virtual void adjustForHiPEPrologue(MachineFunction &MF,
                                   MachineBasicBlock &PrologueMBB) const {}
virtual void
adjustForFrameAllocatePrologue(MachineFunction &MF,
                               MachineBasicBlock &PrologueMBB) const {}

The additional MachineBasicBlock parameter represents the basic-block where
the prologue will be inserted. When shrink-wrapping is disabled (the
default), you can assert that this block is the same as the entry block of
the MachineFunction.


** How Can I Use Shrink-Wrapping? **

* Fix emitPrologue and emitEpilogue *

To be able to use the shrink-wrapping you need to make sure emitPrologue
(respectively emitEpilogue) works on basic block that are not necessarily
the entry block (rest. exit blocks) of the function. In particular, the
block used for the epilogue may be empty.
You can have a look at the targeting of AArch64 in r236507 to see what it
takes to implement such support.

* Play with Shrink-Wrapping *

The shrink-wrapping pass can be enabled via the command switch: (-mllvm)
-enable-shrink-wrap.
You can also see how it applies by using the "(-mllvm) -stats” switch,
and look for shrink-wrap output.
Finally, you can look at what it does by using “(-mllvm)
-debug-only=shrink-wrap”.

Please file any bug that you might encounter.

* How Do I Turn This On By Default *

You can turn the shrink-wrapping pass on by default for your target, by
setting the field EnableShrinkWrap to true in the your derived class of
TargetPassConfig.

Note: The (-mllvm) -enable-shrink-wrap switch overrides the default setting
for the current run.


** What Is Next? **

Here are a few items for futur direction of the shrink-wrapping pass:
- Enable it by default for AArch64.
- Implement ARM/X86 support.
- Enable it by default for ARM/X86.
- Refine the shrink-wrapping pass to support multi insertion point for the
prologue and epilogue.

The last item should be driven by motivating examples. I do not what to
complicate the implementation is it is not proven beneficial.


** AArch64 Developers: We Need Your Help **

As r236507 the shrink-wrapping support is implemented in AArch64. I would
like AArch64 developers to give it a try and report any bug/regression they
might see. The idea would be to enable the shrink-wrapping pass by default
based on those feedbacks.

Thanks,
-Quentin
 
CD: 4ms