Features Download

From: Daniel Sanders <Daniel.Sanders <at> imgtec.com>
Subject: The Trouble with Triples
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Wednesday 8th July 2015 14:31:44 UTC (over 3 years ago)

In http://reviews.llvm.org/D10969,
Eric asked me to explain the wider context of the TargetTuple object that
was replacing Triple on llvmdev so here it is.

Before I start, I'm sure I don't know the full extent of GNU triple
ambiguity and lack of canonicity. Additional examples are welcome.

The Problem

As you know, LLVM uses a GNU Triple is as a target description that can be
relied upon to make decisions. It's used for various decisions such as the
default cpu, the alignment of types, the object format, the names for
libcalls, and a wide variety of others.
In using it like this, LLVM assumes that triples are unambiguous and have a
specific defined meaning. Unfortunately, this assumption fails for a number
of reasons.

The first reason is that compiler options can overrule the triple but leave
it unchanged. For example, in GCC mips-linux-gnu-gcc normally produces
32-bit MIPS-I output using the O32 ABI, but 'mips-linux-gnu-gcc -mips64'
normally produces 64-bit MIPS-III output using the N32 ABI. Like GCC,
compiler options to mips-linux-gnu-clang should (and mostly do but MIPS has
a few crashing cases caused by triple misuse) overrule the triple. However,
we don't mutate the triple to reflect this so any decisions based on the
overridable state cannot rely on the triple to accurately reflect the
desired behaviour.
It's worth mentioning here that some targets have hacks to partially mutate
the triple in clang to work around issues they would otherwise have in the
backend but this is done on an ad-hoc basis for specific details (e.g. mips
<-> mipsel for -EL and -EB).

The second reason is that there is no canonical meaning for a given GNU
Triple, it varies between vendors and over time. There is also no
requirement for vendors to have a unique GNU Triple for their toolchain.
For GCC, it's fairly common for distributors to change the meanings of
triples using options like --with-arch, --with-cpu, --with-abi, etc. There
are also some target-specific options such as --with-mode to select
ARM/Thumb by default and --with-nan for MIPS NAN encoding selection.
Different vendors use different configure options and may change them at
will. When they do change them, the vendors often desire to keep the same
triple to be able to drop in the new version without causing wider impact
on their environment. For example, assuming I'm reading debian/rules2 for
Debian's gcc-4.9 package correctly then the i386-linux-gnu means i486 on
Debian Etch and Lenny but means i586 on more recent versions. On a similar
note, on Debian, mips-linux-gnu targets MIPS-II (optimised for typical
MIPS32 implementations) rather than the usual MIPS-I. The last example of
this ambiguity I'd like to reference is that mentioned by https://wiki.debian.org/Multiarch/Tuples#Why_not_use_GNU_triplets.3F.
In that example, hard-float and soft-float on ARM both used
arm-linux-gnueabi but were mutually incompatible. The Multiarch tuples
described on that page are an attempt to resolve the ambiguity but I'm told
that they aren't likely to be universally adopted.

The third reason, is that different triples can mean the same thing. Jim
Grosbach has mentioned that the prefixes of the GNU Triple are different
between Linux and Darwin for ARM despite sharing the same meaning
(presumably subject to the issues above). As a result decisions based on
the string have to take care of multiple possible values. Mips has a
similar issue too since a host triple (and therefore default target triple)
of mips64-linux-gnu needs to behave like mips-linux-gnu on a 32-bit Mips
port of Debian.

Although not included in the description of the assumption above, one
additional flaw in the use of GNU Triples is that they are sometimes
inadequate as a description of the target. One example affecting MIPS in
particular is that the ABI is not represented in the GNU Triple we require
significant API changes to get this information where we need it. It would
be helpful to be able to pass such information through the existing

The Planned Solution

The plan is to split the GNU Triple represented by the llvm::Triple object
into two pieces. The first piece is the existing llvm::Triple and is
responsible for parsing the GNU triple and canonicalizing it. The second
piece is a mutable target description named llvm::TargetTuple. TargetTuple
is responsible for interpreting the triple according to the vendor's rules,
providing an interface to allow mutation by tools, and authoritatively
defining the target being targeted without the ambiguity of GNU Triples. As
an example, 'mips-linux-gnu-clang -EL ...' would:
// Parse the GNU Triple
llvm::Triple GnuTriple("mips-linux-gnu");
// Convert it to a TargetTuple according to the (possibly customized)
meanings in
// use by the vendor.
llvm::TargetTuple TT(GnuTriple);
// Then mutate the TargetTuple according to the compiler options (or
equivalent depending
// on the tool, for example disassemblers would mutate it according to the
object headers).
if (hasOption("-EL"))
At this point, TT would be
"+mipsel-unknown-linux-gnu-elf32-some-other-stuff" (exact serialization is
t.b.d and may end up target dependent) which we can then rely on in the
rest of LLVM. This split resolves the issue of llvm::Triple objects not
being reliable when used as a target description since TargetTuple will
reflect the result of interpreting the triple as well as applying
appropriate options. It also provides a suitable place for vendors to
define the meanings of their GNU Triples.

One significant detail is the way vendors customize the meaning of their
Triples. Currently, the plan is to nominate a constructor
(TargetTuple::TargetTuple(const Triple &)) a vendor can patch to redefine
their triples with the default implementation being the 'usual' meaning
(the meaning that should be used in the absence of customization). One nice
benefit of this configure-by-source-patch approach is that vendors can
customize multiple triples as easily as their native triple or intended
target triple. To use Debian as an example again, they would be able to
customize all their supported triples such that 'clang -target
arm-linux-gnueabihf' on the amd64 port targets their armhf port using the
same customization that makes 'clang' on the armhf port do the right thing
natively. Android, and toolchains for heterogenous platform would likely
benefit from this too. This configure-by-source-patch approach seems to
make some people uncomfortable so we may have to find another way to
configure the triples (tablegen?).

To reach this result the plan is to do the following:

1.       Replace any remaining std::string's and StringRef's containing GNU
triples with Triple objects.

2.       Split the llvm::Triple class into llvm::Triple and
llvm::TargetTuple classes. Both are identical in implementation and almost
identical in interface at this stage.

3.       Gradually replace Triples with TargetTuples until the C APIs and
the LLVM-IR are the only place inside LLVM where Triples are still used.

4.       Change the implementation of TargetTuple to whatever is convenient
for LLVM's internals and decide on a serialization.

5.       Replace serialized Triples with serialized TargetTuples in

a.       Maintain backwards compatibility with IR using triples, at least
for a while.

6.       Add TargetTuple support to the C API. Exact API is t.b.d.

7.       Have the API users mutate the TargetTuple appropriately.
Renato: This has been revised slightly from the last one we discussed due
to public C++ API's being used internally as well as externally.

Where we are now

I've just started posting patches for step 2 and 3 of the plan. My working
copy is nearly at step 4.

What's next

Upstream step 2 and 3 and then begin replacing the TargetTuple
implementation as per step 4.

Previous Discussions

I should mention that I've since been made aware that the original topic of
private label prefixes could be solved in a much simpler way than
previously thought. The triple related discussion is still relevant though.
I understand from Renato that there are more threads over the last few
years but I haven't looked for them.

Daniel Sanders
Leading Software Design Engineer, MIPS Processor IP
Imagination Technologies Limited
CD: 4ms