Features Download
From: =?utf-8?Q?Fr=C3=A9d=C3=A9ric_Riss?= <friss <at> apple.com>
Subject: Reimplementing Darwin's dsymutil as an lld helper
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Friday 7th November 2014 16:09:51 UTC (over 3 years ago)

[ I Cc'd lld people and debug info people. Apologies if I omitted some
stakeholder. ]

As stated in the subject, I’d like to start working on an in-tree
reimplementation of Darwin’s dsymutil utility. This is an initial step on
the path to having lld handle the debug information itself.

For those who are not familiar with the debug flow on MacOS, dsymutil is a
DWARF linker. Darwin’s linker (ld64) doesn’t link the DWARF debug info
found in the object files, instead it writes a “debug-map” in the
linked binary. This debug-map describes what objects were linked together
and what atoms of each object file are present in the binary along with
their addresses. The debug-map has two uses:
1) During the build->debug cycle, lldb reads the debug-map and uses it to
find the .o files and extract the relevant dwarf debug info.
2) For Release builds, dsymutil reads the debug-map then loads, merges, and
optimizes all the dwarf debug info and writes it as as a .dSYM

The long term goal is that dwarf linking functionality be available as a
library for LLVM tools. Eventually, we’d like lld to be able to make use
of the dwarf linking library and not need a stand along dsymutil tool.  The
first step is to use the dwarf linking library in a stand along dsymutil
replacement tool. We want this tool to be bit-for-bit compatible with the
existing Darwin dsymutil.

The main reason we want to take the first step of a separate tool is
testability. The code committed to the LLVM repository will feature unit
tests, but they won’t offer the coverage that a real world usage would. I
plan to run the new tool through big internal validation campaigns during
which the llvm powered dsymutil output would be compared to the system’s
dsymutil one. This is also the reason we aim for bit-for-bit compatibility.

The current plan is to host the code in the llvm repository. dsymutil will
make heavy use of libDebugInfo and won’t share anything with the lld
codebase (The underlying concepts are just too different). It’s also not
clear yet where most of the implementation logic will end up. I expect most
of the core logic to be in tools/dsymutil, but some of it might be better
folded directly into libDebugInfo.

So how does it work? dsymutil doesn’t simply paste the debug sections
together while applying relocations to them. This wouldn’t work for ld64
as it is able (like lld) to split the sections apart and discard/reorder
the contents. Thus dsymutil needs some semantic knowledge of the DWARF
contents to be able to “patch” the relocatable debug info with accurate
values. It is also able to remove parts of the DIE tree that aren’t
needed or to unique types across the compilation unit boundaries. In
libDebugInfo, we have the needed tooling to read the debug info, but we
currently lack the ability to write it back to disk. Maybe what’s in
lib/CodeGen/AsmPrinter to emit the debug info would fit the bill, but I
won't be sure until I try to write the code. I’ll see along the way if
libDebugInfo should grow it’s own Dwarf streaming capabilities. Opinions

Although the implementation of the dsymutil command line tool will be
fairly Darwin specific (it accepts mach-o files as input and emits a dSYM
bundle), most of the implementation will be format agnostic. I’ll make an
effort to split the mach-o specific parts into their own files so that this
code can be reused in a generic way. Would there be interest in that kind
of code for other platforms also? What’s the story of lld Dwarf support
for ELF?

I plan on sending the initial code (that does basically only parse the
debug map of mach-o files) out for review in the coming days if there are
no objections to the general principle.

LLVM Developers mailing list
[email protected]         http://llvm.cs.uiuc.edu
CD: 22ms