Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Michael Zolotukhin <mzolotukhin <at> apple.com>
Subject: Proposal: add intrinsics for safe division
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Thursday 24th April 2014 05:52:20 UTC (over 3 years ago)
Hi,

I’d like to propose to extend LLVM IR intrinsics set, adding new ones for
safe-division. There are intrinsics for detecting overflow errors, like
sadd.with.overflow, and the intrinsics I’m proposing will augment this
set.

The new intrinsics will return a structure with two elements according to
the following rules:
safe.[us]div(x,0) = safe.[us]rem(x,0) = {0, 1}
safe.sdiv(min, -1) = safe.srem(min, -1) = {min, 1}
In other cases: safe.op(x,y) = {x op y, 0}, where op is sdiv, udiv, srem,
or urem

The use of these intrinsics would be quite the same as it was for
arith.with.overflow intrinsics. For instance:
      %res = call {i32, i1} @llvm.safe.sdiv.i32(i32 %a, i32 %b)
      %div = extractvalue {i32, i1} %res, 0
      %bit = extractvalue {i32, i1} %res, 1
      br i1 %bit, label %trap, label %normal

Now a few words about their implementation in LLVM. Though the new
intrinsics look quite similar to the ones with overflow, there are
significant differences. One of them is that during lowering we need to
create control-flow for the new ones, while for the existing ones it was
sufficient to simply compute the overflow flag. The control flow is needed
to guard the division operation, which otherwise can cause an undefined
behaviour.

The existing intrinsics are lowered in a back-end, during legalization
steps. To do the same for the new ones, we’d need a more complicated
implementation because of the need to create a new control flow. Also, that
would be needed to be done in every backend.

Another alternative here is to lower the new intrinsics in CodeGenPrepare
pass. That approach looks more convenient to me, because it allows us to
have a single implementation for all targets in one place, and it’s
easier to introduce control-flow at this point.

The patch below implements the second alternative. Along with a
straight-forward lowering (which is valid and could be used as a base on
all platforms), during the lowering some simple optimizations are performed
(which I think is also easier to implement in CodeGenPrepare, than on
DAGs):
We don’t to generate code for unused part of the result structure.
If div-instruction on the given platform behaves exactly as needed for the
intrinsic (e.g. it takes place for ARM64), we don’t guard the div
instruction. As a result, we could avoid branches at all if the second part
of the result structure is not used.
The most expected users of the result structure are extractvalue
instructions. Having that in mind, we try to propagate the results - in
most cases that allows to get rid of all corresponding extractvalues.

Attached are two patches: the first one with the described implementation
and tests, and the second one with the corresponding documentation changes.

The first patch happened to already get to the trunk, but the topic is
open, and any suggestions are welcome.



Best regards,
Michael
 
CD: 4ms