Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Tim Northover <t.p.northover <at> gmail.com>
Subject: [RFC] Add second "failure" AtomicOrdering to cmpxchg instruction
Newsgroups: gmane.comp.compilers.llvm.devel
Date: Friday 7th March 2014 13:47:12 UTC (over 3 years ago)
Hi all,

The C++11 (& C11) compare_exchange functions with explicit memory
order allow you to specify two sets of semantics, one for when the
exchange actually happens and one for when it fails. Unfortunately, at
the moment the LLVM IR "cmpxchg" instruction only has one ordering,
which means we get sub-optimal codegen.

This probably affects all architectures which use
load-linked/store-conditional instructions for atomic operations and
don't have versions with built-in acquire/release semantics (and so
need barriers). For example on ARMv7, an "acquire, relaxed" cmpxchg
currently generates:

        mov r9, #52
        mov r3, #42
LBB0_1:                                 @ =>This Inner Loop Header: Depth=1
        ldrexb r1, [r0]
        cmp r1, r3
        bne LBB0_3
@ BB#2:                                 @   in Loop: Header=BB0_1 Depth=1
        strexb r2, r9, [r0]
        cmp r2, #0
        bne LBB0_1
LBB0_3:
        dmb ish
        [...]

That second barrier could be skipped by the first "bne" if we knew
that the user didn't need any ordering in case of failure.

The other atomic operations are not affected, so I'd like to add a
second ordering operand to "cmpxchg" to support the full range of
possible requests. The suggested syntax would be

    cmpxchg [volatile] * ,  ,  
[singlethread]  

That is, syntactically I propose:
  + No comma between the two
  + Both operands are required, even if "failure" is the natural pair
of "success"

Semantically, I would like to enforce the constraints in the standards:
  + failure <= success
  + failure != Release
  + failure != AcquireRelease.

Before the DAG, I suggest completely removing getOrdering from
CmpXchgInst, in favour of getSuccessOrdering and getFailureOrdering to
avoid errors. In the DAG, getOrdering would still exist and return the
success ordering, which I believe would make existing targets
conservatively correct. AtomicSDNode would add getFailureOrdering,
used only by ATOMIC_CMP_SWAP nodes.

Existing bitcode files would obviously be supported, and retain
current semantics by BitcodeReader dropping the "release" part of any
single ordering, and using the result as the "failure ordering" (as
currently documented in the LanguageRef).

I have attached initial patches for reference, but I'm not asking for
thorough review at this stage.

What do people think?

Cheers.

Tim.
 
CD: 4ms