memory_order

Defined in header `<stdatomic.h>`
enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst };		(since C11)

memory_order specifies how non-atomic memory accesses are to be ordered around an atomic operation. The rationale of this is that when several threads simultaneously read and write to several variables on multi-core systems, one thread might see the values change in different order than another thread has written them. Also, the apparent order of changes may be different across several reader threads. Ensuring that all memory accesses to atomic variables are sequential may hurt performance in some cases. std::memory_order allows to specify the exact constraints that the compiler must enforce.

It's possible to specify custom memory order for each atomic operation in the library via an additional parameter. The default is std::memory_order_seq_cst.

Defined in header `<atomic>`
Value	Explanation
`memory_order_relaxed`	Relaxed ordering: there are no constraints on reordering of memory accesses around the atomic variable.
`memory_order_consume`	Consume operation: no reads in the current thread dependent on the value currently loaded can be reordered before this load. This ensures that writes to dependent variables in other threads that release the same atomic variable are visible in the current thread. On most platforms, this affects compiler optimization only.
`memory_order_acquire`	Acquire operation: no reads in the current thread can be reordered before this load. This ensures that all writes in other threads that release the same atomic variable are visible in the current thread.
`memory_order_release`	Release operation: no writes in the current thread can be reordered after this store. This ensures that all writes in the current thread are visible in other threads that acquire the same atomic variable.
`memory_order_acq_rel`	Acquire-release operation: no reads in the current thread can be reordered before this load as well as no writes in the current thread can be reordered after this store. The operation is read-modify-write operation. It is ensured that all writes in another threads that release the same atomic variable are visible before the modification and the modification is visible in other threads that acquire the same atomic variable.
`memory_order_seq_cst`	Sequential ordering. The operation has the same semantics as acquire-release operation, and additionally has sequentially-consistent operation ordering.

[edit] Relaxed ordering

Atomic operations tagged memory_order_relaxed exhibit the following properties:

No ordering of other memory accesses is ensured whatsoever. This means that it is not possible to synchronize several threads using the atomic variable.
Reads and writes to the atomic variable itself are ordered. Once a thread reads a value, a subsequent read by the same thread from the same object can not yield an earlier value.

For example, with x and y initially zero,

// Thread 1:
r1 = atomic_load_explicit(y, memory_order_relaxed);
atomic_store_explicit(x, r1, memory_order_relaxed);
// Thread 2:
r2 = atomic_load_explicit(x, memory_order_relaxed);
atomic_store_explicit(y, 42, memory_order_relaxed);

is allowed to produce r1 == r2 == 42.

[edit] Release-Consume ordering

If an atomic store is tagged memory_order_release and an atomic load from the same variable is tagged memory_order_consume, the operations exhibit the following properties:

No writes in the writer thread can be reordered after the atomic store
No reads or writes dependent on the value received from atomic load can be reordered before the atomic load. "Dependent on" means that the address or value is computed from the value of the atomic variable. This form of synchronization between threads is known as "dependency ordering".
The synchronization is established only between the threads releasing and consuming the same atomic variable. Other threads can see different order of memory accesses than either or both of the synchronized threads.
The synchronization is transitive. That is, if we have the following situation:

Thread A releases atomic variable a.
Thread B consumes atomic variable a.
Atomic variable b is dependent on a.
Thread B releases atomic variable b.
Thread C consumes or acquires atomic variable b.

Then not only A and B or B and C are synchronized, but A and C also. That is, all writes by the thread A that were launched before the release of a are guaranteed to be completed once thread C observes the store to b.

On all mainstream CPUs, other than DEC Alpha, dependency ordering is automatic, no additional CPU instructions are issued for this synchronization mode, only certain compiler optimizations are affected (e.g. the compiler is prohibited from performing speculative loads on the objects that are involved in the dependency chain)

[edit] Release sequence

If some atomic is store-released and several other threads perform read-modify-write operations on that atomic, a "release sequence" is formed: all threads that perform the read-modify-writes to the same atomic synchronize with the first thread and each other even if they have no memory_order_release semantics. This makes single producer - multiple consumers situations possible without imposing unnecessary synchronization between individual consumer threads.

[edit] Release-Acquire ordering

If an atomic store is tagged memory_order_release and an atomic load from the same variable is tagged memory_order_acquire, the operations exhibit the following properties:

No writes in the writer thread can be reordered after the atomic store
No reads in the reader thread can be reordered before the atomic load.
The synchronization is established only between the threads releasing and acquiring the same atomic variable. Other threads can see different order of memory accesses than either or both of the synchronized threads.
The synchronization is transitive. That is, if we have the following situation:

Thread A releases atomic variable a.
Thread B consumes atomic variable a.
Thread B releases atomic variable b.
Thread C consumes or acquires atomic variable b.

Then not only A and B or B and C are synchronized, but A and C also. That is, all writes by the thread A that were launched before the release of a are guaranteed to be completed once thread C observes the store to b.

On strongly-ordered systems (x86, SPARC, IBM mainframe), release-acquire ordering is automatic. No additional CPU instructions are issued for this synchronization mode, only certain compiler optimizations are affected (e.g. the compiler is prohibited from moving non-atomic stores past the atomic store-relase or perform non-atomic loads earlier than the atomic load-acquire)

[edit] Sequentially-consistent ordering

If an atomic store is tagged memory_order_seq_cst and an atomic load from the same variable is tagged memory_order_seq_cst, then the operations exhibit the following properties:

No writes in the writer thread can be reordered after the atomic store
No reads in the reader thread can be reordered before the atomic load.
The synchronization is established between all atomic operations tagged std::memory_order_seq_cst. All threads using such atomic operation see the same order of memory accesses.

Sequential ordering is necessary for many multiple producer-multiple consumer situations where all consumers must observe the actions of all producers occurring in the same order.

Total sequential ordering requires a full memory fence CPU instruction on all multi-core systems. This may become a performance bottleneck since it forces all memory accesses to propagate to every thread.

[edit] Relationship with volatile

Within a thread of execution, accesses (reads and writes) to all volatile objects are guaranteed to not be reordered relative to each other, but this order is not guaranteed to be observed by another thread, since volatile access does not establish inter-thread synchronization. In addition, volatile accesses are not atomic (concurrent read and write is a data race) and do not order memory (non-volatile memory accesses may be freely reordered around the volatile access). One notable exception is Visual Studio, where every volatile write has release semantics and every volatile read has acquire semantics (MSDN), and thus volatiles may be used for inter-thread synchronization. Standard volatile semantics are not applicable to multithreaded programming, although they are sufficient for e.g. communication with a signal handler (see also std::atomic_signal_fence)

[edit] Examples

[edit] See also

C++ documentation for memory order

Language
Standard Library
Type support
Dynamic memory management
Error handling
Program utilities
Date and time utilities
Strings library
Algorithms
Numerics
Input/output support
Localization support
Thread support (C11)
Atomic operations (C11)

Types
memory_order
atomic_flag
Macros
ATOMIC_***_LOCK_FREE
ATOMIC_FLAG_INIT
ATOMIC_VAR_INIT
kill_dependency
Functions
atomic_flag_test_and_set
atomic_flag_clear
atomic_init
atomic_is_lock_free
atomic_store
atomic_load
atomic_exchange
atomic_compare_exchange
atomic_fetch_add
atomic_fetch_sub
atomic_fetch_or
atomic_fetch_xor
atomic_fetch_and
atomic_thread_fence
atomic_signal_fence