memory barrier

来自个人维基
2019年11月12日 (二) 09:17free6d1823讨论 | 贡献的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转至: 导航搜索

Memory barrier is the general term applied to an instruction, or sequence of instructions, that forces synchronization events by a PE with respect to retiring Load/Store instructions.

The memory barriers defined by the Armv8 architecture provide a range of functionality, including:
• Ordering of Load/Store instructions.
• Completion of Load/Store instructions.
• Context synchronization.


arch\arm64\include\asm

#define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
#define dmb(opt)	asm volatile("dmb " #opt : : : "memory")
 
#define mb()		dsb(sy)
#define rmb()		dsb(ld)
#define wmb()		dsb(st)
 
#define dma_rmb()	dmb(oshld)
#define dma_wmb()	dmb(oshst)
 
#define smp_mb()	dmb(ish)
#define smp_rmb()	dmb(ishld)
#define smp_wmb()	dmb(ishst)
  • dmb Data memory barrier 保证在dmb执行后所有之前的memory acess都完成.不影响指令顺序
DMB <option>|#<imm>

option:

ISH Inner Shareable is the required shareability domain, reads and writes are the required access types, both before and after the barrier instruction.
ISHLD Inner Shareable is the required shareability domain, reads are the required access type before the barrier instruction, and reads and writes are the required access types after the barrier instruction.
ISHST Inner Shareable is the required shareability domain, writes are the required access type, both before and after the barrier instruction.
OSHLD Outer Shareable is the required shareability domain, reads are the required access type before the barrier instruction, and reads and writes are the required access types after the barrier instruction.
OSHST Outer Shareable is the required shareability domain, writes are the required access type, both before and after the barrier instruction.
  • dsb Data synchronization barrier
DSB <option>|#<imm>

option:

SY Full system is the required shareability domain, reads and writes are the required access types, both before and after the barrier instruction. This option is referred to as the full system barrier.
ST Full system is the required shareability domain, writes are the required access type, both before and after the barrier instruction. 
LD Full system is the required shareability domain, reads are the required access type before the barrier instruction, and reads and writes are the required access types after the barrier instruction. 


  • isb ISynchronization Barrier: 保证isb 后的指令执行完毕
  • Memory type

Inner Shareable, Outer Shareable , non-sharable normal memory

The Arm architecture abstracts the system as a series of Inner and Outer Shareability domains.
Each Inner Shareability domain contains a set of observers that are data coherent for each member of that set for data accesses with the Inner Shareable attribute made by any member of that set.
Each Outer Shareability domain contains a set of observers that are data coherent for each member of that set for data accesses with the Outer Shareable attribute made by any member of that set.
All observers in an Inner Shareability domain are always members of the same Outer Shareability domain


Non-shareable Normal memory: accessed only by a single PE.

==================================================
  • Atomic operation

arch/arm64/include/asm

#define atomic_inc(v)			atomic_add(1, (v))
#define atomic_dec(v)			atomic_sub(1, (v))
 
 
static inline void atomic_add(int i, atomic_t *v)
{
	register int w0 asm ("w0") = i;
	register atomic_t *x1 asm ("x1") = v;
 
	asm volatile(ARM64_LSE_ATOMIC_INSN(__LL_SC_ATOMIC(add),
	"	stadd	%w[i], %[v]\n")  //STADD atomic add, without return
	: [i] "+r" (w0), [v] "+Q" (v->counter)
	: "r" (x1)
	: __LL_SC_CLOBBERS);
}
  • Exclusive Access

For memory locations for which the shareability attribute is Non-shareable, the exclusive access instructions rely on a local Exclusives monitor, or local monitor, that marks any address from which the PE executes a Load-Exclusive instruction.

Exclusive monitor:

Exclusive acesss state
Open access state


A Load-Exclusive instruction marks a small block of memory for exclusive access. The size of the marked block is
IMPLEMENTATION DEFINED. block size = 2^a bytes, a=4~11 (4~512 words)

For example,
a=4 (16 bytes granule) DXRB=0x341B4=>marked block = 0x341B0~0x341BF

Reference: ARM Architecture Reference Manual ARMv8 for ARMv8-A architecture profile