linux.git/arch/arm/include/asm/atomic.h, branch v3.17

arch,arm: Convert smp_mb__*()

2014-04-18T09:40:32+00:00

ARM uses ll/sc primitives that do not imply barriers for all regular
atomic ops, therefore smp_mb__{before,after} need be a full barrier.

Since ARM doesn't use asm-generic/barrier.h include the required
definitions in its asm/barrier.h

Signed-off-by: Peter Zijlstra 
Acked-by: Paul E. McKenney 
Link: http://lkml.kernel.org/n/tip-yijo7sglsl7uusbp13upcuvo@git.kernel.org
Cc: Albin Tonnerre 
Cc: Catalin Marinas 
Cc: Chen Gang 
Cc: Linus Torvalds 
Cc: Nicolas Pitre 
Cc: Russell King 
Cc: Victor Kamensky 
Cc: Will Deacon 
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

ARM: 7983/1: atomics: implement a better __atomic_add_unless for v6+

2014-02-25T11:35:08+00:00

Looking at perf profiles of multi-threaded hackbench runs, a significant
performance hit appears to manifest from the cmpxchg loop used to
implement the 32-bit atomic_add_unless function. This can be mitigated
by writing a direct implementation of __atomic_add_unless which doesn't
require iteration outside of the atomic operation.

Signed-off-by: Will Deacon 
Signed-off-by: Russell King

ARM: 7984/1: prefetch: add prefetchw invocations for barriered atomics

2014-02-25T11:30:20+00:00

After a bunch of benchmarking on the interaction between dmb and pldw,
it turns out that issuing the pldw *after* the dmb instruction can
give modest performance gains (~3% atomic_add_return improvement on a
dual A15).

This patch adds prefetchw invocations to our barriered atomic operations
including cmpxchg, test_and_xxx and futexes.

Signed-off-by: Will Deacon 
Signed-off-by: Russell King

Merge branch 'devel-stable' into for-next

2013-11-12T10:58:59+00:00

Conflicts:
	arch/arm/include/asm/atomic.h
	arch/arm/include/asm/hardirq.h
	arch/arm/kernel/smp.c

ARM: 7868/1: arm/arm64: remove atomic_clear_mask() in "include/asm/atomic.h"

2013-11-09T00:00:13+00:00

In current kernel wide source code, except other architectures, only
s390 scsi drivers use atomic_clear_mask(), and arm/arm64 need not
support s390 drivers.

So remove atomic_clear_mask() from "arm[64]/include/asm/atomic.h".

Signed-off-by: Chen Gang 
Signed-off-by: Will Deacon 
Signed-off-by: Russell King

ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg().

2013-11-09T00:00:12+00:00

For atomic_cmpxchg(), the type of 'oldval' need be 'int' to match the
type of "*ptr" (used by 'ldrex' instruction) and 'old' (used by 'teq'
instruction).

Reviewed-by: Will Deacon 
Signed-off-by: Chen Gang 
Signed-off-by: Will Deacon 
Signed-off-by: Russell King

ARM: 7866/1: include: asm: use 'long long' instead of 'u64' within atomic.h

2013-11-09T00:00:10+00:00

atomic* value is signed value, and atomic* functions need also process
signed value (parameter value, and return value), so 32-bit arm need
use 'long long' instead of 'u64'.

After replacement, it will also fix a bug for atomic64_add_negative():
"u64 is never less than 0".

The modifications are:

  in vim, use "1,% s/\/long long/g" command.
  remove '__aligned(8)' which is useless for 64-bit.
  be sure of 80 column limitation after replacement.

Acked-by: Will Deacon 
Signed-off-by: Chen Gang 
Signed-off-by: Will Deacon 
Signed-off-by: Russell King

Merge branch 'baserock/bjdooks/312-rc4/be/core-v3' of git://git.baserock.org/delta/linux into devel-stable

2013-10-30T22:20:26+00:00

Conflicts:
	arch/arm/kernel/head.S

This series has been well tested and it would be great to get this
merged now.

Signed-off-by: Russell King

ARM: atomic64: fix endian-ness in atomic.h

2013-10-19T19:46:36+00:00

Fix inline asm for atomic64_xxx functions in arm atomic.h. Instead of
%H operand specifiers code should use %Q for least significant part
of the value, and %R for the most significant part of the value. %H
always returns the higher of the two register numbers, and therefore
it is not endian neutral. %H should be used with ldrexd and strexd
instructions.

Signed-off-by: Victor Kamensky 
Acked-by: Will Deacon 
Signed-off-by: Ben Dooks

ARM: atomics: prefetch the destination word for write prior to strex

2013-09-30T15:42:56+00:00

The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.

This patch prefixes our atomic access implementations with pldw
instructions (on CPUs which support them) to try and grab the line in
exclusive state from the start. Only the barrier-less functions are
updated, since memory barriers can limit the usefulness of prefetching
data.

Acked-by: Nicolas Pitre 
Signed-off-by: Will Deacon