linux.git/arch/arm/include/asm/processor.h, branch v5.1

ARM: avoid Cortex-A9 livelock on tight dmb loops

2019-02-01T22:05:50+00:00

machine_crash_nonpanic_core() does this:

	while (1)
		cpu_relax();

because the kernel has crashed, and we have no known safe way to deal
with the CPU.  So, we place the CPU into an infinite loop which we
expect it to never exit - at least not until the system as a whole is
reset by some method.

In the absence of erratum 754327, this code assembles to:

	b	.

In other words, an infinite loop.  When erratum 754327 is enabled,
this becomes:

1:	dmb
	b	1b

It has been observed that on some systems (eg, OMAP4) where, if a
crash is triggered, the system tries to kexec into the panic kernel,
but fails after taking the secondary CPU down - placing it into one
of these loops.  This causes the system to livelock, and the most
noticable effect is the system stops after issuing:

	Loading crashdump kernel...

to the system console.

The tested as working solution I came up with was to add wfe() to
these infinite loops thusly:

	while (1) {
		cpu_relax();
		wfe();
	}

which, without 754327 builds to:

1:	wfe
	b	1b

or with 754327 is enabled:

1:	dmb
	wfe
	b	1b

Adding "wfe" does two things depending on the environment we're running
under:
- where we're running on bare metal, and the processor implements
  "wfe", it stops us spinning endlessly in a loop where we're never
  going to do any useful work.
- if we're running in a VM, it allows the CPU to be given back to the
  hypervisor and rescheduled for other purposes (maybe a different VM)
  rather than wasting CPU cycles inside a crashed VM.

However, in light of erratum 794072, Will Deacon wanted to see 10 nops
as well - which is reasonable to cover the case where we have erratum
754327 enabled _and_ we have a processor that doesn't implement the
wfe hint.

So, we now end up with:

1:      wfe
        b       1b

when erratum 754327 is disabled, or:

1:      dmb
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        wfe
        b       1b

when erratum 754327 is enabled.  We also get the dmb + 10 nop
sequence elsewhere in the kernel, in terminating loops.

This is reasonable - it means we get the workaround for erratum
794072 when erratum 754327 is enabled, but still relinquish the dead
processor - either by placing it in a lower power mode when wfe is
implemented as such or by returning it to the hypervisior, or in the
case where wfe is a no-op, we use the workaround specified in erratum
794072 to avoid the problem.

These as two entirely orthogonal problems - the 10 nops addresses
erratum 794072, and the wfe is an optimisation that makes the system
more efficient when crashed either in terms of power consumption or
by allowing the host/other VMs to make use of the CPU.

I don't see any reason not to use kexec() inside a VM - it has the
potential to provide automated recovery from a failure of the VMs
kernel with the opportunity for saving a crashdump of the failure.
A panic() with a reboot timeout won't do that, and reading the
libvirt documentation, setting on_reboot to "preserve" won't either
(the documentation states "The preserve action for an on_reboot event
is treated as a destroy".)  Surely it has to be a good thing to
avoiding having CPUs spinning inside a VM that is doing no useful
work.

Acked-by: Will Deacon 
Signed-off-by: Russell King

treewide: remove current_text_addr

2018-10-31T15:54:12+00:00

Prefer _THIS_IP_ defined in linux/kernel.h.

Most definitions of current_text_addr were the same as _THIS_IP_, but
a few archs had inline assembly instead.

This patch removes the final call site of current_text_addr, making all
of the definitions dead code.

[akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h]
Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers 
Cc: Peter Zijlstra 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

arm: Implement thread_struct whitelist for hardened usercopy

2018-01-15T20:08:06+00:00

While ARM32 carries FPU state in the thread structure that is saved and
restored during signal handling, it doesn't need to declare a usercopy
whitelist, since existing accessors are all either using a bounce buffer
(for which whitelisting isn't checking the slab), are statically sized
(which will bypass the hardened usercopy check), or both.

Cc: Russell King 
Cc: Ingo Molnar 
Cc: Christian Borntraeger 
Cc: "Peter Zijlstra (Intel)" 
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Kees Cook

ARM: start_thread(): don't always clear all regs

2017-09-10T23:31:46+00:00

The elf_fdpic binary format driver has to initialize extra registers
beside the stack and program counter as required by the corresponding
ABI. So reinstate them after the regs structure has been cleared.

While at it let's get rid of start_thread_nommu().

Signed-off-by: Nicolas Pitre 
Acked-by: Mickael GUENE 
Tested-by: Vincent Abriou 
Tested-by: Andras Szemzo

locking/core: Provide common cpu_relax_yield() definition

2016-11-17T07:17:36+00:00

No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.

Suggested-by: Russell King 
Signed-off-by: Christian Borntraeger 
Reviewed-by: David Hildenbrand 
Acked-by: Russell King 
Cc: Andrew Morton 
Cc: Catalin Marinas 
Cc: Heiko Carstens 
Cc: Linus Torvalds 
Cc: Martin Schwidefsky 
Cc: Nicholas Piggin 
Cc: Noam Camus 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar

locking/core, arch: Remove cpu_relax_lowlatency()

2016-11-16T09:15:11+00:00

As there are no users left, we can remove cpu_relax_lowlatency()
implementations from every architecture.

Signed-off-by: Christian Borntraeger 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Catalin Marinas 
Cc: Heiko Carstens 
Cc: Linus Torvalds 
Cc: Martin Schwidefsky 
Cc: Nicholas Piggin 
Cc: Noam Camus 
Cc: Peter Zijlstra 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Cc: 
Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar

locking/core: Introduce cpu_relax_yield()

2016-11-16T09:15:09+00:00

For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: Catalin Marinas 
Cc: Heiko Carstens 
Cc: Linus Torvalds 
Cc: Martin Schwidefsky 
Cc: Nicholas Piggin 
Cc: Noam Camus 
Cc: Peter Zijlstra 
Cc: Russell King 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar

arch, locking: Ciao arch_mutex_cpu_relax()

2014-07-17T10:32:47+00:00

The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.

This patch (i) renames the call to cpu_relax_lowlatency  ("relax, but
only if you can do it with very low latency") and (ii) defines it in
each arch's asm/processor.h local header, just like for regular cpu_relax
functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
and thus we can take it out of mutex.h. While this can seem redundant,
I believe it is a good choice as it allows us to move out arch specific
logic from generic locking primitives and enables future(?) archs to
transparently define it, similarly to System Z.

Signed-off-by: Davidlohr Bueso 
Signed-off-by: Peter Zijlstra 
Cc: Andrew Morton 
Cc: Anton Blanchard 
Cc: Aurelien Jacquiot 
Cc: Benjamin Herrenschmidt 
Cc: Bharat Bhushan 
Cc: Catalin Marinas 
Cc: Chen Liqin 
Cc: Chris Metcalf 
Cc: Christian Borntraeger 
Cc: Chris Zankel 
Cc: David Howells 
Cc: David S. Miller 
Cc: Deepthi Dharwar 
Cc: Dominik Dingel 
Cc: Fenghua Yu 
Cc: Geert Uytterhoeven 
Cc: Guan Xuetao 
Cc: Haavard Skinnemoen 
Cc: Hans-Christian Egtvedt 
Cc: Heiko Carstens 
Cc: Helge Deller 
Cc: Hirokazu Takata 
Cc: Ivan Kokshaysky 
Cc: James E.J. Bottomley 
Cc: James Hogan 
Cc: Jason Wang 
Cc: Jesper Nilsson 
Cc: Joe Perches 
Cc: Jonas Bonn 
Cc: Joseph Myers 
Cc: Kees Cook 
Cc: Koichi Yasutake 
Cc: Lennox Wu 
Cc: Linus Torvalds 
Cc: Mark Salter 
Cc: Martin Schwidefsky 
Cc: Matt Turner 
Cc: Max Filippov 
Cc: Michael Neuling 
Cc: Michal Simek 
Cc: Mikael Starvik 
Cc: Nicolas Pitre 
Cc: Paolo Bonzini 
Cc: Paul Burton 
Cc: Paul E. McKenney 
Cc: Paul Gortmaker 
Cc: Paul Mackerras 
Cc: Qais Yousef 
Cc: Qiaowei Ren 
Cc: Rafael Wysocki 
Cc: Ralf Baechle 
Cc: Richard Henderson 
Cc: Richard Kuo 
Cc: Russell King 
Cc: Steven Miao 
Cc: Steven Rostedt 
Cc: Stratos Karafotis 
Cc: Tim Chen 
Cc: Tony Luck 
Cc: Vasily Kulikov 
Cc: Vineet Gupta 
Cc: Vineet Gupta 
Cc: Waiman Long 
Cc: Will Deacon 
Cc: Wolfram Sang 
Cc: adi-buildroot-devel@lists.sourceforge.net
Cc: linux390@de.ibm.com
Cc: linux-alpha@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-cris-kernel@axis.com
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux@lists.openrisc.net
Cc: linux-m32r-ja@ml.linux-m32r.org
Cc: linux-m32r@ml.linux-m32r.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-metag@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar

ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUs

2013-09-30T15:42:55+00:00

SMP ARMv7 CPUs implement the pldw instruction, which allows them to
prefetch data cachelines in an exclusive state.

This patch defines the prefetchw macro using pldw for CPUs that support
it.

Acked-by: Nicolas Pitre 
Signed-off-by: Will Deacon

ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.h

2013-09-30T15:42:55+00:00

Patching UP/SMP alternatives inside inline assembly blocks is useful
outside of the spinlock implementation, where it is used for sev and wfe.

This patch lifts the macro into processor.h and gives it a scarier name
to (a) avoid conflicts in the global namespace and (b) to try and deter
its usage unless you "know what you're doing". The W macro for generating
wide instructions when targetting Thumb-2 is also made available under
the name WASM, to reduce the potential for conflicts with other headers.

Acked-by: Nicolas Pitre 
Signed-off-by: Will Deacon