<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/kernel, branch v4.18</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci</title>
<updated>2018-08-02T17:59:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-02T17:59:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ef46808b79ebbf48587c556436e5281ec51d09b5'/>
<id>ef46808b79ebbf48587c556436e5281ec51d09b5</id>
<content type='text'>
Pull PCI fixes from Bjorn Helgaas:

 - Fix integer overflow in new mobiveil driver (Dan Carpenter)

 - Fix race during NVMe removal/rescan (Hari Vyas)

* tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Fix is_added/is_busmaster race condition
  PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull PCI fixes from Bjorn Helgaas:

 - Fix integer overflow in new mobiveil driver (Dan Carpenter)

 - Fix race during NVMe removal/rescan (Hari Vyas)

* tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Fix is_added/is_busmaster race condition
  PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: Fix is_added/is_busmaster race condition</title>
<updated>2018-07-31T16:27:54+00:00</updated>
<author>
<name>Hari Vyas</name>
<email>hari.vyas@broadcom.com</email>
</author>
<published>2018-07-03T09:05:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=44bda4b7d26e9fffed6d7152d98a2e9edaeb2a76'/>
<id>44bda4b7d26e9fffed6d7152d98a2e9edaeb2a76</id>
<content type='text'>
When a PCI device is detected, pdev-&gt;is_added is set to 1 and proc and
sysfs entries are created.

When the device is removed, pdev-&gt;is_added is checked for one and then
device is detached with clearing of proc and sys entries and at end,
pdev-&gt;is_added is set to 0.

is_added and is_busmaster are bit fields in pci_dev structure sharing same
memory location.

A strange issue was observed with multiple removal and rescan of a PCIe
NVMe device using sysfs commands where is_added flag was observed as zero
instead of one while removing device and proc,sys entries are not cleared.
This causes issue in later device addition with warning message
"proc_dir_entry" already registered.

Debugging revealed a race condition between the PCI core setting the
is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue
setting the is_busmaster bit in pci_set_master().  As these fields are not
handled atomically, that clears the is_added bit.

Move the is_added bit to a separate private flag variable and use atomic
functions to set and retrieve the device addition state.  This avoids the
race because is_added no longer shares a memory location with is_busmaster.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283
Signed-off-by: Hari Vyas &lt;hari.vyas@broadcom.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a PCI device is detected, pdev-&gt;is_added is set to 1 and proc and
sysfs entries are created.

When the device is removed, pdev-&gt;is_added is checked for one and then
device is detached with clearing of proc and sys entries and at end,
pdev-&gt;is_added is set to 0.

is_added and is_busmaster are bit fields in pci_dev structure sharing same
memory location.

A strange issue was observed with multiple removal and rescan of a PCIe
NVMe device using sysfs commands where is_added flag was observed as zero
instead of one while removing device and proc,sys entries are not cleared.
This causes issue in later device addition with warning message
"proc_dir_entry" already registered.

Debugging revealed a race condition between the PCI core setting the
is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue
setting the is_busmaster bit in pci_set_master().  As these fields are not
handled atomically, that clears the is_added bit.

Move the is_added bit to a separate private flag variable and use atomic
functions to set and retrieve the device addition state.  This avoids the
race because is_added no longer shares a memory location with is_busmaster.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283
Signed-off-by: Hari Vyas &lt;hari.vyas@broadcom.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle)</title>
<updated>2018-07-18T10:40:17+00:00</updated>
<author>
<name>Gautham R. Shenoy</name>
<email>ego@linux.vnet.ibm.com</email>
</author>
<published>2018-07-18T08:33:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b03897cf318dfc47de33a7ecbc7655584266f034'/>
<id>b03897cf318dfc47de33a7ecbc7655584266f034</id>
<content type='text'>
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.

SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored.  As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.

Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.

Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.

Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Florian Weimer &lt;fweimer@redhat.com&gt;
Signed-off-by: Gautham R. Shenoy &lt;ego@linux.vnet.ibm.com&gt;
Reviewed-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror
SPRN_USPRG3 are used as userspace VDSO write and read registers
respectively.

SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not
restored.  As a result, any read from SPRN_USPRG3 returns zero on an
exit from stop4 (Power9 only) and above.

Thus in this situation, on POWER9, any call from sched_getcpu() always
returns zero, as on powerpc, we call __kernel_getcpu() which relies
upon SPRN_USPRG3 to report the CPU and NUMA node information.

Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state
with the sprg_vdso value that is cached in PACA.

Fixes: e1c1cfed5432 ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle")
Cc: stable@vger.kernel.org # v4.14+
Reported-by: Florian Weimer &lt;fweimer@redhat.com&gt;
Signed-off-by: Gautham R. Shenoy &lt;ego@linux.vnet.ibm.com&gt;
Reviewed-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild</title>
<updated>2018-06-30T20:05:30+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-06-30T20:05:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=22d3e0c36e10296da91e3bb8826592bacfbf4b9c'/>
<id>22d3e0c36e10296da91e3bb8826592bacfbf4b9c</id>
<content type='text'>
Pull Kbuild fixes from Masahiro Yamada:

 - introduce __diag_* macros and suppress -Wattribute-alias warnings
   from GCC 8

 - fix stack protector test script for x86_64

 - fix line number handling in Kconfig

 - document that '#' starts a comment in Kconfig

 - handle P_SYMBOL property in dump debugging of Kconfig

 - correct help message of LD_DEAD_CODE_DATA_ELIMINATION

 - fix occasional segmentation faults in Kconfig

* tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: loop boundary condition fix
  kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION
  kconfig: handle P_SYMBOL in print_symbol()
  kconfig: document Kconfig source file comments
  kconfig: fix line numbers for if-entries in menu tree
  stack-protector: Fix test with 32-bit userland and CONFIG_64BIT=y
  powerpc: Remove -Wattribute-alias pragmas
  disable -Wattribute-alias warning for SYSCALL_DEFINEx()
  kbuild: add macro for controlling warnings to linux/compiler.h
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull Kbuild fixes from Masahiro Yamada:

 - introduce __diag_* macros and suppress -Wattribute-alias warnings
   from GCC 8

 - fix stack protector test script for x86_64

 - fix line number handling in Kconfig

 - document that '#' starts a comment in Kconfig

 - handle P_SYMBOL property in dump debugging of Kconfig

 - correct help message of LD_DEAD_CODE_DATA_ELIMINATION

 - fix occasional segmentation faults in Kconfig

* tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: loop boundary condition fix
  kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION
  kconfig: handle P_SYMBOL in print_symbol()
  kconfig: document Kconfig source file comments
  kconfig: fix line numbers for if-entries in menu tree
  stack-protector: Fix test with 32-bit userland and CONFIG_64BIT=y
  powerpc: Remove -Wattribute-alias pragmas
  disable -Wattribute-alias warning for SYSCALL_DEFINEx()
  kbuild: add macro for controlling warnings to linux/compiler.h
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Remove -Wattribute-alias pragmas</title>
<updated>2018-06-25T14:21:13+00:00</updated>
<author>
<name>Paul Burton</name>
<email>paul.burton@mips.com</email>
</author>
<published>2018-06-19T20:14:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ac8517440344dbe598f7c1c23e686c800b563061'/>
<id>ac8517440344dbe598f7c1c23e686c800b563061</id>
<content type='text'>
With SYSCALL_DEFINEx() disabling -Wattribute-alias generically, there's
no need to duplicate that for PowerPC syscalls.

This reverts commit 415520373975 ("powerpc: fix build failure by
disabling attribute-alias warning in pci_32") and commit 2479bfc9bc60
("powerpc: Fix build by disabling attribute-alias warning for
SYSCALL_DEFINEx").

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Acked-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Masahiro Yamada &lt;yamada.masahiro@socionext.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With SYSCALL_DEFINEx() disabling -Wattribute-alias generically, there's
no need to duplicate that for PowerPC syscalls.

This reverts commit 415520373975 ("powerpc: fix build failure by
disabling attribute-alias warning in pci_32") and commit 2479bfc9bc60
("powerpc: Fix build by disabling attribute-alias warning for
SYSCALL_DEFINEx").

Signed-off-by: Paul Burton &lt;paul.burton@mips.com&gt;
Acked-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Signed-off-by: Masahiro Yamada &lt;yamada.masahiro@socionext.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2018-06-24T12:18:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-06-24T12:18:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2ce413ec1694eca3a4fa738b6d9007c728a0d40a'/>
<id>2ce413ec1694eca3a4fa738b6d9007c728a0d40a</id>
<content type='text'>
Pull rseq fixes from Thomas Gleixer:
 "A pile of rseq related fixups:

   - Prevent infinite recursion when delivering SIGSEGV

   - Remove the abort of rseq critical section on fork() as syscalls
     inside rseq critical sections are explicitely forbidden. So no
     point in doing the abort on the child.

   - Align the rseq structure on 32 bytes in the ARM selftest code.

   - Fix file permissions of the test script"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Avoid infinite recursion when delivering SIGSEGV
  rseq/cleanup: Do not abort rseq c.s. in child on fork()
  rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes
  rseq/selftests: Make run_param_test.sh executable
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull rseq fixes from Thomas Gleixer:
 "A pile of rseq related fixups:

   - Prevent infinite recursion when delivering SIGSEGV

   - Remove the abort of rseq critical section on fork() as syscalls
     inside rseq critical sections are explicitely forbidden. So no
     point in doing the abort on the child.

   - Align the rseq structure on 32 bytes in the ARM selftest code.

   - Fix file permissions of the test script"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Avoid infinite recursion when delivering SIGSEGV
  rseq/cleanup: Do not abort rseq c.s. in child on fork()
  rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes
  rseq/selftests: Make run_param_test.sh executable
</pre>
</div>
</content>
</entry>
<entry>
<title>rseq: Avoid infinite recursion when delivering SIGSEGV</title>
<updated>2018-06-22T17:04:22+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2018-06-22T10:45:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=784e0300fe9fe4aa81bd7df9d59e138f56bb605b'/>
<id>784e0300fe9fe4aa81bd7df9d59e138f56bb605b</id>
<content type='text'>
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).

However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.

Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: peterz@infradead.org
Cc: paulmck@linux.vnet.ibm.com
Cc: boqun.feng@gmail.com
Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).

However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.

Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: peterz@infradead.org
Cc: paulmck@linux.vnet.ibm.com
Cc: boqun.feng@gmail.com
Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64s: Fix build failures with CONFIG_NMI_IPI=n</title>
<updated>2018-06-19T13:03:50+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2018-06-19T11:51:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e08ecba17b72aeb01859601bc242a5bc48620109'/>
<id>e08ecba17b72aeb01859601bc242a5bc48620109</id>
<content type='text'>
I broke the build when CONFIG_NMI_IPI=n with my recent commit to add
arch_trigger_cpumask_backtrace(), eg:

  stacktrace.c:(.text+0x1b0): undefined reference to `.smp_send_safe_nmi_ipi'

We should rework the CONFIG symbols here in future to avoid these
double barrelled ifdefs but for now they fix the build.

Fixes: 5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
Reported-by: Christophe LEROY &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I broke the build when CONFIG_NMI_IPI=n with my recent commit to add
arch_trigger_cpumask_backtrace(), eg:

  stacktrace.c:(.text+0x1b0): undefined reference to `.smp_send_safe_nmi_ipi'

We should rework the CONFIG symbols here in future to avoid these
double barrelled ifdefs but for now they fix the build.

Fixes: 5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
Reported-by: Christophe LEROY &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: hard disable irqs on the panic()ing CPU</title>
<updated>2018-06-19T11:28:23+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2018-05-19T04:35:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=855b6232dda2b6941ecd22979893e8a1d25642db'/>
<id>855b6232dda2b6941ecd22979893e8a1d25642db</id>
<content type='text'>
Similar to previous patches, hard disable interrupts when a CPU is
in panic. This reduces the chance the watchdog has to interfere with
the panic, and avoids any other type of masked interrupt being
executed when crashing which minimises the length of the crash path.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similar to previous patches, hard disable interrupts when a CPU is
in panic. This reduces the chance the watchdog has to interfere with
the panic, and avoids any other type of masked interrupt being
executed when crashing which minimises the length of the crash path.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: smp_send_stop do not offline stopped CPUs</title>
<updated>2018-06-19T11:28:23+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2018-05-19T04:35:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=de6e5d38417e6cdb005843db420a2974993d36ff'/>
<id>de6e5d38417e6cdb005843db420a2974993d36ff</id>
<content type='text'>
Marking CPUs stopped by smp_send_stop as offline can cause warnings
due to cross-CPU wakeups. This trace was noticed on a busy system
running a sysrq+c crash test, after the injected crash:

WARNING: CPU: 51 PID: 1546 at kernel/sched/core.c:1179 set_task_cpu+0x22c/0x240
CPU: 51 PID: 1546 Comm: kworker/u352:1 Tainted: G      D
Workqueue: mlx5e mlx5e_update_stats_work [mlx5_core]
[...]
NIP [c00000000017c21c] set_task_cpu+0x22c/0x240
LR [c00000000017d580] try_to_wake_up+0x230/0x720
Call Trace:
[c000000001017700] runqueues+0x0/0xb00 (unreliable)
[c00000000017d580] try_to_wake_up+0x230/0x720
[c00000000015a214] insert_work+0x104/0x140
[c00000000015adb0] __queue_work+0x230/0x690
[c000003fc5007910] [c00000000015b26c] queue_work_on+0x5c/0x90
[c0080000135fc8f8] mlx5_cmd_exec+0x538/0xcb0 [mlx5_core]
[c008000013608fd0] mlx5_core_access_reg+0x140/0x1d0 [mlx5_core]
[c00800001362777c] mlx5e_update_pport_counters.constprop.59+0x6c/0x90 [mlx5_core]
[c008000013628868] mlx5e_update_ndo_stats+0x28/0x90 [mlx5_core]
[c008000013625558] mlx5e_update_stats_work+0x68/0xb0 [mlx5_core]
[c00000000015bcec] process_one_work+0x1bc/0x5f0
[c00000000015ecac] worker_thread+0xac/0x6b0
[c000000000168338] kthread+0x168/0x1b0
[c00000000000b628] ret_from_kernel_thread+0x5c/0xb4

This happens because firstly the CPU is not really offline in the
usual sense, processes and interrupts have not been migrated away.
Secondly smp_send_stop does not happen atomically on all CPUs, so
one CPU can have marked itself offline, while another CPU is still
running processes or interrupts which can affect the first CPU.

Fix this by just not marking the CPU as offline. It's more like
frozen in time, so offline does not really reflect its state properly
anyway. There should be nothing in the crash/panic path that walks
online CPUs and synchronously waits for them, so this change should
not introduce new hangs.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Marking CPUs stopped by smp_send_stop as offline can cause warnings
due to cross-CPU wakeups. This trace was noticed on a busy system
running a sysrq+c crash test, after the injected crash:

WARNING: CPU: 51 PID: 1546 at kernel/sched/core.c:1179 set_task_cpu+0x22c/0x240
CPU: 51 PID: 1546 Comm: kworker/u352:1 Tainted: G      D
Workqueue: mlx5e mlx5e_update_stats_work [mlx5_core]
[...]
NIP [c00000000017c21c] set_task_cpu+0x22c/0x240
LR [c00000000017d580] try_to_wake_up+0x230/0x720
Call Trace:
[c000000001017700] runqueues+0x0/0xb00 (unreliable)
[c00000000017d580] try_to_wake_up+0x230/0x720
[c00000000015a214] insert_work+0x104/0x140
[c00000000015adb0] __queue_work+0x230/0x690
[c000003fc5007910] [c00000000015b26c] queue_work_on+0x5c/0x90
[c0080000135fc8f8] mlx5_cmd_exec+0x538/0xcb0 [mlx5_core]
[c008000013608fd0] mlx5_core_access_reg+0x140/0x1d0 [mlx5_core]
[c00800001362777c] mlx5e_update_pport_counters.constprop.59+0x6c/0x90 [mlx5_core]
[c008000013628868] mlx5e_update_ndo_stats+0x28/0x90 [mlx5_core]
[c008000013625558] mlx5e_update_stats_work+0x68/0xb0 [mlx5_core]
[c00000000015bcec] process_one_work+0x1bc/0x5f0
[c00000000015ecac] worker_thread+0xac/0x6b0
[c000000000168338] kthread+0x168/0x1b0
[c00000000000b628] ret_from_kernel_thread+0x5c/0xb4

This happens because firstly the CPU is not really offline in the
usual sense, processes and interrupts have not been migrated away.
Secondly smp_send_stop does not happen atomically on all CPUs, so
one CPU can have marked itself offline, while another CPU is still
running processes or interrupts which can affect the first CPU.

Fix this by just not marking the CPU as offline. It's more like
frozen in time, so offline does not really reflect its state properly
anyway. There should be nothing in the crash/panic path that walks
online CPUs and synchronously waits for them, so this change should
not introduce new hangs.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
