<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel, branch v4.4.261</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>futex: fix spin_lock() / spin_unlock_irq() imbalance</title>
<updated>2021-03-11T12:46:35+00:00</updated>
<author>
<name>Thomas Schoebel-Theuer</name>
<email>tst@1und1.de</email>
</author>
<published>2021-03-07T07:23:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ad83307d1e625704cbf8e88de2c66dc8b175899e'/>
<id>ad83307d1e625704cbf8e88de2c66dc8b175899e</id>
<content type='text'>
This patch and problem analysis is specific for 4.4 LTS, due to incomplete
backporting of other fixes. Later LTS series have different backports.

The following is obviously incorrect:

static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
             struct futex_hash_bucket *hb)
{
[...]
	raw_spin_lock(&amp;pi_state-&gt;pi_mutex.wait_lock);
[...]
	raw_spin_unlock_irq(&amp;pi_state-&gt;pi_mutex.wait_lock);
[...]
}

The 4.4-specific fix should probably go in the direction of
b4abf91047c,
making everything irq-safe.

Probably, backporting of b4abf91047c
to 4.4 LTS could thus be another good idea.

However, this might involve some more 4.4-specific work and
require thorough testing:

&gt; git log --oneline v4.4..b4abf91047c -- kernel/futex.c kernel/locking/rtmutex.c | wc -l
10

So this patch is just an obvious quickfix for now.

Hint: the lock order is documented in 4.9.y and later. A similar
documenting is missing in 4.4.y. Please somebody either backport also,
or write a new description, if there would be some differences I cannot
easily see at the moment. Without reliable docs,
inspection of the locking correctness may become a pain.
 
Signed-off-by: Thomas Schoebel-Theuer &lt;tst@1und1.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Lee Jones &lt;lee.jones@linaro.org&gt;
Fixes: 394fc4981426 ("futex: Rework inconsistent rt_mutex/futex_q state")
Fixes: 6510e4a2d04f ("futex,rt_mutex: Provide futex specific rt_mutex API")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch and problem analysis is specific for 4.4 LTS, due to incomplete
backporting of other fixes. Later LTS series have different backports.

The following is obviously incorrect:

static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
             struct futex_hash_bucket *hb)
{
[...]
	raw_spin_lock(&amp;pi_state-&gt;pi_mutex.wait_lock);
[...]
	raw_spin_unlock_irq(&amp;pi_state-&gt;pi_mutex.wait_lock);
[...]
}

The 4.4-specific fix should probably go in the direction of
b4abf91047c,
making everything irq-safe.

Probably, backporting of b4abf91047c
to 4.4 LTS could thus be another good idea.

However, this might involve some more 4.4-specific work and
require thorough testing:

&gt; git log --oneline v4.4..b4abf91047c -- kernel/futex.c kernel/locking/rtmutex.c | wc -l
10

So this patch is just an obvious quickfix for now.

Hint: the lock order is documented in 4.9.y and later. A similar
documenting is missing in 4.4.y. Please somebody either backport also,
or write a new description, if there would be some differences I cannot
easily see at the moment. Without reliable docs,
inspection of the locking correctness may become a pain.
 
Signed-off-by: Thomas Schoebel-Theuer &lt;tst@1und1.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Lee Jones &lt;lee.jones@linaro.org&gt;
Fixes: 394fc4981426 ("futex: Rework inconsistent rt_mutex/futex_q state")
Fixes: 6510e4a2d04f ("futex,rt_mutex: Provide futex specific rt_mutex API")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>futex: fix irq self-deadlock and satisfy assertion</title>
<updated>2021-03-11T12:46:35+00:00</updated>
<author>
<name>Thomas Schoebel-Theuer</name>
<email>tst@1und1.de</email>
</author>
<published>2021-03-07T07:26:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d68eefc0f6050e64dc57aefc0638eac7bb441050'/>
<id>d68eefc0f6050e64dc57aefc0638eac7bb441050</id>
<content type='text'>
This patch and problem analysis is specific for 4.4 LTS, due to incomplete
backporting of other fixes. Later LTS series have different backports.

Since v4.4.257 when CONFIG_PROVE_LOCKING=y
the following triggers right after reboot of our pre-life systems
which equal our production setup:

Mar 03 11:27:33 icpu-test-bap10 kernel: =================================
Mar 03 11:27:33 icpu-test-bap10 kernel: [ INFO: inconsistent lock state ]
Mar 03 11:27:33 icpu-test-bap10 kernel: 4.4.259-rc1-grsec+ #730 Not tainted
Mar 03 11:27:33 icpu-test-bap10 kernel: ---------------------------------
Mar 03 11:27:33 icpu-test-bap10 kernel: inconsistent {IN-HARDIRQ-W} -&gt; {HARDIRQ-ON-W} usage.
Mar 03 11:27:33 icpu-test-bap10 kernel: apache2-ssl/9310 [HC0[0]:SC0[0]:HE1:SE1] takes:
Mar 03 11:27:33 icpu-test-bap10 kernel:  (&amp;p-&gt;pi_lock){?.-.-.}, at: [&lt;ffffffff810abb68&gt;] pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel: {IN-HARDIRQ-W} state was registered at:
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81088c4a&gt;] __lock_acquire+0x3a7/0xe4a
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81089b01&gt;] lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8170151c&gt;] _raw_spin_lock_irqsave+0x3e/0x50
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810719a5&gt;] try_to_wake_up+0x2c/0x210
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81071bf3&gt;] default_wake_function+0xd/0xf
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083588&gt;] autoremove_wake_function+0x11/0x35
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810830b2&gt;] __wake_up_common+0x48/0x7c
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8108311a&gt;] __wake_up+0x34/0x46
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814c2a23&gt;] megasas_complete_int_cmd+0x31/0x33
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814c60a0&gt;] megasas_complete_cmd+0x570/0x57b
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814d05bc&gt;] complete_cmd_fusion+0x23e/0x33d
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814d0768&gt;] megasas_isr_fusion+0x67/0x74
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81091ae5&gt;] handle_irq_event_percpu+0x134/0x311
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81091cf5&gt;] handle_irq_event+0x33/0x51
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810948b9&gt;] handle_edge_irq+0xa3/0xc2
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81005f7b&gt;] handle_irq+0xf9/0x101
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81005700&gt;] do_IRQ+0x80/0xf5
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81702228&gt;] ret_from_intr+0x0/0x20
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8100cab0&gt;] arch_cpu_idle+0xa/0xc
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083a5a&gt;] default_idle_call+0x1e/0x20
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083b9d&gt;] cpu_startup_entry+0x141/0x22f
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff816fb853&gt;] rest_init+0x135/0x13b
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5ce99&gt;] start_kernel+0x3fa/0x40a
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5c2af&gt;] x86_64_start_reservations+0x2a/0x2c
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5c3d0&gt;] x86_64_start_kernel+0x11f/0x12c
Mar 03 11:27:33 icpu-test-bap10 kernel: irq event stamp: 1457
Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last  enabled at (1457): [&lt;ffffffff81042a69&gt;] get_user_pages_fast+0xeb/0x14f
Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last disabled at (1456): [&lt;ffffffff810429dd&gt;] get_user_pages_fast+0x5f/0x14f
Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last  enabled at (1446): [&lt;ffffffff815e127d&gt;] release_sock+0x142/0x14d
Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last disabled at (1444): [&lt;ffffffff815e116f&gt;] release_sock+0x34/0x14d
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                        other info that might help us debug this:
Mar 03 11:27:33 icpu-test-bap10 kernel:  Possible unsafe locking scenario:
Mar 03 11:27:33 icpu-test-bap10 kernel:        CPU0
Mar 03 11:27:33 icpu-test-bap10 kernel:        ----
Mar 03 11:27:33 icpu-test-bap10 kernel:   lock(&amp;p-&gt;pi_lock);
Mar 03 11:27:33 icpu-test-bap10 kernel:   &lt;Interrupt&gt;
Mar 03 11:27:33 icpu-test-bap10 kernel:     lock(&amp;p-&gt;pi_lock);
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                         *** DEADLOCK ***
Mar 03 11:27:33 icpu-test-bap10 kernel: 2 locks held by apache2-ssl/9310:
Mar 03 11:27:33 icpu-test-bap10 kernel:  #0:  (&amp;(&amp;(__futex_data.queues)[i].lock)-&gt;rlock){+.+...}, at: [&lt;ffffffff810ae4e6&gt;] do
Mar 03 11:27:33 icpu-test-bap10 kernel:  #1:  (&amp;lock-&gt;wait_lock){+.+...}, at: [&lt;ffffffff810ae53a&gt;] do_futex+0x639/0x809
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                        stack backtrace:
Mar 03 11:27:33 icpu-test-bap10 kernel: CPU: 13 PID: 9310 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #730
Mar 03 11:27:33 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019
Mar 03 11:27:33 icpu-test-bap10 kernel:  0000000000000000 ffff883fb79bfc00 ffffffff816f8fc2 ffff883ffa66d300
Mar 03 11:27:33 icpu-test-bap10 kernel:  ffffffff8eaa71f0 ffff883fb79bfc50 ffffffff81088484 0000000000000000
Mar 03 11:27:33 icpu-test-bap10 kernel:  0000000000000001 0000000000000001 0000000000000002 ffff883ffa66db58
Mar 03 11:27:33 icpu-test-bap10 kernel: Call Trace:
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff816f8fc2&gt;] dump_stack+0x94/0xca
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81088484&gt;] print_usage_bug+0x1bc/0x1d1
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81087d76&gt;] ? check_usage_forwards+0x98/0x98
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810885a5&gt;] mark_lock+0x10c/0x203
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81088cb9&gt;] __lock_acquire+0x416/0xe4a
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81089b01&gt;] lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81089b01&gt;] ? lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81700d12&gt;] _raw_spin_lock+0x2a/0x39
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810ae5af&gt;] do_futex+0x6ae/0x809
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810ae83d&gt;] SyS_futex+0x133/0x143
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff8100158a&gt;] ? syscall_trace_enter_phase2+0x1a2/0x1bb
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81701848&gt;] tracesys_phase2+0x90/0x95

Bisecting detects 47e452fcf2f
in the above specific scenario using apache-ssl,
but apparently the missing *_irq() was introduced in
34c8e1c2c02.

However, just reverting the old _irq() variants to a similar status
than before 34c8e1c2c02,
or using _irqsave() / _irqrestore() as some other backports are doing
in various places, would not really help.

The fundamental problem is the following violation of the assertion
lockdep_assert_held(&amp;pi_state-&gt;pi_mutex.wait_lock) in pi_state_update_owner():

Mar 03 12:50:03 icpu-test-bap10 kernel: ------------[ cut here ]------------
Mar 03 12:50:03 icpu-test-bap10 kernel: WARNING: CPU: 37 PID: 8488 at kernel/futex.c:844 pi_state_update_owner+0x3d/0xd7()
Mar 03 12:50:03 icpu-test-bap10 kernel: Modules linked in: xt_time xt_connlimit xt_connmark xt_NFLOG xt_limit xt_hashlimit veth ip_set_bitmap_port xt_DSCP xt_multiport ip_set_hash_ip xt_owner xt_set ip_set_hash_net xt_state xt_conntrack nf_conntrack_ftp mars lz4_decompress lz4_compress ipmi_devintf x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul hed ipmi_si ipmi_msghandler processor crc32c_intel ehci_pci ehci_hcd usbcore i40e usb_common
Mar 03 12:50:03 icpu-test-bap10 kernel: CPU: 37 PID: 8488 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #737
Mar 03 12:50:03 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019
Mar 03 12:50:03 icpu-test-bap10 kernel:  0000000000000000 ffff883f863f7c70 ffffffff816f9002 0000000000000000
Mar 03 12:50:03 icpu-test-bap10 kernel:  0000000000000009 ffff883f863f7ca8 ffffffff8104cda2 ffffffff810abac7
Mar 03 12:50:03 icpu-test-bap10 kernel:  ffff883ffbfe5e80 0000000000000000 ffff883f82ed4bc0 00007fc01c9bf000
Mar 03 12:50:03 icpu-test-bap10 kernel: Call Trace:
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff816f9002&gt;] dump_stack+0x94/0xca
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8104cda2&gt;] warn_slowpath_common+0x94/0xad
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abac7&gt;] ? pi_state_update_owner+0x3d/0xd7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8104ce5f&gt;] warn_slowpath_null+0x15/0x17
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abac7&gt;] pi_state_update_owner+0x3d/0xd7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abea8&gt;] free_pi_state+0x2d/0x73
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abf0b&gt;] unqueue_me_pi+0x1d/0x31
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ad735&gt;] futex_lock_pi+0x27a/0x2e8
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff81088bca&gt;] ? __lock_acquire+0x327/0xe4a
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ae6a9&gt;] do_futex+0x784/0x809
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810cfa9a&gt;] ? seccomp_phase1+0xde/0x1e7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810a4503&gt;] ? current_kernel_time64+0xb/0x31
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810d23c3&gt;] ? current_kernel_time+0xb/0xf
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ae861&gt;] SyS_futex+0x133/0x143
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8100158a&gt;] ? syscall_trace_enter_phase2+0x1a2/0x1bb
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff81701888&gt;] tracesys_phase2+0x90/0x95
Mar 03 12:50:03 icpu-test-bap10 kernel: ---[ end trace 968f95a458dea951 ]---

In order to both (1) prevent the self-deadlock, and (2) to satisfy the assertion
at pi_state_update_owner(), some locking with irq disable is needed,
at least in the specific call stack.

Interestingly, there existed a suchalike locking just before
f08a4af5ccb.

This is just a quick hotfix, resurrecting some previous
locks at the old places, but now using -&gt;wait_lock in place
of the previous -&gt;pi_lock (which was in place before
f08a4af5ccb).

The -&gt;pi_lock is now also taken, by the new code
which had been introduced in
34c8e1c2c02.

When this patch is applied, both the above splats are
no longer triggering at my prelife machines.

Without this patch, I cannot ensure stable production at
1&amp;1 Ionos.

Hint for further work: I have not yet tested other call paths,
since I am under time pressure for security reasons.

Hint for further hardening of 4.4.y and probably some more LTS series:
Probably some more systematic testing with CONFIG_PROVE_LOCKING
(and probably some more options) should be invested
in order to make the 4.4 LTS series really "stable" again.

Signed-off-by: Thomas Schoebel-Theuer &lt;tst@1und1.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Lee Jones &lt;lee.jones@linaro.org&gt;
Fixes: f08a4af5ccb2 ("futex: Use pi_state_update_owner() in put_pi_state()")
Fixes: 34c8e1c2c025 ("futex: Provide and use pi_state_update_owner()")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch and problem analysis is specific for 4.4 LTS, due to incomplete
backporting of other fixes. Later LTS series have different backports.

Since v4.4.257 when CONFIG_PROVE_LOCKING=y
the following triggers right after reboot of our pre-life systems
which equal our production setup:

Mar 03 11:27:33 icpu-test-bap10 kernel: =================================
Mar 03 11:27:33 icpu-test-bap10 kernel: [ INFO: inconsistent lock state ]
Mar 03 11:27:33 icpu-test-bap10 kernel: 4.4.259-rc1-grsec+ #730 Not tainted
Mar 03 11:27:33 icpu-test-bap10 kernel: ---------------------------------
Mar 03 11:27:33 icpu-test-bap10 kernel: inconsistent {IN-HARDIRQ-W} -&gt; {HARDIRQ-ON-W} usage.
Mar 03 11:27:33 icpu-test-bap10 kernel: apache2-ssl/9310 [HC0[0]:SC0[0]:HE1:SE1] takes:
Mar 03 11:27:33 icpu-test-bap10 kernel:  (&amp;p-&gt;pi_lock){?.-.-.}, at: [&lt;ffffffff810abb68&gt;] pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel: {IN-HARDIRQ-W} state was registered at:
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81088c4a&gt;] __lock_acquire+0x3a7/0xe4a
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81089b01&gt;] lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8170151c&gt;] _raw_spin_lock_irqsave+0x3e/0x50
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810719a5&gt;] try_to_wake_up+0x2c/0x210
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81071bf3&gt;] default_wake_function+0xd/0xf
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083588&gt;] autoremove_wake_function+0x11/0x35
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810830b2&gt;] __wake_up_common+0x48/0x7c
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8108311a&gt;] __wake_up+0x34/0x46
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814c2a23&gt;] megasas_complete_int_cmd+0x31/0x33
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814c60a0&gt;] megasas_complete_cmd+0x570/0x57b
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814d05bc&gt;] complete_cmd_fusion+0x23e/0x33d
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814d0768&gt;] megasas_isr_fusion+0x67/0x74
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81091ae5&gt;] handle_irq_event_percpu+0x134/0x311
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81091cf5&gt;] handle_irq_event+0x33/0x51
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810948b9&gt;] handle_edge_irq+0xa3/0xc2
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81005f7b&gt;] handle_irq+0xf9/0x101
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81005700&gt;] do_IRQ+0x80/0xf5
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81702228&gt;] ret_from_intr+0x0/0x20
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8100cab0&gt;] arch_cpu_idle+0xa/0xc
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083a5a&gt;] default_idle_call+0x1e/0x20
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083b9d&gt;] cpu_startup_entry+0x141/0x22f
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff816fb853&gt;] rest_init+0x135/0x13b
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5ce99&gt;] start_kernel+0x3fa/0x40a
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5c2af&gt;] x86_64_start_reservations+0x2a/0x2c
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5c3d0&gt;] x86_64_start_kernel+0x11f/0x12c
Mar 03 11:27:33 icpu-test-bap10 kernel: irq event stamp: 1457
Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last  enabled at (1457): [&lt;ffffffff81042a69&gt;] get_user_pages_fast+0xeb/0x14f
Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last disabled at (1456): [&lt;ffffffff810429dd&gt;] get_user_pages_fast+0x5f/0x14f
Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last  enabled at (1446): [&lt;ffffffff815e127d&gt;] release_sock+0x142/0x14d
Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last disabled at (1444): [&lt;ffffffff815e116f&gt;] release_sock+0x34/0x14d
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                        other info that might help us debug this:
Mar 03 11:27:33 icpu-test-bap10 kernel:  Possible unsafe locking scenario:
Mar 03 11:27:33 icpu-test-bap10 kernel:        CPU0
Mar 03 11:27:33 icpu-test-bap10 kernel:        ----
Mar 03 11:27:33 icpu-test-bap10 kernel:   lock(&amp;p-&gt;pi_lock);
Mar 03 11:27:33 icpu-test-bap10 kernel:   &lt;Interrupt&gt;
Mar 03 11:27:33 icpu-test-bap10 kernel:     lock(&amp;p-&gt;pi_lock);
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                         *** DEADLOCK ***
Mar 03 11:27:33 icpu-test-bap10 kernel: 2 locks held by apache2-ssl/9310:
Mar 03 11:27:33 icpu-test-bap10 kernel:  #0:  (&amp;(&amp;(__futex_data.queues)[i].lock)-&gt;rlock){+.+...}, at: [&lt;ffffffff810ae4e6&gt;] do
Mar 03 11:27:33 icpu-test-bap10 kernel:  #1:  (&amp;lock-&gt;wait_lock){+.+...}, at: [&lt;ffffffff810ae53a&gt;] do_futex+0x639/0x809
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                        stack backtrace:
Mar 03 11:27:33 icpu-test-bap10 kernel: CPU: 13 PID: 9310 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #730
Mar 03 11:27:33 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019
Mar 03 11:27:33 icpu-test-bap10 kernel:  0000000000000000 ffff883fb79bfc00 ffffffff816f8fc2 ffff883ffa66d300
Mar 03 11:27:33 icpu-test-bap10 kernel:  ffffffff8eaa71f0 ffff883fb79bfc50 ffffffff81088484 0000000000000000
Mar 03 11:27:33 icpu-test-bap10 kernel:  0000000000000001 0000000000000001 0000000000000002 ffff883ffa66db58
Mar 03 11:27:33 icpu-test-bap10 kernel: Call Trace:
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff816f8fc2&gt;] dump_stack+0x94/0xca
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81088484&gt;] print_usage_bug+0x1bc/0x1d1
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81087d76&gt;] ? check_usage_forwards+0x98/0x98
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810885a5&gt;] mark_lock+0x10c/0x203
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81088cb9&gt;] __lock_acquire+0x416/0xe4a
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81089b01&gt;] lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81089b01&gt;] ? lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81700d12&gt;] _raw_spin_lock+0x2a/0x39
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810ae5af&gt;] do_futex+0x6ae/0x809
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810ae83d&gt;] SyS_futex+0x133/0x143
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff8100158a&gt;] ? syscall_trace_enter_phase2+0x1a2/0x1bb
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81701848&gt;] tracesys_phase2+0x90/0x95

Bisecting detects 47e452fcf2f
in the above specific scenario using apache-ssl,
but apparently the missing *_irq() was introduced in
34c8e1c2c02.

However, just reverting the old _irq() variants to a similar status
than before 34c8e1c2c02,
or using _irqsave() / _irqrestore() as some other backports are doing
in various places, would not really help.

The fundamental problem is the following violation of the assertion
lockdep_assert_held(&amp;pi_state-&gt;pi_mutex.wait_lock) in pi_state_update_owner():

Mar 03 12:50:03 icpu-test-bap10 kernel: ------------[ cut here ]------------
Mar 03 12:50:03 icpu-test-bap10 kernel: WARNING: CPU: 37 PID: 8488 at kernel/futex.c:844 pi_state_update_owner+0x3d/0xd7()
Mar 03 12:50:03 icpu-test-bap10 kernel: Modules linked in: xt_time xt_connlimit xt_connmark xt_NFLOG xt_limit xt_hashlimit veth ip_set_bitmap_port xt_DSCP xt_multiport ip_set_hash_ip xt_owner xt_set ip_set_hash_net xt_state xt_conntrack nf_conntrack_ftp mars lz4_decompress lz4_compress ipmi_devintf x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul hed ipmi_si ipmi_msghandler processor crc32c_intel ehci_pci ehci_hcd usbcore i40e usb_common
Mar 03 12:50:03 icpu-test-bap10 kernel: CPU: 37 PID: 8488 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #737
Mar 03 12:50:03 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019
Mar 03 12:50:03 icpu-test-bap10 kernel:  0000000000000000 ffff883f863f7c70 ffffffff816f9002 0000000000000000
Mar 03 12:50:03 icpu-test-bap10 kernel:  0000000000000009 ffff883f863f7ca8 ffffffff8104cda2 ffffffff810abac7
Mar 03 12:50:03 icpu-test-bap10 kernel:  ffff883ffbfe5e80 0000000000000000 ffff883f82ed4bc0 00007fc01c9bf000
Mar 03 12:50:03 icpu-test-bap10 kernel: Call Trace:
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff816f9002&gt;] dump_stack+0x94/0xca
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8104cda2&gt;] warn_slowpath_common+0x94/0xad
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abac7&gt;] ? pi_state_update_owner+0x3d/0xd7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8104ce5f&gt;] warn_slowpath_null+0x15/0x17
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abac7&gt;] pi_state_update_owner+0x3d/0xd7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abea8&gt;] free_pi_state+0x2d/0x73
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abf0b&gt;] unqueue_me_pi+0x1d/0x31
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ad735&gt;] futex_lock_pi+0x27a/0x2e8
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff81088bca&gt;] ? __lock_acquire+0x327/0xe4a
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ae6a9&gt;] do_futex+0x784/0x809
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810cfa9a&gt;] ? seccomp_phase1+0xde/0x1e7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810a4503&gt;] ? current_kernel_time64+0xb/0x31
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810d23c3&gt;] ? current_kernel_time+0xb/0xf
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ae861&gt;] SyS_futex+0x133/0x143
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8100158a&gt;] ? syscall_trace_enter_phase2+0x1a2/0x1bb
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff81701888&gt;] tracesys_phase2+0x90/0x95
Mar 03 12:50:03 icpu-test-bap10 kernel: ---[ end trace 968f95a458dea951 ]---

In order to both (1) prevent the self-deadlock, and (2) to satisfy the assertion
at pi_state_update_owner(), some locking with irq disable is needed,
at least in the specific call stack.

Interestingly, there existed a suchalike locking just before
f08a4af5ccb.

This is just a quick hotfix, resurrecting some previous
locks at the old places, but now using -&gt;wait_lock in place
of the previous -&gt;pi_lock (which was in place before
f08a4af5ccb).

The -&gt;pi_lock is now also taken, by the new code
which had been introduced in
34c8e1c2c02.

When this patch is applied, both the above splats are
no longer triggering at my prelife machines.

Without this patch, I cannot ensure stable production at
1&amp;1 Ionos.

Hint for further work: I have not yet tested other call paths,
since I am under time pressure for security reasons.

Hint for further hardening of 4.4.y and probably some more LTS series:
Probably some more systematic testing with CONFIG_PROVE_LOCKING
(and probably some more options) should be invested
in order to make the 4.4 LTS series really "stable" again.

Signed-off-by: Thomas Schoebel-Theuer &lt;tst@1und1.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Lee Jones &lt;lee.jones@linaro.org&gt;
Fixes: f08a4af5ccb2 ("futex: Use pi_state_update_owner() in put_pi_state()")
Fixes: 34c8e1c2c025 ("futex: Provide and use pi_state_update_owner()")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>futex: Ensure the correct return value from futex_lock_pi()</title>
<updated>2021-03-07T10:24:19+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-01-20T15:00:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=074e7d5157830ebd69e4abceba938367c6933ce9'/>
<id>074e7d5157830ebd69e4abceba938367c6933ce9</id>
<content type='text'>
commit 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9 upstream.

In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.

Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.

Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
[Sharan: Backported patch for kernel 4.4.y. Also folded in is a part
 of the cleanup patch d7c5ed73b19c("futex: Remove needless goto's")]
Signed-off-by: Sharan Turlapati &lt;sturlapati@vmware.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9 upstream.

In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.

Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.

Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
[Sharan: Backported patch for kernel 4.4.y. Also folded in is a part
 of the cleanup patch d7c5ed73b19c("futex: Remove needless goto's")]
Signed-off-by: Sharan Turlapati &lt;sturlapati@vmware.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>futex: Fix OWNER_DEAD fixup</title>
<updated>2021-03-03T15:44:24+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-01-22T10:39:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e68489bc827dbb9ae28f3e082b147d303599151b'/>
<id>e68489bc827dbb9ae28f3e082b147d303599151b</id>
<content type='text'>
commit a97cb0e7b3f4c6297fd857055ae8e895f402f501 upstream.

Both Geert and DaveJ reported that the recent futex commit:

  c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")

introduced a problem with setting OWNER_DEAD. We set the bit on an
uninitialized variable and then entirely optimize it away as a
dead-store.

Move the setting of the bit to where it is more useful.

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@us.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Link: http://lkml.kernel.org/r/20180122103947.GD2228@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit a97cb0e7b3f4c6297fd857055ae8e895f402f501 upstream.

Both Geert and DaveJ reported that the recent futex commit:

  c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")

introduced a problem with setting OWNER_DEAD. We set the bit on an
uninitialized variable and then entirely optimize it away as a
dead-store.

Move the setting of the bit to where it is more useful.

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@us.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Link: http://lkml.kernel.org/r/20180122103947.GD2228@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols</title>
<updated>2021-03-03T15:44:23+00:00</updated>
<author>
<name>Fangrui Song</name>
<email>maskray@google.com</email>
</author>
<published>2021-01-15T19:52:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=250704fae6c57834326f5579c0545d0651fa2673'/>
<id>250704fae6c57834326f5579c0545d0651fa2673</id>
<content type='text'>
commit ebfac7b778fac8b0e8e92ec91d0b055f046b4604 upstream.

clang-12 -fno-pic (since
https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de0083333232da3f1d6)
can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail`
on x86.  The two forms should have identical behaviors on x86-64 but the
former causes GNU as&lt;2.37 to produce an unreferenced undefined symbol
_GLOBAL_OFFSET_TABLE_.

(On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the
linker behavior is identical as far as Linux kernel is concerned.)

Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what
scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the
problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for
external function calls on x86.

Note: ld -z defs and dynamic loaders do not error for unreferenced
undefined symbols so the module loader is reading too much.  If we ever
need to ignore more symbols, the code should be refactored to ignore
unreferenced symbols.

Cc: &lt;stable@vger.kernel.org&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/1250
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27178
Reported-by: Marco Elver &lt;elver@google.com&gt;
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Reviewed-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Tested-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Fangrui Song &lt;maskray@google.com&gt;
Signed-off-by: Jessica Yu &lt;jeyu@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ebfac7b778fac8b0e8e92ec91d0b055f046b4604 upstream.

clang-12 -fno-pic (since
https://github.com/llvm/llvm-project/commit/a084c0388e2a59b9556f2de0083333232da3f1d6)
can emit `call __stack_chk_fail@PLT` instead of `call __stack_chk_fail`
on x86.  The two forms should have identical behaviors on x86-64 but the
former causes GNU as&lt;2.37 to produce an unreferenced undefined symbol
_GLOBAL_OFFSET_TABLE_.

(On x86-32, there is an R_386_PC32 vs R_386_PLT32 difference but the
linker behavior is identical as far as Linux kernel is concerned.)

Simply ignore _GLOBAL_OFFSET_TABLE_ for now, like what
scripts/mod/modpost.c:ignore_undef_symbol does. This also fixes the
problem for gcc/clang -fpie and -fpic, which may emit `call foo@PLT` for
external function calls on x86.

Note: ld -z defs and dynamic loaders do not error for unreferenced
undefined symbols so the module loader is reading too much.  If we ever
need to ignore more symbols, the code should be refactored to ignore
unreferenced symbols.

Cc: &lt;stable@vger.kernel.org&gt;
Link: https://github.com/ClangBuiltLinux/linux/issues/1250
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27178
Reported-by: Marco Elver &lt;elver@google.com&gt;
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Reviewed-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Tested-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Fangrui Song &lt;maskray@google.com&gt;
Signed-off-by: Jessica Yu &lt;jeyu@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracepoint: Do not fail unregistering a probe due to memory failure</title>
<updated>2021-03-03T15:44:19+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2020-11-18T14:34:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7a77bf015ede8aa4ef303638cf24fc9399221e3c'/>
<id>7a77bf015ede8aa4ef303638cf24fc9399221e3c</id>
<content type='text'>
[ Upstream commit befe6d946551d65cddbd32b9cb0170b0249fd5ed ]

The list of tracepoint callbacks is managed by an array that is protected
by RCU. To update this array, a new array is allocated, the updates are
copied over to the new array, and then the list of functions for the
tracepoint is switched over to the new array. After a completion of an RCU
grace period, the old array is freed.

This process happens for both adding a callback as well as removing one.
But on removing a callback, if the new array fails to be allocated, the
callback is not removed, and may be used after it is freed by the clients
of the tracepoint.

There's really no reason to fail if the allocation for a new array fails
when removing a function. Instead, the function can simply be replaced by a
stub function that could be cleaned up on the next modification of the
array. That is, instead of calling the function registered to the
tracepoint, it would call a stub function in its place.

Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us
Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home
Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home
Link: https://lkml.kernel.org/r/20201118093405.7a6d2290@gandalf.local.home

[ Note, this version does use undefined compiler behavior (assuming that
  a stub function with no parameters or return, can be called by a location
  that thinks it has parameters but still no return value. Static calls
  do the same thing, so this trick is not without precedent.

  There's another solution that uses RCU tricks and is more complex, but
  can be an alternative if this solution becomes an issue.

  Link: https://lore.kernel.org/lkml/20210127170721.58bce7cc@gandalf.local.home/
]

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Andrii Nakryiko &lt;andriin@fb.com&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: KP Singh &lt;kpsingh@chromium.org&gt;
Cc: netdev &lt;netdev@vger.kernel.org&gt;
Cc: bpf &lt;bpf@vger.kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Florian Weimer &lt;fw@deneb.enyo.de&gt;
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com
Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com
Reported-by: Matt Mullins &lt;mmullins@mmlx.us&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Tested-by: Matt Mullins &lt;mmullins@mmlx.us&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit befe6d946551d65cddbd32b9cb0170b0249fd5ed ]

The list of tracepoint callbacks is managed by an array that is protected
by RCU. To update this array, a new array is allocated, the updates are
copied over to the new array, and then the list of functions for the
tracepoint is switched over to the new array. After a completion of an RCU
grace period, the old array is freed.

This process happens for both adding a callback as well as removing one.
But on removing a callback, if the new array fails to be allocated, the
callback is not removed, and may be used after it is freed by the clients
of the tracepoint.

There's really no reason to fail if the allocation for a new array fails
when removing a function. Instead, the function can simply be replaced by a
stub function that could be cleaned up on the next modification of the
array. That is, instead of calling the function registered to the
tracepoint, it would call a stub function in its place.

Link: https://lore.kernel.org/r/20201115055256.65625-1-mmullins@mmlx.us
Link: https://lore.kernel.org/r/20201116175107.02db396d@gandalf.local.home
Link: https://lore.kernel.org/r/20201117211836.54acaef2@oasis.local.home
Link: https://lkml.kernel.org/r/20201118093405.7a6d2290@gandalf.local.home

[ Note, this version does use undefined compiler behavior (assuming that
  a stub function with no parameters or return, can be called by a location
  that thinks it has parameters but still no return value. Static calls
  do the same thing, so this trick is not without precedent.

  There's another solution that uses RCU tricks and is more complex, but
  can be an alternative if this solution becomes an issue.

  Link: https://lore.kernel.org/lkml/20210127170721.58bce7cc@gandalf.local.home/
]

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Song Liu &lt;songliubraving@fb.com&gt;
Cc: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Andrii Nakryiko &lt;andriin@fb.com&gt;
Cc: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: KP Singh &lt;kpsingh@chromium.org&gt;
Cc: netdev &lt;netdev@vger.kernel.org&gt;
Cc: bpf &lt;bpf@vger.kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Florian Weimer &lt;fw@deneb.enyo.de&gt;
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Reported-by: syzbot+83aa762ef23b6f0d1991@syzkaller.appspotmail.com
Reported-by: syzbot+d29e58bb557324e55e5e@syzkaller.appspotmail.com
Reported-by: Matt Mullins &lt;mmullins@mmlx.us&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Tested-by: Matt Mullins &lt;mmullins@mmlx.us&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kdb: Make memory allocations more robust</title>
<updated>2021-03-03T15:44:15+00:00</updated>
<author>
<name>Sumit Garg</name>
<email>sumit.garg@linaro.org</email>
</author>
<published>2021-01-22T11:05:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f1c9225ad3a07991a0dc82744147d4d65ca264b1'/>
<id>f1c9225ad3a07991a0dc82744147d4d65ca264b1</id>
<content type='text'>
commit 93f7a6d818deef69d0ba652d46bae6fbabbf365c upstream.

Currently kdb uses in_interrupt() to determine whether its library
code has been called from the kgdb trap handler or from a saner calling
context such as driver init. This approach is broken because
in_interrupt() alone isn't able to determine kgdb trap handler entry from
normal task context. This can happen during normal use of basic features
such as breakpoints and can also be trivially reproduced using:
echo g &gt; /proc/sysrq-trigger

We can improve this by adding check for in_dbg_master() instead which
explicitly determines if we are running in debugger context.

Cc: stable@vger.kernel.org
Signed-off-by: Sumit Garg &lt;sumit.garg@linaro.org&gt;
Link: https://lore.kernel.org/r/1611313556-4004-1-git-send-email-sumit.garg@linaro.org
Signed-off-by: Daniel Thompson &lt;daniel.thompson@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 93f7a6d818deef69d0ba652d46bae6fbabbf365c upstream.

Currently kdb uses in_interrupt() to determine whether its library
code has been called from the kgdb trap handler or from a saner calling
context such as driver init. This approach is broken because
in_interrupt() alone isn't able to determine kgdb trap handler entry from
normal task context. This can happen during normal use of basic features
such as breakpoints and can also be trivially reproduced using:
echo g &gt; /proc/sysrq-trigger

We can improve this by adding check for in_dbg_master() instead which
explicitly determines if we are running in debugger context.

Cc: stable@vger.kernel.org
Signed-off-by: Sumit Garg &lt;sumit.garg@linaro.org&gt;
Link: https://lore.kernel.org/r/1611313556-4004-1-git-send-email-sumit.garg@linaro.org
Signed-off-by: Daniel Thompson &lt;daniel.thompson@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fgraph: Initialize tracing_graph_pause at task creation</title>
<updated>2021-02-23T12:58:12+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-01-29T15:13:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=67776ca342399b60c8abef63e3962c1138bc284d'/>
<id>67776ca342399b60c8abef63e3962c1138bc284d</id>
<content type='text'>
commit 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 upstream.

On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.

The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.

   CPU 1				CPU 2
   -----				-----
  do_idle()
    cpu_suspend()
      pause_graph_tracing()
          task_struct-&gt;tracing_graph_pause++ (0 -&gt; 1)

				start_graph_tracing()
				  for_each_online_cpu(cpu) {
				    ftrace_graph_init_idle_task(cpu)
				      task-struct-&gt;tracing_graph_pause = 0 (1 -&gt; 0)

      unpause_graph_tracing()
          task_struct-&gt;tracing_graph_pause-- (0 -&gt; -1)

The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.

There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.

Cc: stable@vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois@arm.com
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 upstream.

On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.

The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.

   CPU 1				CPU 2
   -----				-----
  do_idle()
    cpu_suspend()
      pause_graph_tracing()
          task_struct-&gt;tracing_graph_pause++ (0 -&gt; 1)

				start_graph_tracing()
				  for_each_online_cpu(cpu) {
				    ftrace_graph_init_idle_task(cpu)
				      task-struct-&gt;tracing_graph_pause = 0 (1 -&gt; 0)

      unpause_graph_tracing()
          task_struct-&gt;tracing_graph_pause-- (0 -&gt; -1)

The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.

There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.

Cc: stable@vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois@arm.com
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Do not count ftrace events in top level enable output</title>
<updated>2021-02-23T12:58:12+00:00</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-02-05T20:40:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fb4ec1e12b5935dfc2dbd46920cd2888e91310be'/>
<id>fb4ec1e12b5935dfc2dbd46920cd2888e91310be</id>
<content type='text'>
commit 256cfdd6fdf70c6fcf0f7c8ddb0ebd73ce8f3bc9 upstream.

The file /sys/kernel/tracing/events/enable is used to enable all events by
echoing in "1", or disabling all events when echoing in "0". To know if all
events are enabled, disabled, or some are enabled but not all of them,
cating the file should show either "1" (all enabled), "0" (all disabled), or
"X" (some enabled but not all of them). This works the same as the "enable"
files in the individule system directories (like tracing/events/sched/enable).

But when all events are enabled, the top level "enable" file shows "X". The
reason is that its checking the "ftrace" events, which are special events
that only exist for their format files. These include the format for the
function tracer events, that are enabled when the function tracer is
enabled, but not by the "enable" file. The check includes these events,
which will always be disabled, and even though all true events are enabled,
the top level "enable" file will show "X" instead of "1".

To fix this, have the check test the event's flags to see if it has the
"IGNORE_ENABLE" flag set, and if so, not test it.

Cc: stable@vger.kernel.org
Fixes: 553552ce1796c ("tracing: Combine event filter_active and enable into single flags field")
Reported-by: "Yordan Karadzhov (VMware)" &lt;y.karadz@gmail.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 256cfdd6fdf70c6fcf0f7c8ddb0ebd73ce8f3bc9 upstream.

The file /sys/kernel/tracing/events/enable is used to enable all events by
echoing in "1", or disabling all events when echoing in "0". To know if all
events are enabled, disabled, or some are enabled but not all of them,
cating the file should show either "1" (all enabled), "0" (all disabled), or
"X" (some enabled but not all of them). This works the same as the "enable"
files in the individule system directories (like tracing/events/sched/enable).

But when all events are enabled, the top level "enable" file shows "X". The
reason is that its checking the "ftrace" events, which are special events
that only exist for their format files. These include the format for the
function tracer events, that are enabled when the function tracer is
enabled, but not by the "enable" file. The check includes these events,
which will always be disabled, and even though all true events are enabled,
the top level "enable" file will show "X" instead of "1".

To fix this, have the check test the event's flags to see if it has the
"IGNORE_ENABLE" flag set, and if so, not test it.

Cc: stable@vger.kernel.org
Fixes: 553552ce1796c ("tracing: Combine event filter_active and enable into single flags field")
Reported-by: "Yordan Karadzhov (VMware)" &lt;y.karadz@gmail.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kretprobe: Avoid re-registration of the same kretprobe earlier</title>
<updated>2021-02-10T08:07:27+00:00</updated>
<author>
<name>Wang ShaoBo</name>
<email>bobo.shaobowang@huawei.com</email>
</author>
<published>2021-01-28T12:44:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=05b0762589a7c8ea30999cad6f77ec8ac0a3e109'/>
<id>05b0762589a7c8ea30999cad6f77ec8ac0a3e109</id>
<content type='text'>
commit 0188b87899ffc4a1d36a0badbe77d56c92fd91dc upstream.

Our system encountered a re-init error when re-registering same kretprobe,
where the kretprobe_instance in rp-&gt;free_instances is illegally accessed
after re-init.

Implementation to avoid re-registration has been introduced for kprobe
before, but lags for register_kretprobe(). We must check if kprobe has
been re-registered before re-initializing kretprobe, otherwise it will
destroy the data struct of kretprobe registered, which can lead to memory
leak, system crash, also some unexpected behaviors.

We use check_kprobe_rereg() to check if kprobe has been re-registered
before running register_kretprobe()'s body, for giving a warning message
and terminate registration process.

Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.com

Cc: stable@vger.kernel.org
Fixes: 1f0ab40976460 ("kprobes: Prevent re-registration of the same kprobe")
[ The above commit should have been done for kretprobes too ]
Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Acked-by: Ananth N Mavinakayanahalli &lt;ananth@linux.ibm.com&gt;
Acked-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Signed-off-by: Wang ShaoBo &lt;bobo.shaobowang@huawei.com&gt;
Signed-off-by: Cheng Jian &lt;cj.chengjian@huawei.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0188b87899ffc4a1d36a0badbe77d56c92fd91dc upstream.

Our system encountered a re-init error when re-registering same kretprobe,
where the kretprobe_instance in rp-&gt;free_instances is illegally accessed
after re-init.

Implementation to avoid re-registration has been introduced for kprobe
before, but lags for register_kretprobe(). We must check if kprobe has
been re-registered before re-initializing kretprobe, otherwise it will
destroy the data struct of kretprobe registered, which can lead to memory
leak, system crash, also some unexpected behaviors.

We use check_kprobe_rereg() to check if kprobe has been re-registered
before running register_kretprobe()'s body, for giving a warning message
and terminate registration process.

Link: https://lkml.kernel.org/r/20210128124427.2031088-1-bobo.shaobowang@huawei.com

Cc: stable@vger.kernel.org
Fixes: 1f0ab40976460 ("kprobes: Prevent re-registration of the same kprobe")
[ The above commit should have been done for kretprobes too ]
Acked-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Acked-by: Ananth N Mavinakayanahalli &lt;ananth@linux.ibm.com&gt;
Acked-by: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Signed-off-by: Wang ShaoBo &lt;bobo.shaobowang@huawei.com&gt;
Signed-off-by: Cheng Jian &lt;cj.chengjian@huawei.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
