<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md, branch linux-6.8.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>dm-delay: fix max_delay calculations</title>
<updated>2024-05-30T07:49:24+00:00</updated>
<author>
<name>Benjamin Marzinski</name>
<email>bmarzins@redhat.com</email>
</author>
<published>2024-05-06T21:55:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=32ea54b00be722e45617aa59090b845cdc02033c'/>
<id>32ea54b00be722e45617aa59090b845cdc02033c</id>
<content type='text'>
[ Upstream commit 64eb88d6caee2c8eb806a68dab3f184f14f818a4 ]

delay_ctr() pointlessly compared max_delay in cases where multiple delay
classes were initialized identically. Also, when write delays were
configured different than read delays, delay_ctr() never compared their
value against max_delay. Fix these issues.

Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
Signed-off-by: Benjamin Marzinski &lt;bmarzins@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 64eb88d6caee2c8eb806a68dab3f184f14f818a4 ]

delay_ctr() pointlessly compared max_delay in cases where multiple delay
classes were initialized identically. Also, when write delays were
configured different than read delays, delay_ctr() never compared their
value against max_delay. Fix these issues.

Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
Signed-off-by: Benjamin Marzinski &lt;bmarzins@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm-delay: fix hung task introduced by kthread mode</title>
<updated>2024-05-30T07:49:24+00:00</updated>
<author>
<name>Joel Colledge</name>
<email>joel.colledge@linbit.com</email>
</author>
<published>2024-05-06T07:25:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cffa552c75ba213493a398875787b5e95c2cdd35'/>
<id>cffa552c75ba213493a398875787b5e95c2cdd35</id>
<content type='text'>
[ Upstream commit d14646f23300a5fc85be867bafdc0702c2002789 ]

If the worker thread is not woken due to a bio, then it is not woken at
all. This causes the hung task check to trigger. This occurs, for
instance, when no bios are submitted. Also when a delay of 0 is
configured, delay_bio() returns without waking the worker.

Prevent the hung task check from triggering by creating the thread with
kthread_run() instead of using kthread_create() directly.

Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
Signed-off-by: Joel Colledge &lt;joel.colledge@linbit.com&gt;
Reviewed-by: Benjamin Marzinski &lt;bmarzins@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d14646f23300a5fc85be867bafdc0702c2002789 ]

If the worker thread is not woken due to a bio, then it is not woken at
all. This causes the hung task check to trigger. This occurs, for
instance, when no bios are submitted. Also when a delay of 0 is
configured, delay_bio() returns without waking the worker.

Prevent the hung task check from triggering by creating the thread with
kthread_run() instead of using kthread_create() directly.

Fixes: 70bbeb29fab0 ("dm delay: for short delays, use kthread instead of timers and wq")
Signed-off-by: Joel Colledge &lt;joel.colledge@linbit.com&gt;
Reviewed-by: Benjamin Marzinski &lt;bmarzins@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm-delay: fix workqueue delay_timer race</title>
<updated>2024-05-30T07:49:23+00:00</updated>
<author>
<name>Benjamin Marzinski</name>
<email>bmarzins@redhat.com</email>
</author>
<published>2024-05-07T21:16:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9c0229293585cb8968c649487d9502ab8b60ffc6'/>
<id>9c0229293585cb8968c649487d9502ab8b60ffc6</id>
<content type='text'>
[ Upstream commit 8d24790ed08ab4e619ce58ed4a1b353ab77ffdc5 ]

delay_timer could be pending when delay_dtr() is called. It needs to be
shut down before kdelayd_wq is destroyed, so it won't try queueing more
work to kdelayd_wq while that's getting destroyed.

Also the del_timer_sync() call in delay_presuspend() doesn't protect
against the timer getting immediately rearmed by the queued call to
flush_delayed_bios(), but there's no real harm if that does happen.
timer_delete() is less work, and is basically just as likely to stop a
pointless call to flush_delayed_bios().

Fixes: 26b9f228703f ("dm: delay target")
Signed-off-by: Benjamin Marzinski &lt;bmarzins@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 8d24790ed08ab4e619ce58ed4a1b353ab77ffdc5 ]

delay_timer could be pending when delay_dtr() is called. It needs to be
shut down before kdelayd_wq is destroyed, so it won't try queueing more
work to kdelayd_wq while that's getting destroyed.

Also the del_timer_sync() call in delay_presuspend() doesn't protect
against the timer getting immediately rearmed by the queued call to
flush_delayed_bios(), but there's no real harm if that does happen.
timer_delete() is less work, and is basically just as likely to stop a
pointless call to flush_delayed_bios().

Fixes: 26b9f228703f ("dm: delay target")
Signed-off-by: Benjamin Marzinski &lt;bmarzins@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix resync softlockup when bitmap size is less than array size</title>
<updated>2024-05-30T07:49:03+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-04-22T06:58:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5817f43ae1a118855676f57ef7ab50e37eac7482'/>
<id>5817f43ae1a118855676f57ef7ab50e37eac7482</id>
<content type='text'>
[ Upstream commit f0e729af2eb6bee9eb58c4df1087f14ebaefe26b ]

Is is reported that for dm-raid10, lvextend + lvchange --syncaction will
trigger following softlockup:

kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [mdX_resync:6976]
CPU: 7 PID: 3588 Comm: mdX_resync Kdump: loaded Not tainted 6.9.0-rc4-next-20240419 #1
RIP: 0010:_raw_spin_unlock_irq+0x13/0x30
Call Trace:
 &lt;TASK&gt;
 md_bitmap_start_sync+0x6b/0xf0
 raid10_sync_request+0x25c/0x1b40 [raid10]
 md_do_sync+0x64b/0x1020
 md_thread+0xa7/0x170
 kthread+0xcf/0x100
 ret_from_fork+0x30/0x50
 ret_from_fork_asm+0x1a/0x30

And the detailed process is as follows:

md_do_sync
 j = mddev-&gt;resync_min
 while (j &lt; max_sectors)
  sectors = raid10_sync_request(mddev, j, &amp;skipped)
   if (!md_bitmap_start_sync(..., &amp;sync_blocks))
    // md_bitmap_start_sync set sync_blocks to 0
    return sync_blocks + sectors_skippe;
  // sectors = 0;
  j += sectors;
  // j never change

Root cause is that commit 301867b1c168 ("md/raid10: check
slab-out-of-bounds in md_bitmap_get_counter") return early from
md_bitmap_get_counter(), without setting returned blocks.

Fix this problem by always set returned blocks from
md_bitmap_get_counter"(), as it used to be.

Noted that this patch just fix the softlockup problem in kernel, the
case that bitmap size doesn't match array size still need to be fixed.

Fixes: 301867b1c168 ("md/raid10: check slab-out-of-bounds in md_bitmap_get_counter")
Reported-and-tested-by: Nigel Croxon &lt;ncroxon@redhat.com&gt;
Closes: https://lore.kernel.org/all/71ba5272-ab07-43ba-8232-d2da642acb4e@redhat.com/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240422065824.2516-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f0e729af2eb6bee9eb58c4df1087f14ebaefe26b ]

Is is reported that for dm-raid10, lvextend + lvchange --syncaction will
trigger following softlockup:

kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [mdX_resync:6976]
CPU: 7 PID: 3588 Comm: mdX_resync Kdump: loaded Not tainted 6.9.0-rc4-next-20240419 #1
RIP: 0010:_raw_spin_unlock_irq+0x13/0x30
Call Trace:
 &lt;TASK&gt;
 md_bitmap_start_sync+0x6b/0xf0
 raid10_sync_request+0x25c/0x1b40 [raid10]
 md_do_sync+0x64b/0x1020
 md_thread+0xa7/0x170
 kthread+0xcf/0x100
 ret_from_fork+0x30/0x50
 ret_from_fork_asm+0x1a/0x30

And the detailed process is as follows:

md_do_sync
 j = mddev-&gt;resync_min
 while (j &lt; max_sectors)
  sectors = raid10_sync_request(mddev, j, &amp;skipped)
   if (!md_bitmap_start_sync(..., &amp;sync_blocks))
    // md_bitmap_start_sync set sync_blocks to 0
    return sync_blocks + sectors_skippe;
  // sectors = 0;
  j += sectors;
  // j never change

Root cause is that commit 301867b1c168 ("md/raid10: check
slab-out-of-bounds in md_bitmap_get_counter") return early from
md_bitmap_get_counter(), without setting returned blocks.

Fix this problem by always set returned blocks from
md_bitmap_get_counter"(), as it used to be.

Noted that this patch just fix the softlockup problem in kernel, the
case that bitmap size doesn't match array size still need to be fixed.

Fixes: 301867b1c168 ("md/raid10: check slab-out-of-bounds in md_bitmap_get_counter")
Reported-and-tested-by: Nigel Croxon &lt;ncroxon@redhat.com&gt;
Closes: https://lore.kernel.org/all/71ba5272-ab07-43ba-8232-d2da642acb4e@redhat.com/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240422065824.2516-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>raid1: fix use-after-free for original bio in raid1_write_request()</title>
<updated>2024-04-17T09:23:24+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-03-08T09:37:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f423f41b7679c09abb26d2bd54be5cbef23c9446'/>
<id>f423f41b7679c09abb26d2bd54be5cbef23c9446</id>
<content type='text'>
commit fcf3f7e2fc8a53a6140beee46ec782a4c88e4744 upstream.

r1_bio-&gt;bios[] is used to record new bios that will be issued to
underlying disks, however, in raid1_write_request(), r1_bio-&gt;bios[]
will set to the original bio temporarily. Meanwhile, if blocked rdev
is set, free_r1bio() will be called causing that all r1_bio-&gt;bios[]
to be freed:

raid1_write_request()
 r1_bio = alloc_r1bio(mddev, bio); -&gt; r1_bio-&gt;bios[] is NULL
 for (i = 0;  i &lt; disks; i++) -&gt; for each rdev in conf
  // first rdev is normal
  r1_bio-&gt;bios[0] = bio; -&gt; set to original bio
  // second rdev is blocked
  if (test_bit(Blocked, &amp;rdev-&gt;flags))
   break

 if (blocked_rdev)
  free_r1bio()
   put_all_bios()
    bio_put(r1_bio-&gt;bios[0]) -&gt; original bio is freed

Test scripts:

mdadm -CR /dev/md0 -l1 -n4 /dev/sd[abcd] --assume-clean
fio -filename=/dev/md0 -ioengine=libaio -rw=write -bs=4k -numjobs=1 \
    -iodepth=128 -name=test -direct=1
echo blocked &gt; /sys/block/md0/md/rd2/state

Test result:

BUG bio-264 (Not tainted): Object already free
-----------------------------------------------------------------------------

Allocated in mempool_alloc_slab+0x24/0x50 age=1 cpu=1 pid=869
 kmem_cache_alloc+0x324/0x480
 mempool_alloc_slab+0x24/0x50
 mempool_alloc+0x6e/0x220
 bio_alloc_bioset+0x1af/0x4d0
 blkdev_direct_IO+0x164/0x8a0
 blkdev_write_iter+0x309/0x440
 aio_write+0x139/0x2f0
 io_submit_one+0x5ca/0xb70
 __do_sys_io_submit+0x86/0x270
 __x64_sys_io_submit+0x22/0x30
 do_syscall_64+0xb1/0x210
 entry_SYSCALL_64_after_hwframe+0x6c/0x74
Freed in mempool_free_slab+0x1f/0x30 age=1 cpu=1 pid=869
 kmem_cache_free+0x28c/0x550
 mempool_free_slab+0x1f/0x30
 mempool_free+0x40/0x100
 bio_free+0x59/0x80
 bio_put+0xf0/0x220
 free_r1bio+0x74/0xb0
 raid1_make_request+0xadf/0x1150
 md_handle_request+0xc7/0x3b0
 md_submit_bio+0x76/0x130
 __submit_bio+0xd8/0x1d0
 submit_bio_noacct_nocheck+0x1eb/0x5c0
 submit_bio_noacct+0x169/0xd40
 submit_bio+0xee/0x1d0
 blkdev_direct_IO+0x322/0x8a0
 blkdev_write_iter+0x309/0x440
 aio_write+0x139/0x2f0

Since that bios for underlying disks are not allocated yet, fix this
problem by using mempool_free() directly to free the r1_bio.

Fixes: 992db13a4aee ("md/raid1: free the r1bio before waiting for blocked rdev")
Cc: stable@vger.kernel.org # v6.6+
Reported-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Tested-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240308093726.1047420-1-yukuai1@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fcf3f7e2fc8a53a6140beee46ec782a4c88e4744 upstream.

r1_bio-&gt;bios[] is used to record new bios that will be issued to
underlying disks, however, in raid1_write_request(), r1_bio-&gt;bios[]
will set to the original bio temporarily. Meanwhile, if blocked rdev
is set, free_r1bio() will be called causing that all r1_bio-&gt;bios[]
to be freed:

raid1_write_request()
 r1_bio = alloc_r1bio(mddev, bio); -&gt; r1_bio-&gt;bios[] is NULL
 for (i = 0;  i &lt; disks; i++) -&gt; for each rdev in conf
  // first rdev is normal
  r1_bio-&gt;bios[0] = bio; -&gt; set to original bio
  // second rdev is blocked
  if (test_bit(Blocked, &amp;rdev-&gt;flags))
   break

 if (blocked_rdev)
  free_r1bio()
   put_all_bios()
    bio_put(r1_bio-&gt;bios[0]) -&gt; original bio is freed

Test scripts:

mdadm -CR /dev/md0 -l1 -n4 /dev/sd[abcd] --assume-clean
fio -filename=/dev/md0 -ioengine=libaio -rw=write -bs=4k -numjobs=1 \
    -iodepth=128 -name=test -direct=1
echo blocked &gt; /sys/block/md0/md/rd2/state

Test result:

BUG bio-264 (Not tainted): Object already free
-----------------------------------------------------------------------------

Allocated in mempool_alloc_slab+0x24/0x50 age=1 cpu=1 pid=869
 kmem_cache_alloc+0x324/0x480
 mempool_alloc_slab+0x24/0x50
 mempool_alloc+0x6e/0x220
 bio_alloc_bioset+0x1af/0x4d0
 blkdev_direct_IO+0x164/0x8a0
 blkdev_write_iter+0x309/0x440
 aio_write+0x139/0x2f0
 io_submit_one+0x5ca/0xb70
 __do_sys_io_submit+0x86/0x270
 __x64_sys_io_submit+0x22/0x30
 do_syscall_64+0xb1/0x210
 entry_SYSCALL_64_after_hwframe+0x6c/0x74
Freed in mempool_free_slab+0x1f/0x30 age=1 cpu=1 pid=869
 kmem_cache_free+0x28c/0x550
 mempool_free_slab+0x1f/0x30
 mempool_free+0x40/0x100
 bio_free+0x59/0x80
 bio_put+0xf0/0x220
 free_r1bio+0x74/0xb0
 raid1_make_request+0xadf/0x1150
 md_handle_request+0xc7/0x3b0
 md_submit_bio+0x76/0x130
 __submit_bio+0xd8/0x1d0
 submit_bio_noacct_nocheck+0x1eb/0x5c0
 submit_bio_noacct+0x169/0xd40
 submit_bio+0xee/0x1d0
 blkdev_direct_IO+0x322/0x8a0
 blkdev_write_iter+0x309/0x440
 aio_write+0x139/0x2f0

Since that bios for underlying disks are not allocated yet, fix this
problem by using mempool_free() directly to free the r1_bio.

Fixes: 992db13a4aee ("md/raid1: free the r1bio before waiting for blocked rdev")
Cc: stable@vger.kernel.org # v6.6+
Reported-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Tested-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240308093726.1047420-1-yukuai1@huaweicloud.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm integrity: fix out-of-range warning</title>
<updated>2024-04-10T14:38:00+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2024-03-28T14:30:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bdd000c46edf2681f1654b7c43133125805653b9'/>
<id>bdd000c46edf2681f1654b7c43133125805653b9</id>
<content type='text'>
[ Upstream commit 8e91c2342351e0f5ef6c0a704384a7f6fc70c3b2 ]

Depending on the value of CONFIG_HZ, clang complains about a pointless
comparison:

drivers/md/dm-integrity.c:4085:12: error: result of comparison of
                        constant 42949672950 with expression of type
                        'unsigned int' is always false
                        [-Werror,-Wtautological-constant-out-of-range-compare]
                        if (val &gt;= (uint64_t)UINT_MAX * 1000 / HZ) {

As the check remains useful for other configurations, shut up the
warning by adding a second type cast to uint64_t.

Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Reviewed-by: Justin Stitt &lt;justinstitt@google.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 8e91c2342351e0f5ef6c0a704384a7f6fc70c3b2 ]

Depending on the value of CONFIG_HZ, clang complains about a pointless
comparison:

drivers/md/dm-integrity.c:4085:12: error: result of comparison of
                        constant 42949672950 with expression of type
                        'unsigned int' is always false
                        [-Werror,-Wtautological-constant-out-of-range-compare]
                        if (val &gt;= (uint64_t)UINT_MAX * 1000 / HZ) {

As the check remains useful for other configurations, shut up the
warning by adding a second type cast to uint64_t.

Fixes: 468dfca38b1a ("dm integrity: add a bitmap mode")
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Reviewed-by: Justin Stitt &lt;justinstitt@google.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm snapshot: fix lockup in dm_exception_table_exit</title>
<updated>2024-04-03T13:32:28+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2024-03-20T17:43:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5f4ad4d0b0943296287313db60b3f84df4aad683'/>
<id>5f4ad4d0b0943296287313db60b3f84df4aad683</id>
<content type='text'>
[ Upstream commit 6e7132ed3c07bd8a6ce3db4bb307ef2852b322dc ]

There was reported lockup when we exit a snapshot with many exceptions.
Fix this by adding "cond_resched" to the loop that frees the exceptions.

Reported-by: John Pittman &lt;jpittman@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 6e7132ed3c07bd8a6ce3db4bb307ef2852b322dc ]

There was reported lockup when we exit a snapshot with many exceptions.
Fix this by adding "cond_resched" to the loop that frees the exceptions.

Reported-by: John Pittman &lt;jpittman@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm-raid: fix lockdep waring in "pers-&gt;hot_add_disk"</title>
<updated>2024-04-03T13:32:14+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-03-05T07:23:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=32ecc8b70a258eecf1c2da39e0ec28bfb761555d'/>
<id>32ecc8b70a258eecf1c2da39e0ec28bfb761555d</id>
<content type='text'>
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]

The lockdep assert is added by commit a448af25becf ("md/raid10: remove
rcu protection to access rdev from conf") in print_conf(). And I didn't
notice that dm-raid is calling "pers-&gt;hot_add_disk" without holding
'reconfig_mutex'.

"pers-&gt;hot_add_disk" read and write many fields that is protected by
'reconfig_mutex', and raid_resume() already grab the lock in other
contex. Hence fix this problem by protecting "pers-&gt;host_add_disk"
with the lock.

Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume")
Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 95009ae904b1e9dca8db6f649f2d7c18a6e42c75 ]

The lockdep assert is added by commit a448af25becf ("md/raid10: remove
rcu protection to access rdev from conf") in print_conf(). And I didn't
notice that dm-raid is calling "pers-&gt;hot_add_disk" without holding
'reconfig_mutex'.

"pers-&gt;hot_add_disk" read and write many fields that is protected by
'reconfig_mutex', and raid_resume() already grab the lock in other
contex. Hence fix this problem by protecting "pers-&gt;host_add_disk"
with the lock.

Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed devices on resume")
Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from conf")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240305072306.2562024-10-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape</title>
<updated>2024-04-03T13:32:14+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-03-05T07:23:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a8d249d770cb357d16a2097b548d2e4c1c137304'/>
<id>a8d249d770cb357d16a2097b548d2e4c1c137304</id>
<content type='text'>
[ Upstream commit 41425f96d7aa59bc865f60f5dda3d7697b555677 ]

For raid456, if reshape is still in progress, then IO across reshape
position will wait for reshape to make progress. However, for dm-raid,
in following cases reshape will never make progress hence IO will hang:

1) the array is read-only;
2) MD_RECOVERY_WAIT is set;
3) MD_RECOVERY_FROZEN is set;

After commit c467e97f079f ("md/raid6: use valid sector values to determine
if an I/O should wait on the reshape") fix the problem that IO across
reshape position doesn't wait for reshape, the dm-raid test
shell/lvconvert-raid-reshape.sh start to hang:

[root@fedora ~]# cat /proc/979/stack
[&lt;0&gt;] wait_woken+0x7d/0x90
[&lt;0&gt;] raid5_make_request+0x929/0x1d70 [raid456]
[&lt;0&gt;] md_handle_request+0xc2/0x3b0 [md_mod]
[&lt;0&gt;] raid_map+0x2c/0x50 [dm_raid]
[&lt;0&gt;] __map_bio+0x251/0x380 [dm_mod]
[&lt;0&gt;] dm_submit_bio+0x1f0/0x760 [dm_mod]
[&lt;0&gt;] __submit_bio+0xc2/0x1c0
[&lt;0&gt;] submit_bio_noacct_nocheck+0x17f/0x450
[&lt;0&gt;] submit_bio_noacct+0x2bc/0x780
[&lt;0&gt;] submit_bio+0x70/0xc0
[&lt;0&gt;] mpage_readahead+0x169/0x1f0
[&lt;0&gt;] blkdev_readahead+0x18/0x30
[&lt;0&gt;] read_pages+0x7c/0x3b0
[&lt;0&gt;] page_cache_ra_unbounded+0x1ab/0x280
[&lt;0&gt;] force_page_cache_ra+0x9e/0x130
[&lt;0&gt;] page_cache_sync_ra+0x3b/0x110
[&lt;0&gt;] filemap_get_pages+0x143/0xa30
[&lt;0&gt;] filemap_read+0xdc/0x4b0
[&lt;0&gt;] blkdev_read_iter+0x75/0x200
[&lt;0&gt;] vfs_read+0x272/0x460
[&lt;0&gt;] ksys_read+0x7a/0x170
[&lt;0&gt;] __x64_sys_read+0x1c/0x30
[&lt;0&gt;] do_syscall_64+0xc6/0x230
[&lt;0&gt;] entry_SYSCALL_64_after_hwframe+0x6c/0x74

This is because reshape can't make progress.

For md/raid, the problem doesn't exist because register new sync_thread
doesn't rely on the IO to be done any more:

1) If array is read-only, it can switch to read-write by ioctl/sysfs;
2) md/raid never set MD_RECOVERY_WAIT;
3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
   'reconfig_mutex', hence it can be cleared and reshape can continue by
   sysfs api 'sync_action'.

However, I'm not sure yet how to avoid the problem in dm-raid yet. This
patch on the one hand make sure raid_message() can't change
sync_thread() through raid_message() after presuspend(), on the other
hand detect the above 3 cases before wait for IO do be done in
dm_suspend(), and let dm-raid requeue those IO.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 41425f96d7aa59bc865f60f5dda3d7697b555677 ]

For raid456, if reshape is still in progress, then IO across reshape
position will wait for reshape to make progress. However, for dm-raid,
in following cases reshape will never make progress hence IO will hang:

1) the array is read-only;
2) MD_RECOVERY_WAIT is set;
3) MD_RECOVERY_FROZEN is set;

After commit c467e97f079f ("md/raid6: use valid sector values to determine
if an I/O should wait on the reshape") fix the problem that IO across
reshape position doesn't wait for reshape, the dm-raid test
shell/lvconvert-raid-reshape.sh start to hang:

[root@fedora ~]# cat /proc/979/stack
[&lt;0&gt;] wait_woken+0x7d/0x90
[&lt;0&gt;] raid5_make_request+0x929/0x1d70 [raid456]
[&lt;0&gt;] md_handle_request+0xc2/0x3b0 [md_mod]
[&lt;0&gt;] raid_map+0x2c/0x50 [dm_raid]
[&lt;0&gt;] __map_bio+0x251/0x380 [dm_mod]
[&lt;0&gt;] dm_submit_bio+0x1f0/0x760 [dm_mod]
[&lt;0&gt;] __submit_bio+0xc2/0x1c0
[&lt;0&gt;] submit_bio_noacct_nocheck+0x17f/0x450
[&lt;0&gt;] submit_bio_noacct+0x2bc/0x780
[&lt;0&gt;] submit_bio+0x70/0xc0
[&lt;0&gt;] mpage_readahead+0x169/0x1f0
[&lt;0&gt;] blkdev_readahead+0x18/0x30
[&lt;0&gt;] read_pages+0x7c/0x3b0
[&lt;0&gt;] page_cache_ra_unbounded+0x1ab/0x280
[&lt;0&gt;] force_page_cache_ra+0x9e/0x130
[&lt;0&gt;] page_cache_sync_ra+0x3b/0x110
[&lt;0&gt;] filemap_get_pages+0x143/0xa30
[&lt;0&gt;] filemap_read+0xdc/0x4b0
[&lt;0&gt;] blkdev_read_iter+0x75/0x200
[&lt;0&gt;] vfs_read+0x272/0x460
[&lt;0&gt;] ksys_read+0x7a/0x170
[&lt;0&gt;] __x64_sys_read+0x1c/0x30
[&lt;0&gt;] do_syscall_64+0xc6/0x230
[&lt;0&gt;] entry_SYSCALL_64_after_hwframe+0x6c/0x74

This is because reshape can't make progress.

For md/raid, the problem doesn't exist because register new sync_thread
doesn't rely on the IO to be done any more:

1) If array is read-only, it can switch to read-write by ioctl/sysfs;
2) md/raid never set MD_RECOVERY_WAIT;
3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold
   'reconfig_mutex', hence it can be cleared and reshape can continue by
   sysfs api 'sync_action'.

However, I'm not sure yet how to avoid the problem in dm-raid yet. This
patch on the one hand make sure raid_message() can't change
sync_thread() through raid_message() after presuspend(), on the other
hand detect the above 3 cases before wait for IO do be done in
dm_suspend(), and let dm-raid requeue those IO.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240305072306.2562024-9-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm-raid: add a new helper prepare_suspend() in md_personality</title>
<updated>2024-04-03T13:32:14+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-03-05T07:23:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dd54c8a26649a636a86913aecc05337ae39996cd'/>
<id>dd54c8a26649a636a86913aecc05337ae39996cd</id>
<content type='text'>
[ Upstream commit 5625ff8b72b0e5c13b0fc1fc1f198155af45f729 ]

There are no functional changes for now, prepare to fix a deadlock for
dm-raid456.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240305072306.2562024-8-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5625ff8b72b0e5c13b0fc1fc1f198155af45f729 ]

There are no functional changes for now, prepare to fix a deadlock for
dm-raid456.

Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240305072306.2562024-8-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
