<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/raid1.c, branch v6.8.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>md/raid1: fix choose next idle in read_balance()</title>
<updated>2024-03-26T22:16:29+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-02-29T09:57:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=10b6e8412dc03d691bd950cd1681e42c18b48eea'/>
<id>10b6e8412dc03d691bd950cd1681e42c18b48eea</id>
<content type='text'>
[ Upstream commit 257ac239ffcfd097a9a0732bf5095fb00164f334 ]

Commit 12cee5a8a29e ("md/raid1: prevent merging too large request") add
the case choose next idle in read_balance():

read_balance:
 for_each_rdev
  if(next_seq_sect == this_sector || dist == 0)
  -&gt; sequential reads
   best_disk = disk;
   if (...)
    choose_next_idle = 1
    continue;

 for_each_rdev
 -&gt; iterate next rdev
  if (pending == 0)
   best_disk = disk;
   -&gt; choose the next idle disk
   break;

  if (choose_next_idle)
   -&gt; keep using this rdev if there are no other idle disk
   contine

However, commit 2e52d449bcec ("md/raid1: add failfast handling for reads.")
remove the code:

-               /* If device is idle, use it */
-               if (pending == 0) {
-                       best_disk = disk;
-                       break;
-               }

Hence choose next idle will never work now, fix this problem by
following:

1) don't set best_disk in this case, read_balance() will choose the best
   disk after iterating all the disks;
2) add 'pending' so that other idle disk will be chosen;
3) add a new local variable 'sequential_disk' to record the disk, and if
   there is no other idle disk, 'sequential_disk' will be chosen;

Fixes: 2e52d449bcec ("md/raid1: add failfast handling for reads.")
Co-developed-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240229095714.926789-5-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 257ac239ffcfd097a9a0732bf5095fb00164f334 ]

Commit 12cee5a8a29e ("md/raid1: prevent merging too large request") add
the case choose next idle in read_balance():

read_balance:
 for_each_rdev
  if(next_seq_sect == this_sector || dist == 0)
  -&gt; sequential reads
   best_disk = disk;
   if (...)
    choose_next_idle = 1
    continue;

 for_each_rdev
 -&gt; iterate next rdev
  if (pending == 0)
   best_disk = disk;
   -&gt; choose the next idle disk
   break;

  if (choose_next_idle)
   -&gt; keep using this rdev if there are no other idle disk
   contine

However, commit 2e52d449bcec ("md/raid1: add failfast handling for reads.")
remove the code:

-               /* If device is idle, use it */
-               if (pending == 0) {
-                       best_disk = disk;
-                       break;
-               }

Hence choose next idle will never work now, fix this problem by
following:

1) don't set best_disk in this case, read_balance() will choose the best
   disk after iterating all the disks;
2) add 'pending' so that other idle disk will be chosen;
3) add a new local variable 'sequential_disk' to record the disk, and if
   there is no other idle disk, 'sequential_disk' will be chosen;

Fixes: 2e52d449bcec ("md/raid1: add failfast handling for reads.")
Co-developed-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240229095714.926789-5-yukuai1@huaweicloud.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: record nonrot rdevs while adding/removing rdevs to conf</title>
<updated>2024-03-26T22:16:28+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-02-29T09:57:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cb38ae9ed8e3b8b983cbb27465b4d50471230b56'/>
<id>cb38ae9ed8e3b8b983cbb27465b4d50471230b56</id>
<content type='text'>
[ Upstream commit 2c27d09d3a76b33629d2e681bf8b774f776ade7f ]

For raid1, each read will iterate all the rdevs from conf and check if
any rdev is non-rotational, then choose rdev with minimal IO inflight
if so, or rdev with closest distance otherwise.

Disk nonrot info can be changed through sysfs entry:

/sys/block/[disk_name]/queue/rotational

However, consider that this should only be used for testing, and user
really shouldn't do this in real life. Record the number of non-rotational
disks in conf, to avoid checking each rdev in IO fast path and simplify
read_balance() a little bit.

Co-developed-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240229095714.926789-4-yukuai1@huaweicloud.com
Stable-dep-of: 257ac239ffcf ("md/raid1: fix choose next idle in read_balance()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2c27d09d3a76b33629d2e681bf8b774f776ade7f ]

For raid1, each read will iterate all the rdevs from conf and check if
any rdev is non-rotational, then choose rdev with minimal IO inflight
if so, or rdev with closest distance otherwise.

Disk nonrot info can be changed through sysfs entry:

/sys/block/[disk_name]/queue/rotational

However, consider that this should only be used for testing, and user
really shouldn't do this in real life. Record the number of non-rotational
disks in conf, to avoid checking each rdev in IO fast path and simplify
read_balance() a little bit.

Co-developed-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Paul Luse &lt;paul.e.luse@linux.intel.com&gt;
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240229095714.926789-4-yukuai1@huaweicloud.com
Stable-dep-of: 257ac239ffcf ("md/raid1: fix choose next idle in read_balance()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: factor out helpers to add rdev to conf</title>
<updated>2024-03-26T22:16:28+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-02-29T09:57:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9732ce695eb0bcf902e6e49f25590924ce5df2e5'/>
<id>9732ce695eb0bcf902e6e49f25590924ce5df2e5</id>
<content type='text'>
[ Upstream commit 969d6589abcb369d53d84ec7c9c37f4b23ec1ad9 ]

There are no functional changes, just make code cleaner and prepare to
record disk non-rotational information while adding and removing rdev to
conf

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240229095714.926789-3-yukuai1@huaweicloud.com
Stable-dep-of: 257ac239ffcf ("md/raid1: fix choose next idle in read_balance()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 969d6589abcb369d53d84ec7c9c37f4b23ec1ad9 ]

There are no functional changes, just make code cleaner and prepare to
record disk non-rotational information while adding and removing rdev to
conf

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240229095714.926789-3-yukuai1@huaweicloud.com
Stable-dep-of: 257ac239ffcf ("md/raid1: fix choose next idle in read_balance()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix a suspicious RCU usage warning</title>
<updated>2024-01-25T06:58:00+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2024-01-17T18:22:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9f3fe29d77ef4e7f7cb5c4c8c59f6dc373e57e78'/>
<id>9f3fe29d77ef4e7f7cb5c4c8c59f6dc373e57e78</id>
<content type='text'>
RCU protection was removed in the commit 2d32777d60de ("raid1: remove rcu
protection to access rdev from conf").

However, the code in fix_read_error does rcu_dereference outside
rcu_read_lock - this triggers the following warning. The warning is
triggered by a LVM2 test shell/integrity-caching.sh.

This commit removes rcu_dereference.

=============================
WARNING: suspicious RCU usage
6.7.0 #2 Not tainted
-----------------------------
drivers/md/raid1.c:2265 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
no locks held by mdX_raid1/1859.

stack backtrace:
CPU: 2 PID: 1859 Comm: mdX_raid1 Not tainted 6.7.0 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
 &lt;TASK&gt;
 dump_stack_lvl+0x60/0x70
 lockdep_rcu_suspicious+0x153/0x1b0
 raid1d+0x1732/0x1750 [raid1]
 ? lock_acquire+0x9f/0x270
 ? finish_wait+0x3d/0x80
 ? md_thread+0xf7/0x130 [md_mod]
 ? lock_release+0xaa/0x230
 ? md_register_thread+0xd0/0xd0 [md_mod]
 md_thread+0xa0/0x130 [md_mod]
 ? housekeeping_test_cpu+0x30/0x30
 kthread+0xdc/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x28/0x40
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork_asm+0x11/0x20
 &lt;/TASK&gt;

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Fixes: ca294b34aaf3 ("md/raid1: support read error check")
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/51539879-e1ca-fde3-b8b4-8934ddedcbc@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
RCU protection was removed in the commit 2d32777d60de ("raid1: remove rcu
protection to access rdev from conf").

However, the code in fix_read_error does rcu_dereference outside
rcu_read_lock - this triggers the following warning. The warning is
triggered by a LVM2 test shell/integrity-caching.sh.

This commit removes rcu_dereference.

=============================
WARNING: suspicious RCU usage
6.7.0 #2 Not tainted
-----------------------------
drivers/md/raid1.c:2265 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
no locks held by mdX_raid1/1859.

stack backtrace:
CPU: 2 PID: 1859 Comm: mdX_raid1 Not tainted 6.7.0 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Call Trace:
 &lt;TASK&gt;
 dump_stack_lvl+0x60/0x70
 lockdep_rcu_suspicious+0x153/0x1b0
 raid1d+0x1732/0x1750 [raid1]
 ? lock_acquire+0x9f/0x270
 ? finish_wait+0x3d/0x80
 ? md_thread+0xf7/0x130 [md_mod]
 ? lock_release+0xaa/0x230
 ? md_register_thread+0xd0/0xd0 [md_mod]
 md_thread+0xa0/0x130 [md_mod]
 ? housekeeping_test_cpu+0x30/0x30
 kthread+0xdc/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x28/0x40
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork_asm+0x11/0x20
 &lt;/TASK&gt;

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Fixes: ca294b34aaf3 ("md/raid1: support read error check")
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/51539879-e1ca-fde3-b8b4-8934ddedcbc@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: Use blk_opf_t for read and write operations</title>
<updated>2024-01-09T23:14:01+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2024-01-08T00:12:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7dab24554dedd4e6f408af8eb2d25c89997a6a1f'/>
<id>7dab24554dedd4e6f408af8eb2d25c89997a6a1f</id>
<content type='text'>
Use the type blk_opf_t for read and write operations instead of int. This
patch does not affect the generated code but fixes the following sparse
warning:

drivers/md/raid1.c:1993:60: sparse: sparse: incorrect type in argument 5 (different base types)
     expected restricted blk_opf_t [usertype] opf
     got int rw

Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Fixes: 3c5e514db58f ("md/raid1: Use the new blk_opf_t type")
Cc: stable@vger.kernel.org # v6.0+
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202401080657.UjFnvQgX-lkp@intel.com/
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240108001223.23835-1-bvanassche@acm.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the type blk_opf_t for read and write operations instead of int. This
patch does not affect the generated code but fixes the following sparse
warning:

drivers/md/raid1.c:1993:60: sparse: sparse: incorrect type in argument 5 (different base types)
     expected restricted blk_opf_t [usertype] opf
     got int rw

Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Fixes: 3c5e514db58f ("md/raid1: Use the new blk_opf_t type")
Cc: stable@vger.kernel.org # v6.0+
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202401080657.UjFnvQgX-lkp@intel.com/
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20240108001223.23835-1-bvanassche@acm.org
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: support read error check</title>
<updated>2023-12-15T23:22:15+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2023-12-15T02:38:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ca294b34aaf3a417fe9069b174e52508ac918ec8'/>
<id>ca294b34aaf3a417fe9069b174e52508ac918ec8</id>
<content type='text'>
After commit 1e50915fe0bb ("raid: improve MD/raid10 handling of correctable
read errors."), rdev will be set to faulty if it reads data error to many
times in raid10. Add this mechanism to raid1 now.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231215023852.3478228-3-linan666@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After commit 1e50915fe0bb ("raid: improve MD/raid10 handling of correctable
read errors."), rdev will be set to faulty if it reads data error to many
times in raid10. Add this mechanism to raid1 now.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231215023852.3478228-3-linan666@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md: factor out a helper exceed_read_errors() to check read_errors</title>
<updated>2023-12-15T23:22:15+00:00</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2023-12-15T02:38:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1979dbbe328ca4e1d0f061c94381cfa03388088d'/>
<id>1979dbbe328ca4e1d0f061c94381cfa03388088d</id>
<content type='text'>
Move check_decay_read_errors() to raid1-10.c and factor out a helper
exceed_read_errors() to check if read_errors exceeds the limit, so that
raid1 can also use it. There are no functional changes.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231215023852.3478228-2-linan666@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move check_decay_read_errors() to raid1-10.c and factor out a helper
exceed_read_errors() to check if read_errors exceeds the limit, so that
raid1 can also use it. There are no functional changes.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231215023852.3478228-2-linan666@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: remove unnecessary null checking</title>
<updated>2023-12-15T08:45:37+00:00</updated>
<author>
<name>Gou Hao</name>
<email>gouhao@uniontech.com</email>
</author>
<published>2023-12-14T15:14:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=af140f806ae2679f9dba48ea0f5811da83854eb6'/>
<id>af140f806ae2679f9dba48ea0f5811da83854eb6</id>
<content type='text'>
If %__GFP_DIRECT_RECLAIM is set then bio_alloc_bioset will always
be able to allocate a bio. See comment of bio_alloc_bioset.

Signed-off-by: Gou Hao &lt;gouhao@uniontech.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231214151458.28970-1-gouhao@uniontech.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If %__GFP_DIRECT_RECLAIM is set then bio_alloc_bioset will always
be able to allocate a bio. See comment of bio_alloc_bioset.

Signed-off-by: Gou Hao &lt;gouhao@uniontech.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231214151458.28970-1-gouhao@uniontech.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid1: remove rcu protection to access rdev from conf</title>
<updated>2023-11-27T23:49:05+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-11-25T08:16:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2d32777d60de81aa020a2431567020af26564c71'/>
<id>2d32777d60de81aa020a2431567020af26564c71</id>
<content type='text'>
Because it's safe to accees rdev from conf:
 - If any spinlock is held, because synchronize_rcu() from
   md_kick_rdev_from_array() will prevent 'rdev' to be freed until
   spinlock is released;
 - If 'reconfig_lock' is held, because rdev can't be added or removed from
   array;
 - If there is normal IO inflight, because mddev_suspend() will prevent
   rdev to be added or removed from array;
 - If there is sync IO inflight, because 'MD_RECOVERY_RUNNING' is
   checked in remove_and_add_spares().

And these will cover all the scenarios in raid1.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231125081604.3939938-4-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because it's safe to accees rdev from conf:
 - If any spinlock is held, because synchronize_rcu() from
   md_kick_rdev_from_array() will prevent 'rdev' to be freed until
   spinlock is released;
 - If 'reconfig_lock' is held, because rdev can't be added or removed from
   array;
 - If there is normal IO inflight, because mddev_suspend() will prevent
   rdev to be added or removed from array;
 - If there is sync IO inflight, because 'MD_RECOVERY_RUNNING' is
   checked in remove_and_add_spares().

And these will cover all the scenarios in raid1.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231125081604.3939938-4-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md: remove flag RemoveSynchronized</title>
<updated>2023-11-27T23:49:04+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-11-25T08:16:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c891f1fd90e66e584bb1353e1859cef7c9eb36f8'/>
<id>c891f1fd90e66e584bb1353e1859cef7c9eb36f8</id>
<content type='text'>
rcu is not used correctly here, because synchronize_rcu() is called
before replacing old value, for example:

remove_and_add_spares   // other path
 synchronize_rcu
 // called before replacing old value
 set_bit(RemoveSynchronized)
                        rcu_read_lock()
                        rdev = conf-&gt;mirros[].rdev
 pers-&gt;hot_remove_disk
  conf-&gt;mirros[].rdev = NULL;
  if (!test_bit(RemoveSynchronized))
   synchronize_rcu
   /*
    * won't be called, and won't wait
    * for concurrent readers to be done.
    */
                        // access rdev after remove_and_add_spares()
                        rcu_read_unlock()

Fortunately, there is a separate rcu protection to prevent such rdev
to be freed:

md_kick_rdev_from_array		//other path
				rcu_read_lock()
				rdev = conf-&gt;mirros[].rdev
list_del_rcu(&amp;rdev-&gt;same_set)

				rcu_read_unlock()
				/*
				 * rdev can be removed from conf, but
				 * rdev won't be freed.
				 */
synchronize_rcu()
free rdev

Hence remove this useless flag and prepare to remove rcu protection to
access rdev from 'conf'.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231125081604.3939938-2-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
rcu is not used correctly here, because synchronize_rcu() is called
before replacing old value, for example:

remove_and_add_spares   // other path
 synchronize_rcu
 // called before replacing old value
 set_bit(RemoveSynchronized)
                        rcu_read_lock()
                        rdev = conf-&gt;mirros[].rdev
 pers-&gt;hot_remove_disk
  conf-&gt;mirros[].rdev = NULL;
  if (!test_bit(RemoveSynchronized))
   synchronize_rcu
   /*
    * won't be called, and won't wait
    * for concurrent readers to be done.
    */
                        // access rdev after remove_and_add_spares()
                        rcu_read_unlock()

Fortunately, there is a separate rcu protection to prevent such rdev
to be freed:

md_kick_rdev_from_array		//other path
				rcu_read_lock()
				rdev = conf-&gt;mirros[].rdev
list_del_rcu(&amp;rdev-&gt;same_set)

				rcu_read_unlock()
				/*
				 * rdev can be removed from conf, but
				 * rdev won't be freed.
				 */
synchronize_rcu()
free rdev

Hence remove this useless flag and prepare to remove rcu protection to
access rdev from 'conf'.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231125081604.3939938-2-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
</feed>
