<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/raid5-cache.c, branch v6.12.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>md/raid5: use wait_on_bit() for R5_Overlap</title>
<updated>2024-08-29T16:37:10+00:00</updated>
<author>
<name>Artur Paszkiewicz</name>
<email>artur.paszkiewicz@intel.com</email>
</author>
<published>2024-08-27T15:35:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e6a03207b925aebe10d6aacd8e040ccdaa731bff'/>
<id>e6a03207b925aebe10d6aacd8e040ccdaa731bff</id>
<content type='text'>
Convert uses of wait_for_overlap wait queue with R5_Overlap bit to
wait_on_bit() / wake_up_bit().

Signed-off-by: Artur Paszkiewicz &lt;artur.paszkiewicz@intel.com&gt;
Link: https://lore.kernel.org/r/20240827153536.6743-2-artur.paszkiewicz@intel.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert uses of wait_for_overlap wait queue with R5_Overlap bit to
wait_on_bit() / wake_up_bit().

Signed-off-by: Artur Paszkiewicz &lt;artur.paszkiewicz@intel.com&gt;
Link: https://lore.kernel.org/r/20240827153536.6743-2-artur.paszkiewicz@intel.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/md-bitmap: merge md_bitmap_endwrite() into bitmap_operations</title>
<updated>2024-08-27T17:14:17+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-08-26T07:44:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3486015facc030f30d694b92dc18e58073c6c9e0'/>
<id>3486015facc030f30d694b92dc18e58073c6c9e0</id>
<content type='text'>
So that the implementation won't be exposed, and it'll be possible
to invent a new bitmap by replacing bitmap_operations.

Also change the parameter from bitmap to mddev, to avoid access
bitmap outside md-bitmap.c as much as possible. And change the type
of 'success' and 'behind' from int to bool.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240826074452.1490072-25-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So that the implementation won't be exposed, and it'll be possible
to invent a new bitmap by replacing bitmap_operations.

Also change the parameter from bitmap to mddev, to avoid access
bitmap outside md-bitmap.c as much as possible. And change the type
of 'success' and 'behind' from int to bool.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240826074452.1490072-25-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5: remove rcu protection to access rdev from conf</title>
<updated>2023-11-27T23:49:05+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-11-25T08:16:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ad8606702f268903b26795e6b93605646fd1a6a8'/>
<id>ad8606702f268903b26795e6b93605646fd1a6a8</id>
<content type='text'>
Because it's safe to accees rdev from conf:
 - If any spinlock is held, because synchronize_rcu() from
   md_kick_rdev_from_array() will prevent 'rdev' to be freed until
   spinlock is released;
 - If 'reconfig_lock' is held, because rdev can't be added or removed from
   array;
 - If there is normal IO inflight, because mddev_suspend() will prevent
   rdev to be added or removed from array;
 - If there is sync IO inflight, because 'MD_RECOVERY_RUNNING' is
   checked in remove_and_add_spares().

And these will cover all the scenarios in raid456.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231125081604.3939938-5-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because it's safe to accees rdev from conf:
 - If any spinlock is held, because synchronize_rcu() from
   md_kick_rdev_from_array() will prevent 'rdev' to be freed until
   spinlock is released;
 - If 'reconfig_lock' is held, because rdev can't be added or removed from
   array;
 - If there is normal IO inflight, because mddev_suspend() will prevent
   rdev to be added or removed from array;
 - If there is sync IO inflight, because 'MD_RECOVERY_RUNNING' is
   checked in remove_and_add_spares().

And these will cover all the scenarios in raid456.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231125081604.3939938-5-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md: rename __mddev_suspend/resume() back to mddev_suspend/resume()</title>
<updated>2023-10-11T01:49:51+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-10-10T15:19:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2b16a52549d51937a98d82b07b4d83dce6c43683'/>
<id>2b16a52549d51937a98d82b07b4d83dce6c43683</id>
<content type='text'>
Now that the old apis are removed, __mddev_suspend/resume() can be
renamed to their original names.

This is done by:

sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch]
sed -i "s/__mddev_resume/mddev_resume/g" *.[ch]

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231010151958.145896-20-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the old apis are removed, __mddev_suspend/resume() can be
renamed to their original names.

This is done by:

sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch]
sed -i "s/__mddev_resume/mddev_resume/g" *.[ch]

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231010151958.145896-20-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5-cache: use new apis to suspend array</title>
<updated>2023-10-11T01:49:50+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-10-10T15:19:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1b172e0b11c00e89c5df72c2761b3d4d279fbb4d'/>
<id>1b172e0b11c00e89c5df72c2761b3d4d279fbb4d</id>
<content type='text'>
Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231010151958.145896-9-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231010151958.145896-9-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf-&gt;log'</title>
<updated>2023-10-11T01:49:49+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-10-10T15:19:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=06a4d0d8c642b5ea654e832b74dca12965356da0'/>
<id>06a4d0d8c642b5ea654e832b74dca12965356da0</id>
<content type='text'>
'conf-&gt;log' is set with 'reconfig_mutex' grabbed, however, readers are
not procted, hence protect it with READ_ONCE/WRITE_ONCE to prevent
reading abnormal values.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231010151958.145896-3-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'conf-&gt;log' is set with 'reconfig_mutex' grabbed, however, readers are
not procted, hence protect it with READ_ONCE/WRITE_ONCE to prevent
reading abnormal values.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20231010151958.145896-3-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()</title>
<updated>2023-08-15T16:40:27+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-08-08T10:49:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0d0bd28c500173bfca78aa840f8f36d261ef1765'/>
<id>0d0bd28c500173bfca78aa840f8f36d261ef1765</id>
<content type='text'>
r5l_flush_stripe_to_raid() will check if the list 'flushing_ios' is
empty, and then submit 'flush_bio', however, r5l_log_flush_endio()
is clearing the list first and then clear the bio, which will cause
null-ptr-deref:

T1: submit flush io
raid5d
 handle_active_stripes
  r5l_flush_stripe_to_raid
   // list is empty
   // add 'io_end_ios' to the list
   bio_init
   submit_bio
   // io1

T2: io1 is done
r5l_log_flush_endio
 list_splice_tail_init
 // clear the list
			T3: submit new flush io
			...
			r5l_flush_stripe_to_raid
			 // list is empty
			 // add 'io_end_ios' to the list
			 bio_init
 bio_uninit
 // clear bio-&gt;bi_blkg
			 submit_bio
			 // null-ptr-deref

Fix this problem by clearing bio before clearing the list in
r5l_log_flush_endio().

Fixes: 0dd00cba99c3 ("raid5-cache: fully initialize flush_bio when needed")
Reported-and-tested-by: Corey Hickey &lt;bugfood-ml@fatooh.org&gt;
Closes: https://lore.kernel.org/all/cddd7213-3dfd-4ab7-a3ac-edd54d74a626@fatooh.org/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
r5l_flush_stripe_to_raid() will check if the list 'flushing_ios' is
empty, and then submit 'flush_bio', however, r5l_log_flush_endio()
is clearing the list first and then clear the bio, which will cause
null-ptr-deref:

T1: submit flush io
raid5d
 handle_active_stripes
  r5l_flush_stripe_to_raid
   // list is empty
   // add 'io_end_ios' to the list
   bio_init
   submit_bio
   // io1

T2: io1 is done
r5l_log_flush_endio
 list_splice_tail_init
 // clear the list
			T3: submit new flush io
			...
			r5l_flush_stripe_to_raid
			 // list is empty
			 // add 'io_end_ios' to the list
			 bio_init
 bio_uninit
 // clear bio-&gt;bi_blkg
			 submit_bio
			 // null-ptr-deref

Fix this problem by clearing bio before clearing the list in
r5l_log_flush_endio().

Fixes: 0dd00cba99c3 ("raid5-cache: fully initialize flush_bio when needed")
Reported-and-tested-by: Corey Hickey &lt;bugfood-ml@fatooh.org&gt;
Closes: https://lore.kernel.org/all/cddd7213-3dfd-4ab7-a3ac-edd54d74a626@fatooh.org/
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Hold mddev-&gt;reconfig_mutex when trying to get mddev-&gt;sync_thread</title>
<updated>2023-08-15T16:40:26+00:00</updated>
<author>
<name>Li Lingfeng</name>
<email>lilingfeng3@huawei.com</email>
</author>
<published>2023-08-03T07:17:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7eb8ff02c1df279bf7f7f29b866beb655a9eebe9'/>
<id>7eb8ff02c1df279bf7f7f29b866beb655a9eebe9</id>
<content type='text'>
Commit ba9d9f1a707f ("Revert "md: unlock mddev before reap sync_thread in
action_store"") removed the scenario of calling md_unregister_thread()
without holding mddev-&gt;reconfig_mutex, so add a lock holding check before
acquiring mddev-&gt;sync_thread by passing mdev to md_unregister_thread().

Signed-off-by: Li Lingfeng &lt;lilingfeng3@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20230803071711.2546560-1-lilingfeng@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit ba9d9f1a707f ("Revert "md: unlock mddev before reap sync_thread in
action_store"") removed the scenario of calling md_unregister_thread()
without holding mddev-&gt;reconfig_mutex, so add a lock holding check before
acquiring mddev-&gt;sync_thread by passing mdev to md_unregister_thread().

Signed-off-by: Li Lingfeng &lt;lilingfeng3@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20230803071711.2546560-1-lilingfeng@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5-cache: fix a deadlock in r5l_exit_log()</title>
<updated>2023-08-15T16:37:27+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-07-08T09:17:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a705b11b358dee677aad80630e7608b2d5f56691'/>
<id>a705b11b358dee677aad80630e7608b2d5f56691</id>
<content type='text'>
Commit b13015af94cf ("md/raid5-cache: Clear conf-&gt;log after finishing
work") introduce a new problem:

// caller hold reconfig_mutex
r5l_exit_log
 flush_work(&amp;log-&gt;disable_writeback_work)
			r5c_disable_writeback_async
			 wait_event
			  /*
			   * conf-&gt;log is not NULL, and mddev_trylock()
			   * will fail, wait_event() can never pass.
			   */
 conf-&gt;log = NULL

Fix this problem by setting 'config-&gt;log' to NULL before wake_up() as it
used to be, so that wait_event() from r5c_disable_writeback_async() can
exist. In the meantime, move forward md_unregister_thread() so that
null-ptr-deref this commit fixed can still be fixed.

Fixes: b13015af94cf ("md/raid5-cache: Clear conf-&gt;log after finishing work")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20230708091727.1417894-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit b13015af94cf ("md/raid5-cache: Clear conf-&gt;log after finishing
work") introduce a new problem:

// caller hold reconfig_mutex
r5l_exit_log
 flush_work(&amp;log-&gt;disable_writeback_work)
			r5c_disable_writeback_async
			 wait_event
			  /*
			   * conf-&gt;log is not NULL, and mddev_trylock()
			   * will fail, wait_event() can never pass.
			   */
 conf-&gt;log = NULL

Fix this problem by setting 'config-&gt;log' to NULL before wake_up() as it
used to be, so that wait_event() from r5c_disable_writeback_async() can
exist. In the meantime, move forward md_unregister_thread() so that
null-ptr-deref this commit fixed can still be fixed.

Fixes: b13015af94cf ("md/raid5-cache: Clear conf-&gt;log after finishing work")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20230708091727.1417894-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: protect md_thread with rcu</title>
<updated>2023-06-13T22:25:39+00:00</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2023-05-23T02:10:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4469315439827290923fce4f3f672599cabeb366'/>
<id>4469315439827290923fce4f3f672599cabeb366</id>
<content type='text'>
Currently, there are many places that md_thread can be accessed without
protection, following are known scenarios that can cause
null-ptr-dereference or uaf:

1) sync_thread that is allocated and started from md_start_sync()
2) mddev-&gt;thread can be accessed directly from timeout_store() and
   md_bitmap_daemon_work()
3) md_unregister_thread() from action_store().

Currently, a global spinlock 'pers_lock' is borrowed to protect
'mddev-&gt;thread' in some places, this problem can be fixed likewise,
however, use a global lock for all the cases is not good.

Fix this problem by protecting all md_thread with rcu.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230523021017.3048783-6-yukuai1@huaweicloud.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, there are many places that md_thread can be accessed without
protection, following are known scenarios that can cause
null-ptr-dereference or uaf:

1) sync_thread that is allocated and started from md_start_sync()
2) mddev-&gt;thread can be accessed directly from timeout_store() and
   md_bitmap_daemon_work()
3) md_unregister_thread() from action_store().

Currently, a global spinlock 'pers_lock' is borrowed to protect
'mddev-&gt;thread' in some places, this problem can be fixed likewise,
however, use a global lock for all the cases is not good.

Fix this problem by protecting all md_thread with rcu.

Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230523021017.3048783-6-yukuai1@huaweicloud.com
</pre>
</div>
</content>
</entry>
</feed>
