<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/md.c, branch v5.19</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Revert "md: don't unregister sync_thread with reconfig_mutex held"</title>
<updated>2022-06-15T17:30:14+00:00</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@linux.dev</email>
</author>
<published>2022-06-07T02:03:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d0a180341fe00cd0bd1cc259d196dc255c13f229'/>
<id>d0a180341fe00cd0bd1cc259d196dc255c13f229</id>
<content type='text'>
The 07reshape5intr test is broke because of below path.

    md_reap_sync_thread
            -&gt; mddev_unlock
            -&gt; md_unregister_thread(&amp;mddev-&gt;sync_thread)

And md_check_recovery is triggered by,

mddev_unlock -&gt; md_wakeup_thread(mddev-&gt;thread)

then mddev-&gt;reshape_position is set to MaxSector in raid5_finish_reshape
since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
reshape_position can't be updated accordingly.

Fixes: 8b48ec23cc51a ("md: don't unregister sync_thread with reconfig_mutex held")
Reported-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The 07reshape5intr test is broke because of below path.

    md_reap_sync_thread
            -&gt; mddev_unlock
            -&gt; md_unregister_thread(&amp;mddev-&gt;sync_thread)

And md_check_recovery is triggered by,

mddev_unlock -&gt; md_wakeup_thread(mddev-&gt;thread)

then mddev-&gt;reshape_position is set to MaxSector in raid5_finish_reshape
since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
reshape_position can't be updated accordingly.

Fixes: 8b48ec23cc51a ("md: don't unregister sync_thread with reconfig_mutex held")
Reported-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix double free of io_acct_set bioset</title>
<updated>2022-05-23T06:07:22+00:00</updated>
<author>
<name>Xiao Ni</name>
<email>xni@redhat.com</email>
</author>
<published>2022-05-12T09:21:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=42b805af102471f53e3c7867b8c2b502ea4eef7e'/>
<id>42b805af102471f53e3c7867b8c2b502ea4eef7e</id>
<content type='text'>
Now io_acct_set is alloc and free in personality. Remove the codes that
free io_acct_set in md_free and md_stop.

Fixes: 0c031fd37f69 (md: Move alloc/free acct bioset in to personality)
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now io_acct_set is alloc and free in personality. Remove the codes that
free io_acct_set in md_free and md_stop.

Fixes: 0c031fd37f69 (md: Move alloc/free acct bioset in to personality)
Signed-off-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: remove most calls to bdevname</title>
<updated>2022-05-23T06:07:21+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2022-05-12T06:19:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=913cce5a1e588e3470ea064fe4ea336037d3a454'/>
<id>913cce5a1e588e3470ea064fe4ea336037d3a454</id>
<content type='text'>
Use the %pg format specifier to save on stack consumption and code size.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the %pg format specifier to save on stack consumption and code size.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: protect md_unregister_thread from reentrancy</title>
<updated>2022-05-23T06:07:21+00:00</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@cloud.ionos.com</email>
</author>
<published>2022-04-29T08:49:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1e267742283a4b5a8ca65755c44166be27e9aa0f'/>
<id>1e267742283a4b5a8ca65755c44166be27e9aa0f</id>
<content type='text'>
Generally, the md_unregister_thread is called with reconfig_mutex, but
raid_message in dm-raid doesn't hold reconfig_mutex to unregister thread,
so md_unregister_thread can be called simulitaneously from two call sites
in theory.

Then after previous commit which remove the protection of reconfig_mutex
for md_unregister_thread completely, the potential issue could be worse
than before.

Let's take pers_lock at the beginning of function to ensure reentrancy.

Reported-by: Donald Buczek &lt;buczek@molgen.mpg.de&gt;
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Generally, the md_unregister_thread is called with reconfig_mutex, but
raid_message in dm-raid doesn't hold reconfig_mutex to unregister thread,
so md_unregister_thread can be called simulitaneously from two call sites
in theory.

Then after previous commit which remove the protection of reconfig_mutex
for md_unregister_thread completely, the potential issue could be worse
than before.

Let's take pers_lock at the beginning of function to ensure reentrancy.

Reported-by: Donald Buczek &lt;buczek@molgen.mpg.de&gt;
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: don't unregister sync_thread with reconfig_mutex held</title>
<updated>2022-05-23T06:07:21+00:00</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@cloud.ionos.com</email>
</author>
<published>2021-02-13T00:49:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934'/>
<id>8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934</id>
<content type='text'>
Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.

And it could cause deadlock problem for raid5 as follows:

1. process A tried to reap sync thread with reconfig_mutex held after echo
   idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
   which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
   to hold reconfig_mutex.

More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

And add one parameter to md_reap_sync_thread since it could be called by
dm-raid which doesn't hold reconfig_mutex.

Reported-and-tested-by: Donald Buczek &lt;buczek@molgen.mpg.de&gt;
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@cloud.ionos.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unregister sync_thread doesn't need to hold reconfig_mutex since it
doesn't reconfigure array.

And it could cause deadlock problem for raid5 as follows:

1. process A tried to reap sync thread with reconfig_mutex held after echo
   idle to sync_action.
2. raid5 sync thread was blocked if there were too many active stripes.
3. SB_CHANGE_PENDING was set (because of write IO comes from upper layer)
   which causes the number of active stripes can't be decreased.
4. SB_CHANGE_PENDING can't be cleared since md_check_recovery was not able
   to hold reconfig_mutex.

More details in the link:
https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t

And add one parameter to md_reap_sync_thread since it could be called by
dm-raid which doesn't hold reconfig_mutex.

Reported-and-tested-by: Donald Buczek &lt;buczek@molgen.mpg.de&gt;
Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@cloud.ionos.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Replace role magic numbers with defined constants</title>
<updated>2022-04-25T21:22:31+00:00</updated>
<author>
<name>David Sloan</name>
<email>david.sloan@eideticom.com</email>
</author>
<published>2022-04-21T19:45:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9151ad5d8676a89cf1b6a4051037ab3ca077d938'/>
<id>9151ad5d8676a89cf1b6a4051037ab3ca077d938</id>
<content type='text'>
There are several instances where magic numbers are used in md.c instead
of the defined constants in md_p.h. This patch set improves code
readability by replacing all occurrences of 0xffff, 0xfffe, and 0xfffd when
relating to md roles with their equivalent defined constant.

Signed-off-by: David Sloan &lt;david.sloan@eideticom.com&gt;
Reviewed-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are several instances where magic numbers are used in md.c instead
of the defined constants in md_p.h. This patch set improves code
readability by replacing all occurrences of 0xffff, 0xfffe, and 0xfffd when
relating to md roles with their equivalent defined constant.

Signed-off-by: David Sloan &lt;david.sloan@eideticom.com&gt;
Reviewed-by: Logan Gunthorpe &lt;logang@deltatee.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: replace deprecated strlcpy &amp; remove duplicated line</title>
<updated>2022-04-25T21:00:36+00:00</updated>
<author>
<name>Heming Zhao</name>
<email>heming.zhao@suse.com</email>
</author>
<published>2022-04-01T02:13:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=92d9aac92b7cc92c770e736c70c3acae7b803278'/>
<id>92d9aac92b7cc92c770e736c70c3acae7b803278</id>
<content type='text'>
This commit includes two topics:

1&gt; replace deprecated strlcpy

change strlcpy to strscpy for strlcpy is marked as deprecated in
Documentation/process/deprecated.rst

2&gt; remove duplicated strlcpy line

in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the
history:

- commit cf921cc19cf7 ("Add node recovery callbacks") introduced the first
  usage of strlcpy().

- commit b97e92574c0b ("Use separate bitmaps for each nodes in the cluster")
  introduced the second strlcpy(). this time, the two strlcpy() are same,
   we can remove anyone safely.

- commit d3b178adb3a3 ("md: Skip cluster setup for dm-raid") added dm-raid
  special handling. And the "nodes" value is the key of this patch. but
  from this patch, strlcpy() which was introduced by b97e92574c0bf
  become necessary.

- commit 3c462c880b52 ("md: Increment version for clustered bitmaps") used
  clustered major version to only handle in clustered env. this patch
  could look a polishment for clustered code logic.

So cf921cc19cf7 became useless after d3b178adb3a3a, we could remove it
safely.

Signed-off-by: Heming Zhao &lt;heming.zhao@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit includes two topics:

1&gt; replace deprecated strlcpy

change strlcpy to strscpy for strlcpy is marked as deprecated in
Documentation/process/deprecated.rst

2&gt; remove duplicated strlcpy line

in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the
history:

- commit cf921cc19cf7 ("Add node recovery callbacks") introduced the first
  usage of strlcpy().

- commit b97e92574c0b ("Use separate bitmaps for each nodes in the cluster")
  introduced the second strlcpy(). this time, the two strlcpy() are same,
   we can remove anyone safely.

- commit d3b178adb3a3 ("md: Skip cluster setup for dm-raid") added dm-raid
  special handling. And the "nodes" value is the key of this patch. but
  from this patch, strlcpy() which was introduced by b97e92574c0bf
  become necessary.

- commit 3c462c880b52 ("md: Increment version for clustered bitmaps") used
  clustered major version to only handle in clustered env. this patch
  could look a polishment for clustered code logic.

So cf921cc19cf7 became useless after d3b178adb3a3a, we could remove it
safely.

Signed-off-by: Heming Zhao &lt;heming.zhao@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix an incorrect NULL check in md_reload_sb</title>
<updated>2022-04-25T21:00:35+00:00</updated>
<author>
<name>Xiaomeng Tong</name>
<email>xiam0nd.tong@gmail.com</email>
</author>
<published>2022-04-08T08:47:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=64c54d9244a4efe9bc6e9c98e13c4bbb8bb39083'/>
<id>64c54d9244a4efe9bc6e9c98e13c4bbb8bb39083</id>
<content type='text'>
The bug is here:
	if (!rdev || rdev-&gt;desc_nr != nr) {

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each_rcu(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD). Otherwise it will bypass the check
and lead to invalid memory access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'pdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 70bcecdb1534 ("md-cluster: Improve md_reload_sb to be less error prone")
Signed-off-by: Xiaomeng Tong &lt;xiam0nd.tong@gmail.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bug is here:
	if (!rdev || rdev-&gt;desc_nr != nr) {

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each_rcu(), so it is incorrect to assume that the
iterator value will be NULL if the list is empty or no element
found (In fact, it will be a bogus pointer to an invalid struct
object containing the HEAD). Otherwise it will bypass the check
and lead to invalid memory access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'pdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 70bcecdb1534 ("md-cluster: Improve md_reload_sb to be less error prone")
Signed-off-by: Xiaomeng Tong &lt;xiam0nd.tong@gmail.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix an incorrect NULL check in does_sb_need_changing</title>
<updated>2022-04-25T21:00:35+00:00</updated>
<author>
<name>Xiaomeng Tong</name>
<email>xiam0nd.tong@gmail.com</email>
</author>
<published>2022-04-08T08:37:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fc8738343eefc4ea8afb6122826dea48eacde514'/>
<id>fc8738343eefc4ea8afb6122826dea48eacde514</id>
<content type='text'>
The bug is here:
	if (!rdev)

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element found.
Otherwise it will bypass the NULL check and lead to invalid memory
access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'rdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 2aa82191ac36 ("md-cluster: Perform a lazy update")
Acked-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Xiaomeng Tong &lt;xiam0nd.tong@gmail.com&gt;
Acked-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bug is here:
	if (!rdev)

The list iterator value 'rdev' will *always* be set and non-NULL
by rdev_for_each(), so it is incorrect to assume that the iterator
value will be NULL if the list is empty or no element found.
Otherwise it will bypass the NULL check and lead to invalid memory
access passing the check.

To fix the bug, use a new variable 'iter' as the list iterator,
while using the original variable 'rdev' as a dedicated pointer to
point to the found element.

Cc: stable@vger.kernel.org
Fixes: 2aa82191ac36 ("md-cluster: Perform a lazy update")
Acked-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Xiaomeng Tong &lt;xiam0nd.tong@gmail.com&gt;
Acked-by: Goldwyn Rodrigues &lt;rgoldwyn@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Set MD_BROKEN for RAID1 and RAID10</title>
<updated>2022-04-25T21:00:34+00:00</updated>
<author>
<name>Mariusz Tkaczyk</name>
<email>mariusz.tkaczyk@linux.intel.com</email>
</author>
<published>2022-03-22T15:23:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9631abdbf406c764f2a5d8305eac063bc3396a0a'/>
<id>9631abdbf406c764f2a5d8305eac063bc3396a0a</id>
<content type='text'>
There is no direct mechanism to determine raid failure outside
personality. It is done by checking rdev-&gt;flags after executing
md_error(). If "faulty" flag is not set then -EBUSY is returned to
userspace. -EBUSY means that array will be failed after drive removal.

Mdadm has special routine to handle the array failure and it is executed
if -EBUSY is returned by md.

There are at least two known reasons to not consider this mechanism
as correct:
1. drive can be removed even if array will be failed[1].
2. -EBUSY seems to be wrong status. Array is not busy, but removal
   process cannot proceed safe.

-EBUSY expectation cannot be removed without breaking compatibility
with userspace. In this patch first issue is resolved by adding support
for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in
next commit.

The idea is to set the MD_BROKEN if we are sure that raid is in failed
state now. This is done in each error_handler(). In md_error() MD_BROKEN
flag is checked. If is set, then -EBUSY is returned to userspace.

As in previous commit, it causes that #mdadm --set-faulty is able to
fail array. Previously proposed workaround is valid if optional
functionality[1] is disabled.

[1] commit 9a567843f7ce("md: allow last device to be forcibly removed from
    RAID1/RAID10.")

Reviewd-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Mariusz Tkaczyk &lt;mariusz.tkaczyk@linux.intel.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no direct mechanism to determine raid failure outside
personality. It is done by checking rdev-&gt;flags after executing
md_error(). If "faulty" flag is not set then -EBUSY is returned to
userspace. -EBUSY means that array will be failed after drive removal.

Mdadm has special routine to handle the array failure and it is executed
if -EBUSY is returned by md.

There are at least two known reasons to not consider this mechanism
as correct:
1. drive can be removed even if array will be failed[1].
2. -EBUSY seems to be wrong status. Array is not busy, but removal
   process cannot proceed safe.

-EBUSY expectation cannot be removed without breaking compatibility
with userspace. In this patch first issue is resolved by adding support
for MD_BROKEN flag for RAID1 and RAID10. Support for RAID456 is added in
next commit.

The idea is to set the MD_BROKEN if we are sure that raid is in failed
state now. This is done in each error_handler(). In md_error() MD_BROKEN
flag is checked. If is set, then -EBUSY is returned to userspace.

As in previous commit, it causes that #mdadm --set-faulty is able to
fail array. Previously proposed workaround is valid if optional
functionality[1] is disabled.

[1] commit 9a567843f7ce("md: allow last device to be forcibly removed from
    RAID1/RAID10.")

Reviewd-by: Xiao Ni &lt;xni@redhat.com&gt;
Signed-off-by: Mariusz Tkaczyk &lt;mariusz.tkaczyk@linux.intel.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
