<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/md-bitmap.c, branch v6.4</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>md: Fix bitmap offset type in sb writer</title>
<updated>2023-04-28T16:21:06+00:00</updated>
<author>
<name>Jonathan Derrick</name>
<email>jonathan.derrick@linux.dev</email>
</author>
<published>2023-04-25T01:14:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b1211978ecf19bceb63a04f53fea4b5d73832a4a'/>
<id>b1211978ecf19bceb63a04f53fea4b5d73832a4a</id>
<content type='text'>
Bitmap offset is allowed to be negative, indicating that bitmap precedes
metadata. Change the type back from sector_t to loff_t to satisfy
conditionals and calculations.

Reported-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Link: https://lore.kernel.org/linux-raid/CAPhsuW6HuaUJ5WcyPajVgUfkQFYp2D_cy1g6qxN4CU_gP2=z7g@mail.gmail.com/
Fixes: 10172f200b67 ("md: Fix types in sb writer")
Signed-off-by: Jonathan Derrick &lt;jonathan.derrick@linux.dev&gt;
Suggested-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230425011438.71046-1-jonathan.derrick@linux.dev
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Bitmap offset is allowed to be negative, indicating that bitmap precedes
metadata. Change the type back from sector_t to loff_t to satisfy
conditionals and calculations.

Reported-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Link: https://lore.kernel.org/linux-raid/CAPhsuW6HuaUJ5WcyPajVgUfkQFYp2D_cy1g6qxN4CU_gP2=z7g@mail.gmail.com/
Fixes: 10172f200b67 ("md: Fix types in sb writer")
Signed-off-by: Jonathan Derrick &lt;jonathan.derrick@linux.dev&gt;
Suggested-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230425011438.71046-1-jonathan.derrick@linux.dev
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Use optimal I/O size for last bitmap page</title>
<updated>2023-04-14T05:20:24+00:00</updated>
<author>
<name>Jon Derrick</name>
<email>jonathan.derrick@linux.dev</email>
</author>
<published>2023-02-24T18:33:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8745faa95611f6331331284925012c0310ae3e2c'/>
<id>8745faa95611f6331331284925012c0310ae3e2c</id>
<content type='text'>
If the bitmap space has enough room, size the I/O for the last bitmap
page write to the optimal I/O size for the storage device. The expanded
write is checked that it won't overrun the data or metadata.

The drive this was tested against has higher latencies when there are
sub-4k writes due to device-side read-mod-writes of its atomic 4k write
unit. This change helps increase performance by sizing the last bitmap
page I/O for the device's preferred write unit, if it is given.

Example Intel/Solidigm P5520
Raid10, Chunk-size 64M, bitmap-size 57228 bits

$ mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/nvme{0,1,2,3}n1
        --assume-clean --bitmap=internal --bitmap-chunk=64M
$ fio --name=test --direct=1 --filename=/dev/md0 --rw=randwrite --bs=4k --runtime=60

Without patch:
  write: IOPS=1676, BW=6708KiB/s (6869kB/s)(393MiB/60001msec); 0 zone resets

With patch:
  write: IOPS=15.7k, BW=61.4MiB/s (64.4MB/s)(3683MiB/60001msec); 0 zone resets

Biosnoop:
Without patch:
Time        Process        PID     Device      LBA        Size      Lat
1.410377    md0_raid10     6900    nvme0n1   W 16         4096      0.02
1.410387    md0_raid10     6900    nvme2n1   W 16         4096      0.02
1.410374    md0_raid10     6900    nvme3n1   W 16         4096      0.01
1.410381    md0_raid10     6900    nvme1n1   W 16         4096      0.02
1.410411    md0_raid10     6900    nvme1n1   W 115346512  4096      0.01
1.410418    md0_raid10     6900    nvme0n1   W 115346512  4096      0.02
1.410915    md0_raid10     6900    nvme2n1   W 24         3584      0.43 &lt;--
1.410935    md0_raid10     6900    nvme3n1   W 24         3584      0.45 &lt;--
1.411124    md0_raid10     6900    nvme1n1   W 24         3584      0.64 &lt;--
1.411147    md0_raid10     6900    nvme0n1   W 24         3584      0.66 &lt;--
1.411176    md0_raid10     6900    nvme3n1   W 2019022184 4096      0.01
1.411189    md0_raid10     6900    nvme2n1   W 2019022184 4096      0.02

With patch:
Time        Process        PID     Device      LBA        Size      Lat
5.747193    md0_raid10     727     nvme0n1   W 16         4096      0.01
5.747192    md0_raid10     727     nvme1n1   W 16         4096      0.02
5.747195    md0_raid10     727     nvme3n1   W 16         4096      0.01
5.747202    md0_raid10     727     nvme2n1   W 16         4096      0.02
5.747229    md0_raid10     727     nvme3n1   W 1196223704 4096      0.02
5.747224    md0_raid10     727     nvme0n1   W 1196223704 4096      0.01
5.747279    md0_raid10     727     nvme0n1   W 24         4096      0.01 &lt;--
5.747279    md0_raid10     727     nvme1n1   W 24         4096      0.02 &lt;--
5.747284    md0_raid10     727     nvme3n1   W 24         4096      0.02 &lt;--
5.747291    md0_raid10     727     nvme2n1   W 24         4096      0.02 &lt;--
5.747314    md0_raid10     727     nvme2n1   W 2234636712 4096      0.01
5.747317    md0_raid10     727     nvme1n1   W 2234636712 4096      0.02

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230224183323.638-4-jonathan.derrick@linux.dev
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the bitmap space has enough room, size the I/O for the last bitmap
page write to the optimal I/O size for the storage device. The expanded
write is checked that it won't overrun the data or metadata.

The drive this was tested against has higher latencies when there are
sub-4k writes due to device-side read-mod-writes of its atomic 4k write
unit. This change helps increase performance by sizing the last bitmap
page I/O for the device's preferred write unit, if it is given.

Example Intel/Solidigm P5520
Raid10, Chunk-size 64M, bitmap-size 57228 bits

$ mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/nvme{0,1,2,3}n1
        --assume-clean --bitmap=internal --bitmap-chunk=64M
$ fio --name=test --direct=1 --filename=/dev/md0 --rw=randwrite --bs=4k --runtime=60

Without patch:
  write: IOPS=1676, BW=6708KiB/s (6869kB/s)(393MiB/60001msec); 0 zone resets

With patch:
  write: IOPS=15.7k, BW=61.4MiB/s (64.4MB/s)(3683MiB/60001msec); 0 zone resets

Biosnoop:
Without patch:
Time        Process        PID     Device      LBA        Size      Lat
1.410377    md0_raid10     6900    nvme0n1   W 16         4096      0.02
1.410387    md0_raid10     6900    nvme2n1   W 16         4096      0.02
1.410374    md0_raid10     6900    nvme3n1   W 16         4096      0.01
1.410381    md0_raid10     6900    nvme1n1   W 16         4096      0.02
1.410411    md0_raid10     6900    nvme1n1   W 115346512  4096      0.01
1.410418    md0_raid10     6900    nvme0n1   W 115346512  4096      0.02
1.410915    md0_raid10     6900    nvme2n1   W 24         3584      0.43 &lt;--
1.410935    md0_raid10     6900    nvme3n1   W 24         3584      0.45 &lt;--
1.411124    md0_raid10     6900    nvme1n1   W 24         3584      0.64 &lt;--
1.411147    md0_raid10     6900    nvme0n1   W 24         3584      0.66 &lt;--
1.411176    md0_raid10     6900    nvme3n1   W 2019022184 4096      0.01
1.411189    md0_raid10     6900    nvme2n1   W 2019022184 4096      0.02

With patch:
Time        Process        PID     Device      LBA        Size      Lat
5.747193    md0_raid10     727     nvme0n1   W 16         4096      0.01
5.747192    md0_raid10     727     nvme1n1   W 16         4096      0.02
5.747195    md0_raid10     727     nvme3n1   W 16         4096      0.01
5.747202    md0_raid10     727     nvme2n1   W 16         4096      0.02
5.747229    md0_raid10     727     nvme3n1   W 1196223704 4096      0.02
5.747224    md0_raid10     727     nvme0n1   W 1196223704 4096      0.01
5.747279    md0_raid10     727     nvme0n1   W 24         4096      0.01 &lt;--
5.747279    md0_raid10     727     nvme1n1   W 24         4096      0.02 &lt;--
5.747284    md0_raid10     727     nvme3n1   W 24         4096      0.02 &lt;--
5.747291    md0_raid10     727     nvme2n1   W 24         4096      0.02 &lt;--
5.747314    md0_raid10     727     nvme2n1   W 2234636712 4096      0.01
5.747317    md0_raid10     727     nvme1n1   W 2234636712 4096      0.02

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230224183323.638-4-jonathan.derrick@linux.dev
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Fix types in sb writer</title>
<updated>2023-04-14T05:20:24+00:00</updated>
<author>
<name>Jon Derrick</name>
<email>jonathan.derrick@linux.dev</email>
</author>
<published>2023-02-24T18:33:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=10172f200b67aafcb9b8ce7d0575d713c1aafbb7'/>
<id>10172f200b67aafcb9b8ce7d0575d713c1aafbb7</id>
<content type='text'>
Page-&gt;index is a pgoff_t and multiplying could cause overflows on a
32-bit architecture. In the sb writer, this is used to calculate and
verify the sector being used, and is multiplied by a sector value. Using
sector_t will cast it to a u64 type and is the more appropriate type for
the unit. Additionally, the integer size unit is converted to a sector
unit in later calculations, and is now corrected to be an unsigned type.

Finally, clean up the calculations using variable aliases to improve
readabiliy.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230224183323.638-3-jonathan.derrick@linux.dev
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Page-&gt;index is a pgoff_t and multiplying could cause overflows on a
32-bit architecture. In the sb writer, this is used to calculate and
verify the sector being used, and is multiplied by a sector value. Using
sector_t will cast it to a u64 type and is the more appropriate type for
the unit. Additionally, the integer size unit is converted to a sector
unit in later calculations, and is now corrected to be an unsigned type.

Finally, clean up the calculations using variable aliases to improve
readabiliy.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230224183323.638-3-jonathan.derrick@linux.dev
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Move sb writer loop to its own function</title>
<updated>2023-04-14T05:20:23+00:00</updated>
<author>
<name>Jon Derrick</name>
<email>jonathan.derrick@linux.dev</email>
</author>
<published>2023-02-24T18:33:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=328e17d8d9428b390ab48bdedcaf5106e4b511fe'/>
<id>328e17d8d9428b390ab48bdedcaf5106e4b511fe</id>
<content type='text'>
Preparatory patch for optimal I/O size calculation. Move the sb writer
loop routine into its own function for clarity.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230224183323.638-2-jonathan.derrick@linux.dev
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Preparatory patch for optimal I/O size calculation. Move the sb writer
loop routine into its own function for clarity.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jon Derrick &lt;jonathan.derrick@linux.dev&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20230224183323.638-2-jonathan.derrick@linux.dev
</pre>
</div>
</content>
</entry>
<entry>
<title>md/bitmap: Fix bitmap chunk size overflow issues</title>
<updated>2022-11-14T17:35:50+00:00</updated>
<author>
<name>Florian-Ewald Mueller</name>
<email>florian-ewald.mueller@ionos.com</email>
</author>
<published>2022-10-25T07:37:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4555211190798b6b6fa2c37667d175bf67945c78'/>
<id>4555211190798b6b6fa2c37667d175bf67945c78</id>
<content type='text'>
- limit bitmap chunk size internal u64 variable to values not overflowing
  the u32 bitmap superblock structure variable stored on persistent media
- assign bitmap chunk size internal u64 variable from unsigned values to
  avoid possible sign extension artifacts when assigning from a s32 value

The bug has been there since at least kernel 4.0.
Steps to reproduce it:
1: mdadm -C /dev/mdx -l 1 --bitmap=internal --bitmap-chunk=256M -e 1.2
-n2 /dev/rnbd1 /dev/rnbd2
2 resize member device rnbd1 and rnbd2 to 8 TB
3 mdadm --grow /dev/mdx --size=max

The bitmap_chunksize will overflow without patch.

Cc: stable@vger.kernel.org

Signed-off-by: Florian-Ewald Mueller &lt;florian-ewald.mueller@ionos.com&gt;
Signed-off-by: Jack Wang &lt;jinpu.wang@ionos.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- limit bitmap chunk size internal u64 variable to values not overflowing
  the u32 bitmap superblock structure variable stored on persistent media
- assign bitmap chunk size internal u64 variable from unsigned values to
  avoid possible sign extension artifacts when assigning from a s32 value

The bug has been there since at least kernel 4.0.
Steps to reproduce it:
1: mdadm -C /dev/mdx -l 1 --bitmap=internal --bitmap-chunk=256M -e 1.2
-n2 /dev/rnbd1 /dev/rnbd2
2 resize member device rnbd1 and rnbd2 to 8 TB
3 mdadm --grow /dev/mdx --size=max

The bitmap_chunksize will overflow without patch.

Cc: stable@vger.kernel.org

Signed-off-by: Florian-Ewald Mueller &lt;florian-ewald.mueller@ionos.com&gt;
Signed-off-by: Jack Wang &lt;jinpu.wang@ionos.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers/md/md-bitmap: check the return value of md_bitmap_get_counter()</title>
<updated>2022-11-14T17:35:49+00:00</updated>
<author>
<name>Li Zhong</name>
<email>floridsleeves@gmail.com</email>
</author>
<published>2022-09-16T23:33:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3bd548e5b819b8c0f2c9085de775c5c7bff9052f'/>
<id>3bd548e5b819b8c0f2c9085de775c5c7bff9052f</id>
<content type='text'>
Check the return value of md_bitmap_get_counter() in case it returns
NULL pointer, which will result in a null pointer dereference.

v2: update the check to include other dereference

Signed-off-by: Li Zhong &lt;floridsleeves@gmail.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Check the return value of md_bitmap_get_counter() in case it returns
NULL pointer, which will result in a null pointer dereference.

v2: update the check to include other dereference

Signed-off-by: Li Zhong &lt;floridsleeves@gmail.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/buffer: Combine two submit_bh() and ll_rw_block() arguments</title>
<updated>2022-07-14T18:14:32+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2022-07-14T18:07:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1420c4a549bf28ffddbed827d61fb3d4d2132ddb'/>
<id>1420c4a549bf28ffddbed827d61fb3d4d2132ddb</id>
<content type='text'>
Both submit_bh() and ll_rw_block() accept a request operation type and
request flags as their first two arguments. Micro-optimize these two
functions by combining these first two arguments into a single argument.
This patch does not change the behavior of any of the modified code.

Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt; (for the md changes)
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20220714180729.1065367-48-bvanassche@acm.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Both submit_bh() and ll_rw_block() accept a request operation type and
request flags as their first two arguments. Micro-optimize these two
functions by combining these first two arguments into a single argument.
This patch does not change the behavior of any of the modified code.

Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt; (for the md changes)
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20220714180729.1065367-48-bvanassche@acm.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/core: Combine two sync_page_io() arguments</title>
<updated>2022-07-14T18:14:31+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2022-07-14T18:06:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4ce4c73f662bdb0ae5bfb058bc7ec6f6829ca078'/>
<id>4ce4c73f662bdb0ae5bfb058bc7ec6f6829ca078</id>
<content type='text'>
Improve uniformity in the kernel of handling of request operation and
flags by passing these as a single argument.

Cc: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20220714180729.1065367-32-bvanassche@acm.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Improve uniformity in the kernel of handling of request operation and
flags by passing these as a single argument.

Cc: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Link: https://lore.kernel.org/r/20220714180729.1065367-32-bvanassche@acm.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: replace deprecated strlcpy &amp; remove duplicated line</title>
<updated>2022-04-25T21:00:36+00:00</updated>
<author>
<name>Heming Zhao</name>
<email>heming.zhao@suse.com</email>
</author>
<published>2022-04-01T02:13:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=92d9aac92b7cc92c770e736c70c3acae7b803278'/>
<id>92d9aac92b7cc92c770e736c70c3acae7b803278</id>
<content type='text'>
This commit includes two topics:

1&gt; replace deprecated strlcpy

change strlcpy to strscpy for strlcpy is marked as deprecated in
Documentation/process/deprecated.rst

2&gt; remove duplicated strlcpy line

in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the
history:

- commit cf921cc19cf7 ("Add node recovery callbacks") introduced the first
  usage of strlcpy().

- commit b97e92574c0b ("Use separate bitmaps for each nodes in the cluster")
  introduced the second strlcpy(). this time, the two strlcpy() are same,
   we can remove anyone safely.

- commit d3b178adb3a3 ("md: Skip cluster setup for dm-raid") added dm-raid
  special handling. And the "nodes" value is the key of this patch. but
  from this patch, strlcpy() which was introduced by b97e92574c0bf
  become necessary.

- commit 3c462c880b52 ("md: Increment version for clustered bitmaps") used
  clustered major version to only handle in clustered env. this patch
  could look a polishment for clustered code logic.

So cf921cc19cf7 became useless after d3b178adb3a3a, we could remove it
safely.

Signed-off-by: Heming Zhao &lt;heming.zhao@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit includes two topics:

1&gt; replace deprecated strlcpy

change strlcpy to strscpy for strlcpy is marked as deprecated in
Documentation/process/deprecated.rst

2&gt; remove duplicated strlcpy line

in md_bitmap_read_sb@md-bitmap.c there are two duplicated strlcpy(), the
history:

- commit cf921cc19cf7 ("Add node recovery callbacks") introduced the first
  usage of strlcpy().

- commit b97e92574c0b ("Use separate bitmaps for each nodes in the cluster")
  introduced the second strlcpy(). this time, the two strlcpy() are same,
   we can remove anyone safely.

- commit d3b178adb3a3 ("md: Skip cluster setup for dm-raid") added dm-raid
  special handling. And the "nodes" value is the key of this patch. but
  from this patch, strlcpy() which was introduced by b97e92574c0bf
  become necessary.

- commit 3c462c880b52 ("md: Increment version for clustered bitmaps") used
  clustered major version to only handle in clustered env. this patch
  could look a polishment for clustered code logic.

So cf921cc19cf7 became useless after d3b178adb3a3a, we could remove it
safely.

Signed-off-by: Heming Zhao &lt;heming.zhao@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md/bitmap: don't set sb values if can't pass sanity check</title>
<updated>2022-04-25T21:00:35+00:00</updated>
<author>
<name>Heming Zhao</name>
<email>heming.zhao@suse.com</email>
</author>
<published>2022-04-01T02:13:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e68cb83a57a458b01c9739e2ad9cb70b04d1e6d2'/>
<id>e68cb83a57a458b01c9739e2ad9cb70b04d1e6d2</id>
<content type='text'>
If bitmap area contains invalid data, kernel will crash then mdadm
triggers "Segmentation fault".
This is cluster-md speical bug. In non-clustered env, mdadm will
handle broken metadata case. In clustered array, only kernel space
handles bitmap slot info. But even this bug only happened in clustered
env, current sanity check is wrong, the code should be changed.

How to trigger: (faulty injection)

dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -Ss
echo aaa &gt; magic.txt
 == below modifying slot 2 bitmap data ==
dd if=magic.txt of=/dev/sda seek=16384 bs=1 count=3 &lt;== destroy magic
dd if=/dev/zero of=/dev/sda seek=16436 bs=1 count=4 &lt;== ZERO chunksize
mdadm -A /dev/md0 /dev/sda /dev/sdb
 == kernel crashes. mdadm outputs "Segmentation fault" ==

Reason of kernel crash:

In md_bitmap_read_sb (called by md_bitmap_create), bad bitmap magic didn't
block chunksize assignment, and zero value made DIV_ROUND_UP_SECTOR_T()
trigger "divide error".

Crash log:

kernel: md: md0 stopped.
kernel: md/raid1:md0: not clean -- starting background reconstruction
kernel: md/raid1:md0: active with 2 out of 2 mirrors
kernel: dlm: ... ...
kernel: md-cluster: Joined cluster 44810aba-38bb-e6b8-daca-bc97a0b254aa slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: divide error: 0000 [#1] SMP NOPTI
kernel: CPU: 0 PID: 1603 Comm: mdadm Not tainted 5.14.6-1-default
kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
kernel: RSP: 0018:ffffc22ac0843ba0 EFLAGS: 00010246
kernel: ... ...
kernel: Call Trace:
kernel:  ? dlm_lock_sync+0xd0/0xd0 [md_cluster 77fe..7a0]
kernel:  md_bitmap_copy_from_slot+0x2c/0x290 [md_mod 24ea..d3a]
kernel:  load_bitmaps+0xec/0x210 [md_cluster 77fe..7a0]
kernel:  md_bitmap_load+0x81/0x1e0 [md_mod 24ea..d3a]
kernel:  do_md_run+0x30/0x100 [md_mod 24ea..d3a]
kernel:  md_ioctl+0x1290/0x15a0 [md_mod 24ea....d3a]
kernel:  ? mddev_unlock+0xaa/0x130 [md_mod 24ea..d3a]
kernel:  ? blkdev_ioctl+0xb1/0x2b0
kernel:  block_ioctl+0x3b/0x40
kernel:  __x64_sys_ioctl+0x7f/0xb0
kernel:  do_syscall_64+0x59/0x80
kernel:  ? exit_to_user_mode_prepare+0x1ab/0x230
kernel:  ? syscall_exit_to_user_mode+0x18/0x40
kernel:  ? do_syscall_64+0x69/0x80
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f4a15fa722b
kernel: ... ...
kernel: ---[ end trace 8afa7612f559c868 ]---
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Acked-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Heming Zhao &lt;heming.zhao@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If bitmap area contains invalid data, kernel will crash then mdadm
triggers "Segmentation fault".
This is cluster-md speical bug. In non-clustered env, mdadm will
handle broken metadata case. In clustered array, only kernel space
handles bitmap slot info. But even this bug only happened in clustered
env, current sanity check is wrong, the code should be changed.

How to trigger: (faulty injection)

dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -Ss
echo aaa &gt; magic.txt
 == below modifying slot 2 bitmap data ==
dd if=magic.txt of=/dev/sda seek=16384 bs=1 count=3 &lt;== destroy magic
dd if=/dev/zero of=/dev/sda seek=16436 bs=1 count=4 &lt;== ZERO chunksize
mdadm -A /dev/md0 /dev/sda /dev/sdb
 == kernel crashes. mdadm outputs "Segmentation fault" ==

Reason of kernel crash:

In md_bitmap_read_sb (called by md_bitmap_create), bad bitmap magic didn't
block chunksize assignment, and zero value made DIV_ROUND_UP_SECTOR_T()
trigger "divide error".

Crash log:

kernel: md: md0 stopped.
kernel: md/raid1:md0: not clean -- starting background reconstruction
kernel: md/raid1:md0: active with 2 out of 2 mirrors
kernel: dlm: ... ...
kernel: md-cluster: Joined cluster 44810aba-38bb-e6b8-daca-bc97a0b254aa slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: divide error: 0000 [#1] SMP NOPTI
kernel: CPU: 0 PID: 1603 Comm: mdadm Not tainted 5.14.6-1-default
kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
kernel: RSP: 0018:ffffc22ac0843ba0 EFLAGS: 00010246
kernel: ... ...
kernel: Call Trace:
kernel:  ? dlm_lock_sync+0xd0/0xd0 [md_cluster 77fe..7a0]
kernel:  md_bitmap_copy_from_slot+0x2c/0x290 [md_mod 24ea..d3a]
kernel:  load_bitmaps+0xec/0x210 [md_cluster 77fe..7a0]
kernel:  md_bitmap_load+0x81/0x1e0 [md_mod 24ea..d3a]
kernel:  do_md_run+0x30/0x100 [md_mod 24ea..d3a]
kernel:  md_ioctl+0x1290/0x15a0 [md_mod 24ea....d3a]
kernel:  ? mddev_unlock+0xaa/0x130 [md_mod 24ea..d3a]
kernel:  ? blkdev_ioctl+0xb1/0x2b0
kernel:  block_ioctl+0x3b/0x40
kernel:  __x64_sys_ioctl+0x7f/0xb0
kernel:  do_syscall_64+0x59/0x80
kernel:  ? exit_to_user_mode_prepare+0x1ab/0x230
kernel:  ? syscall_exit_to_user_mode+0x18/0x40
kernel:  ? do_syscall_64+0x69/0x80
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f4a15fa722b
kernel: ... ...
kernel: ---[ end trace 8afa7612f559c868 ]---
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Acked-by: Guoqing Jiang &lt;guoqing.jiang@linux.dev&gt;
Signed-off-by: Heming Zhao &lt;heming.zhao@suse.com&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
