<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/block/loop.c, branch v4.16.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>loop: Fix lost writes caused by missing flag</title>
<updated>2018-03-09T15:36:36+00:00</updated>
<author>
<name>Ross Zwisler</name>
<email>ross.zwisler@linux.intel.com</email>
</author>
<published>2018-03-09T15:36:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1d037577c323e5090ce281e96bc313ab2eee5be2'/>
<id>1d037577c323e5090ce281e96bc313ab2eee5be2</id>
<content type='text'>
The following commit:

commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")

replaced __do_lo_send_write(), which used ITER_KVEC iterators, with
lo_write_bvec() which uses ITER_BVEC iterators.  In this change, though,
the WRITE flag was lost:

-       iov_iter_kvec(&amp;from, ITER_KVEC | WRITE, &amp;kvec, 1, len);
+       iov_iter_bvec(&amp;i, ITER_BVEC, bvec, 1, bvec-&gt;bv_len);

This flag is necessary for the DAX case because we make decisions based on
whether or not the iterator is a READ or a WRITE in dax_iomap_actor() and
in dax_iomap_rw().

We end up going through this path in configurations where we combine a PMEM
device with 4k sectors, a loopback device and DAX.  The consequence of this
missed flag is that what we intend as a write actually turns into a read in
the DAX code, so no data is ever written.

The very simplest test case is to create a loopback device and try and
write a small string to it, then hexdump a few bytes of the device to see
if the write took.  Without this patch you read back all zeros, with this
you read back the string you wrote.

For XFS this causes us to fail or panic during the following xfstests:

	xfs/074 xfs/078 xfs/216 xfs/217 xfs/250

For ext4 we have a similar issue where writes never happen, but we don't
currently have any xfstests that use loopback and show this issue.

Fix this by restoring the WRITE flag argument to iov_iter_bvec().  This
causes the xfstests to all pass.

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: stable@vger.kernel.org
Fixes: commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The following commit:

commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")

replaced __do_lo_send_write(), which used ITER_KVEC iterators, with
lo_write_bvec() which uses ITER_BVEC iterators.  In this change, though,
the WRITE flag was lost:

-       iov_iter_kvec(&amp;from, ITER_KVEC | WRITE, &amp;kvec, 1, len);
+       iov_iter_bvec(&amp;i, ITER_BVEC, bvec, 1, bvec-&gt;bv_len);

This flag is necessary for the DAX case because we make decisions based on
whether or not the iterator is a READ or a WRITE in dax_iomap_actor() and
in dax_iomap_rw().

We end up going through this path in configurations where we combine a PMEM
device with 4k sectors, a loopback device and DAX.  The consequence of this
missed flag is that what we intend as a write actually turns into a read in
the DAX code, so no data is ever written.

The very simplest test case is to create a loopback device and try and
write a small string to it, then hexdump a few bytes of the device to see
if the write took.  Without this patch you read back all zeros, with this
you read back the string you wrote.

For XFS this causes us to fail or panic during the following xfstests:

	xfs/074 xfs/078 xfs/216 xfs/217 xfs/250

For ext4 we have a similar issue where writes never happen, but we don't
currently have any xfstests that use loopback and show this issue.

Fix this by restoring the WRITE flag argument to iov_iter_bvec().  This
causes the xfstests to all pass.

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: stable@vger.kernel.org
Fixes: commit aa4d86163e4e ("block: loop: switch to VFS ITER_BVEC")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>genhd: Rename get_disk() to get_disk_and_module()</title>
<updated>2018-02-26T16:48:42+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2018-02-26T12:01:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3079c22ea815775837a4f389ce2f7e1e7b202e09'/>
<id>3079c22ea815775837a4f389ce2f7e1e7b202e09</id>
<content type='text'>
Rename get_disk() to get_disk_and_module() to make sure what the
function does. It's not a great name but at least it is now clear that
put_disk() is not it's counterpart.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename get_disk() to get_disk_and_module() to make sure what the
function does. It's not a great name but at least it is now clear that
put_disk() is not it's counterpart.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>loop: fix concurrent lo_open/lo_release</title>
<updated>2018-01-06T16:32:07+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-01-06T00:26:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae6650163c66a7eff1acd6eb8b0f752dcfa8eba5'/>
<id>ae6650163c66a7eff1acd6eb8b0f752dcfa8eba5</id>
<content type='text'>
范龙飞 reports that KASAN can report a use-after-free in __lock_acquire.
The reason is due to insufficient serialization in lo_release(), which
will continue to use the loop device even after it has decremented the
lo_refcnt to zero.

In the meantime, another process can come in, open the loop device
again as it is being shut down. Confusion ensues.

Reported-by: 范龙飞 &lt;long7573@126.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
范龙飞 reports that KASAN can report a use-after-free in __lock_acquire.
The reason is due to insufficient serialization in lo_release(), which
will continue to use the loop device even after it has decremented the
lo_refcnt to zero.

In the meantime, another process can come in, open the loop device
again as it is being shut down. Confusion ensues.

Reported-by: 范龙飞 &lt;long7573@126.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: fix a build error</title>
<updated>2017-09-26T18:07:24+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-26T18:02:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b508bc926bdced678febee2a2b8cdba0a19e481'/>
<id>0b508bc926bdced678febee2a2b8cdba0a19e481</id>
<content type='text'>
The code is only for blkcg not for all cgroups

Fixes: d4478e92d618 ("block/loop: make loop cgroup aware")
Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The code is only for blkcg not for all cgroups

Fixes: d4478e92d618 ("block/loop: make loop cgroup aware")
Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block/loop: make loop cgroup aware</title>
<updated>2017-09-26T13:41:22+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-25T19:07:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d4478e92d6186ce37947a36994de407c27446266'/>
<id>d4478e92d6186ce37947a36994de407c27446266</id>
<content type='text'>
loop block device handles IO in a separate thread. The actual IO
dispatched isn't cloned from the IO loop device received, so the
dispatched IO loses the cgroup context.

I'm ignoring buffer IO case now, which is quite complicated.  Making the
loop thread aware cgroup context doesn't really help. The loop device
only writes to a single file. In current writeback cgroup
implementation, the file can only belong to one cgroup.

For direct IO case, we could workaround the issue in theory. For
example, say we assign cgroup1 5M/s BW for loop device and cgroup2
10M/s. We can create a special cgroup for loop thread and assign at
least 15M/s for the underlayer disk. In this way, we correctly throttle
the two cgroups. But this is tricky to setup.

This patch tries to address the issue. We record bio's css in loop
command. When loop thread is handling the command, we then use the API
provided in patch 1 to set the css for current task. The bio layer will
use the css for new IO (from patch 3).

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
loop block device handles IO in a separate thread. The actual IO
dispatched isn't cloned from the IO loop device received, so the
dispatched IO loses the cgroup context.

I'm ignoring buffer IO case now, which is quite complicated.  Making the
loop thread aware cgroup context doesn't really help. The loop device
only writes to a single file. In current writeback cgroup
implementation, the file can only belong to one cgroup.

For direct IO case, we could workaround the issue in theory. For
example, say we assign cgroup1 5M/s BW for loop device and cgroup2
10M/s. We can create a special cgroup for loop thread and assign at
least 15M/s for the underlayer disk. In this way, we correctly throttle
the two cgroups. But this is tricky to setup.

This patch tries to address the issue. We record bio's css in loop
command. When loop thread is handling the command, we then use the API
provided in patch 1 to set the css for current task. The bio layer will
use the css for new IO (from patch 3).

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>loop: set physical block size to logical block size</title>
<updated>2017-09-06T17:08:49+00:00</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2017-09-05T21:24:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bf09375337077b692d21d062c30697c86f2872d3'/>
<id>bf09375337077b692d21d062c30697c86f2872d3</id>
<content type='text'>
Commit 6c6b6f28b333 ("loop: set physical block size to PAGE_SIZE")
caused mkfs.xfs to barf on ppc64 [1]. Always using PAGE_SIZE as the
physical block size still makes the most sense semantically, but let's
just lie and always set it to the same value as the logical block size
(same goes for io_min). In the future we might want to at least bump up
io_min to PAGE_SIZE but I'm sick of these stupid changes so let's play
it safe.

1: https://marc.info/?l=linux-xfs&amp;m=150459024723753&amp;w=2

Tested-by: Chandan Rajendra &lt;chandan@linux.vnet.ibm.com&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 6c6b6f28b333 ("loop: set physical block size to PAGE_SIZE")
caused mkfs.xfs to barf on ppc64 [1]. Always using PAGE_SIZE as the
physical block size still makes the most sense semantically, but let's
just lie and always set it to the same value as the logical block size
(same goes for io_min). In the future we might want to at least bump up
io_min to PAGE_SIZE but I'm sick of these stupid changes so let's play
it safe.

1: https://marc.info/?l=linux-xfs&amp;m=150459024723753&amp;w=2

Tested-by: Chandan Rajendra &lt;chandan@linux.vnet.ibm.com&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block/loop: fix use after free</title>
<updated>2017-09-01T19:57:33+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-01T18:15:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=92d773324b7edbd36bf0c28c1e0157763aeccc92'/>
<id>92d773324b7edbd36bf0c28c1e0157763aeccc92</id>
<content type='text'>
lo_rw_aio-&gt;call_read_iter-&gt;
1       aops-&gt;direct_IO
2       iov_iter_revert
lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could
be freed before 2, which accesses bvec.

Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
lo_rw_aio-&gt;call_read_iter-&gt;
1       aops-&gt;direct_IO
2       iov_iter_revert
lo_rw_aio_complete could happen between 1 and 2, the bio and bvec could
be freed before 2, which accesses bvec.

Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block/loop: allow request merge for directio mode</title>
<updated>2017-09-01T14:44:34+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-01T05:09:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=40326d8a33d5b70039849d233975b63c733d94a2'/>
<id>40326d8a33d5b70039849d233975b63c733d94a2</id>
<content type='text'>
Currently loop disables merge. While it makes sense for buffer IO mode,
directio mode can benefit from request merge. Without merge, loop could
send small size IO to underlayer disk and harm performance.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently loop disables merge. While it makes sense for buffer IO mode,
directio mode can benefit from request merge. Without merge, loop could
send small size IO to underlayer disk and harm performance.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block/loop: set hw_sectors</title>
<updated>2017-09-01T14:44:32+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-09-01T05:09:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=54bb0ade6627a183c211345761ec46e4bf0048fe'/>
<id>54bb0ade6627a183c211345761ec46e4bf0048fe</id>
<content type='text'>
Loop can handle any size of request. Limiting it to 255 sectors just
burns the CPU for bio split and request merge for underlayer disk and
also cause bad fs block allocation in directio mode.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Loop can handle any size of request. Limiting it to 255 sectors just
burns the CPU for bio split and request merge for underlayer disk and
also cause bad fs block allocation in directio mode.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>loop: fold loop_switch() into callers</title>
<updated>2017-08-31T19:51:16+00:00</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2017-08-24T07:03:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=43cade803ebeb002403d4b704e041ce800e5b0e1'/>
<id>43cade803ebeb002403d4b704e041ce800e5b0e1</id>
<content type='text'>
The comments here are really outdated, and blk-mq made flushing much
simpler, so just fold the two cases into the callers.

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The comments here are really outdated, and blk-mq made flushing much
simpler, so just fold the two cases into the callers.

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
