<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/bcache/request.c, branch v5.8</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Merge tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block</title>
<updated>2020-06-02T22:37:03+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-02T22:37:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bce159d734091fe31340976081577333f52a85e4'/>
<id>bce159d734091fe31340976081577333f52a85e4</id>
<content type='text'>
Pull block driver updates from Jens Axboe:
 "On top of the core changes, here are the block driver changes for this
  merge window:

   - NVMe changes:
        - NVMe over Fibre Channel protocol updates, which also reach
          over to drivers/scsi/lpfc (James Smart)
        - namespace revalidation support on the target (Anthony
          Iliopoulos)
        - gcc zero length array fix (Arnd Bergmann)
        - nvmet cleanups (Chaitanya Kulkarni)
        - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg)
        - use a SRQ per completion vector (Max Gurtovoy)
        - fix handling of runtime changes to the queue count (Weiping
          Zhang)
        - t10 protection information support for nvme-rdma and
          nvmet-rdma (Israel Rukshin and Max Gurtovoy)
        - target side AEN improvements (Chaitanya Kulkarni)
        - various fixes and minor improvements all over, icluding the
          nvme part of the lpfc driver"

   - Floppy code cleanup series (Willy, Denis)

   - Floppy contention fix (Jiri)

   - Loop CONFIGURE support (Martijn)

   - bcache fixes/improvements (Coly, Joe, Colin)

   - q-&gt;queuedata cleanups (Christoph)

   - Get rid of ioctl_by_bdev (Christoph, Stefan)

   - md/raid5 allocation fixes (Coly)

   - zero length array fixes (Gustavo)

   - swim3 task state fix (Xu)"

* tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits)
  bcache: configure the asynchronous registertion to be experimental
  bcache: asynchronous devices registration
  bcache: fix refcount underflow in bcache_device_free()
  bcache: Convert pr_&lt;level&gt; uses to a more typical style
  bcache: remove redundant variables i and n
  lpfc: Fix return value in __lpfc_nvme_ls_abort
  lpfc: fix axchg pointer reference after free and double frees
  lpfc: Fix pointer checks and comments in LS receive refactoring
  nvme: set dma alignment to qword
  nvmet: cleanups the loop in nvmet_async_events_process
  nvmet: fix memory leak when removing namespaces and controllers concurrently
  nvmet-rdma: add metadata/T10-PI support
  nvmet: add metadata support for block devices
  nvmet: add metadata/T10-PI support
  nvme: add Metadata Capabilities enumerations
  nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
  nvmet: rename nvmet_rw_len to nvmet_rw_data_len
  nvmet: add metadata characteristics for a namespace
  nvme-rdma: add metadata/T10-PI support
  nvme-rdma: introduce nvme_rdma_sgl structure
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block driver updates from Jens Axboe:
 "On top of the core changes, here are the block driver changes for this
  merge window:

   - NVMe changes:
        - NVMe over Fibre Channel protocol updates, which also reach
          over to drivers/scsi/lpfc (James Smart)
        - namespace revalidation support on the target (Anthony
          Iliopoulos)
        - gcc zero length array fix (Arnd Bergmann)
        - nvmet cleanups (Chaitanya Kulkarni)
        - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg)
        - use a SRQ per completion vector (Max Gurtovoy)
        - fix handling of runtime changes to the queue count (Weiping
          Zhang)
        - t10 protection information support for nvme-rdma and
          nvmet-rdma (Israel Rukshin and Max Gurtovoy)
        - target side AEN improvements (Chaitanya Kulkarni)
        - various fixes and minor improvements all over, icluding the
          nvme part of the lpfc driver"

   - Floppy code cleanup series (Willy, Denis)

   - Floppy contention fix (Jiri)

   - Loop CONFIGURE support (Martijn)

   - bcache fixes/improvements (Coly, Joe, Colin)

   - q-&gt;queuedata cleanups (Christoph)

   - Get rid of ioctl_by_bdev (Christoph, Stefan)

   - md/raid5 allocation fixes (Coly)

   - zero length array fixes (Gustavo)

   - swim3 task state fix (Xu)"

* tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits)
  bcache: configure the asynchronous registertion to be experimental
  bcache: asynchronous devices registration
  bcache: fix refcount underflow in bcache_device_free()
  bcache: Convert pr_&lt;level&gt; uses to a more typical style
  bcache: remove redundant variables i and n
  lpfc: Fix return value in __lpfc_nvme_ls_abort
  lpfc: fix axchg pointer reference after free and double frees
  lpfc: Fix pointer checks and comments in LS receive refactoring
  nvme: set dma alignment to qword
  nvmet: cleanups the loop in nvmet_async_events_process
  nvmet: fix memory leak when removing namespaces and controllers concurrently
  nvmet-rdma: add metadata/T10-PI support
  nvmet: add metadata support for block devices
  nvmet: add metadata/T10-PI support
  nvme: add Metadata Capabilities enumerations
  nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
  nvmet: rename nvmet_rw_len to nvmet_rw_data_len
  nvmet: add metadata characteristics for a namespace
  nvme-rdma: add metadata/T10-PI support
  nvme-rdma: introduce nvme_rdma_sgl structure
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: use bio_{start,end}_io_acct</title>
<updated>2020-05-27T11:21:23+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-05-27T05:24:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=85750aeb748fd17a393d1f510f7b33e3336c395e'/>
<id>85750aeb748fd17a393d1f510f7b33e3336c395e</id>
<content type='text'>
Switch bcache to use the nicer bio accounting helpers, and call the
routines where we also sample the start time to give coherent accounting
results.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Switch bcache to use the nicer bio accounting helpers, and call the
routines where we also sample the start time to give coherent accounting
results.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Convert pr_&lt;level&gt; uses to a more typical style</title>
<updated>2020-05-27T11:19:36+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2020-05-27T04:01:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=46f5aa8806e34f2e48de852cc7db2c74c3a5cd8d'/>
<id>46f5aa8806e34f2e48de852cc7db2c74c3a5cd8d</id>
<content type='text'>
Remove the trailing newline from the define of pr_fmt and add newlines
to the uses.

Miscellanea:

o Convert bch_bkey_dump from multiple uses of pr_err to pr_cont
  as the earlier conversion was inappropriate done causing multiple
  lines to be emitted where only a single output line was desired
o Use vsprintf extension %pV in bch_cache_set_error to avoid multiple
  line output where only a single line output was desired
o Coalesce formats

Fixes: 6ae63e3501c4 ("bcache: replace printk() by pr_*() routines")

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the trailing newline from the define of pr_fmt and add newlines
to the uses.

Miscellanea:

o Convert bch_bkey_dump from multiple uses of pr_err to pr_cont
  as the earlier conversion was inappropriate done causing multiple
  lines to be emitted where only a single output line was desired
o Use vsprintf extension %pV in bch_cache_set_error to avoid multiple
  line output where only a single line output was desired
o Coalesce formats

Fixes: 6ae63e3501c4 ("bcache: replace printk() by pr_*() routines")

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: remove a duplicate -&gt;make_request_fn assignment</title>
<updated>2020-04-25T15:45:43+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-04-25T07:53:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a91b2014fc31dc6eaa02ca33aa3b4d1b6e4a0207'/>
<id>a91b2014fc31dc6eaa02ca33aa3b4d1b6e4a0207</id>
<content type='text'>
The make_request_fn pointer should only be assigned by blk_alloc_queue.
Fix a left over manual initialization.

Fixes: ff27668ce809 ("bcache: pass the make_request methods to blk_queue_make_request")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The make_request_fn pointer should only be assigned by blk_alloc_queue.
Fix a left over manual initialization.

Fixes: ff27668ce809 ("bcache: pass the make_request methods to blk_queue_make_request")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: pass the make_request methods to blk_queue_make_request</title>
<updated>2020-03-27T16:23:43+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-03-27T08:30:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ff27668ce8092c74965f21b5c02ebc6b6764db95'/>
<id>ff27668ce8092c74965f21b5c02ebc6b6764db95</id>
<content type='text'>
bcache is the only driver not actually passing its make_request
methods to blk_queue_make_request, but instead just sets them up
manually a little later.  Make bcache follow the common way of
setting up make_request based queues.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bcache is the only driver not actually passing its make_request
methods to blk_queue_make_request, but instead just sets them up
manually a little later.  Make bcache follow the common way of
setting up make_request based queues.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: add readahead cache policy options via sysfs interface</title>
<updated>2020-02-01T14:55:39+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2020-02-01T14:42:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=038ba8cc1bffc51250add4a9b9249d4331576d8f'/>
<id>038ba8cc1bffc51250add4a9b9249d4331576d8f</id>
<content type='text'>
In year 2007 high performance SSD was still expensive, in order to
save more space for real workload or meta data, the readahead I/Os
for non-meta data was bypassed and not cached on SSD.

In now days, SSD price drops a lot and people can find larger size
SSD with more comfortable price. It is unncessary to alway bypass
normal readahead I/Os to save SSD space for now.

This patch adds options for readahead data cache policies via sysfs
file /sys/block/bcache&lt;N&gt;/readahead_cache_policy, the options are,
- "all": cache all readahead data I/Os.
- "meta-only": only cache meta data, and bypass other regular I/Os.

If users want to make bcache continue to only cache readahead request
for metadata and bypass regular data readahead, please set "meta-only"
to this sysfs file. By default, bcache will back to cache all read-
ahead requests now.

Cc: stable@vger.kernel.org
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Acked-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Cc: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In year 2007 high performance SSD was still expensive, in order to
save more space for real workload or meta data, the readahead I/Os
for non-meta data was bypassed and not cached on SSD.

In now days, SSD price drops a lot and people can find larger size
SSD with more comfortable price. It is unncessary to alway bypass
normal readahead I/Os to save SSD space for now.

This patch adds options for readahead data cache policies via sysfs
file /sys/block/bcache&lt;N&gt;/readahead_cache_policy, the options are,
- "all": cache all readahead data I/Os.
- "meta-only": only cache meta data, and bypass other regular I/Os.

If users want to make bcache continue to only cache readahead request
for metadata and bypass regular data readahead, please set "meta-only"
to this sysfs file. By default, bcache will back to cache all read-
ahead requests now.

Cc: stable@vger.kernel.org
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Acked-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Cc: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: deleted code comments for dead code in bch_data_insert_keys()</title>
<updated>2019-11-13T22:42:50+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-11-13T08:03:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=41fa4deef90ba1cd048b740317f50b9decae9fc8'/>
<id>41fa4deef90ba1cd048b740317f50b9decae9fc8</id>
<content type='text'>
In request.c:bch_data_insert_keys(), there is code comment for a piece
of dead code. This patch deletes the dead code and its code comment
since they are useless in practice.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In request.c:bch_data_insert_keys(), there is code comment for a piece
of dead code. This patch deletes the dead code and its code comment
since they are useless in practice.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Clean up bch_get_congested()</title>
<updated>2019-04-24T16:56:27+00:00</updated>
<author>
<name>George Spelvin</name>
<email>lkml@sdf.org</email>
</author>
<published>2019-04-24T16:48:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3a3947271cd6ce05c5b30c1250fa99de57410500'/>
<id>3a3947271cd6ce05c5b30c1250fa99de57410500</id>
<content type='text'>
There are a few nits in this function.  They could in theory all
be separate patches, but that's probably taking small commits
too far.

1) I added a brief comment saying what it does.

2) I like to declare pointer parameters "const" where possible
   for documentation reasons.

3) It uses bitmap_weight(&amp;rand, BITS_PER_LONG) to compute the Hamming
weight of a 32-bit random number (giving a random integer with
mean 16 and variance 8).  Passing by reference in a 64-bit variable
is silly; just use hweight32().

4) Its helper function fract_exp_two is unnecessarily tangled.
Gcc can optimize the multiply by (1 &lt;&lt; x) to a shift, but it can
be written in a much more straightforward way at the cost of one
more bit of internal precision.  Some analysis reveals that this
bit is always available.

This shrinks the object code for fract_exp_two(x, 6) from 23 bytes:

0000000000000000 &lt;foo1&gt;:
   0:   89 f9                   mov    %edi,%ecx
   2:   c1 e9 06                shr    $0x6,%ecx
   5:   b8 01 00 00 00          mov    $0x1,%eax
   a:   d3 e0                   shl    %cl,%eax
   c:   83 e7 3f                and    $0x3f,%edi
   f:   d3 e7                   shl    %cl,%edi
  11:   c1 ef 06                shr    $0x6,%edi
  14:   01 f8                   add    %edi,%eax
  16:   c3                      retq

To 19:

0000000000000017 &lt;foo2&gt;:
  17:   89 f8                   mov    %edi,%eax
  19:   83 e0 3f                and    $0x3f,%eax
  1c:   83 c0 40                add    $0x40,%eax
  1f:   89 f9                   mov    %edi,%ecx
  21:   c1 e9 06                shr    $0x6,%ecx
  24:   d3 e0                   shl    %cl,%eax
  26:   c1 e8 06                shr    $0x6,%eax
  29:   c3                      retq

(Verified with 0 &lt;= frac_bits &lt;= 8, 0 &lt;= x &lt; 16&lt;&lt;frac_bits;
both versions produce the same output.)

5) And finally, the call to bch_get_congested() in check_should_bypass()
is separated from the use of the value by multiple tests which
could moot the need to compute it.  Move the computation down to
where it's needed.  This also saves a local register to hold the
computed value.

Signed-off-by: George Spelvin &lt;lkml@sdf.org&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are a few nits in this function.  They could in theory all
be separate patches, but that's probably taking small commits
too far.

1) I added a brief comment saying what it does.

2) I like to declare pointer parameters "const" where possible
   for documentation reasons.

3) It uses bitmap_weight(&amp;rand, BITS_PER_LONG) to compute the Hamming
weight of a 32-bit random number (giving a random integer with
mean 16 and variance 8).  Passing by reference in a 64-bit variable
is silly; just use hweight32().

4) Its helper function fract_exp_two is unnecessarily tangled.
Gcc can optimize the multiply by (1 &lt;&lt; x) to a shift, but it can
be written in a much more straightforward way at the cost of one
more bit of internal precision.  Some analysis reveals that this
bit is always available.

This shrinks the object code for fract_exp_two(x, 6) from 23 bytes:

0000000000000000 &lt;foo1&gt;:
   0:   89 f9                   mov    %edi,%ecx
   2:   c1 e9 06                shr    $0x6,%ecx
   5:   b8 01 00 00 00          mov    $0x1,%eax
   a:   d3 e0                   shl    %cl,%eax
   c:   83 e7 3f                and    $0x3f,%edi
   f:   d3 e7                   shl    %cl,%edi
  11:   c1 ef 06                shr    $0x6,%edi
  14:   01 f8                   add    %edi,%eax
  16:   c3                      retq

To 19:

0000000000000017 &lt;foo2&gt;:
  17:   89 f8                   mov    %edi,%eax
  19:   83 e0 3f                and    $0x3f,%eax
  1c:   83 c0 40                add    $0x40,%eax
  1f:   89 f9                   mov    %edi,%ecx
  21:   c1 e9 06                shr    $0x6,%ecx
  24:   d3 e0                   shl    %cl,%eax
  26:   c1 e8 06                shr    $0x6,%eax
  29:   c3                      retq

(Verified with 0 &lt;= frac_bits &lt;= 8, 0 &lt;= x &lt; 16&lt;&lt;frac_bits;
both versions produce the same output.)

5) And finally, the call to bch_get_congested() in check_should_bypass()
is separated from the use of the value by multiple tests which
could moot the need to compute it.  Move the computation down to
where it's needed.  This also saves a local register to hold the
computed value.

Signed-off-by: George Spelvin &lt;lkml@sdf.org&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix crashes stopping bcache device before read miss done</title>
<updated>2019-04-24T16:56:27+00:00</updated>
<author>
<name>Guoju Fang</name>
<email>fangguoju@gmail.com</email>
</author>
<published>2019-04-24T16:48:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1568ee7e3c6305d9fbb2414bfd4b56e52853d42d'/>
<id>1568ee7e3c6305d9fbb2414bfd4b56e52853d42d</id>
<content type='text'>
The bio from upper layer is considered completed when bio_complete()
returns. In most scenarios bio_complete() is called in search_free(),
but when read miss happens, the bio_compete() is called when backing
device reading completed, while the struct search is still in use until
cache inserting finished.

If someone stops the bcache device just then, the device may be closed
and released, but after cache inserting finished the struct search will
access a freed struct cached_dev.

This patch add the reference of bcache device before bio_complete() when
read miss happens, and put it after the search is not used.

Signed-off-by: Guoju Fang &lt;fangguoju@gmail.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bio from upper layer is considered completed when bio_complete()
returns. In most scenarios bio_complete() is called in search_free(),
but when read miss happens, the bio_compete() is called when backing
device reading completed, while the struct search is still in use until
cache inserting finished.

If someone stops the bcache device just then, the device may be closed
and released, but after cache inserting finished the struct search will
access a freed struct cached_dev.

This patch add the reference of bcache device before bio_complete() when
read miss happens, and put it after the search is not used.

Signed-off-by: Guoju Fang &lt;fangguoju@gmail.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: use (REQ_META|REQ_PRIO) to indicate bio for metadata</title>
<updated>2019-02-09T14:18:33+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-02-09T04:53:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dc7292a5bcb4c878b076fca2ac3fc22f81b8f8df'/>
<id>dc7292a5bcb4c878b076fca2ac3fc22f81b8f8df</id>
<content type='text'>
In 'commit 752f66a75aba ("bcache: use REQ_PRIO to indicate bio for
metadata")' REQ_META is replaced by REQ_PRIO to indicate metadata bio.
This assumption is not always correct, e.g. XFS uses REQ_META to mark
metadata bio other than REQ_PRIO. This is why Nix noticed that bcache
does not cache metadata for XFS after the above commit.

Thanks to Dave Chinner, he explains the difference between REQ_META and
REQ_PRIO from view of file system developer. Here I quote part of his
explanation from mailing list,
   REQ_META is used for metadata. REQ_PRIO is used to communicate to
   the lower layers that the submitter considers this IO to be more
   important that non REQ_PRIO IO and so dispatch should be expedited.

   IOWs, if the filesystem considers metadata IO to be more important
   that user data IO, then it will use REQ_PRIO | REQ_META rather than
   just REQ_META.

Then it seems bios with REQ_META or REQ_PRIO should both be cached for
performance optimation, because they are all probably low I/O latency
demand by upper layer (e.g. file system).

So in this patch, when we want to decide whether to bypass the cache,
REQ_META and REQ_PRIO are both checked. Then both metadata and
high priority I/O requests will be handled properly.

Reported-by: Nix &lt;nix@esperi.org.uk&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Andre Noll &lt;maan@tuebingen.mpg.de&gt;
Tested-by: Nix &lt;nix@esperi.org.uk&gt;
Cc: stable@vger.kernel.org
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In 'commit 752f66a75aba ("bcache: use REQ_PRIO to indicate bio for
metadata")' REQ_META is replaced by REQ_PRIO to indicate metadata bio.
This assumption is not always correct, e.g. XFS uses REQ_META to mark
metadata bio other than REQ_PRIO. This is why Nix noticed that bcache
does not cache metadata for XFS after the above commit.

Thanks to Dave Chinner, he explains the difference between REQ_META and
REQ_PRIO from view of file system developer. Here I quote part of his
explanation from mailing list,
   REQ_META is used for metadata. REQ_PRIO is used to communicate to
   the lower layers that the submitter considers this IO to be more
   important that non REQ_PRIO IO and so dispatch should be expedited.

   IOWs, if the filesystem considers metadata IO to be more important
   that user data IO, then it will use REQ_PRIO | REQ_META rather than
   just REQ_META.

Then it seems bios with REQ_META or REQ_PRIO should both be cached for
performance optimation, because they are all probably low I/O latency
demand by upper layer (e.g. file system).

So in this patch, when we want to decide whether to bypass the cache,
REQ_META and REQ_PRIO are both checked. Then both metadata and
high priority I/O requests will be handled properly.

Reported-by: Nix &lt;nix@esperi.org.uk&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Andre Noll &lt;maan@tuebingen.mpg.de&gt;
Tested-by: Nix &lt;nix@esperi.org.uk&gt;
Cc: stable@vger.kernel.org
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
