<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/netfs/misc.c, branch v6.11</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>netfs: Fix trimming of streaming-write folios in netfs_inval_folio()</title>
<updated>2024-08-24T14:09:16+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-08-23T20:08:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cce6bfa6ca0e30af9927b0074c97fe6a92f28092'/>
<id>cce6bfa6ca0e30af9927b0074c97fe6a92f28092</id>
<content type='text'>
When netfslib writes to a folio that it doesn't have data for, but that
data exists on the server, it will make a 'streaming write' whereby it
stores data in a folio that is marked dirty, but not uptodate.  When it
does this, it attaches a record to folio-&gt;private to track the dirty
region.

When truncate() or fallocate() wants to invalidate part of such a folio, it
will call into -&gt;invalidate_folio(), specifying the part of the folio that
is to be invalidated.  netfs_invalidate_folio(), on behalf of the
filesystem, must then determine how to trim the streaming write record.  In
a couple of cases, however, it does this incorrectly (the reduce-length and
move-start cases are switched over and don't, in any case, calculate the
value correctly).

Fix this by making the logic tree more obvious and fixing the cases.

Fixes: 9ebff83e6481 ("netfs: Prep to use folio-&gt;private for write grouping and streaming write")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20240823200819.532106-5-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
cc: Pankaj Raghav &lt;p.raghav@samsung.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When netfslib writes to a folio that it doesn't have data for, but that
data exists on the server, it will make a 'streaming write' whereby it
stores data in a folio that is marked dirty, but not uptodate.  When it
does this, it attaches a record to folio-&gt;private to track the dirty
region.

When truncate() or fallocate() wants to invalidate part of such a folio, it
will call into -&gt;invalidate_folio(), specifying the part of the folio that
is to be invalidated.  netfs_invalidate_folio(), on behalf of the
filesystem, must then determine how to trim the streaming write record.  In
a couple of cases, however, it does this incorrectly (the reduce-length and
move-start cases are switched over and don't, in any case, calculate the
value correctly).

Fix this by making the logic tree more obvious and fixing the cases.

Fixes: 9ebff83e6481 ("netfs: Prep to use folio-&gt;private for write grouping and streaming write")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20240823200819.532106-5-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
cc: Pankaj Raghav &lt;p.raghav@samsung.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Fix netfs_release_folio() to say no if folio dirty</title>
<updated>2024-08-24T14:09:16+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-08-23T20:08:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7dfc8f0c6144c290dbeb01835a67e81b34dda8cd'/>
<id>7dfc8f0c6144c290dbeb01835a67e81b34dda8cd</id>
<content type='text'>
Fix netfs_release_folio() to say no (ie. return false) if the folio is
dirty (analogous with iomap's behaviour).  Without this, it will say yes to
the release of a dirty page by split_huge_page_to_list_to_order(), which
will result in the loss of untruncated data in the folio.

Without this, the generic/075 and generic/112 xfstests (both fsx-based
tests) fail with minimum folio size patches applied[1].

Fixes: c1ec4d7c2e13 ("netfs: Provide invalidate_folio and release_folio calls")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20240815090849.972355-1-kernel@pankajraghav.com/ [1]
Link: https://lore.kernel.org/r/20240823200819.532106-4-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
cc: Pankaj Raghav &lt;p.raghav@samsung.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix netfs_release_folio() to say no (ie. return false) if the folio is
dirty (analogous with iomap's behaviour).  Without this, it will say yes to
the release of a dirty page by split_huge_page_to_list_to_order(), which
will result in the loss of untruncated data in the folio.

Without this, the generic/075 and generic/112 xfstests (both fsx-based
tests) fail with minimum folio size patches applied[1].

Fixes: c1ec4d7c2e13 ("netfs: Provide invalidate_folio and release_folio calls")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20240815090849.972355-1-kernel@pankajraghav.com/ [1]
Link: https://lore.kernel.org/r/20240823200819.532106-4-dhowells@redhat.com
cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
cc: Pankaj Raghav &lt;p.raghav@samsung.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: linux-afs@lists.infradead.org
cc: netfs@lists.linux.dev
cc: linux-mm@kvack.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs, ceph: Partially revert "netfs: Replace PG_fscache by setting folio-&gt;private and marking dirty"</title>
<updated>2024-08-21T20:32:58+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-08-14T20:38:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=92764e8822d4e7f8efb5ad959fac195a7f8ea0c6'/>
<id>92764e8822d4e7f8efb5ad959fac195a7f8ea0c6</id>
<content type='text'>
This partially reverts commit 2ff1e97587f4d398686f52c07afde3faf3da4e5c.

In addition to reverting the removal of PG_private_2 wrangling from the
buffered read code[1][2], the removal of the waits for PG_private_2 from
netfs_release_folio() and netfs_invalidate_folio() need reverting too.

It also adds a wait into ceph_evict_inode() to wait for netfs read and
copy-to-cache ops to complete.

Fixes: 2ff1e97587f4 ("netfs: Replace PG_fscache by setting folio-&gt;private and marking dirty")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/3575457.1722355300@warthog.procyon.org.uk [1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e5ced7804cb9184c4a23f8054551240562a8eda [2]
Link: https://lore.kernel.org/r/20240814203850.2240469-2-dhowells@redhat.com
cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
cc: Xiubo Li &lt;xiubli@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Matthew Wilcox &lt;willy@infradead.org&gt;
cc: ceph-devel@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This partially reverts commit 2ff1e97587f4d398686f52c07afde3faf3da4e5c.

In addition to reverting the removal of PG_private_2 wrangling from the
buffered read code[1][2], the removal of the waits for PG_private_2 from
netfs_release_folio() and netfs_invalidate_folio() need reverting too.

It also adds a wait into ceph_evict_inode() to wait for netfs read and
copy-to-cache ops to complete.

Fixes: 2ff1e97587f4 ("netfs: Replace PG_fscache by setting folio-&gt;private and marking dirty")
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/3575457.1722355300@warthog.procyon.org.uk [1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e5ced7804cb9184c4a23f8054551240562a8eda [2]
Link: https://lore.kernel.org/r/20240814203850.2240469-2-dhowells@redhat.com
cc: Max Kellermann &lt;max.kellermann@ionos.com&gt;
cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
cc: Xiubo Li &lt;xiubli@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Matthew Wilcox &lt;willy@infradead.org&gt;
cc: ceph-devel@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Revert "netfs: Switch debug logging to pr_debug()"</title>
<updated>2024-07-24T08:15:37+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-07-18T20:07:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a9d47a50cf257ff1019a4e30d573777882fd785c'/>
<id>a9d47a50cf257ff1019a4e30d573777882fd785c</id>
<content type='text'>
Revert commit 163eae0fb0d4c610c59a8de38040f8e12f89fd43 to get back the
original operation of the debugging macros.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20240608151352.22860-2-ukleinek@kernel.org
Link: https://lore.kernel.org/r/1410685.1721333252@warthog.procyon.org.uk
cc: Uwe Kleine-König &lt;ukleinek@kernel.org&gt;
cc: Christian Brauner &lt;brauner@kernel.org&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Revert commit 163eae0fb0d4c610c59a8de38040f8e12f89fd43 to get back the
original operation of the debugging macros.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Link: https://lore.kernel.org/r/20240608151352.22860-2-ukleinek@kernel.org
Link: https://lore.kernel.org/r/1410685.1721333252@warthog.procyon.org.uk
cc: Uwe Kleine-König &lt;ukleinek@kernel.org&gt;
cc: Christian Brauner &lt;brauner@kernel.org&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-07-11T16:03:28+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-07-11T16:03:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=83ab4b461eb7bdf90984eb56d4954dbe11e926d4'/>
<id>83ab4b461eb7bdf90984eb56d4954dbe11e926d4</id>
<content type='text'>
Pull vfs fixes from Christian Brauner:
 "cachefiles:

   - Export an existing and add a new cachefile helper to be used in
     filesystems to fix reference count bugs

   - Use the newly added fscache_ty_get_volume() helper to get a
     reference count on an fscache_volume to handle volumes that are
     about to be removed cleanly

   - After withdrawing a fscache_cache via FSCACHE_CACHE_IS_WITHDRAWN
     wait for all ongoing cookie lookups to complete and for the object
     count to reach zero

   - Propagate errors from vfs_getxattr() to avoid an infinite loop in
     cachefiles_check_volume_xattr() because it keeps seeing ESTALE

   - Don't send new requests when an object is dropped by raising
     CACHEFILES_ONDEMAND_OJBSTATE_DROPPING

   - Cancel all requests for an object that is about to be dropped

   - Wait for the ondemand_boject_worker to finish before dropping a
     cachefiles object to prevent use-after-free

   - Use cyclic allocation for message ids to better handle id recycling

   - Add missing lock protection when iterating through the xarray when
     polling

  netfs:

   - Use standard logging helpers for debug logging

  VFS:

   - Fix potential use-after-free in file locks during
     trace_posix_lock_inode(). The tracepoint could fire while another
     task raced it and freed the lock that was requested to be traced

   - Only increment the nr_dentry_negative counter for dentries that are
     present on the superblock LRU. Currently, DCACHE_LRU_LIST list is
     used to detect this case. However, the flag is also raised in
     combination with DCACHE_SHRINK_LIST to indicate that dentry-&gt;d_lru
     is used. So checking only DCACHE_LRU_LIST will lead to wrong
     nr_dentry_negative count. Fix the check to not count dentries that
     are on a shrink related list

  Misc:

   - hfsplus: fix an uninitialized value issue in copy_name

   - minix: fix minixfs_rename with HIGHMEM. It still uses kunmap() even
     though we switched it to kmap_local_page() a while ago"

* tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  minixfs: Fix minixfs_rename with HIGHMEM
  hfsplus: fix uninit-value in copy_name
  vfs: don't mod negative dentry count when on shrinker list
  filelock: fix potential use-after-free in posix_lock_inode
  cachefiles: add missing lock protection when polling
  cachefiles: cyclic allocation of msg_id to avoid reuse
  cachefiles: wait for ondemand_object_worker to finish when dropping object
  cachefiles: cancel all requests for the object that is being dropped
  cachefiles: stop sending new request when dropping object
  cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop
  cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()
  cachefiles: fix slab-use-after-free in fscache_withdraw_volume()
  netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume()
  netfs: Switch debug logging to pr_debug()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs fixes from Christian Brauner:
 "cachefiles:

   - Export an existing and add a new cachefile helper to be used in
     filesystems to fix reference count bugs

   - Use the newly added fscache_ty_get_volume() helper to get a
     reference count on an fscache_volume to handle volumes that are
     about to be removed cleanly

   - After withdrawing a fscache_cache via FSCACHE_CACHE_IS_WITHDRAWN
     wait for all ongoing cookie lookups to complete and for the object
     count to reach zero

   - Propagate errors from vfs_getxattr() to avoid an infinite loop in
     cachefiles_check_volume_xattr() because it keeps seeing ESTALE

   - Don't send new requests when an object is dropped by raising
     CACHEFILES_ONDEMAND_OJBSTATE_DROPPING

   - Cancel all requests for an object that is about to be dropped

   - Wait for the ondemand_boject_worker to finish before dropping a
     cachefiles object to prevent use-after-free

   - Use cyclic allocation for message ids to better handle id recycling

   - Add missing lock protection when iterating through the xarray when
     polling

  netfs:

   - Use standard logging helpers for debug logging

  VFS:

   - Fix potential use-after-free in file locks during
     trace_posix_lock_inode(). The tracepoint could fire while another
     task raced it and freed the lock that was requested to be traced

   - Only increment the nr_dentry_negative counter for dentries that are
     present on the superblock LRU. Currently, DCACHE_LRU_LIST list is
     used to detect this case. However, the flag is also raised in
     combination with DCACHE_SHRINK_LIST to indicate that dentry-&gt;d_lru
     is used. So checking only DCACHE_LRU_LIST will lead to wrong
     nr_dentry_negative count. Fix the check to not count dentries that
     are on a shrink related list

  Misc:

   - hfsplus: fix an uninitialized value issue in copy_name

   - minix: fix minixfs_rename with HIGHMEM. It still uses kunmap() even
     though we switched it to kmap_local_page() a while ago"

* tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  minixfs: Fix minixfs_rename with HIGHMEM
  hfsplus: fix uninit-value in copy_name
  vfs: don't mod negative dentry count when on shrinker list
  filelock: fix potential use-after-free in posix_lock_inode
  cachefiles: add missing lock protection when polling
  cachefiles: cyclic allocation of msg_id to avoid reuse
  cachefiles: wait for ondemand_object_worker to finish when dropping object
  cachefiles: cancel all requests for the object that is being dropped
  cachefiles: stop sending new request when dropping object
  cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop
  cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie()
  cachefiles: fix slab-use-after-free in fscache_withdraw_volume()
  netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume()
  netfs: Switch debug logging to pr_debug()
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Delete some xarray-wangling functions that aren't used</title>
<updated>2024-06-26T12:16:49+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-06-06T10:10:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=84dfbc9cad7d86984f2b5814bf36e61ff492f306'/>
<id>84dfbc9cad7d86984f2b5814bf36e61ff492f306</id>
<content type='text'>
Delete some xarray-based buffer wangling functions that are intended for
use with bounce buffering, but aren't used because bounce-buffering got
deferred to a later patch series.  Now, however, the intention is to use
something other than an xarray to do this.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240620173137.610345-9-dhowells@redhat.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Delete some xarray-based buffer wangling functions that are intended for
use with bounce buffering, but aren't used because bounce-buffering got
deferred to a later patch series.  Now, however, the intention is to use
something other than an xarray to do this.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20240620173137.610345-9-dhowells@redhat.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Switch debug logging to pr_debug()</title>
<updated>2024-06-12T12:25:41+00:00</updated>
<author>
<name>Uwe Kleine-König</name>
<email>ukleinek@kernel.org</email>
</author>
<published>2024-06-08T15:13:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=163eae0fb0d4c610c59a8de38040f8e12f89fd43'/>
<id>163eae0fb0d4c610c59a8de38040f8e12f89fd43</id>
<content type='text'>
Instead of inventing a custom way to conditionally enable debugging,
just make use of pr_debug(), which also has dynamic debugging facilities
and is more likely known to someone who hunts a problem in the netfs
code. Also drop the module parameter netfs_debug which didn't have any
effect without further source changes. (The variable netfs_debug was
only used in #ifdef blocks for cpp vars that don't exist; Note that
CONFIG_NETFS_DEBUG isn't settable via kconfig, a variable with that name
never existed in the mainline and is probably just taken over (and
renamed) from similar custom debug logging implementations.)

Signed-off-by: Uwe Kleine-König &lt;ukleinek@kernel.org&gt;
Link: https://lore.kernel.org/r/20240608151352.22860-2-ukleinek@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of inventing a custom way to conditionally enable debugging,
just make use of pr_debug(), which also has dynamic debugging facilities
and is more likely known to someone who hunts a problem in the netfs
code. Also drop the module parameter netfs_debug which didn't have any
effect without further source changes. (The variable netfs_debug was
only used in #ifdef blocks for cpp vars that don't exist; Note that
CONFIG_NETFS_DEBUG isn't settable via kconfig, a variable with that name
never existed in the mainline and is probably just taken over (and
renamed) from similar custom debug logging implementations.)

Signed-off-by: Uwe Kleine-König &lt;ukleinek@kernel.org&gt;
Link: https://lore.kernel.org/r/20240608151352.22860-2-ukleinek@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Replace PG_fscache by setting folio-&gt;private and marking dirty</title>
<updated>2024-04-29T14:01:42+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-03-19T10:00:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2ff1e97587f4d398686f52c07afde3faf3da4e5c'/>
<id>2ff1e97587f4d398686f52c07afde3faf3da4e5c</id>
<content type='text'>
When dirty data is being written to the cache, setting/waiting on/clearing
the fscache flag is always done in tandem with setting/waiting on/clearing
the writeback flag.  The netfslib buffered write routines wait on and set
both flags and the write request cleanup clears both flags, so the fscache
flag is almost superfluous.

The reason it isn't superfluous is because the fscache flag is also used to
indicate that data just read from the server is being written to the cache.
The flag is used to prevent a race involving overlapping direct-I/O writes
to the cache.

Change this to indicate that a page is in need of being copied to the cache
by placing a magic value in folio-&gt;private and marking the folios dirty.
Then when the writeback code sees a folio marked in this way, it only
writes it to the cache and not to the server.

If a folio that has this magic value set is modified, the value is just
replaced and the folio will then be uplodaded too.

With this, PG_fscache is no longer required by the netfslib core, 9p and
afs.

Ceph and nfs, however, still need to use the old PG_fscache-based tracking.
To deal with this, a flag, NETFS_ICTX_USE_PGPRIV2, now has to be set on the
flags in the netfs_inode struct for those filesystems.  This reenables the
use of PG_fscache in that inode.  9p and afs use the netfslib write helpers
so get switched over; cifs, for the moment, does page-by-page manual access
to the cache, so doesn't use PG_fscache and is unaffected.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
cc: Eric Van Hensbergen &lt;ericvh@kernel.org&gt;
cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
cc: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
cc: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
cc: Xiubo Li &lt;xiubli@redhat.com&gt;
cc: Steve French &lt;sfrench@samba.org&gt;
cc: Paulo Alcantara &lt;pc@manguebit.com&gt;
cc: Ronnie Sahlberg &lt;ronniesahlberg@gmail.com&gt;
cc: Shyam Prasad N &lt;sprasad@microsoft.com&gt;
cc: Tom Talpey &lt;tom@talpey.com&gt;
cc: Bharath SM &lt;bharathsm@microsoft.com&gt;
cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
cc: Anna Schumaker &lt;anna@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: v9fs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When dirty data is being written to the cache, setting/waiting on/clearing
the fscache flag is always done in tandem with setting/waiting on/clearing
the writeback flag.  The netfslib buffered write routines wait on and set
both flags and the write request cleanup clears both flags, so the fscache
flag is almost superfluous.

The reason it isn't superfluous is because the fscache flag is also used to
indicate that data just read from the server is being written to the cache.
The flag is used to prevent a race involving overlapping direct-I/O writes
to the cache.

Change this to indicate that a page is in need of being copied to the cache
by placing a magic value in folio-&gt;private and marking the folios dirty.
Then when the writeback code sees a folio marked in this way, it only
writes it to the cache and not to the server.

If a folio that has this magic value set is modified, the value is just
replaced and the folio will then be uplodaded too.

With this, PG_fscache is no longer required by the netfslib core, 9p and
afs.

Ceph and nfs, however, still need to use the old PG_fscache-based tracking.
To deal with this, a flag, NETFS_ICTX_USE_PGPRIV2, now has to be set on the
flags in the netfs_inode struct for those filesystems.  This reenables the
use of PG_fscache in that inode.  9p and afs use the netfslib write helpers
so get switched over; cifs, for the moment, does page-by-page manual access
to the cache, so doesn't use PG_fscache and is unaffected.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
cc: Eric Van Hensbergen &lt;ericvh@kernel.org&gt;
cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
cc: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
cc: Christian Schoenebeck &lt;linux_oss@crudebyte.com&gt;
cc: Marc Dionne &lt;marc.dionne@auristor.com&gt;
cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
cc: Xiubo Li &lt;xiubli@redhat.com&gt;
cc: Steve French &lt;sfrench@samba.org&gt;
cc: Paulo Alcantara &lt;pc@manguebit.com&gt;
cc: Ronnie Sahlberg &lt;ronniesahlberg@gmail.com&gt;
cc: Shyam Prasad N &lt;sprasad@microsoft.com&gt;
cc: Tom Talpey &lt;tom@talpey.com&gt;
cc: Bharath SM &lt;bharathsm@microsoft.com&gt;
cc: Trond Myklebust &lt;trond.myklebust@hammerspace.com&gt;
cc: Anna Schumaker &lt;anna@kernel.org&gt;
cc: netfs@lists.linux.dev
cc: v9fs@lists.linux.dev
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: linux-nfs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Don't use certain unnecessary folio_*() functions</title>
<updated>2024-01-22T21:56:11+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2024-01-09T17:17:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=202bc57b675601bc07b5942369ecc16af64d1b95'/>
<id>202bc57b675601bc07b5942369ecc16af64d1b95</id>
<content type='text'>
Filesystems should use folio-&gt;index and folio-&gt;mapping, instead of
folio_index(folio), folio_mapping() and folio_file_mapping() since
they know that it's in the pagecache.

Change this automagically with:

perl -p -i -e 's/folio_mapping[(]([^)]*)[)]/\1-&gt;mapping/g' fs/netfs/*.c
perl -p -i -e 's/folio_file_mapping[(]([^)]*)[)]/\1-&gt;mapping/g' fs/netfs/*.c
perl -p -i -e 's/folio_index[(]([^)]*)[)]/\1-&gt;index/g' fs/netfs/*.c

Reported-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: linux-afs@lists.infradead.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-erofs@lists.ozlabs.org
cc: linux-fsdevel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Filesystems should use folio-&gt;index and folio-&gt;mapping, instead of
folio_index(folio), folio_mapping() and folio_file_mapping() since
they know that it's in the pagecache.

Change this automagically with:

perl -p -i -e 's/folio_mapping[(]([^)]*)[)]/\1-&gt;mapping/g' fs/netfs/*.c
perl -p -i -e 's/folio_file_mapping[(]([^)]*)[)]/\1-&gt;mapping/g' fs/netfs/*.c
perl -p -i -e 's/folio_index[(]([^)]*)[)]/\1-&gt;index/g' fs/netfs/*.c

Reported-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: linux-afs@lists.infradead.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-erofs@lists.ozlabs.org
cc: linux-fsdevel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>netfs: Optimise away reads above the point at which there can be no data</title>
<updated>2023-12-28T09:45:27+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2023-11-24T13:39:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=100ccd18bb41ea7abb4fbb419202c06079559501'/>
<id>100ccd18bb41ea7abb4fbb419202c06079559501</id>
<content type='text'>
Track the file position above which the server is not expected to have any
data (the "zero point") and preemptively assume that we can satisfy
requests by filling them with zeroes locally rather than attempting to
download them if they're over that line - even if we've written data back
to the server.  Assume that any data that was written back above that
position is held in the local cache.  Note that we have to split requests
that straddle the line.

Make use of this to optimise away some reads from the server.  We need to
set the zero point in the following circumstances:

 (1) When we see an extant remote inode and have no cache for it, we set
     the zero_point to i_size.

 (2) On local inode creation, we set zero_point to 0.

 (3) On local truncation down, we reduce zero_point to the new i_size if
     the new i_size is lower.

 (4) On local truncation up, we don't change zero_point.

 (5) On local modification, we don't change zero_point.

 (6) On remote invalidation, we set zero_point to the new i_size.

 (7) If stored data is discarded from the pagecache or culled from fscache,
     we must set zero_point above that if the data also got written to the
     server.

 (8) If dirty data is written back to the server, but not fscache, we must
     set zero_point above that.

 (9) If a direct I/O write is made, set zero_point above that.

Assuming the above, any read from the server at or above the zero_point
position will return all zeroes.

The zero_point value can be stored in the cache, provided the above rules
are applied to it by any code that culls part of the local cache.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Track the file position above which the server is not expected to have any
data (the "zero point") and preemptively assume that we can satisfy
requests by filling them with zeroes locally rather than attempting to
download them if they're over that line - even if we've written data back
to the server.  Assume that any data that was written back above that
position is held in the local cache.  Note that we have to split requests
that straddle the line.

Make use of this to optimise away some reads from the server.  We need to
set the zero point in the following circumstances:

 (1) When we see an extant remote inode and have no cache for it, we set
     the zero_point to i_size.

 (2) On local inode creation, we set zero_point to 0.

 (3) On local truncation down, we reduce zero_point to the new i_size if
     the new i_size is lower.

 (4) On local truncation up, we don't change zero_point.

 (5) On local modification, we don't change zero_point.

 (6) On remote invalidation, we set zero_point to the new i_size.

 (7) If stored data is discarded from the pagecache or culled from fscache,
     we must set zero_point above that if the data also got written to the
     server.

 (8) If dirty data is written back to the server, but not fscache, we must
     set zero_point above that.

 (9) If a direct I/O write is made, set zero_point above that.

Assuming the above, any read from the server at or above the zero_point
position will return all zeroes.

The zero_point value can be stored in the cache, provided the above rules
are applied to it by any code that culls part of the local cache.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Jeff Layton &lt;jlayton@kernel.org&gt;
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
</pre>
</div>
</content>
</entry>
</feed>
