<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/fuse/readdir.c, branch for-next</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>fuse: check attributes staleness on fuse_iget()</title>
<updated>2024-11-18T11:24:13+00:00</updated>
<author>
<name>Zhang Tianci</name>
<email>zhangtianci.1997@bytedance.com</email>
</author>
<published>2024-11-18T10:16:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=69eb56f69efb866c791cc87fd7bf62adf2ffcbb3'/>
<id>69eb56f69efb866c791cc87fd7bf62adf2ffcbb3</id>
<content type='text'>
Function fuse_direntplus_link() might call fuse_iget() to initialize a new
fuse_inode and change its attributes. If fi-&gt;attr_version is always
initialized with 0, even if the attributes returned by the FUSE_READDIR
request is staled, as the new fi-&gt;attr_version is 0, fuse_change_attributes
will still set the staled attributes to inode. This wrong behaviour may
cause file size inconsistency even when there is no changes from
server-side.

To reproduce the issue, consider the following 2 programs (A and B) are
running concurrently,

        A                                               B
----------------------------------      --------------------------------
{ /fusemnt/dir/f is a file path in a fuse mount, the size of f is 0. }

readdir(/fusemnt/dir) start
//Daemon set size 0 to f direntry
                                        fallocate(f, 1024)
                                        stat(f) // B see size 1024
                                        echo 2 &gt; /proc/sys/vm/drop_caches
readdir(/fusemnt/dir) reply to kernel
Kernel set 0 to the I_NEW inode

                                        stat(f) // B see size 0

In the above case, only program B is modifying the file size, however, B
observes file size changing between the 2 'readonly' stat() calls. To fix
this issue, we should make sure readdirplus still follows the rule of
attr_version staleness checking even if the fi-&gt;attr_version is lost due to
inode eviction.

To identify this situation, the new fc-&gt;evict_ctr is used to record whether
the eviction of inodes occurs during the readdirplus request processing.
If it does, the result of readdirplus may be inaccurate; otherwise, the
result of readdirplus can be trusted. Although this may still lead to
incorrect invalidation, considering the relatively low frequency of
evict occurrences, it should be acceptable.

Link: https://lore.kernel.org/lkml/20230711043405.66256-2-zhangjiachen.jaycee@bytedance.com/
Link: https://lore.kernel.org/lkml/20241114070905.48901-1-zhangtianci.1997@bytedance.com/

Reported-by: Jiachen Zhang &lt;zhangjiachen.jaycee@bytedance.com&gt;
Suggested-by: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Signed-off-by: Zhang Tianci &lt;zhangtianci.1997@bytedance.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Function fuse_direntplus_link() might call fuse_iget() to initialize a new
fuse_inode and change its attributes. If fi-&gt;attr_version is always
initialized with 0, even if the attributes returned by the FUSE_READDIR
request is staled, as the new fi-&gt;attr_version is 0, fuse_change_attributes
will still set the staled attributes to inode. This wrong behaviour may
cause file size inconsistency even when there is no changes from
server-side.

To reproduce the issue, consider the following 2 programs (A and B) are
running concurrently,

        A                                               B
----------------------------------      --------------------------------
{ /fusemnt/dir/f is a file path in a fuse mount, the size of f is 0. }

readdir(/fusemnt/dir) start
//Daemon set size 0 to f direntry
                                        fallocate(f, 1024)
                                        stat(f) // B see size 1024
                                        echo 2 &gt; /proc/sys/vm/drop_caches
readdir(/fusemnt/dir) reply to kernel
Kernel set 0 to the I_NEW inode

                                        stat(f) // B see size 0

In the above case, only program B is modifying the file size, however, B
observes file size changing between the 2 'readonly' stat() calls. To fix
this issue, we should make sure readdirplus still follows the rule of
attr_version staleness checking even if the fi-&gt;attr_version is lost due to
inode eviction.

To identify this situation, the new fc-&gt;evict_ctr is used to record whether
the eviction of inodes occurs during the readdirplus request processing.
If it does, the result of readdirplus may be inaccurate; otherwise, the
result of readdirplus can be trusted. Although this may still lead to
incorrect invalidation, considering the relatively low frequency of
evict occurrences, it should be acceptable.

Link: https://lore.kernel.org/lkml/20230711043405.66256-2-zhangjiachen.jaycee@bytedance.com/
Link: https://lore.kernel.org/lkml/20241114070905.48901-1-zhangtianci.1997@bytedance.com/

Reported-by: Jiachen Zhang &lt;zhangjiachen.jaycee@bytedance.com&gt;
Suggested-by: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Signed-off-by: Zhang Tianci &lt;zhangtianci.1997@bytedance.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: remove pages for requests and exclusively use folios</title>
<updated>2024-11-05T13:08:35+00:00</updated>
<author>
<name>Joanne Koong</name>
<email>joannelkoong@gmail.com</email>
</author>
<published>2024-10-24T17:18:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=68bfb7eb7f7de355d5b3812c25a2a36e9eead97b'/>
<id>68bfb7eb7f7de355d5b3812c25a2a36e9eead97b</id>
<content type='text'>
All fuse requests use folios instead of pages for transferring data.
Remove pages from the requests and exclusively use folios.

No functional changes.

[SzM: rename back folio_descs -&gt; descs, etc.]

Signed-off-by: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All fuse requests use folios instead of pages for transferring data.
Remove pages from the requests and exclusively use folios.

No functional changes.

[SzM: rename back folio_descs -&gt; descs, etc.]

Signed-off-by: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: convert readdir to use folios</title>
<updated>2024-11-05T10:14:32+00:00</updated>
<author>
<name>Joanne Koong</name>
<email>joannelkoong@gmail.com</email>
</author>
<published>2024-10-24T17:18:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=02b78c7a7a0c72aee6f600a167e6adee9417ac0e'/>
<id>02b78c7a7a0c72aee6f600a167e6adee9417ac0e</id>
<content type='text'>
Convert readdir requests to use a folio instead of a page.

No functional changes.

Signed-off-by: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert readdir requests to use a folio instead of a page.

No functional changes.

Signed-off-by: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: get rid of ff-&gt;readdir.lock</title>
<updated>2024-03-06T15:20:58+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2024-03-06T15:20:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cdf6ac2a03d253f05d3e798f60f23dea1b176b92'/>
<id>cdf6ac2a03d253f05d3e798f60f23dea1b176b92</id>
<content type='text'>
The same protection is provided by file-&gt;f_pos_lock.

Note, this relies on the fact that file-&gt;f_mode has FMODE_ATOMIC_POS.
This flag is cleared by stream_open(), which would prevent locking of
f_pos_lock.

Prior to commit 7de64d521bf9 ("fuse: break up fuse_open_common()")
FOPEN_STREAM on a directory would cause stream_open() to be called.
After this commit this is not done anymore, so f_pos_lock will always
be locked.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The same protection is provided by file-&gt;f_pos_lock.

Note, this relies on the fact that file-&gt;f_mode has FMODE_ATOMIC_POS.
This flag is cleared by stream_open(), which would prevent locking of
f_pos_lock.

Prior to commit 7de64d521bf9 ("fuse: break up fuse_open_common()")
FOPEN_STREAM on a directory would cause stream_open() to be called.
After this commit this is not done anymore, so f_pos_lock will always
be locked.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: convert to new timestamp accessors</title>
<updated>2023-10-18T12:08:21+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@kernel.org</email>
</author>
<published>2023-10-04T18:52:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c0d5df2d03d797e66f6eef08859930a01d5d6ad'/>
<id>3c0d5df2d03d797e66f6eef08859930a01d5d6ad</id>
<content type='text'>
Convert to using the new inode timestamp accessor functions.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Link: https://lore.kernel.org/r/20231004185347.80880-37-jlayton@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert to using the new inode timestamp accessor functions.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Link: https://lore.kernel.org/r/20231004185347.80880-37-jlayton@kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: cache btime</title>
<updated>2023-08-21T10:14:59+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2023-08-10T10:45:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=972f4c46d0a1bb7fde3ce0bd15775855b2d02c68'/>
<id>972f4c46d0a1bb7fde3ce0bd15775855b2d02c68</id>
<content type='text'>
Not all inode attributes are supported by all filesystems, but for the
basic stats (which are returned by stat(2) and friends) all of them will
have some value, even if that doesn't reflect a real attribute of the file.

Btime is different, in that filesystems are free to report or not report a
value in statx.  If the value is available, then STATX_BTIME bit is set in
stx_mask.

When caching the value of btime, remember the availability of the attribute
as well as the value (if available).  This is done by using the
FUSE_I_BTIME bit in fuse_inode-&gt;state to indicate availability, while using
fuse_inode-&gt;inval_mask &amp; STATX_BTIME to indicate the state of the cache
itself (i.e. set if cache is invalid, and cleared if cache is valid).

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Not all inode attributes are supported by all filesystems, but for the
basic stats (which are returned by stat(2) and friends) all of them will
have some value, even if that doesn't reflect a real attribute of the file.

Btime is different, in that filesystems are free to report or not report a
value in statx.  If the value is available, then STATX_BTIME bit is set in
stx_mask.

When caching the value of btime, remember the availability of the attribute
as well as the value (if available).  This is done by using the
FUSE_I_BTIME bit in fuse_inode-&gt;state to indicate availability, while using
fuse_inode-&gt;inval_mask &amp; STATX_BTIME to indicate the state of the cache
itself (i.e. set if cache is invalid, and cleared if cache is valid).

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: add ATTR_TIMEOUT macro</title>
<updated>2023-08-16T10:39:41+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2023-08-10T10:45:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9dc10a54abe50b733a5b561d5f8be718e79c3590'/>
<id>9dc10a54abe50b733a5b561d5f8be718e79c3590</id>
<content type='text'>
Next patch will introduce yet another type attribute reply.  Add a macro
that can handle attribute timeouts for all of the structs.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Next patch will introduce yet another type attribute reply.  Add a macro
that can handle attribute timeouts for all of the structs.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: nlookup missing decrement in fuse_direntplus_link</title>
<updated>2023-08-16T07:40:48+00:00</updated>
<author>
<name>ruanmeisi</name>
<email>ruan.meisi@zte.com.cn</email>
</author>
<published>2023-04-25T11:13:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b8bd342d50cbf606666488488f9fea374aceb2d5'/>
<id>b8bd342d50cbf606666488488f9fea374aceb2d5</id>
<content type='text'>
During our debugging of glusterfs, we found an Assertion failed error:
inode_lookup &gt;= nlookup, which was caused by the nlookup value in the
kernel being greater than that in the FUSE file system.

The issue was introduced by fuse_direntplus_link, where in the function,
fuse_iget increments nlookup, and if d_splice_alias returns failure,
fuse_direntplus_link returns failure without decrementing nlookup
https://github.com/gluster/glusterfs/pull/4081

Signed-off-by: ruanmeisi &lt;ruan.meisi@zte.com.cn&gt;
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Cc: &lt;stable@vger.kernel.org&gt; # v3.9
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During our debugging of glusterfs, we found an Assertion failed error:
inode_lookup &gt;= nlookup, which was caused by the nlookup value in the
kernel being greater than that in the FUSE file system.

The issue was introduced by fuse_direntplus_link, where in the function,
fuse_iget increments nlookup, and if d_splice_alias returns failure,
fuse_direntplus_link returns failure without decrementing nlookup
https://github.com/gluster/glusterfs/pull/4081

Signed-off-by: ruanmeisi &lt;ruan.meisi@zte.com.cn&gt;
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Cc: &lt;stable@vger.kernel.org&gt; # v3.9
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/fuse: Replace kmap() with kmap_local_page()</title>
<updated>2022-11-23T08:10:49+00:00</updated>
<author>
<name>Fabio M. De Francesco</name>
<email>fmdefrancesco@gmail.com</email>
</author>
<published>2022-10-12T11:23:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a1db2f7edef095a385d477ab81e780694d63eebd'/>
<id>a1db2f7edef095a385d477ab81e780694d63eebd</id>
<content type='text'>
The use of kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Therefore, replace kmap() with kmap_local_page() in fuse_readdir_cached(), 
it being the only call site of kmap() currently left in fs/fuse.

Cc: "Venkataramanan, Anirudh" &lt;anirudh.venkataramanan@intel.com&gt;
Suggested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Fabio M. De Francesco &lt;fmdefrancesco@gmail.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The use of kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Therefore, replace kmap() with kmap_local_page() in fuse_readdir_cached(), 
it being the only call site of kmap() currently left in fs/fuse.

Cc: "Venkataramanan, Anirudh" &lt;anirudh.venkataramanan@intel.com&gt;
Suggested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Fabio M. De Francesco &lt;fmdefrancesco@gmail.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: fix readdir cache race</title>
<updated>2022-10-20T15:18:58+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2022-10-20T15:18:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9fa248c65bdbf5af0a2f74dd38575acfc8dfd2bf'/>
<id>9fa248c65bdbf5af0a2f74dd38575acfc8dfd2bf</id>
<content type='text'>
There's a race in fuse's readdir cache that can result in an uninitilized
page being read.  The page lock is supposed to prevent this from happening
but in the following case it doesn't:

Two fuse_add_dirent_to_cache() start out and get the same parameters
(size=0,offset=0).  One of them wins the race to create and lock the page,
after which it fills in data, sets rdc.size and unlocks the page.

In the meantime the page gets evicted from the cache before the other
instance gets to run.  That one also creates the page, but finds the
size to be mismatched, bails out and leaves the uninitialized page in the
cache.

Fix by marking a filled page uptodate and ignoring non-uptodate pages.

Reported-by: Frank Sorenson &lt;fsorenso@redhat.com&gt;
Fixes: 5d7bc7e8680c ("fuse: allow using readdir cache")
Cc: &lt;stable@vger.kernel.org&gt; # v4.20
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's a race in fuse's readdir cache that can result in an uninitilized
page being read.  The page lock is supposed to prevent this from happening
but in the following case it doesn't:

Two fuse_add_dirent_to_cache() start out and get the same parameters
(size=0,offset=0).  One of them wins the race to create and lock the page,
after which it fills in data, sets rdc.size and unlocks the page.

In the meantime the page gets evicted from the cache before the other
instance gets to run.  That one also creates the page, but finds the
size to be mismatched, bails out and leaves the uninitialized page in the
cache.

Fix by marking a filled page uptodate and ignoring non-uptodate pages.

Reported-by: Frank Sorenson &lt;fsorenso@redhat.com&gt;
Fixes: 5d7bc7e8680c ("fuse: allow using readdir cache")
Cc: &lt;stable@vger.kernel.org&gt; # v4.20
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
