<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/trace/events/ext4.h, branch v5.9</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>ext4: limit the length of per-inode prealloc list</title>
<updated>2020-08-19T16:04:36+00:00</updated>
<author>
<name>brookxu</name>
<email>brookxu.cn@gmail.com</email>
</author>
<published>2020-08-17T07:36:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=27bc446e2def38db3244a6eb4bb1d6312936610a'/>
<id>27bc446e2def38db3244a6eb4bb1d6312936610a</id>
<content type='text'>
In the scenario of writing sparse files, the per-inode prealloc list may
be very long, resulting in high overhead for ext4_mb_use_preallocated().
To circumvent this problem, we limit the maximum length of per-inode
prealloc list to 512 and allow users to modify it.

After patching, we observed that the sys ratio of cpu has dropped, and
the system throughput has increased significantly. We created a process
to write the sparse file, and the running time of the process on the
fixed kernel was significantly reduced, as follows:

Running time on unfixed kernel：
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real    0m2.051s
user    0m0.008s
sys     0m2.026s

Running time on fixed kernel：
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real    0m0.471s
user    0m0.004s
sys     0m0.395s

Signed-off-by: Chunguang Xu &lt;brookxu@tencent.com&gt;
Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the scenario of writing sparse files, the per-inode prealloc list may
be very long, resulting in high overhead for ext4_mb_use_preallocated().
To circumvent this problem, we limit the maximum length of per-inode
prealloc list to 512 and allow users to modify it.

After patching, we observed that the sys ratio of cpu has dropped, and
the system throughput has increased significantly. We created a process
to write the sparse file, and the running time of the process on the
fixed kernel was significantly reduced, as follows:

Running time on unfixed kernel：
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real    0m2.051s
user    0m0.008s
sys     0m2.026s

Running time on fixed kernel：
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real    0m0.471s
user    0m0.004s
sys     0m0.395s

Signed-off-by: Chunguang Xu &lt;brookxu@tencent.com&gt;
Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: add prefetch_block_bitmaps mount option</title>
<updated>2020-08-07T18:12:35+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2020-07-17T04:14:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3d392b2676bf3199863a1e5efb2c087ad9d442a4'/>
<id>3d392b2676bf3199863a1e5efb2c087ad9d442a4</id>
<content type='text'>
For file systems where we can afford to keep the buddy bitmaps cached,
we can speed up initial writes to large file systems by starting to
load the block allocation bitmaps as soon as the file system is
mounted.  This won't work well for _super_ large file systems, or
memory constrained systems, so we only enable this when it is
requested via a mount option.

Addresses-Google-Bug: 159488342
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For file systems where we can afford to keep the buddy bitmaps cached,
we can speed up initial writes to large file systems by starting to
load the block allocation bitmaps as soon as the file system is
mounted.  This won't work well for _super_ large file systems, or
memory constrained systems, so we only enable this when it is
requested via a mount option.

Addresses-Google-Bug: 159488342
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: indicate via a block bitmap read is prefetched via a tracepoint</title>
<updated>2020-08-07T18:12:35+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2020-07-15T15:48:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ab74c7b23f3770935016e3eb3ecdf1e42b73efaa'/>
<id>ab74c7b23f3770935016e3eb3ecdf1e42b73efaa</id>
<content type='text'>
Modify the ext4_read_block_bitmap_load tracepoint so that it tells us
whether a block bitmap is being prefetched.

Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Artem Blagodarenko &lt;artem.blagodarenko@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Modify the ext4_read_block_bitmap_load tracepoint so that it tells us
whether a block bitmap is being prefetched.

Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Artem Blagodarenko &lt;artem.blagodarenko@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: mballoc: use lock for checking free blocks while retrying</title>
<updated>2020-06-04T03:16:53+00:00</updated>
<author>
<name>Ritesh Harjani</name>
<email>riteshh@linux.ibm.com</email>
</author>
<published>2020-05-20T06:40:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=993778306e7901a7286322f25c7c681dd47bede6'/>
<id>993778306e7901a7286322f25c7c681dd47bede6</id>
<content type='text'>
Currently while doing block allocation grp-&gt;bb_free may be getting
modified if discard is happening in parallel.
For e.g. consider a case where there are lot of threads who have
preallocated lot of blocks and there is a thread which is trying
to discard all of this group's PA. Now it could happen that
we see all of those group's bb_free is zero and fail the allocation
while there is sufficient space if we free up all the PA.

So this patch adds another flag "EXT4_MB_STRICT_CHECK" which will be set
if we are unable to allocate any blocks in the first try (since we may
not have considered blocks about to be discarded from PA lists).
So during retry attempt to allocate blocks we will use ext4_lock_group()
for checking if the group is good or not.

Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/9cb740a117c958c36596f167b12af1beae9a68b7.1589955723.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently while doing block allocation grp-&gt;bb_free may be getting
modified if discard is happening in parallel.
For e.g. consider a case where there are lot of threads who have
preallocated lot of blocks and there is a thread which is trying
to discard all of this group's PA. Now it could happen that
we see all of those group's bb_free is zero and fail the allocation
while there is sufficient space if we free up all the PA.

So this patch adds another flag "EXT4_MB_STRICT_CHECK" which will be set
if we are unable to allocate any blocks in the first try (since we may
not have considered blocks about to be discarded from PA lists).
So during retry attempt to allocate blocks we will use ext4_lock_group()
for checking if the group is good or not.

Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/9cb740a117c958c36596f167b12af1beae9a68b7.1589955723.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: translate a few more map flags to strings in tracepoints</title>
<updated>2020-06-04T03:16:49+00:00</updated>
<author>
<name>Eric Whitney</name>
<email>enwlinux@gmail.com</email>
</author>
<published>2020-04-15T20:31:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=493e83aafa02316bc79ec90041c378d7902194fa'/>
<id>493e83aafa02316bc79ec90041c378d7902194fa</id>
<content type='text'>
As new ext4_map_blocks() flags have been added, not all have gotten flag
bit to string translations to make tracepoint output more readable.
Fix that, and go one step further by adding a translation for the
EXT4_EX_NOCACHE flag as well.  The EXT4_EX_FORCE_CACHE flag can never
be set in a tracepoint in the current code, so there's no need to
bother with a translation for it right now.

Signed-off-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20200415203140.30349-3-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As new ext4_map_blocks() flags have been added, not all have gotten flag
bit to string translations to make tracepoint output more readable.
Fix that, and go one step further by adding a translation for the
EXT4_EX_NOCACHE flag as well.  The EXT4_EX_FORCE_CACHE flag can never
be set in a tracepoint in the current code, so there's no need to
bother with a translation for it right now.

Signed-off-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20200415203140.30349-3-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: remove EXT4_GET_BLOCKS_KEEP_SIZE flag</title>
<updated>2020-06-04T03:16:48+00:00</updated>
<author>
<name>Eric Whitney</name>
<email>enwlinux@gmail.com</email>
</author>
<published>2020-04-15T20:31:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9e52484c713321e84e8834803a44ca0a001376d2'/>
<id>9e52484c713321e84e8834803a44ca0a001376d2</id>
<content type='text'>
The eofblocks code was removed in the 5.7 release by "ext4: remove
EOFBLOCKS_FL and associated code" (4337ecd1fe99).  The ext4_map_blocks()
flag used to trigger it can now be removed as well.

Signed-off-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20200415203140.30349-2-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The eofblocks code was removed in the 5.7 release by "ext4: remove
EOFBLOCKS_FL and associated code" (4337ecd1fe99).  The ext4_map_blocks()
flag used to trigger it can now be removed as well.

Signed-off-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20200415203140.30349-2-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix extent_status trace points</title>
<updated>2020-01-25T07:03:03+00:00</updated>
<author>
<name>Dmitry Monakhov</name>
<email>dmonakhov@gmail.com</email>
</author>
<published>2019-11-14T20:01:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=52144d893d76294db9ed79a909397ea81bc25a02'/>
<id>52144d893d76294db9ed79a909397ea81bc25a02</id>
<content type='text'>
Show pblock only if it has meaningful value.

# before
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 576460752303423487 H
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 576460752303423487 HR
# after
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 0 H
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 0 HR

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@gmail.com&gt;
Link: https://lore.kernel.org/r/20191114200147.1073-2-dmonakhov@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Show pblock only if it has meaningful value.

# before
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 576460752303423487 H
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 576460752303423487 HR
# after
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 0 H
   ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 0 HR

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@gmail.com&gt;
Link: https://lore.kernel.org/r/20191114200147.1073-2-dmonakhov@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix symbolic enum printing in trace output</title>
<updated>2020-01-25T07:03:02+00:00</updated>
<author>
<name>Dmitry Monakhov</name>
<email>dmonakhov@gmail.com</email>
</author>
<published>2019-11-14T20:01:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=459c80742e6a4a5375d9765a326e6047103460f5'/>
<id>459c80742e6a4a5375d9765a326e6047103460f5</id>
<content type='text'>
Trace's macro __print_flags() produce raw event's decraration w/o knowing actual
flags value

cat /sys/kernel/debug/tracing/events/ext4/ext4_ext_map_blocks_exit/format
..
__print_flags(REC-&gt;mflags, "", { (1 &lt;&lt; BH_New),

For that reason we have to explicitly define it via special macro TRACE_DEFINE_ENUM()
Also add missed EXTENT_STATUS_REFERENCED flag.

#Before patch
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 2 flags  lblk 0 pblk 4177 len 1 mflags 0x20 ret 1
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 12 flags CREATE lblk 0 pblk 34304 len 1 mflags 0x60 ret 1

#With patch
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 2 flags  lblk 0 pblk 4177 len 1 mflags M ret 1
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 12 flags CREATE lblk 0 pblk 34816 len 1 mflags NM ret 1

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@gmail.com&gt;
Link: https://lore.kernel.org/r/20191114200147.1073-1-dmonakhov@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Trace's macro __print_flags() produce raw event's decraration w/o knowing actual
flags value

cat /sys/kernel/debug/tracing/events/ext4/ext4_ext_map_blocks_exit/format
..
__print_flags(REC-&gt;mflags, "", { (1 &lt;&lt; BH_New),

For that reason we have to explicitly define it via special macro TRACE_DEFINE_ENUM()
Also add missed EXTENT_STATUS_REFERENCED flag.

#Before patch
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 2 flags  lblk 0 pblk 4177 len 1 mflags 0x20 ret 1
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 12 flags CREATE lblk 0 pblk 34304 len 1 mflags 0x60 ret 1

#With patch
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 2 flags  lblk 0 pblk 4177 len 1 mflags M ret 1
ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 12 flags CREATE lblk 0 pblk 34816 len 1 mflags NM ret 1

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@gmail.com&gt;
Link: https://lore.kernel.org/r/20191114200147.1073-1-dmonakhov@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: Reserve revoke credits for freed blocks</title>
<updated>2019-11-05T21:00:49+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2019-11-05T16:44:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=83448bdfb59731c2f54784ed3f4a93ff95be6e7e'/>
<id>83448bdfb59731c2f54784ed3f4a93ff95be6e7e</id>
<content type='text'>
So far we have reserved only relatively high fixed amount of revoke
credits for each transaction. We over-reserved by large amount for most
cases but when freeing large directories or files with data journalling,
the fixed amount is not enough. In fact the worst case estimate is
inconveniently large (maximum extent size) for freeing of one extent.

We fix this by doing proper estimate of the amount of blocks that need
to be revoked when removing blocks from the inode due to truncate or
hole punching and otherwise reserve just a small amount of revoke
credits for each transaction to accommodate freeing of xattrs block or
so.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20191105164437.32602-23-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So far we have reserved only relatively high fixed amount of revoke
credits for each transaction. We over-reserved by large amount for most
cases but when freeing large directories or files with data journalling,
the fixed amount is not enough. In fact the worst case estimate is
inconveniently large (maximum extent size) for freeing of one extent.

We fix this by doing proper estimate of the amount of blocks that need
to be revoked when removing blocks from the inode due to truncate or
hole punching and otherwise reserve just a small amount of revoke
credits for each transaction to accommodate freeing of xattrs block or
so.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20191105164437.32602-23-jack@suse.cz
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: force inode writes when nfsd calls commit_metadata()</title>
<updated>2018-12-19T19:07:58+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2018-12-19T19:07:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fde872682e175743e0c3ef939c89e3c6008a1529'/>
<id>fde872682e175743e0c3ef939c89e3c6008a1529</id>
<content type='text'>
Some time back, nfsd switched from calling vfs_fsync() to using a new
commit_metadata() hook in export_operations().  If the file system did
not provide a commit_metadata() hook, it fell back to using
sync_inode_metadata().  Unfortunately doesn't work on all file
systems.  In particular, it doesn't work on ext4 due to how the inode
gets journalled --- the VFS writeback code will not always call
ext4_write_inode().

So we need to provide our own ext4_nfs_commit_metdata() method which
calls ext4_write_inode() directly.

Google-Bug-Id: 121195940
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some time back, nfsd switched from calling vfs_fsync() to using a new
commit_metadata() hook in export_operations().  If the file system did
not provide a commit_metadata() hook, it fell back to using
sync_inode_metadata().  Unfortunately doesn't work on all file
systems.  In particular, it doesn't work on ext4 due to how the inode
gets journalled --- the VFS writeback code will not always call
ext4_write_inode().

So we need to provide our own ext4_nfs_commit_metdata() method which
calls ext4_write_inode() directly.

Google-Bug-Id: 121195940
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</pre>
</div>
</content>
</entry>
</feed>
