<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/ext4/inline.c, branch v5.18.3</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ext4: fix bug_on in ext4_writepages</title>
<updated>2022-06-09T08:30:43+00:00</updated>
<author>
<name>Ye Bin</name>
<email>yebin10@huawei.com</email>
</author>
<published>2022-05-16T12:26:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=013f12bdedb96816aaa27ee04349f4433d361f52'/>
<id>013f12bdedb96816aaa27ee04349f4433d361f52</id>
<content type='text'>
commit ef09ed5d37b84d18562b30cf7253e57062d0db05 upstream.

we got issue as follows:
EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls
------------[ cut here ]------------
kernel BUG at fs/ext4/inode.c:2708!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155
RIP: 0010:ext4_writepages+0x1977/0x1c10
RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000
RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002
RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000
R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001
R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028
FS:  00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 &lt;TASK&gt;
 do_writepages+0x130/0x3a0
 filemap_fdatawrite_wbc+0x83/0xa0
 filemap_flush+0xab/0xe0
 ext4_alloc_da_blocks+0x51/0x120
 __ext4_ioctl+0x1534/0x3210
 __x64_sys_ioctl+0x12c/0x170
 do_syscall_64+0x3b/0x90

It may happen as follows:
1. write inline_data inode
vfs_write
  new_sync_write
    ext4_file_write_iter
      ext4_buffered_write_iter
        generic_perform_write
          ext4_da_write_begin
            ext4_da_write_inline_data_begin -&gt; If inline data size too
            small will allocate block to write, then mapping will has
            dirty page
                ext4_da_convert_inline_data_to_extent -&gt;clear EXT4_STATE_MAY_INLINE_DATA
2. fallocate
do_vfs_ioctl
  ioctl_preallocate
    vfs_fallocate
      ext4_fallocate
        ext4_convert_inline_data
          ext4_convert_inline_data_nolock
            ext4_map_blocks -&gt; fail will goto restore data
            ext4_restore_inline_data
              ext4_create_inline_data
              ext4_write_inline_data
              ext4_set_inode_state -&gt; set inode EXT4_STATE_MAY_INLINE_DATA
3. writepages
__ext4_ioctl
  ext4_alloc_da_blocks
    filemap_flush
      filemap_fdatawrite_wbc
        do_writepages
          ext4_writepages
            if (ext4_has_inline_data(inode))
              BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))

The root cause of this issue is we destory inline data until call
ext4_writepages under delay allocation mode.  But there maybe already
convert from inline to extent.  To solve this issue, we call
filemap_flush first..

Cc: stable@kernel.org
Signed-off-by: Ye Bin &lt;yebin10@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ef09ed5d37b84d18562b30cf7253e57062d0db05 upstream.

we got issue as follows:
EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls
------------[ cut here ]------------
kernel BUG at fs/ext4/inode.c:2708!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155
RIP: 0010:ext4_writepages+0x1977/0x1c10
RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000
RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002
RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000
R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001
R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028
FS:  00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 &lt;TASK&gt;
 do_writepages+0x130/0x3a0
 filemap_fdatawrite_wbc+0x83/0xa0
 filemap_flush+0xab/0xe0
 ext4_alloc_da_blocks+0x51/0x120
 __ext4_ioctl+0x1534/0x3210
 __x64_sys_ioctl+0x12c/0x170
 do_syscall_64+0x3b/0x90

It may happen as follows:
1. write inline_data inode
vfs_write
  new_sync_write
    ext4_file_write_iter
      ext4_buffered_write_iter
        generic_perform_write
          ext4_da_write_begin
            ext4_da_write_inline_data_begin -&gt; If inline data size too
            small will allocate block to write, then mapping will has
            dirty page
                ext4_da_convert_inline_data_to_extent -&gt;clear EXT4_STATE_MAY_INLINE_DATA
2. fallocate
do_vfs_ioctl
  ioctl_preallocate
    vfs_fallocate
      ext4_fallocate
        ext4_convert_inline_data
          ext4_convert_inline_data_nolock
            ext4_map_blocks -&gt; fail will goto restore data
            ext4_restore_inline_data
              ext4_create_inline_data
              ext4_write_inline_data
              ext4_set_inode_state -&gt; set inode EXT4_STATE_MAY_INLINE_DATA
3. writepages
__ext4_ioctl
  ext4_alloc_da_blocks
    filemap_flush
      filemap_fdatawrite_wbc
        do_writepages
          ext4_writepages
            if (ext4_has_inline_data(inode))
              BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))

The root cause of this issue is we destory inline data until call
ext4_writepages under delay allocation mode.  But there maybe already
convert from inline to extent.  To solve this issue, we call
filemap_flush first..

Cc: stable@kernel.org
Signed-off-by: Ye Bin &lt;yebin10@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix fs corruption when tring to remove a non-empty directory with IO error</title>
<updated>2022-03-03T04:50:02+00:00</updated>
<author>
<name>Ye Bin</name>
<email>yebin10@huawei.com</email>
</author>
<published>2022-02-28T02:48:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7aab5c84a0f6ec2290e2ba4a6b245178b1bf949a'/>
<id>7aab5c84a0f6ec2290e2ba4a6b245178b1bf949a</id>
<content type='text'>
We inject IO error when rmdir non empty direcory, then got issue as follows:
step1: mkfs.ext4 -F /dev/sda
step2: mount /dev/sda  test
step3: cd test
step4: mkdir -p 1/2
step5: rmdir 1
	[  110.920551] ext4_empty_dir: inject fault
	[  110.921926] EXT4-fs warning (device sda): ext4_rmdir:3113: inode #12:
	comm rmdir: empty directory '1' has too many links (3)
step6: cd ..
step7: umount test
step8: fsck.ext4 -f /dev/sda
	e2fsck 1.42.9 (28-Dec-2013)
	Pass 1: Checking inodes, blocks, and sizes
	Pass 2: Checking directory structure
	Entry '..' in .../??? (13) has deleted/unused inode 12.  Clear&lt;y&gt;? yes
	Pass 3: Checking directory connectivity
	Unconnected directory inode 13 (...)
	Connect to /lost+found&lt;y&gt;? yes
	Pass 4: Checking reference counts
	Inode 13 ref count is 3, should be 2.  Fix&lt;y&gt;? yes
	Pass 5: Checking group summary information

	/dev/sda: ***** FILE SYSTEM WAS MODIFIED *****
	/dev/sda: 12/131072 files (0.0% non-contiguous), 26157/524288 blocks

ext4_rmdir
	if (!ext4_empty_dir(inode))
		goto end_rmdir;
ext4_empty_dir
	bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE);
	if (IS_ERR(bh))
		return true;
Now if read directory block failed, 'ext4_empty_dir' will return true, assume
directory is empty. Obviously, it will lead to above issue.
To solve this issue, if read directory block failed 'ext4_empty_dir' just
return false. To avoid making things worse when file system is already
corrupted, 'ext4_empty_dir' also return false.

Signed-off-by: Ye Bin &lt;yebin10@huawei.com&gt;
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20220228024815.3952506-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We inject IO error when rmdir non empty direcory, then got issue as follows:
step1: mkfs.ext4 -F /dev/sda
step2: mount /dev/sda  test
step3: cd test
step4: mkdir -p 1/2
step5: rmdir 1
	[  110.920551] ext4_empty_dir: inject fault
	[  110.921926] EXT4-fs warning (device sda): ext4_rmdir:3113: inode #12:
	comm rmdir: empty directory '1' has too many links (3)
step6: cd ..
step7: umount test
step8: fsck.ext4 -f /dev/sda
	e2fsck 1.42.9 (28-Dec-2013)
	Pass 1: Checking inodes, blocks, and sizes
	Pass 2: Checking directory structure
	Entry '..' in .../??? (13) has deleted/unused inode 12.  Clear&lt;y&gt;? yes
	Pass 3: Checking directory connectivity
	Unconnected directory inode 13 (...)
	Connect to /lost+found&lt;y&gt;? yes
	Pass 4: Checking reference counts
	Inode 13 ref count is 3, should be 2.  Fix&lt;y&gt;? yes
	Pass 5: Checking group summary information

	/dev/sda: ***** FILE SYSTEM WAS MODIFIED *****
	/dev/sda: 12/131072 files (0.0% non-contiguous), 26157/524288 blocks

ext4_rmdir
	if (!ext4_empty_dir(inode))
		goto end_rmdir;
ext4_empty_dir
	bh = ext4_read_dirblock(inode, 0, DIRENT_HTREE);
	if (IS_ERR(bh))
		return true;
Now if read directory block failed, 'ext4_empty_dir' will return true, assume
directory is empty. Obviously, it will lead to above issue.
To solve this issue, if read directory block failed 'ext4_empty_dir' just
return false. To avoid making things worse when file system is already
corrupted, 'ext4_empty_dir' also return false.

Signed-off-by: Ye Bin &lt;yebin10@huawei.com&gt;
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20220228024815.3952506-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4</title>
<updated>2022-02-06T18:34:45+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-02-06T18:34:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d8ad2ce873abab1cfd38779c626b79cef6307aac'/>
<id>d8ad2ce873abab1cfd38779c626b79cef6307aac</id>
<content type='text'>
Pull ext4 fixes from Ted Ts'o:
 "Various bug fixes for ext4 fast commit and inline data handling.

  Also fix regression introduced as part of moving to the new mount API"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  fs/ext4: fix comments mentioning i_mutex
  ext4: fix incorrect type issue during replay_del_range
  jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}()
  ext4: fix potential NULL pointer dereference in ext4_fill_super()
  jbd2: refactor wait logic for transaction updates into a common function
  jbd2: cleanup unused functions declarations from jbd2.h
  ext4: fix error handling in ext4_fc_record_modified_inode()
  ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()
  ext4: fix error handling in ext4_restore_inline_data()
  ext4: fast commit may miss file actions
  ext4: fast commit may not fallback for ineligible commit
  ext4: modify the logic of ext4_mb_new_blocks_simple
  ext4: prevent used blocks from being allocated during fast commit replay
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull ext4 fixes from Ted Ts'o:
 "Various bug fixes for ext4 fast commit and inline data handling.

  Also fix regression introduced as part of moving to the new mount API"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  fs/ext4: fix comments mentioning i_mutex
  ext4: fix incorrect type issue during replay_del_range
  jbd2: fix kernel-doc descriptions for jbd2_journal_shrink_{scan,count}()
  ext4: fix potential NULL pointer dereference in ext4_fill_super()
  jbd2: refactor wait logic for transaction updates into a common function
  jbd2: cleanup unused functions declarations from jbd2.h
  ext4: fix error handling in ext4_fc_record_modified_inode()
  ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()
  ext4: fix error handling in ext4_restore_inline_data()
  ext4: fast commit may miss file actions
  ext4: fast commit may not fallback for ineligible commit
  ext4: modify the logic of ext4_mb_new_blocks_simple
  ext4: prevent used blocks from being allocated during fast commit replay
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: remove redundant max inline_size check in ext4_da_write_inline_data_begin()</title>
<updated>2022-02-03T15:57:30+00:00</updated>
<author>
<name>Ritesh Harjani</name>
<email>riteshh@linux.ibm.com</email>
</author>
<published>2022-01-17T12:11:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=09355d9d038a1590ee055831a4ad3a79952cfa8b'/>
<id>09355d9d038a1590ee055831a4ad3a79952cfa8b</id>
<content type='text'>
ext4_prepare_inline_data() already checks for ext4_get_max_inline_size()
and returns -ENOSPC. So there is no need to check it twice within
ext4_da_write_inline_data_begin(). This patch removes the extra check.

It also makes it more clean.

No functionality change in this patch.

Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/cdd1654128d5105550c65fd13ca5da53b2162cc4.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ext4_prepare_inline_data() already checks for ext4_get_max_inline_size()
and returns -ENOSPC. So there is no need to check it twice within
ext4_da_write_inline_data_begin(). This patch removes the extra check.

It also makes it more clean.

No functionality change in this patch.

Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/cdd1654128d5105550c65fd13ca5da53b2162cc4.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix error handling in ext4_restore_inline_data()</title>
<updated>2022-02-03T15:57:20+00:00</updated>
<author>
<name>Ritesh Harjani</name>
<email>riteshh@linux.ibm.com</email>
</author>
<published>2022-01-17T12:11:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=897026aaa73eb2517dfea8d147f20ddb0b813044'/>
<id>897026aaa73eb2517dfea8d147f20ddb0b813044</id>
<content type='text'>
While running "./check -I 200 generic/475" it sometimes gives below
kernel BUG(). Ideally we should not call ext4_write_inline_data() if
ext4_create_inline_data() has failed.

&lt;log snip&gt;
[73131.453234] kernel BUG at fs/ext4/inline.c:223!

&lt;code snip&gt;
 212 static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc,
 213                                    void *buffer, loff_t pos, unsigned int len)
 214 {
&lt;...&gt;
 223         BUG_ON(!EXT4_I(inode)-&gt;i_inline_off);
 224         BUG_ON(pos + len &gt; EXT4_I(inode)-&gt;i_inline_size);

This patch handles the error and prints out a emergency msg saying potential
data loss for the given inode (since we couldn't restore the original
inline_data due to some previous error).

[ 9571.070313] EXT4-fs (dm-0): error restoring inline_data for inode -- potential data loss! (inode 1703982, error -30)

Reported-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/9f4cd7dfd54fa58ff27270881823d94ddf78dd07.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While running "./check -I 200 generic/475" it sometimes gives below
kernel BUG(). Ideally we should not call ext4_write_inline_data() if
ext4_create_inline_data() has failed.

&lt;log snip&gt;
[73131.453234] kernel BUG at fs/ext4/inline.c:223!

&lt;code snip&gt;
 212 static void ext4_write_inline_data(struct inode *inode, struct ext4_iloc *iloc,
 213                                    void *buffer, loff_t pos, unsigned int len)
 214 {
&lt;...&gt;
 223         BUG_ON(!EXT4_I(inode)-&gt;i_inline_off);
 224         BUG_ON(pos + len &gt; EXT4_I(inode)-&gt;i_inline_size);

This patch handles the error and prints out a emergency msg saying potential
data loss for the given inode (since we couldn't restore the original
inline_data due to some previous error).

[ 9571.070313] EXT4-fs (dm-0): error restoring inline_data for inode -- potential data loss! (inode 1703982, error -30)

Reported-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Signed-off-by: Ritesh Harjani &lt;riteshh@linux.ibm.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Link: https://lore.kernel.org/r/9f4cd7dfd54fa58ff27270881823d94ddf78dd07.1642416995.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: stable@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: introduce memalloc_retry_wait()</title>
<updated>2022-01-15T14:30:29+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2022-01-14T22:07:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4034247a0d6ab281ba3293798ce67af494d86129'/>
<id>4034247a0d6ab281ba3293798ce67af494d86129</id>
<content type='text'>
Various places in the kernel - largely in filesystems - respond to a
memory allocation failure by looping around and re-trying.  Some of
these cannot conveniently use __GFP_NOFAIL, for reasons such as:

 - a GFP_ATOMIC allocation, which __GFP_NOFAIL doesn't work on
 - a need to check for the process being signalled between failures
 - the possibility that other recovery actions could be performed
 - the allocation is quite deep in support code, and passing down an
   extra flag to say if __GFP_NOFAIL is wanted would be clumsy.

Many of these currently use congestion_wait() which (in almost all
cases) simply waits the given timeout - congestion isn't tracked for
most devices.

It isn't clear what the best delay is for loops, but it is clear that
the various filesystems shouldn't be responsible for choosing a timeout.

This patch introduces memalloc_retry_wait() with takes on that
responsibility.  Code that wants to retry a memory allocation can call
this function passing the GFP flags that were used.  It will wait
however is appropriate.

For now, it only considers __GFP_NORETRY and whatever
gfpflags_allow_blocking() tests.  If blocking is allowed without
__GFP_NORETRY, then alloc_page either made some reclaim progress, or
waited for a while, before failing.  So there is no need for much
further waiting.  memalloc_retry_wait() will wait until the current
jiffie ends.  If this condition is not met, then alloc_page() won't have
waited much if at all.  In that case memalloc_retry_wait() waits about
200ms.  This is the delay that most current loops uses.

linux/sched/mm.h needs to be included in some files now,
but linux/backing-dev.h does not.

Link: https://lkml.kernel.org/r/163754371968.13692.1277530886009912421@noble.neil.brown.name
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Various places in the kernel - largely in filesystems - respond to a
memory allocation failure by looping around and re-trying.  Some of
these cannot conveniently use __GFP_NOFAIL, for reasons such as:

 - a GFP_ATOMIC allocation, which __GFP_NOFAIL doesn't work on
 - a need to check for the process being signalled between failures
 - the possibility that other recovery actions could be performed
 - the allocation is quite deep in support code, and passing down an
   extra flag to say if __GFP_NOFAIL is wanted would be clumsy.

Many of these currently use congestion_wait() which (in almost all
cases) simply waits the given timeout - congestion isn't tracked for
most devices.

It isn't clear what the best delay is for loops, but it is clear that
the various filesystems shouldn't be responsible for choosing a timeout.

This patch introduces memalloc_retry_wait() with takes on that
responsibility.  Code that wants to retry a memory allocation can call
this function passing the GFP flags that were used.  It will wait
however is appropriate.

For now, it only considers __GFP_NORETRY and whatever
gfpflags_allow_blocking() tests.  If blocking is allowed without
__GFP_NORETRY, then alloc_page either made some reclaim progress, or
waited for a while, before failing.  So there is no need for much
further waiting.  memalloc_retry_wait() will wait until the current
jiffie ends.  If this condition is not met, then alloc_page() won't have
waited much if at all.  In that case memalloc_retry_wait() waits about
200ms.  This is the delay that most current loops uses.

linux/sched/mm.h needs to be included in some files now,
but linux/backing-dev.h does not.

Link: https://lkml.kernel.org/r/163754371968.13692.1277530886009912421@noble.neil.brown.name
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Cc: Chao Yu &lt;chao@kernel.org&gt;
Cc: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: remove extent cache entries when truncating inline data</title>
<updated>2021-09-09T14:52:05+00:00</updated>
<author>
<name>Eric Whitney</name>
<email>enwlinux@gmail.com</email>
</author>
<published>2021-08-19T14:49:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0add491df4e5e2c8cc6eeeaa6dbcca50f932090c'/>
<id>0add491df4e5e2c8cc6eeeaa6dbcca50f932090c</id>
<content type='text'>
Conditionally remove all cached extents belonging to an inode
when truncating its inline data.  It's only necessary to attempt to
remove cached extents when a conversion from inline to extent storage
has been initiated (!EXT4_STATE_MAY_INLINE_DATA).  This avoids
unnecessary es lock overhead in the more common inline case.

Signed-off-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Link: https://lore.kernel.org/r/20210819144927.25163-2-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conditionally remove all cached extents belonging to an inode
when truncating its inline data.  It's only necessary to attempt to
remove cached extents when a conversion from inline to extent storage
has been initiated (!EXT4_STATE_MAY_INLINE_DATA).  This avoids
unnecessary es lock overhead in the more common inline case.

Signed-off-by: Eric Whitney &lt;enwlinux@gmail.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Link: https://lore.kernel.org/r/20210819144927.25163-2-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'delalloc-buffer-write' into dev</title>
<updated>2021-09-09T14:47:06+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2021-09-05T03:46:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=11ef08c9eb52a808b8903004cba0733df6902a43'/>
<id>11ef08c9eb52a808b8903004cba0733df6902a43</id>
<content type='text'>
Fix a bug in how we update i_disksize, and the error path in
inline_data_end.  Finally, drop an unnecessary creation of a journal
handle which was only needed for inline data, which can give us a
large performance gain in delayed allocation writes.

Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a bug in how we update i_disksize, and the error path in
inline_data_end.  Finally, drop an unnecessary creation of a journal
handle which was only needed for inline data, which can give us a
large performance gain in delayed allocation writes.

Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: factor out write end code of inline file</title>
<updated>2021-09-05T03:38:18+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2021-07-16T12:20:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6984aef59814fb5c47b0e30c56e101186b5ebf8c'/>
<id>6984aef59814fb5c47b0e30c56e101186b5ebf8c</id>
<content type='text'>
Now that the inline_data file write end procedure are falled into the
common write end functions, it is not clear. Factor them out and do
some cleanup. This patch also drop ext4_da_write_inline_data_end()
and switch to use ext4_write_inline_data_end() instead because we also
need to do the same error processing if we failed to write data into
inline entry.

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Link: https://lore.kernel.org/r/20210716122024.1105856-4-yi.zhang@huawei.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the inline_data file write end procedure are falled into the
common write end functions, it is not clear. Factor them out and do
some cleanup. This patch also drop ext4_da_write_inline_data_end()
and switch to use ext4_write_inline_data_end() instead because we also
need to do the same error processing if we failed to write data into
inline entry.

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Link: https://lore.kernel.org/r/20210716122024.1105856-4-yi.zhang@huawei.com
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: correct the error path of ext4_write_inline_data_end()</title>
<updated>2021-09-05T03:38:18+00:00</updated>
<author>
<name>Zhang Yi</name>
<email>yi.zhang@huawei.com</email>
</author>
<published>2021-07-16T12:20:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=55ce2f649b9e88111270333a8127e23f4f8f42d7'/>
<id>55ce2f649b9e88111270333a8127e23f4f8f42d7</id>
<content type='text'>
Current error path of ext4_write_inline_data_end() is not correct.

Firstly, it should pass out the error value if ext4_get_inode_loc()
return fail, or else it could trigger infinite loop if we inject error
here. And then it's better to add inode to orphan list if it return fail
in ext4_journal_stop(), otherwise we could not restore inline xattr
entry after power failure. Finally, we need to reset the 'ret' value if
ext4_write_inline_data_end() return success in ext4_write_end() and
ext4_journalled_write_end(), otherwise we could not get the error return
value of ext4_journal_stop().

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Link: https://lore.kernel.org/r/20210716122024.1105856-3-yi.zhang@huawei.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current error path of ext4_write_inline_data_end() is not correct.

Firstly, it should pass out the error value if ext4_get_inode_loc()
return fail, or else it could trigger infinite loop if we inject error
here. And then it's better to add inode to orphan list if it return fail
in ext4_journal_stop(), otherwise we could not restore inline xattr
entry after power failure. Finally, we need to reset the 'ret' value if
ext4_write_inline_data_end() return success in ext4_write_end() and
ext4_journalled_write_end(), otherwise we could not get the error return
value of ext4_journal_stop().

Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Link: https://lore.kernel.org/r/20210716122024.1105856-3-yi.zhang@huawei.com
</pre>
</div>
</content>
</entry>
</feed>
