| Age | Commit message (Collapse) | Author |
|
The --send_omit_free flag is needed for TCP zero copy tests, to ensure
that packetdrill doesn't free the send() buffer after the send() call.
Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy")
Closes: https://lore.kernel.org/netdev/20251124071831.4cbbf412@kernel.org/
Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20251125234029.1320984-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change do_withdraw() to clear the SDF_JOURNAL_LIVE flag under the log
flush lock. In addition, change __gfs2_trans_begin() to check if the
filesystem is already known to be withdrawn using gfs2_withdrawn().
Then, once we are holding the log flush lock, check if the
SDF_JOURNAL_LIVE flag is still set. This second check ensures that the
filesystem will remain live until the transaction is submitted.
With these changes, it is no longer useful to clear SDF_JOURNAL_LIVE in
gfs2_end_log_write() after calling gfs2_withdraw().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Previously, when a withdraw occurred, we would wait for another node to
recover our journal. This also meant that frozen filesystem needed to
be thawed because otherwise, other nodes wouldn't be able to recover the
filesystem. With the reversal of commit 601ef0d52e96 ("gfs2: Force
withdraw to replay journals and wait for it to finish"), we are no
longer waiting for journal recovery during a withdraw, so we no longer
need to thaw frozen filesystems, either. This also fixes a potential
deadlock reported by lockdep when running xfstest generic/108.
In addition, there is nothing left in do_withdraw() that would require
taking sd_freeze_mutex, so don't bother taking that lock there anymore.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
We can now withdraw while the log is locked.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Currently, when a gfs2 filesystem is withdrawn, an "offline" uevent is
triggered that invokes gfs2-util's gfs2_withdraw_helper script. The
purpose of this script is to deactivate the filesystem's block device so
that it can be withdrawn immediately, even before all the filesystem's
caches have been discarded. The script provided by gfs2-utils never did
anything useful, and there was no way for it to report back its status
to the kernel.
To fix that, extend the gfs2_withdraw_helper mechanism so that the
script can report one of the following results by writing the
corresponding value into "/sys$DEVPATH/lock_module/withdraw":
0 - The shared block device has been marked inactive. Future write
operations will fail.
1 - The shared block device may still be active and carry out
write operations.
If the "offline" uevent isn't reacted upon within the timeout configured
in /sys$DEVPATH/tune/withdraw_helper_timeout (default 5 seconds), the
event handler is assumed to have failed.
In addition, add an additional "errors=deactivate" mount option.
With these changes, if fatal errors are detected on a gfs2 filesystem
and the filesystem is mounted with the "errors=panic" option, the kernel
will panic immediately. Otherwise, an attempt will be made to
deactivate the underlying block device. If successful, the kernel will
release all cluster-wide locks immediately so that the rest of the
cluster can continue. If unsuccessful, the kernel will either panic
("errors=deactivate"), or it will purge all filesystem I/O before
releasing all cluster-wide locks ("errors=withdraw").
Note that the gfs2_withdraw_helper script still needs to be fixed to
take advantage of these improvements. It could be changed to use a
mechanism like LVM Persistent Reservations. "dmsetup suspend" is not a
suitable mechanism as it infinitely postpones I/O operations, which may
prevent withdraw from completing.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
During a withdraw, we don't want to write out any more data than we have
to, so in do_xmote(), skip the ->go_sync() glock operation. We still
want to keep calling ->go_inval() to discard any cached data or
metadata, whether clean or dirty.
We do still allow glocks to transition into state LM_ST_UNLOCKED. This
has the desired side effect of calling ->go_inval() and invalidating the
glock caches.
Function gfs2_withdraw_glocks() is already used for dequeuing any
left-over waiters. We still want that to happen, but additionally, we
want all glocks to be unlocked.
Finally, we change function do_promote() to refuse any further
promotions.
This commit cleans up the leftovers of commit 86934198eefa ("gfs2: Clear
flags when withdraw prevents xmote").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Rename function gfs2_gl_dq_holders() to gfs2_withdraw_glocks(). This
function will soon be used for more than just dequeuing holders.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts commit 33dbd1e41a1d ("gfs2: fix infinite loop when checking ail
item count before go_inval").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts commit a72d2401f54b ("gfs2: Allow some glocks to be used during
withdraw").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts the rest of d93ae386ef3d ("gfs2: Check for log write errors
before telling dlm to unlock").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts commit 865cc3e9cc0b ("gfs2: fix a deadlock on
withdraw-during-mount").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts parts of commit 601ef0d52e96 ("gfs2: Force withdraw to replay
journals and wait for it to finish").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts parts of commit 601ef0d52e96 ("gfs2: Force withdraw to replay
journals and wait for it to finish").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts parts of commit 601ef0d52e96 ("gfs2: Force withdraw to replay
journals and wait for it to finish").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts parts of commit 601ef0d52e96 ("gfs2: Force withdraw to replay
journals and wait for it to finish").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts parts of commit 601ef0d52e96 ("gfs2: Force withdraw to replay
journals and wait for it to finish").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
Reverts parts of commit 601ef0d52e96 ("gfs2: Force withdraw to replay
journals and wait for it to finish").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
The current withdraw code duplicates the journal recovery code gfs2
already has for dealing with node failures, and it does so poorly. That
code was added because when releasing a lockspace, we didn't have a way
to indicate that the lockspace needs recovery. We now do have this
feature, so the current withdraw code can be removed almost entirely.
This is one of several steps towards that.
The withdrawing node has no role in recovering from the withdraw
anymore, so it also no longer needs to read metadata blocks after a
withdraw.
We now only need to set a single bit in gfs2_withdraw(), so switch from
try_cmpxchg() to test_and_set_bit().
Reverts commit 8cc67f704f4b ("gfs2: don't stop reads while withdraw in
progress").
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
GFS sets the LM_FLAG_NOEXP flag on locking requests it makes during
journal recovery, so rename the flag to LM_FLAG_RECOVER for improved
code readability.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
All callers of gfs2_io_error_bh() call gfs2_withdraw() as well, so
change gfs2_io_error_bh() to call gfs2_withdraw() directly. This also
brings it in line with other similar error reporting functions.
With that, gfs2_io_error_bh() is the same as gfs2_io_error_bh_wd(), so
remove the latter.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Now that gfs2_withdraw() is asynchronous, immediately withdraw when
a log write error is detected.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
With delayed withdraws and the SDF_WITHDRAWING flag gone, we can now
rename gfs2_withdrawing_or_withdrawn() back to gfs2_withdrawn().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Now that gfs2_withdraw() is asynchronous, is can be called in any
context and there is no more need for gfs2_withdraw_delayed() or for
turning delayed withdraws into actual withdraws. Remove the
now-obsolete code.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
So far, withdraws are carried out in the context of the calling task.
When another task tries to withdraw while a withdraw is already
underway, that task blocks as well. Change that to carry out withdraws
asynchronously in workqueue context and don't block the task triggering
the withdraw anymore.
Fixes: syzbot+6b156e132970e550194c@syzkaller.appspotmail.com
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Add a 'clean' argument to ->lm_unmount() that indicates whether the
filesystem is clean or needs recovery. Set clean to true for normal
unmounts, and to false for withdraws.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Instead of tracking the remaining time, track the deadline of each of
the timeouts.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Commit e4a8b5481c59a ("gfs2: Switch to wait_event in gfs2_quotad") broke
cyclic statfs syncing, so the numbers reported by "df" could easily get
completely out of sync with reality. Fix this by reverting part of
commit e4a8b5481c59a for now.
A follow-up commit will clean this code up later.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
Rename gfs2_try_evict() to gfs2_try_to_evict(). The GIF_DEFER_DELETE
flag has been superceded by the GLF_DEFER_DELETE flag, so fix a
left-over comment. Add a clarifying comment to delete_work_func().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
When a node tries to delete an inode, it first requests exclusive access
to the iopen glock. This triggers demote requests on all remote nodes
currently holding the iopen glock. To satisfy those requests, the
remote nodes evict the inode in question, or they poke the corresponding
inode glock to signal that the inode is still in active use.
This behavior doesn't depend on whether or not a filesystem is
read-only, so remove the incorrect read-only check.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
|
|
char variable in 'so_txtime.c' & 'txtimestamp.c' were left uninitilized
when switch default case taken. which raises following warning.
txtimestamp.c:240:2: warning: variable 'tsname' is used uninitialized
whenever switch default is taken [-Wsometimes-uninitialized]
so_txtime.c:210:3: warning: variable 'reason' is used uninitialized
whenever switch default is taken [-Wsometimes-uninitialized]
initializing these variables to NULL to fix this.
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Link: https://patch.msgid.link/20251125165302.20079-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Document the RSPI controller on the Renesas RZ/V2N SoC. The block is
compatible with the RSPI implementation found on the RZ/V2H(P) family.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20251126131619.136605-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Merge series from Johan Hovold <johan@kernel.org>:
This series fixes device and OF node reference leaks during probe and
a clock prepare imbalance on probe failures.
Included is a related cleanup of an error path.
|
|
When discarding descriptors with IN_ORDER, we should rewind
next_avail_head otherwise it would run out of sync with
last_avail_idx. This would cause driver to report
"id X is not a head".
Fixing this by returning the number of descriptors that is used for
each buffer via vhost_get_vq_desc_n() so caller can use the value
while discarding descriptors.
Fixes: 67a873df0c41 ("vhost: basic in order support")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20251120022950.10117-1-jasowang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With the previous commit revamping the timeout handling, started isn't
used anymore. It could be taken into account by adjusting the initial
value of the timeout, but there is little point as both callers capture
the timestamp shortly before calling __ceph_open_session() -- the only
thing of note that happens in the interim is taking client->mount_mutex
and that isn't expected to take multiple seconds.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
|
|
The wait loop in __ceph_open_session() can race with the client
receiving a new monmap or osdmap shortly after the initial map is
received. Both ceph_monc_handle_map() and handle_one_map() install
a new map immediately after freeing the old one
kfree(monc->monmap);
monc->monmap = monmap;
ceph_osdmap_destroy(osdc->osdmap);
osdc->osdmap = newmap;
under client->monc.mutex and client->osdc.lock respectively, but
because neither is taken in have_mon_and_osd_map() it's possible for
client->monc.monmap->epoch and client->osdc.osdmap->epoch arms in
client->monc.monmap && client->monc.monmap->epoch &&
client->osdc.osdmap && client->osdc.osdmap->epoch;
condition to dereference an already freed map. This happens to be
reproducible with generic/395 and generic/397 with KASAN enabled:
BUG: KASAN: slab-use-after-free in have_mon_and_osd_map+0x56/0x70
Read of size 4 at addr ffff88811012d810 by task mount.ceph/13305
CPU: 2 UID: 0 PID: 13305 Comm: mount.ceph Not tainted 6.14.0-rc2-build2+ #1266
...
Call Trace:
<TASK>
have_mon_and_osd_map+0x56/0x70
ceph_open_session+0x182/0x290
ceph_get_tree+0x333/0x680
vfs_get_tree+0x49/0x180
do_new_mount+0x1a3/0x2d0
path_mount+0x6dd/0x730
do_mount+0x99/0xe0
__do_sys_mount+0x141/0x180
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
Allocated by task 13305:
ceph_osdmap_alloc+0x16/0x130
ceph_osdc_init+0x27a/0x4c0
ceph_create_client+0x153/0x190
create_fs_client+0x50/0x2a0
ceph_get_tree+0xff/0x680
vfs_get_tree+0x49/0x180
do_new_mount+0x1a3/0x2d0
path_mount+0x6dd/0x730
do_mount+0x99/0xe0
__do_sys_mount+0x141/0x180
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 9475:
kfree+0x212/0x290
handle_one_map+0x23c/0x3b0
ceph_osdc_handle_map+0x3c9/0x590
mon_dispatch+0x655/0x6f0
ceph_con_process_message+0xc3/0xe0
ceph_con_v1_try_read+0x614/0x760
ceph_con_workfn+0x2de/0x650
process_one_work+0x486/0x7c0
process_scheduled_works+0x73/0x90
worker_thread+0x1c8/0x2a0
kthread+0x2ec/0x300
ret_from_fork+0x24/0x40
ret_from_fork_asm+0x1a/0x30
Rewrite the wait loop to check the above condition directly with
client->monc.mutex and client->osdc.lock taken as appropriate. While
at it, improve the timeout handling (previously mount_timeout could be
exceeded in case wait_event_interruptible_timeout() slept more than
once) and access client->auth_err under client->monc.mutex to match
how it's set in finish_auth().
monmap_show() and osdmap_show() now take the respective lock before
accessing the map as well.
Cc: stable@vger.kernel.org
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
|
|
Kernel commit 0a6ce20c1564 ("ext4: verify orphan file size is not too big")
limits the maximum supported orphan file size to 8 << 20.
However, in e2fsprogs, the orphan file size is set to 32–512 filesystem
blocks when creating a filesystem.
With 64k block size, formatting an ext4 fs >32G gives an orphan file bigger
than the kernel allows, so mount prints an error and fails:
EXT4-fs (vdb): orphan file too big: 8650752
EXT4-fs (vdb): mount failed
To prevent this issue and allow previously created 64KB filesystems to
mount, we updates the maximum allowed orphan file size in the kernel to
512 filesystem blocks.
Fixes: 0a6ce20c1564 ("ext4: verify orphan file size is not too big")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20251120134233.2994147-1-libaokun@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
|
Based on ext4(5) and fs/ext4/ext4.h.
For INCOMPAT_ENCRYPT, it's possible to create a new filesystem with that
flag without creating any encrypted inodes. ext4(5) says it adds
"support" but doesn't say whether anything's actually present like
COMPAT_RESIZE_INODE does.
Signed-off-by: Daniel Tang <danielzgtg.opensource@gmail.com>
Message-ID: <4506189.9SDvczpPoe@daniel-desktop3>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Correct 'metdata' -> 'metadata' in comment.
Signed-off-by: Haodong Tian <tianhd25@mails.tsinghua.edu.cn>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Message-ID: <20251112155916.3007639-1-tianhd25@mails.tsinghua.edu.cn>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Move the comments just before we set EXT4_EXT_MAY_ZEROOUT in
ext4_split_convert_extents.
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Message-ID: <20251112084538.1658232-4-yangerkun@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
Retval from ext4_map_create_blocks means we really create some blocks,
cannot happened with m_flags without EXT4_MAP_UNWRITTEN and
EXT4_MAP_MAPPED.
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Message-ID: <20251112084538.1658232-3-yangerkun@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
This flag has been generalized to split an unwritten extent when we do
dio or dioread_nolock writeback, or to avoid merge new extents which was
created by extents split. Update some related comments too.
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Message-ID: <20251112084538.1658232-2-yangerkun@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
validation
When the MB_CHECK_ASSERT macro is enabled, we found that the
current validation logic in __mb_check_buddy has a gap in
detecting certain invalid buddy states, particularly related
to order-0 (bitmap) bits.
The original logic consists of three steps:
1. Validates higher-order buddies: if a higher-order bit is
set, at most one of the two corresponding lower-order bits
may be free; if a higher-order bit is clear, both lower-order
bits must be allocated (and their bitmap bits must be 0).
2. For any set bit in order-0, ensures all corresponding
higher-order bits are not free.
3. Verifies that all preallocated blocks (pa) in the group
have pa_pstart within bounds and their bitmap bits marked as
allocated.
However, this approach fails to properly validate cases where
order-0 bits are incorrectly cleared (0), allowing some invalid
configurations to pass:
corrupt integral
order 3 1 1
order 2 1 1 1 1
order 1 1 1 1 1 1 1 1 1
order 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Here we get two adjacent free blocks at order-0 with inconsistent
higher-order state, and the right one shows the correct scenario.
The root cause is insufficient validation of order-0 zero bits.
To fix this and improve completeness without significant performance
cost, we refine the logic:
1. Maintain the top-down higher-order validation, but we no longer
check the cases where the higher-order bit is 0, as this case will
be covered in step 2.
2. Enhance order-0 checking by examining pairs of bits:
- If either bit in a pair is set (1), all corresponding
higher-order bits must not be free.
- If both bits are clear (0), then exactly one of the
corresponding higher-order bits must be free
3. Keep the preallocation (pa) validation unchanged.
This change closes the validation gap, ensuring illegal buddy states
involving order-0 are correctly detected, while removing redundant
checks and maintaining efficiency.
Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4")
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20251106060614.631382-3-sunyongjian@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
|
When the MB_CHECK_ASSERT macro is enabled, an assertion failure can
occur in __mb_check_buddy when checking preallocated blocks (pa) in
a block group:
Assertion failure in mb_free_blocks() : "groupnr == e4b->bd_group"
This happens when a pa at the very end of a block group (e.g.,
pa_pstart=32765, pa_len=3 in a group of 32768 blocks) becomes
exhausted - its pa_pstart is advanced by pa_len to 32768, which
lies in the next block group. If this exhausted pa (with pa_len == 0)
is still in the bb_prealloc_list during the buddy check, the assertion
incorrectly flags it as belonging to the wrong group. A possible
sequence is as follows:
ext4_mb_new_blocks
ext4_mb_release_context
pa->pa_pstart += EXT4_C2B(sbi, ac->ac_b_ex.fe_len)
pa->pa_len -= ac->ac_b_ex.fe_len
__mb_check_buddy
for each pa in group
ext4_get_group_no_and_offset
MB_CHECK_ASSERT(groupnr == e4b->bd_group)
To fix this, we modify the check to skip block group validation for
exhausted preallocations (where pa_len == 0). Such entries are in a
transitional state and will be removed from the list soon, so they
should not trigger an assertion. This change prevents the false
positive while maintaining the integrity of the checks for active
allocations.
Fixes: c9de560ded61f ("ext4: Add multi block allocator for ext4")
Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20251106060614.631382-2-sunyongjian@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
|
|
This patchset enables support for the Gamma 2.2.
With this patch the following IGT subtests pass:
kms_colorop --run plane-XR30-XR30-gamma_2_2
kms_colorop --run plane-XR30-XR30-gamma_2_2_inv-gamma_2_2
kms_colorop --run plane-XR30-XR30-gamma_2_2_inv-gamma_2_2-gamma_2_2_inv
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-52-alex.hung@amd.com
|
|
Add "DRM_COLOROP_1D_CURVE_GAMMA22" and DRM_COLOROP_1D_CURVE_GAMMA22_INV
subtypes to drm_colorop of DRM_COLOROP_1D_CURVE.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-51-alex.hung@amd.com
|
|
The degamma is to be handled by Color pipeline API.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-50-alex.hung@amd.com
|
|
Check dpp.hw_3d_lut before creating shaper tf/lut and 3dlut colorops in
colorpipeline and handling these colorops.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-49-alex.hung@amd.com
|
|
Add kernel doc for AMD color pipeline.
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-48-alex.hung@amd.com
|
|
This adds support for a 3D LUT.
The color pipeline now consists of the following colorops:
1. 1D curve colorop
2. Multiplier
3. 3x4 CTM
4. 1D curve colorop
5. 1D LUT
6. 3D LUT
7. 1D curve colorop
8. 1D LUT
Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-47-alex.hung@amd.com
|
|
It is to be used to enable HDR by allowing userpace to create and pass
3D LUTs to kernel and hardware.
new drm_colorop_type: DRM_COLOROP_3D_LUT.
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Sebastian Wick <sebastian.wick@redhat.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/20251115000237.3561250-46-alex.hung@amd.com
|