summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2026-04-18kho: kexec-metadata: track previous kernel chainBreno Leitao
Use Kexec Handover (KHO) to pass the previous kernel's version string and the number of kexec reboots since the last cold boot to the next kernel, and print it at boot time. Example output: [ 0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1) Motivation ========== Bugs that only reproduce when kexecing from specific kernel versions are difficult to diagnose. These issues occur when a buggy kernel kexecs into a new kernel, with the bug manifesting only in the second kernel. Recent examples include the following commits: * commit eb2266312507 ("x86/boot: Fix page table access in 5-level to 4-level paging transition") * commit 77d48d39e991 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption") * commit 64b45dd46e15 ("x86/efi: skip memattr table on kexec boot") As kexec-based reboots become more common, these version-dependent bugs are appearing more frequently. At scale, correlating crashes to the previous kernel version is challenging, especially when issues only occur in specific transition scenarios. Implementation ============== The kexec metadata is stored as a plain C struct (struct kho_kexec_metadata) rather than FDT format, for simplicity and direct field access. It is registered via kho_add_subtree() as a separate subtree, keeping it independent from the core KHO ABI. This design choice: - Keeps the core KHO ABI minimal and stable - Allows the metadata format to evolve independently - Avoids requiring version bumps for all KHO consumers (LUO, etc.) when the metadata format changes The struct kho_kexec_metadata contains two fields: - previous_release: The kernel version that initiated the kexec - kexec_count: Number of kexec boots since last cold boot On cold boot, kexec_count starts at 0 and increments with each kexec. The count helps identify issues that only manifest after multiple consecutive kexec reboots. [leitao@debian.org: call kho_kexec_metadata_init() for both boot paths] Link: https://lore.kernel.org/all/20260309-kho-v8-5-c3abcf4ac750@debian.org/ [1] Link: https://lore.kernel.org/20260409-kho_fix_merge_issue-v1-1-710c84ceaa85@debian.org Link: https://lore.kernel.org/20260316-kho-v9-5-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: SeongJae Park <sj@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18kho: persist blob size in KHO FDTBreno Leitao
kho_add_subtree() accepts a size parameter but only forwards it to debugfs. The size is not persisted in the KHO FDT, so it is lost across kexec. This makes it impossible for the incoming kernel to determine the blob size without understanding the blob format. Store the blob size as a "blob-size" property in the KHO FDT alongside the "preserved-data" physical address. This allows the receiving kernel to recover the size for any blob regardless of format. Also extend kho_retrieve_subtree() with an optional size output parameter so callers can learn the blob size without needing to understand the blob format. Update all callers to pass NULL for the new parameter. Link: https://lore.kernel.org/20260316-kho-v9-3-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18kho: rename fdt parameter to blob in kho_add/remove_subtree()Breno Leitao
Since kho_add_subtree() now accepts arbitrary data blobs (not just FDTs), rename the parameter from 'fdt' to 'blob' to better reflect its purpose. Apply the same rename to kho_remove_subtree() for consistency. Also rename kho_debugfs_fdt_add() and kho_debugfs_fdt_remove() to kho_debugfs_blob_add() and kho_debugfs_blob_remove() respectively, with the same parameter rename from 'fdt' to 'blob'. Link: https://lore.kernel.org/20260316-kho-v9-2-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18kho: add size parameter to kho_add_subtree()Breno Leitao
Patch series "kho: history: track previous kernel version and kexec boot count", v9. Use Kexec Handover (KHO) to pass the previous kernel's version string and the number of kexec reboots since the last cold boot to the next kernel, and print it at boot time. Example ======= [ 0.000000] Linux version 6.19.0-rc3-upstream-00047-ge5d992347849 ... [ 0.000000] KHO: exec from: 6.19.0-rc4-next-20260107upstream-00004-g3071b0dc4498 (count 1) Motivation ========== Bugs that only reproduce when kexecing from specific kernel versions are difficult to diagnose. These issues occur when a buggy kernel kexecs into a new kernel, with the bug manifesting only in the second kernel. Recent examples include: * eb2266312507 ("x86/boot: Fix page table access in 5-level to 4-level paging transition") * 77d48d39e991 ("efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption") * 64b45dd46e15 ("x86/efi: skip memattr table on kexec boot") As kexec-based reboots become more common, these version-dependent bugs are appearing more frequently. At scale, correlating crashes to the previous kernel version is challenging, especially when issues only occur in specific transition scenarios. Some bugs manifest only after multiple consecutive kexec reboots. Tracking the kexec count helps identify these cases (this metric is already used by live update sub-system). KHO provides a reliable mechanism to pass information between kernels. By carrying the previous kernel's release string and kexec count forward, we can print this context at boot time to aid debugging. The goal of this feature is to have this information being printed in early boot, so, users can trace back kernel releases in kexec. Systemd is not helpful because we cannot assume that the previous kernel has systemd or even write access to the disk (common when using Linux as bootloaders) This patch (of 6): kho_add_subtree() assumes the fdt argument is always an FDT and calls fdt_totalsize() on it in the debugfs code path. This assumption will break if a caller passes arbitrary data instead of an FDT. When CONFIG_KEXEC_HANDOVER_DEBUGFS is enabled, kho_debugfs_fdt_add() calls __kho_debugfs_fdt_add(), which executes: f->wrapper.size = fdt_totalsize(fdt); Fix this by adding an explicit size parameter to kho_add_subtree() so callers specify the blob size. This allows subtrees to contain arbitrary data formats, not just FDTs. Update all callers: - memblock.c: use fdt_totalsize(fdt) - luo_core.c: use fdt_totalsize(fdt_out) - test_kho.c: use fdt_totalsize() - kexec_handover.c (root fdt): use fdt_totalsize(kho_out.fdt) Also update __kho_debugfs_fdt_add() to receive the size explicitly instead of computing it internally via fdt_totalsize(). In kho_in_debugfs_init(), pass fdt_totalsize() for the root FDT and sub-blobs since all current users are FDTs. A subsequent patch will persist the size in the KHO FDT so the incoming side can handle non-FDT blobs correctly. Link: https://lore.kernel.org/20260323110747.193569-1-duanchenghao@kylinos.cn Link: https://lore.kernel.org/20260316-kho-v9-1-ed6dcd951988@debian.org Signed-off-by: Breno Leitao <leitao@debian.org> Suggested-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: correct the nr_pages parameter type of ↵Qi Zheng
mem_cgroup_update_lru_size() The nr_pages parameter of mem_cgroup_update_lru_size() represents a page count. During the reparenting of LRU folios, the value passed to it can potentially exceed the maximum value of a 32-bit integer. It should be declared as long instead of int to match the types used in lruvec size accounting and to prevent possible overflow. Update the parameter type to long to ensure correctness. Link: https://lore.kernel.org/fd4140de44fa0a3978e4e2426731187fe8625f0b.1774604356.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: change val type to long in __mod_memcg_{lruvec_}state()Qi Zheng
The __mod_memcg_state() and __mod_memcg_lruvec_state() functions are also used to reparent non-hierarchical stats. In this scenario, the values passed to them are accumulated statistics that might be extremely large and exceed the upper limit of a 32-bit integer. Change the val parameter type from int to long in these functions and their corresponding tracepoints (memcg_rstat_stats) to prevent potential overflow issues. After that, in memcg_state_val_in_pages(), if the passed val is negative, the expression val * unit / PAGE_SIZE could be implicitly converted to a massive positive number when compared with 1UL in the max() macro. This leads to returning an incorrect massive positive value. Fix this by using abs(val) to calculate the magnitude first, and then restoring the sign of the value before returning the result. Additionally, use mult_frac() to prevent potential overflow during the multiplication of val and unit. Link: https://lore.kernel.org/70a9440e49c464b4dca88bcabc6b491bd335c9f0.1774604356.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reported-by: Harry Yoo (Oracle) <harry@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: lru: add VM_WARN_ON_ONCE_FOLIO to lru maintenance helpersMuchun Song
We must ensure the folio is deleted from or added to the correct lruvec list. So, add VM_WARN_ON_ONCE_FOLIO() to catch invalid users. The VM_BUG_ON_PAGE() in move_pages_to_lru() can be removed as add_page_to_lru_list() will perform the necessary check. Link: https://lore.kernel.org/2c90fc006d9d730331a3caeef96f7e5dabe2036d.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: eliminate the problem of dying memory cgroup for LRU foliosMuchun Song
Now that everything is set up, switch folio->memcg_data pointers to objcgs, update the accessors, and execute reparenting on cgroup death. Finally, folio->memcg_data of LRU folios and kmem folios will always point to an object cgroup pointer. The folio->memcg_data of slab folios will point to an vector of object cgroups. Link: https://lore.kernel.org/80cb7af198dc6f2173fe616d1207a4c315ece141.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: convert objcg to be per-memcg per-node typeQi Zheng
Convert objcg to be per-memcg per-node type, so that when reparent LRU folios later, we can hold the lru lock at the node level, thus avoiding holding too many lru locks at once. [zhengqi.arch@bytedance.com: reset pn->orig_objcg to NULL] Link: https://lore.kernel.org/20260309112939.31937-1-qi.zheng@linux.dev [akpm@linux-foundation.org: fix comment typo, per Usama. Reflow comment to 80 cols] [devnexen@gmail.com: fix obj_cgroup leak in mem_cgroup_css_online() error path] Link: https://lore.kernel.org/20260322193631.45457-1-devnexen@gmail.com [devnexen@gmail.com: add newline, per Qi Zheng] Link: https://lore.kernel.org/20260323063007.7783-1-devnexen@gmail.com Link: https://lore.kernel.org/56c04b1c5d54f75ccdc12896df6c1ca35403ecc3.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: David Carlier <devnexen@gmail.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Usama Arif <usama.arif@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: workingset: use lruvec_lru_size() to get the number of lru pagesQi Zheng
For cgroup v2, count_shadow_nodes() is the only place to read non-hierarchical stats (lruvec_stats->state_local). To avoid the need to consider cgroup v2 during subsequent non-hierarchical stats reparenting, use lruvec_lru_size() instead of lruvec_page_state_local() to get the number of lru pages. For NR_SLAB_RECLAIMABLE_B and NR_SLAB_UNRECLAIMABLE_B cases, it appears that the statistics here have already been problematic for a while since slab pages have been reparented. So just ignore it for now. Link: https://lore.kernel.org/b1d448c667a8fb377c3390d9aba43bdb7e4d5739.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: vmscan: prepare for reparenting MGLRU foliosQi Zheng
Similar to traditional LRU folios, in order to solve the dying memcg problem, we also need to reparenting MGLRU folios to the parent memcg when memcg offline. However, there are the following challenges: 1. Each lruvec has between MIN_NR_GENS and MAX_NR_GENS generations, the number of generations of the parent and child memcg may be different, so we cannot simply transfer MGLRU folios in the child memcg to the parent memcg as we did for traditional LRU folios. 2. The generation information is stored in folio->flags, but we cannot traverse these folios while holding the lru lock, otherwise it may cause softlockup. 3. In walk_update_folio(), the gen of folio and corresponding lru size may be updated, but the folio is not immediately moved to the corresponding lru list. Therefore, there may be folios of different generations on an LRU list. 4. In lru_gen_del_folio(), the generation to which the folio belongs is found based on the generation information in folio->flags, and the corresponding LRU size will be updated. Therefore, we need to update the lru size correctly during reparenting, otherwise the lru size may be updated incorrectly in lru_gen_del_folio(). Finally, this patch chose a compromise method, which is to splice the lru list in the child memcg to the lru list of the same generation in the parent memcg during reparenting. And in order to ensure that the parent memcg has the same generation, we need to increase the generations in the parent memcg to the MAX_NR_GENS before reparenting. Of course, the same generation has different meanings in the parent and child memcg, this will cause confusion in the hot and cold information of folios. But other than that, this method is simple enough, the lru size is correct, and there is no need to consider some concurrency issues (such as lru_gen_del_folio()). To prepare for the above work, this commit implements the specific functions, which will be used during reparenting. [zhengqi.arch@bytedance.com: use list_splice_tail_init() to reparent child folios] Link: https://lore.kernel.org/20260324114937.28569-1-qi.zheng@linux.dev Link: https://lore.kernel.org/e75050354cdbc42221a04f7cf133292b61105548.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Suggested-by: Harry Yoo <harry.yoo@oracle.com> Suggested-by: Imran Khan <imran.f.khan@oracle.com> Acked-by: Harry Yoo <harry.yoo@oracle.com> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: vmscan: prepare for reparenting traditional LRU foliosQi Zheng
To resolve the dying memcg issue, we need to reparent LRU folios of child memcg to its parent memcg. For traditional LRU list, each lruvec of every memcg comprises four LRU lists. Due to the symmetry of the LRU lists, it is feasible to transfer the LRU lists from a memcg to its parent memcg during the reparenting process. This commit implements the specific function, which will be used during the reparenting process. Link: https://lore.kernel.org/a92d217a9fc82bd0c401210204a095caaf615b1c.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Muchun Song <muchun.song@linux.dev> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: prepare for reparenting LRU pages for lruvec lockMuchun Song
The following diagram illustrates how to ensure the safety of the folio lruvec lock when LRU folios undergo reparenting. In the folio_lruvec_lock(folio) function: rcu_read_lock(); retry: lruvec = folio_lruvec(folio); /* There is a possibility of folio reparenting at this point. */ spin_lock(&lruvec->lru_lock); if (unlikely(lruvec_memcg(lruvec) != folio_memcg(folio))) { /* * The wrong lruvec lock was acquired, and a retry is required. * This is because the folio resides on the parent memcg lruvec * list. */ spin_unlock(&lruvec->lru_lock); goto retry; } /* Reaching here indicates that folio_memcg() is stable. */ In the memcg_reparent_objcgs(memcg) function: spin_lock(&lruvec->lru_lock); spin_lock(&lruvec_parent->lru_lock); /* Transfer folios from the lruvec list to the parent's. */ spin_unlock(&lruvec_parent->lru_lock); spin_unlock(&lruvec->lru_lock); After acquiring the lruvec lock, it is necessary to verify whether the folio has been reparented. If reparenting has occurred, the new lruvec lock must be reacquired. During the LRU folio reparenting process, the lruvec lock will also be acquired (this will be implemented in a subsequent patch). Therefore, folio_memcg() remains unchanged while the lruvec lock is held. Given that lruvec_memcg(lruvec) is always equal to folio_memcg(folio) after the lruvec lock is acquired, the lruvec_memcg_debug() check is redundant. Hence, it is removed. This patch serves as a preparation for the reparenting of LRU folios. Link: https://lore.kernel.org/23f22cbb1419f277a3483018b32158ae2b86c666.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: do not open-code lruvec lockQi Zheng
Now we have lruvec_unlock(), lruvec_unlock_irq() and lruvec_unlock_irqrestore(), but no the paired lruvec_lock(), lruvec_lock_irq() and lruvec_lock_irqsave(). There is currently no use case for lruvec_lock_irqsave(), so only introduce lruvec_lock_irq(), and change all open-code places to use this helper function. This looks cleaner and prepares for reparenting LRU pages, preventing user from missing RCU lock calls due to open-code lruvec lock. Link: https://lore.kernel.org/2d0bafe7564e17ece46dfd58197af22ce57017dc.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Muchun Song <muchun.song@linux.dev> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: prevent memory cgroup release in count_memcg_folio_events()Muchun Song
In the near future, a folio will no longer pin its corresponding memory cgroup. To ensure safety, it will only be appropriate to hold the rcu read lock or acquire a reference to the memory cgroup returned by folio_memcg(), thereby preventing it from being released. In the current patch, the rcu read lock is employed to safeguard against the release of the memory cgroup in count_memcg_folio_events(). This serves as a preparatory measure for the reparenting of the LRU pages. Link: https://lore.kernel.org/dea6aa0389367f7fd6b715c8837a2cf7506bd889.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18writeback: prevent memory cgroup release in writeback moduleMuchun Song
In the near future, a folio will no longer pin its corresponding memory cgroup. To ensure safety, it will only be appropriate to hold the rcu read lock or acquire a reference to the memory cgroup returned by folio_memcg(), thereby preventing it from being released. In the current patch, the function get_mem_cgroup_css_from_folio() and the rcu read lock are employed to safeguard against the release of the memory cgroup. This serves as a preparatory measure for the reparenting of the LRU pages. Link: https://lore.kernel.org/645f99bc344575417f67def3744f975596df2793.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: memcontrol: return root object cgroup for root memory cgroupMuchun Song
Memory cgroup functions such as get_mem_cgroup_from_folio() and get_mem_cgroup_from_mm() return a valid memory cgroup pointer, even for the root memory cgroup. In contrast, the situation for object cgroups has been different. Previously, the root object cgroup couldn't be returned because it didn't exist. Now that a valid root object cgroup exists, for the sake of consistency, it's necessary to align the behavior of object-cgroup-related operations with that of memory cgroup APIs. Link: https://lore.kernel.org/e9c3f40ba7681d9753372d4ee2ac7a0216848b95.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chen Ridong <chenridong@huawei.com> Cc: David Hildenbrand <david@kernel.org> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm: rename unlock_page_lruvec_irq and its variantsMuchun Song
It is inappropriate to use folio_lruvec_lock() variants in conjunction with unlock_page_lruvec() variants, as this involves the inconsistent operation of locking a folio while unlocking a page. To rectify this, the functions unlock_page_lruvec{_irq, _irqrestore} are renamed to lruvec_unlock{_irq,_irqrestore}. Link: https://lore.kernel.org/4e5e05271a250df4d1812e1832be65636a78c957.1772711148.git.zhengqi.arch@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Chen Ridong <chenridong@huawei.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Allen Pais <apais@linux.microsoft.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Cc: Hugh Dickins <hughd@google.com> Cc: Imran Khan <imran.f.khan@oracle.com> Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yosry Ahmed <yosry@kernel.org> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-18mm/vma: remove __vma_check_mmap_hook()Lorenzo Stoakes
Commit c50ca15dd496 ("mm: add vm_ops->mapped hook") introduced __vma_check_mmap_hook() in order to assert that a driver doesn't incorrectly implement both an f_op->mmap() and a vm_ops->mapped hook, the latter of which would not ultimately get invoked. However, this did not correctly account for stacked drivers (or drivers that otherwise use the compatibility layer) which might recursively call an mmap_prepare hook via the compatibility layer. Thus the nested mmap_prepare() invocation might result in a VMA which has vm_ops->mapped set with an overlaying mmap() hook, causing the __vma_check_mmap_hook() to fail in vfs_mmap(), wrongly failing the operation. This patch resolves this by simply removing the check, as we can't be certain that an mmap() hook doesn't at some point invoke the compatibility layer, and it's not worth trying to track it. Link: https://lore.kernel.org/20260413105713.92625-1-ljs@kernel.org Fixes: c50ca15dd496 ("mm: add vm_ops->mapped hook") Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/all/adx2ws5z0NMIe5Yj@shinmob/ Signed-off-by: Lorenzo Stoakes <ljs@kernel.org> Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-17Merge tag 'mtd/for-7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Miquel Raynal: "MTD changes: - mtdconcat finally makes it in, after several years of being merged and reverted - Baikal SoC support is being removed, so MTD bits are being removed as well - misc cleanups NAND changes: - SunXi driver support for new versions of the Allwinner NAND controller. - DT-binding improvements and cleanups. - A few fixes (Realtek ECC and Winbond SPI NAND), aside with the usual load of misc changes. SPI NOR fixes: - Enable die erase on MT35XU02GCBA. We knew this flash needed this fixup since 7f77c561e227 ("mtd: spi-nor: micron-st: add TODO for fixing mt35xu02gcba") but did not add it due to lack of hardware to test on. - Fix locking on some Winbond w25q series flashes. - Fix Auto Address Increment (AAI) writes on SST that flashes that start on odd address. The write enable latch needs to be set again after the single byte program" * tag 'mtd/for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (44 commits) mtd: spinand: winbond: Declare the QE bit on W25NxxJW mtd: spi-nor: micron-st: Enable die erase support for MT35XU02GCBA mtd: spi-nor: winbond: Fix locking support for w25q256jw mtd: spi-nor: sst: Fix write enable before AAI sequence mtd: spi-nor: winbond: Fix locking support for w25q64jvm mtd: spi-nor: winbond: Fix locking support for w25q256jwm dt-bindings: mtd: mxc-nand: add missing compatible string and ref to nand-controller-legacy.yaml dt-bindings: mtd: gpmi-nand: ref to nand-controller-legacy.yaml dt-bindings: mtd: refactor NAND bindings and add nand-controller-legacy.yaml mtd: spinand: winbond: Clarify when to enable the HS bit mtd: rawnand: sunxi: introduce maximize variable user data length mtd: rawnand: sunxi: fix typos in comments mtd: rawnand: sunxi: change error prone variable name mtd: rawnand: sunxi: remove dead code mtd: rawnand: sunxi: make the code more self-explanatory mtd: rawnand: sunxi: replace hard coded value by a define - take2 mtd: rawnand: sunxi: do not count BBM bytes twice mtd: rawnand: sunxi: fix sunxi_nfc_hw_ecc_read_extra_oob mtd: rawnand: sunxi: sunxi_nand_ooblayout_free code clarification mtd: cmdlinepart: use a flexible array member ...
2026-04-17Merge tag 'ext4_for_linux-7.0-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: - Refactor code paths involved with partial block zero-out in prearation for converting ext4 to use iomap for buffered writes - Remove use of d_alloc() from ext4 in preparation for the deprecation of this interface - Replace some J_ASSERTS with a journal abort so we can avoid a kernel panic for a localized file system error - Simplify various code paths in mballoc, move_extent, and fast commit - Fix rare deadlock in jbd2_journal_cancel_revoke() that can be triggered by generic/013 when blocksize < pagesize - Fix memory leak when releasing an extended attribute when its value is stored in an ea_inode - Fix various potential kunit test bugs in fs/ext4/extents.c - Fix potential out-of-bounds access in check_xattr() with a corrupted file system - Make the jbd2_inode dirty range tracking safe for lockless reads - Avoid a WARN_ON when writeback files due to a corrupted file system; we already print an ext4 warning indicatign that data will be lost, so the WARN_ON is not necessary and doesn't add any new information * tag 'ext4_for_linux-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (37 commits) jbd2: fix deadlock in jbd2_journal_cancel_revoke() ext4: fix missing brelse() in ext4_xattr_inode_dec_ref_all() ext4: fix possible null-ptr-deref in mbt_kunit_exit() ext4: fix possible null-ptr-deref in extents_kunit_exit() ext4: fix the error handling process in extents_kunit_init). ext4: call deactivate_super() in extents_kunit_exit() ext4: fix miss unlock 'sb->s_umount' in extents_kunit_init() ext4: fix bounds check in check_xattrs() to prevent out-of-bounds access ext4: zero post-EOF partial block before appending write ext4: move pagecache_isize_extended() out of active handle ext4: remove ctime/mtime update from ext4_alloc_file_blocks() ext4: unify SYNC mode checks in fallocate paths ext4: ensure zeroed partial blocks are persisted in SYNC mode ext4: move zero partial block range functions out of active handle ext4: pass allocate range as loff_t to ext4_alloc_file_blocks() ext4: remove handle parameters from zero partial block functions ext4: move ordered data handling out of ext4_block_do_zero_range() ext4: rename ext4_block_zero_page_range() to ext4_block_zero_range() ext4: factor out journalled block zeroing range ext4: rename and extend ext4_block_truncate_page() ...
2026-04-17Merge tag 'ntfs-for-7.1-rc1-v2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs Pull ntfs resurrection from Namjae Jeon: "Ever since Kari Argillander’s 2022 report [1] regarding the state of the ntfs3 driver, I have spent the last 4 years working to provide full write support and current trends (iomap, no buffer head, folio), enhanced performance, stable maintenance, utility support including fsck for NTFS in Linux. This new implementation is built upon the clean foundation of the original read-only NTFS driver, adding: - Write support: Implemented full write support based on the classic read-only NTFS driver. Added delayed allocation to improve write performance through multi-cluster allocation and reduced fragmentation of the cluster bitmap. - iomap conversion: Switched buffered IO (reads/writes), direct IO, file extent mapping, readpages, and writepages to use iomap. - Remove buffer_head: Completely removed buffer_head usage by converting to folios. As a result, the dependency on CONFIG_BUFFER_HEAD has been removed from Kconfig. - Stability improvements: The new ntfs driver passes 326 xfstests, compared to 273 for ntfs3. All tests passed by ntfs3 are a complete subset of the tests passed by this implementation. Added support for fallocate, idmapped mounts, permissions, and more. xfstests Results report: Total tests run: 787 Passed : 326 Failed : 38 Skipped : 423 Failed tests breakdown: - 34 tests require metadata journaling - 4 other tests: 094: No unwritten extent concept in NTFS on-disk format 563: cgroup v2 aware writeback accounting not supported 631: RENAME_WHITEOUT support required 787: NFS delegation test" Link: https://lore.kernel.org/all/da20d32b-5185-f40b-48b8-2986922d8b25@stargateuniverse.net/ [1] [ Let's see if this undead filesystem ends up being of the "Easter miracle" kind, or the "Nosferatu of filesystems" kind... ] * tag 'ntfs-for-7.1-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs: (46 commits) ntfs: remove redundant out-of-bound checks ntfs: add bound checking to ntfs_external_attr_find ntfs: add bound checking to ntfs_attr_find ntfs: fix ignoring unreachable code warnings ntfs: fix inconsistent indenting warnings ntfs: fix variable dereferenced before check warnings ntfs: prefer IS_ERR_OR_NULL() over manual NULL check ntfs: harden ntfs_listxattr against EA entries ntfs: harden ntfs_ea_lookup against malformed EA entries ntfs: check $EA query-length in ntfs_ea_get ntfs: validate WSL EA payload sizes ntfs: fix WSL ea restore condition ntfs: add missing newlines to pr_err() messages ntfs: fix pointer/integer casting warnings ntfs: use ->mft_no instead of ->i_ino in prints ntfs: change mft_no type to u64 ntfs: select FS_IOMAP in Kconfig ntfs: add MODULE_ALIAS_FS ntfs: reduce stack usage in ntfs_write_mft_block() ntfs: fix sysctl table registration and path ...
2026-04-17Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: "Most of the diff stat comes from Xu Kuohai's fix to emit ENDBR/BTI, since all JITs had to be touched to move constant blinding out and pass bpf_verifier_env in. - Fix use-after-free in arena_vm_close on fork (Alexei Starovoitov) - Dissociate struct_ops program with map if map_update fails (Amery Hung) - Fix out-of-range and off-by-one bugs in arm64 JIT (Daniel Borkmann) - Fix precedence bug in convert_bpf_ld_abs alignment check (Daniel Borkmann) - Fix arg tracking for imprecise/multi-offset in BPF_ST/STX insns (Eduard Zingerman) - Copy token from main to subprogs to fix missing kallsyms (Eduard Zingerman) - Prevent double close and leak of btf objects in libbpf (Jiri Olsa) - Fix af_unix null-ptr-deref in sockmap (Michal Luczaj) - Fix NULL deref in map_kptr_match_type for scalar regs (Mykyta Yatsenko) - Avoid unnecessary IPIs. Remove redundant bpf_flush_icache() in arm64 and riscv JITs (Puranjay Mohan) - Fix out of bounds access. Validate node_id in arena_alloc_pages() (Puranjay Mohan) - Reject BPF-to-BPF calls and callbacks in arm32 JIT (Puranjay Mohan) - Refactor all JITs to pass bpf_verifier_env to emit ENDBR/BTI for indirect jump targets on x86-64, arm64 JITs (Xu Kuohai) - Allow UTF-8 literals in bpf_bprintf_prepare() (Yihan Ding)" * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (32 commits) bpf, arm32: Reject BPF-to-BPF calls and callbacks in the JIT bpf: Dissociate struct_ops program with map if map_update fails bpf: Validate node_id in arena_alloc_pages() libbpf: Prevent double close and leak of btf objects selftests/bpf: cover UTF-8 trace_printk output bpf: allow UTF-8 literals in bpf_bprintf_prepare() selftests/bpf: Reject scalar store into kptr slot bpf: Fix NULL deref in map_kptr_match_type for scalar regs bpf: Fix precedence bug in convert_bpf_ld_abs alignment check bpf, arm64: Emit BTI for indirect jump target bpf, x86: Emit ENDBR for indirect jump targets bpf: Add helper to detect indirect jump targets bpf: Pass bpf_verifier_env to JIT bpf: Move constants blinding out of arch-specific JITs bpf, sockmap: Take state lock for af_unix iter bpf, sockmap: Fix af_unix null-ptr-deref in proto update selftests/bpf: Extend bpf_iter_unix to attempt deadlocking bpf, sockmap: Fix af_unix iter deadlock bpf, sockmap: Annotate af_unix sock:: Sk_state data-races selftests/bpf: verify kallsyms entries for token-loaded subprograms ...
2026-04-17Merge tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds
Pull CXL (Compute Express Link) updates from Dave Jiang: "The significant change of interest is the handling of soft reserved memory conflict between CXL and HMEM. In essence CXL will be the first to claim the soft reserved memory ranges that belongs to CXL and attempt to enumerate them with best effort. If CXL is not able to enumerate the ranges it will punt them to HMEM. There are also MAINTAINERS email changes from Dan Williams and Jonathan Cameron" * tag 'cxl-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (37 commits) MAINTAINERS: Update Jonathan Cameron's email address cxl/hdm: Add support for 32 switch decoders MAINTAINERS: Update address for Dan Williams tools/testing/cxl: Enable replay of user regions as auto regions cxl/region: Add a region sysfs interface for region lock status tools/testing/cxl: Test dax_hmem takeover of CXL regions tools/testing/cxl: Simulate auto-assembly failure dax/hmem: Parent dax_hmem devices dax/hmem: Fix singleton confusion between dax_hmem_work and hmem devices dax/hmem: Reduce visibility of dax_cxl coordination symbols cxl/region: Constify cxl_region_resource_contains() cxl/region: Limit visibility of cxl_region_contains_resource() dax/cxl: Fix HMEM dependencies cxl/region: Fix use-after-free from auto assembly failure cxl/core: Check existence of cxl_memdev_state in poison test cxl/core: use cleanup.h for devm_cxl_add_dax_region cxl/core/region: move dax region device logic into region_dax.c cxl/core/region: move pmem region driver logic into region_pmem.c dax/hmem, cxl: Defer and resolve Soft Reserved ownership cxl/region: Add helper to check Soft Reserved containment by CXL regions ...
2026-04-17Merge tag 'stop-machine.2026.04.16a' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull stop-machine update from Paul McKenney: - kernel-doc updates for stop_machine() and stop_machine_cpuslocked() functions * tag 'stop-machine.2026.04.16a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: stop_machine: Fix the documentation for a NULL cpus argument
2026-04-17Merge tag 'integrity-v7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "There are two main changes, one feature removal, some code cleanup, and a number of bug fixes. Main changes: - Detecting secure boot mode was limited to IMA. Make detecting secure boot mode accessible to EVM and other LSMs - IMA sigv3 support was limited to fsverity. Add IMA sigv3 support for IMA regular file hashes and EVM portable signatures Remove: - Remove IMA support for asychronous hash calculation originally added for hardware acceleration Cleanup: - Remove unnecessary Kconfig CONFIG_MODULE_SIG and CONFIG_KEXEC_SIG tests - Add descriptions of the IMA atomic flags Bug fixes: - Like IMA, properly limit EVM "fix" mode - Define and call evm_fix_hmac() to update security.evm - Fallback to using i_version to detect file change for filesystems that do not support STATX_CHANGE_COOKIE - Address missing kernel support for configured (new) TPM hash algorithms - Add missing crypto_shash_final() return value" * tag 'integrity-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: evm: Enforce signatures version 3 with new EVM policy 'bit 3' integrity: Allow sigv3 verification on EVM_XATTR_PORTABLE_DIGSIG ima: add support to require IMA sigv3 signatures ima: add regular file data hash signature version 3 support ima: Define asymmetric_verify_v3() to verify IMA sigv3 signatures ima: remove buggy support for asynchronous hashes integrity: Eliminate weak definition of arch_get_secureboot() ima: Add code comments to explain IMA iint cache atomic_flags ima_fs: Correctly create securityfs files for unsupported hash algos ima: check return value of crypto_shash_final() in boot aggregate ima: Define and use a digest_size field in the ima_algo_desc structure powerpc/ima: Drop unnecessary check for CONFIG_MODULE_SIG ima: efi: Drop unnecessary check for CONFIG_MODULE_SIG/CONFIG_KEXEC_SIG ima: fallback to using i_version to detect file change evm: fix security.evm for a file with IMA signature s390: Drop unnecessary CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT evm: Don't enable fix mode when secure boot is enabled integrity: Make arch_ima_get_secureboot integrity-wide
2026-04-17Merge tag 'hwlock-v7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull hwspinlock updates from Bjorn Andersson: "Remove the unused u8500 hardware spinlock driver, and clean out the hwspinlock_pdata struct as this was the last user of the struct" * tag 'hwlock-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: hwspinlock: remove now unused pdata from header file hwspinlock: u8500: delete driver
2026-04-17Merge tag 'rpmsg-v7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull rpmsg updates from Bjorn Andersson: "Mark 'data' argument in rpmsg_send() const, and perculate to related drivers. Replace deprecated class_destroy() with class_unregister()" * tag 'rpmsg-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: media: platform: mtk-mdp3: Constify buffer passed to mdp_vpu_sendmsg() ASoC: qcom: Constify GPR packet being send over GPR interface rpmsg: Constify buffer passed to send API remoteproc: mtk_scp: Constify buffer passed to scp_send_ipi() remoteproc: mtk_scp_ipi: Constify buffer passed to scp_ipi_send() drivers: rpmsg: class_destroy() is deprecated
2026-04-17Merge tag 'devicetree-for-7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: "DT core: - Cleanup of the reserved memory code to keep CMA specifics in CMA code - Add and convert several users to new of_machine_get_match() helper - Validate nul termination in string properties - Update dtc to upstream v1.7.2-69-g53373d135579 - Limit matching reserved memory devices to /reserved-memory nodes - Fix some UAF in unittests - Remove Baikal SoC bus driver - Fix false DT_SPLIT_BINDING_PATCH checkpatch warning - Allow fw_devlink device-tree on x86 - Fix kerneldoc return description for of_property_count_elems_of_size() DT bindings: - Add fsl,imx25-aips, fsl,imx25-tcq, qcom,eliza-pdc, qcom,eliza-spmi-pmic-arb, qcom,hawi-imem, qcom,milos-imem, qcom,hawi-pdc, and lg,sw49410 bindings - Convert arm,vexpress-scc to DT schema - Deprecate Qualcomm generic CPU compatibles. Add Apple M3 CPU cores. - Move some dual-link display panels to the dual-link schema - Drop mux controller node name constraints - Remove Baikal SoC bus bindings - Fix a false warning in the thermal trip node binding" * tag 'devicetree-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (39 commits) dt-bindings: display: panel: panel-simple: Add lg,sw49410 compatible dt-bindings: display: ti, am65x-dss: Fix AM62L DSS reg and clock constraints dt-bindings: display: simple: Move Innolux G156HCE-L01 panel to dual-link dt-bindings: display: simple: Move AUO 21.5" FHD to dual-link dt-bindings: thermal: Fix false warning with 'phandle' in trips nodes of: unittest: fix use-after-free in testdrv_probe() of: unittest: fix use-after-free in of_unittest_changeset() dt-bindings: qcom,pdc: document the Hawi Power Domain Controller dt-bindings: ARM: arm,vexpress-scc: convert to DT schema drivers/of: fdt: validate flat DT string properties before string use drivers/of: fdt: validate stdout-path properties before parsing them dt-bindings: sram: Document qcom,hawi-imem compatible dt-bindings: sram: Allow multiple-word prefixes to sram subnode dt-bindings: sram: Document qcom,milos-imem scripts/dtc: Update to upstream version v1.7.2-69-g53373d135579 of: property: Allow fw_devlink device-tree on x86 dt-bindings: arm: cpus: Add Apple M3 CPU core compatibles dt-bindings: display: lt8912b: Drop redundant endpoint properties dt-bindings: opp-v2: Fix example 3 CPU reg value dt-bindings: connector: add pd-disable dependency ...
2026-04-17t10-pi: reduce ref tag code duplicationCaleb Sander Mateos
t10_pi_ref_tag() and ext_pi_ref_tag() are identical except for the final truncation of the ref tag to 32 or 48 bits. Factor out a helper full_pi_ref_tag() to return the untruncated ref tag and use it in t10_pi_ref_tag() and ext_pi_ref_tag(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://patch.msgid.link/20260415210847.1730016-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-04-17Merge tag 'nand/for-7.1' into mtd/nextMiquel Raynal
The main changes happened in the SunXi driver in order to support new versions of the Allwinner NAND controller. There are also some DT-binding improvements and cleanups. Finally a couple of actual fixes (Realtek ECC and Winbond SPI NAND), aside with the usual load of misc changes.
2026-04-17Merge tag 'for-v7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply and reset updates from Sebastian Reichel: "Power-supply drivers: - S2MU005: new battery fuel gauge driver - macsmc-power: new driver for Apple Silicon - qcom_battmgr: Add support for Glymur and Kaanapali - max17042: add support for max77759 - qcom_smbx: allow disabling charging - bd71828: add input current limit support - multiple drivers: use new device managed workqueue allocation function - misc small cleanups and fixes Reset core: - Expose sysfs for registered reboot_modes Reset drivers - misc small cleanups and fixes" * tag 'for-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (36 commits) power: supply: qcom_smbx: allow disabling charging power: reset: drop unneeded dependencies on OF_GPIO power: supply: bd71828: add input current limit property dt-bindings: power: reset: cortina,gemini-power-controller: convert to DT schema power: supply: add support for S2MU005 battery fuel gauge device dt-bindings: power: supply: document Samsung S2MU005 battery fuel gauge power: reset: reboot-mode: fix -Wformat-security warning power: supply: ipaq_micro: Simplify with devm power: supply: mt6370: Simplify with devm_alloc_ordered_workqueue() power: supply: max77705: Free allocated workqueue and fix removal order power: supply: max77705: Drop duplicated IRQ error message power: supply: cw2015: Free allocated workqueue power: reset: keystone: Use register_sys_off_handler(SYS_OFF_MODE_RESTART) power: supply: twl4030_madc: Drop unused header includes power: supply: bq24190: Avoid rescheduling after cancelling work power: supply: axp288_charger: Simplify returns of dev_err_probe() power: supply: axp288_charger: Do not cancel work before initializing it power: supply: cpcap-battery: pass static battery cell data from device tree dt-bindings: power: supply: cpcap-battery: document monitored-battery property power: supply: qcom_battmgr: Add support for Glymur and Kaanapali ...
2026-04-17Merge tag 'hsi-for-7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi Pull HSI updates from Sebastian Reichel: - use flexible array member for hsi_port in hsi_controller - misc small fixes * tag 'hsi-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi: HSI: omap_ssi_port: remove depends on ARM HSI: omap_ssi_port: remove set but unused variables HSI: cmt_speech: fix wrong printf format HSI: omap_ssi_port: remove null check from FAM hsi: hsi_core: use kzalloc_flex
2026-04-17Merge tag 'hid-for-linus-2026041601' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: "Core: - fixed handling of 0-sized reports (Dmitry Torokhov) - convert core code to __free() (Dmitry Torokhov) - support for multiple batteries per HID device (Lucas Zampieri) Drivers: - support for rumble effects in winwing driver (Ivan Gorinov) - new support for a variety of Sony Rock Band and Sony DJ Hero Turntable devices (Rosalie Wanders) - new driver for Lenovo Legion Go / S devices (Derek J. Clark) - power management improvements to intel-thc-hid driver (Even Xu) ... other assorted cleanups, fixes and device-specific quirks" * tag 'hid-for-linus-2026041601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (73 commits) HID: core: clamp report_size in s32ton() to avoid undefined shift HID: logitech-dj: fix wrong detection of bad DJ_SHORT output report HID: logitech-hidpp: fix race condition when accessing stale stack pointer HID: winwing: Enable rumble effects HID: core: do not allow parsing 0-sized reports HID: usbhid: refactor endpoint lookup HID: huawei: fix CD30 keyboard report descriptor issue HID: playstation: validate num_touch_reports in DualShock 4 reports HID: drop 'default !EXPERT' from tristate symbols HID: usbhid: fix deadlock in hid_post_reset() HID: apple: ensure the keyboard backlight is off if suspending HID: quirks: Set ALWAYS_POLL for LOGITECH_BOLT_RECEIVER HID: alps: fix NULL pointer dereference in alps_raw_event() HID: logitech-dj: Prevent REPORT_ID_DJ_SHORT related user initiated OOB write HID: logitech-dj: Standardise hid_report_enum variable nomenclature HID: sony: update module description HID: logitech-hidpp: Check bounds when deleting force-feedback effects HID: sony: add battery status support for Rock Band 4 PS5 guitars HID: sony: fix style issues HID: quirks: update hid-sony supported devices ...
2026-04-17Merge tag 'dma-mapping-7.1-2026-04-16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: - added support for batched cache sync, what improves performance of dma_map/unmap_sg() operations on ARM64 architecture (Barry Song) - introduced DMA_ATTR_CC_SHARED attribute for explicitly shared memory used in confidential computing (Jiri Pirko) - refactored spaghetti-like code in drivers/of/of_reserved_mem.c and its clients (Marek Szyprowski, shared branch with device-tree updates to avoid merge conflicts) - prepared Contiguous Memory Allocator related code for making dma-buf drivers modularized (Maxime Ripard) - added support for benchmarking dma_map_sg() calls to tools/dma utility (Qinxin Xia) * tag 'dma-mapping-7.1-2026-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: (24 commits) dma-buf: heaps: system: document system_cc_shared heap dma-buf: heaps: system: add system_cc_shared heap for explicitly shared memory dma-mapping: introduce DMA_ATTR_CC_SHARED for shared memory mm: cma: Export cma_alloc(), cma_release() and cma_get_name() dma: contiguous: Export dev_get_cma_area() dma: contiguous: Make dma_contiguous_default_area static dma: contiguous: Make dev_get_cma_area() a proper function dma: contiguous: Turn heap registration logic around of: reserved_mem: rework fdt_init_reserved_mem_node() of: reserved_mem: clarify fdt_scan_reserved_mem*() functions of: reserved_mem: rearrange code a bit of: reserved_mem: replace CMA quirks by generic methods of: reserved_mem: switch to ops based OF_DECLARE() of: reserved_mem: use -ENODEV instead of -ENOENT of: reserved_mem: remove fdt node from the structure dma-mapping: fix false kernel-doc comment marker dma-mapping: Support batch mode for dma_direct_{map,unmap}_sg dma-mapping: Separate DMA sync issuing and completion waiting arm64: Provide dcache_inval_poc_nosync helper arm64: Provide dcache_clean_poc_nosync helper ...
2026-04-17Merge tag 'dmaengine-7.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "Core: - New devm_of_dma_controller_register() API New Support: - Support for RZ/G3L SoC - Loongson Multi-Channel DMA controller support - Conversion of Xilinx AXI DMA binding - DW AXI CV1800B DMA support - Switchtec DMA engine driver Updates: - AMD MDB Endpoint and non-LL mode support - DW edma virtual IRQ for interrupt-emulation, cyclic transfers support" * tag 'dmaengine-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (65 commits) dmaengine: dw-edma: Add non-LL mode dmaengine: dw-edma: Add AMD MDB Endpoint Support dt-bindings: dmaengine: Fix spelling mistake "Looongson" -> "Looogson" dmaengine: loongson: Fix spelling mistake "Looongson" -> "Looogson" dmaengine: loongson: New driver for the Loongson Multi-Channel DMA controller dt-bindings: dmaengine: Add Loongson Multi-Channel DMA controller dmaengine: loongson: loongson2-apb: Simplify locking with guard() and scoped_guard() dmaengine: loongson: loongson2-apb: Convert to devm_clk_get_enabled() dmaengine: loongson: loongson2-apb: Convert to dmaenginem_async_device_register() dmaengine: loongson: New directory for Loongson DMA controllers drivers dt-bindings: dma: xlnx,axi-dma: Convert to DT schema dt-bindings: dma: rz-dmac: Add conditional schema for RZ/G3L dmaengine: sh: rz-dmac: Add device_{pause,resume}() callbacks dmaengine: sh: rz-dmac: Add device_tx_status() callback dmaengine: sh: rz-dmac: Use rz_lmdesc_setup() to invalidate descriptors dmaengine: sh: rz-dmac: Drop unnecessary local_irq_save() call dmaengine: sh: rz-dmac: Drop goto instruction and label dmaengine: sh: rz-dmac: Drop read of CHCTRL register dmaengine: sh: rz_dmac: add RZ/{T2H,N2H} support dt-bindings: dma: renesas,rz-dmac: document RZ/{T2H,N2H} ...
2026-04-17Merge tag 'soundwire-7.1-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire updates from Vinod Koul: - Core: DP prepare polling for avoiding interrupt deadlock - AMD clock init and bandwidth refactoring - Intel more codecs to wake list, clear message on before signaling waiting thread * tag 'soundwire-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: intel_auxdevice: Add cs42l49 to wake_capable_list soundwire: cadence: Clear message complete before signaling waiting thread soundwire: Intel: test bus.bpt_stream before assigning it soundwire: bus: demote UNATTACHED state warnings to dev_dbg() soundwire: stream: Poll for DP prepare to avoid interrupt deadlock soundwire: amd: refactor bandwidth calculation logic soundwire: amd: add clock init control function soundwire: intel_auxdevice: Add CS47L47 to wake_capable_list soundwire: slave: Don't register devices that are disabled in ACPI soundwire: sdw.h: repair names and format of kernel-doc comments
2026-04-17Merge tag 'trace-v7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Fix printf format warning for bprintf sunrpc uses a trace_printk() that triggers a printf warning during the compile. Move the __printf() attribute around for when debugging is not enabled the warning will go away - Remove redundant check for EVENT_FILE_FL_FREED in event_filter_write() The FREED flag is checked in the call to event_file_file() and then checked again right afterward, which is unneeded - Clean up event_file_file() and event_file_data() helpers These helper functions played a different role in the past, but now with eventfs, the READ_ONCE() isn't needed. Simplify the code a bit and also add a warning to event_file_data() if the file or its data is not present - Remove updating file->private_data in tracing open All access to the file private data is handled by the helper functions, which do not use file->private_data. Stop updating it on open - Show ENUM names in function arguments via BTF in function tracing When showing the function arguments when func-args option is set for function tracing, if one of the arguments is found to be an enum, show the name of the enum instead of its number - Add new trace_call__##name() API for tracepoints Tracepoints are enabled via static_branch() blocks, where when not enabled, there's only a nop that is in the code where the execution will just skip over it. When tracing is enabled, the nop is converted to a direct jump to the tracepoint code. Sometimes more calculations are required to be performed to update the parameters of the tracepoint. In this case, trace_##name##_enabled() is called which is a static_branch() that gets enabled only when the tracepoint is enabled. This allows the extra calculations to also be skipped by the nop: if (trace_foo_enabled()) { x = bar(); trace_foo(x); } Where the x=bar() is only performed when foo is enabled. The problem with this approach is that there's now two static_branch() calls. One for checking if the tracepoint is enabled, and then again to know if the tracepoint should be called. The second one is redundant Introduce trace_call__foo() that will call the foo() tracepoint directly without doing a static_branch(): if (trace_foo_enabled()) { x = bar(); trace_call__foo(); } - Update various locations to use the new trace_call__##name() API - Move snapshot code out of trace.c Cleaning up trace.c to not be a "dump all", move the snapshot code out of it and into a new trace_snapshot.c file - Clean up some "%*.s" to "%*s" - Allow boot kernel command line options to be called multiple times Have options like: ftrace_filter=foo ftrace_filter=bar ftrace_filter=zoo Equal to: ftrace_filter=foo,bar,zoo - Fix ipi_raise event CPU field to be a CPU field The ipi_raise target_cpus field is defined as a __bitmask(). There is now a __cpumask() field definition. Update the field to use that - Have hist_field_name() use a snprintf() and not a series of strcat() It's safer to use snprintf() that a series of strcat() - Fix tracepoint regfunc balancing A tracepoint can define a "reg" and "unreg" function that gets called before the tracepoint is enabled, and after it is disabled respectively. But on error, after the "reg" func is called and the tracepoint is not enabled, the "unreg" function is not called to tear down what the "reg" function performed - Fix output that shows what histograms are enabled Event variables are displayed incorrectly in the histogram output Instead of "sched.sched_wakeup.$var", it is showing "$sched.sched_wakeup.var" where the '$' is in the incorrect location - Some other simple cleanups * tag 'trace-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (24 commits) selftests/ftrace: Add test case for fully-qualified variable references tracing: Fix fully-qualified variable reference printing in histograms tracepoint: balance regfunc() on func_add() failure in tracepoint_add_func() tracing: Rebuild full_name on each hist_field_name() call tracing: Report ipi_raise target CPUs as cpumask tracing: Remove duplicate latency_fsnotify() stub tracing: Preserve repeated trace_trigger boot parameters tracing: Append repeated boot-time tracing parameters tracing: Remove spurious default precision from show_event_trigger/filter formats cpufreq: Use trace_call__##name() at guarded tracepoint call sites tracing: Remove tracing_alloc_snapshot() when snapshot isn't defined tracing: Move snapshot code out of trace.c and into trace_snapshot.c mm: damon: Use trace_call__##name() at guarded tracepoint call sites btrfs: Use trace_call__##name() at guarded tracepoint call sites spi: Use trace_call__##name() at guarded tracepoint call sites i2c: Use trace_call__##name() at guarded tracepoint call sites kernel: Use trace_call__##name() at guarded tracepoint call sites tracepoint: Add trace_call__##name() API tracing: trace_mmap.h: fix a kernel-doc warning tracing: Pretty-print enum parameters in function arguments ...
2026-04-17Merge tag 'bootconfig-v7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull bootconfig updates from Masami Hiramatsu: "Minor fixes for handling errors: - fix off-by-one in xbc_verify_tree() next node check - increment xbc_node_num after node init succeeds - validate child node index in xbc_verify_tree() Code cleanups (mainly type/attribute changes): - clean up comment typos and bracing - drop redundant memset of xbc_nodes - replace linux/kernel.h with specific includes - narrow flag parameter type from uint32_t to uint16_t - constify xbc_calc_checksum() data parameter - fix signed comparison in xbc_node_get_data() - use size_t for strlen result in xbc_node_match_prefix() - use signed type for offset in xbc_init_node() - use size_t for key length tracking in xbc_verify_tree() - change xbc_node_index() return type to uint16_t" * tag 'bootconfig-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: lib/bootconfig: change xbc_node_index() return type to uint16_t lib/bootconfig: use size_t for key length tracking in xbc_verify_tree() lib/bootconfig: use signed type for offset in xbc_init_node() lib/bootconfig: use size_t for strlen result in xbc_node_match_prefix() lib/bootconfig: fix signed comparison in xbc_node_get_data() lib/bootconfig: validate child node index in xbc_verify_tree() lib/bootconfig: replace linux/kernel.h with specific includes bootconfig: constify xbc_calc_checksum() data parameter lib/bootconfig: drop redundant memset of xbc_nodes lib/bootconfig: increment xbc_node_num after node init succeeds lib/bootconfig: fix off-by-one in xbc_verify_tree() next node check lib/bootconfig: narrow flag parameter type from uint32_t to uint16_t lib/bootconfig: clean up comment typos and bracing
2026-04-17Merge tag 'mips_7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds
Pull MIPS updates from Thomas Bogendoerfer: - Support for Mobileye EyeQ6Lplus - Cleanups and fixes * tag 'mips_7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (30 commits) MIPS/mtd: Handle READY GPIO in generic NAND platform data MIPS/input: Move RB532 button to GPIO descriptors MIPS: validate DT bootargs before appending them MIPS: Alchemy: Remove unused forward declaration MAINTAINERS: Mobileye: Add EyeQ6Lplus files MIPS: config: add eyeq6lplus_defconfig MIPS: Add Mobileye EyeQ6Lplus evaluation board dts MIPS: Add Mobileye EyeQ6Lplus SoC dtsi clk: eyeq: Add Mobileye EyeQ6Lplus OLB clk: eyeq: Adjust PLL accuracy computation clk: eyeq: Skip post-divisor when computing PLL frequency pinctrl: eyeq5: Add Mobileye EyeQ6Lplus OLB pinctrl: eyeq5: Use match data reset: eyeq: Add Mobileye EyeQ6Lplus OLB MIPS: Add Mobileye EyeQ6Lplus support dt-bindings: soc: mobileye: Add EyeQ6Lplus OLB dt-bindings: mips: Add Mobileye EyeQ6Lplus SoC MIPS: dts: loongson64g-package: Switch to Loongson UART driver mips: pci-mt7620: rework initialization procedure mips: pci-mt7620: add more register init values ...
2026-04-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "Arm: - Add support for tracing in the standalone EL2 hypervisor code, which should help both debugging and performance analysis. This uses the new infrastructure for 'remote' trace buffers that can be exposed by non-kernel entities such as firmware, and which came through the tracing tree - Add support for GICv5 Per Processor Interrupts (PPIs), as the starting point for supporting the new GIC architecture in KVM - Finally add support for pKVM protected guests, where pages are unmapped from the host as they are faulted into the guest and can be shared back from the guest using pKVM hypercalls. Protected guests are created using a new machine type identifier. As the elusive guestmem has not yet delivered on its promises, anonymous memory is also supported This is only a first step towards full isolation from the host; for example, the CPU register state and DMA accesses are not yet isolated. Because this does not really yet bring fully what it promises, it is hidden behind CONFIG_ARM_PKVM_GUEST + 'kvm-arm.mode=protected', and also triggers TAINT_USER when a VM is created. Caveat emptor - Rework the dreaded user_mem_abort() function to make it more maintainable, reducing the amount of state being exposed to the various helpers and rendering a substantial amount of state immutable - Expand the Stage-2 page table dumper to support NV shadow page tables on a per-VM basis - Tidy up the pKVM PSCI proxy code to be slightly less hard to follow - Fix both SPE and TRBE in non-VHE configurations so that they do not generate spurious, out of context table walks that ultimately lead to very bad HW lockups - A small set of patches fixing the Stage-2 MMU freeing in error cases - Tighten-up accepted SMC immediate value to be only #0 for host SMCCC calls - The usual cleanups and other selftest churn LoongArch: - Use CSR_CRMD_PLV for kvm_arch_vcpu_in_kernel() - Add DMSINTC irqchip in kernel support RISC-V: - Fix steal time shared memory alignment checks - Fix vector context allocation leak - Fix array out-of-bounds in pmu_ctr_read() and pmu_fw_ctr_read_hi() - Fix double-free of sdata in kvm_pmu_clear_snapshot_area() - Fix integer overflow in kvm_pmu_validate_counter_mask() - Fix shift-out-of-bounds in make_xfence_request() - Fix lost write protection on huge pages during dirty logging - Split huge pages during fault handling for dirty logging - Skip CSR restore if VCPU is reloaded on the same core - Implement kvm_arch_has_default_irqchip() for KVM selftests - Factored-out ISA checks into separate sources - Added hideleg to struct kvm_vcpu_config - Factored-out VCPU config into separate sources - Support configuration of per-VM HGATP mode from KVM user space s390: - Support for ESA (31-bit) guests inside nested hypervisors - Remove restriction on memslot alignment, which is not needed anymore with the new gmap code - Fix LPSW/E to update the bear (which of course is the breaking event address register) x86: - Shut up various UBSAN warnings on reading module parameter before they were initialized - Don't zero-allocate page tables that are used for splitting hugepages in the TDP MMU, as KVM is guaranteed to set all SPTEs in the page table and thus write all bytes - As an optimization, bail early when trying to unsync 4KiB mappings if the target gfn can just be mapped with a 2MiB hugepage x86 generic: - Copy single-chunk MMIO write values into struct kvm_vcpu (more precisely struct kvm_mmio_fragment) to fix use-after-free stack bugs where KVM would dereference stack pointer after an exit to userspace - Clean up and comment the emulated MMIO code to try to make it easier to maintain (not necessarily "easy", but "easier") - Move VMXON+VMXOFF and EFER.SVME toggling out of KVM (not *all* of VMX and SVM enabling) as it is needed for trusted I/O - Advertise support for AVX512 Bit Matrix Multiply (BMM) instructions - Immediately fail the build if a required #define is missing in one of KVM's headers that is included multiple times - Reject SET_GUEST_DEBUG with -EBUSY if there's an already injected exception, mostly to prevent syzkaller from abusing the uAPI to trigger WARNs, but also because it can help prevent userspace from unintentionally crashing the VM - Exempt SMM from CPUID faulting on Intel, as per the spec - Misc hardening and cleanup changes x86 (AMD): - Fix and optimize IRQ window inhibit handling for AVIC; make it per-vCPU so that KVM doesn't prematurely re-enable AVIC if multiple vCPUs have to-be-injected IRQs - Clean up and optimize the OSVW handling, avoiding a bug in which KVM would overwrite state when enabling virtualization on multiple CPUs in parallel. This should not be a problem because OSVW should usually be the same for all CPUs - Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a "too large" size based purely on user input - Clean up and harden the pinning code for KVM_MEMORY_ENCRYPT_REG_REGION - Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as doing so for an SNP guest will crash the host due to an RMP violation page fault - Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are required to hold kvm->lock, and enforce it by lockdep. Fix various bugs where sev_guest() was not ensured to be stable for the whole duration of a function or ioctl - Convert a pile of kvm->lock SEV code to guard() - Play nicer with userspace that does not enable KVM_CAP_EXCEPTION_PAYLOAD, for which KVM needs to set CR2 and DR6 as a response to ioctls such as KVM_GET_VCPU_EVENTS (even if the payload would end up in EXITINFO2 rather than CR2, for example). Only set CR2 and DR6 when consumption of the payload is imminent, but on the other hand force delivery of the payload in all paths where userspace retrieves CR2 or DR6 - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT instead of vmcb02->save.cr2. The value is out of sync after a save/restore or after a #PF is injected into L2 - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore - Add a variety of missing nSVM consistency checks - Fix several bugs where KVM failed to correctly update VMCB fields on nested #VMEXIT - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions - Add support for save+restore of virtualized LBRs (on SVM) - Refactor various helpers and macros to improve clarity and (hopefully) make the code easier to maintain - Aggressively sanitize fields when copying from vmcb12, to guard against unintentionally allowing L1 to utilize yet-to-be-defined features - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. There are remaining issues in that KVM doesn't handle size prefix overrides for 64-bit guests - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't double down on AMD's architectural but sketchy behavior of generating #GP for "unsupported" addresses) - Cache all used vmcb12 fields to further harden against TOCTOU bugs x86 (Intel): - Drop obsolete branch hint prefixes from the VMX instruction macros - Use ASM_INPUT_RM() in __vmcs_writel() to coerce clang into using a register input when appropriate - Code cleanups guest_memfd: - Don't mark guest_memfd folios as accessed, as guest_memfd doesn't support reclaim, the memory is unevictable, and there is no storage to write back to LoongArch selftests: - Add KVM PMU test cases s390 selftests: - Enable more memory selftests x86 selftests: - Add support for Hygon CPUs in KVM selftests - Fix a bug in the MSR test where it would get false failures on AMD/Hygon CPUs with exactly one of RDPID or RDTSCP - Add an MADV_COLLAPSE testcase for guest_memfd as a regression test for a bug where the kernel would attempt to collapse guest_memfd folios against KVM's will" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (373 commits) KVM: x86: use inlines instead of macros for is_sev_*guest x86/virt: Treat SVM as unsupported when running as an SEV+ guest KVM: SEV: Goto an existing error label if charging misc_cg for an ASID fails KVM: SVM: Move lock-protected allocation of SEV ASID into a separate helper KVM: SEV: use mutex guard in snp_handle_guest_req() KVM: SEV: use mutex guard in sev_mem_enc_unregister_region() KVM: SEV: use mutex guard in sev_mem_enc_ioctl() KVM: SEV: use mutex guard in snp_launch_update() KVM: SEV: Assert that kvm->lock is held when querying SEV+ support KVM: SEV: Document that checking for SEV+ guests when reclaiming memory is "safe" KVM: SEV: Hide "struct kvm_sev_info" behind CONFIG_KVM_AMD_SEV=y KVM: SEV: WARN on unhandled VM type when initializing VM KVM: LoongArch: selftests: Add PMU overflow interrupt test KVM: LoongArch: selftests: Add basic PMU event counting test KVM: LoongArch: selftests: Add cpucfg read/write helpers LoongArch: KVM: Add DMSINTC inject msi to vCPU LoongArch: KVM: Add DMSINTC device support LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function LoongArch: KVM: Move host CSR_GSTAT save and restore in context switch LoongArch: KVM: Move host CSR_EENTRY save and restore in context switch ...
2026-04-16Merge tag 'for-linus-fwctl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl Pull fwctl updates from Jason Gunthorpe: - New fwctl driver for Broadcom RDMA NICs - Bug fix for non-modular builds * tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/fwctl/fwctl: fwctl: Fix class init ordering to avoid NULL pointer dereference on device removal fwctl/bnxt_fwctl: Add documentation entries fwctl/bnxt_fwctl: Add bnxt fwctl device fwctl/bnxt_en: Create an aux device for fwctl fwctl/bnxt_en: Refactor aux bus functions to be more generic fwctl/bnxt_en: Move common definitions to include/linux/bnxt/
2026-04-16Merge tag 'soc-arm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull SoC ARM code updates from Arnd Bergmann: "These are again very minimal updates: - A workaround for firmware on Google Nexus 10 - A fix for early debugging on OMAP1 - A rework for Microchip SoC configuration - Cleanups on OMAP2 an R-Car-Gen2" * tag 'soc-arm-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: omap2: dead code cleanup in kconfig for ARCH_OMAP4 ARM: OMAP1: Fix DEBUG_LL and earlyprintk on OMAP16XX arm64: Kconfig: provide a top-level switch for Microchip platforms ARM: shmobile: rcar-gen2: Use of_phandle_args_equal() helper ARM: omap: fix all kernel-doc warnings ARM: omap2: Replace scnprintf with strscpy in omap3_cpuinfo ARM: samsung: exynos5250: Allow CPU1 to boot
2026-04-16Merge tag 'soc-drivers-7.1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "The driver updates again are all over the place with many minor fixes going into platform specific code. The most notable changes are: - Support for Microchip pic64gx system controllers - Work on cleaning up devicetree bindings for SoC drivers, and converting them into the new format - Lots of smaller changes for Qualcomm SoC drivers, including support for a number of newly supported chips - reset controller API cleanups and a new driver for Cix Sky1 - Reworks of the Tegra PMC and CBB drivers, along with a change to how individual Tegra SoCs get selected in Kconfig and BPMP firmware driver updates including a refresh of the ABI header to match the version used by firmware - STM32 updates to the firewall bus driver and support for the debug bus through OP-TEE - SCMI firmware driver improvements for reliability, in particular for dealing with broken firmware interrupts - Memory driver updates for Tegra, and a patch to remove the unused Baikal T1 driver" * tag 'soc-drivers-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (193 commits) firmware: arm_ffa: Use the correct buffer size during RXTX_MAP firmware: qcom: scm: Allow QSEECOM on Lenovo IdeaCentre Mini X clk: spear: fix resource leak in clk_register_vco_pll() reset: rzv2h-usb2phy: Add support for VBUS mux controller registration reset: rzv2h-usb2phy: Convert to regmap API dt-bindings: reset: renesas,rzv2h-usb2phy: Document RZ/G3E USB2PHY reset dt-bindings: reset: renesas,rzv2h-usb2phy: Add '#mux-state-cells' property soc: microchip: add mpfs gpio interrupt mux driver dt-bindings: soc: microchip: document PolarFire SoC's gpio interrupt mux gpio: mpfs: Add interrupt support soc: qcom: ubwc: add helpers to get programmable values soc: qcom: ubwc: add helper to get min_acc length firmware: qcom: scm: Register gunyah watchdog device soc: qcom: socinfo: Add SoC ID for SA8650P dt-bindings: arm: qcom,ids: Add SoC ID for SA8650P firmware: qcom: scm: Allow QSEECOM on Mahua CRD soc: qcom: wcnss: simplify allocation of req soc: qcom: pd-mapper: Add support for Eliza soc: qcom: aoss: compare against normalized cooling state soc: qcom: llcc: fix v1 SB syndrome register offset ...
2026-04-16Merge tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull SoC devicetree updates from Arnd Bergmann: "A number of SoC platforms are adding modernized variants of their already supported chips time, with a total of 12 new SoCs, and two older SoC getting removed: - Qualcomm Glymur is a compute SoC using 18 Oryon-2 CPU cores - Qualcomm Mahua is a variant of Glymur with only 12 CPU cores, but largely identical. - Qualcomm Eliza is an embeded platform for mobile phone (SM7750) and IOT (QC7790S/M) workloads - Qualcomm IPQ5210 is a wireless networking SoC using Cortex-A53 cores - Qualcomm apq8084 and ipq806x had only rudimentary support but no actual products using them, so they are now gone. - Axis ARTPEC-9 is a follow-up to the ARTPEC-8 embedded SoC, using the Samsung SoC platform but now with Cortex-A55 cores - ARM Zena is a virtual platform in FVP using Cortex-A720AE cores, with additional versions planned to be merged in the future. - ARM corstone-1000-a320 is a reference platform for IOT, using low-end Cortex-A320 cores - Microchip LAN9691 is an updated 64-bit variant of the arm32 lan966x series of networking SoCs - Microchip PIC64GX is an embedded RISC-V chip using SIFIVE U54 CPU cores - Rockchip RV1103B is the low-end 32-bit single-core vision processor - Renesas RZ/G3L (r9a08g046) is an industrial embedded chip using Cortex-A55 cores, similar to the G3E and G3S variants we already supported. - NXP S32N79 is an automotive SoC using Cortex-A78AE cores, a significant upgrade from the older S32V and S32G series These all come with at least one reference board or an initial product using these, in total there are 67 newly added boards. The ones for already supported SoCs are: - Two more Aspeed BMC based boards - Three older tablets based on 32-bit OMAP4 and Exynos5 SoCs - One Set-top-box based on Allwinner H6 - 22 additional industrial/embedded boards using 64-bit NXP i.MX8M or i.MX9 SoCs - 20 Qualcomm SoC based machines across all possible markets: workstation, gaming, laptop, phone, networking, reference, ... - Three more Rockchips rk35xx based boards - Four variants of the Toradex Verdin using TI AM62 Other notable bits are: - A cleanup for the 32-bit Tegra paz00 board moved the last board specific code on Tegra into equivalent dts syntax. - There continues to be a significant number of fixes for static checking of dtc syntax, but it feels like this is slowing down, hopefully getting into a state where most known issues are addressed - Additional hardware support for many existing boards across SoC families, notably Qualcomm, Broadcom, i.MX2, i.MX6, Rockchips, STM32, Mediatek, Tegra, TI and Microchip" * tag 'soc-dt-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (841 commits) arm64: dts: ti: k3: Use memory-region-names for r5f ARM: dts: imx: Add DT overlays for DH i.MX6 DHCOM SoM and boards ARM: dts: imx6sx: remove fallback compatible string fsl,imx28-lcdif ARM: dts: imx25: rename node name tcq to touchscreen ARM: dts: imx: b850v3: Disable unused usdhc4 ARM: dts: imx: b850v3: Define GPIO line names ARM: dts: imx: b850v3: Use alphabetical sorting ARM: dts: imx: bx50v3: Configure phy-mode to eliminate a warning ARM: dts: imx: bx50v3: Configure switch PHY max-speed to 100Mbps ARM: dts: imx7ulp: Add CPU clock and OPP table support ARM: dts: imx7-mba7: Deassert BOOT_EN after boot ARM: dts: tqma7: add boot phase properties ARM: dts: imx7s: add boot phase properties ARM: dts: tqma6ul[l]: correct spelling of TQ-Systems ARM: dts: mba6ulx: add boot phase properties ARM: dts: imx6ul[l]-tqma6ul[l]: add boot phase properties ARM: dts: imx6ul/imx6ull: add boot phase properties ARM: dts: imx6qdl-mba6: add boot phase properties ARM: dts: imx6qdl-tqma6: add boot phase properties ARM: dts: imx6qdl: add boot phase properties ...
2026-04-16Merge tag 'mm-nonmm-stable-2026-04-15-04-20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "pid: make sub-init creation retryable" (Oleg Nesterov) Make creation of init in a new namespace more robust by clearing away some historical cruft which is no longer needed. Also some documentation fixups - "selftests/fchmodat2: Error handling and general" (Mark Brown) Fix and a cleanup for the fchmodat2() syscall selftest - "lib: polynomial: Move to math/ and clean up" (Andy Shevchenko) - "hung_task: Provide runtime reset interface for hung task detector" (Aaron Tomlin) Give administrators the ability to zero out /proc/sys/kernel/hung_task_detect_count - "tools/getdelays: use the static UAPI headers from tools/include/uapi" (Thomas Weißschuh) Teach getdelays to use the in-kernel UAPI headers rather than the system-provided ones - "watchdog/hardlockup: Improvements to hardlockup" (Mayank Rungta) Several cleanups and fixups to the hardlockup detector code and its documentation - "lib/bch: fix undefined behavior from signed left-shifts" (Josh Law) A couple of small/theoretical fixes in the bch code - "ocfs2/dlm: fix two bugs in dlm_match_regions()" (Junrui Luo) - "cleanup the RAID5 XOR library" (Christoph Hellwig) A quite far-reaching cleanup to this code. I can't do better than to quote Christoph: "The XOR library used for the RAID5 parity is a bit of a mess right now. The main file sits in crypto/ despite not being cryptography and not using the crypto API, with the generic implementations sitting in include/asm-generic and the arch implementations sitting in an asm/ header in theory. The latter doesn't work for many cases, so architectures often build the code directly into the core kernel, or create another module for the architecture code. Change this to a single module in lib/ that also contains the architecture optimizations, similar to the library work Eric Biggers has done for the CRC and crypto libraries later. After that it changes to better calling conventions that allow for smarter architecture implementations (although none is contained here yet), and uses static_call to avoid indirection function call overhead" - "lib/list_sort: Clean up list_sort() scheduling workarounds" (Kuan-Wei Chiu) Clean up this library code by removing a hacky thing which was added for UBIFS, which UBIFS doesn't actually need - "Fix bugs in extract_iter_to_sg()" (Christian Ehrhardt) Fix a few bugs in the scatterlist code, add in-kernel tests for the now-fixed bugs and fix a leak in the test itself - "kdump: Enable LUKS-encrypted dump target support in ARM64 and PowerPC" (Coiby Xu) Enable support of the LUKS-encrypted device dump target on arm64 and powerpc - "ocfs2: consolidate extent list validation into block read callbacks" (Joseph Qi) Cleanup, simplify, and make more robust ocfs2's validation of extent list fields (Kernel test robot loves mounting corrupted fs images!) * tag 'mm-nonmm-stable-2026-04-15-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (127 commits) ocfs2: validate group add input before caching ocfs2: validate bg_bits during freefrag scan ocfs2: fix listxattr handling when the buffer is full doc: watchdog: fix typos etc update Sean's email address ocfs2: use get_random_u32() where appropriate ocfs2: split transactions in dio completion to avoid credit exhaustion ocfs2: remove redundant l_next_free_rec check in __ocfs2_find_path() ocfs2: validate extent block list fields during block read ocfs2: remove empty extent list check in ocfs2_dx_dir_lookup_rec() ocfs2: validate dx_root extent list fields during block read ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY ocfs2: handle invalid dinode in ocfs2_group_extend .get_maintainer.ignore: add Askar ocfs2: validate bg_list extent bounds in discontig groups checkpatch: exclude forward declarations of const structs tools/accounting: handle truncated taskstats netlink messages taskstats: set version in TGID exit notifications ocfs2/heartbeat: fix slot mapping rollback leaks on error paths arm64,ppc64le/kdump: pass dm-crypt keys to kdump kernel ...
2026-04-16net: enetc: fix NTMP DMA use-after-free issueWei Fang
The AI-generated review reported a potential DMA use-after-free issue [1]. If netc_xmit_ntmp_cmd() times out and returns an error, the pending command is not explicitly aborted, while ntmp_free_data_mem() unconditionally frees the DMA buffer. If the buffer has already been reallocated elsewhere, this may lead to silent memory corruption. Because the hardware eventually processes the pending command and perform a DMA write of the response to the physical address of the freed buffer. To resolve this issue, this patch does the following modifications: 1. Convert cbdr->ring_lock from a spinlock to a mutex The lock was originally a spinlock in case NTMP operations might be invoked from atomic context. After downstream support for all NTMP tables, no such usage has materialized. A mutex lock is now required because the driver now needs to reclaim used BDs and release associated DMA memory within the lock's context, while dma_free_coherent() might sleep. 2. Introduce software command BD (struct netc_swcbd) The hardware write-back overwrites the addr and len fields of the BD, so the driver cannot rely on the hardware BD to free the associated DMA memory. The driver now maintains a software shadow BD storing the DMA buffer pointer, DMA address, and size. And netc_xmit_ntmp_cmd() only reclaims older BDs when the number of used BDs reaches NETC_CBDR_CLEAN_WORK (16). The software BD enables correct DMA memory release. With this, struct ntmp_dma_buf and ntmp_free_data_mem() are no longer needed and are removed. 3. Require callers to hold ring_lock across netc_xmit_ntmp_cmd() netc_xmit_ntmp_cmd() releases the ring_lock before the caller finishes consuming the response. At this point, if a concurrent thread submits a new command, it may trigger ntmp_clean_cbdr() and free the DMA buffer while it is still in use. Move ring_lock ownership to the caller to ensure the response buffer cannot be reclaimed prematurely. So the helpers ntmp_select_and_lock_cbdr() and ntmp_unlock_cbdr() are added. These changes eliminate the DMA use-after-free condition and ensure safe and consistent BD reclamation and DMA buffer lifecycle management. Fixes: 4701073c3deb ("net: enetc: add initial netc-lib driver to support NTMP") Link: https://lore.kernel.org/netdev/20260403011729.1795413-1-kuba@kernel.org/ # [1] Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260415060833.2303846-3-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-16Merge tag 'v7.1-rc1-part2-smb3-client-fixes' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - Fix integer underflow in encrypted read - Four debug patches, adding a few tracepoints - Minor update to MAINTAINERS file (preferred server URL for cifs) - Remove the BUG_ON() calls in d_mark_tmpfile_name * tag 'v7.1-rc1-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: MAINTAINERS: change git.samba.org to https smb: client: fix integer underflow in receive_encrypted_read() smb: client: add tracepoints for deferred handle caching smb: client: add oplock level to smb3_open_done tracepoint smb: client: add tracepoint for local lock conflicts smb: client: add tracepoints for lock operations vfs: get rid of BUG_ON() in d_mark_tmpfile_name()
2026-04-16entry: Kill ARCH_SYSCALL_WORK_{ENTER,EXIT}Oleg Nesterov
Nowadays nothing redefines these flags. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/advfWWKgOQkFkwp9@redhat.com
2026-04-16Merge branch 'for-7.1/lenovo-v2' into for-linusJiri Kosina
- new driver for Lenovo Legion Go / S devices (Derek J. Clark)