summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2026-01-27irqchip/gic-v5: Add ACPI IRS probingLorenzo Pieralisi
On ARM64 ACPI systems GICv5 IRSes are described in MADT sub-entries. Add the required plumbing to parse MADT IRS firmware table entries and probe the IRS components in ACPI. Augment the irqdomain_ops.translate() for PPI and SPI IRQs in order to provide support for their ACPI based firmware translation. Implement an irqchip ACPI based callback to initialize the global GSI domain upon an MADT IRS detection. The IRQCHIP_ACPI_DECLARE() entry in the top level GICv5 driver is only used to trigger the IRS probing (ie the global GSI domain is initialized once on the first call on multi-IRS systems); IRS probing takes place by calling acpi_table_parse_madt() in the IRS sub-driver, that probes all IRSes in sequence. Add a new ACPI interrupt model so that it can be detected at runtime and distinguished from previous GIC architecture models. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260115-gicv5-host-acpi-v3-4-c13a9a150388@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-27PCI/MSI: Make the pci_msi_map_rid_ctlr_node() interface firmware agnosticLorenzo Pieralisi
To support booting with OF and ACPI seamlessly, GIC ITS parent code requires the PCI/MSI irqdomain layer to implement a function to retrieve an MSI controller fwnode and map an RID in a firmware agnostic way (ie pci_msi_map_rid_ctlr_node()). Convert pci_msi_map_rid_ctlr_node() to an OF agnostic interface (fwnode_handle based) and update the GIC ITS MSI parent code to reflect the pci_msi_map_rid_ctlr_node() change. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260115-gicv5-host-acpi-v3-2-c13a9a150388@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-27irqdomain: Add parent field to struct irqchip_fwidLorenzo Pieralisi
The GICv5 driver IRQ domain hierarchy requires adding a parent field to struct irqchip_fwid so that core code can reference a fwnode_handle parent for a given fwnode. Add a parent field to struct irqchip_fwid and update the related kernel API functions to initialize and handle it. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260115-gicv5-host-acpi-v3-1-c13a9a150388@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-01-27Merge ACPICA material for 6.20 to satisfy dependenciesRafael J. Wysocki
2026-01-27RDMA/mana_ib: Add device‑memory supportKonstantin Taranov
Introduce a basic DM implementation that enables creating and registering device memory, and using the associated memory keys for networking operations. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://patch.msgid.link/20260127082649.429018-1-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-01-27Merge tag 'cpufreq-arm-updates-7.0-rc1' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull CPUFreq Arm updates for 7.0 from Viresh Kumar: "- Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling, Dhruva Gole, and Konrad Dybcio). - Minor improvements to the cpufreq / cpumask rust implementation (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen). - Add support for AM62L3 SoC to ti-cpufreq driver (Dhruva Gole). - Update FIE arch_freq_scale in ticks for non-PCC regs (Jie Zhan). - Other minor cleanups / improvements (Felix Gu, Juan Martinez, Luca Weiss, and Sergey Shtylyov)." * tag 'cpufreq-arm-updates-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id() cpufreq: ti-cpufreq: add support for AM62L3 SoC cpufreq: dt-platdev: Add ti,am62l3 to blocklist cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy cpufreq: scmi: correct SCMI explanation cpufreq: dt-platdev: Block the driver from probing on more QC platforms rust: cpumask: rename methods of Cpumask for clarity and consistency cpufreq: CPPC: Update FIE arch_freq_scale in ticks for non-PCC regs cpufreq: CPPC: Factor out cppc_fie_kworker_init() ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu() rust: cpufreq: replace `kernel::c_str!` with C-Strings cpufreq: Add Tegra186 and Tegra194 to cpufreq-dt-platdev blocklist dt-bindings: cpufreq: qcom-hw: document Milos CPUFREQ Hardware rust: cpufreq: add __rust_helper to helpers rust: cpufreq: always inline functions using build_assert with arguments
2026-01-27Revert "tty: tty_port: add workqueue to flip TTY buffer"Greg Kroah-Hartman
This reverts commit d000422a46aad32217cf1be747eb61d641baae2f. It is reported by many to cause boot failures, so must be reverted. Cc: Xin Zhao <jackzxcui1989@163.com> Cc: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/d1942304-ee30-478d-90fb-279519f3ae81@samsung.com Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Reported-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-27regmap: Add reg_default_cb callback for flat cache defaultsSheetal
Commit e062bdfdd6ad ("regmap: warn users about uninitialized flat cache") warns when REGCACHE_FLAT is used without full defaults. This causes false positives on hardware where many registers reset to zero but are not listed in reg_defaults, forcing drivers to maintain large tables just to silence the warning. Add a reg_default_cb() hook so drivers can supply defaults for registers not present in reg_defaults when populating REGCACHE_FLAT. This keeps the warning quiet for known zero-reset registers without bloating tables. Provide a generic regmap_default_zero_cb() helper for drivers that need zero defaults. The hook is only used for REGCACHE_FLAT; the core does not check readable/writeable access, so drivers must provide readable_reg/ writeable_reg callbacks and handle holes in the register map. Signed-off-by: Sheetal <sheetal@nvidia.com> Link: https://patch.msgid.link/20260123095346.1258556-3-sheetal@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-01-27ASoC: codec: Remove ak4641Peng Fan
Since commit d6df7df7ae5a0 ("ARM: pxa: remove unused board files"), there has been no in-tree user of the AK4641 codec driver. The last user (HP iPAQ hx4700) was a non-DT PXA board file that instantiated the device via I2C board data; that code was removed as part of the PXA board-file purge. The AK4641 driver was introduced ~2011 and still probes only via the I2C device-ID table ('.id_table'), without an 'of_match_table', so there are no upstream Devicetree users to retain. With no in-tree users left, remove the driver. Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com> Link: https://patch.msgid.link/20260122-sound-cleanup-v1-1-0a91901609b8@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-01-27wifi: cfg80211: treat deprecated INDOOR_SP_AP_OLD control value as LPI modePagadala Yesu Anjaneyulu
Although value 4 (INDOOR_SP_AP_OLD) is deprecated in IEEE standards, existing APs may still use this control value. Since this value is based on the old specification, we cannot trust such APs implement proper power controls. Therefore, move IEEE80211_6GHZ_CTRL_REG_INDOOR_SP_AP_OLD case from SP_AP to LPI_AP power type handling to prevent potential power limit violations. Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260111163601.6b5a36d3601e.I1704ee575fd25edb0d56f48a0a3169b44ef72ad0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27sdio: Provide a bustype shutdown functionUwe Kleine-König
To prepare sdio drivers to migrate away from struct device_driver::shutdown (and then eventually remove that callback) create a serdev driver shutdown callback and migration code to keep the existing behaviour. Note this introduces a warning for each driver that isn't converted yet to that callback at register time. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://patch.msgid.link/397f45c2818f6632151f92b70e547262f373c3b6.1768232321.git.u.kleine-koenig@baylibre.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27wifi: nl80211/cfg80211: support operating as RSTA in PMSR FTM requestAvraham Stern
Add an option to operate as the RSTA in an FTM measurement request. When requested, the device will dwell on the requested channel until the peer starts the FTM negotiation. This option is only valid for trigger-based/non trigger-based measurement with LMR feedback which will allow the RSTA to receive the results of the measurement. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260111190221.1f95fc0afab4.Iae2d32783b8e7c4a29089fec0f4c6bce94d303cc@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27wifi: nl80211/cfg80211: add negotiated burst period to FTM resultAvraham Stern
The FTM result includes some of the periodic measurement negotiated parameters (like the burst duration and number of bursts), but it doesn't include the burst period. Add it to the FTM result notification. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260111190221.e0778f86edef.I3c98c1933eb639963bc3ffdef81a8788b59f2188@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27wifi: nl80211/cfg80211: clarify periodic FTM parameters for non-EDCA based ↵Avraham Stern
ranging Periodic FTM request attributes are defined based on the periodic parameters used in EDCA-based ranging negotiation. However, non-EDCA based ranging (trigger-based/non-trigger-based) does not include periodic parameters in the negotiation protocol, even though upper layers may still request periodic measurements. Clarify the semantics of periodic ranging attributes when used with non-EDCA based ranging. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260111190221.b89cb3f68e1a.I7a9d8c6d1c66c77f1b43120a841101c96c3f19ad@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27wifi: nl80211/cfg80211: add new FTM capabilitiesAvraham Stern
Add new capabilities to the PMSR FTM capabilities list. The new capabilities include 6 GHz support, supported number of spatial streams and supported number of LTF repetitions. Signed-off-by: Avraham Stern <avraham.stern@intel.com> Tested-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260111190221.bf43785c18f6.Ic98cf9790ddee84bf88e5720b93c46c23af3c96c@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-27vsock: add netns support to virtio transportsBobby Eshleman
Add netns support to loopback and vhost. Keep netns disabled for virtio-vsock, but add necessary changes to comply with common API updates. This is the patch in the series when vhost-vsock namespaces actually come online. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-3-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27vsock: add netns to vsock coreBobby Eshleman
Add netns logic to vsock core. Additionally, modify transport hook prototypes to be used by later transport-specific patches (e.g., *_seqpacket_allow()). Namespaces are supported primarily by changing socket lookup functions (e.g., vsock_find_connected_socket()) to take into account the socket namespace and the namespace mode before considering a candidate socket a "match". This patch also introduces the sysctl /proc/sys/net/vsock/ns_mode to report the mode and /proc/sys/net/vsock/child_ns_mode to set the mode for new namespaces. Add netns functionality (initialization, passing to transports, procfs, etc...) to the af_vsock socket layer. Later patches that add netns support to transports depend on this patch. This patch changes the allocation of random ports for connectible vsocks in order to avoid leaking the random port range starting point to other namespaces. dgram_allow(), stream_allow(), and seqpacket_allow() callbacks are modified to take a vsk in order to perform logic on namespace modes. In future patches, the net will also be used for socket lookups in these functions. Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com> Link: https://patch.msgid.link/20260121-vsock-vmtest-v16-1-2859a7512097@meta.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-27dma-buf: Rename dma_buf_move_notify() to dma_buf_invalidate_mappings()Leon Romanovsky
Along with renaming the .move_notify() callback, rename the corresponding dma-buf core function. This makes the expected behavior clear to exporters calling this function. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20260124-dmabuf-revoke-v5-2-f98fca917e96@nvidia.com Signed-off-by: Christian König <christian.koenig@amd.com>
2026-01-27dma-buf: Rename .move_notify() callback to a clearer identifierLeon Romanovsky
Rename the .move_notify() callback to .invalidate_mappings() to make its purpose explicit and highlight that it is responsible for invalidating existing mappings. Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20260124-dmabuf-revoke-v5-1-f98fca917e96@nvidia.com Signed-off-by: Christian König <christian.koenig@amd.com>
2026-01-27gpiolib: introduce devm_fwnode_gpiod_get_optional() wrapperStefan Kerkmann
The helper makes it easier to handle optional GPIOs and simplifies the error handling code. Signed-off-by: Stefan Kerkmann <s.kerkmann@pengutronix.de> Signed-off-by: Michael Tretter <m.tretter@pengutronix.de> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://lore.kernel.org/r/20260126-gpio-devm_fwnode_gpiod_get_optional-v2-1-ec34f8e35077@pengutronix.de Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2026-01-27ACPI: CPPC: Factor out and export per-cpu cppc_perf_ctrs_in_pcc_cpu()Jie Zhan
Factor out cppc_perf_ctrs_in_pcc_cpu() for checking whether per-cpu CPC regs are defined in PCC channels, and export it out for further use. Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2026-01-26block: remove bio_last_bvec_allKeith Busch
There are no more callers of this function after commit f6b2d8b134b2413 ("btrfs: track the next file offset in struct btrfs_bio_ctrl"), so remove the function. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-26mm/damon: hide kdamond and kdamond_lock of damon_ctxSeongJae Park
There is no DAMON API caller that directly access 'kdamond' and 'kdamond_lock' fields of 'struct damon_ctx'. Keeping those exposed could only encourage creative but error-prone usages. Hide them from DAMON API callers by marking those as private fields. Link: https://lkml.kernel.org/r/20260115152047.68415-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/damon/core: implement damon_kdamond_pid()SeongJae Park
Patch series "mm/damon: hide kdamond and kdamond_lock from API callers". 'kdamond' and 'kdamond_lock' fields initially exposed to DAMON API callers for flexible synchronization and use cases. As DAMON API became somewhat complicated compared to the early days, Keeping those exposed could only encourage the API callers to invent more creative but complicated and difficult-to-debug use cases. Fortunately DAMON API callers didn't invent that many creative use cases. There exist only two use cases of 'kdamond' and 'kdamond_lock'. Finding whether the kdamond is actively running, and getting the pid of the kdamond. For the first use case, a dedicated API function, namely 'damon_is_running()' is provided, and all DAMON API callers are using the function for the use case. Hence only the second use case is where the fields are directly being used by DAMON API callers. To prevent future invention of complicated and erroneous use cases of the fields, hide the fields from the API callers. For that, provide new dedicated DAMON API functions for the remaining use case, namely damon_kdamond_pid(), migrate DAMON API callers to use the new function, and mark the fields as private fields. This patch (of 5): 'kdamond' and 'kdamond_lock' are directly being used by DAMON API callers for getting the pid of the corresponding kdamond. To discourage invention of creative but complicated and erroneous new usages of the fields that require careful synchronization, implement a new API function that can simply be used without the manual synchronizations. Link: https://lkml.kernel.org/r/20260115152047.68415-1-sj@kernel.org Link: https://lkml.kernel.org/r/20260115152047.68415-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26nodemask: propagate boolean for nodes_and{,not}Yury Norov
Patch series "nodemask: align nodes_and{,not} with underlying bitmap ops". nodes_and{,not} are void despite that underlying bitmap_and(,not) return boolean, true if the result bitmap is non-empty. Align nodemask API, and simplify client code. This patch (of 3): Bitmap functions bitmap_and{,not} return boolean depending on emptiness of the result bitmap. The corresponding nodemask helpers ignore the returned value. Propagate the underlying bitmaps result to nodemasks users, as it simplifies user code. Link: https://lkml.kernel.org/r/20260114172217.861204-1-ynorov@nvidia.com Link: https://lkml.kernel.org/r/20260114172217.861204-2-ynorov@nvidia.com Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pte_clear() This reverts commit aa232204c468 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pte_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd@linux.ibm.com: rebase, fix additional occurrence and loop handling] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-8-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pmd_clear() This reverts commit 1831414cd729 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pmd_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd@linux.ibm.com: rebase on arm64 changes] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-7-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pud_clear() This reverts commit 931c38e16499 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pud_clear"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. [ajd@linux.ibm.com: rebase on arm64 changes] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-6-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: provide addr parameter to page_table_check_ptes_set()Rohan McLure
To provide support for powerpc platforms, provide an addr parameter to the __page_table_check_ptes_set() and page_table_check_ptes_set() routines. This parameter is needed on some powerpc platforms which do not encode whether a mapping is for user or kernel in the pte. On such platforms, this can be inferred from the addr parameter. [ajd@linux.ibm.com: rebase on arm64 + riscv changes, update commit message] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-5-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pmd[s]_set() This reverts commit a3b837130b58 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pmd_set"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. Apply this to __page_table_check_pmds_set(), page_table_check_pmd_set(), and the page_table_check_pmd_set() wrapper macro. [ajd@linux.ibm.com: rebase on arm64 + riscv changes, update commit message] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-4-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_table_check: reinstate address parameter in ↵Rohan McLure
[__]page_table_check_pud[s]_set() This reverts commit 6d144436d954 ("mm/page_table_check: remove unused parameter in [__]page_table_check_pud_set"). Reinstate previously unused parameters for the purpose of supporting powerpc platforms, as many do not encode user/kernel ownership of the page in the pte, but instead in the address of the access. Apply this to __page_table_check_puds_set(), page_table_check_puds_set() and the page_table_check_pud_set() wrapper macro. [ajd@linux.ibm.com: rebase on riscv + arm64 changes, update commit message] Link: https://lkml.kernel.org/r/20251219-pgtable_check_v18rebase-v18-3-755bc151a50b@linux.ibm.com Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: Ingo Molnar <mingo@kernel.org> # x86 Acked-by: Alexandre Ghiti <alexghiti@rivosinc.com> # riscv Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Miehlbradt <nicholas@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Thomas Huth <thuth@redhat.com> Cc: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26migrate: replace RMP_ flags with TTU_ flagsMatthew Wilcox (Oracle)
Instead of translating between RMP_ and TTU_ flags, remove the RMP_ flags and just use the TTU_ flag space; there's plenty available. Possibly we should rename these to RMAP_ flags, and maybe even pass them in through rmap_walk_arg, but that can be done later. Link: https://lkml.kernel.org/r/20260109041345.3863089-3-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Jann Horn <jannh@google.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Ying Huang <ying.huang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/mempolicy: fix mpol_rebind_nodemask() for MPOL_F_NUMA_BALANCINGJinjiang Tu
commit bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") adds new flag MPOL_F_NUMA_BALANCING to enable NUMA balancing for MPOL_BIND memory policy. When the cpuset of tasks changes, the mempolicy of the task is rebound by mpol_rebind_nodemask(). When MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES are both not set, the behaviour of rebinding should be same whenever MPOL_F_NUMA_BALANCING is set or not. So, when an application calls set_mempolicy() with MPOL_F_NUMA_BALANCING set but both MPOL_F_STATIC_NODES and MPOL_F_RELATIVE_NODES cleared, mempolicy.w.cpuset_mems_allowed should be set to cpuset_current_mems_allowed nodemask. However, in current implementation, mpol_store_user_nodemask() wrongly returns true, causing mempolicy->w.user_nodemask to be incorrectly set to the user-specified nodemask. Later, when the cpuset of the application changes, mpol_rebind_nodemask() ends up rebinding based on the user-specified nodemask rather than the cpuset_mems_allowed nodemask as intended. I can reproduce with the following steps in qemu with 4 NUMA nodes: 1. echo '+cpuset' > /sys/fs/cgroup/cgroup.subtree_control 2. mkdir /sys/fs/cgroup/test 3. ./reproducer & 4. cat /proc/$pid/numa_maps, the task is bound to NUMA 1 5. echo $pid > /sys/fs/cgroup/test/cgroup.procs 6. cat /proc/$pid/numa_maps, the task is bound to NUMA 0 now. The reproducer code: int main() { struct bitmask *bmp; int ret; bmp = numa_parse_nodestring("1"); ret = set_mempolicy(MPOL_BIND | MPOL_F_NUMA_BALANCING, bmp->maskp, bmp->size + 1); if (ret < 0) { perror("Failed to call set_mempolicy"); exit(-1); } while (1); return 0; } If I call set_mempolicy() without MPOL_F_NUMA_BALANCING in the reproducer code. After step 5, the task is still bound to NUMA 1. To fix this, only set mempolicy->w.user_nodemask to the user-specified nodemask if MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES is present. Link: https://lkml.kernel.org/r/20260120011018.1256654-1-tujinjiang@huawei.com Link: https://lkml.kernel.org/r/20251223110523.1161421-1-tujinjiang@huawei.com Fixes: bda420b98505 ("numa balancing: migrate on fault among multiple bound nodes") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26zsmalloc: introduce SG-list based object read APISergey Senozhatsky
Currently, zsmalloc performs address linearization on read (which sometimes requires memcpy() to a local buffer). Not all zsmalloc users need a linear address. For example, Crypto API supports SG-list, performing linearization under the hood, if needed. In addition, some compressors can have native SG-list support, completely avoiding the linearization step. Provide an SG-list based zsmalloc read API: - zs_obj_read_sg_begin() - zs_obj_read_sg_end() This API allows callers to obtain an SG representation of the object (one entry for objects that are contained in a single page and two entries for spanning objects), avoiding the need for a bounce buffer and memcpy. [senozhatsky@chromium.org: make zs_obj_read_sg_begin() return void, per Yosry] Link: https://lkml.kernel.org/r/20260117024900.792237-1-senozhatsky@chromium.org Link: https://lkml.kernel.org/r/20260113034645.2729998-1-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Brian Geffon <bgeffon@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/damon/core: introduce [in]active memory ratio damos quota goal metricSeongJae Park
Patch series "mm/damon: advance DAMOS-based LRU sorting". DAMOS_LRU_[DE]PRIO actions were added to DAMOS for more access-aware LRU lists sorting. For simple usage, a specialized kernel module, namely DAMON_LRU_SORT, has also been introduced. After the introduction of the module, DAMON got a few important new features, including the aim-based quota auto-tuning, age tracking, young page filter, and monitoring intervals auto-tuning. Meanwhile, DAMOS-based LRU sorting had no direct updates. Now we show some rooms to advance for DAMOS-based LRU sorting. Firstly, the aim-oriented quota auto-tuning can simplify the LRU sorting parameters tuning. But there is no good auto-tuning target metric for LRU sorting use case. Secondly, the behavior of DAMOS_LRU_[DE]PRIO are not very symmetric. DAMOS_LRU_DEPRIO directly moves the pages to inactive LRU list, while DAMOS_LRU_PRIO only marks the page as accessed, so that the page can not directly but only eventually moved to the active LRU list. Finally, DAMON_LRU_SORT users cannot utilize the modern features that can be useful for them, too. Improve the situation with the following changes. First, introduce a new DAMOS quota auto-tuning target metric for active:inactive memory size ratio. Since LRU sorting is a kind of balancing of active and inactive pages, the active:inactive memory size ratio can be intuitively set. Second, update DAMOS_LRU_[DE]PRIO behaviors to be more intuitive and symmetric, by letting them directly move the pages to [in]active LRU list. Third, update the DAMON_LRU_SORT module user interface to be able to fully utilize the modern features including the [in]active memory size ratio-based quota auto-tuning, young page filter, and monitoring intervals auto-tuning. With these changes, for example, users can now ask DAMON to "find hot/cold memory regions with auto-tuned monitoring intervals, do one more page level access check for found hot/cold memory, and move pages of those to active or inactive LRU lists accordingly, aiming X:Y active to inactive memory ratio." For example, if they know 30% of the memory is better to be protected from reclamation, 30:70 can be set as the target ratio. Test Results ------------ I ran DAMON_LRU_SORT with the features introduced by this series, on a real world server workload. For the active:inactive ratio goal, I set 50:50. I confirmed it achieves the target active:inactive ratio, without manual tuning of the monitoring intervals and the hot/coldness thresholds. The baseline system that was not running the DAMON_LRU_SORT was keeping active:inactive ratio of about 1:10. Note that the test didn't show a clear performance difference, though. I believe that was mainly because the workload was not very memory intensive. Also, whether the 50:50 target ratio was optimum is unclear. Nonetheless, the positive performance impact of the basic LRU sorting idea is already confirmed with the initial DAMON_LRU_SORT introduction patch series. The goal of this patch series is simplifying the parameters tuning of DAMOS-based LRU sorting, and the test confirmed the aimed goals are achieved. Patches Sequence ---------------- First three patches extend DAMOS quota auto-tuning to support [in]active memory ratio target metric type. Those (patches 1-3) introduce new metrics, implement DAMON sysfs support, and update the documentation, respectively. Following patch (patch 4) makes DAMOS_LRU_PRIO action to directly move target pages to active LRU list, instead of only marking them accessed. Following seven patches (patches 5-11) updates DAMON_LRU_SORT to support modern DAMON features. Patch 5 makes it uses not only access frequency but also age at under-quota regions prioritization. Patches 6-11 add the support for young page filtering, active:inactive memory ratio based quota auto-tuning, and monitoring intervals auto-tuning, with appropriate document updates. This patch (of 11): DAMOS_LRU_[DE]PRIO are DAMOS actions for making balance of active and inactive memory size. There is no appropriate DAMOS quota auto-tuning target metric for the use case. Add two new DAMOS quota goal metrics for the purpose, namely DAMOS_QUOTA_[IN]ACTIVE_MEM_BP. Those will represent the ratio of [in]active memory to total (inactive + active) memory. Hence, users will be able to ask DAMON to, for example, "find hot and cold memory, and move pages of those to active and inactive LRU lists, adjusting the hot/cold thresholds aiming 50:50 active:inactive memory ratio." Link: https://lkml.kernel.org/r/20260113152717.70459-1-sj@kernel.org Link: https://lkml.kernel.org/r/20260113152717.70459-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm: cma: add cma_alloc_frozen{_compound}()Kefeng Wang
Introduce cma_alloc_frozen{_compound}() helper to alloc pages without incrementing their refcount, then convert hugetlb cma to use the cma_alloc_frozen_compound() and cma_release_frozen() and remove the unused cma_{alloc,free}_folio(), also move the cma_validate_zones() into mm/internal.h since no outside user. The set_pages_refcounted() is only called to set non-compound pages after above changes, so remove the processing about PageHead. Link: https://lkml.kernel.org/r/20260109093136.1491549-6-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm: page_alloc: add alloc_contig_frozen_{range,pages}()Kefeng Wang
In order to allocate given range of pages or allocate compound pages without incrementing their refcount, adding two new helper alloc_contig_frozen_{range,pages}() which may be beneficial to some users (eg hugetlb). The new alloc_contig_{range,pages} only take !__GFP_COMP gfp now, and the free_contig_range() is refactored to only free non-compound pages, the only caller to free compound pages in cma_free_folio() is changed accordingly, and the free_contig_frozen_range() is provided to match the alloc_contig_frozen_range(), which is used to free frozen pages. Link: https://lkml.kernel.org/r/20260109093136.1491549-5-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm: cma: kill cma_pages_valid()Kefeng Wang
Kill cma_pages_valid() which only used in cma_release(), also cleanup code duplication between cma pages valid checking and cma memrange finding. Link: https://lkml.kernel.org/r/20260109093136.1491549-4-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Muchun Song <muchun.song@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm: page_alloc: add __split_page()Kefeng Wang
Factor out the splitting of non-compound page from make_alloc_exact() and split_page() into a new helper function __split_page(). While at it, convert the VM_BUG_ON_PAGE() into a VM_WARN_ON_PAGE(). Link: https://lkml.kernel.org/r/20260109093136.1491549-3-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Muchun Song <muchun.song@linux.dev> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm: debug_vm_pgtable: add debug_vm_pgtable_free_huge_page()Kefeng Wang
Patch series "mm: hugetlb: allocate frozen gigantic folio", v6. Introduce alloc_contig_frozen_pages() and cma_alloc_frozen_compound() which avoid atomic operation about page refcount, and then convert to allocate frozen gigantic folio by the new helpers in hugetlb to cleanup the alloc_gigantic_folio(). This patch (of 6): Add a new helper to free huge page to be consistency to debug_vm_pgtable_alloc_huge_page(), and use HPAGE_PUD_ORDER instead of open-code. Also move the free_contig_range() under CONFIG_ALLOC_CONTIG since all caller are built with CONFIG_ALLOC_CONTIG. Link: https://lkml.kernel.org/r/20260109093136.1491549-2-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Muchun Song <muchun.song@linux.dev> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26zsmalloc: use actual object size to detect spansSergey Senozhatsky
Using class->size to detect spanning objects is not entirely correct, because some size classes can hold a range of object sizes of up to class->size bytes in length, due to size-classes merge. Such classes use padding for cases when actually written objects are smaller than class->size. zs_obj_read_begin() can incorrectly hit the slow path and perform memcpy of such objects, basically copying padding bytes. Instead of class->size zs_obj_read_begin() should use the actual compressed object length (both zram and zswap know it) so that it can correctly handle situations when a written object is small enough to fit into the first physical page. Link: https://lkml.kernel.org/r/20260107052145.3586917-1-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> [zsmalloc & zswap] Reviewed-by: Nhat Pham <nphamcs@gmail.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26memcg: rename mem_cgroup_ino() to mem_cgroup_id()Shakeel Butt
Rename mem_cgroup_ino() to mem_cgroup_id() and mem_cgroup_get_from_ino() to mem_cgroup_get_from_id(). These functions now use cgroup IDs (from cgroup_id()) rather than inode numbers, so the names should reflect that. [shakeel.butt@linux.dev: replace ino with id, per SeongJae] Link: https://lkml.kernel.org/r/flkqanhyettp5uq22bjwg37rtmnpeg3mghznsylxcxxgaafpl4@nov2x7tagma7 [akpm@linux-foundation.org: build fix] Link: https://lkml.kernel.org/r/20251225232116.294540-9-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26memcg: remove unused mem_cgroup_id() and mem_cgroup_from_id()Shakeel Butt
Now that all callers have been converted to use either: - The private ID APIs (mem_cgroup_private_id/mem_cgroup_from_private_id) for internal kernel objects that outlive their cgroup - The public cgroup ID APIs (mem_cgroup_ino/mem_cgroup_get_from_ino) for external interfaces Remove the unused wrapper functions mem_cgroup_id() and mem_cgroup_from_id() along with their !CONFIG_MEMCG stubs. Link: https://lkml.kernel.org/r/20251225232116.294540-8-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/damon: use cgroup ID instead of private memcg IDShakeel Butt
DAMON was using the internal private memcg ID which is meant for tracking kernel objects that outlive their cgroup. Switch to using the public cgroup ID instead. Link: https://lkml.kernel.org/r/20251225232116.294540-6-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: SeongJae Park <sj@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26memcg: use cgroup_id() instead of cgroup_ino() for memcg IDShakeel Butt
Switch mem_cgroup_ino() from using cgroup_ino() to cgroup_id(). The cgroup_ino() returns the kernfs inode number while cgroup_id() returns the kernfs node ID. For 64-bit systems, they are the same. Also cgroup_get_from_id() expects 64-bit node ID which is called by mem_cgroup_get_from_ino(). Change the type from unsigned long to u64 to match cgroup_id()'s return type, and update the format specifiers accordingly. Note that the names mem_cgroup_ino() and mem_cgroup_get_from_ino() are now misnomers since they deal with cgroup IDs rather than inode numbers. A follow-up patch will rename them. Link: https://lkml.kernel.org/r/20251225232116.294540-5-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26memcg: expose mem_cgroup_ino() and mem_cgroup_get_from_ino() unconditionallyShakeel Butt
Remove the CONFIG_SHRINKER_DEBUG guards around mem_cgroup_ino() and mem_cgroup_get_from_ino(). These APIs provide a way to get a memcg's cgroup inode number and to look up a memcg from an inode number respectively. Making these functions unconditionally available allows other in-kernel users to leverage them without requiring CONFIG_SHRINKER_DEBUG to be enabled. No functional change for existing users. Link: https://lkml.kernel.org/r/20251225232116.294540-3-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26memcg: introduce private id API for in-kernel usersShakeel Butt
Patch series "memcg: separate private and public ID namespaces". The memory cgroup subsystem maintains a private ID infrastructure that is decoupled from the cgroup IDs. This private ID system exists because some kernel objects (like swap entries and shadow entries in the workingset code) can outlive the cgroup they were associated with. The motivation is best described in commit 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs"). Unfortunately, some in-kernel users (DAMON, LRU gen debugfs interface, shrinker debugfs) started exposing these private IDs to userspace. This is problematic because: 1. The private IDs are internal implementation details that could change 2. Userspace already has access to cgroup IDs through the cgroup filesystem 3. Using different ID namespaces in different interfaces is confusing This series cleans up the memcg ID infrastructure by: 1. Explicitly marking the private ID APIs with "private" in their names to make it clear they are for internal use only (swap/workingset) 2. Making the public cgroup ID APIs (mem_cgroup_id/mem_cgroup_get_from_id) unconditionally available 3. Converting DAMON, LRU gen, and shrinker debugfs interfaces to use the public cgroup IDs instead of the private IDs 4. Removing the now-unused wrapper functions and renaming the public APIs for clarity After this series: - mem_cgroup_private_id() / mem_cgroup_from_private_id() are used for internal kernel objects that outlive their cgroup (swap, workingset) - mem_cgroup_id() / mem_cgroup_get_from_id() return the public cgroup ID (from cgroup_id()) for use in userspace-facing interfaces This patch (of 8): The memory cgroup maintains a private ID infrastructure decoupled from the cgroup IDs for swapout records and shadow entries. The main motivation of this private ID infra is best described in the commit 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs"). Unfortunately some users have started exposing these private IDs to the userspace where they should have used the cgroup IDs which are already exposed to the userspace. Let's rename the memcg ID APIs to explicitly mark them private. No functional change is intended. Link: https://lkml.kernel.org/r/20251225232116.294540-1-shakeel.butt@linux.dev Link: https://lkml.kernel.org/r/20251225232116.294540-2-shakeel.butt@linux.dev Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Dave Chinner <david@fromorbit.com> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Wei Xu <weixugc@google.com> Cc: Yuanchu Xie <yuanchu@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/page_alloc: refactor the initial compaction handlingVlastimil Babka
The initial direct compaction done in some cases in __alloc_pages_slowpath() stands out from the main retry loop of reclaim + compaction. We can simplify this by instead skipping the initial reclaim attempt via a new local variable compact_first, and handle the compact_prority as necessary to match the original behavior. No functional change intended. Link: https://lkml.kernel.org/r/20260106-thp-thisnode-tweak-v3-2-f5d67c21a193@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/mmap_lock: add vma_is_attached() helperLorenzo Stoakes
This makes it easy to explicitly check for VMA detachment, which is useful for things like asserts. Note that we intentionally do not allow this function to be available should CONFIG_PER_VMA_LOCK be set - this is because vma_assert_attached() and vma_assert_detached() are no-ops if !CONFIG_PER_VMA_LOCK, so there is no correct state for vma_is_attached() to be in if this configuration option is not specified. Therefore users elsewhere must invoke this function only after checking for CONFIG_PER_VMA_LOCK. We rework the assert functions to utilise this. Link: https://lkml.kernel.org/r/0172d3bf527ca54ba27d8bce8f8476095b241ac7.1768746221.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chris Li <chriscli@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26mm/rmap: make anon_vma functions internalLorenzo Stoakes
The bulk of the anon_vma operations are only used by mm, so formalise this by putting the function prototypes and inlines in mm/internal.h. This allows us to make changes without having to worry about the rest of the kernel. Link: https://lkml.kernel.org/r/79ec933c3a9c8bf1f64dab253bbfdae8a01cb921.1768746221.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chris Li <chriscli@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Rik van Riel <riel@surriel.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>