summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2026-03-31module: add kflagstab section to vmlinux and modulesSiddharth Nayyar
This patch introduces a __kflagstab section to store symbol flags in a dedicated data structure, similar to how CRCs are handled in the __kcrctab. The flags for a given symbol in __kflagstab will be located at the same index as the symbol's entry in __ksymtab and its CRC in __kcrctab. This design decouples the flags from the symbol table itself, allowing us to maintain a single, sorted __ksymtab. As a result, the symbol search remains an efficient, single lookup, regardless of the number of flags we add in the future. The motivation for this change comes from the Android kernel, which uses an additional symbol flag to restrict the use of certain exported symbols by unsigned modules, thereby enhancing kernel security. This __kflagstab can be implemented as a bitmap to efficiently manage which symbols are available for general use versus those restricted to signed modules only. This section will contain read-only data for values of kernel symbol flags in the form of an 8-bit bitsets for each kernel symbol. Each bit in the bitset represents a flag value defined by ksym_flags enumeration. Petr Pavlu ran a small test to get a better understanding of the different section sizes resulting from this patch series. He used v6.17-rc6 together with the openSUSE x86_64 config [1], which is fairly large. The resulting vmlinux.bin (no debuginfo) had an on-disk size of 58 MiB, and included 5937 + 6589 (GPL-only) exported symbols. The following table summarizes his measurements and calculations regarding the sizes of all sections related to exported symbols: | HAVE_ARCH_PREL32_RELOCATIONS | !HAVE_ARCH_PREL32_RELOCATIONS Section | Base [B] | Ext. [B] | Sep. [B] | Base [B] | Ext. [B] | Sep. [B] ---------------------------------------------------------------------------------------- __ksymtab | 71244 | 200416 | 150312 | 142488 | 400832 | 300624 __ksymtab_gpl | 79068 | NA | NA | 158136 | NA | NA __kcrctab | 23748 | 50104 | 50104 | 23748 | 50104 | 50104 __kcrctab_gpl | 26356 | NA | NA | 26356 | NA | NA __ksymtab_strings | 253628 | 253628 | 253628 | 253628 | 253628 | 253628 __kflagstab | NA | NA | 12526 | NA | NA | 12526 ---------------------------------------------------------------------------------------- Total | 454044 | 504148 | 466570 | 604356 | 704564 | 616882 Increase to base [%] | NA | 11.0 | 2.8 | NA | 16.6 | 2.1 The column "HAVE_ARCH_PREL32_RELOCATIONS -> Base" contains the measured numbers. The rest of the values are calculated. The "Ext." column represents an alternative approach of extending __ksymtab to include a bitset of symbol flags, and the "Sep." column represents the approach of having a separate __kflagstab. With HAVE_ARCH_PREL32_RELOCATIONS, each kernel_symbol is 12 B in size and is extended to 16 B. With !HAVE_ARCH_PREL32_RELOCATIONS, it is 24 B, extended to 32 B. Note that this does not include the metadata needed to relocate __ksymtab*, which is freed after the initial processing. Adding __kflagstab as a separate section has a negligible impact, as expected. When extending __ksymtab (kernel_symbol) instead, the worst case with !HAVE_ARCH_PREL32_RELOCATIONS increases the export data size by 16.6%. Note that the larger increase in size for the latter approach is due to 4-byte alignment of kernel_symbol data structure, instead of 1-byte alignment for the flags bitset in __kflagstab in the former approach. Based on the above, it was concluded that introducing __kflagstab makes sense, as the added complexity is minimal over extending kernel_symbol, and there is overall simplification of symbol finding logic in the module loader. Signed-off-by: Siddharth Nayyar <sidnayyar@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> [Sami: Updated commit message to include details from the cover letter.] Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2026-03-31module: define ksym_flags enumeration to represent kernel symbol flagsSiddharth Nayyar
The core architectural issue with kernel symbol flags is our reliance on splitting the main symbol table, ksymtab. To handle a single boolean property, such as GPL-only, all exported symbols are split across two separate tables: __ksymtab and __ksymtab_gpl. This design forces the module loader to perform a separate search on each of these tables for every symbol it needs, for vmlinux and for all previously loaded modules. This approach is fundamentally not scalable. If we were to introduce a second flag, we would need four distinct symbol tables. For n boolean flags, this model requires an exponential growth to 2^n tables, dramatically increasing complexity. Another consequence of this fragmentation is degraded performance. For example, a binary search on the symbol table of vmlinux, that would take only 14 comparison steps (assuming ~2^14 or 16K symbols) in a unified table, can require up to 26 steps when spread across two tables (assuming both tables have ~2^13 symbols). This performance penalty worsens as more flags are added. To address this, symbol flags is an enumeration used to represent flags as a bitset, for example a flag to tell if a symbol is GPL only. The said bitset is introduced in subsequent patches and will contain values of kernel symbol flags. These bitset will then be used to infer flag values rather than fragmenting ksymtab for separating symbols with different flag values, thereby eliminating the need to fragment the ksymtab. Link: https://lore.kernel.org/r/20260326-kflagstab-v5-0-fa0796fe88d9@google.com Signed-off-by: Siddharth Nayyar <sidnayyar@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> [Sami: Updated the commit message to explain the use case for the series.] Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2026-03-31bpf: Fix grace period wait for tracepoint bpf_linkKumar Kartikeya Dwivedi
Recently, tracepoints were switched from using disabled preemption (which acts as RCU read section) to SRCU-fast when they are not faultable. This means that to do a proper grace period wait for programs running in such tracepoints, we must use SRCU's grace period wait. This is only for non-faultable tracepoints, faultable ones continue using RCU Tasks Trace. However, bpf_link_free() currently does call_rcu() for all cases when the link is non-sleepable (hence, for tracepoints, non-faultable). Fix this by doing a call_srcu() grace period wait. As far RCU Tasks Trace gp -> RCU gp chaining is concerned, it is deemed unnecessary for tracepoint programs. The link and program are either accessed under RCU Tasks Trace protection, or SRCU-fast protection now. The earlier logic of chaining both RCU Tasks Trace and RCU gp waits was to generalize the logic, even if it conceded an extra RCU gp wait, however that is unnecessary for tracepoints even before this change. In practice no cost was paid since rcu_trace_implies_rcu_gp() was always true. Hence we need not chaining any RCU gp after the SRCU gp. For instance, in the non-faultable raw tracepoint, the RCU read section of the program in __bpf_trace_run() is enclosed in the SRCU gp, likewise for faultable raw tracepoint, the program is under the RCU Tasks Trace protection. Hence, the outermost scope can be waited upon to ensure correctness. Also, sleepable programs cannot be attached to non-faultable tracepoints, so whenever program or link is sleepable, only RCU Tasks Trace protection is being used for the link and prog. Fixes: a46023d5616e ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast") Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com> Reviewed-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20260331211021.1632902-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-31bpf: Clarify BPF_RB_NO_WAKEUP behavior for bpf_ringbuf_discard()Eyal Birger
Clarify bpf_ringbuf_discard() documentation for BPF_RB_NO_WAKEUP. Discarded ring buffer records are still left in the ring buffer and are only skipped when user space consumes them. This can matter when BPF_RB_NO_WAKEUP is used: a later submit relying on adaptive wakeup might not wake the consumer, because the discarded record still needs to be consumed first. Scenario: epoll_wait(rb_fd); // blocks rec = bpf_ringbuf_reserve(&rb, ...); bpf_ringbuf_discard(rec, BPF_RB_NO_WAKEUP); rec = bpf_ringbuf_reserve(&rb, ...); bpf_ringbuf_submit(rec, 0); // valid record, but no wakeup Document this in bpf_ringbuf_discard() to make the interaction between discarded records, user-space consumption, and adaptive wakeups explicit. Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20260331130612.3762433-1-eyal.birger@gmail.com ---- v2: adapt wording per feedback from Andrii.
2026-03-31refcount: Remove unused __signed_wrap function annotationsKees Cook
With CONFIG_UBSAN_INTEGER_WRAP being replaced by Overflow Behavior Types, remove the __signed_wrap function annotation as it is already unused, and any future work here will use OBT annotations instead. Link: https://patch.msgid.link/20260331163725.2765789-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2026-03-31Merge tag 'cgroup-for-7.0-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix cgroup rmdir racing with dying tasks. Deferred task cgroup unlink introduced a window where cgroup.procs is empty but the cgroup is still populated, causing rmdir to fail with -EBUSY and selftest failures. Make rmdir wait for dying tasks to fully leave and fix selftests to not depend on synchronous populated updates. - Fix cpuset v1 task migration failure from empty cpusets under strict security policies. When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be migrated to an ancestor without a security_task_setscheduler() check that would block the migration. * tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: Skip security check for hotplug induced v1 task migration cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach() cgroup: Fix cgroup_drain_dying() testing the wrong condition selftests/cgroup: Don't require synchronous populated update on task exit cgroup: Wait for dying tasks to leave on rmdir
2026-03-31drm/msm/adreno: Expose a PARAM to check AQE supportAkhil P Oommen
AQE (Applicaton Qrisc Engine) is required to support VK ray-pipeline. Two conditions should be met to use this HW: 1. AQE firmware should be loaded and programmed 2. Preemption support Expose a new MSM_PARAM to allow userspace to query its support. Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/714685/ Message-ID: <20260327-a8xx-gpu-batch2-v2-17-2b53c38d2101@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2026-03-31Merge tag 'fs_for_v7.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull udf fix from Jan Kara: "Fix for a race in UDF that can lead to memory corruption" * tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix race between file type conversion and writeback mpage: Provide variant of mpage_writepages() with own optional folio handler
2026-03-31EDAC/mc: Use kzalloc_flex()Rosen Penev
Convert struct mem_ctl_info to use flex array and use the new flex array helpers to enable runtime bounds checking, including annotating the array length member with __counted_by() for extra runtime analysis when requested. Move memcpy() after the counter assignment so that it is initialized before the first reference to the flex array, as the new attribute requires. [ bp: Heavily massage commit message. ] Signed-off-by: Rosen Penev <rosenp@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://patch.msgid.link/20260327024828.7377-1-rosenp@gmail.com
2026-03-31fwctl/bnxt_fwctl: Add bnxt fwctl devicePavan Chebbi
Create bnxt_fwctl device. This will bind to bnxt's aux device. On the upper edge, it will register with the fwctl subsystem. It will make use of bnxt's ULP functions to send FW commands. Link: https://patch.msgid.link/r/20260314151605.932749-5-pavan.chebbi@broadcom.com Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2026-03-31sched/deadline: Move some utility functions to deadline.hGabriele Monaco
Some utility functions on sched_dl_entity can be useful outside of deadline.c , for instance for modelling, without relying on raw structure fields. Move functions like dl_task_of and dl_is_implicit to deadline.h to make them available outside. Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20260330111010.153663-12-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-03-31sched: Add deadline tracepointsGabriele Monaco
Add the following tracepoints: * sched_dl_throttle(dl_se, cpu, type): Called when a deadline entity is throttled * sched_dl_replenish(dl_se, cpu, type): Called when a deadline entity's runtime is replenished * sched_dl_update(dl_se, cpu, type): Called when a deadline entity updates without throttle or replenish * sched_dl_server_start(dl_se, cpu, type): Called when a deadline server is started * sched_dl_server_stop(dl_se, cpu, type): Called when a deadline server is stopped Those tracepoints can be useful to validate the deadline scheduler with RV and are not exported to tracefs. Reviewed-by: Phil Auld <pauld@redhat.com> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20260330111010.153663-11-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-03-31rv: Add support for per-object monitors in DA/HAGabriele Monaco
RV deterministic and hybrid automata currently only support global, per-cpu and per-task monitors. It isn't possible to write a model that would follow some different type of object, like a deadline entity or a lock. Define the generic per-object monitor implementation which shares part of the implementation with the per-task monitors. The user needs to provide an id for the object (e.g. pid for tasks) and define the data type for the monitor_target (e.g. struct task_struct * for tasks). Both are supplied to the event handlers, as the id may not be easily available in the target. The monitor storage (e.g. the rv monitor, pointer to the target, etc.) is stored in a hash table indexed by id. Monitor storage objects are automatically allocated unless specified otherwise (e.g. if the creation context is unsafe for allocation). Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260330111010.153663-9-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-03-31rv: Add Hybrid Automata monitor typeGabriele Monaco
Deterministic automata define which events are allowed in every state, but cannot define more sophisticated constraint taking into account the system's environment (e.g. time or other states not producing events). Add the Hybrid Automata monitor type as an extension of Deterministic automata where each state transition is validating a constraint on a finite number of environment variables. Hybrid automata can be used to implement timed automata, where the environment variables are clocks. Also implement the necessary functionality to handle clock constraints (ns or jiffy granularity) on state and events. Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260330111010.153663-3-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-03-31rv: Unify DA event handling functions across monitor typesGabriele Monaco
The DA event handling functions are mostly duplicated because the per-task monitors need to propagate the task struct while others do not. Unify the functions, handle the difference by always passing an identifier which is the task's pid for per-task monitors but is ignored for the other types. Only keep the actual tracepoint calling separated. Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20260330111010.153663-2-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
2026-03-31Merge tag 'fixes' into 'for-next'Ilpo Järvinen
Allows uniwill-laptop feature work that depends on changes in the fixes branch proceed.
2026-03-31sed-opal: Add STACK_RESET commandMilan Broz
The TCG Opal device could enter a state where no new session can be created, blocking even Discovery or PSID reset. While a power cycle or waiting for the timeout should work, there is another possibility for recovery: using the Stack Reset command. The Stack Reset command is defined in the TCG Storage Architecture Core Specification and is mandatory for all Opal devices (see Section 3.3.6 of the Opal SSC specification). This patch implements the Stack Reset command. Sending it should clear all active sessions immediately, allowing subsequent commands to run successfully. While it is a TCG transport layer command, the Linux kernel implements only Opal ioctls, so it makes sense to use the IOC_OPAL ioctl interface. The Stack Reset takes no arguments; the response can be success or pending. If the command reports a pending state, userspace can try to repeat it; in this case, the code returns -EBUSY. Signed-off-by: Milan Broz <gmazyland@gmail.com> Reviewed-by: Ondrej Kozina <okozina@redhat.com> Link: https://patch.msgid.link/20260310095349.411287-1-gmazyland@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-03-31Merge branch 'dma-contig-for-7.1-modules-prep-v4' into dma-mapping-for-nextMarek Szyprowski
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2026-03-31dma: contiguous: Make dma_contiguous_default_area staticMaxime Ripard
Now that dev_get_cma_area() is no longer inline, we don't have any user of dma_contiguous_default_area() outside of contiguous.c so we can make it static. Signed-off-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-3-e18fda504419@kernel.org
2026-03-31dma: contiguous: Make dev_get_cma_area() a proper functionMaxime Ripard
As we try to enable dma-buf heaps, and the CMA one in particular, to compile as modules, we need to export dev_get_cma_area(). It's currently implemented as an inline function that returns either the content of device->cma_area or dma_contiguous_default_area. Thus, it means we need to export dma_contiguous_default_area, which isn't really something we want any module to have access to. Instead, let's make dev_get_cma_area() a proper function we will be able to export so we can avoid exporting dma_contiguous_default_area. Signed-off-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-2-e18fda504419@kernel.org
2026-03-31dma: contiguous: Turn heap registration logic aroundMaxime Ripard
The CMA heap instantiation was initially developed by having the contiguous DMA code call into the CMA heap to create a new instance every time a reserved memory area is probed. Turning the CMA heap into a module would create a dependency of the kernel on a module, which doesn't work. Let's turn the logic around and do the opposite: store all the reserved memory CMA regions into the contiguous DMA code, and provide an iterator for the heap to use when it probes. Signed-off-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20260331-dma-buf-heaps-as-modules-v4-1-e18fda504419@kernel.org
2026-03-31fs: hide file and bfile caches behind runtime const machineryMateusz Guzik
s/cachep/cache/ for consistency with namei and dentry caches. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20260328173728.3388070-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-03-31crypto: algif_aead - Revert to operating out-of-placeHerbert Xu
This mostly reverts commit 72548b093ee3 except for the copying of the associated data. There is no benefit in operating in-place in algif_aead since the source and destination come from different mappings. Get rid of all the complexity added for in-place operation and just copy the AD directly. Fixes: 72548b093ee3 ("crypto: algif_aead - copy AAD from src to dst") Reported-by: Taeyang Lee <0wn@theori.io> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2026-03-31serdev: Add an API to find the serdev controller associated with the ↵Manivannan Sadhasivam
devicetree node Add of_find_serdev_controller_by_node() API to find the serdev controller device associated with the devicetree node. Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64) Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260326-pci-m2-e-v7-2-43324a7866e6@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2026-03-31serdev: Convert to_serdev_*() helpers to macros and use container_of_const()Manivannan Sadhasivam
If these helpers receive the 'const struct device' pointer, then the const qualifier will get dropped, leading to below warning: warning: passing argument 1 of ‘to_serdev_device_driver’ discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] This is not an issue as of now, but with the future commits adding serdev device based driver matching, this warning will get triggered. Hence, convert these helpers to macros so that the qualifier get preserved and also use container_of_const() as container_of() is deprecated. Tested-by: Hans de Goede <johannes.goede@oss.qualcomm.com> # ThinkPad T14s gen6 (arm64) Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260326-pci-m2-e-v7-1-43324a7866e6@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
2026-03-31Merge tag 'drm-intel-next-2026-03-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 feature pull #2 for v7.1: Refactoring and cleanups: - Refactor LT PHY PLL handling to use the DPLL framework (Mika) - Implement display register polling and waits in display code (Ville) - Move PCH clock gating in display PCH file (Luca) - Add shared stepping info header for i915 and display (Jani) - Clean up GVT I2C command decoding (Jonathan) - NV12 plane unlinking cleanups (Ville) - Clean up NV12 DDB/watermark handling for pre-ICL platforms (Ville) Fixes: - An assortment of DSI fixes (Ville) - Handle PORT_NONE in assert_port_valid() (Jonathan) - Fix link failure without FBDEV emulation (Arnd Bergmann) - Quirk disable panel replay on certain Dell XPS models (Jouni) - Check if VESA DPCD AUX backlight is possible (Suraj) Other: - Mailmap update for Christoph (Christoph) Signed-off-by: Dave Airlie <airlied@redhat.com> # Conflicts: # drivers/gpu/drm/i915/display/intel_plane.c From: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/ac9dfdb745d5a67c519ea150a6f36f8f74b8760e@intel.com
2026-03-30hwmon: (ina2xx) drop unused platform dataBartosz Golaszewski
Nobody defines struct ina2xx_platform_data. Remove platform data support from the drivers which still have it (it's effectively dead code) and remove the header. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/r/20260326-drop-ina2xx-pdata-v1-1-c159437bb2df@oss.qualcomm.com [groeck: Fixed continuation line alignment] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-03-30vfio: unhide vdev->debug_rootArnd Bergmann
When debugfs is disabled, the hisilicon driver now fails to build: drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c: In function 'hisi_acc_vfio_debug_init': drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c:1671:62: error: 'struct vfio_device' has no member named 'debug_root' 1671 | vfio_dev_migration = debugfs_lookup("migration", vdev->debug_root); | ^~ The driver otherwise relies on dead-code elimination, but this reference fails. The single struct member is not going to make much of a difference for memory consumption, so just keep this visible unconditionally. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: b398f91779b8 ("hisi_acc_vfio_pci: register debugfs for hisilicon migration driver") Link: https://lore.kernel.org/r/20260327165521.3779707-1-arnd@kernel.org Signed-off-by: Alex Williamson <alex@shazbot.org>
2026-03-30x86: rename and clean up __copy_from_user_inatomic_nocache()Linus Torvalds
Similarly to the previous commit, this renames the somewhat confusingly named function. But in this case, it was at least less confusing: the __copy_from_user_inatomic_nocache is indeed copying from user memory, and it is indeed ok to be used in an atomic context, so it will not warn about it. But the previous commit also removed the NTB mis-use of the __copy_from_user_inatomic_nocache() function, and as a result every call-site is now _actually_ doing a real user copy. That means that we can now do the proper user pointer verification too. End result: add proper address checking, remove the double underscores, and change the "nocache" to "nontemporal" to more accurately describe what this x86-only function actually does. It might be worth noting that only the target is non-temporal: the actual user accesses are normal memory accesses. Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's before XMM2 in the Pentium III) we end up just falling back on a regular user copy, so nothing can actually depend on the non-temporal semantics, but that has always been true. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-03-31BackMerge tag 'v7.0-rc6' into drm-nextDave Airlie
Linux 7.0-rc6 Requested by a few people on irc to resolve conflicts in other tress. Signed-off-by: Dave Airlie <airlied@redhat.com>
2026-03-30cpufreq: Add boost_freq_req QoS requestPierre Gondois
The Power Management Quality of Service (PM QoS) allows to aggregate constraints from multiple entities. It is currently used to manage the min/max frequency of a given policy. Frequency constraints can come for instance from: - Thermal framework: acpi_thermal_cpufreq_init() - Firmware: _PPC objects: acpi_processor_ppc_init() - User: by setting policyX/scaling_[min|max]_freq The minimum of the max frequency constraints is used to compute the resulting maximum allowed frequency. When enabling boost frequencies, the same frequency request object (policy->max_freq_req) as to handle requests from users is used. As a result, when setting: - scaling_max_freq - boost The last sysfs file used overwrites the request from the other sysfs file. To avoid this, create a per-policy boost_freq_req to save the boost constraints instead of overwriting the last scaling_max_freq constraint. policy_set_boost() calls the cpufreq set_boost callback. Update the newly added boost_freq_req request from there: - whenever boost is toggled - to cover all possible paths In the existing .set_boost() callbacks: - Don't update policy->max as this is done through the qos notifier cpufreq_notifier_max() which calls cpufreq_set_policy(). - Remove freq_qos_update_request() calls as the qos request is now done in policy_set_boost() and updates the new boost_freq_req $ ## Init state scaling_max_freq:1000000 cpuinfo_max_freq:1000000 $ echo 700000 > scaling_max_freq scaling_max_freq:700000 cpuinfo_max_freq:1000000 $ echo 1 > ../boost scaling_max_freq:1200000 cpuinfo_max_freq:1200000 $ echo 800000 > scaling_max_freq scaling_max_freq:800000 cpuinfo_max_freq:1200000 $ ## Final step: $ ## Without the patches: $ echo 0 > ../boost scaling_max_freq:1000000 cpuinfo_max_freq:1000000 $ ## With the patches: $ echo 0 > ../boost scaling_max_freq:800000 cpuinfo_max_freq:1000000 Note: cpufreq_frequency_table_cpuinfo() updates policy->min and max from: A. cpufreq_boost_set_sw() \-cpufreq_frequency_table_cpuinfo() B. cpufreq_policy_online() \-cpufreq_table_validate_and_sort() \-cpufreq_frequency_table_cpuinfo() Keep these updates as some drivers expect policy->min and max to be set through B. Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20260326204404.1401849-3-pierre.gondois@arm.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-03-30rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace ↵Paul E. McKenney
periods Now that RCU Tasks Trace is implemented in terms of SRCU-fast, the fact that each SRCU-fast grace period implies at least two RCU grace periods in turn means that each RCU Tasks Trace grace period implies at least two grace periods. This commit therefore updates the documentation accordingly. Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()Paul E. McKenney
Typo fix in srcu_read_unlock_fast() header comment. Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30srcu: Fix SRCU read flavor macro commentsPaul E. McKenney
The SRCU_READ_FLAVOR_FAST and SRCU_READ_FLAVOR_FAST_UPDOWN comments need repair. The former fails to not that SRCU-fast can be used in NMI handlers, and the latter says that it goes with srcu_read_lock_fast() when it really goes with srcu_read_lock_fast_updown(). This commit therefore fixes both comments. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30rcutorture: Add a textbook-style trivial preemptible RCUPaul E. McKenney
This commit adds a trivial textbook implementation of preemptible RCU to rcutorture ("torture_type=trivial-preempt"), similar in spirit to the existing "torture_type=trivial" textbook implementation of non-preemptible RCU. Neither trivial RCU implementation has any value for production use, and are intended only to keep Paul honest in his introductory writings and presentations. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2026-03-30lib/crypto: tests: Migrate ChaCha20Poly1305 self-test to KUnitEric Biggers
Move the ChaCha20Poly1305 test from an ad-hoc self-test to a KUnit test. Keep the same test logic for now, just translated to KUnit. Moving to KUnit has multiple benefits, such as: - Consistency with the rest of the lib/crypto/ tests. - Kernel developers familiar with KUnit, which is used kernel-wide, can quickly understand the test and how to enable and run it. - The test will be automatically run by anyone using lib/crypto/.kunitconfig or KUnit's all_tests.config. - Results are reported using the standard KUnit mechanism. - It eliminates one of the few remaining back-references to crypto/ from lib/crypto/, specifically a reference to CONFIG_CRYPTO_SELFTESTS. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260327224229.137532-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-03-30RDMA/mana_ib: Disable RX steering on RSS QP destroyLong Li
When an RSS QP is destroyed (e.g. DPDK exit), mana_ib_destroy_qp_rss() destroys the RX WQ objects but does not disable vPort RX steering in firmware. This leaves stale steering configuration that still points to the destroyed RX objects. If traffic continues to arrive (e.g. peer VM is still transmitting) and the VF interface is subsequently brought up (mana_open), the firmware may deliver completions using stale CQ IDs from the old RX objects. These CQ IDs can be reused by the ethernet driver for new TX CQs, causing RX completions to land on TX CQs: WARNING: mana_poll_tx_cq+0x1b8/0x220 [mana] (is_sq == false) WARNING: mana_gd_process_eq_events+0x209/0x290 (cq_table lookup fails) Fix this by disabling vPort RX steering before destroying RX WQ objects. Note that mana_fence_rqs() cannot be used here because the fence completion is delivered on the CQ, which is polled by user-mode (e.g. DPDK) and not visible to the kernel driver. Refactor the disable logic into a shared mana_disable_vport_rx() in mana_en, exported for use by mana_ib, replacing the duplicate code. The ethernet driver's mana_dealloc_queues() is also updated to call this common function. Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Cc: stable@vger.kernel.org Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/20260325194100.1929056-1-longli@microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-30RDMA/umem: Use consistent DMA attributes when unmapping entriesLeon Romanovsky
The DMA API expects that mapping and unmapping use the same DMA attributes. The RDMA umem code did not meet this requirement, so fix the mismatch. Fixes: f03d9fadfe13 ("RDMA/core: Add weak ordering dma attr to dma mapping") Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-03-30Merge branch 'master' into rdma-nextLeon Romanovsky
Let's bring v7.0-rc6 to the -next branch, so we can merge the DMA attributes fix [1] without merge conflicts. [1] https://lore.kernel.org/all/20260323-umem-dma-attrs-v1-1-d6890f2e6a1e@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org> * master: (1688 commits) Linux 7.0-rc6 ...
2026-03-30RDMA: Properly propagate the number of CQEs as unsigned intLeon Romanovsky
Instead of checking whether the number of CQEs is negative or zero, fix the .resize_user_cq() declaration to use unsigned int. This better reflects the expected value range. The sanity check is then handled correctly in ib_uvbers. Link: https://patch.msgid.link/20260319-resize_cq-cqe-v1-1-b78c6efc1def@nvidia.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-03-30RDMA: Clarify that CQ resize is a user‑space verbLeon Romanovsky
The CQ resize operation is used only by uverbs. Make this explicit. Link: https://patch.msgid.link/20260318-resize_cq-type-v1-2-b2846ed18846@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-03-30RDMA/core: Remove unused ib_resize_cq() implementationLeon Romanovsky
There are no in-kernel users of the CQ resize functionality, so drop it. Link: https://patch.msgid.link/20260318-resize_cq-type-v1-1-b2846ed18846@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2026-03-30RDMA/nldev: Add dellink function pointerZhu Yanjun
Add a dellink function pointer to rdma_link_ops to allow drivers to clean up resources created during newlink. Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20260313023058.13020-2-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>
2026-03-30Merge branch '20260125-iris-ubwc-v4-1-1ff30644ac81@oss.qualcomm.com' into ↵Bjorn Andersson
drivers-for-7.1 Merge the new helpers in UBWC driver through a topic branch, to allow them to be shared with display and video branches as well.
2026-03-30soc: qcom: ubwc: add helpers to get programmable valuesDmitry Baryshkov
Currently the database stores macrotile_mode in the data. However it can be derived from the rest of the data: it should be used for UBWC encoding >= 3.0 except for several corner cases (SM8150 and SC8180X). The ubwc_bank_spread field seems to be based on the impreside data we had for the MDSS and DPU programming. In some cases UBWC engine inside the display controller doesn't need to program it, although bank spread is to be enabled. Bank swizzle is also currently stored as is, but it is almost standard (banks 1-3 for UBWC 1.0 and 2-3 for other versions), the only exception being Lemans (it uses only bank 3). Add helpers returning values from the config for now. They will be rewritten later, in a separate series, but having the helper now simplifies refacroring the code later. Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260125-iris-ubwc-v4-2-1ff30644ac81@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-30soc: qcom: ubwc: add helper to get min_acc lengthDmitry Baryshkov
MDSS and GPU drivers use different approaches to get min_acc length. Add helper function that can be used by all the drivers. The helper reflects our current best guess, it blindly copies the approach adopted by the MDSS drivers and it matches current values selected by the GPU driver. Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Acked-by: Bjorn Andersson <andersson@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Dikshita Agarwal <dikshita.agarwal@oss.qualcomm.com> Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260125-iris-ubwc-v4-1-1ff30644ac81@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-03-30ASoC: Merge up fixesMark Brown
Merge branch 'for-7.0' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into asoc-7.1 for both ASoC and general bug fixes to support testing.
2026-03-30KVM: arm64: Allow userspace to create protected VMs when pKVM is enabledWill Deacon
Introduce a new VM type for KVM/arm64 to allow userspace to request the creation of a "protected VM" when the host has booted with pKVM enabled. For now, this feature results in a taint on first use as many aspects of a protected VM are not yet protected! Tested-by: Fuad Tabba <tabba@google.com> Tested-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://patch.msgid.link/20260330144841.26181-32-will@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-03-30hvc/xen: Check console connection flagJason Andryuk
When the console out buffer is filled, __write_console() will return 0 as it cannot send any data. domU_write_console() will then spin in `while (len)` as len doesn't decrement until xenconsoled attaches. This would block a domU and nullify the parallelism of Hyperlaunch until dom0 userspace starts xenconsoled, which empties the buffer. Xen 4.21 added a connection field to the xen console page. This is set to XENCONSOLE_DISCONNECTED (1) when a domain is built, and xenconsoled will set it to XENCONSOLE_CONNECTED (0) when it connects. Update the hvc_xen driver to check the field. When the field is disconnected, drop the write with -ENOTCONN. We only drop the write when the field is XENCONSOLE_DISCONNECTED (1) to try for maximum compatibility. The Xen toolstack has historically zero initialized the console, so it should see XENCONSOLE_CONNECTED (0) by default. If an implemenation used uninitialized memory, only checking for XENCONSOLE_DISCONNECTED could have the lowest chance of not connecting. This lets the hyperlaunched domU boot without stalling. Once dom0 starts xenconsoled, xl console can be used to access the domU's hvc0. Paritally sync console.h from xen.git to bring in the new field. Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Link: https://patch.msgid.link/20260318235326.14568-1-jason.andryuk@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-30lib/linear_ranges: Add linear_range_get_selector_high_arrayAmit Sunil Dhamne
Add a helper function to find the selector for a given value in a linear range array. The selector should be such that the value it represents should be higher or equal to the given value. Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Reviewed-by: Matti Vaittinen <mazziesaccount@gmail.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://patch.msgid.link/20260325-max77759-charger-v9-4-4486dd297adc@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>