summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-06-02rust/drm: Introduce DeviceContextLyude Paul
One of the tricky things about DRM bindings in Rust is the fact that initialization of a DRM device is a multi-step process. It's quite normal for a device driver to start making use of its DRM device for tasks like creating GEM objects before userspace registration happens. This is an issue in rust though, since prior to userspace registration the device is only partly initialized. This means there's a plethora of DRM device operations we can't yet expose without opening up the door to UB if the DRM device in question isn't yet registered. Additionally, this isn't something we can reliably check at runtime. And even if we could, performing an operation which requires the device be registered when the device isn't actually registered is a programmer bug, meaning there's no real way to gracefully handle such a mistake at runtime. And even if that wasn't the case, it would be horrendously annoying and noisy to have to check if a device is registered constantly throughout a driver. In order to solve this, we first take inspiration from `kernel::device::DeviceContext` and introduce `kernel::drm::DeviceContext`. This provides us with a ZST type that we can generalize over to represent contexts where a device is known to have been registered with userspace at some point in time (`Registered`), along with contexts where we can't make such a guarantee (`Uninit`). It's important to note we intentionally do not provide a `DeviceContext` which represents an unregistered device. This is because there's no reasonable way to guarantee that a device with long-living references to itself will not be registered eventually with userspace. Instead, we provide a new-type for this: `UnregisteredDevice` which can provide a guarantee that the `Device` has never been registered with userspace. To ensure this, we modify `Registration` so that creating a new `Registration` requires passing ownership of an `UnregisteredDevice`. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com> Link: https://patch.msgid.link/20260507220044.3204919-2-lyude@redhat.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-06-02timers/migration: Turn tmigr_hierarchy level_list into a flexible arrayRosen Penev
The level_list array is allocated separately right after the parent struct. The size of the array is already known. Move level_list to the struct tail as a flexible array member and fold the two allocations into a single kzalloc_flex(). Signed-off-by: Rosen Penev <rosenp@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Assisted-by: Claude:Opus-4.7 Link: https://patch.msgid.link/20260522231618.41622-1-rosenp@gmail.com
2026-06-02timers/migration: Deactivate per-capacity hierarchies under nohz_fullFrederic Weisbecker
NOHZ_FULL CPUs global timers are guaranteed to be handled by the timekeeper CPU, which never stops its tick and therefore remains active in the hierarchy. But since the introduction of per-capacity hierarchies, this guarantee is broken because the timekeeper may not belong to the same hierarchy as all the NOHZ_FULL CPUs. Fix it with simply turning off capacity awareness when NOHZ_FULL is running and force a single hierarchy. NOHZ_FULL is not exactly optimized powerwise anyway. Fixes: 098cbaad8e57 ("timers/migration: Split per-capacity hierarchies") Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260519220926.63437-3-frederic@kernel.org
2026-06-02timers/migration: Fix hotplug migrator selection target on asymetric ↵Frederic Weisbecker
capacity machines When a top-level migrator is deactivated, either at CPU down hotplug time or when a CPU is domain isolated, a new migrator is elected among the available CPUs and woken up to take over the migration duty. However that election must happen at the scope of a given hierarchy and not globally, which the introduction of per-capacity hierarchies failed to handle. As a result a given hierarchy may end up without migrator to handle global timers. Fix it by making sure that the new migrator belongs to the same hierarchy as the outgoing CPU. Fixes: 098cbaad8e57 ("timers/migration: Split per-capacity hierarchies") Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260519220926.63437-2-frederic@kernel.org
2026-06-02sched/cputime: Handle dyntick-idle steal time correctlyrefs/merge-window/a491e1430a4fbfab275a200b4acfa057f0aab44dFrederic Weisbecker
The dyntick-idle steal time is currently accounted when the tick restarts but the stolen idle time is not subtracted from the idle time that was already accounted. This is to avoid observing the idle time going backward as the dyntick-idle cputime accessors can't reliably know in advance the stolen idle time. In order to maintain a forward progressing idle cputime while subtracting idle steal time from it, keep track of the previously accounted idle stolen time and substract it from _later_ idle cputime accounting. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-16-frederic@kernel.org
2026-06-02sched/cputime: Handle idle irqtime gracefullyFrederic Weisbecker
The dyntick-idle cputime accounting always assumes that interrupt time accounting is enabled and consequently stops elapsing the idle time during dyntick-idle interrupts. This doesn't mix up well with disabled interrupt time accounting because then idle interrupts become a cputime blind-spot. Also this feature is disabled on most configurations and the overhead of pausing dyntick-idle accounting while in idle interrupts could then be avoided. Fix the situation with conditionally pausing dyntick-idle accounting during idle interrupts only iff either native vtime (which does interrupt time accounting) or generic interrupt time accounting are enabled. Also make sure that the accumulated interrupt time is not accidentally substracted from later accounting. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-15-frederic@kernel.org
2026-06-02sched/cputime: Provide get_cpu_[idle|iowait]_time_us() off-caseFrederic Weisbecker
The last reason why get_cpu_idle/iowait_time_us() may return -1 now is if the config doesn't support nohz. The ad-hoc replacement solution by cpufreq is to compute jiffies minus the whole busy cputime. Although the intention should provide a coherent low resolution estimation of the idle and iowait time, the implementation is buggy because jiffies don't start at 0. Just provide instead a real get_cpu_[idle|iowait]_time_us() offcase. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-14-frederic@kernel.org
2026-06-02tick/sched: Consolidate idle time fetching APIsFrederic Weisbecker
Fetching the idle cputime is available through a variety of accessors all over the place depending on the different accounting flavours and needs: - idle vtime generic accounting can be accessed by kcpustat_field(), kcpustat_cpu_fetch(), get_idle/iowait_time() and get_cpu_idle/iowait_time_us() - dynticks-idle accounting can only be accessed by get_idle/iowait_time() or get_cpu_idle/iowait_time_us() - CONFIG_NO_HZ_COMMON=n idle accounting can be accessed by kcpustat_field() kcpustat_cpu_fetch(), or get_idle/iowait_time() but not by get_cpu_idle/iowait_time_us() Moreover get_idle/iowait_time() relies on get_cpu_idle/iowait_time_us() with a non-sensical conversion to microseconds and back to nanoseconds on the way. Start consolidating the APIs with removing get_idle/iowait_time() and make kcpustat_field() and kcpustat_cpu_fetch() work for all cases. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-13-frederic@kernel.org
2026-06-02tick/sched: Account tickless idle cputime only when tick is stoppedFrederic Weisbecker
There is no real point in switching to dyntick-idle cputime accounting mode if the tick is not actually stopped. This just adds overhead, notably fetching the GTOD, on each idle exit and each idle IRQ entry for no reason during short idle trips. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-12-frederic@kernel.org
2026-06-02tick/sched: Remove unused fieldsFrederic Weisbecker
Remove fields after the dyntick-idle cputime migration to scheduler code. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-11-frederic@kernel.org
2026-06-02tick/sched: Move dyntick-idle cputime accounting to cputime codeFrederic Weisbecker
Although the dynticks-idle cputime accounting is necessarily tied to the tick subsystem, the actual related accounting code has no business residing there and should be part of the scheduler cputime code. Move away the relevant pieces and state machine to where they belong. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-10-frederic@kernel.org
2026-06-02tick/sched: Remove nohz disabled special case in cputime fetchFrederic Weisbecker
Even when nohz is not runtime enabled, the dynticks idle cputime accounting can run and the common idle cputime accessors are still relevant. Remove the nohz disabled special case accordingly. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-9-frederic@kernel.org
2026-06-02tick/sched: Unify idle cputime accountingFrederic Weisbecker
The non-vtime dynticks-idle cputime accounting is a big mess that accumulates within two concurrent statistics, each having their own shortcomings: * The accounting for online CPUs which is based on the delta between tick_nohz_start_idle() and tick_nohz_stop_idle(). Pros: - Works when the tick is off - Has nsecs granularity Cons: - Account idle steal time but doesn't substract it from idle cputime. - Assumes CONFIG_IRQ_TIME_ACCOUNTING by not accounting IRQs but the IRQ time is simply ignored when CONFIG_IRQ_TIME_ACCOUNTING=n - The windows between 1) idle task scheduling and the first call to tick_nohz_start_idle() and 2) idle task between the last tick_nohz_stop_idle() and the rest of the idle time are blindspots wrt. cputime accounting (though mostly insignificant amount) - Relies on private fields outside of kernel stats, with specific accessors. * The accounting for offline CPUs which is based on ticks and the jiffies delta during which the tick was stopped. Pros: - Handles steal time correctly - Handle CONFIG_IRQ_TIME_ACCOUNTING=y and CONFIG_IRQ_TIME_ACCOUNTING=n correctly. - Handles the whole idle task - Accounts directly to kernel stats, without midlayer accumulator. Cons: - Doesn't elapse when the tick is off, which doesn't make it suitable for online CPUs. - Has TICK_NSEC granularity (jiffies) - Needs to track the dyntick-idle ticks that were accounted and substract them from the total jiffies time spent while the tick was stopped. This is an ugly workaround. Having two different accounting for a single context is not the only problem: since those accountings are of different natures, it is possible to observe the global idle time going backward after a CPU goes offline. Clean up the situation with introducing a hybrid approach that stays coherent and works for both online and offline CPUs: * Tick based or native vtime accounting operate before the idle loop is entered and resume once the idle loop prepares to exit. * When the idle loop starts, switch to dynticks-idle accounting as is done currently, except that the statistics accumulate directly to the relevant kernel stat fields. * Private dyntick cputime accounting fields are removed. * Works on both online and offline case. Further improvement will include: * Only switch to dynticks-idle cputime accounting when the tick actually goes in dynticks mode. * Handle CONFIG_IRQ_TIME_ACCOUNTING=n correctly such that the dynticks-idle accounting still elapses while on IRQs. * Correctly substract idle steal cputime from idle time Reported-by: Xin Zhao <jackzxcui1989@163.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-8-frederic@kernel.org
2026-06-02s390/time: Prepare to stop elapsing in dynticks-idleFrederic Weisbecker
Currently the tick subsystem stores the idle cputime accounting in private fields, allowing cohabitation with architecture idle vtime accounting. The former is fetched on online CPUs, the latter on offline CPUs. For consolidation purposes, architecture vtime accounting will continue to account the cputime but will make a break when the idle tick is stopped. The dyntick cputime accounting will then be relayed by the tick subsystem so that the idle cputime is still seen advancing coherently even when the tick isn't there to flush the idle vtime. Prepare for that and introduce three new APIs which will be used in subsequent patches: - vtime_dynticks_start() is deemed to be called when idle enters in dyntick mode. The idle cputime that elapsed so far is accumulated and accounted. Also idle time accounting is ignored. - vtime_dynticks_stop() is deemed to be called when idle exits from dyntick mode. The vtime entry clocks are fast-forward to current time so that idle accounting restarts elapsing from now. Also idle time accounting is resumed. - vtime_reset() is deemed to be called from dynticks idle IRQ entry to fast-forward the clock to current time so that the IRQ time is still accounted by vtime while nohz cputime is paused. Also accumulated vtime won't be flushed from dyntick-idle ticks to avoid accounting twice the idle cputime, along with nohz accounting. Co-developed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-7-frederic@kernel.org
2026-06-02powerpc/time: Prepare to stop elapsing in dynticks-idleFrederic Weisbecker
Currently the tick subsystem stores the idle cputime accounting in private fields, allowing cohabitation with architecture idle vtime accounting. The former is fetched on online CPUs, the latter on offline CPUs. For consolidation purpose, architecture vtime accounting will continue to account the cputime but will make a break when the idle tick is stopped. The dyntick cputime accounting will then be relayed by the tick subsystem so that the idle cputime is still seen advancing coherently even when the tick isn't there to flush the idle vtime. Prepare for that and introduce three new APIs which will be used in subsequent patches: - vtime_dynticks_start() is deemed to be called when idle enters in dyntick mode. The idle cputime that elapsed so far is accumulated. - vtime_dynticks_stop() is deemed to be called when idle exits from dyntick mode. The vtime entry clocks are fast-forward to current time so that idle accounting restarts elapsing from now. - vtime_reset() is deemed to be called from dynticks idle IRQ entry to fast-forward the clock to current time so that the IRQ time is still accounted by vtime while nohz cputime is paused. Also accumulated vtime won't be flushed from dyntick-idle ticks to avoid accounting twice the idle cputime, along with nohz accounting. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-6-frederic@kernel.org
2026-06-02sched/cputime: Correctly support generic vtime idle timeFrederic Weisbecker
Currently whether generic vtime is running or not, the idle cputime is fetched from the nohz accounting. However generic vtime already does its own idle cputime accounting. Only the kernel stat accessors are not plugged to support it. Read the idle generic vtime cputime when it's running, this will allow to later more clearly split nohz and vtime cputime accounting. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-5-frederic@kernel.org
2026-06-02sched/cputime: Remove superfluous and error prone kcpustat_field() parameterFrederic Weisbecker
The first parameter to kcpustat_field() is a pointer to the cpu kcpustat to be fetched from. This parameter is error prone because a copy to a kcpustat could be passed by accident instead of the original one. Also the kcpustat structure can already be retrieved with the help of the mandatory CPU argument. Remove the needless parameter. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-4-frederic@kernel.org
2026-06-02sched/idle: Handle offlining first in idle loopFrederic Weisbecker
Offline handling happens from within the inner idle loop, after the beginning of dyntick cputime accounting, nohz idle load balancing and TIF_NEED_RESCHED polling. This is not necessary and even buggy because: * There is no dyntick handling to do. And calling tick_nohz_idle_enter() messes up with the struct tick_sched reset that was performed on tick_sched_timer_dying(). * There is no nohz idle balancing to do. * Polling on TIF_RESCHED is irrelevant at this stage, there are no more tasks allowed to run. * No need to check if need_resched() before offline handling since stop_machine is done and all per-cpu kthread should be done with their job. Therefore move the offline handling at the beginning of the idle loop. This will also ease the idle cputime unification later by not elapsing idle time while offline through the call to: tick_nohz_idle_enter() -> tick_nohz_start_idle() Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20260508131647.43868-3-frederic@kernel.org
2026-06-02tick/sched: Fix TOCTOU in nohz idle time fetchFrederic Weisbecker
When the nohz idle time is fetched, the current clock timestamp is taken outside the seqcount, which can result in a race as reported by Sashiko: get_cpu_sleep_time_us() tick_nohz_start_idle() ----------------------- --------------------- now = ktime_get() write_seqcount_begin(idle_sleeptime_seq); idle_entrytime = ktime_get() tick_sched_flag_set(ts, TS_FLAG_IDLE_ACTIVE); write_seqcount_end(&ts->idle_sleeptime_seq); read_seqcount_begin(idle_sleeptime_seq) delta = now - idle_entrytime); //!! But now < idle_entrytime idle = *sleeptime + delta; read_seqcount_retry(&ts->idle_sleeptime_seq, seq) Here the read side fetches the timestamp before the write side and its update. As a result the time delta computed on the read side is negative (ktime_t is signed) and breaks the cputime monotonicity guarantee. This could possibly be fixed with reading the current clock timestamp inside the seqcount but the reader overhead might then increase. Also simply checking that the current timestamp is above the idle entry time is enough to prevent any issue of the like. Fixes: 620a30fa0bd1 ("timers/nohz: Protect idle/iowait sleep time under seqcount") Reported-by: Sashiko Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260508131647.43868-2-frederic@kernel.org
2026-06-02net: garp: fix unsigned integer underflow in garp_pdu_parse_attrYizhou Zhao
The receive-side GARP attribute parser computes dlen with reversed operands: dlen = sizeof(*ga) - ga->len; ga->len is the on-wire attribute length and includes the GARP attribute header. For normal attributes with data, ga->len is larger than sizeof(*ga), so the subtraction underflows in unsigned arithmetic. The resulting value is later passed to garp_attr_lookup(), whose length argument is u8. After truncation, the parsed data length usually no longer matches the length stored for locally registered attributes, so received Join/Leave events are ignored. This breaks the GARP receive path for common attributes, such as GVRP VLAN registration attributes. Compute the data length as the attribute length minus the header length. Fixes: eca9ebac651f ("net: Add GARP applicant-only participant") Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn> Reported-by: Ao Wang <wangao@seu.edu.cn> Reported-by: Xuewei Feng <fengxw06@126.com> Reported-by: Qi Li <qli01@tsinghua.edu.cn> Reported-by: Ke Xu <xuke@tsinghua.edu.cn> Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260527083200.42861-1-zhaoyz24@mails.tsinghua.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02selftests: drv-net: tso: add new tests for ip6tnl, ipip, and sit tunnelsDaniel Zahka
Add new tunnel test cases for ip6tnl, ipip, and sit. ip6tnl supports ipv[46] as inner l3 header, and the other two tunnels only support a single inner l3 type. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20260529-tso-tunnels-v1-1-3771ee9eaaa9@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02time: Fix off-by-one in settimeofday() usec validationNaveen Kumar Chaudhary
The validation check uses '>' instead of '>=' when comparing tv_usec against USEC_PER_SEC, allowing the value 1000000 through. After conversion to nanoseconds (*= 1000), this produces tv_nsec == NSEC_PER_SEC, violating the timespec invariant that tv_nsec must be less than NSEC_PER_SEC. Use '>=' to reject tv_usec values that are not in the valid range of 0 to 999999. Fixes: 5e0fb1b57bea ("y2038: time: avoid timespec usage in settimeofday()") Signed-off-by: Naveen Kumar Chaudhary <naveen.osdev@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Acked-by: John Stultz <jstultz@google.com> Link: https://patch.msgid.link/4rikk44zew3s6577dugmx4jyblz7o5c57niuap6ct3td5yfm6w@gh7pcumg7qor
2026-06-02clockevents: Fix duplicate type specifier in stub function parameterNaveen Kumar Chaudhary
The stub for arch_inlined_clockevent_set_next_coupled() has 'u64 u64 cycles' in its parameter list. Since u64 is a typedef, the compiler parses the second 'u64' as the parameter name, making 'cycles' an unused token. Remove the duplicate so the parameter is correctly named. Fixes: 89f951a1e8ad ("clockevents: Provide support for clocksource coupled comparators") Signed-off-by: Naveen Kumar Chaudhary <naveen.osdev@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/7tostpvxzdn6tobmyow63a5rweatls5kux3scqp2vzhe7mv6uq@ecr746b4hyhf
2026-06-02ACPI: button: Switch over to devres-based resource managementRafael J. Wysocki
Switch over the ACPI button driver to devres-based resource management by making the following changes: * Use devm_kzalloc() for allocating button object memory. * Use devm_input_allocate_device() for allocating the input class device object. * Turn acpi_lid_remove_fs() into a devm cleanup action added by devm_acpi_lid_add_fs() which is a new wrapper around acpi_lid_add_fs(). * Add devm_acpi_button_init_wakeup() for initializing the wakeup source and make it add a custom devm action that will automatically remove the wakeup source registered by it. * Turn acpi_button_remove_event_handler() into a devm cleanup action added by devm_acpi_button_add_event_handler() which is a new wrapper around acpi_button_add_event_handler(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2283436.Mh6RI2rZIc@rafael.j.wysocki [ rjw: Rebased and removed unnecessary input device parent assignment ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-06-02ACPI: button: Reorganize installing and removing event handlersRafael J. Wysocki
To facilitate subsequent changes, move the code installing and removing button event handlers into two separate functions called acpi_button_add_event_handler() and acpi_button_remove_event_handler(), respectively, and rearrange it to reduce code duplication. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2714170.Lt9SDvczpP@rafael.j.wysocki
2026-06-02ACPI: button: Use string literals for generating netlink messagesRafael J. Wysocki
Instead of storing strings that never change later under acpi_device_class(device) and using them for generating netlink messages, use pointers to string literals with the same content. This also allows the clearing of the acpi_device_class(device) area during driver removal and in the probe rollback path to be dropped. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/2070791.usQuhbGJ8B@rafael.j.wysocki
2026-06-02ACPI: button: Clean up adding and removing lid procfs interfaceRafael J. Wysocki
The procfs interface is only used with lid devices which only becomes clear after looking into the function bodies of acpi_button_add_fs() and acpi_button_remove_fs(). Moreover, the only error code returned by the former of these functions is -ENODEV, so the ret local variable in it is redundant, and the return type of the latter one can be changed to void. Accordingly, rename these functions to acpi_button_add_fs() and acpi_button_remove_fs(), respectively, move the button->type checks against ACPI_BUTTON_TYPE_LID from them to their callers, and make code simplifications as per the above. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/1869050.VLH7GnMWUR@rafael.j.wysocki
2026-06-02ACPI: button: Merge two switch () statements in acpi_button_probe()Rafael J. Wysocki
Two switch () statements in acpi_button_probe() operate on the same value and the statements between them can be reordered with respect to the second one, so merge them. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3352815.5fSG56mABF@rafael.j.wysocki
2026-06-02ACPI: button: Drop redundant variable from acpi_button_probe()Rafael J. Wysocki
Local char pointer called "name" in acpi_button_probe() is redundant because its value can be assigned directly to input->name and the latter can be used in the only other place where "name" is read, so get rid of it. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3706239.iIbC2pHGDl@rafael.j.wysocki
2026-06-02ACPI: button: Rework device verification during probeRafael J. Wysocki
Instead of manually comparing the primary ID of the device (retuned by _HID) with each of the device IDs supported by the driver, use acpi_match_acpi_device() (which includes the ACPI companion device pointer check against NULL) and store the ACPI button type as driver_data in button_device_ids[], which allows a multi-branch conditional statement to be replaced with a switch () one. However, to continue preventing successful probing of devices that only have one of the supported device IDs in their _CID lists, compare the matched device ID with the primary ID of the device and return an error if they don't match. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/7960518.EvYhyI6sBW@rafael.j.wysocki [ rjw: Fixed button memory leak on probe failure ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-06-02ntsync: Honour caller's time namespace for absolute MONOTONIC timeoutsMaoyi Xie
ntsync_schedule() takes the absolute timeout from userspace and hands it to schedule_hrtimeout_range_clock() with HRTIMER_MODE_ABS. For the default CLOCK_MONOTONIC path, it does not call timens_ktime_to_host() first. A process inside a CLOCK_MONOTONIC time namespace computes the absolute timeout in its own clock view. The kernel reads the same value against the host clock. The two differ by the namespace offset. The timeout then fires too early or too late. Other users of absolute timeouts run the ktime through timens_ktime_to_host() before starting the hrtimer. ntsync was added later and missed that step. /dev/ntsync is mode 0666. Any user inside a time namespace that can open it is affected. The visible effect is wrong timeout behaviour for Wine in a container that sets a CLOCK_MONOTONIC offset. Reproducer: unshare --user --time, set the monotonic offset to -10s, issue NTSYNC_IOC_WAIT_ANY with a 100 ms absolute MONOTONIC timeout. The baseline run elapses about 100 ms. The run inside the namespace elapses about 0 ms. Apply timens_ktime_to_host() to the parsed timeout when the caller did not set NTSYNC_WAIT_REALTIME. The helper does nothing in the initial time namespace, so the fast path is unchanged. Fixes: b4a7b5fe3f51 ("ntsync: Introduce NTSYNC_IOC_WAIT_ANY.") Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://patch.msgid.link/20260528063311.3300393-3-maoyixie.tju@gmail.com
2026-06-02time/namespace: Export init_time_ns and do_timens_ktime_to_host()Maoyi Xie
timens_ktime_to_host() in compares the current time namespace against init_time_ns for the fast path. It calls do_timens_ktime_to_host() for the offset case. Both symbols are needed at link time by any caller of the inline. All current callers are builtin, but ntsync can be built as module, which prevents it from using it. Export both with EXPORT_SYMBOL_GPL. Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260528063311.3300393-2-maoyixie.tju@gmail.com
2026-06-02hsr: Remove WARN_ONCE() in hsr_addr_is_self().Kuniyuki Iwashima
syzbot reported the warning [0] in hsr_addr_is_self(), whose assumption is simply wrong. hsr->self_node is cleared in hsr_del_self_node(), which is called from hsr_dellink(). Since dev->rtnl_link_ops->dellink() is called before unregister_netdevice_many(), there is a window when user can find the device but without hsr->self_node. Let's remove WARN_ONCE() in hsr_addr_is_self(). [0]: HSR: No self node WARNING: net/hsr/hsr_framereg.c:39 at hsr_addr_is_self+0x211/0x3f0 net/hsr/hsr_framereg.c:39, CPU#0: syz.4.16848/17220 Modules linked in: CPU: 0 UID: 0 PID: 17220 Comm: syz.4.16848 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)} Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026 RIP: 0010:hsr_addr_is_self+0x211/0x3f0 net/hsr/hsr_framereg.c:39 Code: 33 2f 41 0f b7 dd 89 ee 09 de 31 ff e8 c8 b4 c6 f6 09 dd 74 54 e8 0f b0 c6 f6 31 ed eb 53 e8 06 b0 c6 f6 48 8d 3d 2f 50 9c 04 <67> 48 0f b9 3a 31 ed eb 42 e8 c1 13 1f 00 89 c5 31 ff 89 c6 e8 96 RSP: 0018:ffffc900041c70e0 EFLAGS: 00010283 RAX: ffffffff8afdc6ca RBX: ffffffff8afdc4e6 RCX: 0000000000080000 RDX: ffffc90010493000 RSI: 0000000000000948 RDI: ffffffff8f9a1700 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: ffffc900041c71e8 R11: fffff52000838e3f R12: dffffc0000000000 R13: ffff888041f9e3c0 R14: ffff888086ee3802 R15: 0000000000000000 FS: 00007f6fe985d6c0(0000) GS:ffff888126176000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f80bd437dac CR3: 0000000025096000 CR4: 00000000003526f0 DR0: ffffffffffffffff DR1: 00000000000001f8 DR2: 0000000000000002 DR3: ffffffffefffff15 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <TASK> check_local_dest net/hsr/hsr_forward.c:592 [inline] fill_frame_info net/hsr/hsr_forward.c:728 [inline] hsr_forward_skb+0xa11/0x2a80 net/hsr/hsr_forward.c:739 hsr_dev_xmit+0x253/0x370 net/hsr/hsr_device.c:236 __netdev_start_xmit include/linux/netdevice.h:5368 [inline] netdev_start_xmit include/linux/netdevice.h:5377 [inline] xmit_one net/core/dev.c:3888 [inline] dev_hard_start_xmit+0x2df/0x860 net/core/dev.c:3904 __dev_queue_xmit+0x1428/0x3900 net/core/dev.c:4870 neigh_output include/net/neighbour.h:556 [inline] ip_finish_output2+0xcec/0x10b0 net/ipv4/ip_output.c:237 ip_send_skb net/ipv4/ip_output.c:1510 [inline] ip_push_pending_frames+0x8b/0x110 net/ipv4/ip_output.c:1530 raw_sendmsg+0x1547/0x1a50 net/ipv4/raw.c:659 sock_sendmsg_nosec net/socket.c:787 [inline] __sock_sendmsg net/socket.c:802 [inline] ____sys_sendmsg+0x7da/0x9c0 net/socket.c:2698 ___sys_sendmsg+0x2a5/0x360 net/socket.c:2752 __sys_sendmsg net/socket.c:2784 [inline] __do_sys_sendmsg net/socket.c:2789 [inline] __se_sys_sendmsg net/socket.c:2787 [inline] __x64_sys_sendmsg+0x1c3/0x2a0 net/socket.c:2787 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x15f/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f6feb62ce59 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f6fe985d028 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f6feb8a6090 RCX: 00007f6feb62ce59 RDX: 0000000000000000 RSI: 0000200000000000 RDI: 0000000000000004 RBP: 00007f6feb6c2d6f R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f6feb8a6128 R14: 00007f6feb8a6090 R15: 00007ffcf01cc488 </TASK> Fixes: f266a683a480 ("net/hsr: Better frame dispatch") Reported-by: syzbot+652670cf249077eb498b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6a1a861e.b111c304.35cd64.0016.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260530064300.340793-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02bpf: Silence unused-but-set-variable warning in bpf_for_each_reg_in_vstate_maskAmery Hung
The macro requires callers to pass a stack variable, but not all callbacks use it. Add (void)__stack to suppress the clang W=1 warning. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20260602175204.624401-1-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-06-02Merge tag 'batadv-next-pullrequest-20260601' of https://git.open-mesh.org/batadvJakub Kicinski
Simon Wunderlich says: ==================== This batman-adv cleanup patchset includes the following patches, all by Sven Eckelmann: - drop batman-adv specific version - MAINTAINERS housekeeping for batman-adv (two patches) - add missing includes - use atomic_xchg() for gw.reselect check - extract netdev wifi detection information object - replace inappropriate atomic access with (READ|WRITE)_ONCE (six patches) - tt: replace open-coded overflow check with helper - tvlv: avoid unnecessary OGM buffer reallocations - use neigh_node's orig_node only as id * tag 'batadv-next-pullrequest-20260601' of https://git.open-mesh.org/batadv: batman-adv: use neigh_node's orig_node only as id batman-adv: tvlv: avoid unnecessary OGM buffer reallocations batman-adv: tt: replace open-coded overflow check with helper batman-adv: replace non-atomic last_ttvn with (READ|WRITE)_ONCE batman-adv: replace non-atomic packet_size_max with (READ|WRITE)_ONCE batman-adv: replace non-atomic mesh state with (READ|WRITE)_ONCE batman-adv: replace non-atomic vlan config fields with (READ|WRITE)_ONCE batman-adv: replace non-atomic hardif config fields with (READ|WRITE)_ONCE batman-adv: replace non-atomic meshif config fields with (READ|WRITE)_ONCE batman-adv: extract netdev wifi detection information object batman-adv: use atomic_xchg() for gw.reselect check batman-adv: add missing includes MAINTAINERS: Don't send batman-adv patches to netdev MAINTAINERS: Rename batman-adv T(ree) batman-adv: drop batman-adv specific version ==================== Link: https://patch.msgid.link/20260601123629.707089-1-sw@simonwunderlich.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02Merge tag 'nf-26-06-01' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Fix splat with PREEMPT_RCU because smp_processor_id() in nfqueue, from Fernando Fernandez Mancera. 2) Fix possible use of pointer to old IPVS scheduler after RCU grace period when editing service, from Julian Anastasov. 3) Fix possible forever RCU walk over rt->fib6_siblings in nft_fib6, if rt is unlinked mid-iteration, apparently same issue happens in the fib6 core. From Jiayuan Chen. 4) Add mutex to guard refcount in synproxy infrastructure, since concurrent hook {un}registration can happen. From Fernando Fernandez Mancera. 5) Bail out if IRC conntrack helper fails to parse a command, do not try parsing using other command handlers, from Florian Westphal. This fixes a possible out-of-bound read. 6) Possible use-after-free in nft_tunnel by releasing template dst after all references has been dropped, from Tristan Madani. 7) Ignore conntrack template in nft_ct, from Jiayuan Chen. 8) Missing skb_ensure_writable() in ebt_snat, Yiming Qian. 9) Remove multi-register byteorder support, this allows for kernel stack info leak, from Florian Westphal. * tag 'nf-26-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_byteorder: remove multi-register support netfilter: bridge: make ebt_snat ARP rewrite writable netfilter: nft_ct: bail out on template ct in get eval netfilter: nft_tunnel: fix use-after-free on object destroy netfilter: conntrack_irc: fix possible out-of-bounds read netfilter: synproxy: add mutex to guard hook reference counting netfilter: nft_fib_ipv6: bail out of sibling walk if rt got unlinked ipvs: clear the svc scheduler ptr early on edit netfilter: xt_NFQUEUE: prefer raw_smp_processor_id ==================== Link: https://patch.msgid.link/20260601115923.433946-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02tcp: Add preempt_{disable,enable}_nested() in reqsk_queue_hash_req().Kuniyuki Iwashima
syzbot reported a weird reqsk->rsk_refcnt underflow in __inet_csk_reqsk_queue_drop(). The captured reqsk_put() in __inet_csk_reqsk_queue_drop() is called only when it successfully removes reqsk from ehash. Moreover, reqsk_timer_handler() calls another reqsk_put() after that. This indicates that the reqsk was missing both refcnts for ehash and the timer itself. Since all the syzbot reports had PREEMPT_RT enabled, the only possible scenario is that reqsk_queue_hash_req() is preempted after mod_timer() and before refcount_set(), and then the timer triggered after 1s aborts the reqsk due to its listener's close(). Let's wrap mod_timer() and refcount_set() with preempt_disable_nested() and preempt_enable_nested(). Note that inet_ehash_insert() holds the normal spin_lock() (mutex in PREEMPT_RT), so it must be called outside of preempt_disable_nested(), but this is fine. The lookup path just ignores 0 sk_refcnt entries in ehash and tries to create another reqsk, but this will fail at inet_ehash_insert(). [0]: refcount_t: underflow; use-after-free. WARNING: lib/refcount.c:28 at refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28, CPU#0: ktimers/0/16 Modules linked in: CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Tainted: G L syzkaller #0 PREEMPT_{RT,(full)} Tainted: [L]=SOFTLOCKUP Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026 RIP: 0010:refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28 Code: e4 7d d1 0a 67 48 0f b9 3a eb 4a e8 38 3d 23 fd 48 8d 3d e1 7d d1 0a 67 48 0f b9 3a eb 37 e8 25 3d 23 fd 48 8d 3d de 7d d1 0a <67> 48 0f b9 3a eb 24 e8 12 3d 23 fd 48 8d 3d db 7d d1 0a 67 48 0f RSP: 0000:ffffc90000157948 EFLAGS: 00010246 RAX: ffffffff84a1301b RBX: 0000000000000003 RCX: ffff88801ca98000 RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffffff8f72ae00 RBP: ffffffff99ae3b01 R08: ffff88801ca98000 R09: 0000000000000005 R10: 0000000000000100 R11: 0000000000000004 R12: ffff8880425ef568 R13: ffff8880425ef4f8 R14: ffff8880425ef578 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff888126386000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7b46710e9c CR3: 000000000dbb6000 CR4: 00000000003526f0 Call Trace: <TASK> __refcount_sub_and_test include/linux/refcount.h:400 [inline] __refcount_dec_and_test include/linux/refcount.h:432 [inline] refcount_dec_and_test include/linux/refcount.h:450 [inline] reqsk_put include/net/request_sock.h:136 [inline] __inet_csk_reqsk_queue_drop+0x3ce/0x440 net/ipv4/inet_connection_sock.c:1007 reqsk_timer_handler+0x651/0xdf0 net/ipv4/inet_connection_sock.c:1137 call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748 expire_timers kernel/time/timer.c:1799 [inline] __run_timers kernel/time/timer.c:2374 [inline] __run_timer_base+0x6a3/0x9f0 kernel/time/timer.c:2386 run_timer_base kernel/time/timer.c:2395 [inline] run_timer_softirq+0x67/0x170 kernel/time/timer.c:2403 handle_softirqs+0x1de/0x6d0 kernel/softirq.c:622 __do_softirq kernel/softirq.c:656 [inline] run_ktimerd+0x69/0x100 kernel/softirq.c:1151 smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160 kthread+0x388/0x470 kernel/kthread.c:436 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.") Reported-by: syzbot+e809069bc15f26300526@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6a1a7bcf.0a9e871e.332604.000b.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20260601182101.3183993-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02net: Annotate sk->sk_write_space() for UDP SOCKMAP.Kuniyuki Iwashima
UDP TX skb->destructor() is sock_wfree(), and UDP holds lock_sock() only for UDP_CORK / MSG_MORE sendmsg(). Otherwise, sk->sk_write_space() may be read locklessly while SOCKMAP rewrites sk->sk_write_space(). Let's use WRITE_ONCE() and READ_ONCE() for sk->sk_write_space(). Note that the write side is annotated by commit 2ef2b20cf4e0 ("net: annotate data-races around sk->sk_{data_ready,write_space}"). Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support") Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://patch.msgid.link/20260529193941.3897256-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02af_unix: Remove sock->state assignment.Kuniyuki Iwashima
Both struct socket and struct sock have a variable to manage its state, sock->state and sk->sk_state. When both are used, the former typically manages syscall state and the latter manages the actual connection state. AF_UNIX only uses sk->sk_state. Let's remove unnecessary assignemnts for sock->state. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260529191829.3864438-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02selftests: net: add socat syslog for PPPoL2TPQingfang Deng
As done in pppoe.sh, start socat as the syslog listener. In case the test fails, dump its log to see what's going on. Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev> Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Link: https://patch.msgid.link/20260529021146.5739-1-qingfang.deng@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02Merge branch 'net-phy-dp83822-add-optional-external-phy-clock'Jakub Kicinski
Stefan Wahren says: ==================== net: phy: dp83822: Add optional external PHY clock This small series implement support for external PHY clock for the dp83822 driver. ==================== Link: https://patch.msgid.link/20260528184642.33424-1-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02net: phy: dp83822: Add optional external PHY clockStefan Wahren
In some cases, the PHY can use an external ref clock source instead of a crystal. Add an optional clock in the PHY node to make sure that the clock source is enabled, if specified, before probing. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260528184642.33424-3-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02net: phy: dp83822: Improve readability in dp8382x_probeStefan Wahren
Introduce a local pointer for device so devm_kzalloc() fit into a single line. Also this makes following changes easier to read. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260528184642.33424-2-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02pcnet32: stop holding device spin lock during napi_complete_doneOscar Maes
napi_complete_done may call gro_flush_normal (though not currently, as GRO is unsupported at the moment), which may result in packet TX. This will eventually result in calling pcnet32_start_xmit - resulting in a deadlock while trying to re-acquire the already locked spin lock. It is safe to split the spinlock block into two, because the hardware registers are still protected from concurrent access, and the two blocks perform unrelated operations that don't need to happen atomically. Fixes: 5b2ec6f2be51 ("pcnet32: use napi_complete_done()") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Oscar Maes <oscmaes92@gmail.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20260528140320.5556-1-oscmaes92@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02cgroup: Migrate tasks to the root css when a controller is reboundTejun Heo
cgroup_apply_control_disable() defers kill_css_finish() while a css is still populated, relying on css_update_populated() to fire the deferred kill once the populated count reaches zero. This deadlocks when a controller is rebound out of a hierarchy. Mounting an implicit_on_dfl controller such as perf_event as a v1 hierarchy steals it off the default hierarchy, and rebind_subsystems() kills its per-cgroup csses while they are still populated. The migration run in the same step keeps the old css for a controller no longer in the hierarchy's mask, so no task is migrated off the dying csses. Their populated count never reaches zero, the deferred kill_css_finish() never fires, and the next cgroup_lock_and_drain_offline() hangs forever under cgroup_mutex. That migration is already a no-op pass over the rebound subtree. Add cgroup_rebind_ss_mask so find_existing_css_set() resolves the leaving controllers to the root css. Their tasks are migrated there, the per-cgroup csses depopulate, and cgroup_apply_control_disable() kills them synchronously. The deferral stays correct for the rmdir and controller-disable paths it was meant for. Fixes: 1dffd95575eb ("cgroup: Defer kill_css_finish() in cgroup_apply_control_disable()") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/all/41cd159c-54e5-45e0-81df-eaf36a6c028e@sirena.org.uk/ Reported-by: Bert Karwatzki <spasswolf@web.de> Closes: https://lore.kernel.org/all/4e986b4ed7e16547805d54b6e67d09120bc4d2f2.camel@web.de/ Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-06-02tcp: change bpf_skops_hdr_opt_len() signatureEric Dumazet
Some compilers do not inline bpf_skops_hdr_opt_len() from tcp_established_options(), forcing an expensive stack canary when CONFIG_STACKPROTECTOR_STRONG=y. Change bpf_skops_hdr_opt_len() to return @remaining by value to remove this stack canary from TCP fast path. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/0 grow/shrink: 1/1 up/down: 10/-59 (-49) Function old new delta bpf_skops_hdr_opt_len 297 307 +10 tcp_established_options 574 515 -59 Total: Before=31456795, After=31456746, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260601093819.469626-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02net: dsa: b53: hide legacy gpiolib usage on non-mipsArnd Bergmann
The MIPS bcm53xx platform still uses the legacy gpiolib interfaces based on gpio numbers, but other platforms do not. Hide these interfaces inside of the existing #ifdef block and use the modern interfaces in the common parts of the driver to allow building it when the gpio_set_value() is left out of the kernel. Reviewed-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Linus Walleij <linusw@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patch.msgid.link/20260601165716.648230-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-06-02Merge tag 'soc-fixes-7.1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC fixes from Arnd Bergmann: "Following the previous set of fixes, this addresses another significant number of small issues found in firmware drivers (tee, optee, qcomtee, qcom ice, exynos acpm) drivers through various tools. This is about error handling, resource leaks, concurrency and a use-after-free bug. The fixes for the Qualcomm ICE driver also introduce interface changes in the UFS and MMC drivers using it. Outside of firmware drivers, there are a few fixes across the tree: - Minor driver code mistakes in the Atmel EBI memory controller, the i.MX soc ID driver and socfpga boot logic - A defconfig change to avoid a boot time regression on multiple qualcomm boards - Device tree fixes for qualcomm, at91 and gemini, addressing mostly minor configuration mistakes" * tag 'soc-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits) firmware: samsung: acpm: Fix infinite loop on sequence number exhaustion firmware: samsung: acpm: Fix missing LKMM barriers in sequence allocator firmware: samsung: acpm: Fix false timeouts and Use-After-Free in polling ARM: dts: gemini: Fix partition offsets ARM: socfpga: Fix OF node refcount leak in SMP setup soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node arm64: dts: qcom: milos: Add power-domain and iface clk for ice node tee: qcomtee: add missing va_end in early return qcomtee_object_user_init() tee: fix params_from_user() error path in tee_ioctl_supp_recv tee: shm: fix shm leak in register_shm_helper() tee: fix tee_ioctl_object_invoke_arg padding arm64: defconfig: Enable PCI M.2 power sequencing driver scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get() mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get() soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL soc: qcom: ice: Return -ENODEV if the ICE platform device is not found soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get() ARM: dts: microchip: sam9x7: fix GMAC clock configuration firmware: samsung: acpm: Fix mailbox channel leak on probe error ...
2026-06-02ALSA: seq: oss: Reject reads that cannot fit the next eventCássio Gabriel
snd_seq_oss_read() checks whether the next queued OSS sequencer event fits in the remaining userspace buffer before removing it from the read queue. The check is inverted. It currently stops when the event is smaller than the remaining buffer, so a normal 4-byte event is not copied for an 8-byte read buffer. Conversely, an 8-byte event can be copied for a smaller read count. Break only when the remaining userspace buffer is smaller than the next event, and report -EINVAL if no complete event has been copied. This prevents an undersized read from looking like end-of-file while leaving the event queued for a later read with a large enough buffer. Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com> Link: https://patch.msgid.link/20260602-alsa-seq-oss-read-size-check-v1-1-10e59b1742e0@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-06-02RDMA/efa: Validate SQ ring size against max LLQ sizeYonatan Nachum
Validate the SQ ring size against the device's max LLQ size. This ensures that when using 128-byte WQEs, userspace cannot exceed the queue limits. On create QP, userspace provides the SQ ring size (depth x WQE size) which is validated against the max LLQ size. Fixes: 40909f664d27 ("RDMA/efa: Add EFA verbs implementation") Link: https://patch.msgid.link/r/20260526081536.1203553-1-ynachum@amazon.com Reviewed-by: Michael Margolin <mrgolin@amazon.com> Signed-off-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>