summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2026-01-05compiler-context-analysis: Remove __cond_lock() function-like helperMarco Elver
As discussed in [1], removing __cond_lock() will improve the readability of trylock code. Now that Sparse context tracking support has been removed, we can also remove __cond_lock(). Change existing APIs to either drop __cond_lock() completely, or make use of the __cond_acquires() function attribute instead. In particular, spinlock and rwlock implementations required switching over to inline helpers rather than statement-expressions for their trylock_* variants. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250207082832.GU7145@noisy.programming.kicks-ass.net/ [1] Link: https://patch.msgid.link/20251219154418.3592607-25-elver@google.com
2026-01-05compiler-context-analysis: Remove Sparse supportMarco Elver
Remove Sparse support as discussed at [1]. The kernel codebase is still scattered with numerous places that try to appease Sparse's context tracking ("annotation for sparse", "fake out sparse", "work around sparse", etc.). Eventually, as more subsystems enable Clang's context analysis, these places will show up and need adjustment or removal of the workarounds altogether. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250207083335.GW7145@noisy.programming.kicks-ass.net/ [1] Link: https://lore.kernel.org/all/Z6XTKTo_LMj9KmbY@elver.google.com/ [2] Link: https://patch.msgid.link/20251219154418.3592607-24-elver@google.com
2026-01-05um: Fix incorrect __acquires/__releases annotationsMarco Elver
With Clang's context analysis, the compiler is a bit more strict about what goes into the __acquires/__releases annotations and can't refer to non-existent variables. On an UM build, mm_id.h is transitively included into mm_types.h, and we can observe the following error (if context analysis is enabled in e.g. stackdepot.c): In file included from lib/stackdepot.c:17: In file included from include/linux/debugfs.h:15: In file included from include/linux/fs.h:5: In file included from include/linux/fs/super.h:5: In file included from include/linux/fs/super_types.h:7: In file included from include/linux/list_lru.h:14: In file included from include/linux/xarray.h:16: In file included from include/linux/gfp.h:7: In file included from include/linux/mmzone.h:22: In file included from include/linux/mm_types.h:26: In file included from arch/um/include/asm/mmu.h:12: >> arch/um/include/shared/skas/mm_id.h:24:54: error: use of undeclared identifier 'turnstile' 24 | void enter_turnstile(struct mm_id *mm_id) __acquires(turnstile); | ^~~~~~~~~ arch/um/include/shared/skas/mm_id.h:25:53: error: use of undeclared identifier 'turnstile' 25 | void exit_turnstile(struct mm_id *mm_id) __releases(turnstile); | ^~~~~~~~~ One (discarded) option was to use token_context_lock(turnstile) to just define a token with the already used name, but that would not allow the compiler to distinguish between different mm_id-dependent instances. Another constraint is that struct mm_id is only declared and incomplete in the header, so even if we tried to construct an expression to get to the mutex instance, this would fail (including more headers transitively everywhere should also be avoided). Instead, just declare an mm_id-dependent helper to return the mutex, and use the mm_id-dependent call expression in the __acquires/__releases attributes; the compiler will consider the identity of the mutex to be the call expression. Then using __get_turnstile() in the lock/unlock wrappers (with context analysis enabled for mmu.c) the compiler will be able to verify the implementation of the wrappers as-is. We leave context analysis disabled in arch/um/kernel/skas/ for now. This change is a preparatory change to allow enabling context analysis in subsystems that include any of the above headers. No functional change intended. Closes: https://lore.kernel.org/oe-kbuild-all/202512171220.vHlvhpCr-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-23-elver@google.com
2026-01-05debugfs: Make debugfs_cancellation a context lock structMarco Elver
When compiling include/linux/debugfs.h with CONTEXT_ANALYSIS enabled, we can see this error: ./include/linux/debugfs.h:239:17: error: use of undeclared identifier 'cancellation' 239 | void __acquires(cancellation) Move the __acquires(..) attribute after the declaration, so that the compiler can see the cancellation function argument, as well as making struct debugfs_cancellation a real context lock to benefit from Clang's context analysis. This change is a preparatory change to allow enabling context analysis in subsystems that include the above header. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251219154418.3592607-22-elver@google.com
2026-01-05locking/ww_mutex: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for ww_mutex. The programming model for ww_mutex is subtly more complex than other locking primitives when using ww_acquire_ctx. Encoding the respective pre-conditions for ww_mutex lock/unlock based on ww_acquire_ctx state using Clang's context analysis makes incorrect use of the API harder. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-21-elver@google.com
2026-01-05locking/local_lock: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for local_lock_t and local_trylock_t. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-20-elver@google.com
2026-01-05locking/local_lock: Include missing headersMarco Elver
Including <linux/local_lock.h> into an empty TU will result in the compiler complaining: ./include/linux/local_lock.h: In function ‘class_local_lock_irqsave_constructor’: ./include/linux/local_lock_internal.h:95:17: error: implicit declaration of function ‘local_irq_save’; <...> 95 | local_irq_save(flags); \ | ^~~~~~~~~~~~~~ As well as (some architectures only, such as 'sh'): ./include/linux/local_lock_internal.h: In function ‘local_lock_acquire’: ./include/linux/local_lock_internal.h:33:20: error: ‘current’ undeclared (first use in this function) 33 | l->owner = current; Include missing headers to allow including local_lock.h where the required headers are not otherwise included. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251219154418.3592607-19-elver@google.com
2026-01-05locking/rwsem: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for rw_semaphore. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-18-elver@google.com
2026-01-05kref: Add context-analysis annotationsMarco Elver
Mark functions that conditionally acquire the passed lock. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251219154418.3592607-17-elver@google.com
2026-01-05srcu: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for SRCU. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://patch.msgid.link/20251219154418.3592607-16-elver@google.com
2026-01-05rcu: Support Clang's context analysisMarco Elver
Improve the existing annotations to properly support Clang's context analysis. The old annotations distinguished between RCU, RCU_BH, and RCU_SCHED; however, to more easily be able to express that "hold the RCU read lock" without caring if the normal, _bh(), or _sched() variant was used we'd have to remove the distinction of the latter variants: change the _bh() and _sched() variants to also acquire "RCU". When (and if) we introduce context locks to denote more generally that "IRQ", "BH", "PREEMPT" contexts are disabled, it would make sense to acquire these instead of RCU_BH and RCU_SCHED respectively. The above change also simplified introducing __guarded_by support, where only the "RCU" context lock needs to be held: introduce __rcu_guarded, where Clang's context analysis warns if a pointer is dereferenced without any of the RCU locks held, or updated without the appropriate helpers. The primitives rcu_assign_pointer() and friends are wrapped with context_unsafe(), which enforces using them to update RCU-protected pointers marked with __rcu_guarded. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://patch.msgid.link/20251219154418.3592607-15-elver@google.com
2026-01-05bit_spinlock: Support Clang's context analysisMarco Elver
The annotations for bit_spinlock.h have simply been using "bitlock" as the token. For Sparse, that was likely sufficient in most cases. But Clang's context analysis is more precise, and we need to ensure we can distinguish different bitlocks. To do so, add a token context, and a macro __bitlock(bitnum, addr) that is used to construct unique per-bitlock tokens. Add the appropriate test. <linux/list_bl.h> is implicitly included through other includes, and requires 2 annotations to indicate that acquisition (without release) and release (without prior acquisition) of its bitlock is intended. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-14-elver@google.com
2026-01-05bit_spinlock: Include missing <asm/processor.h>Marco Elver
Including <linux/bit_spinlock.h> into an empty TU will result in the compiler complaining: ./include/linux/bit_spinlock.h:34:4: error: call to undeclared function 'cpu_relax'; <...> 34 | cpu_relax(); | ^ 1 error generated. Include <asm/processor.h> to allow including bit_spinlock.h where <asm/processor.h> is not otherwise included. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251219154418.3592607-13-elver@google.com
2026-01-05locking/seqlock: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for seqlock_t. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-12-elver@google.com
2026-01-05locking/mutex: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for mutex. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-11-elver@google.com
2026-01-05compiler-context-analysis: Change __cond_acquires to take return valueMarco Elver
While Sparse is oblivious to the return value of conditional acquire functions, Clang's context analysis needs to know the return value which indicates successful acquisition. Add the additional argument, and convert existing uses. Notably, Clang's interpretation of the value merely relates to the use in a later conditional branch, i.e. 1 ==> context lock acquired in branch taken if condition non-zero, and 0 ==> context lock acquired in branch taken if condition is zero. Given the precise value does not matter, introduce symbolic variants to use instead of either 0 or 1, which should be more intuitive. No functional change intended. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-10-elver@google.com
2026-01-05locking/rwlock, spinlock: Support Clang's context analysisMarco Elver
Add support for Clang's context analysis for raw_spinlock_t, spinlock_t, and rwlock. This wholesale conversion is required because all three of them are interdependent. To avoid warnings in constructors, the initialization functions mark a lock as acquired when initialized before guarded variables. The test verifies that common patterns do not generate false positives. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-9-elver@google.com
2026-01-05lockdep: Annotate lockdep assertions for context analysisMarco Elver
Clang's context analysis can be made aware of functions that assert that locks are held. Presence of these annotations causes the analysis to assume the context lock is held after calls to the annotated function, and avoid false positives with complex control-flow; for example, where not all control-flow paths in a function require a held lock, and therefore marking the function with __must_hold(..) is inappropriate. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-8-elver@google.com
2026-01-05cleanup: Basic compatibility with context analysisMarco Elver
Introduce basic compatibility with cleanup.h infrastructure. We need to allow the compiler to see the acquisition and release of the context lock at the start and end of a scope. However, the current "cleanup" helpers wrap the lock in a struct passed through separate helper functions, which hides the lock alias from the compiler (no inter-procedural analysis). While Clang supports scoped guards in C++, it's not possible to apply in C code: https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#scoped-context However, together with recent improvements to Clang's alias analysis abilities, idioms such as this work correctly now: void spin_unlock_cleanup(spinlock_t **l) __releases(*l) { .. } ... { spinlock_t *lock_scope __cleanup(spin_unlock_cleanup) = &lock; spin_lock(&lock); // lock through &lock ... critical section ... } // unlock through lock_scope -[alias]-> &lock (no warnings) To generalize this pattern and make it work with existing lock guards, introduce DECLARE_LOCK_GUARD_1_ATTRS() and WITH_LOCK_GUARD_1_ATTRS(). These allow creating an explicit alias to the context lock instance that is "cleaned" up with a separate cleanup helper. This helper is a dummy function that does nothing at runtime, but has the release attributes to tell the compiler what happens at the end of the scope. Example usage: DECLARE_LOCK_GUARD_1_ATTRS(mutex, __acquires(_T), __releases(*(struct mutex **)_T)) #define class_mutex_constructor(_T) WITH_LOCK_GUARD_1_ATTRS(mutex, _T) Note: To support the for-loop based scoped helpers, the auxiliary variable must be a pointer to the "class" type because it is defined in the same statement as the guard variable. However, we initialize it with the lock pointer (despite the type mismatch, the compiler's alias analysis still works as expected). The "_unlock" attribute receives a pointer to the auxiliary variable (a double pointer to the class type), and must be cast and dereferenced appropriately. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-7-elver@google.com
2026-01-05checkpatch: Warn about context_unsafe() without commentMarco Elver
Warn about applications of context_unsafe() without a comment, to encourage documenting the reasoning behind why it was deemed safe. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-6-elver@google.com
2026-01-05Documentation: Add documentation for Compiler-Based Context AnalysisMarco Elver
Adds documentation in Documentation/dev-tools/context-analysis.rst, and adds it to the index. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-5-elver@google.com
2026-01-05compiler-context-analysis: Add test stubMarco Elver
Add a simple test stub where we will add common supported patterns that should not generate false positives for each new supported context lock. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-4-elver@google.com
2026-01-05compiler-context-analysis: Add infrastructure for Context Analysis with ClangMarco Elver
Context Analysis is a language extension, which enables statically checking that required contexts are active (or inactive), by acquiring and releasing user-definable "context locks". An obvious application is lock-safety checking for the kernel's various synchronization primitives (each of which represents a "context lock"), and checking that locking rules are not violated. Clang originally called the feature "Thread Safety Analysis" [1]. This was later changed and the feature became more flexible, gaining the ability to define custom "capabilities". Its foundations can be found in "Capability Systems" [2], used to specify the permissibility of operations to depend on some "capability" being held (or not held). Because the feature is not just able to express "capabilities" related to synchronization primitives, and "capability" is already overloaded in the kernel, the naming chosen for the kernel departs from Clang's "Thread Safety" and "capability" nomenclature; we refer to the feature as "Context Analysis" to avoid confusion. The internal implementation still makes references to Clang's terminology in a few places, such as `-Wthread-safety` being the warning option that also still appears in diagnostic messages. [1] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html [2] https://www.cs.cornell.edu/talc/papers/capabilities.pdf See more details in the kernel-doc documentation added in this and subsequent changes. Clang version 22+ is required. [peterz: disable the thing for __CHECKER__ builds] Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-3-elver@google.com
2026-01-05compiler_types: Move lock checking attributes to compiler-context-analysis.hMarco Elver
The conditional definition of lock checking macros and attributes is about to become more complex. Factor them out into their own header for better readability, and to make it obvious which features are supported by which mode (currently only Sparse). This is the first step towards generalizing towards "context analysis". No functional change intended. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251219154418.3592607-2-elver@google.com
2026-01-05drm/xe/i2c: Force polling mode in survivabilityRaag Jadav
SGUnit interrupts are not initialized in survivability. Force I2C controller to polling mode while in survivability. v2: Use helper function instead of manual check (Riana) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://patch.msgid.link/20260105080750.16605-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2026-01-05x86,fs/resctrl: Support binary fixed point event countersTony Luck
resctrl assumes that all monitor events can be displayed as unsigned decimal integers. Hardware architecture counters may provide some telemetry events with greater precision where the event is not a simple count, but is a measurement of some sort (e.g. Joules for energy consumed). Add a new argument to resctrl_enable_mon_event() for architecture code to inform the file system that the value for a counter is a fixed-point value with a specific number of binary places. Only allow architecture to use floating point format on events that the file system has marked with mon_evt::is_floating_point which reflects the contract with user space on how the event values are displayed. Display fixed point values with values rounded to ceil(binary_bits * log10(2)) decimal places. Special case for zero binary bits to print "{value}.0". Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com
2026-01-05drm/meson: venc: add support for HDMI DMT modes up to 3840x2160Martin Blumenstingl
Commit 5d0bfe448481 ("drm/meson: Add HDMI 1.4 4k modes") added support for HDMI 1.4 4k modes, which is what TVs need. For computer monitors the code is using the DMT code-path, which ends up in meson_venc_hdmi_supported_mode(), which does not allow the 4k modes yet. The datasheet for all supported SoCs mentions "4Kx2K@60". It's not clear whether "4K" here means 3840 or 4096 pixels. Allow resolutions up to 3840x2160 pixels (including middle steps, such as WQHD at 2560x1440 pixels) so they can be used with computer monitors (using the DMT code-path in the driver). Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20251108134236.1299630-1-martin.blumenstingl@googlemail.com
2026-01-05arm64: dts: qcom: qcs6490-rb3gen2: Add TC9563 PCIe switch nodeKrishna Chaitanya Chundru
Add a node for the TC9563 PCIe switch, which has three downstream ports. Two embedded Ethernet devices are present on one of the downstream ports. As all these ports are present in the node represent the downstream ports and embedded endpoints. Power to the TC9563 is supplied through two LDO regulators, controlled by two GPIOs, which are added as fixed regulators. Configure the TC9563 through I2C. Reviewed-by: Bjorn Andersson <andersson@kernel.org> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-tc9563-v1-1-642fd1fe7893@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-05clk: qcom: rcg2: compute 2d using duty fraction directlyTaniya Das
The duty-cycle calculation in clk_rcg2_set_duty_cycle() currently derives an intermediate percentage `duty_per = (num * 100) / den` and then computes: d = DIV_ROUND_CLOSEST(n * duty_per * 2, 100); This introduces integer truncation at the percentage step (division by `den`) and a redundant scaling by 100, which can reduce precision for large `den` and skew the final rounding. Compute `2d` directly from the duty fraction to preserve precision and avoid the unnecessary scaling: d = DIV_ROUND_CLOSEST(n * duty->num * 2, duty->den); This keeps the intended formula `d ≈ n * 2 * (num/den)` while performing a single, final rounded division, improving accuracy especially for small duty cycles or large denominators. It also removes the unused `duty_per` variable, simplifying the code. There is no functional changes beyond improved numerical accuracy. Fixes: 7f891faf596ed ("clk: qcom: clk-rcg2: Add support for duty-cycle for RCG") Signed-off-by: Taniya Das <taniya.das@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105-duty_cycle_precision-v2-1-d1d466a6330a@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-05x86/platform/olpc: Replace strcpy() with strscpy() in xo15_sci_add()Thorsten Blum
strcpy() has been deprecated¹ because it performs no bounds checking on the destination buffer, which can lead to buffer overflows. Use the safer strscpy() instead. ¹ https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251124125455.5495-2-thorsten.blum@linux.dev
2026-01-05dt-bindings: cache: qcom,llcc: Remove duplicate llcc7_base for GlymurPankaj Patil
Drop redundant llcc7_base entry from Glymur LLCC reg-items Fixes: bd0b8028ce5f ("dt-bindings: cache: qcom,llcc: Document Glymur LLCC block") Signed-off-by: Pankaj Patil <pankaj.patil@oss.qualcomm.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260105130050.1062903-1-pankaj.patil@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2026-01-05media: chips-media: wave5: Fix Potential Probe Resource LeakBrandon Brnich
After kthread creation during probe sequence, a handful of other failures could occur. If this were to happen, the kthread is never explicitly deleted which results in a resource leak. Add explicit cleanup of this resource. Signed-off-by: Brandon Brnich <b-brnich@ti.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: platform: mtk-mdp3: add WQ_PERCPU to alloc_workqueue usersMarco Crivellari
Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: verisilicon: AV1: Fix tx mode bit settingBenjamin Gaignard
AV1 specification describes 3 possibles tx modes: 4x4 only, largest and select. The hardware allows 5 possibles tx modes: 4x4 only, 8x8, 16x16, 32x32 and select. Since the both aren't exactly matching we need to add a mapping function to set the correct mode on hardware. Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Fixes: 727a400686a2c ("media: verisilicon: Add Rockchip AV1 decoder") Cc: stable@vger.kernel.org Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: verisilicon: AV1: Fix enable cdef computationBenjamin Gaignard
If all the fields of the CDEF parameters are zero (which is the default), then av1_enable_cdef register needs to be unset (despite the V4L2_AV1_SEQUENCE_FLAG_ENABLE_CDEF possibly being set). Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Fixes: 727a400686a2c ("media: verisilicon: Add Rockchip AV1 decoder") Cc: stable@vger.kernel.org Reported-by: Jianfeng Liu <liujianfeng1994@gmail.com> Closes: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/4786 Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> [hverkuil: dropped Link tag since it just duplicated the Closes: URL]
2026-01-05media: verisilicon: Avoid G2 bus error while decoding H.264 and HEVCMing Qian
For the i.MX8MQ platform, there is a hardware limitation: the g1 VPU and g2 VPU cannot decode simultaneously; otherwise, it will cause below bus error and produce corrupted pictures, even potentially lead to system hang. [ 110.527986] hantro-vpu 38310000.video-codec: frame decode timed out. [ 110.583517] hantro-vpu 38310000.video-codec: bus error detected. Therefore, it is necessary to ensure that g1 and g2 operate alternately. This allows for successful multi-instance decoding of H.264 and HEVC. To achieve this, g1 and g2 share the same v4l2_m2m_dev, and then the v4l2_m2m_dev can handle the scheduling. Fixes: cb5dd5a0fa518 ("media: hantro: Introduce G2/HEVC decoder") Cc: stable@vger.kernel.org Signed-off-by: Ming Qian <ming.qian@oss.nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Co-developed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: v4l2-mem2mem: Add a kref to the v4l2_m2m_dev structureNicolas Dufresne
Adding a reference count to the v4l2_m2m_dev structure allow safely sharing it across multiple hardware nodes. This can be used to prevent running jobs concurrently on m2m cores that have some internal resource sharing. Signed-off-by: Ming Qian <ming.qian@oss.nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> [hverkuil: fix typos in v4l2_m2m_put documentation]
2026-01-05media: mediatek: vcodec: Don't try to decode 422/444 VP9Nicolas Dufresne
This is not supported by the hardware and trying to decode these leads to LAT timeout errors. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: mediatek: vcodec: Implement manual request completionSebastian Fricke
Rework how requests are completed in the MediaTek VCodec driver, by implementing the new manual request completion feature, which allows to keep a request open while allowing to add new bitstream data. This is useful in this case, because the hardware has a LAT and a core decode work, after the LAT decode the bitstream isn't required anymore so the source buffer can be set done and the request stays open until the core decode work finishes. Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com> Co-developed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: mc: add debugfs node to keep track of requestsHans Verkuil
Keep track of the number of requests and request objects of a media device. Helps to verify that all request-related memory is freed. Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
2026-01-05media: vicodec: add support for manual completionHans Verkuil
Manually complete the requests: this tests the manual completion code. Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
2026-01-05media: mc: add manual request completionHans Verkuil
By default when the last request object is completed, the whole request completes as well. But sometimes you want to delay this completion to an arbitrary point in time so add a manual complete mode for this. In req_queue the driver marks the request for manual completion by calling media_request_mark_manual_completion, and when the driver wants to manually complete the request it calls media_request_manual_complete(). Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
2026-01-05media: chips-media: wave5: Fix memory leak on codec_info allocation failureZilin Guan
In wave5_vpu_open_enc() and wave5_vpu_open_dec(), a vpu instance is allocated via kzalloc(). If the subsequent allocation for inst->codec_info fails, the functions return -ENOMEM without freeing the previously allocated instance, causing a memory leak. Fix this by calling kfree() on the instance in this error path to ensure it is properly released. Fixes: 9707a6254a8a6 ("media: chips-media: wave5: Add the v4l2 layer") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Improve performance of decoderJackson Lee
The current decoding method was to wait until each frame was decoded after feeding a bitstream. As a result, performance was low and Wave5 could not achieve max pixel processing rate. Update driver to use an asynchronous approach for decoding and feeding a bitstream in order to achieve full capabilities of the device. WAVE5 supports command-queueing to maximize performance by pipelining internal commands and by hiding wait cycle taken to receive a command from Host processor. Instead of waiting for each command to be executed before sending the next command, Host processor just places all the commands in the command-queue and goes on doing other things while the commands in the queue are processed by VPU. While Host processor handles its own tasks, it can receive VPU interrupt request (IRQ). In this case, host processor can simply exit interrupt service routine (ISR) without accessing to host interface to read the result of the command reported by VPU. After host processor completed its tasks, host processor can read the command result when host processor needs the reports and does response processing. To achieve this goal, the device_run() calls v4l2_m2m_job_finish so that next command can be sent to VPU continuously, if there is any result, then irq is triggered and gets decoded frames and returns them to upper layer. Theses processes work independently each other without waiting a decoded frame. Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com> Tested-by: Brandon Brnich <b-brnich@ti.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Add WARN_ON to check if dec_output_info is NULLJackson Lee
The dec_output_info should not be a null pointer, WARN_ON around it to indicates a driver issue. Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com> Tested-by: Brandon Brnich <b-brnich@ti.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Fix Null reference while testing flusterJackson Lee
When multi instances are created/destroyed, many interrupts happens and structures for decoder are removed. "struct vpu_instance" this structure is shared for all flow in the decoder, so if the structure is not protected by lock, Null dereference could happens sometimes. IRQ Handler was spilt to two phases and Lock was added as well. Fixes: 9707a6254a8a ("media: chips-media: wave5: Add the v4l2 layer") Cc: stable@vger.kernel.org Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com> Tested-by: Brandon Brnich <b-brnich@ti.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Fix SError of kernel panic when closedJackson Lee
SError of kernel panic rarely happened while testing fluster. The root cause was to enter suspend mode because timeout of autosuspend delay happened. [ 48.834439] SError Interrupt on CPU0, code 0x00000000bf000000 -- SError [ 48.834455] CPU: 0 UID: 0 PID: 1067 Comm: v4l2h265dec0:sr Not tainted 6.12.9-gc9e21a1ebd75-dirty #7 [ 48.834461] Hardware name: ti Texas Instruments J721S2 EVM/Texas Instruments J721S2 EVM, BIOS 2025.01-00345-gbaf3aaa8ecfa 01/01/2025 [ 48.834464] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 48.834468] pc : wave5_dec_clr_disp_flag+0x40/0x80 [wave5] [ 48.834488] lr : wave5_dec_clr_disp_flag+0x40/0x80 [wave5] [ 48.834495] sp : ffff8000856e3a30 [ 48.834497] x29: ffff8000856e3a30 x28: ffff0008093f6010 x27: ffff000809158130 [ 48.834504] x26: 0000000000000000 x25: ffff00080b625000 x24: ffff000804a9ba80 [ 48.834509] x23: ffff000802343028 x22: ffff000809158150 x21: ffff000802218000 [ 48.834513] x20: ffff0008093f6000 x19: ffff0008093f6000 x18: 0000000000000000 [ 48.834518] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff74009618 [ 48.834523] x14: 000000010000000c x13: 0000000000000000 x12: 0000000000000000 [ 48.834527] x11: ffffffffffffffff x10: ffffffffffffffff x9 : ffff000802343028 [ 48.834532] x8 : ffff00080b6252a0 x7 : 0000000000000038 x6 : 0000000000000000 [ 48.834536] x5 : ffff00080b625060 x4 : 0000000000000000 x3 : 0000000000000000 [ 48.834541] x2 : 0000000000000000 x1 : ffff800084bf0118 x0 : ffff800084bf0000 [ 48.834547] Kernel panic - not syncing: Asynchronous SError Interrupt [ 48.834549] CPU: 0 UID: 0 PID: 1067 Comm: v4l2h265dec0:sr Not tainted 6.12.9-gc9e21a1ebd75-dirty #7 [ 48.834554] Hardware name: ti Texas Instruments J721S2 EVM/Texas Instruments J721S2 EVM, BIOS 2025.01-00345-gbaf3aaa8ecfa 01/01/2025 [ 48.834556] Call trace: [ 48.834559] dump_backtrace+0x94/0xec [ 48.834574] show_stack+0x18/0x24 [ 48.834579] dump_stack_lvl+0x38/0x90 [ 48.834585] dump_stack+0x18/0x24 [ 48.834588] panic+0x35c/0x3e0 [ 48.834592] nmi_panic+0x40/0x8c [ 48.834595] arm64_serror_panic+0x64/0x70 [ 48.834598] do_serror+0x3c/0x78 [ 48.834601] el1h_64_error_handler+0x34/0x4c [ 48.834605] el1h_64_error+0x64/0x68 [ 48.834608] wave5_dec_clr_disp_flag+0x40/0x80 [wave5] [ 48.834615] wave5_vpu_dec_clr_disp_flag+0x54/0x80 [wave5] [ 48.834622] wave5_vpu_dec_buf_queue+0x19c/0x1a0 [wave5] [ 48.834628] __enqueue_in_driver+0x3c/0x74 [videobuf2_common] [ 48.834639] vb2_core_qbuf+0x508/0x61c [videobuf2_common] [ 48.834646] vb2_qbuf+0xa4/0x168 [videobuf2_v4l2] [ 48.834656] v4l2_m2m_qbuf+0x80/0x238 [v4l2_mem2mem] [ 48.834666] v4l2_m2m_ioctl_qbuf+0x18/0x24 [v4l2_mem2mem] [ 48.834673] v4l_qbuf+0x48/0x5c [videodev] [ 48.834704] __video_do_ioctl+0x180/0x3f0 [videodev] [ 48.834725] video_usercopy+0x2ec/0x68c [videodev] [ 48.834745] video_ioctl2+0x18/0x24 [videodev] [ 48.834766] v4l2_ioctl+0x40/0x60 [videodev] [ 48.834786] __arm64_sys_ioctl+0xa8/0xec [ 48.834793] invoke_syscall+0x44/0x100 [ 48.834800] el0_svc_common.constprop.0+0xc0/0xe0 [ 48.834804] do_el0_svc+0x1c/0x28 [ 48.834809] el0_svc+0x30/0xd0 [ 48.834813] el0t_64_sync_handler+0xc0/0xc4 [ 48.834816] el0t_64_sync+0x190/0x194 [ 48.834820] SMP: stopping secondary CPUs [ 48.834831] Kernel Offset: disabled [ 48.834833] CPU features: 0x08,00002002,80200000,4200421b [ 48.834837] Memory Limit: none [ 49.161404] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]--- Fixes: 2092b3833487 ("media: chips-media: wave5: Support runtime suspend/resume") Cc: stable@vger.kernel.org Signed-off-by: Jackson Lee <jackson.lee@chipsnmedia.com> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Tested-by: Brandon Brnich <b-brnich@ti.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Fix device cleanup order to prevent kernel panicXulin Sun
Move video device unregistration to the beginning of the remove function to ensure all video operations are stopped before cleaning up the worker thread and disabling PM runtime. This prevents hardware register access after the device has been powered down. In polling mode, the hrtimer periodically triggers wave5_vpu_timer_callback() which queues work to the kthread worker. The worker executes wave5_vpu_irq_work_fn() which reads hardware registers via wave5_vdi_read_register(). The original cleanup order disabled PM runtime and powered down hardware before unregistering video devices. When autosuspend triggers and powers off the hardware, the video devices are still registered and the worker thread can still be triggered by the hrtimer, causing it to attempt reading registers from powered-off hardware. This results in a bus error (synchronous external abort) and kernel panic. This causes random kernel panics during encoding operations: Internal error: synchronous external abort: 0000000096000010 [#1] PREEMPT SMP Modules linked in: wave5 rpmsg_ctrl rpmsg_char ... CPU: 0 UID: 0 PID: 1520 Comm: vpu_irq_thread Tainted: G M W pc : wave5_vdi_read_register+0x10/0x38 [wave5] lr : wave5_vpu_irq_work_fn+0x28/0x60 [wave5] Call trace: wave5_vdi_read_register+0x10/0x38 [wave5] kthread_worker_fn+0xd8/0x238 kthread+0x104/0x120 ret_from_fork+0x10/0x20 Code: aa1e03e9 d503201f f9416800 8b214000 (b9400000) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: synchronous external abort: Fatal exception Fixes: 9707a6254a8a ("media: chips-media: wave5: Add the v4l2 layer") Cc: stable@vger.kernel.org Signed-off-by: Xulin Sun <xulin.sun@windriver.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Fix kthread worker destruction in polling modeXulin Sun
Fix the cleanup order in polling mode (irq < 0) to prevent kernel warnings during module removal. Cancel the hrtimer before destroying the kthread worker to ensure work queues are empty. In polling mode, the driver uses hrtimer to periodically trigger wave5_vpu_timer_callback() which queues work via kthread_queue_work(). The kthread_destroy_worker() function validates that both work queues are empty with WARN_ON(!list_empty(&worker->work_list)) and WARN_ON(!list_empty(&worker->delayed_work_list)). The original code called kthread_destroy_worker() before hrtimer_cancel(), creating a race condition where the timer could fire during worker destruction and queue new work, triggering the WARN_ON. This causes the following warning on every module unload in polling mode: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1034 at kernel/kthread.c:1430 kthread_destroy_worker+0x84/0x98 Modules linked in: wave5(-) rpmsg_ctrl rpmsg_char ... Call trace: kthread_destroy_worker+0x84/0x98 wave5_vpu_remove+0xc8/0xe0 [wave5] platform_remove+0x30/0x58 ... ---[ end trace 0000000000000000 ]--- Fixes: ed7276ed2fd0 ("media: chips-media: wave5: Add hrtimer based polling support") Cc: stable@vger.kernel.org Signed-off-by: Xulin Sun <xulin.sun@windriver.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
2026-01-05media: chips-media: wave5: Fix PM runtime usage count underflowXulin Sun
Replace pm_runtime_put_sync() with pm_runtime_dont_use_autosuspend() in the remove path to properly pair with pm_runtime_use_autosuspend() from probe. This allows pm_runtime_disable() to handle reference count cleanup correctly regardless of current suspend state. The driver calls pm_runtime_put_sync() unconditionally in remove, but the device may already be suspended due to autosuspend configured in probe. When autosuspend has already suspended the device, the usage count is 0, and pm_runtime_put_sync() decrements it to -1. This causes the following warning on module unload: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 963 at kernel/kthread.c:1430 kthread_destroy_worker+0x84/0x98 ... vdec 30210000.video-codec: Runtime PM usage count underflow! Fixes: 9707a6254a8a ("media: chips-media: wave5: Add the v4l2 layer") Cc: stable@vger.kernel.org Signed-off-by: Xulin Sun <xulin.sun@windriver.com> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>