summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2026-05-19drm/amdkfd: Check bounds for allocate_sdma_queue restore_sdma_idDavid Francis
allocate_sdma_queue has an option where the sdma queue id can be specified (used by CRIU). We weren't bounds-checking that value. Confirm it's less than the maximum number of queues. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bfe9a7545b2a7be1c543f1741e16f2d5ec4116ae)
2026-05-19drm/amdgpu: use atomic operation to achieve lockless serializationSunil Khatri
In amdgpu_seq64_alloc there is a possibility that two difference cores from two separate NODES can try to and could get the same free slot. So this fixes that race here using atomic test_and_set clear operations. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4d50a14d346141e03a7c3905e496d91e048bc30c)
2026-05-19drm/amdkfd: Check bounds on allocate_doorbellDavid Francis
allocated_doorbell has an option to set the doorbell id to a specific value (used by CRIU). This value was not bounds checked. Check to confirm it's less than KFD_MAX_NUM_OF_QUEUES_PER_PROCESS. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1f087bb8cf9e8797633da35c85435e557ef74d06)
2026-05-19drm/amdgpu/vce3: Fix VCE 3 firmware size and offsetsTimur Kristóf
The VCPU BO contains the actual FW at an offset, but it was not calculated into the VCPU BO size. Subtract this from the FW size to make sure there is no out of bounds access. This may fix VM faults when using VCE 3. Cc: John Olender <john.olender@gmail.com> Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 15c369257bd85f47a514744f960c5a51c867716f)
2026-05-19drm/amdgpu/vce2: Fix VCE 2 firmware size and offsetsTimur Kristóf
The VCPU BO contains the actual FW at an offset, but it was not calculated into the VCPU BO size. Subtract this from the FW size to make sure there is no out of bounds access. Additionally, increase the VCE_V2_0_DATA_SIZE to have extra space after the VCE handles. Also increase the data size used for each VCE handle. The FW needs 23744 bytes, use 24K to be safe. This fixes VM faults when using VCE 2. Cc: John Olender <john.olender@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802 Fixes: e98226221467 ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a20d21df625548c1738c0745f753c5d6eb823bc3)
2026-05-19drm/amdgpu/vce1: Stop using amdgpu_vce_resumeTimur Kristóf
The VCE1 firmware works slightly differently and is already loaded by vce_v1_0_load_fw(). It doesn't actually need to call amdgpu_vce_resume(). Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 33d8951405e2dd81ac61edebc680e2dfb6b4fc9f)
2026-05-19drm/amdgpu/vce1: Fix VCE 1 firmware size and offsetsTimur Kristóf
The VCPU BO contains the actual FW at an offset, but it was not calculated into the VCPU BO size. Subtract this from the FW size to make sure there is no out of bounds access. Make sure the stack and data offsets are aligned to the 32K TLB size. Check that the FW microcode actually fits in the space that is reserved for it. Fixes: d4a640d4b9f3 ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c16fe59f622a080fc457a57b3e8f14c780699449)
2026-05-19drm/amdgpu/vce1: Don't repeat GTT MGR node allocationTimur Kristóf
Only allocate entries from the GTT manager when the VCE GTT node is not allocated yet. This prevents the possibility of allocating them multiple times, which causes issues during GPU reset and suspend/resume. Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8d2a20c1721cb17e22821e1b4ecbb02d475d91c5)
2026-05-19drm/amdgpu/vce1: Check if VRAM address is lower than GART.Timur Kristóf
Previously, I had assumed this was not possible so it was OK to not handle it, but now we got a report from a user who has a board that is configured this way. When the VCPU BO is already located in a low 32-bit address in VRAM (eg. when VRAM is mapped to the low address space), don't do the workaround. Fixes: 71aec08f80e7 ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f370ec9b164698a9ca1a7b59bfbea07f70df769d)
2026-05-19drm/amdgpu/vce1: Remove superfluous address checkTimur Kristóf
The same thing is already checked a few lines above. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c1dc555e760dbfc4a4710f7270f525a03d433af8)
2026-05-19drm/amdgpu/vce1: Check that the GPU address is < 128 MiBTimur Kristóf
When ensuring the low 32-bit address, make sure it is less than 128 MiB, otherwise the VCE seems to fail to initialize. This seems to be an undocumented limitation of the firmware validation mechanism. Note that in case of VCE1 the BAR address is zero and we can't change it also due to the firmware validator. When programming the mmVCE_VCPU_CACHE_OFFSETn registers, don't AND them with a mask. This is incorrect because the register mask is actually 0x0fffffff and useless because we already ensure the addresses are below the limit. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e729ae5f3ac73c861c062080ac8c3d666c972404)
2026-05-19drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti (v2)Timur Kristóf
The TLB is organized in groups of 8 entries, each one is 4K. On Tahiti, the HW requires these GART entries to be 32K-aligned. This fixes a VCE 1 firmware validation failure that can happen after suspend/resume since we use amdgpu_gtt_mgr for VCE 1. v2: - Change variable declaration order - Add comment about "V bit HW bug" Fixes: 698fa62f56aa ("drm/amdgpu: Add helper to alloc GART entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 530411b465ef0b2c0cc18c2e3d7e38422b1117d1)
2026-05-19drm/amdkfd: Fix OOB memory exposure in get_wave_state()Sunday Clement
The get_wave_state() function for v9 trusts cp_hqd_cntl_stack_size and cp_hqd_cntl_stack_offset values read directly from the MQD, which are written by GPU microcode and fully attacker-controlled on the CRIU-restore path (via AMDKFD_IOC_RESTORE_PROCESS with H3). this leads to an unbounded copy_to_user() that can leak adjacent GTT/kernel memory. If offset > size, integer underflow produces a ~4 GiB read length, if size is set to 1 MiB against a 4 KiB allocation, we leak 1 MiB of adjacent kernel memory (other queues' MQDs, ring buffers, KASLR pointers). Fix by clamping both cp_hqd_cntl_stack_size to the actual allocated buffer size (q->ctl_stack_size) and cp_hqd_cntl_stack_offset to the clamped size before performing arithmetic and copy_to_user(). This ensures we never read beyond the allocated kernel BO regardless of attacker-supplied MQD field values. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7ef144458f48d5589e36f1b3d83e83db2e5c5ba5)
2026-05-19drm/amd/pm: fix memleak of dpm_policies on smu v15Yang Wang
In smu_v15_0_fini_smc_tables, dpm_policies was not freed or NULLed, causing a memory leak. Add kfree() and NULL assignment to properly release memory and avoid dangling pointers. Fixes: 2beedc3a92b7 ("drm/amd/pm: Add initial support for smu v15_0_8"); Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 014f329074f688b9b49383e8b70e79e9ef99359e)
2026-05-19drm/amdgpu: Fix discovery offset check under VFLijo Lazar
Discovery table may be kept at offset 0 by host driver. Remove the validation check. Fixes: 01bdc7e219c4 ("drm/amdgpu: New interface to get IP discovery binary v3") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d3f5bbd007133c64a20e81ef290a93e46c75df40)
2026-05-19drm/amdgpu: remove va cursors for all mappingsSunil Khatri
va_cursor struct needs to be cleaned even if the mapping has been removed already. Also simplify it by make it a void function as return value check isn't needed as its called during tear down. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4d35a45c9b4c1ac5b6e3219f83c3db706b675fa2)
2026-05-19drm/amdgpu: reject non-user addresses early in GEM_USERPTR ioctlAmir Shetaia
amdgpu_gem_userptr_ioctl() currently accepts any value of args->addr and only discovers an out-of-range pointer much later, inside amdgpu_gem_object_create() and the HMM mirror registration path. Userspace can drive that path with kernel-side virtual addresses; the get_user_pages() layer rejects them, but only after the driver has already allocated a GEM object and started wiring up notifier state that then has to be torn down on failure. Add an access_ok() guard at the top of the ioctl, right after the existing page-alignment check and before flag validation, so any address that does not lie within the calling task's user address range is rejected with -EFAULT before any allocation occurs. No legitimate ROCm/HSA userspace passes kernel-mode pointers through this interface, so this is defense-in-depth rather than a behaviour change for valid callers; -EFAULT matches the convention already used by other uaccess-style rejections in the kernel. Also add an explicit #include <linux/uaccess.h>; access_ok() is otherwise only available transitively through other headers in this translation unit. Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7a076df36397d780d7e4fb595287b4980451a7f5)
2026-05-19drm/amdgpu/vpe: Force collaborate sync after TRAPAlan Liu
VPE1 could possibly hang and fail to power off at the end of commands in collaboration mode. This workaround adds a COLLAB_SYNC after TRAP to force instances synchronized to avoid VPE1 fail to power off. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alan liu <haoping.liu@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5171 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a8b749c5c5afb7e5daa2bfb95d958fb3c6b8f055) Cc: stable@vger.kernel.org
2026-05-19drm/amdgpu/userq: update the vm task info during signal ioctlSunil Khatri
Pagefaults does not have process information correctly populated as vm->task is not set during vm_init but should be updated while real submission. So setting that up during signal_ioctl to get the correct submission process details. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a9b14d88b4d83e21ab965f23d1fb7b07b87e0517)
2026-05-19drm/amdgpu/userq: cancel reset work while tear down in progressSunil Khatri
While tear down of a userq_mgr is happening when all the queues are free we should cancel any reset work if pending before exiting. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 160164609f71f774c4f661227a9b7a370a86b112)
2026-05-19drm/amdgpu: rework userq reset work handlingChristian König
It is illegal to schedule reset work from another reset work! Fix this by scheduling the userq reset work directly on the work queue of the reset domain. Not fully tested, I leave that to the IGT test cases. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fd9200ccefab94f27877d1943761d6b0ccbd89c8)
2026-05-19drm/amdgpu/userq: pin mqd and fw object bo to avoid evictionSunil Khatri
mqd and fw objects are queue core objects which should remain valid and never be unmapped and evicted for user queues to work properly. During eviction if these buffers are evicted the hw continue to use the invalid addresses and caused page faults and system hung. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a3bbf32a336939a1d21b9561f8e53333b684b7ef)
2026-05-19drm/amdgpu/userq: use drm_exec in amdgpu_userq_fence_read_wptrSunil Khatri
To access the bo from vm mapping first lock the root bo and then the object bo of the mapping to make sure both locks are taken safely. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3aab50410653fe7eb35eb6f9c2b27e3549ab09e6)
2026-05-19drm/xe/multi_queue: Fix secondary queue error caseNiranjana Vishwanathapura
If xe_lrc_create() fails, the secondary queue added to the multi-queue group list is not removed before freeing the queue. Fix error path handling for secondary queues by removing it from the multi-queue group list at the right place. Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7979 Fixes: d716a5088c88 ("drm/xe/multi_queue: Handle tearing down of a multi queue") Cc: stable@vger.kernel.org # v7.0+ Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260518191639.320890-2-niranjana.vishwanathapura@intel.com (cherry picked from commit d2d23c12789cf69eddc35b8d38cd8eaabd0168f1) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-19drm/bridge: megachips: remove bridge when irq request failsOsama Abdelkader
If devm_request_threaded_irq() fails after drm_bridge_add(), remove the bridge before returning. Keep drm_bridge_add() rather than devm_drm_bridge_add(): registration is tied to the STDP4028 device while ge_b850v3_register() may complete from either I2C probe; devm would not unwind the bridge if the other client's probe fails. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Fixes: fcfa0ddc18ed ("drm/bridge: Drivers for megachips-stdpxxxx-ge-b850v3-fw (LVDS-DP++)") Cc: stable@vger.kernel.org Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Tested-by: Ian Ray <ian.ray@gehealthcare.com> Link: https://patch.msgid.link/20260430195700.80317-1-osama.abdelkader@gmail.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
2026-05-19drm/bridge: chipone-icn6211: use devm_drm_bridge_add in i2c probeOsama Abdelkader
Use devm_drm_bridge_add() so the bridge is released if probe fails after registration, and drop drm_bridge_remove() in chipone_i2c_probe. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Fixes: 8dde6f7452a1 ("drm: bridge: icn6211: Add I2C configuration support") Cc: stable@vger.kernel.org Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20260430194944.78119-1-osama.abdelkader@gmail.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
2026-05-19drm/i915/psr: Apply Intel DPCD workaround when SDP on prior line usedJouni Högander
There is Intel specific workaround DPCD address containing workaround for case where SDP is on prior line. Apply this workaround according to values in the offset. Fixes: 61e887329e33 ("drm/i915/xelpd: Handle PSR2 SDP indication in the prior scanline") Cc: <stable@vger.kernel.org> # v5.15+ Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260515095756.2799483-4-jouni.hogander@intel.com (cherry picked from commit c3fe899fbeac86ea4a5ca9dd845b2cbc0da46249) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2026-05-19drm/i915/psr: Read Intel DPCD workaround registerJouni Högander
Read Intel DPCD workaround register and store it into intel_connector->dp.psr_caps. psr_caps was chosen as currently it contains only PSR workaround for PSR2 SDP on prior scanline implementation. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260515095756.2799483-3-jouni.hogander@intel.com (cherry picked from commit c48ff24d0f4ab7ad696b2d35ad64ce7e049c668c) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2026-05-19drm/i915/psr: Add defininitions for INTEL_WA_REGISTER_CAPS DPCD registerJouni Högander
EDP specification says: "If either VSC SDP is unable to be transmitted 100 ns before the SU region, the Source device may optionally transmit the VSC SDP during the prior video scan line’s HBlank period There is a Intel specific drm dp register currently containing bits related how TCON can support PSR2 with SDP on prior line." Unfortunately many panels are having problems in implementing this. So there is a custom Intel specific DPCD register (INTEL_WA_REGISTER_CAPS) to figure out if this is properly implemented on a panel or if panel doesn't require that 100 ns delay before the SU region. Here are the definitions in this custom DPCD address: 0 = Panel doesn't support SDP on prior line 1 = Panel supports SDP on prior line 2 = Panel doesn't have 100ns requirement 3 = Reserved Add definitions for this new register and it's values into new header intel_dpcd.h. v2: add INTEL_DPCD_ prefix to definitions Bspec: 74741 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260515095756.2799483-2-jouni.hogander@intel.com (cherry picked from commit 1da1c9294825f08f622c473480d185680c2a3b75) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2026-05-19drm/i915/dp: Fix readback for target_rr in Adaptive Sync SDPAnkit Nautiyal
Correct the bit-shift logic to properly readback the 10 bit target_rr from DB3 and DB4. v2: Align the style with readback for vtotal. (Ville) Fixes: 12ea89291603 ("drm/i915/dp: Add Read/Write support for Adaptive Sync SDP") Cc: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com> Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260511123218.1589830-2-ankit.k.nautiyal@intel.com (cherry picked from commit f7abc4af2b19240a145a221461dfe756cc01d74a) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2026-05-19drm/i915/display: Copy color pipeline from plane in the primary joiner pipeChaitanya Kumar Borah
When copying plane color state in a joiner configuration, use the plane in the primary joiner pipe since it carries the pipeline number selected by the user-space. This assumes that all pipes in the joiner are symmetric in their plane color capabilities. Cc: stable@vger.kernel.org # v6.19+ Fixes: a78f1b6baf4d ("drm/i915/color: Add framework to program CSC") Tested-by: Vidya Srinivas <vidya.srinivas@intel.com> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patch.msgid.link/20260511053213.3122314-2-chaitanya.kumar.borah@intel.com (cherry picked from commit e8308fb5e05ca08ddfb8b46f6d947a6e3fd80cd7) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2026-05-18drm/v3d: Release indirect CSD GEM reference on CPU job freeMaíra Canal
v3d_get_cpu_indirect_csd_params() takes a reference to the indirect BO via drm_gem_object_lookup() and stashes it in cpu_job->indirect_csd.indirect, but nothing on the CPU job teardown path ever drops that reference. Drop the extra reference in v3d_cpu_job_free(). The NULL check covers ioctl errors before the lookup ran and CPU job types other than V3D_CPU_JOB_TYPE_INDIRECT_CSD, which leave the field zero-initialised. Cc: stable@vger.kernel.org Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job") Assisted-by: Claude:claude-opus-4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patch.msgid.link/20260515-v3d-cpu-job-leaks-v1-2-7f147cbbf935@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2026-05-18drm/v3d: Fix use-after-free of CPU job query arrays on error pathMaíra Canal
The CPU job ioctl's fail label calls kvfree() on cpu_job's timestamp and performance query arrays after v3d_job_cleanup(), which drops the job's last reference and frees cpu_job. Reading cpu_job at that point is a use-after-free. Also, on the early v3d_job_init() failure path, it is a NULL dereference, since v3d_job_deallocate() zeroes the local pointer. In the success path, the arrays are released from the scheduler's .free_job callback, but on the error path, they are freed manually, as the job was never pushed to the scheduler. While the success path deals with this correctly, the fail path doesn't. On top of that, the manual kvfree() calls only free the array storage; they don't drm_syncobj_put() the per-query syncobjs that v3d_timestamp_query_info_free() and v3d_performance_query_info_free() release on the success path. So the same fail path that triggers the use-after-free also leaks one syncobj reference per query. Unify the CPU job teardown into the CPU job's kref destructor, mirroring v3d_render_job_free(). The scheduler's .free_job slot reverts to the generic v3d_sched_job_free() and the fail label drops the manual kvfree() calls, leaving a single teardown path that is reached from both the scheduler and the ioctl error path. That removes the use-after-free, the NULL dereference, and the syncobj leak by construction. Cc: stable@vger.kernel.org Fixes: 9ba0ff3e083f ("drm/v3d: Create a CPU job extension for the timestamp query job") Assisted-by: Claude:claude-opus-4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patch.msgid.link/20260515-v3d-cpu-job-leaks-v1-1-7f147cbbf935@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2026-05-18drm/mediatek: mtk_hdmi_ddc: Fix non-static global variableLouis-Alexis Eyraud
The struct 'mtk_hdmi_ddc_driver' is not used outside of the mtk_hdmi_ddc.c file, so make it static to silence sparse warning: ``` drivers/gpu/drm/mediatek/mtk_hdmi_ddc.c:331:24: sparse: warning: symbol 'mtk_hdmi_ddc_driver' was not declared. Should it be static? ``` Fixes: c241118b6216 ("drm/mediatek: mtk_hdmi_ddc: Switch to register as module_platform_driver") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20260429-mediatek-drm-fix-sparse-warnings-v1-4-d95c4d118b83@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2026-05-18drm/mediatek: mtk_cec: Fix non-static global variableLouis-Alexis Eyraud
The struct 'mtk_cec_driver' is not used outside of the mtk_cec.c file, so make it static to silence sparse warning: ``` drivers/gpu/drm/mediatek/mtk_cec.c:243:24: sparse: warning: symbol 'mtk_cec_driver' was not declared. Should it be static? ``` Fixes: 1e914a89ab7e ("drm/mediatek: mtk_cec: Switch to register as module_platform_driver") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20260429-mediatek-drm-fix-sparse-warnings-v1-3-d95c4d118b83@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2026-05-18drm/mediatek: mtk_hdmi_v2: Fix non-static global variableLouis-Alexis Eyraud
The struct 'mtk_hdmi_v2_clk_names' is not used outside of the mtk_hdmi_v2.c file, so make it static to silence sparse warning: ``` drivers/gpu/drm/mediatek/mtk_hdmi_v2.c:53:12: sparse: warning: symbol 'mtk_hdmi_v2_clk_names' was not declared. Should it be static? ``` Fixes: 8d0f79886273 ("drm/mediatek: Introduce HDMI/DDC v2 for MT8195/MT8188") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604132044.fcYjEcU8-lkp@intel.com/ Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20260429-mediatek-drm-fix-sparse-warnings-v1-2-d95c4d118b83@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2026-05-18drm/mediatek: mtk_hdmi_ddc_v2: Fix non-static global variableLouis-Alexis Eyraud
The struct 'mtk_hdmi_ddc_v2_driver' is not used outside of the mtk_hdmi_ddc_v2.c file, so make it static to silence sparse warning: ``` drivers/gpu/drm/mediatek/mtk_hdmi_ddc_v2.c:392:24: sparse: warning: symbol 'mtk_hdmi_ddc_v2_driver' was not declared. Should it be static? ``` Fixes: 8d0f79886273 ("drm/mediatek: Introduce HDMI/DDC v2 for MT8195/MT8188") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202604132044.fcYjEcU8-lkp@intel.com/ Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20260429-mediatek-drm-fix-sparse-warnings-v1-1-d95c4d118b83@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2026-05-18drm/xe: Define and use MCR version of COMMON_SLICE_CHICKEN4Gustavo Sousa
The register COMMON_SLICE_CHICKEN4 is a MCR register on both Xe2 and Xe3. Let's make sure to define a MCR version of it and use it for the relevant IP versions. Use XEHP_ as prefix for the register name, since it is MCR as of Xe_HP. v2: - Also change for one entry in lrc_tunnings, which was caught by manual testing and add corresponging Fixes tag in commit message. (Gustavo) Fixes: 8d6f16f1f082 ("drm/xe: Extend Wa_22021007897 to Xe3 platforms") Fixes: e5c13e2c505b ("drm/xe/xe2hpg: Add Wa_22021007897") Fixes: 8ccf5f6b2295 ("drm/xe/tuning: Apply windower hardware filtering setting on Xe3 and Xe3p") Bspec: 66534, 71185, 74417 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-3-30dd47855fee@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> (cherry picked from commit 75f65f1a4c06da1d87f28570a9d4cdad28f13360) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/xe: Define and use MCR version of COMMON_SLICE_CHICKEN1Gustavo Sousa
The register COMMON_SLICE_CHICKEN1 is a MCR register on Xe2. Let's make sure to define a MCR version of it and use it for the relevant IP versions. Use XEHP_ as prefix for the register name, since it is MCR as of Xe_HP. Fixes: a5d221924e13 ("drm/xe/xe2_hpg: Add set of workarounds") Fixes: 9f18b55b6d3f ("drm/xe/xe2: Add workaround 18033852989") Bspec: 66534, 71185 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-2-30dd47855fee@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> (cherry picked from commit a672725fdbfc3ea430130039d677c7dc98d59df8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/xe: Define CACHE_MODE_1 as MCR registerGustavo Sousa
CACHE_MODE_1 is a MCR register for all platforms that currently use it in the Xe driver. Use XE_REG_MCR() when defining it. Fixes: 8cd7e9759766 ("drm/xe: Add missing DG2 lrc workarounds") Fixes: ff063430caa8 ("drm/xe/mtl: Add some initial MTL workarounds") Bspec: 66534, 67788 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260514-rtp-mcr-check-v3-1-30dd47855fee@intel.com Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> (cherry picked from commit 8f765f0c054e0fb39980a76b4c899b027395929d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/xe/pf: Fix CFI failure in debugfs accessMohanram Meenakshisundaram
Reading debugfs file (/sys/kernel/debug/dri/0/gt*/pf/adverse_events) with CFI (Control Flow Integrity) enabled, the kernel panics at xe_gt_debugfs_simple_show+0x82/0xc0. xe_gt_debugfs_simple_show() declare a function pointer expecting int return type, but xe_gt_sriov_pf_monitor_print_events() is void return type, leading to CFI failure and kernel panic. [507620.973657] CFI failure at xe_gt_debugfs_simple_show+0x82/0xc0 [xe] (target: xe_gt_sriov_pf_monitor_print_events+0x0/0x130 [xe]; expected type: 0xd72c7139) Fix xe_gt_sriov_pf_monitor_print_events() function by updating to return an int type. Fixes: 1c99d3d3edab ("drm/xe/pf: Expose PF monitor details via debugfs") Signed-off-by: Mohanram Meenakshisundaram <mohanram.meenakshisundaram@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260514174918.1556357-2-mohanram.meenakshisundaram@intel.com (cherry picked from commit ff1d386a8359746d9699ac30336e3b0684c68958) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/xe/vf: Fix signature of print functionsMichal Wajdeczko
We have plugged-in existing VF print functions into our GT debugfs show helper as-is, but we missed that the helper expects functions to return int, while they were defined as void. This can lead to errors being reported when CFI is enabled. Fixes: 63d8cb8fe3dd ("drm/xe/vf: Expose SR-IOV VF attributes to GT debugfs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Mohanram Meenakshisundaram <mohanram.meenakshisundaram@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260514155726.7165-1-michal.wajdeczko@intel.com (cherry picked from commit 314e31c9a8a1c421ee4f7f755b9348aefbbca090) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/xe/gsc: Fix double-free of managed BO in error pathShuicheng Lin
The error path in xe_gsc_init_post_hwconfig() explicitly frees a BO allocated with xe_managed_bo_create_pin_map() via xe_bo_unpin_map_no_vm(). Since the managed BO already has a devm cleanup action registered, this causes a double-free when devm unwinds during probe failure. Remove the explicit free and let devm handle it, consistent with all other xe_managed_bo_create_pin_map() callers. Fixes: 2e5d47fe7839 ("drm/xe/uc: Use managed bo for HuC and GSC objects") Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Assisted-by: Claude:claude-opus-4.6 Link: https://patch.msgid.link/20260511154134.223696-1-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> (cherry picked from commit 71d61e3e299a17139e47f980a4d6f425b2c59bf7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/xe/memirq: Update interrupt handler logicMichal Wajdeczko
To workaround some corner case hardware limitations, new programming note for the memory based interrupt handler suggests to assume that some status bytes, like GT_MI_USER_INTERRUPT and GUC_INTR_GUC2HOST, are always set. Update our interrupt handler to follow the new rules. Bspec: 53672 Fixes: a6581ebe7685 ("drm/xe/vf: Introduce Memory Based Interrupts Handler") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patch.msgid.link/20260511172838.2299-2-michal.wajdeczko@intel.com (cherry picked from commit 284f4cae4579eed9dd4406f18a6c1becc69f8931) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-05-18drm/gem: Make the GEM LRU lock part of drm_deviceBoris Brezillon
Recently, a few races have been discovered in the GEM LRU logic, all of them caused by the fact the LRU lock is accessed through gem->lru->lock, and that very same lock also protects changes to gem->lru, leading to situations where gem->lru needs to first be accessed without the lock held, to then get the lru to access the lock through and finally take the lock and do the expected operation. Currently, the only driver making use of this API (MSM) declares a device-wide lock, and the user we're about to add (panthor) will do the same. There's no evidence that we will ever have a driver that wants different pools of LRUs protected by different locks under the same drm_device. So we're better off moving this lock to drm_device and always locking it through obj->dev->gem_lru_mutex, or directly through dev->gem_lru_mutex. If anyone ever needs more fine-grained locking, this can be revisited to pass some drm_gem_lru_pool object representing the pool of LRUs under a specific lock, but for now, the per-device lock seems to be enough. Fixes: e7c2af13f811 ("drm/gem: Add LRU/shrinker helper") Reported-by: Chia-I Wu <olvaffe@gmail.com> Closes: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/86 Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Link: https://patch.msgid.link/20260518-panthor-shrinker-fixes-v4-1-1920234470d5@collabora.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2026-05-18drm/bridge: it66121: acquire reset GPIO in probeJulien Chauveau
The it66121_ctx structure has a gpio_reset field, and it66121_hw_reset() calls gpiod_set_value() on it. However, the GPIO descriptor is never acquired via devm_gpiod_get(), leaving gpio_reset as NULL throughout the driver lifetime. gpiod_set_value() silently returns when passed a NULL descriptor, so the hardware reset sequence in it66121_hw_reset() is a no-op. This leaves the chip in an undefined state at probe time, which can prevent it from responding on the I2C bus. The DT binding marks reset-gpios as a required property, so all compliant device trees provide this GPIO. Add the missing devm_gpiod_get() call after enabling power supplies and before the hardware reset, so the chip is properly reset with power applied. Fixes: 988156dc2fc9 ("drm: bridge: add it66121 driver") Cc: stable@vger.kernel.org Signed-off-by: Julien Chauveau <chauveau.julien@gmail.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Tested-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patch.msgid.link/20260324193011.16583-1-chauveau.julien@gmail.com Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
2026-05-16drm/msm/snapshot: fix dumping of the unaligned regionsDmitry Baryshkov
The snapshotting code internally aligns data segment to 16 bytes. This works fine for DPU code (where most of the regions are aligned), but fails for snapshotting of the DSI data (because DSI data region is shifted by 4 bytes). Fix the code by removing length alignment and by accurately printing last registers in the region. While reworking the code also fix the 16x memory overallocation in msm_disp_state_dump_regs(). Fixes: 98659487b845 ("drm/msm: add support to take dpu snapshot") Reported-by: Salendarsingh Gaud <sgaud@qti.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/725449/ Message-ID: <20260516-msm-fix-dsi-dump-2-v2-1-9e49fb2d240e@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2026-05-16drm: Replace old pointer to new idrEdward Adam Davis
Commit 5e28b7b94408 introduced a logical error by failing to replace the newly generated IDR pointer to old id's pointer at the correct location within the "change handle" logic; this resulted in the issue reported by syzbot [1]. Specifically, the new IDR object pointer is intended to replace the original id's pointer during the normal execution flow. Additionally, an unnecessary conditional check for the ret exit path has been removed. [1] !RB_EMPTY_ROOT(&prime_fpriv->dmabufs) WARNING: drivers/gpu/drm/drm_prime.c:224 at drm_prime_destroy_file_private+0x48/0x60 drivers/gpu/drm/drm_prime.c:224, CPU#0: syz.0.17/5833 Call Trace: drm_file_free.part.0+0x7e6/0xcc0 drivers/gpu/drm/drm_file.c:269 drm_file_free drivers/gpu/drm/drm_file.c:237 [inline] drm_close_helper.isra.0+0x186/0x200 drivers/gpu/drm/drm_file.c:290 drm_release+0x1ab/0x360 drivers/gpu/drm/drm_file.c:438 Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Reported-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7c9eed171647e421013 Cc: stable@vger.kernel.org Tested-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/tencent_C267296443AAA4567771176886DFF364A305@qq.com
2026-05-16Merge tag 'drm-misc-fixes-2026-05-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup gma500: - oaktrail_lvds: fix i2c handling loongson: - use managed cleanup for connector polling panfrost: - handle results from reservation locking correctly qaic: - check for integer overflows in mmap logic rocket: - handle results from reservation locking correctly ttm: - avoid infinite loop in swap out - avoid infinite loop in BO shrinking - convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260515070816.GA88575@2a02-2455-9062-2500-7dec-552d-233d-9fe0.dyn6.pyur.net
2026-05-16Merge tag 'drm-xe-fixes-2026-05-14' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Madvise fix around purgeability tracking (Arvind) - Restore engine mask for specific blitter style (Roper) - Couple UAF fixes (Auld) - Drop unused ggtt_balloon field (Wajdeczko) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/agXWkM3Y98bqt6TG@intel.com