summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)Author
2025-12-12drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaimBrian Nguyen
There are additional hardware managed L2$ flushing such as the transient display. In those scenarios, page reclamation is unnecessary resulting in redundant cacheline flushes, so skip over those corresponding ranges. v2: - Elaborated on reasoning for page reclamation skip based on Tejas's discussion. (Matthew A, Tejas) v3: - Removed MEDIA_IS_ON due to racy condition resulting in removal of relevant registers and values. (Matthew A) - Moved l3 policy access to xe_pat. (Matthew A) v4: - Updated comments based on previous change. (Tejas) - Move back PAT index macros to xe_pat.c. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-21-brian3.nguyen@intel.com
2025-12-12drm/xe: Append page reclamation action to tlb invalBrian Nguyen
Add page reclamation action to tlb inval backend. The page reclamation action is paired with range tlb invalidations so both are issued at the same time. Page reclamation will issue the TLB invalidation with an invalid seqno and a H2G page reclamation action with the fence's corresponding seqno and handle the fence accordingly on page reclaim action done handler. If page reclamation fails, tlb timeout handler will be responsible for signalling fence and cleaning up. v2: - add send_page_reclaim to patch. - Remove flush_cache and use prl_sa pointer to determine PPC flush instead of explicit bool. Add NULL as fallback for others. (Matthew B) v3: - Add comments for flush_cache with media. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-20-brian3.nguyen@intel.com
2025-12-12drm/xe: Prep page reclaim in tlb inval jobBrian Nguyen
Use page reclaim list as indicator if page reclaim action is desired and pass it to tlb inval fence to handle. Job will need to maintain its own embedded copy to ensure lifetime of PRL exist until job has run. v2: - Use xe variant of WARN_ON (Michal) v3: - Add comments for PRL tile handling and flush behavior with media. (Matthew Brost) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-19-brian3.nguyen@intel.com
2025-12-12drm/xe: Suballocate BO for page reclaimBrian Nguyen
Page reclamation feature needs the PRL to be suballocated into a GGTT-mapped BO. On allocation failure, fallback to default tlb invalidation with full PPC flush. PRL's BO allocation is managed in separate pool to ensure 4K alignment for proper GGTT address. With BO, pass into TLB invalidation backend and modify fence to accomadate accordingly. v2: - Removed page reclaim related variables from TLB fence. (Matthew B) - Allocate PRL bo size to num_entries. (Matthew B) - Move PRL bo allocation to tlb_inval run_job. (Matthew B) v5: - Use xe_page_reclaim_list_valid. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-18-brian3.nguyen@intel.com
2025-12-12drm/xe: Create page reclaim list on unbindBrian Nguyen
Page reclaim list (PRL) is preparation work for the page reclaim feature. The PRL is firstly owned by pt_update_ops and all other page reclaim operations will point back to this PRL. PRL generates its entries during the unbind page walker, updating the PRL. This PRL is restricted to a 4K page, so 512 page entries at most. v2: - Removed unused function. (Shuicheng) - Compacted warning checking, update commit message, spelling, etc. (Shuicheng, Matthew B) - Fix kernel docs - Moved PRL max entries overflow handling out from generate_reclaim_entry to caller (Shuicheng) - Add xe_page_reclaim_list_init for clarity. (Matthew B) - Modify xe_guc_page_reclaim_entry to use macros for greater flexbility. (Matthew B) - Add fallback for PTE outside of page reclaim supported 4K, 64K, 2M pages (Matthew B) - Invalidate PRL for early abort page walk. - Removed page reclaim related variables from tlb fence (Matthew Brost) - Remove error handling in *alloc_entries failure. (Matthew B) v3: - Fix NULL pointer dereference check. - Modify reclaim_entry to QW and bitfields accordingly. (Matthew B) - Add vm_dbg prints for PRL generation and invalidation. (Matthew B) v4: - s/GENMASK/GENMASK_ULL && s/BIT/BIT_ULL (CI) v5: - Addition of xe_page_reclaim_list_is_new() to avoid continuous allocation of PRL if consecutive VMAs cause a PRL invalidation. - Add xe_page_reclaim_list_valid() helpers for clarity. (Matthew B) - Move xe_page_reclaim_list_entries_put in xe_page_reclaim_list_invalidate. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-17-brian3.nguyen@intel.com
2025-12-12drm/xe/guc: Add page reclamation interface to GuCBrian Nguyen
Add page reclamation related changes to GuC interface, handlers, and senders to support page reclamation. Currently TLB invalidations will perform an entire PPC flush in order to prevent stale memory access for noncoherent system memory. Page reclamation is an extension of the typical TLB invalidation workflow, allowing disabling of full PPC flush and enable selective PPC flushing. Selective flushing will be decided by a list of pages whom's address is passed to GuC at time of action. Page reclamation interfaces require at least GuC FW ver 70.31.0. v2: - Moved send_page_reclaim to first patch usage. - Add comments explaining shared done handler. (Matthew B) - Add FW version fallback to disable page reclaim on older versions. (Matthew B, Shuicheng) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-16-brian3.nguyen@intel.com
2025-12-12drm/xe: Add page reclamation info to device infoOak Zeng
Starting from Xe3p, HW adds a feature assisting range based page reclamation. Introduce a bit in device info to indicate whether device has such capability. Signed-off-by: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-15-brian3.nguyen@intel.com
2025-12-12drm/xe/xe_tlb_inval: Modify fence interface to support PPC flushBrian Nguyen
Allow tlb_invalidation to control when driver wants to flush the Private Physical Cache (PPC) as a process of the tlb invalidation process. Default behavior is still to always flush the PPC but driver now has the option to disable it. v2: - Revise commit/kernel doc descriptions. (Shuicheng) - Remove unused function. (Shuicheng) - Remove bool flush_cache parameter from fence, and various function inputs. (Matthew B) Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-14-brian3.nguyen@intel.com
2025-12-12drm/xe: Do not forward invalid TLB invalidation seqnos to upper layersMatthew Brost
Certain TLB invalidation operations send multiple H2G messages per seqno with only the final H2G containing the valid seqno - the others carry an invalid seqno. The G2H handler drops these invalid seqno to aovid prematurely signaling a TLB invalidation fence. With TLB_INVALIDATION_SEQNO_INVALID used to indicate in progress multi-step TLB invalidations, reset tdr to ensure that timeout won't prematurely trigger when G2H actions are still ongoing. v2: Remove lock from xe_tlb_inval_reset_timeout. (Matthew B) v3: Squash with dependent patch from Matthew Brost' series. Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251212213225.3564537-13-brian3.nguyen@intel.com
2025-12-13Merge tag 'drm-misc-fixes-2025-12-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.19-rc1: - Fix stack usage warning in novatek-nt35560. - Fix s/r, i2c issues in nouveau and update string handling. - Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83. - Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties(). - Fix devcoredump crash on reading evicted bo's. - Fix bigendian handling in mgag200. - Fix probe failure in tilcdc. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/6c371dc1-08bf-4a34-895c-9ef348b6061b@linux.intel.com
2025-12-12drm/xe: Restore engine registers before restarting schedulers after GT resetJan Maslak
During GT reset recovery in do_gt_restart(), xe_uc_start() was called before xe_reg_sr_apply_mmio() restored engine-specific registers. This created a race window where the scheduler could run jobs before hardware state was fully restored. This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint- waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE) wasn't restored before jobs started executing. Breakpoints would fail to trigger SIP entry because the debug enable bit wasn't set yet. Fix by moving xe_uc_start() after all MMIO register restoration, including engine registers and CCS mode configuration, ensuring all hardware state is fully restored before any jobs can be scheduled. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak <jan.maslak@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
2025-12-12drm/xe: Increase TDF timeoutJagmeet Randhawa
There are some corner cases where flushing transient data may take slightly longer than the 150us timeout we currently allow. Update the driver to use a 300us timeout instead based on the latest guidance from the hardware team. An update to the bspec to formally document this is expected to arrive soon. Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush") Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm/xe/cri: Enable I2C controllerRaag Jadav
Enable I2C controller for Crescent Island and while at it, rely on has_i2c flag instead of manual platform checks. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20251128084414.306265-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-12-12drm: Fix object leak in DRM_IOCTL_GEM_CHANGE_HANDLEKarol Wachowski
Add missing drm_gem_object_put() call when drm_gem_object_lookup() successfully returns an object. This fixes a GEM object reference leak that can prevent driver modules from unloading when using prime buffers. Fixes: 53096728b891 ("drm: Add DRM prime interface to reassign GEM handle") Cc: <stable@vger.kernel.org> # v6.18+ Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com
2025-12-12drm/{i915, xe}/panic: move panic handling to parent interfaceJani Nikula
Move the panic handling to the display parent interface, making display more independent of i915 and xe driver implementations. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/e27eca5424479e8936b786018d0af19a34f839f6.1765474612.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-12drm/i915/panic: move i915 specific panic implementation to i915Jani Nikula
The intel_panic.c implementation is i915 specific, and xe has its own. Move it to i915 core as i915_panic.c. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/8dc7af0ae1f859d17b0be269a545146c5536d8fc.1765474612.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-12drm/tests: Handle EDEADLK in set_up_atomic_state()José Expósito
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_validate_modeset test [1]: # drm_test_check_connector_changed_modeset: EXPECTATION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:162 Expected ret == 0, but ret == -35 (0xffffffffffffffdd) Change the set_up_atomic_state() helper function to return on error and restart the atomic sequence when the returned error is EDEADLK. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log Fixes: 73d934d7b6e3 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com
2025-12-12drm/tests: Handle EDEADLK in drm_test_check_valid_clones()José Expósito
Fedora/CentOS/RHEL CI is reporting intermittent failures while running the drm_test_check_valid_clones() KUnit test. The error log can be either [1]: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == 0 (0x0) Or [2] depending on the test case: # drm_test_check_valid_clones: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_atomic_state_test.c:295 Expected ret == param->expected_result, but ret == -35 (0xffffffffffffffdd) param->expected_result == -22 (0xffffffffffffffea) Restart the atomic sequence when EDEADLK is returned. [1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log [2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log Fixes: 88849f24e2ab ("drm/tests: Add test for drm_atomic_helper_check_modeset()") Closes: https://datawarehouse.cki-project.org/issue/4004 Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com
2025-12-12drm/tests: hdmi: Handle drm_kunit_helper_enable_crtc_connector() returning ↵José Expósito
EDEADLK Fedora/CentOS/RHEL CI is reporting intermittent failures while running the KUnit tests present in drm_hdmi_state_helper_test.c [1]. While the specific test causing the failure change between runs, all of them are caused by drm_kunit_helper_enable_crtc_connector() returning -EDEADLK. The error trace always follow this structure: # <test name>: ASSERTION FAILED at # drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line> Expected ret == 0, but ret == -35 (0xffffffffffffffdd) As documented, if the drm_kunit_helper_enable_crtc_connector() function returns -EDEADLK (-35), the entire atomic sequence must be restarted. Handle this error code for all function calls. Closes: https://datawarehouse.cki-project.org/issue/4039 [1] Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper") Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com
2025-12-12Merge tag 'drm-intel-next-fixes-2025-12-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 fixes for v6.19-rc1: - Fix format string truncation warning - FIx runtime PM reference during fbdev BO creation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/281309f78560bcceebac8d5c0511efe66baf641c@intel.com
2025-12-11drm/xe/doc: Add documentation for Multi Queue Group GuC interfaceNiranjana Vishwanathapura
Add kernel documentation for Multi Queue group GuC interface. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-36-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/doc: Add documentation for Multi Queue GroupNiranjana Vishwanathapura
Add kernel documentation for Multi Queue group and update the corresponding rst. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-35-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Support active group after primary is destroyedNiranjana Vishwanathapura
Add support to keep the group active after the primary queue is destroyed. Instead of killing the primary queue during exec_queue destroy ioctl, kill it when all the secondary queues of the group are killed. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-34-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Tracepoint supportNiranjana Vishwanathapura
Add xe_exec_queue_create_multi_queue event with multi-queue information. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-33-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Teardown group upon job timeoutNiranjana Vishwanathapura
Upon a job timeout, teardown the multi-queue group by triggering TDR on all queues of the multi-queue group and by skipping timeout checks in them. v5: Ban the group while triggering TDR for the guc reported errors Add FIXME in TDR to take multi-queue group off HW (Matt Brost) v6: Trigger cleanup of group only for multi-queue case Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-32-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Reset GT upon CGP_SYNC failureNiranjana Vishwanathapura
If GuC doesn't response to CGP_SYNC message, trigger GT reset and cleanup of all the queues of the multi queue group. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-31-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Handle CGP context errorNiranjana Vishwanathapura
Trigger multi-queue context cleanup upon CGP context error notification from GuC. v4: Fix error message Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-30-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Set QUEUE_DRAIN_MODE for Multi Queue batchesNiranjana Vishwanathapura
To properly support soft light restore between batches being arbitrated at the CFEG, PIPE_CONTROL instructions have a new bit in the first DW, QUEUE_DRAIN_MODE. When set, this indicates to the CFEG that it should only drain the current queue. Additionally we no longer want to set the CS_STALL bit for these multi queue queues as this causes the entire pipeline to stall waiting for completion of the prior batch, preventing this soft light restore from occurring between queues in a queue group. v4: Assert !multi_queue where applicable (Matt Roper) Bspec: 56551 Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-29-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Handle tearing down of a multi queueNiranjana Vishwanathapura
As all queues of a multi queue group use the primary queue of the group to interface with GuC. Hence there is a dependency between the queues of the group. So, when primary queue of a multi queue group is cleaned up, also trigger a cleanup of the secondary queues also. During cleanup, stop and re-start submission for all queues of a multi queue group to avoid any submission happening in parallel when a queue is being cleaned up. v2: Initialize group->list_lock, add fs_reclaim dependency, remove unwanted secondary queues cleanup (Matt Brost) v3: Properly handle cleanup of multi-queue group (Matt Brost) v4: Fix IS_ENABLED(CONFIG_LOCKDEP) check (Matt Brost) Revert stopping/restarting of submissions on queues of the group in TDR as it is not needed. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-28-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add multi queue information to guc_info dumpNiranjana Vishwanathapura
Dump multi queue specific information in the guc exec queue dump. v2: Move multi queue related fields inside the multi_queue sub-structure (Matt Brost) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-27-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add support for multi queue dynamic priority changeNiranjana Vishwanathapura
Support dynamic priority change for multi queue group queues via exec queue set_property ioctl. Issue CGP_SYNC command to GuC through the drm scheduler message interface for priority to take effect. v2: Move is_multi_queue check to exec_queue layer and assert is_multi_queue being set in guc submission layer (Matt Brost) v3: Assert CGP_SYNC message length is valid (Matt Brost) Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-26-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add exec_queue set_property ioctl supportNiranjana Vishwanathapura
This patch adds support for exec_queue set_property ioctl. It is derived from the original work which is part of https://patchwork.freedesktop.org/series/112188/ Currently only DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY property can be dynamically set. v2: Check for and update kernel-doc which property this ioctl supports (Matt Brost) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-25-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Handle invalid exec queue property settingNiranjana Vishwanathapura
Only MULTI_QUEUE_PRIORITY property is valid for secondary queues of a multi queue group. MULTI_QUEUE_PRIORITY only applies to multi queue group queues. Detect invalid user queue property setting and return error. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-24-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add multi queue priority propertyNiranjana Vishwanathapura
Add support for queues of a multi queue group to set their priority within the queue group by adding property DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY. This is the only other property supported by secondary queues of a multi queue group, other than DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE. v2: Add kernel doc for enum xe_multi_queue_priority, Add assert for priority values, fix includes and declarations (Matt Brost) v3: update uapi kernel-doc (Matt Brost) v4: uapi change due to rebase Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-23-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add GuC interface for multi queue supportNiranjana Vishwanathapura
Implement GuC commands and response along with the Context Group Page (CGP) interface for multi queue support. Ensure that only primary queue (q0) of a multi queue group communicate with GuC. The secondary queues of the group only need to maintain LRCA and interface with drm scheduler. Use primary queue's submit_wq for all secondary queues of a multi queue group. This serialization avoids any locking around CGP synchronization with GuC. v2: Fix G2H_LEN_DW_MULTI_QUEUE_CONTEXT value, add more comments (Matt Brost) v3: Minor code refactro, use xe_gt_assert v4: Use xe_guc_ct_wake_waiters(), remove vf recovery support (Matt Brost) Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-22-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add user interface for multi queue supportNiranjana Vishwanathapura
Multi Queue is a new mode of execution supported by the compute and blitter copy command streamers (CCS and BCS, respectively). It is an enhancement of the existing hardware architecture and leverages the same submission model. It enables support for efficient, parallel execution of multiple queues within a single context. All the queues of a group must use the same address space (VM). The new DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE execution queue property supports creating a multi queue group and adding queues to a queue group. All queues of a multi queue group share the same context. A exec queue create ioctl call with above property specified with value DRM_XE_SUPER_GROUP_CREATE will create a new multi queue group with the queue being created as the primary queue (aka q0) of the group. To add secondary queues to the group, they need to be created with the above property with id of the primary queue as the value. The properties of the primary queue (like priority, timeslice) applies to the whole group. So, these properties can't be set for secondary queues of a group. Once destroyed, the secondary queues of a multi queue group can't be replaced. However, they can be dynamically added to the group up to a total of 64 queues per group. Once the primary queue is destroyed, secondary queues can't be added to the queue group. v2: Remove group->lock, fix xe_exec_queue_group_add()/delete() function semantics, add additional comments, remove unused group->list_lock, add XE_BO_FLAG_GGTT_INVALIDATE for cgp bo, Assert LRC is valid, update uapi kernel doc. (Matt Brost) v3: Use XE_BO_FLAG_PINNED_LATE_RESTORE/USER_VRAM/GGTT_INVALIDATE flags for cgp bo (Matt) v4: Ensure queue is not a vm_bind queue uapi change due to rebase Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-21-niranjana.vishwanathapura@intel.com
2025-12-11drm/xe/multi_queue: Add multi_queue_enable_mask to gt informationNiranjana Vishwanathapura
Add multi_queue_enable_mask field to the gt information structure which is bitmask of all engine classes with multi queue support enabled. v2: Rename multi_queue_enable_mask to multi_queue_engine_class_mask (Matt Brost) Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251211010249.1647839-20-niranjana.vishwanathapura@intel.com
2025-12-12Merge tag 'amd-drm-fixes-6.19-2025-12-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.19-2025-12-11: amdgpu: - SI fix - DC reduce stack usage - HDMI fixes - VCN 4.0.5 fix - DP MST fix - DC memory allocation fix amdkfd: - SVM fix - Trap handler fix - VGPR fixes for GC 11.5 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20251211195600.1641924-1-alexander.deucher@amd.com
2025-12-12Merge tag 'drm-misc-next-fixes-2025-12-10' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next-fixes for v6.19-rc1: - Fix uaf in panthor. - Revert 8 byte alignment constraint for pitch in dumb bo's. - Fix DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC handling renasas. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/a82c2a2a-314f-403b-85bf-9b3ee09b903c@linux.intel.com
2025-12-11drm/plane: Fix IS_ERR() vs NULL bug drm_plane_create_color_pipeline_property()Dan Carpenter
The drm_property_create_enum() function returns NULL on error, it never returns error pointers. Fix the error checking to match. Fixes: 2afc3184f3b3 ("drm/plane: Add COLOR PIPELINE property") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Simon Ser <contact@emersion.fr> Link: https://patch.msgid.link/aTK9ZR0sMgqSACow@stanley.mountain
2025-12-11drm/vblank: prefer drm_crtc_vblank_crtc() over drm_vblank_crtc()Jani Nikula
Use the higher level function where crtc is available. v2: Rebase Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/29a29e746bc90c824d4f2bd15e42817dd7d0b199.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/vblank: use the drm_vblank_crtc() and drm_crtc_vblank_crtc() helpers moreJani Nikula
We have the helpers to avoid open coding dev->vblank[pipe] access. v2: Rebase Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/ad41f25c625d6a263b7e2e1d227cb14c5d0ce204.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/vblank: limit vblank variable scope to atomicJani Nikula
In drm_crtc_vblank_helper_get_vblank_timestamp_internal(), we only need the vblank variable for atomic modesetting. Limit the scope to make upcoming changes easier. Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/b50f0bff654a6902ffd7ae52c31d46fad9ed7540.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/vblank: add return value to drm_crtc_wait_one_vblank()Jani Nikula
Let drivers deal with the vblank wait failures if they so desire. If the current warning backtrace gets toned down to a simple warning message, the drivers may wish to add the backtrace themselves. Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/7f2de4dd170771991756073f037c7ca043c3e746.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/vblank: remove superfluous pipe checkJani Nikula
Now that the pipe is crtc->pipe, there's no need to check it's within range. Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/ced963542bfb00c2f1a653e9e5f717fccbd25132.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/vblank: remove drm_wait_one_vblank() completelyJani Nikula
There's really no need for the extra static function at all. Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/fe969aad198d3f151fafd01faca5b0e73bfd9a03.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/vblank: Unexport drm_wait_one_vblank()Thomas Zimmermann
Make drm_wait_on_vblank() static. The function is an internal interface and not invoked directly by drivers. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/b0ab9833a85f5fb6de95ad6cb0216864bf860c9e.1765290097.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-12-11drm/xe/vf: Reset recovery_queued after issuing RESFIX_STARTSatyanarayana K V P
During VF_RESTORE or VF_RESUME, the GuC sends a migration interrupt and clears the RESFIX_START marker. If migration or resume occurs before the VF issues its own RESFIX_START, VF KMD may receive two back-to-back migration interrupts. VF then sends RESFIX_START to indicate the beginning of fixups and RESFIX_DONE to mark completion. However, the second RESFIX_START fails because the GuC is already in the RUNNING state. Clear the recovery_queued flag after sending a RESFIX_START message to ignore duplicated IRQs seen before we start actual recovery. This ensures the state is reset only after the fixup process begins, avoiding redundant work item queuing. Fixes: b5fbb94341a2 ("drm/xe/vf: Introduce RESFIX start marker support") Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251210052546.622809-6-satyanarayana.k.v.p@intel.com
2025-12-11drm/xe/vf: Fix queuing of recovery workSatyanarayana K V P
Ensure VF migration recovery work is only queued when no recovery is already queued and teardown is not in progress. Fixes: b47c0c07c350 ("drm/xe/vf: Teardown VF post migration worker on driver unload") Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tomasz Lis <tomasz.lis@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com
2025-12-11drm/bridge: ti-sn65dsi83: protect device resources on unplugLuca Ceresoli
To support hot-unplug of this bridge we need to protect access to device resources in case sn65dsi83_remove() happens concurrently to other code. Some care is needed for the case when the unplug happens before sn65dsi83_atomic_disable() has a chance to enter the critical section (i.e. a successful drm_bridge_enter() call), which occurs whenever the hardware is removed while the display is active. When that happens, sn65dsi83_atomic_disable() in unable to release the resources taken by sn65dsi83_atomic_pre_enable(). To ensure those resources are released exactly once on device removal: * move the code to release them to a dedicated function * register that function when the resources are taken in sn65dsi83_atomic_pre_enable() * if sn65dsi83_atomic_disable() happens before sn65dsi83_remove() (typical non-hot-unplug case): * sn65dsi83_atomic_disable() can enter the critical section (drm_bridge_enter() returns 0) -> it releases and executes the devres action * if sn65dsi83_atomic_disable() happens after sn65dsi83_remove() (typical hot-unplug case): * sn65dsi83_remove() -> drm_bridge_unplug() prevents sn65dsi83_atomic_disable() from entering the critical section (drm_bridge_enter() returns nonzero), so sn65dsi83_atomic_disable() cannot release and execute the devres action * the devres action is executed at the end of sn65dsi83_remove() Reviewed-by: Maxime Ripard <mripard@kernel.org> Link: https://patch.msgid.link/20251112-drm-bridge-atomic-vs-remove-v3-2-85db717ce094@bootlin.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>