| Age | Commit message (Collapse) | Author |
|
There are additional hardware managed L2$ flushing such as the
transient display. In those scenarios, page reclamation is
unnecessary resulting in redundant cacheline flushes, so skip
over those corresponding ranges.
v2:
- Elaborated on reasoning for page reclamation skip based on
Tejas's discussion. (Matthew A, Tejas)
v3:
- Removed MEDIA_IS_ON due to racy condition resulting in removal of
relevant registers and values. (Matthew A)
- Moved l3 policy access to xe_pat. (Matthew A)
v4:
- Updated comments based on previous change. (Tejas)
- Move back PAT index macros to xe_pat.c.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-21-brian3.nguyen@intel.com
|
|
Add page reclamation action to tlb inval backend. The page reclamation
action is paired with range tlb invalidations so both are issued at the
same time.
Page reclamation will issue the TLB invalidation with an invalid seqno
and a H2G page reclamation action with the fence's corresponding seqno
and handle the fence accordingly on page reclaim action done handler.
If page reclamation fails, tlb timeout handler will be responsible for
signalling fence and cleaning up.
v2:
- add send_page_reclaim to patch.
- Remove flush_cache and use prl_sa pointer to determine PPC flush
instead of explicit bool. Add NULL as fallback for others. (Matthew B)
v3:
- Add comments for flush_cache with media.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-20-brian3.nguyen@intel.com
|
|
Use page reclaim list as indicator if page reclaim action is desired and
pass it to tlb inval fence to handle.
Job will need to maintain its own embedded copy to ensure lifetime of
PRL exist until job has run.
v2:
- Use xe variant of WARN_ON (Michal)
v3:
- Add comments for PRL tile handling and flush behavior with media.
(Matthew Brost)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-19-brian3.nguyen@intel.com
|
|
Page reclamation feature needs the PRL to be suballocated into a
GGTT-mapped BO. On allocation failure, fallback to default tlb
invalidation with full PPC flush.
PRL's BO allocation is managed in separate pool to ensure 4K alignment
for proper GGTT address.
With BO, pass into TLB invalidation backend and modify fence to
accomadate accordingly.
v2:
- Removed page reclaim related variables from TLB fence. (Matthew B)
- Allocate PRL bo size to num_entries. (Matthew B)
- Move PRL bo allocation to tlb_inval run_job. (Matthew B)
v5:
- Use xe_page_reclaim_list_valid. (Matthew B)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-18-brian3.nguyen@intel.com
|
|
Page reclaim list (PRL) is preparation work for the page reclaim feature.
The PRL is firstly owned by pt_update_ops and all other page reclaim
operations will point back to this PRL. PRL generates its entries during
the unbind page walker, updating the PRL.
This PRL is restricted to a 4K page, so 512 page entries at most.
v2:
- Removed unused function. (Shuicheng)
- Compacted warning checking, update commit message,
spelling, etc. (Shuicheng, Matthew B)
- Fix kernel docs
- Moved PRL max entries overflow handling out from
generate_reclaim_entry to caller (Shuicheng)
- Add xe_page_reclaim_list_init for clarity. (Matthew B)
- Modify xe_guc_page_reclaim_entry to use macros
for greater flexbility. (Matthew B)
- Add fallback for PTE outside of page reclaim supported
4K, 64K, 2M pages (Matthew B)
- Invalidate PRL for early abort page walk.
- Removed page reclaim related variables from tlb fence
(Matthew Brost)
- Remove error handling in *alloc_entries failure. (Matthew B)
v3:
- Fix NULL pointer dereference check.
- Modify reclaim_entry to QW and bitfields accordingly. (Matthew B)
- Add vm_dbg prints for PRL generation and invalidation. (Matthew B)
v4:
- s/GENMASK/GENMASK_ULL && s/BIT/BIT_ULL (CI)
v5:
- Addition of xe_page_reclaim_list_is_new() to avoid continuous
allocation of PRL if consecutive VMAs cause a PRL invalidation.
- Add xe_page_reclaim_list_valid() helpers for clarity. (Matthew B)
- Move xe_page_reclaim_list_entries_put in
xe_page_reclaim_list_invalidate.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-17-brian3.nguyen@intel.com
|
|
Add page reclamation related changes to GuC interface, handlers, and
senders to support page reclamation.
Currently TLB invalidations will perform an entire PPC flush in order to
prevent stale memory access for noncoherent system memory. Page
reclamation is an extension of the typical TLB invalidation
workflow, allowing disabling of full PPC flush and enable selective PPC
flushing. Selective flushing will be decided by a list of pages whom's
address is passed to GuC at time of action.
Page reclamation interfaces require at least GuC FW ver 70.31.0.
v2:
- Moved send_page_reclaim to first patch usage.
- Add comments explaining shared done handler. (Matthew B)
- Add FW version fallback to disable page reclaim
on older versions. (Matthew B, Shuicheng)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-16-brian3.nguyen@intel.com
|
|
Starting from Xe3p, HW adds a feature assisting range based page
reclamation. Introduce a bit in device info to indicate whether
device has such capability.
Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-15-brian3.nguyen@intel.com
|
|
Allow tlb_invalidation to control when driver wants to flush the
Private Physical Cache (PPC) as a process of the tlb invalidation
process.
Default behavior is still to always flush the PPC but driver now has the
option to disable it.
v2:
- Revise commit/kernel doc descriptions. (Shuicheng)
- Remove unused function. (Shuicheng)
- Remove bool flush_cache parameter from fence,
and various function inputs. (Matthew B)
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-14-brian3.nguyen@intel.com
|
|
Certain TLB invalidation operations send multiple H2G messages per seqno
with only the final H2G containing the valid seqno - the others carry an
invalid seqno. The G2H handler drops these invalid seqno to aovid
prematurely signaling a TLB invalidation fence.
With TLB_INVALIDATION_SEQNO_INVALID used to indicate in progress
multi-step TLB invalidations, reset tdr to ensure that timeout
won't prematurely trigger when G2H actions are still ongoing.
v2: Remove lock from xe_tlb_inval_reset_timeout. (Matthew B)
v3: Squash with dependent patch from Matthew Brost' series.
Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251212213225.3564537-13-brian3.nguyen@intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.19-rc1:
- Fix stack usage warning in novatek-nt35560.
- Fix s/r, i2c issues in nouveau and update string handling.
- Ignore spurious PLL_UNLOCK bit in ti-sn65dsi83.
- Handle IS_ERR vs NULL in drm_plane_create_hotspot_properties().
- Fix devcoredump crash on reading evicted bo's.
- Fix bigendian handling in mgag200.
- Fix probe failure in tilcdc.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/6c371dc1-08bf-4a34-895c-9ef348b6061b@linux.intel.com
|
|
During GT reset recovery in do_gt_restart(), xe_uc_start() was called
before xe_reg_sr_apply_mmio() restored engine-specific registers. This
created a race window where the scheduler could run jobs before hardware
state was fully restored.
This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint-
waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE)
wasn't restored before jobs started executing. Breakpoints would fail to
trigger SIP entry because the debug enable bit wasn't set yet.
Fix by moving xe_uc_start() after all MMIO register restoration,
including engine registers and CCS mode configuration, ensuring all
hardware state is fully restored before any jobs can be scheduled.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Jan Maslak <jan.maslak@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com
|
|
There are some corner cases where flushing transient
data may take slightly longer than the 150us timeout
we currently allow. Update the driver to use a 300us
timeout instead based on the latest guidance from
the hardware team. An update to the bspec to formally
document this is expected to arrive soon.
Fixes: c01c6066e6fa ("drm/xe/device: implement transient flush")
Signed-off-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/0201b1d6ec64d3651fcbff1ea21026efa915126a.1765487866.git.jagmeet.randhawa@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Enable I2C controller for Crescent Island and while at it, rely on
has_i2c flag instead of manual platform checks.
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20251128084414.306265-1-raag.jadav@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
|
|
Add missing drm_gem_object_put() call when drm_gem_object_lookup()
successfully returns an object. This fixes a GEM object reference
leak that can prevent driver modules from unloading when using
prime buffers.
Fixes: 53096728b891 ("drm: Add DRM prime interface to reassign GEM handle")
Cc: <stable@vger.kernel.org> # v6.18+
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20251212134133.475218-1-karol.wachowski@linux.intel.com
|
|
Move the panic handling to the display parent interface, making display
more independent of i915 and xe driver implementations.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/e27eca5424479e8936b786018d0af19a34f839f6.1765474612.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The intel_panic.c implementation is i915 specific, and xe has its
own. Move it to i915 core as i915_panic.c.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/8dc7af0ae1f859d17b0be269a545146c5536d8fc.1765474612.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the drm_validate_modeset test [1]:
# drm_test_check_connector_changed_modeset: EXPECTATION FAILED at
# drivers/gpu/drm/tests/drm_atomic_state_test.c:162
Expected ret == 0, but
ret == -35 (0xffffffffffffffdd)
Change the set_up_atomic_state() helper function to return on error and
restart the atomic sequence when the returned error is EDEADLK.
[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744096/test_x86_64/11762450343/artifacts/jobwatch/logs/recipes/19797909/tasks/204139142/results/945095586/logs/dmesg.log
Fixes: 73d934d7b6e3 ("drm/tests: Add test for drm_atomic_helper_commit_modeset_disables()")
Closes: https://datawarehouse.cki-project.org/issue/4004
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102535.12212-2-jose.exposito89@gmail.com
|
|
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the drm_test_check_valid_clones() KUnit test.
The error log can be either [1]:
# drm_test_check_valid_clones: ASSERTION FAILED at
# drivers/gpu/drm/tests/drm_atomic_state_test.c:295
Expected ret == param->expected_result, but
ret == -35 (0xffffffffffffffdd)
param->expected_result == 0 (0x0)
Or [2] depending on the test case:
# drm_test_check_valid_clones: ASSERTION FAILED at
# drivers/gpu/drm/tests/drm_atomic_state_test.c:295
Expected ret == param->expected_result, but
ret == -35 (0xffffffffffffffdd)
param->expected_result == -22 (0xffffffffffffffea)
Restart the atomic sequence when EDEADLK is returned.
[1] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2113057246/test_x86_64/11802139999/artifacts/jobwatch/logs/recipes/19824965/tasks/204347800/results/946112713/logs/dmesg.log
[2] https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/2106744297/test_aarch64/11762450907/artifacts/jobwatch/logs/recipes/19797942/tasks/204139727/results/945094561/logs/dmesg.log
Fixes: 88849f24e2ab ("drm/tests: Add test for drm_atomic_helper_check_modeset()")
Closes: https://datawarehouse.cki-project.org/issue/4004
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102535.12212-1-jose.exposito89@gmail.com
|
|
EDEADLK
Fedora/CentOS/RHEL CI is reporting intermittent failures while running
the KUnit tests present in drm_hdmi_state_helper_test.c [1].
While the specific test causing the failure change between runs, all of
them are caused by drm_kunit_helper_enable_crtc_connector() returning
-EDEADLK. The error trace always follow this structure:
# <test name>: ASSERTION FAILED at
# drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c:<line>
Expected ret == 0, but
ret == -35 (0xffffffffffffffdd)
As documented, if the drm_kunit_helper_enable_crtc_connector() function
returns -EDEADLK (-35), the entire atomic sequence must be restarted.
Handle this error code for all function calls.
Closes: https://datawarehouse.cki-project.org/issue/4039 [1]
Fixes: 6a5c0ad7e08e ("drm/tests: hdmi_state_helpers: Switch to new helper")
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://patch.msgid.link/20251104102258.10026-1-jose.exposito89@gmail.com
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 fixes for v6.19-rc1:
- Fix format string truncation warning
- FIx runtime PM reference during fbdev BO creation
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patch.msgid.link/281309f78560bcceebac8d5c0511efe66baf641c@intel.com
|
|
Add kernel documentation for Multi Queue group GuC interface.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-36-niranjana.vishwanathapura@intel.com
|
|
Add kernel documentation for Multi Queue group and update
the corresponding rst.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-35-niranjana.vishwanathapura@intel.com
|
|
Add support to keep the group active after the primary queue is
destroyed. Instead of killing the primary queue during exec_queue
destroy ioctl, kill it when all the secondary queues of the group
are killed.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-34-niranjana.vishwanathapura@intel.com
|
|
Add xe_exec_queue_create_multi_queue event with
multi-queue information.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-33-niranjana.vishwanathapura@intel.com
|
|
Upon a job timeout, teardown the multi-queue group by
triggering TDR on all queues of the multi-queue group
and by skipping timeout checks in them.
v5: Ban the group while triggering TDR for the guc
reported errors
Add FIXME in TDR to take multi-queue group off HW
(Matt Brost)
v6: Trigger cleanup of group only for multi-queue case
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-32-niranjana.vishwanathapura@intel.com
|
|
If GuC doesn't response to CGP_SYNC message, trigger
GT reset and cleanup of all the queues of the multi
queue group.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-31-niranjana.vishwanathapura@intel.com
|
|
Trigger multi-queue context cleanup upon CGP context error
notification from GuC.
v4: Fix error message
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-30-niranjana.vishwanathapura@intel.com
|
|
To properly support soft light restore between batches
being arbitrated at the CFEG, PIPE_CONTROL instructions
have a new bit in the first DW, QUEUE_DRAIN_MODE. When
set, this indicates to the CFEG that it should only
drain the current queue.
Additionally we no longer want to set the CS_STALL bit
for these multi queue queues as this causes the entire
pipeline to stall waiting for completion of the prior
batch, preventing this soft light restore from occurring
between queues in a queue group.
v4: Assert !multi_queue where applicable (Matt Roper)
Bspec: 56551
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-29-niranjana.vishwanathapura@intel.com
|
|
As all queues of a multi queue group use the primary queue of the group
to interface with GuC. Hence there is a dependency between the queues of
the group. So, when primary queue of a multi queue group is cleaned up,
also trigger a cleanup of the secondary queues also. During cleanup, stop
and re-start submission for all queues of a multi queue group to avoid
any submission happening in parallel when a queue is being cleaned up.
v2: Initialize group->list_lock, add fs_reclaim dependency, remove
unwanted secondary queues cleanup (Matt Brost)
v3: Properly handle cleanup of multi-queue group (Matt Brost)
v4: Fix IS_ENABLED(CONFIG_LOCKDEP) check (Matt Brost)
Revert stopping/restarting of submissions on queues of the
group in TDR as it is not needed.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-28-niranjana.vishwanathapura@intel.com
|
|
Dump multi queue specific information in the guc exec queue
dump.
v2: Move multi queue related fields inside the multi_queue
sub-structure (Matt Brost)
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-27-niranjana.vishwanathapura@intel.com
|
|
Support dynamic priority change for multi queue group queues via
exec queue set_property ioctl. Issue CGP_SYNC command to GuC through
the drm scheduler message interface for priority to take effect.
v2: Move is_multi_queue check to exec_queue layer and assert
is_multi_queue being set in guc submission layer (Matt Brost)
v3: Assert CGP_SYNC message length is valid (Matt Brost)
Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-26-niranjana.vishwanathapura@intel.com
|
|
This patch adds support for exec_queue set_property ioctl.
It is derived from the original work which is part of
https://patchwork.freedesktop.org/series/112188/
Currently only DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY
property can be dynamically set.
v2: Check for and update kernel-doc which property this ioctl
supports (Matt Brost)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-25-niranjana.vishwanathapura@intel.com
|
|
Only MULTI_QUEUE_PRIORITY property is valid for secondary queues of a
multi queue group. MULTI_QUEUE_PRIORITY only applies to multi queue group
queues. Detect invalid user queue property setting and return error.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-24-niranjana.vishwanathapura@intel.com
|
|
Add support for queues of a multi queue group to set
their priority within the queue group by adding property
DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE_PRIORITY.
This is the only other property supported by secondary
queues of a multi queue group, other than
DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE.
v2: Add kernel doc for enum xe_multi_queue_priority,
Add assert for priority values, fix includes and
declarations (Matt Brost)
v3: update uapi kernel-doc (Matt Brost)
v4: uapi change due to rebase
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-23-niranjana.vishwanathapura@intel.com
|
|
Implement GuC commands and response along with the Context
Group Page (CGP) interface for multi queue support.
Ensure that only primary queue (q0) of a multi queue group
communicate with GuC. The secondary queues of the group only
need to maintain LRCA and interface with drm scheduler.
Use primary queue's submit_wq for all secondary queues of a multi
queue group. This serialization avoids any locking around CGP
synchronization with GuC.
v2: Fix G2H_LEN_DW_MULTI_QUEUE_CONTEXT value, add more comments
(Matt Brost)
v3: Minor code refactro, use xe_gt_assert
v4: Use xe_guc_ct_wake_waiters(), remove vf recovery support
(Matt Brost)
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-22-niranjana.vishwanathapura@intel.com
|
|
Multi Queue is a new mode of execution supported by the compute and
blitter copy command streamers (CCS and BCS, respectively). It is an
enhancement of the existing hardware architecture and leverages the
same submission model. It enables support for efficient, parallel
execution of multiple queues within a single context. All the queues
of a group must use the same address space (VM).
The new DRM_XE_EXEC_QUEUE_SET_PROPERTY_MULTI_QUEUE execution queue
property supports creating a multi queue group and adding queues to
a queue group. All queues of a multi queue group share the same
context.
A exec queue create ioctl call with above property specified with value
DRM_XE_SUPER_GROUP_CREATE will create a new multi queue group with the
queue being created as the primary queue (aka q0) of the group. To add
secondary queues to the group, they need to be created with the above
property with id of the primary queue as the value. The properties of
the primary queue (like priority, timeslice) applies to the whole group.
So, these properties can't be set for secondary queues of a group.
Once destroyed, the secondary queues of a multi queue group can't be
replaced. However, they can be dynamically added to the group up to a
total of 64 queues per group. Once the primary queue is destroyed,
secondary queues can't be added to the queue group.
v2: Remove group->lock, fix xe_exec_queue_group_add()/delete()
function semantics, add additional comments, remove unused
group->list_lock, add XE_BO_FLAG_GGTT_INVALIDATE for cgp bo,
Assert LRC is valid, update uapi kernel doc.
(Matt Brost)
v3: Use XE_BO_FLAG_PINNED_LATE_RESTORE/USER_VRAM/GGTT_INVALIDATE
flags for cgp bo (Matt)
v4: Ensure queue is not a vm_bind queue
uapi change due to rebase
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-21-niranjana.vishwanathapura@intel.com
|
|
Add multi_queue_enable_mask field to the gt information structure
which is bitmask of all engine classes with multi queue support
enabled.
v2: Rename multi_queue_enable_mask to multi_queue_engine_class_mask
(Matt Brost)
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20251211010249.1647839-20-niranjana.vishwanathapura@intel.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-6.19-2025-12-11:
amdgpu:
- SI fix
- DC reduce stack usage
- HDMI fixes
- VCN 4.0.5 fix
- DP MST fix
- DC memory allocation fix
amdkfd:
- SVM fix
- Trap handler fix
- VGPR fixes for GC 11.5
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20251211195600.1641924-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next-fixes for v6.19-rc1:
- Fix uaf in panthor.
- Revert 8 byte alignment constraint for pitch in dumb bo's.
- Fix DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNC handling renasas.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patch.msgid.link/a82c2a2a-314f-403b-85bf-9b3ee09b903c@linux.intel.com
|
|
The drm_property_create_enum() function returns NULL on error, it never
returns error pointers. Fix the error checking to match.
Fixes: 2afc3184f3b3 ("drm/plane: Add COLOR PIPELINE property")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patch.msgid.link/aTK9ZR0sMgqSACow@stanley.mountain
|
|
Use the higher level function where crtc is available.
v2: Rebase
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/29a29e746bc90c824d4f2bd15e42817dd7d0b199.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
We have the helpers to avoid open coding dev->vblank[pipe] access.
v2: Rebase
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/ad41f25c625d6a263b7e2e1d227cb14c5d0ce204.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
In drm_crtc_vblank_helper_get_vblank_timestamp_internal(), we only need
the vblank variable for atomic modesetting. Limit the scope to make
upcoming changes easier.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/b50f0bff654a6902ffd7ae52c31d46fad9ed7540.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Let drivers deal with the vblank wait failures if they so desire. If the
current warning backtrace gets toned down to a simple warning message,
the drivers may wish to add the backtrace themselves.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/7f2de4dd170771991756073f037c7ca043c3e746.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Now that the pipe is crtc->pipe, there's no need to check it's within
range.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/ced963542bfb00c2f1a653e9e5f717fccbd25132.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
There's really no need for the extra static function at all.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/fe969aad198d3f151fafd01faca5b0e73bfd9a03.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Make drm_wait_on_vblank() static. The function is an internal interface
and not invoked directly by drivers.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/b0ab9833a85f5fb6de95ad6cb0216864bf860c9e.1765290097.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
During VF_RESTORE or VF_RESUME, the GuC sends a migration interrupt and
clears the RESFIX_START marker. If migration or resume occurs before the
VF issues its own RESFIX_START, VF KMD may receive two back-to-back
migration interrupts. VF then sends RESFIX_START to indicate the beginning
of fixups and RESFIX_DONE to mark completion. However, the second
RESFIX_START fails because the GuC is already in the RUNNING state.
Clear the recovery_queued flag after sending a RESFIX_START message to
ignore duplicated IRQs seen before we start actual recovery.
This ensures the state is reset only after the fixup process begins,
avoiding redundant work item queuing.
Fixes: b5fbb94341a2 ("drm/xe/vf: Introduce RESFIX start marker support")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251210052546.622809-6-satyanarayana.k.v.p@intel.com
|
|
Ensure VF migration recovery work is only queued when no recovery is
already queued and teardown is not in progress.
Fixes: b47c0c07c350 ("drm/xe/vf: Teardown VF post migration worker on driver unload")
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20251210052546.622809-5-satyanarayana.k.v.p@intel.com
|
|
To support hot-unplug of this bridge we need to protect access to device
resources in case sn65dsi83_remove() happens concurrently to other code.
Some care is needed for the case when the unplug happens before
sn65dsi83_atomic_disable() has a chance to enter the critical section
(i.e. a successful drm_bridge_enter() call), which occurs whenever the
hardware is removed while the display is active. When that happens,
sn65dsi83_atomic_disable() in unable to release the resources taken by
sn65dsi83_atomic_pre_enable().
To ensure those resources are released exactly once on device removal:
* move the code to release them to a dedicated function
* register that function when the resources are taken in
sn65dsi83_atomic_pre_enable()
* if sn65dsi83_atomic_disable() happens before sn65dsi83_remove()
(typical non-hot-unplug case):
* sn65dsi83_atomic_disable() can enter the critical section
(drm_bridge_enter() returns 0) -> it releases and executes the
devres action
* if sn65dsi83_atomic_disable() happens after sn65dsi83_remove()
(typical hot-unplug case):
* sn65dsi83_remove() -> drm_bridge_unplug() prevents
sn65dsi83_atomic_disable() from entering the critical section
(drm_bridge_enter() returns nonzero), so sn65dsi83_atomic_disable()
cannot release and execute the devres action
* the devres action is executed at the end of sn65dsi83_remove()
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://patch.msgid.link/20251112-drm-bridge-atomic-vs-remove-v3-2-85db717ce094@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
|