<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/gpu/drm/xe, branch v6.14</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>drm/xe: Fix exporting xe buffers multiple times</title>
<updated>2025-03-20T16:59:49+00:00</updated>
<author>
<name>Tomasz Rusinowicz</name>
<email>tomasz.rusinowicz@intel.com</email>
</author>
<published>2025-02-18T10:03:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=50af7cab7520e46680cf4633bba6801443b75856'/>
<id>50af7cab7520e46680cf4633bba6801443b75856</id>
<content type='text'>
The `struct ttm_resource-&gt;placement` contains TTM_PL_FLAG_* flags, but
it was incorrectly tested for XE_PL_* flags.
This caused xe_dma_buf_pin() to always fail when invoked for
the second time. Fix this by checking the `mem_type` field instead.

Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM")
Cc: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
Cc: "Thomas Hellström" &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: Michal Wajdeczko &lt;michal.wajdeczko@intel.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: Nirmoy Das &lt;nirmoy.das@intel.com&gt;
Cc: Jani Nikula &lt;jani.nikula@intel.com&gt;
Cc: intel-xe@lists.freedesktop.org
Cc: &lt;stable@vger.kernel.org&gt; # v6.8+
Signed-off-by: Tomasz Rusinowicz &lt;tomasz.rusinowicz@intel.com&gt;
Signed-off-by: Jacek Lawrynowicz &lt;jacek.lawrynowicz@linux.intel.com&gt;
Reviewed-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
(cherry picked from commit b96dabdba9b95f71ded50a1c094ee244408b2a8e)
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The `struct ttm_resource-&gt;placement` contains TTM_PL_FLAG_* flags, but
it was incorrectly tested for XE_PL_* flags.
This caused xe_dma_buf_pin() to always fail when invoked for
the second time. Fix this by checking the `mem_type` field instead.

Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM")
Cc: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
Cc: "Thomas Hellström" &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: Michal Wajdeczko &lt;michal.wajdeczko@intel.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: Nirmoy Das &lt;nirmoy.das@intel.com&gt;
Cc: Jani Nikula &lt;jani.nikula@intel.com&gt;
Cc: intel-xe@lists.freedesktop.org
Cc: &lt;stable@vger.kernel.org&gt; # v6.8+
Signed-off-by: Tomasz Rusinowicz &lt;tomasz.rusinowicz@intel.com&gt;
Signed-off-by: Jacek Lawrynowicz &lt;jacek.lawrynowicz@linux.intel.com&gt;
Reviewed-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
(cherry picked from commit b96dabdba9b95f71ded50a1c094ee244408b2a8e)
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe: remove redundant check in xe_vm_create_ioctl()</title>
<updated>2025-03-10T18:01:43+00:00</updated>
<author>
<name>Xin Wang</name>
<email>x.wang@intel.com</email>
</author>
<published>2025-03-03T00:49:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f5d4e81774c42d9c2ea3980e570f3330ff2ed5d2'/>
<id>f5d4e81774c42d9c2ea3980e570f3330ff2ed5d2</id>
<content type='text'>
The check for args-&gt;extensions is repeated twice in xe_vm_create_ioctl().
This commit removes the redundant check to streamline the code.

Fixes: 7224788f6756 ("drm/xe: Kill XE_VM_PROPERTY_BIND_OP_ERROR_CAPTURE_ADDRESS extension")
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Signed-off-by: Xin Wang &lt;x.wang@intel.com&gt;
Reviewed-by: Tejas Upadhyay &lt;tejas.upadhyay@intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250303004942.951699-1-x.wang@intel.com
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
(cherry picked from commit 8da8aecf1f2d89c2b8188bcf7aa252ec146ddd12)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The check for args-&gt;extensions is repeated twice in xe_vm_create_ioctl().
This commit removes the redundant check to streamline the code.

Fixes: 7224788f6756 ("drm/xe: Kill XE_VM_PROPERTY_BIND_OP_ERROR_CAPTURE_ADDRESS extension")
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Signed-off-by: Xin Wang &lt;x.wang@intel.com&gt;
Reviewed-by: Tejas Upadhyay &lt;tejas.upadhyay@intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250303004942.951699-1-x.wang@intel.com
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
(cherry picked from commit 8da8aecf1f2d89c2b8188bcf7aa252ec146ddd12)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe/guc_pc: Retry and wait longer for GuC PC start</title>
<updated>2025-03-10T15:53:37+00:00</updated>
<author>
<name>Rodrigo Vivi</name>
<email>rodrigo.vivi@intel.com</email>
</author>
<published>2025-03-07T16:03:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c605acb53f449f6289f042790307d7dc9e62d03d'/>
<id>c605acb53f449f6289f042790307d7dc9e62d03d</id>
<content type='text'>
In a rare situation of thermal limit during resume, GuC can
be slow and run into delays like this:

xe 0000:00:02.0: [drm] GT1: excessive init time: 667ms! \
   		 [status = 0x8002F034, timeouts = 0]
xe 0000:00:02.0: [drm] GT1: excessive init time: \
   		 [freq = 100MHz (req = 800MHz), before = 100MHz, \
   		 perf_limit_reasons = 0x1C001000]
xe 0000:00:02.0: [drm] *ERROR* GT1: GuC PC Start failed
------------[ cut here ]------------
xe 0000:00:02.0: [drm] GT1: Failed to start GuC PC: -EIO

When this happens, it will block entirely the GPU to be used.
So, let's try and with a huge timeout in the hope it comes back.

Also, let's collect some information on how long it is usually
taking on situations like this, so perhaps the time can be tuned
later.

Cc: Vinay Belgaumkar &lt;vinay.belgaumkar@intel.com&gt;
Cc: Jonathan Cavitt &lt;jonathan.cavitt@intel.com&gt;
Cc: John Harrison &lt;John.C.Harrison@Intel.com&gt;
Reviewed-by: Jonathan Cavitt &lt;jonathan.cavitt@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250307160307.1093391-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
(cherry picked from commit b4b05e53b550a886b4754b87fd0dd2b304579e85)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In a rare situation of thermal limit during resume, GuC can
be slow and run into delays like this:

xe 0000:00:02.0: [drm] GT1: excessive init time: 667ms! \
   		 [status = 0x8002F034, timeouts = 0]
xe 0000:00:02.0: [drm] GT1: excessive init time: \
   		 [freq = 100MHz (req = 800MHz), before = 100MHz, \
   		 perf_limit_reasons = 0x1C001000]
xe 0000:00:02.0: [drm] *ERROR* GT1: GuC PC Start failed
------------[ cut here ]------------
xe 0000:00:02.0: [drm] GT1: Failed to start GuC PC: -EIO

When this happens, it will block entirely the GPU to be used.
So, let's try and with a huge timeout in the hope it comes back.

Also, let's collect some information on how long it is usually
taking on situations like this, so perhaps the time can be tuned
later.

Cc: Vinay Belgaumkar &lt;vinay.belgaumkar@intel.com&gt;
Cc: Jonathan Cavitt &lt;jonathan.cavitt@intel.com&gt;
Cc: John Harrison &lt;John.C.Harrison@Intel.com&gt;
Reviewed-by: Jonathan Cavitt &lt;jonathan.cavitt@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250307160307.1093391-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
(cherry picked from commit b4b05e53b550a886b4754b87fd0dd2b304579e85)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe/pm: Temporarily disable D3Cold on BMG</title>
<updated>2025-03-10T15:42:32+00:00</updated>
<author>
<name>Rodrigo Vivi</name>
<email>rodrigo.vivi@intel.com</email>
</author>
<published>2025-03-08T00:56:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3e331a6715ee26f2fabc59dad6bb36d810707028'/>
<id>3e331a6715ee26f2fabc59dad6bb36d810707028</id>
<content type='text'>
Currently, many instability cases related to D3Cold -&gt; D0 transition
on BMG are under investigation. Among them some bad cases where
the device is lost after 1 to 3 transitions from D3Cold to D0
on the runtime pm, with pcieport upstream bridge port link retrain
failure.

In other cases, it works fine, but with some sudden random memory
corruptions after D3cold, that could be 0xffff missed ack on GT
forcewake or GuC reload related failures.

In some other cases though, D3Cold -&gt; D0 works pretty reliably.
It looks like it is a combination of GPU cards and Host boards at
this point. So, there is no possible/available quirk at this time.

This patch disables the D3Cold by default on BMG by reducing the
vram_d3cold_threshold to 0. Users and developers who wants to enable
it are still able to via
$ echo 300 &gt; /sys/bus/pci/devices/&lt;addr&gt;/vram_d3cold_threshold

Fixes: 3adcf970dc7e ("drm/xe/bmg: Drop force_probe requirement")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4037
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4395
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4396
Cc: Karthik Poosa &lt;karthik.poosa@intel.com&gt;
Reviewed-by: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250308005636.1475420-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
(cherry picked from commit d945cc876277851053c0cf37927c8d7bd9d0e880)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, many instability cases related to D3Cold -&gt; D0 transition
on BMG are under investigation. Among them some bad cases where
the device is lost after 1 to 3 transitions from D3Cold to D0
on the runtime pm, with pcieport upstream bridge port link retrain
failure.

In other cases, it works fine, but with some sudden random memory
corruptions after D3cold, that could be 0xffff missed ack on GT
forcewake or GuC reload related failures.

In some other cases though, D3Cold -&gt; D0 works pretty reliably.
It looks like it is a combination of GPU cards and Host boards at
this point. So, there is no possible/available quirk at this time.

This patch disables the D3Cold by default on BMG by reducing the
vram_d3cold_threshold to 0. Users and developers who wants to enable
it are still able to via
$ echo 300 &gt; /sys/bus/pci/devices/&lt;addr&gt;/vram_d3cold_threshold

Fixes: 3adcf970dc7e ("drm/xe/bmg: Drop force_probe requirement")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4037
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4395
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4396
Cc: Karthik Poosa &lt;karthik.poosa@intel.com&gt;
Reviewed-by: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250308005636.1475420-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
(cherry picked from commit d945cc876277851053c0cf37927c8d7bd9d0e880)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe/userptr: Fix an incorrect assert</title>
<updated>2025-03-10T15:42:25+00:00</updated>
<author>
<name>Thomas Hellström</name>
<email>thomas.hellstrom@linux.intel.com</email>
</author>
<published>2025-03-07T10:01:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9106713bd2ab0cacd380cda0d3f0219f2e488086'/>
<id>9106713bd2ab0cacd380cda0d3f0219f2e488086</id>
<content type='text'>
The assert incorrectly checks the total length processed which
can in fact be greater than the number of pages. Fix.

Fixes: 0a98219bcc96 ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock")
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250307100109.21397-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 70e5043ba85eae199b232e39921abd706b5c1fa4)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The assert incorrectly checks the total length processed which
can in fact be greater than the number of pages. Fix.

Fixes: 0a98219bcc96 ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock")
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: Matthew Brost &lt;matthew.brost@intel.com&gt;
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250307100109.21397-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 70e5043ba85eae199b232e39921abd706b5c1fa4)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe: Release guc ids before cancelling work</title>
<updated>2025-03-10T15:42:21+00:00</updated>
<author>
<name>Tejas Upadhyay</name>
<email>tejas.upadhyay@intel.com</email>
</author>
<published>2025-03-06T13:12:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=10c7988418d8f759ba70c4a558961e0bfa74647f'/>
<id>10c7988418d8f759ba70c4a558961e0bfa74647f</id>
<content type='text'>
A GT resets can be occurring in parallel while cancelling
work in async call  which can requeue these workers.
to avoid that, lets first release guc ids and then cancel
work so they don't requeued.

Fixes: 8ae8a2e8dd21 ("drm/xe: Long running job update")
Fixes: 12c2f962fe71 ("drm/xe: cancel pending job timer before freeing scheduler")
Signed-off-by: Tejas Upadhyay &lt;tejas.upadhyay@intel.com&gt;
Suggested-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Reviewed-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250306131211.975503-1-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
(cherry picked from commit 8e8d76f62329127b31c64a034b052fb9e30e92af)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A GT resets can be occurring in parallel while cancelling
work in async call  which can requeue these workers.
to avoid that, lets first release guc ids and then cancel
work so they don't requeued.

Fixes: 8ae8a2e8dd21 ("drm/xe: Long running job update")
Fixes: 12c2f962fe71 ("drm/xe: cancel pending job timer before freeing scheduler")
Signed-off-by: Tejas Upadhyay &lt;tejas.upadhyay@intel.com&gt;
Suggested-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Reviewed-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250306131211.975503-1-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@intel.com&gt;
(cherry picked from commit 8e8d76f62329127b31c64a034b052fb9e30e92af)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe/userptr: Unmap userptrs in the mmu notifier</title>
<updated>2025-03-05T19:25:27+00:00</updated>
<author>
<name>Thomas Hellström</name>
<email>thomas.hellstrom@linux.intel.com</email>
</author>
<published>2025-03-04T17:33:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=333b8906336174478efbbfc1e24a89e3397ffe65'/>
<id>333b8906336174478efbbfc1e24a89e3397ffe65</id>
<content type='text'>
If userptr pages are freed after a call to the xe mmu notifier,
the device will not be blocked out from theoretically accessing
these pages unless they are also unmapped from the iommu, and
this violates some aspects of the iommu-imposed security.

Ensure that userptrs are unmapped in the mmu notifier to
mitigate this. A naive attempt would try to free the sg table, but
the sg table itself may be accessed by a concurrent bind
operation, so settle for only unmapping.

v3:
- Update lockdep asserts.
- Fix a typo (Matthew Auld)

Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr")
Cc: Oak Zeng &lt;oak.zeng@intel.com&gt;
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v6.10+
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Acked-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-4-thomas.hellstrom@linux.intel.com
(cherry picked from commit ba767b9d01a2c552d76cf6f46b125d50ec4147a6)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If userptr pages are freed after a call to the xe mmu notifier,
the device will not be blocked out from theoretically accessing
these pages unless they are also unmapped from the iommu, and
this violates some aspects of the iommu-imposed security.

Ensure that userptrs are unmapped in the mmu notifier to
mitigate this. A naive attempt would try to free the sg table, but
the sg table itself may be accessed by a concurrent bind
operation, so settle for only unmapping.

v3:
- Update lockdep asserts.
- Fix a typo (Matthew Auld)

Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr")
Cc: Oak Zeng &lt;oak.zeng@intel.com&gt;
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v6.10+
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Acked-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-4-thomas.hellstrom@linux.intel.com
(cherry picked from commit ba767b9d01a2c552d76cf6f46b125d50ec4147a6)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe/hmm: Don't dereference struct page pointers without notifier lock</title>
<updated>2025-03-05T19:25:23+00:00</updated>
<author>
<name>Thomas Hellström</name>
<email>thomas.hellstrom@linux.intel.com</email>
</author>
<published>2025-03-04T17:33:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0a98219bcc961edd3388960576e4353e123b4a51'/>
<id>0a98219bcc961edd3388960576e4353e123b4a51</id>
<content type='text'>
The pnfs that we obtain from hmm_range_fault() point to pages that
we don't have a reference on, and the guarantee that they are still
in the cpu page-tables is that the notifier lock must be held and the
notifier seqno is still valid.

So while building the sg table and marking the pages accesses / dirty
we need to hold this lock with a validated seqno.

However, the lock is reclaim tainted which makes
sg_alloc_table_from_pages_segment() unusable, since it internally
allocates memory.

Instead build the sg-table manually. For the non-iommu case
this might lead to fewer coalesces, but if that's a problem it can
be fixed up later in the resource cursor code. For the iommu case,
the whole sg-table may still be coalesced to a single contigous
device va region.

This avoids marking pages that we don't own dirty and accessed, and
it also avoid dereferencing struct pages that we don't own.

v2:
- Use assert to check whether hmm pfns are valid (Matthew Auld)
- Take into account that large pages may cross range boundaries
  (Matthew Auld)

v3:
- Don't unnecessarily check for a non-freed sg-table. (Matthew Auld)
- Add a missing up_read() in an error path. (Matthew Auld)

Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr")
Cc: Oak Zeng &lt;oak.zeng@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v6.10+
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Acked-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit ea3e66d280ce2576664a862693d1da8fd324c317)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The pnfs that we obtain from hmm_range_fault() point to pages that
we don't have a reference on, and the guarantee that they are still
in the cpu page-tables is that the notifier lock must be held and the
notifier seqno is still valid.

So while building the sg table and marking the pages accesses / dirty
we need to hold this lock with a validated seqno.

However, the lock is reclaim tainted which makes
sg_alloc_table_from_pages_segment() unusable, since it internally
allocates memory.

Instead build the sg-table manually. For the non-iommu case
this might lead to fewer coalesces, but if that's a problem it can
be fixed up later in the resource cursor code. For the iommu case,
the whole sg-table may still be coalesced to a single contigous
device va region.

This avoids marking pages that we don't own dirty and accessed, and
it also avoid dereferencing struct pages that we don't own.

v2:
- Use assert to check whether hmm pfns are valid (Matthew Auld)
- Take into account that large pages may cross range boundaries
  (Matthew Auld)

v3:
- Don't unnecessarily check for a non-freed sg-table. (Matthew Auld)
- Add a missing up_read() in an error path. (Matthew Auld)

Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr")
Cc: Oak Zeng &lt;oak.zeng@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v6.10+
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Acked-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-3-thomas.hellstrom@linux.intel.com
(cherry picked from commit ea3e66d280ce2576664a862693d1da8fd324c317)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe/hmm: Style- and include fixes</title>
<updated>2025-03-05T19:25:18+00:00</updated>
<author>
<name>Thomas Hellström</name>
<email>thomas.hellstrom@linux.intel.com</email>
</author>
<published>2025-03-04T17:33:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e3e2e7fc4cd8414c9a966ef1b344db543f8614f4'/>
<id>e3e2e7fc4cd8414c9a966ef1b344db543f8614f4</id>
<content type='text'>
Add proper #ifndef around the xe_hmm.h header, proper spacing
and since the documentation mostly follows kerneldoc format,
make it kerneldoc. Also prepare for upcoming -stable fixes.

Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr")
Cc: Oak Zeng &lt;oak.zeng@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v6.10+
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Acked-by: Matthew Brost &lt;Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-2-thomas.hellstrom@linux.intel.com
(cherry picked from commit bbe2b06b55bc061c8fcec034ed26e88287f39143)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add proper #ifndef around the xe_hmm.h header, proper spacing
and since the documentation mostly follows kerneldoc format,
make it kerneldoc. Also prepare for upcoming -stable fixes.

Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr")
Cc: Oak Zeng &lt;oak.zeng@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v6.10+
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Acked-by: Matthew Brost &lt;Matthew Brost &lt;matthew.brost@intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-2-thomas.hellstrom@linux.intel.com
(cherry picked from commit bbe2b06b55bc061c8fcec034ed26e88287f39143)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/xe: Add staging tree for VM binds</title>
<updated>2025-03-05T19:25:11+00:00</updated>
<author>
<name>Matthew Brost</name>
<email>matthew.brost@intel.com</email>
</author>
<published>2025-02-28T07:30:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae482ec8cd1a85bde3307f71921a7780086fbec0'/>
<id>ae482ec8cd1a85bde3307f71921a7780086fbec0</id>
<content type='text'>
Concurrent VM bind staging and zapping of PTEs from a userptr notifier
do not work because the view of PTEs is not stable. VM binds cannot
acquire the notifier lock during staging, as memory allocations are
required. To resolve this race condition, use a staging tree for VM
binds that is committed only under the userptr notifier lock during the
final step of the bind. This ensures a consistent view of the PTEs in
the userptr notifier.

A follow up may only use staging for VM in fault mode as this is the
only mode in which the above race exists.

v3:
 - Drop zap PTE change (Thomas)
 - s/xe_pt_entry/xe_pt_entry_staging (Thomas)

Suggested-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")
Signed-off-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Reviewed-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
(cherry picked from commit 6f39b0c5ef0385eae586760d10b9767168037aa5)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Concurrent VM bind staging and zapping of PTEs from a userptr notifier
do not work because the view of PTEs is not stable. VM binds cannot
acquire the notifier lock during staging, as memory allocations are
required. To resolve this race condition, use a staging tree for VM
binds that is committed only under the userptr notifier lock during the
final step of the bind. This ensures a consistent view of the PTEs in
the userptr notifier.

A follow up may only use staging for VM in fault mode as this is the
only mode in which the above race exists.

v3:
 - Drop zap PTE change (Thomas)
 - s/xe_pt_entry/xe_pt_entry_staging (Thomas)

Suggested-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job")
Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling")
Signed-off-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Reviewed-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-5-thomas.hellstrom@linux.intel.com
Signed-off-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
(cherry picked from commit 6f39b0c5ef0385eae586760d10b9767168037aa5)
Signed-off-by: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
