<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/vfio, branch v6.9.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>VFIO: Add the SPR_DSA and SPR_IAX devices to the denylist</title>
<updated>2024-05-17T10:18:09+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2024-03-21T19:44:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fa20fd0f9cd9748490175f58091e7d695ef675a0'/>
<id>fa20fd0f9cd9748490175f58091e7d695ef675a0</id>
<content type='text'>
commit 95feb3160eef0caa6018e175a5560b816aee8e79 upstream.

Due to an erratum with the SPR_DSA and SPR_IAX devices, it is not secure to assign
these devices to virtual machines. Add the PCI IDs of these devices to the VFIO
denylist to ensure that this is handled appropriately by the VFIO subsystem.

The SPR_DSA and SPR_IAX devices are on-SOC devices for the Sapphire Rapids
(and related) family of products that perform data movement and compression.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 95feb3160eef0caa6018e175a5560b816aee8e79 upstream.

Due to an erratum with the SPR_DSA and SPR_IAX devices, it is not secure to assign
these devices to virtual machines. Add the PCI IDs of these devices to the VFIO
denylist to ensure that this is handled appropriately by the VFIO subsystem.

The SPR_DSA and SPR_IAX devices are on-SOC devices for the Sapphire Rapids
(and related) family of products that perform data movement and compression.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'vfio-v6.9-rc1' of https://github.com/awilliam/linux-vfio</title>
<updated>2024-03-15T20:21:13+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-15T20:21:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4138f02288333cb596885e9af03dd3ea2de845cb'/>
<id>4138f02288333cb596885e9af03dd3ea2de845cb</id>
<content type='text'>
Pull VFIO updates from Alex Williamson:

 - Add warning in unlikely case that device is not captured with
   driver_override (Kunwu Chan)

 - Error handling improvements in mlx5-vfio-pci to detect firmware
   tracking object error states, logging of firmware error syndrom, and
   releasing of firmware resources in aborted migration sequence (Yishai
   Hadas)

 - Correct an un-alphabetized VFIO MAINTAINERS entry (Alex Williamson)

 - Make the mdev_bus_type const and also make the class struct const for
   a couple of the vfio-mdev sample drivers (Ricardo B. Marliere)

 - Addition of a new vfio-pci variant driver for the GPU of NVIDIA's
   Grace-Hopper superchip. During initialization of the chip-to-chip
   interconnect in this hardware module, the PCI BARs of the device
   become unused in favor of a faster, coherent mechanism for exposing
   device memory. This driver primarily changes the VFIO representation
   of the device to masquerade this coherent aperture to replace the
   physical PCI BARs for userspace drivers. This also incorporates use
   of a new vma flag allowing KVM to use write combining attributes for
   uncached device memory (Ankit Agrawal)

 - Reset fixes and cleanups for the pds-vfio-pci driver. Save and
   restore files were previously leaked if the device didn't pass
   through an error state, this is resolved and later re-fixed to
   prevent access to the now freed files. Reset handling is also
   refactored to remove the complicated deferred reset mechanism (Brett
   Creeley)

 - Remove some references to pl330 in the vfio-platform amba driver
   (Geert Uytterhoeven)

 - Remove twice redundant and ugly code to unpin incidental pins of the
   zero-page (Alex Williamson)

 - Deferred reset logic is also removed from the hisi-acc-vfio-pci
   driver as a simplification (Shameer Kolothum)

 - Enforce that mlx5-vfio-pci devices must support PRE_COPY and remove
   resulting unnecessary code. There is no device firmware that has been
   available publicly without this support (Yishai Hadas)

 - Switch over to using the .remove_new callback for vfio-platform in
   support of the broader transition for a void remove function (Uwe
   Kleine-König)

 - Resolve multiple issues in interrupt code for VFIO bus drivers that
   allow calling eventfd_signal() on a NULL context. This also remove a
   potential race in INTx setup on certain hardware for vfio-pci, races
   with various mechanisms to mask INTx, and leaked virqfds in
   vfio-platform (Alex Williamson)

* tag 'vfio-v6.9-rc1' of https://github.com/awilliam/linux-vfio: (29 commits)
  vfio/fsl-mc: Block calling interrupt handler without trigger
  vfio/platform: Create persistent IRQ handlers
  vfio/platform: Disable virqfds on cleanup
  vfio/pci: Create persistent INTx handler
  vfio: Introduce interface to flush virqfd inject workqueue
  vfio/pci: Lock external INTx masking ops
  vfio/pci: Disable auto-enable of exclusive INTx IRQ
  vfio/pds: Refactor/simplify reset logic
  vfio/pds: Make sure migration file isn't accessed after reset
  vfio/platform: Convert to platform remove callback returning void
  vfio/mlx5: Enforce PRE_COPY support
  vfio/mbochs: make mbochs_class constant
  vfio/mdpy: make mdpy_class constant
  hisi_acc_vfio_pci: Remove the deferred_reset logic
  Revert "vfio/type1: Unpin zero pages"
  vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached
  vfio: amba: Rename pl330_ids[] to vfio_amba_ids[]
  vfio/pds: Always clear the save/restore FDs on reset
  vfio/nvgrace-gpu: Add vfio pci variant module for grace hopper
  vfio/pci: rename and export range_intersect_range
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull VFIO updates from Alex Williamson:

 - Add warning in unlikely case that device is not captured with
   driver_override (Kunwu Chan)

 - Error handling improvements in mlx5-vfio-pci to detect firmware
   tracking object error states, logging of firmware error syndrom, and
   releasing of firmware resources in aborted migration sequence (Yishai
   Hadas)

 - Correct an un-alphabetized VFIO MAINTAINERS entry (Alex Williamson)

 - Make the mdev_bus_type const and also make the class struct const for
   a couple of the vfio-mdev sample drivers (Ricardo B. Marliere)

 - Addition of a new vfio-pci variant driver for the GPU of NVIDIA's
   Grace-Hopper superchip. During initialization of the chip-to-chip
   interconnect in this hardware module, the PCI BARs of the device
   become unused in favor of a faster, coherent mechanism for exposing
   device memory. This driver primarily changes the VFIO representation
   of the device to masquerade this coherent aperture to replace the
   physical PCI BARs for userspace drivers. This also incorporates use
   of a new vma flag allowing KVM to use write combining attributes for
   uncached device memory (Ankit Agrawal)

 - Reset fixes and cleanups for the pds-vfio-pci driver. Save and
   restore files were previously leaked if the device didn't pass
   through an error state, this is resolved and later re-fixed to
   prevent access to the now freed files. Reset handling is also
   refactored to remove the complicated deferred reset mechanism (Brett
   Creeley)

 - Remove some references to pl330 in the vfio-platform amba driver
   (Geert Uytterhoeven)

 - Remove twice redundant and ugly code to unpin incidental pins of the
   zero-page (Alex Williamson)

 - Deferred reset logic is also removed from the hisi-acc-vfio-pci
   driver as a simplification (Shameer Kolothum)

 - Enforce that mlx5-vfio-pci devices must support PRE_COPY and remove
   resulting unnecessary code. There is no device firmware that has been
   available publicly without this support (Yishai Hadas)

 - Switch over to using the .remove_new callback for vfio-platform in
   support of the broader transition for a void remove function (Uwe
   Kleine-König)

 - Resolve multiple issues in interrupt code for VFIO bus drivers that
   allow calling eventfd_signal() on a NULL context. This also remove a
   potential race in INTx setup on certain hardware for vfio-pci, races
   with various mechanisms to mask INTx, and leaked virqfds in
   vfio-platform (Alex Williamson)

* tag 'vfio-v6.9-rc1' of https://github.com/awilliam/linux-vfio: (29 commits)
  vfio/fsl-mc: Block calling interrupt handler without trigger
  vfio/platform: Create persistent IRQ handlers
  vfio/platform: Disable virqfds on cleanup
  vfio/pci: Create persistent INTx handler
  vfio: Introduce interface to flush virqfd inject workqueue
  vfio/pci: Lock external INTx masking ops
  vfio/pci: Disable auto-enable of exclusive INTx IRQ
  vfio/pds: Refactor/simplify reset logic
  vfio/pds: Make sure migration file isn't accessed after reset
  vfio/platform: Convert to platform remove callback returning void
  vfio/mlx5: Enforce PRE_COPY support
  vfio/mbochs: make mbochs_class constant
  vfio/mdpy: make mdpy_class constant
  hisi_acc_vfio_pci: Remove the deferred_reset logic
  Revert "vfio/type1: Unpin zero pages"
  vfio/nvgrace-gpu: Convey kvm to map device memory region as noncached
  vfio: amba: Rename pl330_ids[] to vfio_amba_ids[]
  vfio/pds: Always clear the save/restore FDs on reset
  vfio/nvgrace-gpu: Add vfio pci variant module for grace hopper
  vfio/pci: rename and export range_intersect_range
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/fsl-mc: Block calling interrupt handler without trigger</title>
<updated>2024-03-11T19:08:52+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7447d911af699a15f8d050dfcb7c680a86f87012'/>
<id>7447d911af699a15f8d050dfcb7c680a86f87012</id>
<content type='text'>
The eventfd_ctx trigger pointer of the vfio_fsl_mc_irq object is
initially NULL and may become NULL if the user sets the trigger
eventfd to -1.  The interrupt handler itself is guaranteed that
trigger is always valid between request_irq() and free_irq(), but
the loopback testing mechanisms to invoke the handler function
need to test the trigger.  The triggering and setting ioctl paths
both make use of igate and are therefore mutually exclusive.

The vfio-fsl-mc driver does not make use of irqfds, nor does it
support any sort of masking operations, therefore unlike vfio-pci
and vfio-platform, the flow can remain essentially unchanged.

Cc: Diana Craciun &lt;diana.craciun@oss.nxp.com&gt;
Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-8-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The eventfd_ctx trigger pointer of the vfio_fsl_mc_irq object is
initially NULL and may become NULL if the user sets the trigger
eventfd to -1.  The interrupt handler itself is guaranteed that
trigger is always valid between request_irq() and free_irq(), but
the loopback testing mechanisms to invoke the handler function
need to test the trigger.  The triggering and setting ioctl paths
both make use of igate and are therefore mutually exclusive.

The vfio-fsl-mc driver does not make use of irqfds, nor does it
support any sort of masking operations, therefore unlike vfio-pci
and vfio-platform, the flow can remain essentially unchanged.

Cc: Diana Craciun &lt;diana.craciun@oss.nxp.com&gt;
Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-8-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/platform: Create persistent IRQ handlers</title>
<updated>2024-03-11T19:08:52+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=675daf435e9f8e5a5eab140a9864dfad6668b375'/>
<id>675daf435e9f8e5a5eab140a9864dfad6668b375</id>
<content type='text'>
The vfio-platform SET_IRQS ioctl currently allows loopback triggering of
an interrupt before a signaling eventfd has been configured by the user,
which thereby allows a NULL pointer dereference.

Rather than register the IRQ relative to a valid trigger, register all
IRQs in a disabled state in the device open path.  This allows mask
operations on the IRQ to nest within the overall enable state governed
by a valid eventfd signal.  This decouples @masked, protected by the
@locked spinlock from @trigger, protected via the @igate mutex.

In doing so, it's guaranteed that changes to @trigger cannot race the
IRQ handlers because the IRQ handler is synchronously disabled before
modifying the trigger, and loopback triggering of the IRQ via ioctl is
safe due to serialization with trigger changes via igate.

For compatibility, request_irq() failures are maintained to be local to
the SET_IRQS ioctl rather than a fatal error in the open device path.
This allows, for example, a userspace driver with polling mode support
to continue to work regardless of moving the request_irq() call site.
This necessarily blocks all SET_IRQS access to the failed index.

Cc: Eric Auger &lt;eric.auger@redhat.com&gt;
Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-7-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The vfio-platform SET_IRQS ioctl currently allows loopback triggering of
an interrupt before a signaling eventfd has been configured by the user,
which thereby allows a NULL pointer dereference.

Rather than register the IRQ relative to a valid trigger, register all
IRQs in a disabled state in the device open path.  This allows mask
operations on the IRQ to nest within the overall enable state governed
by a valid eventfd signal.  This decouples @masked, protected by the
@locked spinlock from @trigger, protected via the @igate mutex.

In doing so, it's guaranteed that changes to @trigger cannot race the
IRQ handlers because the IRQ handler is synchronously disabled before
modifying the trigger, and loopback triggering of the IRQ via ioctl is
safe due to serialization with trigger changes via igate.

For compatibility, request_irq() failures are maintained to be local to
the SET_IRQS ioctl rather than a fatal error in the open device path.
This allows, for example, a userspace driver with polling mode support
to continue to work regardless of moving the request_irq() call site.
This necessarily blocks all SET_IRQS access to the failed index.

Cc: Eric Auger &lt;eric.auger@redhat.com&gt;
Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 57f972e2b341 ("vfio/platform: trigger an interrupt via eventfd")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-7-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/platform: Disable virqfds on cleanup</title>
<updated>2024-03-11T19:08:52+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fcdc0d3d40bc26c105acf8467f7d9018970944ae'/>
<id>fcdc0d3d40bc26c105acf8467f7d9018970944ae</id>
<content type='text'>
irqfds for mask and unmask that are not specifically disabled by the
user are leaked.  Remove any irqfds during cleanup

Cc: Eric Auger &lt;eric.auger@redhat.com&gt;
Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: a7fa7c77cf15 ("vfio/platform: implement IRQ masking/unmasking via an eventfd")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-6-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
irqfds for mask and unmask that are not specifically disabled by the
user are leaked.  Remove any irqfds during cleanup

Cc: Eric Auger &lt;eric.auger@redhat.com&gt;
Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: a7fa7c77cf15 ("vfio/platform: implement IRQ masking/unmasking via an eventfd")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-6-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Create persistent INTx handler</title>
<updated>2024-03-11T19:08:52+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=18c198c96a815c962adc2b9b77909eec0be7df4d'/>
<id>18c198c96a815c962adc2b9b77909eec0be7df4d</id>
<content type='text'>
A vulnerability exists where the eventfd for INTx signaling can be
deconfigured, which unregisters the IRQ handler but still allows
eventfds to be signaled with a NULL context through the SET_IRQS ioctl
or through unmask irqfd if the device interrupt is pending.

Ideally this could be solved with some additional locking; the igate
mutex serializes the ioctl and config space accesses, and the interrupt
handler is unregistered relative to the trigger, but the irqfd path
runs asynchronous to those.  The igate mutex cannot be acquired from the
atomic context of the eventfd wake function.  Disabling the irqfd
relative to the eventfd registration is potentially incompatible with
existing userspace.

As a result, the solution implemented here moves configuration of the
INTx interrupt handler to track the lifetime of the INTx context object
and irq_type configuration, rather than registration of a particular
trigger eventfd.  Synchronization is added between the ioctl path and
eventfd_signal() wrapper such that the eventfd trigger can be
dynamically updated relative to in-flight interrupts or irqfd callbacks.

Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reported-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-5-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A vulnerability exists where the eventfd for INTx signaling can be
deconfigured, which unregisters the IRQ handler but still allows
eventfds to be signaled with a NULL context through the SET_IRQS ioctl
or through unmask irqfd if the device interrupt is pending.

Ideally this could be solved with some additional locking; the igate
mutex serializes the ioctl and config space accesses, and the interrupt
handler is unregistered relative to the trigger, but the irqfd path
runs asynchronous to those.  The igate mutex cannot be acquired from the
atomic context of the eventfd wake function.  Disabling the irqfd
relative to the eventfd registration is potentially incompatible with
existing userspace.

As a result, the solution implemented here moves configuration of the
INTx interrupt handler to track the lifetime of the INTx context object
and irq_type configuration, rather than registration of a particular
trigger eventfd.  Synchronization is added between the ioctl path and
eventfd_signal() wrapper such that the eventfd trigger can be
dynamically updated relative to in-flight interrupts or irqfd callbacks.

Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reported-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-5-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio: Introduce interface to flush virqfd inject workqueue</title>
<updated>2024-03-11T19:08:52+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b620ecbd17a03cacd06f014a5d3f3a11285ce053'/>
<id>b620ecbd17a03cacd06f014a5d3f3a11285ce053</id>
<content type='text'>
In order to synchronize changes that can affect the thread callback,
introduce an interface to force a flush of the inject workqueue.  The
irqfd pointer is only valid under spinlock, but the workqueue cannot
be flushed under spinlock.  Therefore the flush work for the irqfd is
queued under spinlock.  The vfio_irqfd_cleanup_wq workqueue is re-used
for queuing this work such that flushing the workqueue is also ordered
relative to shutdown.

Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-4-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to synchronize changes that can affect the thread callback,
introduce an interface to force a flush of the inject workqueue.  The
irqfd pointer is only valid under spinlock, but the workqueue cannot
be flushed under spinlock.  Therefore the flush work for the irqfd is
queued under spinlock.  The vfio_irqfd_cleanup_wq workqueue is re-used
for queuing this work such that flushing the workqueue is also ordered
relative to shutdown.

Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-4-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Lock external INTx masking ops</title>
<updated>2024-03-11T19:08:52+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=810cd4bb53456d0503cc4e7934e063835152c1b7'/>
<id>810cd4bb53456d0503cc4e7934e063835152c1b7</id>
<content type='text'>
Mask operations through config space changes to DisINTx may race INTx
configuration changes via ioctl.  Create wrappers that add locking for
paths outside of the core interrupt code.

In particular, irq_type is updated holding igate, therefore testing
is_intx() requires holding igate.  For example clearing DisINTx from
config space can otherwise race changes of the interrupt configuration.

This aligns interfaces which may trigger the INTx eventfd into two
camps, one side serialized by igate and the other only enabled while
INTx is configured.  A subsequent patch introduces synchronization for
the latter flows.

Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reported-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-3-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mask operations through config space changes to DisINTx may race INTx
configuration changes via ioctl.  Create wrappers that add locking for
paths outside of the core interrupt code.

In particular, irq_type is updated holding igate, therefore testing
is_intx() requires holding igate.  For example clearing DisINTx from
config space can otherwise race changes of the interrupt configuration.

This aligns interfaces which may trigger the INTx eventfd into two
camps, one side serialized by igate and the other only enabled while
INTx is configured.  A subsequent patch introduces synchronization for
the latter flows.

Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reported-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-3-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pci: Disable auto-enable of exclusive INTx IRQ</title>
<updated>2024-03-11T19:08:30+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2024-03-08T23:05:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fe9a7082684eb059b925c535682e68c34d487d43'/>
<id>fe9a7082684eb059b925c535682e68c34d487d43</id>
<content type='text'>
Currently for devices requiring masking at the irqchip for INTx, ie.
devices without DisINTx support, the IRQ is enabled in request_irq()
and subsequently disabled as necessary to align with the masked status
flag.  This presents a window where the interrupt could fire between
these events, resulting in the IRQ incrementing the disable depth twice.
This would be unrecoverable for a user since the masked flag prevents
nested enables through vfio.

Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
is never auto-enabled, then unmask as required.

Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-2-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently for devices requiring masking at the irqchip for INTx, ie.
devices without DisINTx support, the IRQ is enabled in request_irq()
and subsequently disabled as necessary to align with the masked status
flag.  This presents a window where the interrupt could fire between
these events, resulting in the IRQ incrementing the disable depth twice.
This would be unrecoverable for a user since the masked flag prevents
nested enables through vfio.

Instead, invert the logic using IRQF_NO_AUTOEN such that exclusive INTx
is never auto-enabled, then unmask as required.

Cc:  &lt;stable@vger.kernel.org&gt;
Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver")
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Eric Auger &lt;eric.auger@redhat.com&gt;
Link: https://lore.kernel.org/r/20240308230557.805580-2-alex.williamson@redhat.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/pds: Refactor/simplify reset logic</title>
<updated>2024-03-11T18:10:37+00:00</updated>
<author>
<name>Brett Creeley</name>
<email>brett.creeley@amd.com</email>
</author>
<published>2024-03-08T18:21:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6a7e448c6b238c9b256b01eb198b0af93ae3157e'/>
<id>6a7e448c6b238c9b256b01eb198b0af93ae3157e</id>
<content type='text'>
The current logic for handling resets is more complicated than it needs
to be. The deferred_reset flag is used to indicate a reset is needed
and the deferred_reset_state is the requested, post-reset, state.

Also, the deferred_reset logic was added to vfio migration drivers to
prevent a circular locking dependency with respect to mm_lock and state
mutex. This is mainly because of the copy_to/from_user() functions(which
takes mm_lock) invoked under state mutex.

Remove all of the deferred reset logic and just pass the requested
next state to pds_vfio_reset() so it can be used for VMM and DSC
initiated resets.

This removes the need for pds_vfio_state_mutex_lock(), so remove that
and replace its use with a simple mutex_unlock().

Also, remove the reset_mutex as it's no longer needed since the
state_mutex can be the driver's primary protector.

Suggested-by: Shameer Kolothum &lt;shameerali.kolothum.thodi@huawei.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Link: https://lore.kernel.org/r/20240308182149.22036-3-brett.creeley@amd.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current logic for handling resets is more complicated than it needs
to be. The deferred_reset flag is used to indicate a reset is needed
and the deferred_reset_state is the requested, post-reset, state.

Also, the deferred_reset logic was added to vfio migration drivers to
prevent a circular locking dependency with respect to mm_lock and state
mutex. This is mainly because of the copy_to/from_user() functions(which
takes mm_lock) invoked under state mutex.

Remove all of the deferred reset logic and just pass the requested
next state to pds_vfio_reset() so it can be used for VMM and DSC
initiated resets.

This removes the need for pds_vfio_state_mutex_lock(), so remove that
and replace its use with a simple mutex_unlock().

Also, remove the reset_mutex as it's no longer needed since the
state_mutex can be the driver's primary protector.

Suggested-by: Shameer Kolothum &lt;shameerali.kolothum.thodi@huawei.com&gt;
Reviewed-by: Shannon Nelson &lt;shannon.nelson@amd.com&gt;
Signed-off-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Link: https://lore.kernel.org/r/20240308182149.22036-3-brett.creeley@amd.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
