linux.git/drivers/iommu/amd, branch master

iommu/amd: Bounds-check devid in __rlookup_amd_iommu()

2026-05-11T08:07:52+00:00

iommu_device_register() walks every device on the PCI bus via
bus_for_each_dev() and calls amd_iommu_probe_device() for each. The
inlined check_device() path computes the device's sbdf, calls
rlookup_amd_iommu() to find the owning IOMMU, and only afterwards
verifies devid <= pci_seg->last_bdf. __rlookup_amd_iommu() indexes
rlookup_table[devid] with no bounds check of its own, so for a PCI
device whose BDF is not described by the IVRS, the lookup reads past
the end of the allocation before the caller's bounds check can run.

This was harmless before commit e874c666b15b ("iommu/amd: Change
rlookup, irq_lookup, and alias to use kvalloc()"): the table was a
zeroed page-order allocation, so the over-read returned NULL and the
caller's NULL check skipped the device. After that commit the table is
a tight kvcalloc() and the over-read returns adjacent slab contents,
which check_device() then dereferences as a struct amd_iommu *,
causing a boot-time GPF.

Seen on Google Compute Engine ct6e VMs, where the virtualized IVRS
describes only the four TPU endpoints 00:04.0-07.0; the gVNIC at
00:08.0 (devid 0x40) indexes 56 bytes past the 456-byte allocation,
into the adjacent kmalloc-512 slab object:

  pci 0000:00:04.0: Adding to iommu group 0
  pci 0000:00:05.0: Adding to iommu group 1
  pci 0000:00:06.0: Adding to iommu group 2
  pci 0000:00:07.0: Adding to iommu group 3
  Oops: general protection fault, probably for non-canonical address 0x3a64695f78746382: 0000 [#1] SMP NOPTI
  CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.22 #1
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/06/2025
  RIP: 0010:amd_iommu_probe_device+0x54/0x3a0
  Call Trace:
   __iommu_probe_device+0x107/0x520
   probe_iommu_group+0x29/0x50
   bus_for_each_dev+0x7e/0xe0
   iommu_device_register+0xc9/0x240
   iommu_go_to_state+0x9c0/0x1c60
   amd_iommu_init+0x14/0x40
   pci_iommu_init+0x16/0x60
   do_one_initcall+0x47/0x2f0

Guard the array access in __rlookup_amd_iommu(). With the fix applied
on 6.18.22, the gVNIC at 00:08.0 is skipped cleanly and the VM boots.

Fixes: e874c666b15b ("iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc()")
Cc: stable@vger.kernel.org
Reported-by: Ziyuan Chen 
Tested-by: Ziyuan Chen 
Reviewed-by: Josef Bacik 
Assisted-by: Claude:unspecified
Signed-off-by: Jose Fernandez (Anthropic) 
Signed-off-by: Joerg Roedel

iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs

2026-05-11T07:52:54+00:00

In iommu_mmio_write() and iommu_capability_write(), the variables
dbg_mmio_offset and dbg_cap_offset are declared as int. However, they
are populated using kstrtou32_from_user(). If a user provides a
sufficiently large value, it can become a negative integer.

Prior to this patch, the AMD IOMMU debugfs implementation was already
protected by different mechanisms.

1. #define OFS_IN_SZ 8 ensures the user string <= 8 bytes, so
   e.g. 0xffffffff isn't a valid input.

  if (cnt > OFS_IN_SZ)
     return -EINVAL;

2. Implicit type promotion in iommu_mmio_write(), dbg_mmio_offset is int
   and iommu->mmio_phys_end is u64

  if (dbg_mmio_offset > iommu->mmio_phys_end - sizeof(u64))
      return -EINVAL;

3. The show handlers would currently catch the negative number and
   refuse to perform the read.

Replace kstrtou32_from_user() with kstrtos32_from_user() to parse the
input, and check for negative values to explicitly prevent out-of-bounds
memory accesses directly in iommu_mmio_write() and
iommu_capability_write().

Signed-off-by: Eder Zulian 
Fixes: 7a4ee419e8c1 ("iommu/amd: Add debugfs support to dump IOMMU MMIO registers")
Cc: stable@vger.kernel.org
Signed-off-by: Joerg Roedel

iommu/amd: Fix precedence order in set_dte_passthrough()

2026-05-04T08:26:16+00:00

Bitwise OR | operator has a higher precedence than the ternary ?:
operatior. It will be incorrectly evaluated as:

new->data[1] |= (FIELD_PREP(...) | dev_data->ats_enabled) ? DTE_FLAG_IOTLB : 0;

Wrap the conditional operation in parentheses to enforce the
correct evaluation order.

Fixes: 93eee2a49c1b ("iommu/amd: Refactor logic to program the host page table in DTE")
Signed-off-by: Weinan Liu 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Vasant Hegde 
Signed-off-by: Joerg Roedel

iommu/amd: Use maximum PPR log buffer size when SNP is enabled on Family 0x19

2026-04-27T11:49:36+00:00

Due to CVE-2023-20585, the PPR log buffer must use the maximum supported
size (512K) on Genoa (Family 0x19, model >= 0x10) systems when SNP is
enabled, to mitigate a potential security vulnerability. Note that Family
0x19 models below 0x10 (Milan) do not support PPR when SNP is enabled.
Hence the PPR log size increase is only applied for model >= 0x10.
All other systems continue to use the default PPR log buffer size (8K).

Apply the errata fix by making the following changes:

- Introduce global new variable (amd_iommu_pprlog_size) to have PPR log buffer
  size. Adjust variable size for Genoa family.

- Extend 'amd_iommu_apply_erratum_snp()' to also set the PPR log buffer
  size to maximum for Family 0x19 model >= 0x10 when SNP is enabled.

- Rename PPR_* macros to make it more readable.

Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3016.html
Cc: Borislav Petkov 
Cc: Suravee Suthikulpanit 
Cc: Joerg Roedel 
Signed-off-by: Vasant Hegde 
Tested-by: Dheeraj Kumar Srivastava 
Signed-off-by: Joerg Roedel

iommu/amd: Use maximum Event log buffer size when SNP is enabled on Family 0x19

2026-04-27T11:49:16+00:00

Due to CVE-2023-20585, the Event log buffer must use the maximum supported
size (512K) on Milan/Genoa (Family 0x19) systems when SNP is enabled,
to mitigate a potential security vulnerability. All other systems continue to
use the default Event log buffer size (8K).

Apply the errata fix by making the following changes:

* Introduce new global variable (amd_iommu_evtlog_size) to have event log
  buffer size. Adjust variable size for family 0x19.

* Since 'iommu_snp_enable()' must be called after the core IOMMU subsystem
  is initialized, it cannot be moved to the early init stage. The SNP errata
  must also be applied after the 'iommu_snp_enable()' check. Therefore,
  'alloc_event_buffer()' and 'iommu_enable_event_buffer()' are now called
  in the IOMMU_ENABLED state, after the errata is applied.

* Adjust alloc_event_buffer() and iommu_enable_event_buffer() to handle
  all IOMMU instances.

* Also rename EVT_* macros to make it more readable.

Link: https://www.amd.com/en/resources/product-security/bulletin/amd-sb-3016.html
Cc: Borislav Petkov 
Cc: Suravee Suthikulpanit 
Cc: Joerg Roedel 
Signed-off-by: Vasant Hegde 
Tested-by: Dheeraj Kumar Srivastava 
Signed-off-by: Joerg Roedel

Merge branches 'fixes', 'arm/smmu/updates', 'arm/smmu/bindings', 'riscv', 'intel/vt-d', 'amd/amd-vi' and 'core' into next

2026-04-09T12:18:27+00:00

iommu/amd: Invalidate IRT cache for DMA aliases

2026-04-02T09:42:45+00:00

DMA aliasing causes interrupt remapping table entries (IRTEs) to be shared
between multiple device IDs. See commit 3c124435e8dd
("iommu/amd: Support multiple PCI DMA aliases in IRQ Remapping") for more
information on this. However, the AMD IOMMU driver currently invalidates
IRTE cache entries on a per-device basis whenever an IRTE is updated, not
for each alias.

This approach leaves stale IRTE cache entries when an IRTE is cached under
one DMA alias but later updated and invalidated through a different alias.
In such cases, the original device ID is never invalidated, since it is
programmed via aliasing.

This incoherency bug has been observed when IRTEs are cached for one
Non-Transparent Bridge (NTB) DMA alias, later updated via another.

Fix this by invalidating the interrupt remapping table cache for all DMA
aliases when updating an IRTE.

Co-developed-by: Lars B. Kristiansen 
Signed-off-by: Lars B. Kristiansen 
Co-developed-by: Jonas Markussen 
Signed-off-by: Jonas Markussen 
Co-developed-by: Tore H. Larsen 
Signed-off-by: Tore H. Larsen 
Signed-off-by: Magnus Kalland 
Link: https://lore.kernel.org/linux-iommu/9204da81-f821-4034-b8ad-501e43383b56@amd.com/
Signed-off-by: Joerg Roedel

iommu/amd: Fix clone_alias() to use the original device's devid

2026-04-02T07:31:24+00:00

Currently clone_alias() assumes first argument (pdev) is always the
original device pointer. This function is called by
pci_for_each_dma_alias() which based on topology decides to send
original or alias device details in first argument.

This meant that the source devid used to look up and copy the DTE
may be incorrect, leading to wrong or stale DTE entries being
propagated to alias device.

Fix this by passing the original pdev as the opaque data argument to
both the direct clone_alias() call and pci_for_each_dma_alias(). Inside
clone_alias(), retrieve the original device from data and compute devid
from it.

Fixes: 3332364e4ebc ("iommu/amd: Support multiple PCI DMA aliases in device table")
Signed-off-by: Vasant Hegde 
Signed-off-by: Joerg Roedel

iommu/dma: Always allow DMA-FQ when iommupt provides the iommu_domain

2026-04-01T07:50:20+00:00

iommupt always supports the semantics required for DMA-FQ, when drivers
are converted to use it they automatically get support.

Detect iommpt directly instead of using IOMMU_CAP_DEFERRED_FLUSH and
remove IOMMU_CAP_DEFERRED_FLUSH from converted drivers.

This will also enable DMA-FQ on RISC-V.

Signed-off-by: Jason Gunthorpe 
Reviewed-by: Kevin Tian 
Signed-off-by: Joerg Roedel

iommu/amd: Fix illegal cap/mmio access in IOMMU debugfs

2026-03-27T08:26:59+00:00

In the current AMD IOMMU debugfs, when multiple processes simultaneously
access the IOMMU mmio/cap registers using the IOMMU debugfs, illegal
access issues can occur in the following execution flow:

1. CPU1: Sets a valid access address using iommu_mmio/capability_write,
and verifies the access address's validity in iommu_mmio/capability_show

2. CPU2: Sets an invalid address using iommu_mmio/capability_write

3. CPU1: accesses the IOMMU mmio/cap registers based on the invalid
address, resulting in an illegal access.

This patch modifies the execution process to first verify the address's
validity and then access it based on the same address, ensuring
correctness and robustness.

Signed-off-by: Guanghui Feng 
Signed-off-by: Joerg Roedel