linux-stable.git/include/linux/dma-mapping.h, branch v6.1.172

dma-mapping: add missing `inline` for `dma_free_attrs`

2026-04-11T12:16:08+00:00

[ Upstream commit 2cdaff22ed26f1e619aa2b43f27bb84f2c6ef8f8 ]

Under an UML build for an upcoming series [1], I got `-Wstatic-in-inline`
for `dma_free_attrs`:

      BINDGEN rust/bindings/bindings_generated.rs - due to target missing
    In file included from rust/helpers/helpers.c:59:
    rust/helpers/dma.c:17:2: warning: static function 'dma_free_attrs' is used in an inline function with external linkage [-Wstatic-in-inline]
       17 |         dma_free_attrs(dev, size, cpu_addr, dma_handle, attrs);
          |         ^
    rust/helpers/dma.c:12:1: note: use 'static' to give inline function 'rust_helper_dma_free_attrs' internal linkage
       12 | __rust_helper void rust_helper_dma_free_attrs(struct device *dev, size_t size,
          | ^
          | static

The issue is that `dma_free_attrs` was not marked `inline` when it was
introduced alongside the rest of the stubs.

Thus mark it.

Fixes: ed6ccf10f24b ("dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA")
Closes: https://lore.kernel.org/rust-for-linux/20260322194616.89847-1-ojeda@kernel.org/ [1]
Signed-off-by: Miguel Ojeda 
Signed-off-by: Marek Szyprowski 
Link: https://lore.kernel.org/r/20260325015548.70912-1-ojeda@kernel.org
Signed-off-by: Sasha Levin

dma-mapping: avoid potential unused data compilation warning

2025-06-04T12:40:01+00:00

[ Upstream commit c9b19ea63036fc537a69265acea1b18dabd1cbd3 ]

When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might
report unused data compilation warnings for dma_unmap_*() calls
arguments. Redefine macros for those calls to let compiler to notice that
it is okay when the provided arguments are not used.

Reported-by: Andy Shevchenko 
Suggested-by: Jakub Kicinski 
Signed-off-by: Marek Szyprowski 
Tested-by: Andy Shevchenko 
Link: https://lore.kernel.org/r/20250415075659.428549-1-m.szyprowski@samsung.com
Signed-off-by: Sasha Levin

dma: allow dma_get_cache_alignment() to be overridden by the arch code

2024-12-14T18:53:57+00:00

commit 8c57da28dc3df4e091474a004b5596c9b88a3be0 upstream.

On arm64, ARCH_DMA_MINALIGN is larger than most cache line size
configurations deployed.  Allow an architecture to override
dma_get_cache_alignment() in order to return a run-time probed value (e.g.
cache_line_size()).

Link: https://lkml.kernel.org/r/20230612153201.554742-3-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas 
Reviewed-by: Christoph Hellwig 
Tested-by: Isaac J. Manjarres 
Cc: Robin Murphy 
Cc: Will Deacon 
Cc: Alasdair Kergon 
Cc: Ard Biesheuvel 
Cc: Arnd Bergmann 
Cc: Daniel Vetter 
Cc: Greg Kroah-Hartman 
Cc: Herbert Xu 
Cc: Jerry Snitselaar 
Cc: Joerg Roedel 
Cc: Jonathan Cameron 
Cc: Jonathan Cameron 
Cc: Lars-Peter Clausen 
Cc: Logan Gunthorpe 
Cc: Marc Zyngier 
Cc: Mark Brown 
Cc: Mike Snitzer 
Cc: "Rafael J. Wysocki" 
Cc: Saravana Kannan 
Cc: Vlastimil Babka 
Signed-off-by: Andrew Morton 
Cc: Guenter Roeck 
Signed-off-by: Greg Kroah-Hartman

mm/slab: decouple ARCH_KMALLOC_MINALIGN from ARCH_DMA_MINALIGN

2024-12-14T18:53:57+00:00

commit 4ab5f8ec7d71aea5fe13a48248242130f84ac6bb upstream.

Patch series "mm, dma, arm64: Reduce ARCH_KMALLOC_MINALIGN to 8", v7.

A series reducing the kmalloc() minimum alignment on arm64 to 8 (from
128).


This patch (of 17):

In preparation for supporting a kmalloc() minimum alignment smaller than
the arch DMA alignment, decouple the two definitions.  This requires that
either the kmalloc() caches are aligned to a (run-time) cache-line size or
the DMA API bounces unaligned kmalloc() allocations.  Subsequent patches
will implement both options.

After this patch, ARCH_DMA_MINALIGN is expected to be used in static
alignment annotations and defined by an architecture to be the maximum
alignment for all supported configurations/SoCs in a single Image.
Architectures opting in to a smaller ARCH_KMALLOC_MINALIGN will need to
define its value in the arch headers.

Since ARCH_DMA_MINALIGN is now always defined, adjust the #ifdef in
dma_get_cache_alignment() so that there is no change for architectures not
requiring a minimum DMA alignment.

Link: https://lkml.kernel.org/r/20230612153201.554742-1-catalin.marinas@arm.com
Link: https://lkml.kernel.org/r/20230612153201.554742-2-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas 
Tested-by: Isaac J. Manjarres 
Cc: Vlastimil Babka 
Cc: Christoph Hellwig 
Cc: Robin Murphy 
Cc: Alasdair Kergon 
Cc: Ard Biesheuvel 
Cc: Arnd Bergmann 
Cc: Daniel Vetter 
Cc: Greg Kroah-Hartman 
Cc: Herbert Xu 
Cc: Joerg Roedel 
Cc: Jonathan Cameron 
Cc: Marc Zyngier 
Cc: Mark Brown 
Cc: Mike Snitzer 
Cc: Rafael J. Wysocki 
Cc: Saravana Kannan 
Cc: Will Deacon 
Cc: Jerry Snitselaar 
Cc: Jonathan Cameron 
Cc: Lars-Peter Clausen 
Cc: Logan Gunthorpe 
Signed-off-by: Andrew Morton 
Cc: Guenter Roeck 
Signed-off-by: Greg Kroah-Hartman

dma-mapping: mark dma_supported static

2022-09-07T08:38:28+00:00

Now that the remaining users in drivers are gone, this function can be
marked static.

Signed-off-by: Christoph Hellwig

dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support

2022-07-26T11:27:48+00:00

Add a flags member to the dma_map_ops structure with one flag to
indicate support for PCI P2PDMA.

Also, add a helper to check if a device supports PCI P2PDMA.

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Christoph Hellwig

dma-mapping: add dma_opt_mapping_size()

2022-07-19T04:05:41+00:00

Streaming DMA mapping involving an IOMMU may be much slower for larger
total mapping size. This is because every IOMMU DMA mapping requires an
IOVA to be allocated and freed. IOVA sizes above a certain limit are not
cached, which can have a big impact on DMA mapping performance.

Provide an API for device drivers to know this "optimal" limit, such that
they may try to produce mapping which don't exceed it.

Signed-off-by: John Garry 
Reviewed-by: Damien Le Moal 
Acked-by: Martin K. Petersen 
Signed-off-by: Christoph Hellwig

Reinstate some of "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""

2022-03-28T18:37:05+00:00

Halil Pasic points out [1] that the full revert of that commit (revert
in bddac7c1e02b), and that a partial revert that only reverts the
problematic case, but still keeps some of the cleanups is probably
better.  

And that partial revert [2] had already been verified by Oleksandr
Natalenko to also fix the issue, I had just missed that in the long
discussion.

So let's reinstate the cleanups from commit aa6f8dcbab47 ("swiotlb:
rework "fix info leak with DMA_FROM_DEVICE""), and effectively only
revert the part that caused problems.

Link: https://lore.kernel.org/all/20220328013731.017ae3e3.pasic@linux.ibm.com/ [1]
Link: https://lore.kernel.org/all/20220324055732.GB12078@lst.de/ [2]
Link: https://lore.kernel.org/all/4386660.LvFx2qVVIh@natalenko.name/ [3]
Suggested-by: Halil Pasic 
Tested-by: Oleksandr Natalenko 
Cc: Christoph Hellwig" 
Signed-off-by: Linus Torvalds

Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE""

2022-03-26T17:42:04+00:00

This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.

It turns out this breaks at least the ath9k wireless driver, and
possibly others.

What the ath9k driver does on packet receive is to set up the DMA
transfer with:

  int ath_rx_init(..)
  ..
                bf->bf_buf_addr = dma_map_single(sc->dev, skb->data,
                                                 common->rx_bufsize,
                                                 DMA_FROM_DEVICE);

and then the receive logic (through ath_rx_tasklet()) will fetch
incoming packets

  static bool ath_edma_get_buffers(..)
  ..
        dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
                                common->rx_bufsize, DMA_FROM_DEVICE);

        ret = ath9k_hw_process_rxdesc_edma(ah, rs, skb->data);
        if (ret == -EINPROGRESS) {
                /*let device gain the buffer again*/
                dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
                                common->rx_bufsize, DMA_FROM_DEVICE);
                return false;
        }

and it's worth noting how that first DMA sync:

    dma_sync_single_for_cpu(..DMA_FROM_DEVICE);

is there to make sure the CPU can read the DMA buffer (possibly by
copying it from the bounce buffer area, or by doing some cache flush).
The iommu correctly turns that into a "copy from bounce bufer" so that
the driver can look at the state of the packets.

In the meantime, the device may continue to write to the DMA buffer, but
we at least have a snapshot of the state due to that first DMA sync.

But that _second_ DMA sync:

    dma_sync_single_for_device(..DMA_FROM_DEVICE);

is telling the DMA mapping that the CPU wasn't interested in the area
because the packet wasn't there.  In the case of a DMA bounce buffer,
that is a no-op.

Note how it's not a sync for the CPU (the "for_device()" part), and it's
not a sync for data written by the CPU (the "DMA_FROM_DEVICE" part).

Or rather, it _should_ be a no-op.  That's what commit aa6f8dcbab47
broke: it made the code bounce the buffer unconditionally, and changed
the DMA_FROM_DEVICE to just unconditionally and illogically be
DMA_TO_DEVICE.

[ Side note: purely within the confines of the swiotlb driver it wasn't
  entirely illogical: The reason it did that odd DMA_FROM_DEVICE ->
  DMA_TO_DEVICE conversion thing is because inside the swiotlb driver,
  it uses just a swiotlb_bounce() helper that doesn't care about the
  whole distinction of who the sync is for - only which direction to
  bounce.

  So it took the "sync for device" to mean that the CPU must have been
  the one writing, and thought it meant DMA_TO_DEVICE. ]

Also note how the commentary in that commit was wrong, probably due to
that whole confusion, claiming that the commit makes the swiotlb code

                                  "bounce unconditionally (that is, also
    when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
    data from the swiotlb buffer"

which is nonsensical for two reasons:

 - that "also when dir == DMA_TO_DEVICE" is nonsensical, as that was
   exactly when it always did - and should do - the bounce.

 - since this is a sync for the device (not for the CPU), we're clearly
   fundamentally not coping back stale data from the bounce buffers at
   all, because we'd be copying *to* the bounce buffers.

So that commit was just very confused.  It confused the direction of the
synchronization (to the device, not the cpu) with the direction of the
DMA (from the device).

Reported-and-bisected-by: Oleksandr Natalenko 
Reported-by: Olha Cherevyk 
Cc: Halil Pasic 
Cc: Christoph Hellwig 
Cc: Kalle Valo 
Cc: Robin Murphy 
Cc: Toke Høiland-Jørgensen 
Cc: Maxime Bizon 
Cc: Johannes Berg 
Signed-off-by: Linus Torvalds

swiotlb: rework "fix info leak with DMA_FROM_DEVICE"

2022-03-07T19:26:02+00:00

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
  must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.

Signed-off-by: Halil Pasic 
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig 
Signed-off-by: Linus Torvalds