diff options
| author | Koichiro Den <den@valinux.co.jp> | 2025-10-23 16:21:05 +0900 |
|---|---|---|
| committer | Jon Mason <jdmason@kudzu.us> | 2026-02-20 17:31:55 -0500 |
| commit | 322617a06c97153f7b0681ecaa55490abccff7fa (patch) | |
| tree | 47e4a6d2907f0a721721815bc75320c5185a77b5 /include/linux/i2c/git@git.tavy.me:linux.git | |
| parent | b36490b5fb9866295cc13808b04a968b13acbab3 (diff) | |
NTB: ntb_transport: Add 'tx_memcpy_offload' module option
Some platforms (e.g. R-Car S4) do not gain from using a DMAC on TX path
in ntb_transport and end up CPU-bound on memcpy_toio(). Add a module
parameter 'tx_memcpy_offload' that moves the TX memcpy_toio() and
descriptor writes to a per-QP kernel thread. It is disabled by default.
This change also fixes a rare ordering hazard in ntb_tx_copy_callback(),
that was observed on R-Car S4 once throughput improved with the new
module parameter: the DONE flag write to the peer MW, which is WC
mapped, could be observed after the DB/MSI trigger. Both operations are
posted PCIe MWr (often via different OB iATUs), so WC buffering and
bridges may reorder visibility. Insert dma_mb() to enforce store->load
ordering and then read back hdr->flags to flush the posted write before
ringing the doorbell / issuing MSI.
While at it, update tx_index with WRITE_ONCE() at the earlier possible
location to make ntb_transport_tx_free_entry() robust.
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Diffstat (limited to 'include/linux/i2c/git@git.tavy.me:linux.git')
0 files changed, 0 insertions, 0 deletions
