linux-stable.git/drivers/pci, branch v6.8.2

PCI: brcmstb: Fix broken brcm_pcie_mdio_write() polling

2024-03-26T22:17:20+00:00

[ Upstream commit 039741a8d7c9a01c1bc84a5ac5aa770a5e138a30 ]

The MDIO_WT_DONE() macro tests bit 31, which is always 0 (== done) as
readw_poll_timeout_atomic() does a 16-bit read. Replace with the readl
variant.

[kwilczynski: commit log]
Fixes: ca5dcc76314d ("PCI: brcmstb: Replace status loops with read_poll_timeout_atomic()")
Link: https://lore.kernel.org/linux-pci/20240217133722.14391-1-wahrenst@gmx.net
Signed-off-by: Jonathan Bell 
Signed-off-by: Stefan Wahren 
Signed-off-by: Krzysztof Wilczyński 
Acked-by: Florian Fainelli 
Signed-off-by: Sasha Levin

PCI: Mark 3ware-9650SE Root Port Extended Tags as broken

2024-03-26T22:17:08+00:00

[ Upstream commit baf67aefbe7d7deafa59ca49612d163f8889934c ]

Per PCIe r6.1, sec 2.2.6.2 and 7.5.3.4, a Requester may not use 8-bit Tags
unless its Extended Tag Field Enable is set, but all Receivers/Completers
must handle 8-bit Tags correctly regardless of their Extended Tag Field
Enable.

Some devices do not handle 8-bit Tags as Completers, so add a quirk for
them.  If we find such a device, we disable Extended Tags for the entire
hierarchy to make peer-to-peer DMA possible.

The 3ware 9650SE seems to have issues with handling 8-bit tags. Mark it as
broken.

This fixes PCI Parity Errors like :

  3w-9xxx: scsi0: ERROR: (0x06:0x000C): PCI Parity Error: clearing.
  3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing.
  3w-9xxx: scsi0: ERROR: (0x06:0x000E): Controller Queue Error: clearing.
  3w-9xxx: scsi0: ERROR: (0x06:0x0010): Microcontroller Error: clearing.

Link: https://lore.kernel.org/r/20240219132811.8351-1-joerg@wedekind.de
Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202425
Signed-off-by: Jörg Wedekind 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Sasha Levin

NTB: fix possible name leak in ntb_register_device()

2024-03-26T22:17:06+00:00

[ Upstream commit aebfdfe39b9327a3077d0df8db3beb3160c9bdd0 ]

If device_register() fails in ntb_register_device(), the device name
allocated by dev_set_name() should be freed. As per the comment in
device_register(), callers should use put_device() to give up the
reference in the error path. So fix this by calling put_device() in the
error path so that the name can be freed in kobject_cleanup().

As a result of this, put_device() in the error path of
ntb_register_device() is removed and the actual error is returned.

Fixes: a1bd3baeb2f1 ("NTB: Add NTB hardware abstraction layer")
Signed-off-by: Yang Yingliang 
Reviewed-by: Ilpo Järvinen 
Reviewed-by: Manivannan Sadhasivam 
Reviewed-by: Dave Jiang 
Link: https://lore.kernel.org/r/20231201033057.1399131-1-yangyingliang@huaweicloud.com
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam 
Signed-off-by: Sasha Levin

PCI: switchtec: Fix an error handling path in switchtec_pci_probe()

2024-03-26T22:17:04+00:00

[ Upstream commit dec529b0b0572b32f9eb91c882dd1f08ca657efb ]

The commit in Fixes changed the logic on how resources are released and
introduced a new switchtec_exit_pci() that need to be called explicitly in
order to undo a corresponding switchtec_init_pci().

This was done in the remove function, but not in the probe.

Fix the probe now.

Fixes: df25461119d9 ("PCI: switchtec: Fix stdev_release() crash after surprise hot remove")
Link: https://lore.kernel.org/r/01446d2ccb91a578239915812f2b7dfbeb2882af.1703428183.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Sasha Levin

PCI/P2PDMA: Fix a sleeping issue in a RCU read section

2024-03-26T22:17:04+00:00

[ Upstream commit 1e5c66afd4a40bb7be17cb33cbb1a1085f727730 ]

It is not allowed to sleep within a RCU read section, so use GFP_ATOMIC
instead of GFP_KERNEL here.

Link: https://lore.kernel.org/r/02d9ec4a10235def0e764ff1f5be881ba12e16e8.1704397858.git.christophe.jaillet@wanadoo.fr
Fixes: ae21f835a5bd ("PCI/P2PDMA: Finish RCU conversion of pdev->p2pdma")
Signed-off-by: Christophe JAILLET 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Logan Gunthorpe 
Signed-off-by: Sasha Levin

PCI/DPC: Print all TLP Prefixes, not just the first

2024-03-26T22:17:00+00:00

[ Upstream commit 6568d82512b0a64809acff3d7a747362fa4288c8 ]

The TLP Prefix Log Register consists of multiple DWORDs (PCIe r6.1 sec
7.9.14.13) but the loop in dpc_process_rp_pio_error() keeps reading from
the first DWORD, so we print only the first PIO TLP Prefix (duplicated
several times), and we never print the second, third, etc., Prefixes.

Add the iteration count based offset calculation into the config read.

Fixes: f20c4ea49ec4 ("PCI/DPC: Add eDPC support")
Link: https://lore.kernel.org/r/20240118110815.3867-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen 
[bhelgaas: add user-visible details to commit log]
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Sasha Levin

PCI: Make pci_dev_is_disconnected() helper public for other drivers

2024-03-26T22:16:52+00:00

[ Upstream commit 39714fd73c6b60a8d27bcc5b431afb0828bf4434 ]

Make pci_dev_is_disconnected() public so that it can be called from
Intel VT-d driver to quickly fix/workaround the surprise removal
unplug hang issue for those ATS capable devices on PCIe switch downstream
hotplug capable ports.

Beside pci_device_is_present() function, this one has no config space
space access, so is light enough to optimize the normal pure surprise
removal and safe removal flow.

Acked-by: Bjorn Helgaas 
Reviewed-by: Dan Carpenter 
Tested-by: Haorong Ye 
Signed-off-by: Ethan Zhao 
Link: https://lore.kernel.org/r/20240301080727.3529832-2-haifeng.zhao@linux.intel.com
Signed-off-by: Lu Baolu 
Signed-off-by: Joerg Roedel 
Stable-dep-of: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected")
Signed-off-by: Sasha Levin

PCI/MSI: Prevent MSI hardware interrupt number truncation

2024-02-19T15:11:01+00:00

While calculating the hardware interrupt number for a MSI interrupt, the
higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI
domain number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting in
same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.

To address this cast the PCI domain number to 'irq_hw_number_t' before left
shifting it to calculate the hardware interrupt number.

Please note that this fixes the issue only on 64-bit systems and doesn't
change the behavior for 32-bit systems i.e. the 32-bit systems continue to
have the issue. Since the issue surfaces only if there are too many PCIe
controllers in the system which usually is the case in modern server
systems and they don't tend to run 32-bit kernels.

Fixes: 3878eaefb89a ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Signed-off-by: Vidya Sagar 
Signed-off-by: Thomas Gleixner 
Tested-by: Shanker Donthineni 
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240115135649.708536-1-vidyas@nvidia.com

Merge tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

2024-02-17T16:06:20+00:00

Pull pci fixes from Bjorn Helgaas:

 - Keep bridges in D0 if we need to poll downstream devices for PME to
   resolve a v6.6 regression where we failed to enumerate devices below
   bridges put in D3hot by runtime PM, e.g., NVMe drives connected via
   Thunderbolt or USB4 docks (Alex Williamson)

 - Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer

* tag 'pci-v6.8-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  MAINTAINERS: Add Siddharth Vadapalli as PCI TI DRA7XX/J721E reviewer
  PCI: Fix active state requirement in PME polling

PCI: Fix active state requirement in PME polling

2024-02-09T19:03:21+00:00

The commit noted in fixes added a bogus requirement that runtime PM managed
devices need to be in the RPM_ACTIVE state for PME polling.  In fact, only
devices in low power states should be polled.

However there's still a requirement that the device config space must be
accessible, which has implications for both the current state of the polled
device and the parent bridge, when present.  It's not sufficient to assume
the bridge remains in D0 and cases have been observed where the bridge
passes the D0 test, but the PM state indicates RPM_SUSPENDING and config
space of the polled device becomes inaccessible during pci_pme_wakeup().

Therefore, since the bridge is already effectively required to be in the
RPM_ACTIVE state, formalize this in the code and elevate the PM usage count
to maintain the state while polling the subordinate device.

This resolves a regression reported in the bugzilla below where a
Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint
downstream of a bridge in a D3hot power state.

Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com
Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
Reported-by: Sanath S 
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
Signed-off-by: Alex Williamson 
Signed-off-by: Bjorn Helgaas 
Tested-by: Sanath S 
Reviewed-by: Rafael J. Wysocki 
Cc: Lukas Wunner 
Cc: Mika Westerberg