linux-stable.git/drivers/pci, branch v6.2.4

PCI/DPC: Await readiness of secondary bus after reset

2023-03-10T08:29:54+00:00

commit 53b54ad074de1896f8b021615f65b27f557ce874 upstream.

pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus
Reset, but not after a DPC-induced Hot Reset.

As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not
observed and devices on the secondary bus may be accessed before
they're ready.

One affected device is Intel's Ponte Vecchio HPC GPU.  It comprises a
PCIe switch whose upstream port is not immediately ready after reset.
Because its config space is restored too early, it remains in
D0uninitialized, its subordinate devices remain inaccessible and DPC
recovery fails with messages such as:

  i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible)
  intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible)
  pcieport 0000:89:02.0: AER: device recovery failed

Fix it.

Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de
Tested-by: Ravi Kishore Koppuravuri 
Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Mika Westerberg 
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

PCI: Avoid FLR for AMD FCH AHCI adapters

2023-03-10T08:29:54+00:00

commit 63ba51db24ed1b8f8088a897290eb6c036c5435d upstream.

PCI passthrough to VMs does not work with AMD FCH AHCI adapters: the guest
OS fails to correctly probe devices attached to the controller due to FIS
communication failures:

  ata4: softreset failed (1st FIS failed)
  ...
  ata4.00: qc timeout after 5000 msecs (cmd 0xec)
  ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Forcing the "bus" reset method before unbinding & binding the adapter to
the vfio-pci driver solves this issue, e.g.:

  echo "bus" > /sys/bus/pci/devices//reset_method

gives a working guest OS, indicating that the default FLR reset method
doesn't work correctly.

Apply quirk_no_flr() to AMD FCH AHCI devices to work around this issue.

Link: https://lore.kernel.org/r/20230128013951.523247-1-damien.lemoal@opensource.wdc.com
Reported-by: Niklas Cassel 
Signed-off-by: Damien Le Moal 
Signed-off-by: Bjorn Helgaas 
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

PCI: hotplug: Allow marking devices as disconnected during bind/unbind

2023-03-10T08:29:54+00:00

commit 74ff8864cc842be994853095dba6db48e716400a upstream.

On surprise removal, pciehp_unconfigure_device() and acpiphp's
trim_stale_devices() call pci_dev_set_disconnected() to mark removed
devices as permanently offline.  Thereby, the PCI core and drivers know
to skip device accesses.

However pci_dev_set_disconnected() takes the device_lock and thus waits for
a concurrent driver bind or unbind to complete.  As a result, the driver's
->probe and ->remove hooks have no chance to learn that the device is gone.

That doesn't make any sense, so drop the device_lock and instead use atomic
xchg() and cmpxchg() operations to update the device state.

As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs
on surprise removal with AER concurrently performing a bus reset.

AER bus reset:

  INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds.
  Tainted: G        W          6.2.0-rc3-custom-norework-jan11+
  schedule
  rwsem_down_write_slowpath
  down_write_nested
  pciehp_reset_slot                      # acquires reset_lock
  pci_reset_hotplug_slot
  pci_slot_reset                         # acquires device_lock
  pci_bus_error_reset
  aer_root_reset
  pcie_do_recovery
  aer_process_err_devices
  aer_isr

pciehp surprise removal:

  INFO: task irq/26-pciehp:96 blocked for more than 120 seconds.
  Tainted: G        W          6.2.0-rc3-custom-norework-jan11+
  schedule_preempt_disabled
  __mutex_lock
  mutex_lock_nested
  pci_dev_set_disconnected               # acquires device_lock
  pci_walk_bus
  pciehp_unconfigure_device
  pciehp_disable_slot
  pciehp_handle_presence_or_link_change
  pciehp_ist                             # acquires reset_lock

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590
Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible")
Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de
Reported-by: Anatoli Antonovitch 
Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Cc: stable@vger.kernel.org # v4.20+
Cc: Keith Busch 
Signed-off-by: Greg Kroah-Hartman

PCI: Unify delay handling for reset and resume

2023-03-10T08:29:54+00:00

commit ac91e6980563ed53afadd925fa6585ffd2bc4a2c upstream.

Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait
for devices on the secondary bus to become accessible after reset:

Although it does call pci_dev_wait(), it erroneously passes the bridge's
pci_dev rather than that of a child.  The bridge of course is always
accessible while its secondary bus is reset, so pci_dev_wait() returns
immediately.

Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait()
function which is called from pci_bridge_secondary_bus_reset():

https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/

However we already have pci_bridge_wait_for_secondary_bus() which does
almost exactly what we need.  So far it's only called on resume from
D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8).
Re-using it for Secondary Bus Resets is a leaner and more rational
approach than introducing a new function.

That only requires a few minor tweaks:

- Amend pci_bridge_wait_for_secondary_bus() to await accessibility of
  the first device on the secondary bus by calling pci_dev_wait() after
  performing the prescribed delays.  pci_dev_wait() needs two parameters,
  a reset reason and a timeout, which callers must now pass to
  pci_bridge_wait_for_secondary_bus().  The timeout is 1 sec for resume
  (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI:
  Wait up to 60 seconds for device to become ready after FLR")).
  Introduce a PCI_RESET_WAIT macro for the 1 sec timeout.

- Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or
  -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().

- Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which
  is now performed by pci_bridge_wait_for_secondary_bus().  A static
  delay this long is only necessary for Conventional PCI, so modern
  PCIe systems benefit from shorter reset times as a side effect.

Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset")
Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@wunner.de
Reported-by: Sheng Bi 
Tested-by: Ravi Kishore Koppuravuri 
Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Mika Westerberg 
Reviewed-by: Kuppuswamy Sathyanarayanan 
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Greg Kroah-Hartman

PCI/PM: Observe reset delay irrespective of bridge_d3

2023-03-10T08:29:53+00:00

commit 8ef0217227b42e2c34a18de316cee3da16c9bf1e upstream.

If a PCI bridge is suspended to D3cold upon entering system sleep,
resuming it entails a Fundamental Reset per PCIe r6.0 sec 5.8.

The delay prescribed after a Fundamental Reset in PCIe r6.0 sec 6.6.1
is sought to be observed by:

  pci_pm_resume_noirq()
    pci_pm_bridge_power_up_actions()
      pci_bridge_wait_for_secondary_bus()

However, pci_bridge_wait_for_secondary_bus() bails out if the bridge_d3
flag is not set.  That flag indicates whether a bridge is allowed to
suspend to D3cold at *runtime*.

Hence *no* delay is observed on resume from system sleep if runtime
D3cold is forbidden.  That doesn't make any sense, so drop the bridge_d3
check from pci_bridge_wait_for_secondary_bus().

The purpose of the bridge_d3 check was probably to avoid delays if a
bridge remained in D0 during suspend.  However the sole caller of
pci_bridge_wait_for_secondary_bus(), pci_pm_bridge_power_up_actions(),
is only invoked if the previous power state was D3cold.  Hence the
additional bridge_d3 check seems superfluous.

Fixes: ad9001f2f411 ("PCI/PM: Add missing link delays required by the PCIe spec")
Link: https://lore.kernel.org/r/eb37fa345285ec8bacabbf06b020b803f77bdd3d.1673769517.git.lukas@wunner.de
Tested-by: Ravi Kishore Koppuravuri 
Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Mika Westerberg 
Reviewed-by: Kuppuswamy Sathyanarayanan 
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Greg Kroah-Hartman

PCI: qcom: Fix host-init error handling

2023-03-10T08:28:56+00:00

[ Upstream commit 997e010de9134474dbfde52be03efd7d1bce902d ]

Implement the new host_deinit() callback so that the PHY is powered off
and regulators and clocks are disabled also on late host-init errors.

Link: https://lore.kernel.org/r/20221017114705.8277-2-johan+linaro@kernel.org
Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
Signed-off-by: Johan Hovold 
Signed-off-by: Lorenzo Pieralisi 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Manivannan Sadhasivam 
Signed-off-by: Sasha Levin

PCI: Fix dropping valid root bus resources with .end = zero

2023-03-10T08:28:56+00:00

[ Upstream commit 9d8ba74a181b1c81def21168795ed96cbe6f05ed ]

On r8a7791/koelsch:

  kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
  # cat /sys/kernel/debug/kmemleak
  unreferenced object 0xc3a34e00 (size 64):
    comm "swapper/0", pid 1, jiffies 4294937460 (age 199.080s)
    hex dump (first 32 bytes):
      b4 5d 81 f0 b4 5d 81 f0 c0 b0 a2 c3 00 00 00 00  .]...]..........
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [] __kmalloc+0xf0/0x140
      [<34bd6bc0>] resource_list_create_entry+0x18/0x38
      [<767046bc>] pci_add_resource_offset+0x20/0x68
      [] devm_of_pci_get_host_bridge_resources.constprop.0+0xb0/0x390

When coalescing two resources for a contiguous aperture, the second
resource is enlarged to cover the full contiguous range, while the first
resource is marked invalid.  This invalidation is done by clearing the
flags, start, and end members.

When adding the initial resources to the bus later, invalid resources are
skipped.  Unfortunately, the check for an invalid resource considers only
the end member, causing false positives.

E.g. on r8a7791/koelsch, root bus resource 0 ("bus 00") is skipped, and no
longer registered with pci_bus_insert_busn_res() (causing the memory leak),
nor printed:

   pci-rcar-gen2 ee090000.pci: host bridge /soc/pci@ee090000 ranges:
   pci-rcar-gen2 ee090000.pci:      MEM 0x00ee080000..0x00ee08ffff -> 0x00ee080000
   pci-rcar-gen2 ee090000.pci: PCI: revision 11
   pci-rcar-gen2 ee090000.pci: PCI host bridge to bus 0000:00
  -pci_bus 0000:00: root bus resource [bus 00]
   pci_bus 0000:00: root bus resource [mem 0xee080000-0xee08ffff]

Fix this by only skipping resources where all of the flags, start, and end
members are zero.

Fixes: 7c3855c423b17f6c ("PCI: Coalesce host bridge contiguous apertures")
Link: https://lore.kernel.org/r/da0fcd5e86c74239be79c7cb03651c0fce31b515.1676036673.git.geert+renesas@glider.be
Tested-by: Niklas Schnelle 
Signed-off-by: Geert Uytterhoeven 
Signed-off-by: Bjorn Helgaas 
Acked-by: Kai-Heng Feng 
Signed-off-by: Sasha Levin

PCI: mt7621: Delay phy ports initialization

2023-03-10T08:28:54+00:00

[ Upstream commit 0cb2a8f3456ff1cc51d571e287a48e8fddc98ec2 ]

Some devices like ZBT WE1326 and ZBT WF3526-P and some Netgear models need
to delay phy port initialization after calling the mt7621_pcie_init_port()
driver function to get into reliable boots for both warm and hard resets.

The delay required to detect the ports seems to be in the range [75-100]
milliseconds.

If the ports are not detected the controller is not functional.

There is no datasheet or something similar to really understand why this
extra delay is needed only for these devices and it is not for most of
the boards that are built on mt7621 SoC.

This issue has been reported by openWRT community and the complete
discussion is in [0]. The 100 milliseconds delay has been tested in all
devices to validate it.

Add the extra 100 milliseconds delay to fix the issue.

[0]: https://github.com/openwrt/openwrt/pull/11220

Link: https://lore.kernel.org/r/20221231074041.264738-1-sergio.paracuellos@gmail.com
Fixes: 2bdd5238e756 ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver")
Signed-off-by: Sergio Paracuellos 
Signed-off-by: Lorenzo Pieralisi 
Signed-off-by: Sasha Levin

PCI: endpoint: pci-epf-vntb: Add epf_ntb_mw_bar_clear() num_mws kernel-doc

2023-03-10T08:28:49+00:00

[ Upstream commit fd858402c6d0a80e0b543886b9f7865c6d76d5d6 ]

8e4bfbe644a6 ("PCI: endpoint: pci-epf-vntb: fix error handle in
epf_ntb_mw_bar_init()") added a "num_mws" parameter to
epf_ntb_mw_bar_clear() but failed to add kernel-doc for num_mws.

Add kernel-doc for num_mws on epf_ntb_mw_bar_clear().

Fixes: 8e4bfbe644a6 ("PCI: endpoint: pci-epf-vntb: fix error handle in epf_ntb_mw_bar_init()")
Link: https://lore.kernel.org/r/20230103024907.293853-1-yangyingliang@huawei.com
Signed-off-by: Yang Yingliang 
Signed-off-by: Bjorn Helgaas 
Signed-off-by: Sasha Levin

PCI: switchtec: Return -EFAULT for copy_to_user() errors

2023-03-10T08:28:49+00:00

[ Upstream commit ddc10938e08cd7aac63d8385f7305f7889df5179 ]

switchtec_dev_read() didn't handle copy_to_user() errors correctly: it
assigned "rc = -EFAULT", but actually returned either "size", -ENXIO, or
-EBADMSG instead.

Update the failure cases to unlock mrpc_mutex and return -EFAULT directly.

Link: https://lore.kernel.org/r/20221216162126.207863-3-helgaas@kernel.org
Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Logan Gunthorpe 
Signed-off-by: Sasha Levin