linux-stable.git/drivers/pci, branch v4.14.78

PCI: dwc: Fix scheduling while atomic issues

2018-10-20T07:48:51+00:00

[ Upstream commit 9024143e700f89d74b8cdaf316a3499d74fc56fe ]

When programming the inbound/outbound ATUs, we call usleep_range() after
each checking PCIE_ATU_ENABLE bit. Unfortunately, the ATU programming
can be executed in atomic context:

inbound ATU programming could be called through
pci_epc_write_header()
  =>dw_pcie_ep_write_header()
    =>dw_pcie_prog_inbound_atu()

outbound ATU programming could be called through
pci_bus_read_config_dword()
  =>dw_pcie_rd_conf()
    =>dw_pcie_prog_outbound_atu()

Fix this issue by calling mdelay() instead.

Fixes: f8aed6ec624f ("PCI: dwc: designware: Add EP mode support")
Fixes: d8bbeb39fbf3 ("PCI: designware: Wait for iATU enable")
Signed-off-by: Jisheng Zhang 
[lorenzo.pieralisi@arm.com: commit log update]
Signed-off-by: Lorenzo Pieralisi 
Signed-off-by: Bjorn Helgaas 
Acked-by: Gustavo Pimentel 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

PCI: hv: support reporting serial number as slot information

2018-10-18T07:16:22+00:00

[ Upstream commit a15f2c08c70811f120d99288d81f70d7f3d104f1 ]

The Hyper-V host API for PCI provides a unique "serial number" which
can be used as basis for sysfs PCI slot table. This can be useful
for cases where userspace wants to find the PCI device based on
serial number.

When an SR-IOV NIC is added, the host sends an attach message
with serial number. The kernel doesn't use the serial number, but
it is useful when doing the same thing in a userspace driver such
as the DPDK. By having /sys/bus/pci/slots/N it provides a direct
way to find the matching PCI device.

There maybe some cases where serial number is not unique such
as when using GPU's. But the PCI slot infrastructure will handle
that.

This has a side effect which may also be useful. The common udev
network device naming policy uses the slot information (rather
than PCI address).

Signed-off-by: Stephen Hemminger 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

PCI: Reprogram bridge prefetch registers on resume

2018-10-13T07:27:24+00:00

commit 083874549fdfefa629dfa752785e20427dde1511 upstream.

On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
suspend/resume.  The affected products include multiple generations of
NVIDIA GPUs and Intel SoCs.  After resume, nouveau logs many errors such
as:

  fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
        [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
  DRM: failed to idle channel 0 [DRM]

Similarly, the NVIDIA proprietary driver also fails after resume (black
screen, 100% CPU usage in Xorg process).  We shipped a sample to NVIDIA for
diagnosis, and their response indicated that it's a problem with the parent
PCI bridge (on the Intel SoC), not the GPU.

Runtime suspend/resume works fine, only S3 suspend is affected.

We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32).  In the
cases that I checked, this register has value 0 and we just have to rewrite
that value.

Linux already saves and restores PCI config space during suspend/resume,
but this register was being skipped because upon resume, it already has
value 0 (the correct, pre-suspend value).

Intel appear to have previously acknowledged this behaviour and the
requirement to rewrite this register:
https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23

Based on that, rewrite the prefetch register values even when that appears
unnecessary.

We have confirmed this solution on all the affected models we have in-hands
(X542UQ, UX533FD, X530UN, V272UN).

Additionally, this solves an issue where r8169 MSI-X interrupts were broken
after S3 suspend/resume on ASUS X441UAR.  This issue was recently worked
around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e").  It
also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
that we had not yet patched.  I suspect it will also fix the issue that was
worked around in commit 7c53a722459c ("r8169: don't use MSI-X on
RTL8168g").

Thomas Martitz reports that this change also solves an issue where the AMD
Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
suspend/resume.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
Signed-off-by: Daniel Drake 
Signed-off-by: Bjorn Helgaas 
Reviewed-by: Rafael J. Wysocki 
Reviewed-By: Peter Wu 
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

PCI: aardvark: Size bridges before resources allocation

2018-09-29T10:06:07+00:00

commit 91a2968e245d6ba616db37001fa1a043078b1a65 upstream.

The PCIE I/O and MEM resource allocation mechanism is that root bus
goes through the following steps:

1. Check PCI bridges' range and computes I/O and Mem base/limits.

2. Sort all subordinate devices I/O and MEM resource requirements and
   allocate the resources and writes/updates subordinate devices'
   requirements to PCI bridges I/O and Mem MEM/limits registers.

Currently, PCI Aardvark driver only handles the second step and lacks
the first step, so there is an I/O and MEM resource allocation failure
when using a PCI switch. This commit fixes that by sizing bridges
before doing the resource allocation.

Fixes: 8c39d710363c1 ("PCI: aardvark: Add Aardvark PCI host controller
driver")
Signed-off-by: Zachary Zhang 
[Thomas: edit commit log.]
Signed-off-by: Thomas Petazzoni 
Signed-off-by: Lorenzo Pieralisi 
Cc: 
Signed-off-by: Greg Kroah-Hartman

Revert "PCI: Add ACS quirk for Intel 300 series"

2018-09-29T10:06:04+00:00

commit 50ca031b51106b1b46162d4e9ecccb7edc95682f upstream.

This reverts f154a718e6cc ("PCI: Add ACS quirk for Intel 300 series").

It turns out that erratum "PCH PCIe* Controller Root Port (ACSCTLR) Appear
As Read Only" has been fixed in 300 series chipsets, even though the
datasheet [1] claims otherwise.  To make ACS work properly on 300 series
root ports, revert the faulty commit.

[1] https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/300-series-c240-series-chipset-pch-spec-update.pdf

Fixes: f154a718e6cc ("PCI: Add ACS quirk for Intel 300 series")
Signed-off-by: Mika Westerberg 
Signed-off-by: Bjorn Helgaas 
Cc: stable@vger.kernel.org	# v4.18+
Signed-off-by: Greg Kroah-Hartman

switchtec: Fix Spectre v1 vulnerability

2018-09-19T20:43:37+00:00

commit 46feb6b495f7628a6dbf36c4e6d80faf378372d4 upstream.

p.port can is indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

  drivers/pci/switch/switchtec.c:912 ioctl_port_to_pff() warn: potential spectre issue 'pcfg->dsp_pff_inst_id' [r]

Fix this by sanitizing p.port before using it to index
pcfg->dsp_pff_inst_id

Notice that given that speculation windows are large, the policy is to kill
the speculation on the first load and not worry if it can be completed with
a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Signed-off-by: Gustavo A. R. Silva 
Signed-off-by: Bjorn Helgaas 
Acked-by: Logan Gunthorpe 
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

PCI: mvebu: Fix I/O space end address calculation

2018-09-15T07:45:31+00:00

[ Upstream commit dfd0309fd7b30a5baffaf47b2fccb88b46d64d69 ]

pcie->realio.end should be the address of last byte of the area,
therefore using resource_size() of another resource is not correct, we
must substract 1 to get the address of the last byte.

Fixes: 11be65472a427 ("PCI: mvebu: Adapt to the new device tree layout")
Signed-off-by: Thomas Petazzoni 
Signed-off-by: Lorenzo Pieralisi 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

PCI: pciehp: Fix unprotected list iteration in IRQ handler

2018-08-24T11:09:23+00:00

commit 1204e35bedf4e5015cda559ed8c84789a6dae24e upstream.

Commit b440bde74f04 ("PCI: Add pci_ignore_hotplug() to ignore hotplug
events for a device") iterates over the devices on a hotplug port's
subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem.
It is thus possible for a user to cause a crash by concurrently
manipulating the device list, e.g. by disabling slot power via sysfs
on a different CPU or by initiating a remove/rescan via sysfs.

This can't be fixed by acquiring pci_bus_sem because it may sleep.
The simplest fix is to avoid the list iteration altogether and just
check the ignore_hotplug flag on the port itself.  This works because
pci_ignore_hotplug() sets the flag both on the device as well as on its
parent bridge.

We do lose the ability to print the name of the device blocking hotplug
in the debug message, but that's probably bearable.

Fixes: b440bde74f04 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

PCI: pciehp: Fix use-after-free on unplug

2018-08-24T11:09:22+00:00

commit 281e878eab191cce4259abbbf1a0322e3adae02c upstream.

When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the
hotplug_slot struct is deregistered and thus freed before freeing the
IRQ.  The IRQ handler and the work items it schedules print the slot
name referenced from the freed structure in various informational and
debug log messages, each time resulting in a quadruple dereference of
freed pointers (hotplug_slot -> pci_slot -> kobject -> name).

At best the slot name is logged as "(null)", at worst kernel memory is
exposed in logs or the driver crashes:

  pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present

An attacker may provoke the bug by unplugging multiple devices on a
Thunderbolt daisy chain at once.  Unplugging can also be simulated by
powering down slots via sysfs.  The bug is particularly easy to trigger
in poll mode.

It has been present since the driver's introduction in 2004:
https://git.kernel.org/tglx/history/c/c16b4b14d980

Fix by rearranging teardown such that the IRQ is freed first.  Run the
work items queued by the IRQ handler to completion before freeing the
hotplug_slot struct by draining the work queue from the ->release_slot
callback which is invoked by pci_hp_deregister().

Signed-off-by: Lukas Wunner 
Signed-off-by: Bjorn Helgaas 
Cc: stable@vger.kernel.org # v2.6.4
Signed-off-by: Greg Kroah-Hartman

PCI: Skip MPS logic for Virtual Functions (VFs)

2018-08-24T11:09:22+00:00

commit 3dbe97efe8bf450b183d6dee2305cbc032e6b8a4 upstream.

PCIe r4.0, sec 9.3.5.4, "Device Control Register", shows both
Max_Payload_Size (MPS) and Max_Read_request_Size (MRRS) to be 'RsvdP' for
VFs.  Just prior to the table it states:

  "PF and VF functionality is defined in Section 7.5.3.4 except where
   noted in Table 9-16.  For VF fields marked 'RsvdP', the PF setting
   applies to the VF."

All of which implies that with respect to Max_Payload_Size Supported
(MPSS), MPS, and MRRS values, we should not be paying any attention to the
VF's fields, but rather only to the PF's.  Only looking at the PF's fields
also logically makes sense as it's the sole physical interface to the PCIe
bus.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527
Fixes: 27d868b5e6cf ("PCI: Set MPS to match upstream bridge")
Signed-off-by: Myron Stowe 
Signed-off-by: Bjorn Helgaas 
Cc: stable@vger.kernel.org # 4.3+
Cc: Keith Busch 
Cc: Sinan Kaya 
Cc: Dongdong Liu 
Cc: Jon Mason 
Signed-off-by: Greg Kroah-Hartman