<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/pci, branch linux-4.1.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()</title>
<updated>2018-05-23T01:36:32+00:00</updated>
<author>
<name>Mika Westerberg</name>
<email>mika.westerberg@linux.intel.com</email>
</author>
<published>2018-02-12T10:55:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a72042114282da974f8e7547a045af421d65bb35'/>
<id>a72042114282da974f8e7547a045af421d65bb35</id>
<content type='text'>
[ Upstream commit 13d3047c81505cc0fb9bdae7810676e70523c8bf ]

Mike Lothian reported that plugging in a USB-C device does not work
properly in his Dell Alienware system.  This system has an Intel Alpine
Ridge Thunderbolt controller providing USB-C functionality.  In these
systems the USB controller (xHCI) is hotplugged whenever a device is
connected to the port using ACPI-based hotplug.

The ACPI description of the root port in question is as follows:

  Device (RP01)
  {
      Name (_ADR, 0x001C0000)

      Device (PXSX)
      {
          Name (_ADR, 0x02)

          Method (_RMV, 0, NotSerialized)
          {
              // ...
          }
      }

Here _ADR 0x02 means device 0, function 2 on the bus under root port (RP01)
but that seems to be incorrect because device 0 is the upstream port of the
Alpine Ridge PCIe switch and it has no functions other than 0 (the bridge
itself).  When we get ACPI Notify() to the root port resulting from
connecting a USB-C device, Linux tries to read PCI_VENDOR_ID from device 0,
function 2 which of course always returns 0xffffffff because there is no
such function and we never find the device.

In Windows this works fine.

Now, since we get ACPI Notify() to the root port and not to the PXSX device
we should actually start our scan from there as well and not from the
non-existent PXSX device.  Fix this by checking presence of the slot itself
(function 0) if we fail to do that otherwise.

While there use pci_bus_read_dev_vendor_id() in get_slot_status(), which is
the recommended way to read Device and Vendor IDs of devices on PCI buses.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=198557
Reported-by: Mike Lothian &lt;mike@fireburn.co.uk&gt;
Signed-off-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 13d3047c81505cc0fb9bdae7810676e70523c8bf ]

Mike Lothian reported that plugging in a USB-C device does not work
properly in his Dell Alienware system.  This system has an Intel Alpine
Ridge Thunderbolt controller providing USB-C functionality.  In these
systems the USB controller (xHCI) is hotplugged whenever a device is
connected to the port using ACPI-based hotplug.

The ACPI description of the root port in question is as follows:

  Device (RP01)
  {
      Name (_ADR, 0x001C0000)

      Device (PXSX)
      {
          Name (_ADR, 0x02)

          Method (_RMV, 0, NotSerialized)
          {
              // ...
          }
      }

Here _ADR 0x02 means device 0, function 2 on the bus under root port (RP01)
but that seems to be incorrect because device 0 is the upstream port of the
Alpine Ridge PCIe switch and it has no functions other than 0 (the bridge
itself).  When we get ACPI Notify() to the root port resulting from
connecting a USB-C device, Linux tries to read PCI_VENDOR_ID from device 0,
function 2 which of course always returns 0xffffffff because there is no
such function and we never find the device.

In Windows this works fine.

Now, since we get ACPI Notify() to the root port and not to the PXSX device
we should actually start our scan from there as well and not from the
non-existent PXSX device.  Fix this by checking presence of the slot itself
(function 0) if we fail to do that otherwise.

While there use pci_bus_read_dev_vendor_id() in get_slot_status(), which is
the recommended way to read Device and Vendor IDs of devices on PCI buses.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=198557
Reported-by: Mike Lothian &lt;mike@fireburn.co.uk&gt;
Signed-off-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: Make PCI_ROM_ADDRESS_MASK a 32-bit constant</title>
<updated>2018-05-23T01:33:55+00:00</updated>
<author>
<name>Matthias Kaehlcke</name>
<email>mka@chromium.org</email>
</author>
<published>2017-04-14T20:38:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f052f848d7805ba6219c98cad6a7feec655c8953'/>
<id>f052f848d7805ba6219c98cad6a7feec655c8953</id>
<content type='text'>
[ Upstream commit 76dc52684d0f72971d9f6cc7d5ae198061b715bd ]

A 64-bit value is not needed since a PCI ROM address consists in 32 bits.
This fixes a clang warning about "implicit conversion from 'unsigned long'
to 'u32'".

Also remove now unnecessary casts to u32 from __pci_read_base() and
pci_std_update_resource().

Signed-off-by: Matthias Kaehlcke &lt;mka@chromium.org&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 76dc52684d0f72971d9f6cc7d5ae198061b715bd ]

A 64-bit value is not needed since a PCI ROM address consists in 32 bits.
This fixes a clang warning about "implicit conversion from 'unsigned long'
to 'u32'".

Also remove now unnecessary casts to u32 from __pci_read_base() and
pci_std_update_resource().

Signed-off-by: Matthias Kaehlcke &lt;mka@chromium.org&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L</title>
<updated>2018-05-23T01:33:52+00:00</updated>
<author>
<name>Hans de Goede</name>
<email>hdegoede@redhat.com</email>
</author>
<published>2018-03-02T10:36:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d98f7df054bf0b11b68c4371093b3d27754a3ad2'/>
<id>d98f7df054bf0b11b68c4371093b3d27754a3ad2</id>
<content type='text'>
[ Upstream commit 1903be8222b7c278ca897c129ce477c1dd6403a8 ]

The Highpoint RocketRAID 644L uses a Marvel 88SE9235 controller, as with
other Marvel controllers this needs a function 1 DMA alias quirk.

Note the RocketRAID 642L uses the same Marvel 88SE9235 controller and
already is listed with a function 1 DMA alias quirk.

Cc: stable@vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106
Signed-off-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 1903be8222b7c278ca897c129ce477c1dd6403a8 ]

The Highpoint RocketRAID 644L uses a Marvel 88SE9235 controller, as with
other Marvel controllers this needs a function 1 DMA alias quirk.

Note the RocketRAID 642L uses the same Marvel 88SE9235 controller and
already is listed with a function 1 DMA alias quirk.

Cc: stable@vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106
Signed-off-by: Hans de Goede &lt;hdegoede@redhat.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI/MSI: Stop disabling MSI/MSI-X in pci_device_shutdown()</title>
<updated>2018-05-23T01:33:43+00:00</updated>
<author>
<name>Prarit Bhargava</name>
<email>prarit@redhat.com</email>
</author>
<published>2017-01-26T19:07:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c51ddd5e3fad95587cc7ed6c096071ebb2fe79f8'/>
<id>c51ddd5e3fad95587cc7ed6c096071ebb2fe79f8</id>
<content type='text'>
[ Upstream commit fda78d7a0ead144f4b2cdb582dcba47911f4952c ]

The pci_bus_type .shutdown method, pci_device_shutdown(), is called from
device_shutdown() in the kernel restart and shutdown paths.

Previously, pci_device_shutdown() called pci_msi_shutdown() and
pci_msix_shutdown().  This disables MSI and MSI-X, which causes the device
to fall back to raising interrupts via INTx.  But the driver is still bound
to the device, it doesn't know about this change, and it likely doesn't
have an INTx handler, so these INTx interrupts cause "nobody cared"
warnings like this:

  irq 16: nobody cared (try booting with the "irqpoll" option)
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1
  Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/
  ...

The MSI disabling code was added by d52877c7b1af ("pci/irq: let
pci_device_shutdown to call pci_msi_shutdown v2") because a driver left MSI
enabled and kdump failed because the kexeced kernel wasn't prepared to
receive the MSI interrupts.

Subsequent commits 1851617cd2da ("PCI/MSI: Disable MSI at enumeration even
if kernel doesn't support MSI") and  e80e7edc55ba ("PCI/MSI: Initialize MSI
capability for all architectures") changed the kexeced kernel to disable
all MSIs itself so it no longer depends on the crashed kernel to clean up
after itself.

Stop disabling MSI/MSI-X in pci_device_shutdown().  This resolves the
"nobody cared" unhandled IRQ issue above.  It also allows PCI serial
devices, which may rely on the MSI interrupts, to continue outputting
messages during reboot/shutdown.

[bhelgaas: changelog, drop pci_msi_shutdown() and pci_msix_shutdown() calls
altogether]
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=187351
Signed-off-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
CC: Alex Williamson &lt;alex.williamson@redhat.com&gt;
CC: David Arcari &lt;darcari@redhat.com&gt;
CC: Myron Stowe &lt;mstowe@redhat.com&gt;
CC: Lukas Wunner &lt;lukas@wunner.de&gt;
CC: Keith Busch &lt;keith.busch@intel.com&gt;
CC: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit fda78d7a0ead144f4b2cdb582dcba47911f4952c ]

The pci_bus_type .shutdown method, pci_device_shutdown(), is called from
device_shutdown() in the kernel restart and shutdown paths.

Previously, pci_device_shutdown() called pci_msi_shutdown() and
pci_msix_shutdown().  This disables MSI and MSI-X, which causes the device
to fall back to raising interrupts via INTx.  But the driver is still bound
to the device, it doesn't know about this change, and it likely doesn't
have an INTx handler, so these INTx interrupts cause "nobody cared"
warnings like this:

  irq 16: nobody cared (try booting with the "irqpoll" option)
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.2-1.el7_UNSUPPORTED.x86_64 #1
  Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.90 06/
  ...

The MSI disabling code was added by d52877c7b1af ("pci/irq: let
pci_device_shutdown to call pci_msi_shutdown v2") because a driver left MSI
enabled and kdump failed because the kexeced kernel wasn't prepared to
receive the MSI interrupts.

Subsequent commits 1851617cd2da ("PCI/MSI: Disable MSI at enumeration even
if kernel doesn't support MSI") and  e80e7edc55ba ("PCI/MSI: Initialize MSI
capability for all architectures") changed the kexeced kernel to disable
all MSIs itself so it no longer depends on the crashed kernel to clean up
after itself.

Stop disabling MSI/MSI-X in pci_device_shutdown().  This resolves the
"nobody cared" unhandled IRQ issue above.  It also allows PCI serial
devices, which may rely on the MSI interrupts, to continue outputting
messages during reboot/shutdown.

[bhelgaas: changelog, drop pci_msi_shutdown() and pci_msix_shutdown() calls
altogether]
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=187351
Signed-off-by: Prarit Bhargava &lt;prarit@redhat.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
CC: Alex Williamson &lt;alex.williamson@redhat.com&gt;
CC: David Arcari &lt;darcari@redhat.com&gt;
CC: Myron Stowe &lt;mstowe@redhat.com&gt;
CC: Lukas Wunner &lt;lukas@wunner.de&gt;
CC: Keith Busch &lt;keith.busch@intel.com&gt;
CC: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: keystone: Fix interrupt-controller-node lookup</title>
<updated>2018-03-04T15:28:34+00:00</updated>
<author>
<name>Johan Hovold</name>
<email>johan@kernel.org</email>
</author>
<published>2017-11-17T13:38:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=209a86d22f6ad051d0df9e0ed78a90e9b3248f31'/>
<id>209a86d22f6ad051d0df9e0ed78a90e9b3248f31</id>
<content type='text'>
[ Upstream commit eac56aa3bc8af3d9b9850345d0f2da9d83529134 ]

Fix child-node lookup during initialisation which was using the wrong
OF-helper and ended up searching the whole device tree depth-first
starting at the parent rather than just matching on its children.

To make things worse, the parent pci node could end up being prematurely
freed as of_find_node_by_name() drops a reference to its first argument.
Any matching child interrupt-controller node was also leaked.

Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver")
Cc: stable &lt;stable@vger.kernel.org&gt;     # 3.18
Acked-by: Murali Karicheri &lt;m-karicheri2@ti.com&gt;
Signed-off-by: Johan Hovold &lt;johan@kernel.org&gt;
[lorenzo.pieralisi@arm.com: updated commit subject]
Signed-off-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit eac56aa3bc8af3d9b9850345d0f2da9d83529134 ]

Fix child-node lookup during initialisation which was using the wrong
OF-helper and ended up searching the whole device tree depth-first
starting at the parent rather than just matching on its children.

To make things worse, the parent pci node could end up being prematurely
freed as of_find_node_by_name() drops a reference to its first argument.
Any matching child interrupt-controller node was also leaked.

Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver")
Cc: stable &lt;stable@vger.kernel.org&gt;     # 3.18
Acked-by: Murali Karicheri &lt;m-karicheri2@ti.com&gt;
Signed-off-by: Johan Hovold &lt;johan@kernel.org&gt;
[lorenzo.pieralisi@arm.com: updated commit subject]
Signed-off-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()</title>
<updated>2018-03-01T00:32:17+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2017-12-15T02:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=77ebefd06490ae38a300d1e4082ffee01baddac0'/>
<id>77ebefd06490ae38a300d1e4082ffee01baddac0</id>
<content type='text'>
[ Upstream commit 5839ee7389e893a31e4e3c9cf17b50d14103c902 ]

It is incorrect to call pci_restore_state() for devices in low-power
states (D1-D3), as that involves the restoration of MSI setup which
requires MMIO to be operational and that is only the case in D0.

However, pci_pm_thaw_noirq() may do that if the driver's "freeze"
callbacks put the device into a low-power state, so fix it by making
it force devices into D0 via pci_set_power_state() instead of trying
to "update" their power state which is pointless.

Fixes: e60514bd4485 (PCI/PM: Restore the status of PCI devices across hibernation)
Cc: 4.13+ &lt;stable@vger.kernel.org&gt; # 4.13+
Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reported-by: Maarten Lankhorst &lt;dev@mblankhorst.nl&gt;
Tested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Maarten Lankhorst &lt;dev@mblankhorst.nl&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5839ee7389e893a31e4e3c9cf17b50d14103c902 ]

It is incorrect to call pci_restore_state() for devices in low-power
states (D1-D3), as that involves the restoration of MSI setup which
requires MMIO to be operational and that is only the case in D0.

However, pci_pm_thaw_noirq() may do that if the driver's "freeze"
callbacks put the device into a low-power state, so fix it by making
it force devices into D0 via pci_set_power_state() instead of trying
to "update" their power state which is pointless.

Fixes: e60514bd4485 (PCI/PM: Restore the status of PCI devices across hibernation)
Cc: 4.13+ &lt;stable@vger.kernel.org&gt; # 4.13+
Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reported-by: Maarten Lankhorst &lt;dev@mblankhorst.nl&gt;
Tested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Maarten Lankhorst &lt;dev@mblankhorst.nl&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI/AER: Report non-fatal errors only to the affected endpoint</title>
<updated>2018-03-01T00:32:15+00:00</updated>
<author>
<name>Gabriele Paoloni</name>
<email>gabriele.paoloni@huawei.com</email>
</author>
<published>2017-09-28T14:33:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=38a7d46c65efd240a87c34e414ab6f53f7a62e00'/>
<id>38a7d46c65efd240a87c34e414ab6f53f7a62e00</id>
<content type='text'>
[ Upstream commit 86acc790717fb60fb51ea3095084e331d8711c74 ]

Previously, if an non-fatal error was reported by an endpoint, we
called report_error_detected() for the endpoint, every sibling on the
bus, and their descendents.  If any of them did not implement the
.error_detected() method, do_recovery() failed, leaving all these
devices unrecovered.

For example, the system described in the bugzilla below has two devices:

  0000:74:02.0 [19e5:a230] SAS controller, driver has .error_detected()
  0000:74:03.0 [19e5:a235] SATA controller, driver lacks .error_detected()

When a device such as 74:02.0 reported a non-fatal error, do_recovery()
failed because 74:03.0 lacked an .error_detected() method.  But per PCIe
r3.1, sec 6.2.2.2.2, such an error does not compromise the Link and
does not affect 74:03.0:

  Non-fatal errors are uncorrectable errors which cause a particular
  transaction to be unreliable but the Link is otherwise fully functional.
  Isolating Non-fatal from Fatal errors provides Requester/Receiver logic
  in a device or system management software the opportunity to recover from
  the error without resetting the components on the Link and disturbing
  other transactions in progress.  Devices not associated with the
  transaction in error are not impacted by the error.

Report non-fatal errors only to the endpoint that reported them.  We really
want to check for AER_NONFATAL here, but the current code structure doesn't
allow that.  Looking for pci_channel_io_normal is the best we can do now.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=197055
Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
Signed-off-by: Gabriele Paoloni &lt;gabriele.paoloni@huawei.com&gt;
Signed-off-by: Dongdong Liu &lt;liudongdong3@huawei.com&gt;
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 86acc790717fb60fb51ea3095084e331d8711c74 ]

Previously, if an non-fatal error was reported by an endpoint, we
called report_error_detected() for the endpoint, every sibling on the
bus, and their descendents.  If any of them did not implement the
.error_detected() method, do_recovery() failed, leaving all these
devices unrecovered.

For example, the system described in the bugzilla below has two devices:

  0000:74:02.0 [19e5:a230] SAS controller, driver has .error_detected()
  0000:74:03.0 [19e5:a235] SATA controller, driver lacks .error_detected()

When a device such as 74:02.0 reported a non-fatal error, do_recovery()
failed because 74:03.0 lacked an .error_detected() method.  But per PCIe
r3.1, sec 6.2.2.2.2, such an error does not compromise the Link and
does not affect 74:03.0:

  Non-fatal errors are uncorrectable errors which cause a particular
  transaction to be unreliable but the Link is otherwise fully functional.
  Isolating Non-fatal from Fatal errors provides Requester/Receiver logic
  in a device or system management software the opportunity to recover from
  the error without resetting the components on the Link and disturbing
  other transactions in progress.  Devices not associated with the
  transaction in error are not impacted by the error.

Report non-fatal errors only to the endpoint that reported them.  We really
want to check for AER_NONFATAL here, but the current code structure doesn't
allow that.  Looking for pci_channel_io_normal is the best we can do now.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=197055
Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver")
Signed-off-by: Gabriele Paoloni &lt;gabriele.paoloni@huawei.com&gt;
Signed-off-by: Dongdong Liu &lt;liudongdong3@huawei.com&gt;
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: Create SR-IOV virtfn/physfn links before attaching driver</title>
<updated>2018-03-01T00:32:15+00:00</updated>
<author>
<name>Stuart Hayes</name>
<email>stuart.w.hayes@gmail.com</email>
</author>
<published>2017-10-04T15:57:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6afe6627430ac3c1cf24f82d2a8f6f158242fd70'/>
<id>6afe6627430ac3c1cf24f82d2a8f6f158242fd70</id>
<content type='text'>
[ Upstream commit 27d6162944b9b34c32cd5841acd21786637ee743 ]

When creating virtual functions, create the "virtfn%u" and "physfn" links
in sysfs *before* attaching the driver instead of after.  When we attach
the driver to the new virtual network interface first, there is a race when
the driver attaches to the new sends out an "add" udev event, and the
network interface naming software (biosdevname or systemd, for example)
tries to look at these links.

Signed-off-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 27d6162944b9b34c32cd5841acd21786637ee743 ]

When creating virtual functions, create the "virtfn%u" and "physfn" links
in sysfs *before* attaching the driver instead of after.  When we attach
the driver to the new virtual network interface first, there is a race when
the driver attaches to the new sends out an "add" udev event, and the
network interface naming software (biosdevname or systemd, for example)
tries to look at these links.

Signed-off-by: Stuart Hayes &lt;stuart.w.hayes@gmail.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: Avoid bus reset if bridge itself is broken</title>
<updated>2018-03-01T00:32:15+00:00</updated>
<author>
<name>David Daney</name>
<email>david.daney@cavium.com</email>
</author>
<published>2017-09-08T08:10:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=67d39ee351dd8a4ac6c8e5b913ef36d68a7f82c4'/>
<id>67d39ee351dd8a4ac6c8e5b913ef36d68a7f82c4</id>
<content type='text'>
[ Upstream commit 357027786f3523d26f42391aa4c075b8495e5d28 ]

When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.  Children marked with that flag are known not to behave well
after a bus reset.

Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.

Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.

Signed-off-by: David Daney &lt;david.daney@cavium.com&gt;
[jglauber@cavium.com: fixed typo]
Signed-off-by: Jan Glauber &lt;jglauber@cavium.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 357027786f3523d26f42391aa4c075b8495e5d28 ]

When checking to see if a PCI bus can safely be reset, we previously
checked to see if any of the children had their PCI_DEV_FLAGS_NO_BUS_RESET
flag set.  Children marked with that flag are known not to behave well
after a bus reset.

Some PCIe root port bridges also do not behave well after a bus reset,
sometimes causing the devices behind the bridge to become unusable.

Add a check for PCI_DEV_FLAGS_NO_BUS_RESET being set in the bridge device
to allow these bridges to be flagged, and prevent their secondary buses
from being reset.

Signed-off-by: David Daney &lt;david.daney@cavium.com&gt;
[jglauber@cavium.com: fixed typo]
Signed-off-by: Jan Glauber &lt;jglauber@cavium.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PCI: Detach driver before procfs &amp; sysfs teardown on device remove</title>
<updated>2018-01-17T17:55:39+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2017-10-11T21:35:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e9d85866c84c8ea47b04185a8343823962b1fc6b'/>
<id>e9d85866c84c8ea47b04185a8343823962b1fc6b</id>
<content type='text'>
[ Upstream commit 16b6c8bb687cc3bec914de09061fcb8411951fda ]

When removing a device, for example a VF being removed due to SR-IOV
teardown, a "soft" hot-unplug via 'echo 1 &gt; remove' in sysfs, or an actual
hot-unplug, we first remove the procfs and sysfs attributes for the device
before attempting to release the device from any driver bound to it.
Unbinding the driver from the device can take time.  The device might need
to write out data or it might be actively in use.  If it's in use by
userspace through a vfio driver, the unbind might block until the user
releases the device.  This leads to a potentially non-trivial amount of
time where the device exists, but we've torn down the interfaces that
userspace uses to examine devices, for instance lspci might generate this
sort of error:

  pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config
  lspci: Unable to read the standard configuration space header of device 0000:01:0a.3

We don't seem to have any dependence on this teardown ordering in the
kernel, so let's unbind the driver first, which is also more symmetric with
the instantiation of the device in pci_bus_add_device().

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 16b6c8bb687cc3bec914de09061fcb8411951fda ]

When removing a device, for example a VF being removed due to SR-IOV
teardown, a "soft" hot-unplug via 'echo 1 &gt; remove' in sysfs, or an actual
hot-unplug, we first remove the procfs and sysfs attributes for the device
before attempting to release the device from any driver bound to it.
Unbinding the driver from the device can take time.  The device might need
to write out data or it might be actively in use.  If it's in use by
userspace through a vfio driver, the unbind might block until the user
releases the device.  This leads to a potentially non-trivial amount of
time where the device exists, but we've torn down the interfaces that
userspace uses to examine devices, for instance lspci might generate this
sort of error:

  pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config
  lspci: Unable to read the standard configuration space header of device 0000:01:0a.3

We don't seem to have any dependence on this teardown ordering in the
kernel, so let's unbind the driver first, which is also more symmetric with
the instantiation of the device in pci_bus_add_device().

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
