<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/powerpc/kernel/eeh.c, branch linux-3.19.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>powerpc: Drop useless warning in eeh_init()</title>
<updated>2014-12-02T00:03:45+00:00</updated>
<author>
<name>Greg Kurz</name>
<email>gkurz@linux.vnet.ibm.com</email>
</author>
<published>2014-11-25T16:10:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=221195fb80daa1a0c2fd54a023081c416fe93340'/>
<id>221195fb80daa1a0c2fd54a023081c416fe93340</id>
<content type='text'>
This is what we get in dmesg when booting a pseries guest and
the hypervisor doesn't provide EEH support.

[    0.166655] EEH functionality not supported
[    0.166778] eeh_init: Failed to call platform init function (-22)

Since both powernv_eeh_init() and pseries_eeh_init() already complain when
hitting an error, it is not needed to print more (especially such an
uninformative message).

Signed-off-by: Greg Kurz &lt;gkurz@linux.vnet.ibm.com&gt;
Acked-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is what we get in dmesg when booting a pseries guest and
the hypervisor doesn't provide EEH support.

[    0.166655] EEH functionality not supported
[    0.166778] eeh_init: Failed to call platform init function (-22)

Since both powernv_eeh_init() and pseries_eeh_init() already complain when
hitting an error, it is not needed to print more (especially such an
uninformative message).

Signed-off-by: Greg Kurz &lt;gkurz@linux.vnet.ibm.com&gt;
Acked-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Dump PHB diag-data early</title>
<updated>2014-12-02T00:03:26+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-11-22T10:58:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a450e8f55a57d049ac3afe218f06567e12d6b4f5'/>
<id>a450e8f55a57d049ac3afe218f06567e12d6b4f5</id>
<content type='text'>
On PowerNV platform, PHB diag-data is dumped after stopping device
drivers. In case of recursive EEH errors, the kernel is usually
crashed before dumping PHB diag-data for the second EEH error. It's
hard to locate the root cause of the second EEH error without PHB
diag-data.

The patch adds one more EEH option "eeh=early_log", which helps
dumping PHB diag-data immediately once frozen PE is detected, in
order to get the PHB diag-data for the second EEH error.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On PowerNV platform, PHB diag-data is dumped after stopping device
drivers. In case of recursive EEH errors, the kernel is usually
crashed before dumping PHB diag-data for the second EEH error. It's
hard to locate the root cause of the second EEH error without PHB
diag-data.

The patch adds one more EEH option "eeh=early_log", which helps
dumping PHB diag-data immediately once frozen PE is detected, in
order to get the PHB diag-data for the second EEH error.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Recover EEH error on ownership change for BCM5719</title>
<updated>2014-12-02T00:03:26+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-11-13T23:47:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b1d76a7d57762332cd8e0c020470d43c5ad3948e'/>
<id>b1d76a7d57762332cd8e0c020470d43c5ad3948e</id>
<content type='text'>
In PCI passthrou scenario, we need simulate EEH recovery for Emulex
adapters when their ownership changes, as we did in commit 5cfb20b96
("powerpc/eeh: Emulate EEH recovery for VFIO devices"). Broadcom
BCM5719 adpaters are facing same problem and needs same cure.

Reported-by: Rajeshkumar Subramanian &lt;rajeshkumars@in.ibm.com&gt;
Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In PCI passthrou scenario, we need simulate EEH recovery for Emulex
adapters when their ownership changes, as we did in commit 5cfb20b96
("powerpc/eeh: Emulate EEH recovery for VFIO devices"). Broadcom
BCM5719 adpaters are facing same problem and needs same cure.

Reported-by: Rajeshkumar Subramanian &lt;rajeshkumars@in.ibm.com&gt;
Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Set EEH_PE_RESET on PE reset</title>
<updated>2014-12-02T00:03:26+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-11-13T23:47:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=28bf36f92afc6b22ba50ceaf36ba89afa9f5c1e8'/>
<id>28bf36f92afc6b22ba50ceaf36ba89afa9f5c1e8</id>
<content type='text'>
The patch introduces additional flag EEH_PE_RESET to indicate the
corresponding PE is under reset. In turn, the PE retrieval bakcend
on PowerNV platform can return unfrozen state for the EEH core to
moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for
the purpose.

In PCI passthrou case, the problem is more worse: Guest doesn't
recover 6th EEH error. The PE is left in isolated (frozen) and
config blocked state on Broadcom adapters. We can't retrieve the
PE's state correctly any more, even from the host side via sysfs
/sys/bus/pci/devices/xxx/eeh_pe_state.

Reported-by: Rajeshkumar Subramanian &lt;rajeshkumars@in.ibm.com&gt;
Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The patch introduces additional flag EEH_PE_RESET to indicate the
corresponding PE is under reset. In turn, the PE retrieval bakcend
on PowerNV platform can return unfrozen state for the EEH core to
moving forward. Flag EEH_PE_CFG_BLOCKED isn't the correct one for
the purpose.

In PCI passthrou case, the problem is more worse: Guest doesn't
recover 6th EEH error. The PE is left in isolated (frozen) and
config blocked state on Broadcom adapters. We can't retrieve the
PE's state correctly any more, even from the host side via sysfs
/sys/bus/pci/devices/xxx/eeh_pe_state.

Reported-by: Rajeshkumar Subramanian &lt;rajeshkumars@in.ibm.com&gt;
Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Refactor eeh_reset_pe()</title>
<updated>2014-12-02T00:03:26+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-11-13T23:47:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b85743ee95f3593ded82bca55cd3c38090ad001f'/>
<id>b85743ee95f3593ded82bca55cd3c38090ad001f</id>
<content type='text'>
The patch refactors eeh_reset_pe() in order for:

   * Varied return values for different failure cases.
   * Replace pr_err() with pr_warn() and print function name.
   * Coding style cleanup.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The patch refactors eeh_reset_pe() in order for:

   * Varied return values for different failure cases.
   * Replace pr_err() with pr_warn() and print function name.
   * Coding style cleanup.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Don't collect logs on PE with blocked config space</title>
<updated>2014-10-15T00:27:21+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-10-01T07:07:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c59004cc83c3f8b182c32ca9d366d222a59ab63f'/>
<id>c59004cc83c3f8b182c32ca9d366d222a59ab63f</id>
<content type='text'>
When the PE's config space is marked as blocked, PCI config read
requests always return 0xFF's. It's pointless to collect logs in
this case.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the PE's config space is marked as blocked, PCI config read
requests always return 0xFF's. It's pointless to collect logs in
this case.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED</title>
<updated>2014-10-15T00:27:18+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-10-01T07:07:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8a6b3710ccc33da1fd5c85144ad3db01c4457552'/>
<id>8a6b3710ccc33da1fd5c85144ad3db01c4457552</id>
<content type='text'>
The flag EEH_PE_RESET indicates blocking config space of the PE
during reset time. We potentially need block PE's config space
other than reset time. So it's reasonable to replace it with
EEH_PE_CFG_BLOCKED to indicate its usage.

There are no substantial code or logic changes in this patch.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The flag EEH_PE_RESET indicates blocking config space of the PE
during reset time. We potentially need block PE's config space
other than reset time. So it's reasonable to replace it with
EEH_PE_CFG_BLOCKED to indicate its usage.

There are no substantial code or logic changes in this patch.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Dump PCI config space for all child devices</title>
<updated>2014-09-30T07:15:19+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-09-30T02:39:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f2e0be5e76dd626c70f5aa5c6165b4dfa1d14c64'/>
<id>f2e0be5e76dd626c70f5aa5c6165b4dfa1d14c64</id>
<content type='text'>
The PEs can be organized as nested. Current implementation doesn't
dump PCI config space for subordinate devices of child PEs. However,
the frozen PE could be caused by those subordinate devices of its
child PEs.

The patch dumps PCI config space for all subordinate devices of the
problematic PE.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The PEs can be organized as nested. Current implementation doesn't
dump PCI config space for subordinate devices of child PEs. However,
the frozen PE could be caused by those subordinate devices of its
child PEs.

The patch dumps PCI config space for all subordinate devices of the
problematic PE.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Emulate EEH recovery for VFIO devices</title>
<updated>2014-09-30T07:15:18+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-09-30T02:39:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5cfb20b96f624e9852c4f3f1c4397e81ca28d5aa'/>
<id>5cfb20b96f624e9852c4f3f1c4397e81ca28d5aa</id>
<content type='text'>
When enabling EEH functionality on passed through devices (PE)
with VFIO, the devices in the PE would be removed permanently
from guest side. In that case, the PE remains frozen state.
When returning PE to host, or restarting the guest again, we
had mechanism unfreezing the PE by clearing PESTA/B frozen
bits. However, that's not enough for some adapters, which are
indicated as following "lspci" shows. Those adapters require
hot reset on the parent bus to bring their firmware back to
workable state. Otherwise, those adaptrs won't be operative
and the host (for returning case) or the guest will fail to
load the drivers for those adapters without exception.

0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \
             10Gb NIC (be3) (rev 02)
0000:01:00.0 0200: 19a2:0710 (rev 02)
0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \
             NIC (Lancer) (rev 10)
0001:03:00.0 0200: 10df:e220 (rev 10)

The patch adds mechanism to emulate EEH recovery (for hot reset
on parent PCI bus) on 3 gates to fix the issue: open/release one
adapter of the PE, enable EEH functionality on one adapter of the
PE.

Reported-by:  Murilo Fossa Vicentini &lt;muvic@br.ibm.com&gt;
Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When enabling EEH functionality on passed through devices (PE)
with VFIO, the devices in the PE would be removed permanently
from guest side. In that case, the PE remains frozen state.
When returning PE to host, or restarting the guest again, we
had mechanism unfreezing the PE by clearing PESTA/B frozen
bits. However, that's not enough for some adapters, which are
indicated as following "lspci" shows. Those adapters require
hot reset on the parent bus to bring their firmware back to
workable state. Otherwise, those adaptrs won't be operative
and the host (for returning case) or the guest will fail to
load the drivers for those adapters without exception.

0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \
             10Gb NIC (be3) (rev 02)
0000:01:00.0 0200: 19a2:0710 (rev 02)
0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \
             NIC (Lancer) (rev 10)
0001:03:00.0 0200: 10df:e220 (rev 10)

The patch adds mechanism to emulate EEH recovery (for hot reset
on parent PCI bus) on 3 gates to fix the issue: open/release one
adapter of the PE, enable EEH functionality on one adapter of the
PE.

Reported-by:  Murilo Fossa Vicentini &lt;muvic@br.ibm.com&gt;
Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Tag reset state for user owned PE</title>
<updated>2014-09-30T07:15:17+00:00</updated>
<author>
<name>Gavin Shan</name>
<email>gwshan@linux.vnet.ibm.com</email>
</author>
<published>2014-09-30T02:39:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=93e8b36d7bf5c54f1c52d8b78e34f88e52a3dfa2'/>
<id>93e8b36d7bf5c54f1c52d8b78e34f88e52a3dfa2</id>
<content type='text'>
PE would be owned by userland, which probably request PE reset
done in host side. During the reset, we should drop the PCI
config accesses to the PE with help of flag EEH_PE_RESET.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PE would be owned by userland, which probably request PE reset
done in host side. During the reset, we should drop the PCI
config accesses to the PE with help of flag EEH_PE_RESET.

Signed-off-by: Gavin Shan &lt;gwshan@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
