<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/powerpc/kernel/eeh_driver.c, branch v5.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>powerpc/eeh: Improve recovery of passed-through devices</title>
<updated>2019-02-05T00:55:44+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-11-29T03:16:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1ef52073fd25ea97090eaff2c8b528ebf401a12a'/>
<id>1ef52073fd25ea97090eaff2c8b528ebf401a12a</id>
<content type='text'>
Currently, the EEH recovery process considers passed-through devices
as if they were not EEH-aware, which can cause them to be removed as
part of recovery.  Because device removal requires cooperation from
the guest, this may lead to the process stalling or deadlocking.
Also, if devices are removed on the host side, they will be removed
from their IOMMU group, making recovery in the guest impossible.

Therefore, alter the recovery process so that passed-through devices
are not removed but are instead left frozen (and marked isolated)
until the guest performs it's own recovery.  If firmware thaws a
passed-through PE because it's parent PE has been thawed (because it
was not passed through), re-freeze it.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, the EEH recovery process considers passed-through devices
as if they were not EEH-aware, which can cause them to be removed as
part of recovery.  Because device removal requires cooperation from
the guest, this may lead to the process stalling or deadlocking.
Also, if devices are removed on the host side, they will be removed
from their IOMMU group, making recovery in the guest impossible.

Therefore, alter the recovery process so that passed-through devices
are not removed but are instead left frozen (and marked isolated)
until the guest performs it's own recovery.  If firmware thaws a
passed-through PE because it's parent PE has been thawed (because it
was not passed through), re-freeze it.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Add include_passed to eeh_clear_pe_frozen_state()</title>
<updated>2019-02-05T00:55:44+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-11-29T03:16:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4d8e325d9df32ef00136d7885f0c65bf124edd22'/>
<id>4d8e325d9df32ef00136d7885f0c65bf124edd22</id>
<content type='text'>
Add a parameter to eeh_clear_pe_frozen_state() that allows
passed-through PEs to be excluded. Update callers to always pass true
so that there is no change in behaviour.

This is to prepare for follow-up work for passed-through devices.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a parameter to eeh_clear_pe_frozen_state() that allows
passed-through PEs to be excluded. Update callers to always pass true
so that there is no change in behaviour.

This is to prepare for follow-up work for passed-through devices.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Add include_passed to eeh_pe_state_clear()</title>
<updated>2019-02-05T00:55:43+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-11-29T03:16:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9ed5ca66aa66e5ce2e1d8758250a4d740052c8cd'/>
<id>9ed5ca66aa66e5ce2e1d8758250a4d740052c8cd</id>
<content type='text'>
Add a parameter to eeh_pe_state_clear() that allows passed-through PEs
to be excluded. Update callers to always pass true so that there is no
change in behaviour.

Also refactor to use direct traversal, to allow the removal of some
boilerplate.

This is to prepare for follow-up work for passed-through devices.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a parameter to eeh_pe_state_clear() that allows passed-through PEs
to be excluded. Update callers to always pass true so that there is no
change in behaviour.

Also refactor to use direct traversal, to allow the removal of some
boilerplate.

This is to prepare for follow-up work for passed-through devices.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: remove sw_state from eeh_unfreeze_pe()</title>
<updated>2019-02-05T00:55:42+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-11-29T03:16:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=188fdea69fa91dcd674a3d40f060a5891d4bc45a'/>
<id>188fdea69fa91dcd674a3d40f060a5891d4bc45a</id>
<content type='text'>
eeh_unfreeze_pe() performs two operations: unfreezing a PE (which may
cause firmware to unfreeze child PEs as well) and de-isolating the PE
and it's children.

To simplify this and support future work, separate out the
de-isolation and perform it at the call sites (when necessary).

There should be no change in behaviour.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
eeh_unfreeze_pe() performs two operations: unfreezing a PE (which may
cause firmware to unfreeze child PEs as well) and de-isolating the PE
and it's children.

To simplify this and support future work, separate out the
de-isolation and perform it at the call sites (when necessary).

There should be no change in behaviour.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Cleanup eeh_pe_clear_frozen_state()</title>
<updated>2019-02-05T00:55:41+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-11-29T03:16:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3376cb91ed908eb0728900894a77d8206574dbcd'/>
<id>3376cb91ed908eb0728900894a77d8206574dbcd</id>
<content type='text'>
The 'clear_sw_state' parameter for eeh_pe_clear_frozen_state() is
redundant because it has no effect (except in the rare case of a
hardware error part way through unfreezing a tree of PEs, where it
would dangerously allow partial de-isolation before returning
failure).

It is passed down to __eeh_pe_clear_frozen_state(), and from there to
eeh_unfreeze_pe(), where it causes EEH_PE_ISOLATED to be removed
from the state of each PE during the traversal.  However, when the
traversal finishes, EEH_PE_ISOLATED is unconditionally removed by a
call to eeh_pe_state_clear() regardless of the parameter's value.

So remove the flag and pass false to eeh_unfreeze_pe() (to avoid the
rare case described above, as it was before the flag was introduced).
Also, perform the recursion directly in the function and eliminate a
bit of boilerplate.

There should be no change in functionality, except as mentioned above.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The 'clear_sw_state' parameter for eeh_pe_clear_frozen_state() is
redundant because it has no effect (except in the rare case of a
hardware error part way through unfreezing a tree of PEs, where it
would dangerously allow partial de-isolation before returning
failure).

It is passed down to __eeh_pe_clear_frozen_state(), and from there to
eeh_unfreeze_pe(), where it causes EEH_PE_ISOLATED to be removed
from the state of each PE during the traversal.  However, when the
traversal finishes, EEH_PE_ISOLATED is unconditionally removed by a
call to eeh_pe_state_clear() regardless of the parameter's value.

So remove the flag and pass false to eeh_unfreeze_pe() (to avoid the
rare case described above, as it was before the flag was introduced).
Also, perform the recursion directly in the function and eliminate a
bit of boilerplate.

There should be no change in functionality, except as mentioned above.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Declare pci_ers_result_name() as static</title>
<updated>2018-11-25T06:11:21+00:00</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2018-10-22T14:54:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c36c5ffd5173bad377749c136718f3c77d896691'/>
<id>c36c5ffd5173bad377749c136718f3c77d896691</id>
<content type='text'>
Function pci_ers_result_name() is a static function, although not declared
as such. This was detected by sparse in the following warning

	arch/powerpc/kernel/eeh_driver.c:63:12: warning: symbol 'pci_ers_result_name' was not declared. Should it be static?

This patch simply declares the function a static.

Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Function pci_ers_result_name() is a static function, although not declared
as such. This was detected by sparse in the following warning

	arch/powerpc/kernel/eeh_driver.c:63:12: warning: symbol 'pci_ers_result_name' was not declared. Should it be static?

This patch simply declares the function a static.

Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Cleanup control flow in eeh_handle_normal_event()</title>
<updated>2018-10-13T11:21:25+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-09-12T01:23:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b90484ec1137424f606832a22f24d6cfc62a1427'/>
<id>b90484ec1137424f606832a22f24d6cfc62a1427</id>
<content type='text'>
Rather than mixing "if (state)" blocks and gotos, convert entirely to
"if (state)" blocks to make the state machine behaviour clearer.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rather than mixing "if (state)" blocks and gotos, convert entirely to
"if (state)" blocks to make the state machine behaviour clearer.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Cleanup eeh_ops.wait_state()</title>
<updated>2018-10-13T11:21:25+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-09-12T01:23:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fef7f905523fb96b431e5e73487a689c10c77875'/>
<id>fef7f905523fb96b431e5e73487a689c10c77875</id>
<content type='text'>
The wait_state member of eeh_ops does not need to be platform
dependent; it's just logic around eeh_ops.get_state(). Therefore,
merge the two (slightly different!) platform versions into a new
function, eeh_wait_state() and remove the eeh_ops member.

While doing this, also correct:
* The wait logic, so that it never waits longer than max_wait.
* The wait logic, so that it never waits less than
  EEH_STATE_MIN_WAIT_TIME.
* One call site where the result is treated like a bit field before
  it's checked for negative error values.
* In pseries_eeh_get_state(), rename the "state" parameter to "delay"
  because that's what it is.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The wait_state member of eeh_ops does not need to be platform
dependent; it's just logic around eeh_ops.get_state(). Therefore,
merge the two (slightly different!) platform versions into a new
function, eeh_wait_state() and remove the eeh_ops member.

While doing this, also correct:
* The wait logic, so that it never waits longer than max_wait.
* The wait logic, so that it never waits less than
  EEH_STATE_MIN_WAIT_TIME.
* One call site where the result is treated like a bit field before
  it's checked for negative error values.
* In pseries_eeh_get_state(), rename the "state" parameter to "delay"
  because that's what it is.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Cleanup eeh_pe_state_mark()</title>
<updated>2018-10-13T11:21:25+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-09-12T01:23:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e762bb891a294af00b83f54062dae4e24565edf8'/>
<id>e762bb891a294af00b83f54062dae4e24565edf8</id>
<content type='text'>
Currently, eeh_pe_state_mark() marks a PE (and it's children) with a
state and then performs additional processing if that state included
EEH_PE_ISOLATED.

The state parameter is always a constant at the call site, so
rearrange eeh_pe_state_mark() into two functions and just call the
appropriate one at each site.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, eeh_pe_state_mark() marks a PE (and it's children) with a
state and then performs additional processing if that state included
EEH_PE_ISOLATED.

The state parameter is always a constant at the call site, so
rearrange eeh_pe_state_mark() into two functions and just call the
appropriate one at each site.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/eeh: Cleanup field names in eeh_rmv_data</title>
<updated>2018-10-13T11:21:25+00:00</updated>
<author>
<name>Sam Bobroff</name>
<email>sbobroff@linux.ibm.com</email>
</author>
<published>2018-09-12T01:23:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1c5c533b149f02d0ce00fc7ab5810766398acc11'/>
<id>1c5c533b149f02d0ce00fc7ab5810766398acc11</id>
<content type='text'>
Change the name of the fields in eeh_rmv_data to clarify their usage.

Change "edev_list" to "removed_vf_list" because it does not contain
generic edevs, but rather only edevs that contain virtual functions
(which need to be removed during recovery).

Similarly, change "removed" to "removed_dev_count" because it is a
count of any removed devices, not just those in the above list.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change the name of the fields in eeh_rmv_data to clarify their usage.

Change "edev_list" to "removed_vf_list" because it does not contain
generic edevs, but rather only edevs that contain virtual functions
(which need to be removed during recovery).

Similarly, change "removed" to "removed_dev_count" because it is a
count of any removed devices, not just those in the above list.

Signed-off-by: Sam Bobroff &lt;sbobroff@linux.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
