<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/s390, branch v5.4.232</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>s390/lcs: Fix return type of lcs_start_xmit()</title>
<updated>2023-01-18T10:41:36+00:00</updated>
<author>
<name>Nathan Chancellor</name>
<email>nathan@kernel.org</email>
</author>
<published>2022-11-03T17:01:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=20022d551f2064a194d8e0acb6cd7a85094a17b2'/>
<id>20022d551f2064a194d8e0acb6cd7a85094a17b2</id>
<content type='text'>
[ Upstream commit bb16db8393658e0978c3f0d30ae069e878264fa3 ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/s390/net/lcs.c:2090:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = lcs_start_xmit,
                                    ^~~~~~~~~~~~~~
  drivers/s390/net/lcs.c:2097:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = lcs_start_xmit,
                                    ^~~~~~~~~~~~~~

-&gt;ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of lcs_start_xmit() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter &lt;wintera@linux.ibm.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit bb16db8393658e0978c3f0d30ae069e878264fa3 ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/s390/net/lcs.c:2090:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = lcs_start_xmit,
                                    ^~~~~~~~~~~~~~
  drivers/s390/net/lcs.c:2097:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = lcs_start_xmit,
                                    ^~~~~~~~~~~~~~

-&gt;ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of lcs_start_xmit() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter &lt;wintera@linux.ibm.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/netiucv: Fix return type of netiucv_tx()</title>
<updated>2023-01-18T10:41:36+00:00</updated>
<author>
<name>Nathan Chancellor</name>
<email>nathan@kernel.org</email>
</author>
<published>2022-11-03T17:01:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4bee3c75d5bf7c2b5dc0b520410eb40449e5da31'/>
<id>4bee3c75d5bf7c2b5dc0b520410eb40449e5da31</id>
<content type='text'>
[ Upstream commit 88d86d18d7cf7e9137c95f9d212bb9fff8a1b4be ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/s390/net/netiucv.c:1854:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = netiucv_tx,
                                    ^~~~~~~~~~

-&gt;ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of netiucv_tx() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.

Additionally, while in the area, remove a comment block that is no
longer relevant.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter &lt;wintera@linux.ibm.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 88d86d18d7cf7e9137c95f9d212bb9fff8a1b4be ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/s390/net/netiucv.c:1854:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = netiucv_tx,
                                    ^~~~~~~~~~

-&gt;ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of netiucv_tx() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.

Additionally, while in the area, remove a comment block that is no
longer relevant.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter &lt;wintera@linux.ibm.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/ctcm: Fix return type of ctc{mp,}m_tx()</title>
<updated>2023-01-18T10:41:36+00:00</updated>
<author>
<name>Nathan Chancellor</name>
<email>nathan@kernel.org</email>
</author>
<published>2022-11-03T17:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e859e02fbfa78db5f5a1501bab493c20305cbddc'/>
<id>e859e02fbfa78db5f5a1501bab493c20305cbddc</id>
<content type='text'>
[ Upstream commit aa5bf80c3c067b82b4362cd6e8e2194623bcaca6 ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/s390/net/ctcm_main.c:1064:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = ctcm_tx,
                                    ^~~~~~~
  drivers/s390/net/ctcm_main.c:1072:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = ctcmpc_tx,
                                    ^~~~~~~~~

-&gt;ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of ctc{mp,}m_tx() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.

Additionally, while in the area, remove a comment block that is no
longer relevant.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter &lt;wintera@linux.ibm.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit aa5bf80c3c067b82b4362cd6e8e2194623bcaca6 ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/s390/net/ctcm_main.c:1064:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = ctcm_tx,
                                    ^~~~~~~
  drivers/s390/net/ctcm_main.c:1072:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .ndo_start_xmit         = ctcmpc_tx,
                                    ^~~~~~~~~

-&gt;ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int'. Adjust the return type of ctc{mp,}m_tx() to
match the prototype's to resolve the warning and potential CFI failure,
should s390 select ARCH_SUPPORTS_CFI_CLANG in the future.

Additionally, while in the area, remove a comment block that is no
longer relevant.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reviewed-by: Alexandra Winter &lt;wintera@linux.ibm.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/dasd: fix no record found for raw_track_access</title>
<updated>2022-12-08T10:22:59+00:00</updated>
<author>
<name>Stefan Haberland</name>
<email>sth@linux.ibm.com</email>
</author>
<published>2022-11-23T16:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=420b21235d63e87bad315191028a752ddb47e3ae'/>
<id>420b21235d63e87bad315191028a752ddb47e3ae</id>
<content type='text'>
[ Upstream commit 590ce6d96d6a224b470a3862c33a483d5022bfdb ]

For DASD devices in raw_track_access mode only full track images are
read and written.
For this purpose it is not necessary to do search operation in the
locate record extended function. The documentation even states that
this might fail if the searched record is not found on a track.

Currently the driver sets a value of 1 in the search field for the first
record after record zero. This is the default for disks not in
raw_track_access mode but record 1 might be missing on a completely
empty track.

There has not been any problem with this on IBM storage servers but it
might lead to errors with DASD devices on other vendors storage servers.

Fix this by setting the search field to 0. Record zero is always available
even on a completely empty track.

Fixes: e4dbb0f2b5dd ("[S390] dasd: Add support for raw ECKD access.")
Signed-off-by: Stefan Haberland &lt;sth@linux.ibm.com&gt;
Reviewed-by: Jan Hoeppner &lt;hoeppner@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20221123160719.3002694-4-sth@linux.ibm.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 590ce6d96d6a224b470a3862c33a483d5022bfdb ]

For DASD devices in raw_track_access mode only full track images are
read and written.
For this purpose it is not necessary to do search operation in the
locate record extended function. The documentation even states that
this might fail if the searched record is not found on a track.

Currently the driver sets a value of 1 in the search field for the first
record after record zero. This is the default for disks not in
raw_track_access mode but record 1 might be missing on a completely
empty track.

There has not been any problem with this on IBM storage servers but it
might lead to errors with DASD devices on other vendors storage servers.

Fix this by setting the search field to 0. Record zero is always available
even on a completely empty track.

Fixes: e4dbb0f2b5dd ("[S390] dasd: Add support for raw ECKD access.")
Signed-off-by: Stefan Haberland &lt;sth@linux.ibm.com&gt;
Reviewed-by: Jan Hoeppner &lt;hoeppner@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20221123160719.3002694-4-sth@linux.ibm.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>scsi: zfcp: Fix double free of FSF request when qdio send fails</title>
<updated>2022-11-25T16:42:19+00:00</updated>
<author>
<name>Benjamin Block</name>
<email>bblock@linux.ibm.com</email>
</author>
<published>2022-11-16T10:50:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1bf8ed585501bb2dd0b5f67c824eab45adfbdccd'/>
<id>1bf8ed585501bb2dd0b5f67c824eab45adfbdccd</id>
<content type='text'>
commit 0954256e970ecf371b03a6c9af2cf91b9c4085ff upstream.

We used to use the wrong type of integer in 'zfcp_fsf_req_send()' to cache
the FSF request ID when sending a new FSF request. This is used in case the
sending fails and we need to remove the request from our internal hash
table again (so we don't keep an invalid reference and use it when we free
the request again).

In 'zfcp_fsf_req_send()' we used to cache the ID as 'int' (signed and 32
bit wide), but the rest of the zfcp code (and the firmware specification)
handles the ID as 'unsigned long'/'u64' (unsigned and 64 bit wide [s390x
ELF ABI]).  For one this has the obvious problem that when the ID grows
past 32 bit (this can happen reasonably fast) it is truncated to 32 bit
when storing it in the cache variable and so doesn't match the original ID
anymore.  The second less obvious problem is that even when the original ID
has not yet grown past 32 bit, as soon as the 32nd bit is set in the
original ID (0x80000000 = 2'147'483'648) we will have a mismatch when we
cast it back to 'unsigned long'. As the cached variable is of a signed
type, the compiler will choose a sign-extending instruction to load the 32
bit variable into a 64 bit register (e.g.: 'lgf %r11,188(%r15)'). So once
we pass the cached variable into 'zfcp_reqlist_find_rm()' to remove the
request again all the leading zeros will be flipped to ones to extend the
sign and won't match the original ID anymore (this has been observed in
practice).

If we can't successfully remove the request from the hash table again after
'zfcp_qdio_send()' fails (this happens regularly when zfcp cannot notify
the adapter about new work because the adapter is already gone during
e.g. a ChpID toggle) we will end up with a double free.  We unconditionally
free the request in the calling function when 'zfcp_fsf_req_send()' fails,
but because the request is still in the hash table we end up with a stale
memory reference, and once the zfcp adapter is either reset during recovery
or shutdown we end up freeing the same memory twice.

The resulting stack traces vary depending on the kernel and have no direct
correlation to the place where the bug occurs. Here are three examples that
have been seen in practice:

  list_del corruption. next-&gt;prev should be 00000001b9d13800, but was 00000000dead4ead. (next=00000001bd131a00)
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:62!
  monitor event: 0040 ilc:2 [#1] PREEMPT SMP
  Modules linked in: ...
  CPU: 9 PID: 1617 Comm: zfcperp0.0.1740 Kdump: loaded
  Hardware name: ...
  Krnl PSW : 0704d00180000000 00000003cbeea1f8 (__list_del_entry_valid+0x98/0x140)
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
  Krnl GPRS: 00000000916d12f1 0000000080000000 000000000000006d 00000003cb665cd6
             0000000000000001 0000000000000000 0000000000000000 00000000d28d21e8
             00000000d3844000 00000380099efd28 00000001bd131a00 00000001b9d13800
             00000000d3290100 0000000000000000 00000003cbeea1f4 00000380099efc70
  Krnl Code: 00000003cbeea1e8: c020004f68a7        larl    %r2,00000003cc8d7336
             00000003cbeea1ee: c0e50027fd65        brasl   %r14,00000003cc3e9cb8
            #00000003cbeea1f4: af000000            mc      0,0
            &gt;00000003cbeea1f8: c02000920440        larl    %r2,00000003cd12aa78
             00000003cbeea1fe: c0e500289c25        brasl   %r14,00000003cc3fda48
             00000003cbeea204: b9040043            lgr     %r4,%r3
             00000003cbeea208: b9040051            lgr     %r5,%r1
             00000003cbeea20c: b9040032            lgr     %r3,%r2
  Call Trace:
   [&lt;00000003cbeea1f8&gt;] __list_del_entry_valid+0x98/0x140
  ([&lt;00000003cbeea1f4&gt;] __list_del_entry_valid+0x94/0x140)
   [&lt;000003ff7ff502fe&gt;] zfcp_fsf_req_dismiss_all+0xde/0x150 [zfcp]
   [&lt;000003ff7ff49cd0&gt;] zfcp_erp_strategy_do_action+0x160/0x280 [zfcp]
   [&lt;000003ff7ff4a22e&gt;] zfcp_erp_strategy+0x21e/0xca0 [zfcp]
   [&lt;000003ff7ff4ad34&gt;] zfcp_erp_thread+0x84/0x1a0 [zfcp]
   [&lt;00000003cb5eece8&gt;] kthread+0x138/0x150
   [&lt;00000003cb557f3c&gt;] __ret_from_fork+0x3c/0x60
   [&lt;00000003cc4172ea&gt;] ret_from_fork+0xa/0x40
  INFO: lockdep is turned off.
  Last Breaking-Event-Address:
   [&lt;00000003cc3e9d04&gt;] _printk+0x4c/0x58
  Kernel panic - not syncing: Fatal exception: panic_on_oops

or:

  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803
  Fault in home space mode while using kernel ASCE.
  AS:0000000063b10007 R3:0000000000000024
  Oops: 0038 ilc:3 [#1] SMP
  Modules linked in: ...
  CPU: 10 PID: 0 Comm: swapper/10 Kdump: loaded
  Hardware name: ...
  Krnl PSW : 0404d00180000000 000003ff7febaf8e (zfcp_fsf_reqid_check+0x86/0x158 [zfcp])
             R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
  Krnl GPRS: 5a6f1cfa89c49ac3 00000000aff2c4c8 6b6b6b6b6b6b6b6b 00000000000002a8
             0000000000000000 0000000000000055 0000000000000000 00000000a8515800
             0700000000000000 00000000a6e14500 00000000aff2c000 000000008003c44c
             000000008093c700 0000000000000010 00000380009ebba8 00000380009ebb48
  Krnl Code: 000003ff7febaf7e: a7f4003d            brc     15,000003ff7febaff8
             000003ff7febaf82: e32020000004        lg      %r2,0(%r2)
            #000003ff7febaf88: ec2100388064        cgrj    %r2,%r1,8,000003ff7febaff8
            &gt;000003ff7febaf8e: e3b020100020        cg      %r11,16(%r2)
             000003ff7febaf94: a774fff7            brc     7,000003ff7febaf82
             000003ff7febaf98: ec280030007c        cgij    %r2,0,8,000003ff7febaff8
             000003ff7febaf9e: e31020080004        lg      %r1,8(%r2)
             000003ff7febafa4: e33020000004        lg      %r3,0(%r2)
  Call Trace:
   [&lt;000003ff7febaf8e&gt;] zfcp_fsf_reqid_check+0x86/0x158 [zfcp]
   [&lt;000003ff7febbdbc&gt;] zfcp_qdio_int_resp+0x6c/0x170 [zfcp]
   [&lt;000003ff7febbf90&gt;] zfcp_qdio_irq_tasklet+0xd0/0x108 [zfcp]
   [&lt;0000000061d90a04&gt;] tasklet_action_common.constprop.0+0xdc/0x128
   [&lt;000000006292f300&gt;] __do_softirq+0x130/0x3c0
   [&lt;0000000061d906c6&gt;] irq_exit_rcu+0xfe/0x118
   [&lt;000000006291e818&gt;] do_io_irq+0xc8/0x168
   [&lt;000000006292d516&gt;] io_int_handler+0xd6/0x110
   [&lt;000000006292d596&gt;] psw_idle_exit+0x0/0xa
  ([&lt;0000000061d3be50&gt;] arch_cpu_idle+0x40/0xd0)
   [&lt;000000006292ceea&gt;] default_idle_call+0x52/0xf8
   [&lt;0000000061de4fa4&gt;] do_idle+0xd4/0x168
   [&lt;0000000061de51fe&gt;] cpu_startup_entry+0x36/0x40
   [&lt;0000000061d4faac&gt;] smp_start_secondary+0x12c/0x138
   [&lt;000000006292d88e&gt;] restart_int_handler+0x6e/0x90
  Last Breaking-Event-Address:
   [&lt;000003ff7febaf94&gt;] zfcp_fsf_reqid_check+0x8c/0x158 [zfcp]
  Kernel panic - not syncing: Fatal exception in interrupt

or:

  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 523b05d3ae76a000 TEID: 523b05d3ae76a803
  Fault in home space mode while using kernel ASCE.
  AS:0000000077c40007 R3:0000000000000024
  Oops: 0038 ilc:3 [#1] SMP
  Modules linked in: ...
  CPU: 3 PID: 453 Comm: kworker/3:1H Kdump: loaded
  Hardware name: ...
  Workqueue: kblockd blk_mq_run_work_fn
  Krnl PSW : 0404d00180000000 0000000076fc0312 (__kmalloc+0xd2/0x398)
             R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
  Krnl GPRS: ffffffffffffffff 523b05d3ae76abf6 0000000000000000 0000000000092a20
             0000000000000002 00000007e49b5cc0 00000007eda8f000 0000000000092a20
             00000007eda8f000 00000003b02856b9 00000000000000a8 523b05d3ae76abf6
             00000007dd662000 00000007eda8f000 0000000076fc02b2 000003e0037637a0
  Krnl Code: 0000000076fc0302: c004000000d4	brcl	0,76fc04aa
             0000000076fc0308: b904001b		lgr	%r1,%r11
            #0000000076fc030c: e3106020001a	algf	%r1,32(%r6)
            &gt;0000000076fc0312: e31010000082	xg	%r1,0(%r1)
             0000000076fc0318: b9040001		lgr	%r0,%r1
             0000000076fc031c: e30061700082	xg	%r0,368(%r6)
             0000000076fc0322: ec59000100d9	aghik	%r5,%r9,1
             0000000076fc0328: e34003b80004	lg	%r4,952
  Call Trace:
   [&lt;0000000076fc0312&gt;] __kmalloc+0xd2/0x398
   [&lt;0000000076f318f2&gt;] mempool_alloc+0x72/0x1f8
   [&lt;000003ff8027c5f8&gt;] zfcp_fsf_req_create.isra.7+0x40/0x268 [zfcp]
   [&lt;000003ff8027f1bc&gt;] zfcp_fsf_fcp_cmnd+0xac/0x3f0 [zfcp]
   [&lt;000003ff80280f1a&gt;] zfcp_scsi_queuecommand+0x122/0x1d0 [zfcp]
   [&lt;000003ff800b4218&gt;] scsi_queue_rq+0x778/0xa10 [scsi_mod]
   [&lt;00000000771782a0&gt;] __blk_mq_try_issue_directly+0x130/0x208
   [&lt;000000007717a124&gt;] blk_mq_request_issue_directly+0x4c/0xa8
   [&lt;000003ff801302e2&gt;] dm_mq_queue_rq+0x2ea/0x468 [dm_mod]
   [&lt;0000000077178c12&gt;] blk_mq_dispatch_rq_list+0x33a/0x818
   [&lt;000000007717f064&gt;] __blk_mq_do_dispatch_sched+0x284/0x2f0
   [&lt;000000007717f44c&gt;] __blk_mq_sched_dispatch_requests+0x1c4/0x218
   [&lt;000000007717fa7a&gt;] blk_mq_sched_dispatch_requests+0x52/0x90
   [&lt;0000000077176d74&gt;] __blk_mq_run_hw_queue+0x9c/0xc0
   [&lt;0000000076da6d74&gt;] process_one_work+0x274/0x4d0
   [&lt;0000000076da7018&gt;] worker_thread+0x48/0x560
   [&lt;0000000076daef18&gt;] kthread+0x140/0x160
   [&lt;000000007751d144&gt;] ret_from_fork+0x28/0x30
  Last Breaking-Event-Address:
   [&lt;0000000076fc0474&gt;] __kmalloc+0x234/0x398
  Kernel panic - not syncing: Fatal exception: panic_on_oops

To fix this, simply change the type of the cache variable to 'unsigned
long', like the rest of zfcp and also the argument for
'zfcp_reqlist_find_rm()'. This prevents truncation and wrong sign extension
and so can successfully remove the request from the hash table.

Fixes: e60a6d69f1f8 ("[SCSI] zfcp: Remove function zfcp_reqlist_find_safe")
Cc: &lt;stable@vger.kernel.org&gt; #v2.6.34+
Signed-off-by: Benjamin Block &lt;bblock@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/979f6e6019d15f91ba56182f1aaf68d61bf37fc6.1668595505.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier &lt;maier@linux.ibm.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0954256e970ecf371b03a6c9af2cf91b9c4085ff upstream.

We used to use the wrong type of integer in 'zfcp_fsf_req_send()' to cache
the FSF request ID when sending a new FSF request. This is used in case the
sending fails and we need to remove the request from our internal hash
table again (so we don't keep an invalid reference and use it when we free
the request again).

In 'zfcp_fsf_req_send()' we used to cache the ID as 'int' (signed and 32
bit wide), but the rest of the zfcp code (and the firmware specification)
handles the ID as 'unsigned long'/'u64' (unsigned and 64 bit wide [s390x
ELF ABI]).  For one this has the obvious problem that when the ID grows
past 32 bit (this can happen reasonably fast) it is truncated to 32 bit
when storing it in the cache variable and so doesn't match the original ID
anymore.  The second less obvious problem is that even when the original ID
has not yet grown past 32 bit, as soon as the 32nd bit is set in the
original ID (0x80000000 = 2'147'483'648) we will have a mismatch when we
cast it back to 'unsigned long'. As the cached variable is of a signed
type, the compiler will choose a sign-extending instruction to load the 32
bit variable into a 64 bit register (e.g.: 'lgf %r11,188(%r15)'). So once
we pass the cached variable into 'zfcp_reqlist_find_rm()' to remove the
request again all the leading zeros will be flipped to ones to extend the
sign and won't match the original ID anymore (this has been observed in
practice).

If we can't successfully remove the request from the hash table again after
'zfcp_qdio_send()' fails (this happens regularly when zfcp cannot notify
the adapter about new work because the adapter is already gone during
e.g. a ChpID toggle) we will end up with a double free.  We unconditionally
free the request in the calling function when 'zfcp_fsf_req_send()' fails,
but because the request is still in the hash table we end up with a stale
memory reference, and once the zfcp adapter is either reset during recovery
or shutdown we end up freeing the same memory twice.

The resulting stack traces vary depending on the kernel and have no direct
correlation to the place where the bug occurs. Here are three examples that
have been seen in practice:

  list_del corruption. next-&gt;prev should be 00000001b9d13800, but was 00000000dead4ead. (next=00000001bd131a00)
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:62!
  monitor event: 0040 ilc:2 [#1] PREEMPT SMP
  Modules linked in: ...
  CPU: 9 PID: 1617 Comm: zfcperp0.0.1740 Kdump: loaded
  Hardware name: ...
  Krnl PSW : 0704d00180000000 00000003cbeea1f8 (__list_del_entry_valid+0x98/0x140)
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
  Krnl GPRS: 00000000916d12f1 0000000080000000 000000000000006d 00000003cb665cd6
             0000000000000001 0000000000000000 0000000000000000 00000000d28d21e8
             00000000d3844000 00000380099efd28 00000001bd131a00 00000001b9d13800
             00000000d3290100 0000000000000000 00000003cbeea1f4 00000380099efc70
  Krnl Code: 00000003cbeea1e8: c020004f68a7        larl    %r2,00000003cc8d7336
             00000003cbeea1ee: c0e50027fd65        brasl   %r14,00000003cc3e9cb8
            #00000003cbeea1f4: af000000            mc      0,0
            &gt;00000003cbeea1f8: c02000920440        larl    %r2,00000003cd12aa78
             00000003cbeea1fe: c0e500289c25        brasl   %r14,00000003cc3fda48
             00000003cbeea204: b9040043            lgr     %r4,%r3
             00000003cbeea208: b9040051            lgr     %r5,%r1
             00000003cbeea20c: b9040032            lgr     %r3,%r2
  Call Trace:
   [&lt;00000003cbeea1f8&gt;] __list_del_entry_valid+0x98/0x140
  ([&lt;00000003cbeea1f4&gt;] __list_del_entry_valid+0x94/0x140)
   [&lt;000003ff7ff502fe&gt;] zfcp_fsf_req_dismiss_all+0xde/0x150 [zfcp]
   [&lt;000003ff7ff49cd0&gt;] zfcp_erp_strategy_do_action+0x160/0x280 [zfcp]
   [&lt;000003ff7ff4a22e&gt;] zfcp_erp_strategy+0x21e/0xca0 [zfcp]
   [&lt;000003ff7ff4ad34&gt;] zfcp_erp_thread+0x84/0x1a0 [zfcp]
   [&lt;00000003cb5eece8&gt;] kthread+0x138/0x150
   [&lt;00000003cb557f3c&gt;] __ret_from_fork+0x3c/0x60
   [&lt;00000003cc4172ea&gt;] ret_from_fork+0xa/0x40
  INFO: lockdep is turned off.
  Last Breaking-Event-Address:
   [&lt;00000003cc3e9d04&gt;] _printk+0x4c/0x58
  Kernel panic - not syncing: Fatal exception: panic_on_oops

or:

  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803
  Fault in home space mode while using kernel ASCE.
  AS:0000000063b10007 R3:0000000000000024
  Oops: 0038 ilc:3 [#1] SMP
  Modules linked in: ...
  CPU: 10 PID: 0 Comm: swapper/10 Kdump: loaded
  Hardware name: ...
  Krnl PSW : 0404d00180000000 000003ff7febaf8e (zfcp_fsf_reqid_check+0x86/0x158 [zfcp])
             R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
  Krnl GPRS: 5a6f1cfa89c49ac3 00000000aff2c4c8 6b6b6b6b6b6b6b6b 00000000000002a8
             0000000000000000 0000000000000055 0000000000000000 00000000a8515800
             0700000000000000 00000000a6e14500 00000000aff2c000 000000008003c44c
             000000008093c700 0000000000000010 00000380009ebba8 00000380009ebb48
  Krnl Code: 000003ff7febaf7e: a7f4003d            brc     15,000003ff7febaff8
             000003ff7febaf82: e32020000004        lg      %r2,0(%r2)
            #000003ff7febaf88: ec2100388064        cgrj    %r2,%r1,8,000003ff7febaff8
            &gt;000003ff7febaf8e: e3b020100020        cg      %r11,16(%r2)
             000003ff7febaf94: a774fff7            brc     7,000003ff7febaf82
             000003ff7febaf98: ec280030007c        cgij    %r2,0,8,000003ff7febaff8
             000003ff7febaf9e: e31020080004        lg      %r1,8(%r2)
             000003ff7febafa4: e33020000004        lg      %r3,0(%r2)
  Call Trace:
   [&lt;000003ff7febaf8e&gt;] zfcp_fsf_reqid_check+0x86/0x158 [zfcp]
   [&lt;000003ff7febbdbc&gt;] zfcp_qdio_int_resp+0x6c/0x170 [zfcp]
   [&lt;000003ff7febbf90&gt;] zfcp_qdio_irq_tasklet+0xd0/0x108 [zfcp]
   [&lt;0000000061d90a04&gt;] tasklet_action_common.constprop.0+0xdc/0x128
   [&lt;000000006292f300&gt;] __do_softirq+0x130/0x3c0
   [&lt;0000000061d906c6&gt;] irq_exit_rcu+0xfe/0x118
   [&lt;000000006291e818&gt;] do_io_irq+0xc8/0x168
   [&lt;000000006292d516&gt;] io_int_handler+0xd6/0x110
   [&lt;000000006292d596&gt;] psw_idle_exit+0x0/0xa
  ([&lt;0000000061d3be50&gt;] arch_cpu_idle+0x40/0xd0)
   [&lt;000000006292ceea&gt;] default_idle_call+0x52/0xf8
   [&lt;0000000061de4fa4&gt;] do_idle+0xd4/0x168
   [&lt;0000000061de51fe&gt;] cpu_startup_entry+0x36/0x40
   [&lt;0000000061d4faac&gt;] smp_start_secondary+0x12c/0x138
   [&lt;000000006292d88e&gt;] restart_int_handler+0x6e/0x90
  Last Breaking-Event-Address:
   [&lt;000003ff7febaf94&gt;] zfcp_fsf_reqid_check+0x8c/0x158 [zfcp]
  Kernel panic - not syncing: Fatal exception in interrupt

or:

  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 523b05d3ae76a000 TEID: 523b05d3ae76a803
  Fault in home space mode while using kernel ASCE.
  AS:0000000077c40007 R3:0000000000000024
  Oops: 0038 ilc:3 [#1] SMP
  Modules linked in: ...
  CPU: 3 PID: 453 Comm: kworker/3:1H Kdump: loaded
  Hardware name: ...
  Workqueue: kblockd blk_mq_run_work_fn
  Krnl PSW : 0404d00180000000 0000000076fc0312 (__kmalloc+0xd2/0x398)
             R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
  Krnl GPRS: ffffffffffffffff 523b05d3ae76abf6 0000000000000000 0000000000092a20
             0000000000000002 00000007e49b5cc0 00000007eda8f000 0000000000092a20
             00000007eda8f000 00000003b02856b9 00000000000000a8 523b05d3ae76abf6
             00000007dd662000 00000007eda8f000 0000000076fc02b2 000003e0037637a0
  Krnl Code: 0000000076fc0302: c004000000d4	brcl	0,76fc04aa
             0000000076fc0308: b904001b		lgr	%r1,%r11
            #0000000076fc030c: e3106020001a	algf	%r1,32(%r6)
            &gt;0000000076fc0312: e31010000082	xg	%r1,0(%r1)
             0000000076fc0318: b9040001		lgr	%r0,%r1
             0000000076fc031c: e30061700082	xg	%r0,368(%r6)
             0000000076fc0322: ec59000100d9	aghik	%r5,%r9,1
             0000000076fc0328: e34003b80004	lg	%r4,952
  Call Trace:
   [&lt;0000000076fc0312&gt;] __kmalloc+0xd2/0x398
   [&lt;0000000076f318f2&gt;] mempool_alloc+0x72/0x1f8
   [&lt;000003ff8027c5f8&gt;] zfcp_fsf_req_create.isra.7+0x40/0x268 [zfcp]
   [&lt;000003ff8027f1bc&gt;] zfcp_fsf_fcp_cmnd+0xac/0x3f0 [zfcp]
   [&lt;000003ff80280f1a&gt;] zfcp_scsi_queuecommand+0x122/0x1d0 [zfcp]
   [&lt;000003ff800b4218&gt;] scsi_queue_rq+0x778/0xa10 [scsi_mod]
   [&lt;00000000771782a0&gt;] __blk_mq_try_issue_directly+0x130/0x208
   [&lt;000000007717a124&gt;] blk_mq_request_issue_directly+0x4c/0xa8
   [&lt;000003ff801302e2&gt;] dm_mq_queue_rq+0x2ea/0x468 [dm_mod]
   [&lt;0000000077178c12&gt;] blk_mq_dispatch_rq_list+0x33a/0x818
   [&lt;000000007717f064&gt;] __blk_mq_do_dispatch_sched+0x284/0x2f0
   [&lt;000000007717f44c&gt;] __blk_mq_sched_dispatch_requests+0x1c4/0x218
   [&lt;000000007717fa7a&gt;] blk_mq_sched_dispatch_requests+0x52/0x90
   [&lt;0000000077176d74&gt;] __blk_mq_run_hw_queue+0x9c/0xc0
   [&lt;0000000076da6d74&gt;] process_one_work+0x274/0x4d0
   [&lt;0000000076da7018&gt;] worker_thread+0x48/0x560
   [&lt;0000000076daef18&gt;] kthread+0x140/0x160
   [&lt;000000007751d144&gt;] ret_from_fork+0x28/0x30
  Last Breaking-Event-Address:
   [&lt;0000000076fc0474&gt;] __kmalloc+0x234/0x398
  Kernel panic - not syncing: Fatal exception: panic_on_oops

To fix this, simply change the type of the cache variable to 'unsigned
long', like the rest of zfcp and also the argument for
'zfcp_reqlist_find_rm()'. This prevents truncation and wrong sign extension
and so can successfully remove the request from the hash table.

Fixes: e60a6d69f1f8 ("[SCSI] zfcp: Remove function zfcp_reqlist_find_safe")
Cc: &lt;stable@vger.kernel.org&gt; #v2.6.34+
Signed-off-by: Benjamin Block &lt;bblock@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/979f6e6019d15f91ba56182f1aaf68d61bf37fc6.1668595505.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier &lt;maier@linux.ibm.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup</title>
<updated>2022-09-28T09:04:08+00:00</updated>
<author>
<name>Stefan Haberland</name>
<email>sth@linux.ibm.com</email>
</author>
<published>2022-09-19T15:49:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2e473351400e3dd66f0b71eddcef82ee45a584c1'/>
<id>2e473351400e3dd66f0b71eddcef82ee45a584c1</id>
<content type='text'>
commit db7ba07108a48c0f95b74fabbfd5d63e924f992d upstream.

Fix Oops in dasd_alias_get_start_dev() function caused by the pavgroup
pointer being NULL.

The pavgroup pointer is checked on the entrance of the function but
without the lcu-&gt;lock being held. Therefore there is a race window
between dasd_alias_get_start_dev() and _lcu_update() which sets
pavgroup to NULL with the lcu-&gt;lock held.

Fix by checking the pavgroup pointer with lcu-&gt;lock held.

Cc: &lt;stable@vger.kernel.org&gt; # 2.6.25+
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Signed-off-by: Stefan Haberland &lt;sth@linux.ibm.com&gt;
Reviewed-by: Jan Hoeppner &lt;hoeppner@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220919154931.4123002-2-sth@linux.ibm.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit db7ba07108a48c0f95b74fabbfd5d63e924f992d upstream.

Fix Oops in dasd_alias_get_start_dev() function caused by the pavgroup
pointer being NULL.

The pavgroup pointer is checked on the entrance of the function but
without the lcu-&gt;lock being held. Therefore there is a race window
between dasd_alias_get_start_dev() and _lcu_update() which sets
pavgroup to NULL with the lcu-&gt;lock held.

Fix by checking the pavgroup pointer with lcu-&gt;lock held.

Cc: &lt;stable@vger.kernel.org&gt; # 2.6.25+
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Signed-off-by: Stefan Haberland &lt;sth@linux.ibm.com&gt;
Reviewed-by: Jan Hoeppner &lt;hoeppner@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220919154931.4123002-2-sth@linux.ibm.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>scsi: zfcp: Fix missing auto port scan and thus missing target ports</title>
<updated>2022-08-25T09:18:10+00:00</updated>
<author>
<name>Steffen Maier</name>
<email>maier@linux.ibm.com</email>
</author>
<published>2022-07-29T16:25:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bc98764d80ee37110484957943d159454c68c5d6'/>
<id>bc98764d80ee37110484957943d159454c68c5d6</id>
<content type='text'>
commit 4da8c5f76825269f28d6a89fa752934a4bcb6dfa upstream.

Case (1):
  The only waiter on wka_port-&gt;completion_wq is zfcp_fc_wka_port_get()
  trying to open a WKA port. As such it should only be woken up by WKA port
  *open* responses, not by WKA port close responses.

Case (2):
  A close WKA port response coming in just after having sent a new open WKA
  port request and before blocking for the open response with wait_event()
  in zfcp_fc_wka_port_get() erroneously renders the wait_event a NOP
  because the close handler overwrites wka_port-&gt;status. Hence the
  wait_event condition is erroneously true and it does not enter blocking
  state.

With non-negligible probability, the following time space sequence happens
depending on timing without this fix:

user process        ERP thread zfcp work queue tasklet system work queue
============        ========== =============== ======= =================
$ echo 1 &gt; online
zfcp_ccw_set_online
zfcp_ccw_activate
zfcp_erp_adapter_reopen
msleep scan backoff zfcp_erp_strategy
|                   ...
|                   zfcp_erp_action_cleanup
|                   ...
|                   queue delayed scan_work
|                   queue ns_up_work
|                              ns_up_work:
|                              zfcp_fc_wka_port_get
|                               open wka request
|                                              open response
|                              GSPN FC-GS
|                              RSPN FC-GS [NPIV-only]
|                              zfcp_fc_wka_port_put
|                               (--wka-&gt;refcount==0)
|                               sched delayed wka-&gt;work
|
~~~Case (1)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zfcp_erp_wait
flush scan_work
|                                                      wka-&gt;work:
|                                                      wka-&gt;status=CLOSING
|                                                      close wka request
|                              scan_work:
|                              zfcp_fc_wka_port_get
|                               (wka-&gt;status==CLOSING)
|                               wka-&gt;status=OPENING
|                               open wka request
|                               wait_event
|                               |              close response
|                               |              wka-&gt;status=OFFLINE
|                               |              wake_up /*WRONG*/
~~~Case (2)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|                                                      wka-&gt;work:
|                                                      wka-&gt;status=CLOSING
|                                                      close wka request
zfcp_erp_wait
flush scan_work
|                              scan_work:
|                              zfcp_fc_wka_port_get
|                               (wka-&gt;status==CLOSING)
|                               wka-&gt;status=OPENING
|                               open wka request
|                                              close response
|                                              wka-&gt;status=OFFLINE
|                                              wake_up /*WRONG&amp;NOP*/
|                               wait_event /*NOP*/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|                               (wka-&gt;status!=ONLINE)
|                               return -EIO
|                              return early
                                               open response
                                               wka-&gt;status=ONLINE
                                               wake_up /*NOP*/

So we erroneously end up with no automatic port scan. This is a big problem
when it happens during boot. The timing is influenced by v3.19 commit
18f87a67e6d6 ("zfcp: auto port scan resiliency").

Fix it by fully mutually excluding zfcp_fc_wka_port_get() and
zfcp_fc_wka_port_offline(). For that to work, we make the latter block
until we got the response for a close WKA port. In order not to penalize
the system workqueue, we move wka_port-&gt;work to our own adapter workqueue.
Note that before v2.6.30 commit 828bc1212a68 ("[SCSI] zfcp: Set WKA-port to
offline on adapter deactivation"), zfcp did block in
zfcp_fc_wka_port_offline() as well, but with a different condition.

While at it, make non-functional cleanups to improve code reading in
zfcp_fc_wka_port_get(). If we cannot send the WKA port open request, don't
rely on the subsequent wait_event condition to immediately let this case
pass without blocking. Also don't want to rely on the additional condition
handling the refcount to be skipped just to finally return with -EIO.

Link: https://lore.kernel.org/r/20220729162529.1620730-1-maier@linux.ibm.com
Fixes: 5ab944f97e09 ("[SCSI] zfcp: attach and release SAN nameserver port on demand")
Cc: &lt;stable@vger.kernel.org&gt; #v2.6.28+
Reviewed-by: Benjamin Block &lt;bblock@linux.ibm.com&gt;
Signed-off-by: Steffen Maier &lt;maier@linux.ibm.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4da8c5f76825269f28d6a89fa752934a4bcb6dfa upstream.

Case (1):
  The only waiter on wka_port-&gt;completion_wq is zfcp_fc_wka_port_get()
  trying to open a WKA port. As such it should only be woken up by WKA port
  *open* responses, not by WKA port close responses.

Case (2):
  A close WKA port response coming in just after having sent a new open WKA
  port request and before blocking for the open response with wait_event()
  in zfcp_fc_wka_port_get() erroneously renders the wait_event a NOP
  because the close handler overwrites wka_port-&gt;status. Hence the
  wait_event condition is erroneously true and it does not enter blocking
  state.

With non-negligible probability, the following time space sequence happens
depending on timing without this fix:

user process        ERP thread zfcp work queue tasklet system work queue
============        ========== =============== ======= =================
$ echo 1 &gt; online
zfcp_ccw_set_online
zfcp_ccw_activate
zfcp_erp_adapter_reopen
msleep scan backoff zfcp_erp_strategy
|                   ...
|                   zfcp_erp_action_cleanup
|                   ...
|                   queue delayed scan_work
|                   queue ns_up_work
|                              ns_up_work:
|                              zfcp_fc_wka_port_get
|                               open wka request
|                                              open response
|                              GSPN FC-GS
|                              RSPN FC-GS [NPIV-only]
|                              zfcp_fc_wka_port_put
|                               (--wka-&gt;refcount==0)
|                               sched delayed wka-&gt;work
|
~~~Case (1)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
zfcp_erp_wait
flush scan_work
|                                                      wka-&gt;work:
|                                                      wka-&gt;status=CLOSING
|                                                      close wka request
|                              scan_work:
|                              zfcp_fc_wka_port_get
|                               (wka-&gt;status==CLOSING)
|                               wka-&gt;status=OPENING
|                               open wka request
|                               wait_event
|                               |              close response
|                               |              wka-&gt;status=OFFLINE
|                               |              wake_up /*WRONG*/
~~~Case (2)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|                                                      wka-&gt;work:
|                                                      wka-&gt;status=CLOSING
|                                                      close wka request
zfcp_erp_wait
flush scan_work
|                              scan_work:
|                              zfcp_fc_wka_port_get
|                               (wka-&gt;status==CLOSING)
|                               wka-&gt;status=OPENING
|                               open wka request
|                                              close response
|                                              wka-&gt;status=OFFLINE
|                                              wake_up /*WRONG&amp;NOP*/
|                               wait_event /*NOP*/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|                               (wka-&gt;status!=ONLINE)
|                               return -EIO
|                              return early
                                               open response
                                               wka-&gt;status=ONLINE
                                               wake_up /*NOP*/

So we erroneously end up with no automatic port scan. This is a big problem
when it happens during boot. The timing is influenced by v3.19 commit
18f87a67e6d6 ("zfcp: auto port scan resiliency").

Fix it by fully mutually excluding zfcp_fc_wka_port_get() and
zfcp_fc_wka_port_offline(). For that to work, we make the latter block
until we got the response for a close WKA port. In order not to penalize
the system workqueue, we move wka_port-&gt;work to our own adapter workqueue.
Note that before v2.6.30 commit 828bc1212a68 ("[SCSI] zfcp: Set WKA-port to
offline on adapter deactivation"), zfcp did block in
zfcp_fc_wka_port_offline() as well, but with a different condition.

While at it, make non-functional cleanups to improve code reading in
zfcp_fc_wka_port_get(). If we cannot send the WKA port open request, don't
rely on the subsequent wait_event condition to immediately let this case
pass without blocking. Also don't want to rely on the additional condition
handling the refcount to be skipped just to finally return with -EIO.

Link: https://lore.kernel.org/r/20220729162529.1620730-1-maier@linux.ibm.com
Fixes: 5ab944f97e09 ("[SCSI] zfcp: attach and release SAN nameserver port on demand")
Cc: &lt;stable@vger.kernel.org&gt; #v2.6.28+
Reviewed-by: Benjamin Block &lt;bblock@linux.ibm.com&gt;
Signed-off-by: Steffen Maier &lt;maier@linux.ibm.com&gt;
Signed-off-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390/zcore: fix race when reading from hardware system area</title>
<updated>2022-08-25T09:18:05+00:00</updated>
<author>
<name>Alexander Gordeev</name>
<email>agordeev@linux.ibm.com</email>
</author>
<published>2022-07-19T05:16:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3edcd1348ba7fda6779e0bc4faaa276243661c9b'/>
<id>3edcd1348ba7fda6779e0bc4faaa276243661c9b</id>
<content type='text'>
[ Upstream commit 9ffed254d938c9e99eb7761c7f739294c84e0367 ]

Memory buffer used for reading out data from hardware system
area is not protected against concurrent access.

Reported-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Fixes: 411ed3225733 ("[S390] zfcpdump support.")
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Tested-by: Alexander Egorenkov &lt;egorenar@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/e68137f0f9a0d2558f37becc20af18e2939934f6.1658206891.git.agordeev@linux.ibm.com
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 9ffed254d938c9e99eb7761c7f739294c84e0367 ]

Memory buffer used for reading out data from hardware system
area is not protected against concurrent access.

Reported-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Fixes: 411ed3225733 ("[S390] zfcpdump support.")
Acked-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Tested-by: Alexander Egorenkov &lt;egorenar@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/e68137f0f9a0d2558f37becc20af18e2939934f6.1658206891.git.agordeev@linux.ibm.com
Signed-off-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfio/ccw: Do not change FSM state in subchannel event</title>
<updated>2022-08-25T09:18:03+00:00</updated>
<author>
<name>Eric Farman</name>
<email>farman@linux.ibm.com</email>
</author>
<published>2022-07-07T13:57:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=373343d8a79621837073368cd54dc4335c09b5b7'/>
<id>373343d8a79621837073368cd54dc4335c09b5b7</id>
<content type='text'>
[ Upstream commit cffcc109fd682075dee79bade3d60a07152a8fd1 ]

The routine vfio_ccw_sch_event() is tasked with handling subchannel events,
specifically machine checks, on behalf of vfio-ccw. It correctly calls
cio_update_schib(), and if that fails (meaning the subchannel is gone)
it makes an FSM event call to mark the subchannel Not Operational.

If that worked, however, then it decides that if the FSM state was already
Not Operational (implying the subchannel just came back), then it should
simply change the FSM to partially- or fully-open.

Remove this trickery, since a subchannel returning will require more
probing than simply "oh all is well again" to ensure it works correctly.

Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine")
Signed-off-by: Eric Farman &lt;farman@linux.ibm.com&gt;
Reviewed-by: Matthew Rosato &lt;mjrosato@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cffcc109fd682075dee79bade3d60a07152a8fd1 ]

The routine vfio_ccw_sch_event() is tasked with handling subchannel events,
specifically machine checks, on behalf of vfio-ccw. It correctly calls
cio_update_schib(), and if that fails (meaning the subchannel is gone)
it makes an FSM event call to mark the subchannel Not Operational.

If that worked, however, then it decides that if the FSM state was already
Not Operational (implying the subchannel just came back), then it should
simply change the FSM to partially- or fully-open.

Remove this trickery, since a subchannel returning will require more
probing than simply "oh all is well again" to ensure it works correctly.

Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine")
Signed-off-by: Eric Farman &lt;farman@linux.ibm.com&gt;
Reviewed-by: Matthew Rosato &lt;mjrosato@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com
Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tty: the rest, stop using tty_schedule_flip()</title>
<updated>2022-07-29T15:14:19+00:00</updated>
<author>
<name>Jiri Slaby</name>
<email>jslaby@suse.cz</email>
</author>
<published>2021-11-22T11:16:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f20912215c9ca3403b2dbbbeac26e98663defb67'/>
<id>f20912215c9ca3403b2dbbbeac26e98663defb67</id>
<content type='text'>
commit b68b914494df4f79b4e9b58953110574af1cb7a2 upstream.

Since commit a9c3f68f3cd8d (tty: Fix low_latency BUG) in 2014,
tty_flip_buffer_push() is only a wrapper to tty_schedule_flip(). We are
going to remove the latter (as it is used less), so call the former in
the rest of the users.

Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: William Hubbs &lt;w.d.hubbs@gmail.com&gt;
Cc: Chris Brannon &lt;chris@the-brannons.com&gt;
Cc: Kirk Reiser &lt;kirk@reisers.ca&gt;
Cc: Samuel Thibault &lt;samuel.thibault@ens-lyon.org&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Reviewed-by: Johan Hovold &lt;johan@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Link: https://lore.kernel.org/r/20211122111648.30379-3-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b68b914494df4f79b4e9b58953110574af1cb7a2 upstream.

Since commit a9c3f68f3cd8d (tty: Fix low_latency BUG) in 2014,
tty_flip_buffer_push() is only a wrapper to tty_schedule_flip(). We are
going to remove the latter (as it is used less), so call the former in
the rest of the users.

Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: William Hubbs &lt;w.d.hubbs@gmail.com&gt;
Cc: Chris Brannon &lt;chris@the-brannons.com&gt;
Cc: Kirk Reiser &lt;kirk@reisers.ca&gt;
Cc: Samuel Thibault &lt;samuel.thibault@ens-lyon.org&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Reviewed-by: Johan Hovold &lt;johan@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Link: https://lore.kernel.org/r/20211122111648.30379-3-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
