<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/dlm/midcomms.c, branch v6.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>fs: dlm: add send ack threshold and append acks to msgs</title>
<updated>2023-06-14T15:17:33+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-05-29T21:44:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1696c75f1864b338fccbc55f278039d199287ec3'/>
<id>1696c75f1864b338fccbc55f278039d199287ec3</id>
<content type='text'>
This patch changes the time when we sending an ack back to tell the
other side it can free some message because it is arrived on the
receiver node, due random reconnects e.g. TCP resets this is handled as
well on application layer to not let DLM run into a deadlock state.

The current handling has the following problems:

1. We end in situations that we only send an ack back message of 16
   bytes out and no other messages. Whereas DLM has logic to combine
   so much messages as it can in one send() socket call. This behaviour
   can be discovered by "trace-cmd start -e dlm_recv" and observing the
   ret field being 16 bytes.

2. When processing of DLM messages will never end because we receive a
   lot of messages, we will not send an ack back as it happens when
   the processing loop ends.

This patch introduces a likely and unlikely threshold case. The likely
case will send an ack back on a transmit path if the threshold is
triggered of amount of processed upper layer protocol. This will solve
issue 1 because it will be send when another normal DLM message will be
sent. It solves issue 2 because it is not part of the processing loop.

There is however a unlikely case, the unlikely case has a bigger
threshold and will be triggered when we only receive messages and do not
sent any message back. This case avoids that the sending node will keep
a lot of message for a long time as we send sometimes ack backs to tell
the sender to finally release messages.

The atomic cmpxchg() is there to provide a atomically ack send with
reset of the upper layer protocol delivery counter.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch changes the time when we sending an ack back to tell the
other side it can free some message because it is arrived on the
receiver node, due random reconnects e.g. TCP resets this is handled as
well on application layer to not let DLM run into a deadlock state.

The current handling has the following problems:

1. We end in situations that we only send an ack back message of 16
   bytes out and no other messages. Whereas DLM has logic to combine
   so much messages as it can in one send() socket call. This behaviour
   can be discovered by "trace-cmd start -e dlm_recv" and observing the
   ret field being 16 bytes.

2. When processing of DLM messages will never end because we receive a
   lot of messages, we will not send an ack back as it happens when
   the processing loop ends.

This patch introduces a likely and unlikely threshold case. The likely
case will send an ack back on a transmit path if the threshold is
triggered of amount of processed upper layer protocol. This will solve
issue 1 because it will be send when another normal DLM message will be
sent. It solves issue 2 because it is not part of the processing loop.

There is however a unlikely case, the unlikely case has a bigger
threshold and will be triggered when we only receive messages and do not
sent any message back. This case avoids that the sending node will keep
a lot of message for a long time as we send sometimes ack backs to tell
the sender to finally release messages.

The atomic cmpxchg() is there to provide a atomically ack send with
reset of the upper layer protocol delivery counter.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: handle sequence numbers as atomic</title>
<updated>2023-06-14T15:17:33+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-05-29T21:44:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d00725cab226491fc347d9bc42b881a0a35419cf'/>
<id>d00725cab226491fc347d9bc42b881a0a35419cf</id>
<content type='text'>
Currently seq_next is only be read on the receive side which processed
in an ordered way. The seq_send is being protected by locks. To being
able to read the seq_next value on send side as well we convert it to an
atomic_t value. The atomic_cmpxchg() is probably not necessary, however
the atomic_inc() depends on a if coniditional and this should be handled
in an atomic context.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently seq_next is only be read on the receive side which processed
in an ordered way. The seq_send is being protected by locks. To being
able to read the seq_next value on send side as well we convert it to an
atomic_t value. The atomic_cmpxchg() is probably not necessary, however
the atomic_inc() depends on a if coniditional and this should be handled
in an atomic context.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: filter ourself midcomms calls</title>
<updated>2023-06-14T15:17:33+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-05-29T21:44:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=07ee38674a0b9071fa0bbec4dbda6aad1c5e4003'/>
<id>07ee38674a0b9071fa0bbec4dbda6aad1c5e4003</id>
<content type='text'>
It makes no sense to call midcomms/lowcomms functionality for the local
node as socket functionality is only required for remote nodes. This
patch filters those calls in the upper layer of lockspace membership
handling instead of doing it in midcomms/lowcomms layer as they should
never be aware of local nodeid.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It makes no sense to call midcomms/lowcomms functionality for the local
node as socket functionality is only required for remote nodes. This
patch filters those calls in the upper layer of lockspace membership
handling instead of doing it in midcomms/lowcomms layer as they should
never be aware of local nodeid.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: revert check required context while close</title>
<updated>2023-06-14T15:17:33+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-05-29T21:44:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6b6d6dcc7f32767d57740e0552337c8de40610b'/>
<id>c6b6d6dcc7f32767d57740e0552337c8de40610b</id>
<content type='text'>
This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context
while close"). The function dlm_midcomms_close(), which will call later
dlm_lowcomms_close(), is called when the cluster manager tells the node
got fenced which means on midcomms/lowcomms layer to disconnect the node
from the cluster communication. The node can rejoin the cluster later.
This patch was ensuring no new message were able to be triggered when we
are in the close() function context. This was done by checking if the
lockspace has been stopped. However there is a missing check that we
only need to check specific lockspaces where the fenced node is member
of. This is currently complicated because there is no way to easily
check if a node is part of a specific lockspace without stopping the
recovery. For now we just revert this commit as it is just a check to
finding possible leaks of stopping lockspaces before close() is called.

Cc: stable@vger.kernel.org
Fixes: 2c3fa6ae4d52 ("dlm: check required context while close")
Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch reverts commit 2c3fa6ae4d52 ("dlm: check required context
while close"). The function dlm_midcomms_close(), which will call later
dlm_lowcomms_close(), is called when the cluster manager tells the node
got fenced which means on midcomms/lowcomms layer to disconnect the node
from the cluster communication. The node can rejoin the cluster later.
This patch was ensuring no new message were able to be triggered when we
are in the close() function context. This was done by checking if the
lockspace has been stopped. However there is a missing check that we
only need to check specific lockspaces where the fenced node is member
of. This is currently complicated because there is no way to easily
check if a node is part of a specific lockspace without stopping the
recovery. For now we just revert this commit as it is just a check to
finding possible leaks of stopping lockspaces before close() is called.

Cc: stable@vger.kernel.org
Fixes: 2c3fa6ae4d52 ("dlm: check required context while close")
Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: remove unnecessary waker_up() calls</title>
<updated>2023-01-23T22:27:28+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-01-12T22:18:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=723b197bbdf1e0adbab772b8e5e022c40db6a9fe'/>
<id>723b197bbdf1e0adbab772b8e5e022c40db6a9fe</id>
<content type='text'>
The wake_up() is already handled inside of midcomms_node_reset() when
switching the state to CLOSED state. So there is not need to call it
after midcomms_node_reset() again.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The wake_up() is already handled inside of midcomms_node_reset() when
switching the state to CLOSED state. So there is not need to call it
after midcomms_node_reset() again.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: move state change into else branch</title>
<updated>2023-01-23T22:27:27+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-01-12T22:18:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ef7ef015eb333c8954d7165d216ec0c774d7e4e4'/>
<id>ef7ef015eb333c8954d7165d216ec0c774d7e4e4</id>
<content type='text'>
Currently we can switch at first into DLM_CLOSE_WAIT state and then do
another state change if a condition is true. Instead of doing two state
changes we handle the other state change inside an else branch of this
condition.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we can switch at first into DLM_CLOSE_WAIT state and then do
another state change if a condition is true. Instead of doing two state
changes we handle the other state change inside an else branch of this
condition.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: remove newline in log_print</title>
<updated>2023-01-23T22:27:26+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-01-12T22:18:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3186409711e64ed0a0fa767fad555a5e655022da'/>
<id>3186409711e64ed0a0fa767fad555a5e655022da</id>
<content type='text'>
There is an API difference between log_print() and other printk()s to
put a newline or not. This one was introduced by mistake because
log_print() adds a newline.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is an API difference between log_print() and other printk()s to
put a newline or not. This one was introduced by mistake because
log_print() adds a newline.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: reduce the shutdown timeout to 5 secs</title>
<updated>2023-01-23T22:27:20+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-01-12T22:18:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=11605353f27c23902b6fb993a57bd2f0baaff3b6'/>
<id>11605353f27c23902b6fb993a57bd2f0baaff3b6</id>
<content type='text'>
When a shutdown is stuck, time out after 5 seconds instead of
3 minutes.  After this timeout we try a forced shutdown.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a shutdown is stuck, time out after 5 seconds instead of
3 minutes.  After this timeout we try a forced shutdown.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: wait until all midcomms nodes detect version</title>
<updated>2023-01-23T20:58:19+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-01-12T22:18:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b8b750e0c99f39223115f2672ac4cfa96ecb9edd'/>
<id>b8b750e0c99f39223115f2672ac4cfa96ecb9edd</id>
<content type='text'>
The current dlm version detection is very complex due to backwards
compatablilty with earlier dlm protocol versions. It takes some time to
detect if a peer node has a specific DLM version. If it's not detected,
we just cut the socket connection. There could be cases where the local
node has not detected the version yet, but the peer node has.  In these
cases, we are trying to shutdown the dlm connection with a FIN/ACK message
exchange to be sure the other peer is ready to shutdown the connection on
dlm application level.  However this mechanism is only available on DLM
protocol version 3.2 and we need to be sure the DLM version is detected
before.

To make it more robust we introduce a a "best effort" wait to wait for the
version detection before shutdown the dlm connection. This need to be
done before the kthread recoverd for recovery handling is stopped,
because recovery handling will trigger enough messages to have a version
detection going on.

It is a corner case which was detected by modprobe dlm_locktroture module
and rmmod dlm_locktorture module directly afterwards (in a looping
behaviour). In practice probably nobody would leave a lockspace immediately
after joining it.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current dlm version detection is very complex due to backwards
compatablilty with earlier dlm protocol versions. It takes some time to
detect if a peer node has a specific DLM version. If it's not detected,
we just cut the socket connection. There could be cases where the local
node has not detected the version yet, but the peer node has.  In these
cases, we are trying to shutdown the dlm connection with a FIN/ACK message
exchange to be sure the other peer is ready to shutdown the connection on
dlm application level.  However this mechanism is only available on DLM
protocol version 3.2 and we need to be sure the DLM version is detected
before.

To make it more robust we introduce a a "best effort" wait to wait for the
version detection before shutdown the dlm connection. This need to be
done before the kthread recoverd for recovery handling is stopped,
because recovery handling will trigger enough messages to have a version
detection going on.

It is a corner case which was detected by modprobe dlm_locktroture module
and rmmod dlm_locktorture module directly afterwards (in a looping
behaviour). In practice probably nobody would leave a lockspace immediately
after joining it.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dlm: ignore unexpected non dlm opts msgs</title>
<updated>2023-01-23T20:42:02+00:00</updated>
<author>
<name>Alexander Aring</name>
<email>aahringo@redhat.com</email>
</author>
<published>2023-01-12T22:18:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=89835b064fe7421a60e40770c27d38024190a0c8'/>
<id>89835b064fe7421a60e40770c27d38024190a0c8</id>
<content type='text'>
This patch ignores unexpected RCOM_NAMES/RCOM_STATUS messages.
To be backwards compatible, those messages are not part of the new
reliable DLM OPTS encapsulation header, and have their own
retransmit handling using sequence number matching  When we get
unexpected non dlm opts messages, we should allow them and let
RCOM message handling filter them out using sequence numbers.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch ignores unexpected RCOM_NAMES/RCOM_STATUS messages.
To be backwards compatible, those messages are not part of the new
reliable DLM OPTS encapsulation header, and have their own
retransmit handling using sequence number matching  When we get
unexpected non dlm opts messages, we should allow them and let
RCOM message handling filter them out using sequence numbers.

Signed-off-by: Alexander Aring &lt;aahringo@redhat.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
