<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/dlm, branch v3.6</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>dlm: fix missing dir remove</title>
<updated>2012-07-16T19:24:43+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-25T18:48:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=96006ea6d4eea73466e90ef353bf34e507724e77'/>
<id>96006ea6d4eea73466e90ef353bf34e507724e77</id>
<content type='text'>
I don't know exactly how, but in some cases, a dir
record is not removed, or a new one is created when
it shouldn't be.  The result is that the dir node
lookup returns a master node where the rsb does not
exist.  In this case, The master node will repeatedly
return -EBADR for requests, and the lock requests will
be stuck.

Until all possible ways for this to happen can be
eliminated, a simple and effective way to recover from
this situation is for the supposed master node to send
a standard remove message to the dir node when it
receives a request for a resource it has no rsb for.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I don't know exactly how, but in some cases, a dir
record is not removed, or a new one is created when
it shouldn't be.  The result is that the dir node
lookup returns a master node where the rsb does not
exist.  In this case, The master node will repeatedly
return -EBADR for requests, and the lock requests will
be stuck.

Until all possible ways for this to happen can be
eliminated, a simple and effective way to recover from
this situation is for the supposed master node to send
a standard remove message to the dir node when it
receives a request for a resource it has no rsb for.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix conversion deadlock from recovery</title>
<updated>2012-07-16T19:18:22+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-05T20:55:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c503a62103c46d56447f56306b52be6f844689ba'/>
<id>c503a62103c46d56447f56306b52be6f844689ba</id>
<content type='text'>
The process of rebuilding locks on a new master during
recovery could re-order the locks on the convert queue,
creating an "in place" conversion deadlock that would
not be resolved.  Fix this by not considering queue
order when granting conversions after recovery.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The process of rebuilding locks on a new master during
recovery could re-order the locks on the convert queue,
creating an "in place" conversion deadlock that would
not be resolved.  Fix this by not considering queue
order when granting conversions after recovery.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: use wait_event_timeout</title>
<updated>2012-07-16T19:18:12+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-05T16:23:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6d768177c282637a7943e72b4b2b148e7553ecf1'/>
<id>6d768177c282637a7943e72b4b2b148e7553ecf1</id>
<content type='text'>
Use wait_event_timeout to avoid using a timer
directly.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use wait_event_timeout to avoid using a timer
directly.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fix race between remove and lookup</title>
<updated>2012-07-16T19:18:01+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-06-14T17:17:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=05c32f47bfae74dabff05208957768078b53cc49'/>
<id>05c32f47bfae74dabff05208957768078b53cc49</id>
<content type='text'>
It was possible for a remove message on an old
rsb to be sent after a lookup message on a new
rsb, where the rsbs were for the same resource
name.  This could lead to a missing directory
entry for the new rsb.

It is fixed by keeping a copy of the resource
name being removed until after the remove has
been sent.  A lookup checks if this in-progress
remove matches the name it is looking up.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It was possible for a remove message on an old
rsb to be sent after a lookup message on a new
rsb, where the rsbs were for the same resource
name.  This could lead to a missing directory
entry for the new rsb.

It is fixed by keeping a copy of the resource
name being removed until after the remove has
been sent.  A lookup checks if this in-progress
remove matches the name it is looking up.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: use idr instead of list for recovered rsbs</title>
<updated>2012-07-16T19:17:52+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-05-15T21:07:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1d7c484eeb167fc374294e38ae402de4097c8611'/>
<id>1d7c484eeb167fc374294e38ae402de4097c8611</id>
<content type='text'>
When a large number of resources are being recovered,
a linear search of the recover_list takes a long time.
Use an idr in place of a list.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a large number of resources are being recovered,
a linear search of the recover_list takes a long time.
Use an idr in place of a list.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: use rsbtbl as resource directory</title>
<updated>2012-07-16T19:16:19+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-05-10T15:18:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c04fecb4d9f7753e0cbff7edd03ec68f8721cdce'/>
<id>c04fecb4d9f7753e0cbff7edd03ec68f8721cdce</id>
<content type='text'>
Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory.  It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: NULL dereference on failure in kmem_cache_create()</title>
<updated>2012-05-15T15:39:28+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@oracle.com</email>
</author>
<published>2012-05-15T08:58:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=75af271ed5f51b1f3506c7c1d567b1f32e5c9206'/>
<id>75af271ed5f51b1f3506c7c1d567b1f32e5c9206</id>
<content type='text'>
We aren't allowed to pass NULL pointers to kmem_cache_destroy() so if
both allocations fail, it leads to a NULL dereference.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We aren't allowed to pass NULL pointers to kmem_cache_destroy() so if
both allocations fail, it leads to a NULL dereference.

Signed-off-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: fixes for nodir mode</title>
<updated>2012-05-02T19:15:27+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-26T20:54:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4875647a08e35f77274838d97ca8fa44158d50e2'/>
<id>4875647a08e35f77274838d97ca8fa44158d50e2</id>
<content type='text'>
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: improve error and debug messages</title>
<updated>2012-04-26T20:41:46+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-23T21:36:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6d40c4a708e0e996fd9c60d4093aebba5fe1f749'/>
<id>6d40c4a708e0e996fd9c60d4093aebba5fe1f749</id>
<content type='text'>
Change some existing error/debug messages to
collect more useful information, and add
some new error/debug messages to address
recently found problems.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change some existing error/debug messages to
collect more useful information, and add
some new error/debug messages to address
recently found problems.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dlm: avoid unnecessary search in search_rsb</title>
<updated>2012-04-26T20:37:56+00:00</updated>
<author>
<name>David Teigland</name>
<email>teigland@redhat.com</email>
</author>
<published>2012-04-23T19:08:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=57638bf3aa64facd9eba0e018b5773f5d2da6c2b'/>
<id>57638bf3aa64facd9eba0e018b5773f5d2da6c2b</id>
<content type='text'>
If the rsb is found in the "keep" tree, but is
not the right type (i.e. not MASTER), we can
return immediately with the result.  There's
no point in going on to search the "toss" list
as if we hadn't found it.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the rsb is found in the "keep" tree, but is
not the right type (i.e. not MASTER), we can
return immediately with the result.  There's
no point in going on to search the "toss" list
as if we hadn't found it.

Signed-off-by: David Teigland &lt;teigland@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
