<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/md.c, branch v2.6.33</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>md: fix some lockdep issues between md and sysfs.</title>
<updated>2010-02-10T00:26:09+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-02-09T05:34:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ef286f6fa673cd7fb367e1b145069d8dbfcc6081'/>
<id>ef286f6fa673cd7fb367e1b145069d8dbfcc6081</id>
<content type='text'>
======
This fix is related to
    http://bugzilla.kernel.org/show_bug.cgi?id=15142
but does not address that exact issue.
======

sysfs does like attributes being removed while they are being accessed
(i.e. read or written) and waits for the access to complete.

As accessing some md attributes takes the same lock that is held while
removing those attributes a deadlock can occur.

This patch addresses 3 issues in md that could lead to this deadlock.

Two relate to calling flush_scheduled_work while the lock is held.
This is probably a bad idea in general and as we use schedule_work to
delete various sysfs objects it is particularly bad.

In one case flush_scheduled_work is called from md_alloc (called by
md_probe) called from do_md_run which holds the lock.  This call is
only present to ensure that -&gt;gendisk is set.  However we can be sure
that gendisk is always set (though possibly we couldn't when that code
was originally written.  This is because do_md_run is called in three
different contexts:
  1/ from md_ioctl.  This requires that md_open has succeeded, and it
     fails if -&gt;gendisk is not set.
  2/ from writing a sysfs attribute.  This can only happen if the
     mddev has been registered in sysfs which happens in md_alloc
     after -&gt;gendisk has been set.
  3/ from autorun_array which is only called by autorun_devices, which
     checks for -&gt;gendisk to be set before calling autorun_array.
So the call to md_probe in do_md_run can be removed, and the check on
-&gt;gendisk can also go.


In the other case flush_scheduled_work is being called in do_md_stop,
purportedly to wait for all md_delayed_delete calls (which delete the
component rdevs) to complete.  However there really isn't any need to
wait for them - they have already been disconnected in all important
ways.

The third issue is that raid5-&gt;stop() removes some attribute names
while the lock is held.  There is already some infrastructure in place
to delay attribute removal until after the lock is released (using
schedule_work).  So extend that infrastructure to remove the
raid5_attrs_group.

This does not address all lockdep issues related to the sysfs
"s_active" lock.  The rest can be address by splitting that lockdep
context between symlinks and non-symlinks which hopefully will happen.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
======
This fix is related to
    http://bugzilla.kernel.org/show_bug.cgi?id=15142
but does not address that exact issue.
======

sysfs does like attributes being removed while they are being accessed
(i.e. read or written) and waits for the access to complete.

As accessing some md attributes takes the same lock that is held while
removing those attributes a deadlock can occur.

This patch addresses 3 issues in md that could lead to this deadlock.

Two relate to calling flush_scheduled_work while the lock is held.
This is probably a bad idea in general and as we use schedule_work to
delete various sysfs objects it is particularly bad.

In one case flush_scheduled_work is called from md_alloc (called by
md_probe) called from do_md_run which holds the lock.  This call is
only present to ensure that -&gt;gendisk is set.  However we can be sure
that gendisk is always set (though possibly we couldn't when that code
was originally written.  This is because do_md_run is called in three
different contexts:
  1/ from md_ioctl.  This requires that md_open has succeeded, and it
     fails if -&gt;gendisk is not set.
  2/ from writing a sysfs attribute.  This can only happen if the
     mddev has been registered in sysfs which happens in md_alloc
     after -&gt;gendisk has been set.
  3/ from autorun_array which is only called by autorun_devices, which
     checks for -&gt;gendisk to be set before calling autorun_array.
So the call to md_probe in do_md_run can be removed, and the check on
-&gt;gendisk can also go.


In the other case flush_scheduled_work is being called in do_md_stop,
purportedly to wait for all md_delayed_delete calls (which delete the
component rdevs) to complete.  However there really isn't any need to
wait for them - they have already been disconnected in all important
ways.

The third issue is that raid5-&gt;stop() removes some attribute names
while the lock is held.  There is already some infrastructure in place
to delay attribute removal until after the lock is released (using
schedule_work).  So extend that infrastructure to remove the
raid5_attrs_group.

This does not address all lockdep issues related to the sysfs
"s_active" lock.  The rest can be address by splitting that lockdep
context between symlinks and non-symlinks which hopefully will happen.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: allow a resync that is waiting for other resync to complete, to be aborted.</title>
<updated>2009-12-30T04:25:23+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-12-30T04:25:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=404e4b43fdd6daa7b4a9f81bc7e4358281d763fa'/>
<id>404e4b43fdd6daa7b4a9f81bc7e4358281d763fa</id>
<content type='text'>
If two arrays share a device, then they will not both resync at the
same time.  One will wait for the other to complete.
While waiting, the MD_RECOVERY_INTR flag is not checked so a device
failure, which would make the resync pointless, does not cause the
resync to abort, so the failed device cannot be removed (as it cannot
be remove while a resync is happening).

So add a test for MD_RECOVERY_INTR.

Reported-by: Brett Russ &lt;bruss@netezza.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If two arrays share a device, then they will not both resync at the
same time.  One will wait for the other to complete.
While waiting, the MD_RECOVERY_INTR flag is not checked so a device
failure, which would make the resync pointless, does not cause the
resync to abort, so the failed device cannot be removed (as it cannot
be remove while a resync is happening).

So add a test for MD_RECOVERY_INTR.

Reported-by: Brett Russ &lt;bruss@netezza.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: remove unnecessary code from do_md_run</title>
<updated>2009-12-30T04:20:43+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-12-30T04:19:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7fb9dadc91948ddd16d1d32e414b5e71ec63b97b'/>
<id>7fb9dadc91948ddd16d1d32e414b5e71ec63b97b</id>
<content type='text'>
Since commit dfc7064500061677720fa26352963c772d3ebe6b,
-&gt;hot_remove_disks has not removed non-failed devices from
an array until recovery is no longer possible.
So the code in do_md_run to get around the fact that
md_check_recovery (which calls -&gt;hot_remove_disks) would
remove partially-in-sync devices is no longer needed.

So remove it.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit dfc7064500061677720fa26352963c772d3ebe6b,
-&gt;hot_remove_disks has not removed non-failed devices from
an array until recovery is no longer possible.
So the code in do_md_run to get around the fact that
md_check_recovery (which calls -&gt;hot_remove_disks) would
remove partially-in-sync devices is no longer needed.

So remove it.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: make recovery started by do_md_run() visible via sync_action</title>
<updated>2009-12-30T04:20:31+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-12-22T01:18:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a2d79c324ac0c26ae9995a312a7731067a7f01fc'/>
<id>a2d79c324ac0c26ae9995a312a7731067a7f01fc</id>
<content type='text'>
By default md_do_sync() will perform recovery if no other actions are
specified.  However, action_show() relies on MD_RECOVERY_RECOVER to be
set otherwise it returns 'idle'.  So, add a missing set
MD_RECOVERY_RECOVER when starting recovery.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
By default md_do_sync() will perform recovery if no other actions are
specified.  However, action_show() relies on MD_RECOVERY_RECOVER to be
set otherwise it returns 'idle'.  So, add a missing set
MD_RECOVERY_RECOVER when starting recovery.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix small irregularity with start_ro module parameter</title>
<updated>2009-12-30T04:20:12+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-12-30T01:08:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0f9552b5dc4fe10da37fa3f4a4ca185d90fa41c9'/>
<id>0f9552b5dc4fe10da37fa3f4a4ca185d90fa41c9</id>
<content type='text'>
The start_ro modules parameter can be used to force arrays to be
started in 'auto-readonly' in which they are read-only until the first
write.  This ensures that no resync/recovery happens until something
else writes to the device.  This is important for resume-from-disk
off an md array.

However if an array is started 'readonly' (by writing 'readonly' to
the 'array_state' sysfs attribute) we want it to be really 'readonly',
not 'auto-readonly'.

So strengthen the condition to only set auto-readonly if the
array is not already read-only.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The start_ro modules parameter can be used to force arrays to be
started in 'auto-readonly' in which they are read-only until the first
write.  This ensures that no resync/recovery happens until something
else writes to the device.  This is important for resume-from-disk
off an md array.

However if an array is started 'readonly' (by writing 'readonly' to
the 'array_state' sysfs attribute) we want it to be really 'readonly',
not 'auto-readonly'.

So strengthen the condition to only set auto-readonly if the
array is not already read-only.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Fix unfortunate interaction with evms</title>
<updated>2009-12-30T01:08:49+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-12-30T01:08:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cbd1998377504df005302ac90d49db72a48552a6'/>
<id>cbd1998377504df005302ac90d49db72a48552a6</id>
<content type='text'>
evms configures md arrays by:
  open device
  send ioctl
  close device

for each different ioctl needed.
Since 2.6.29, the device can disappear after the 'close'
unless a significant configuration has happened to the device.
The change made by "SET_ARRAY_INFO" can too minor to stop the device
from disappearing, but important enough that losing the change is bad.

So: make sure SET_ARRAY_INFO sets mddev-&gt;ctime, and keep the device
active as long as ctime is non-zero (it gets zeroed with lots of other
things when the array is stopped).

This is suitable for -stable kernels since 2.6.29.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: stable@kernel.org</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
evms configures md arrays by:
  open device
  send ioctl
  close device

for each different ioctl needed.
Since 2.6.29, the device can disappear after the 'close'
unless a significant configuration has happened to the device.
The change made by "SET_ARRAY_INFO" can too minor to stop the device
from disappearing, but important enough that losing the change is bad.

So: make sure SET_ARRAY_INFO sets mddev-&gt;ctime, and keep the device
active as long as ctime is non-zero (it gets zeroed with lots of other
things when the array is stopped).

This is suitable for -stable kernels since 2.6.29.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: stable@kernel.org</pre>
</div>
</content>
</entry>
<entry>
<title>drivers/md/md.c: use %pU to print UUIDs</title>
<updated>2009-12-15T16:53:33+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2009-12-15T02:01:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7b75c2f8cf6d938b571d3bc3f566f883ab7472c4'/>
<id>7b75c2f8cf6d938b571d3bc3f566f883ab7472c4</id>
<content type='text'>
Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tree-wide: convert open calls to remove spaces to skip_spaces() lib function</title>
<updated>2009-12-15T16:53:32+00:00</updated>
<author>
<name>André Goddard Rosa</name>
<email>andre.goddard@gmail.com</email>
</author>
<published>2009-12-15T02:01:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e7d2860b690d4f3bed6824757c540579638e3d1e'/>
<id>e7d2860b690d4f3bed6824757c540579638e3d1e</id>
<content type='text'>
Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
   text    data     bss     dec     hex filename
  64688     584     592   65864   10148 (TOTALS-BEFORE)
  64641     584     592   65817   10119 (TOTALS-AFTER)

Also, while at it, if we see (*str &amp;&amp; isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
    drivers/leds/led-class.c
    drivers/leds/ledtrig-timer.c
    drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str &amp;&amp;  isspace(*str)) { \(str++;\|++str;\) }
|
- *str &amp;&amp;
isspace(*str)
)

Signed-off-by: André Goddard Rosa &lt;andre.goddard@gmail.com&gt;
Cc: Julia Lawall &lt;julia@diku.dk&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Richard Purdie &lt;rpurdie@rpsys.net&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Cc: Henrique de Moraes Holschuh &lt;hmh@hmh.eng.br&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Cc: Samuel Ortiz &lt;samuel@sortiz.org&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.

It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
   text    data     bss     dec     hex filename
  64688     584     592   65864   10148 (TOTALS-BEFORE)
  64641     584     592   65817   10119 (TOTALS-AFTER)

Also, while at it, if we see (*str &amp;&amp; isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".

Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
    drivers/leds/led-class.c
    drivers/leds/ledtrig-timer.c
    drivers/video/output.c

@@
expression str;
@@

( // ignore skip_spaces cases
while (*str &amp;&amp;  isspace(*str)) { \(str++;\|++str;\) }
|
- *str &amp;&amp;
isspace(*str)
)

Signed-off-by: André Goddard Rosa &lt;andre.goddard@gmail.com&gt;
Cc: Julia Lawall &lt;julia@diku.dk&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Jeff Dike &lt;jdike@addtoit.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Richard Purdie &lt;rpurdie@rpsys.net&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Kyle McMartin &lt;kyle@mcmartin.ca&gt;
Cc: Henrique de Moraes Holschuh &lt;hmh@hmh.eng.br&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Cc: Samuel Ortiz &lt;samuel@sortiz.org&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Cc: Takashi Iwai &lt;tiwai@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: add 'recovery_start' per-device sysfs attribute</title>
<updated>2009-12-14T01:58:57+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-12-13T04:17:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=06e3c817b750c131a20e82eed57a17841ea88ed2'/>
<id>06e3c817b750c131a20e82eed57a17841ea88ed2</id>
<content type='text'>
Enable external metadata arrays to manage rebuild checkpointing via a
md/dev-XXX/recovery_start attribute which reflects rdev-&gt;recovery_offset

Also update resync_start_store to allow 'none' to be written, for
consistency.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Enable external metadata arrays to manage rebuild checkpointing via a
md/dev-XXX/recovery_start attribute which reflects rdev-&gt;recovery_offset

Also update resync_start_store to allow 'none' to be written, for
consistency.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;</pre>
</div>
</content>
</entry>
<entry>
<title>md: rcu_read_lock() walk of mddev-&gt;disks in md_do_sync()</title>
<updated>2009-12-14T01:57:43+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2009-12-13T04:17:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e59ca7da05f0d5d3ad40365c502c8b0fd24c7e3'/>
<id>4e59ca7da05f0d5d3ad40365c502c8b0fd24c7e3</id>
<content type='text'>
Other walks of this list are either under rcu_read_lock() or the list
mutation lock (mddev_lock()).  This protects against the improbable case of a
disk being removed from the array at the start of md_do_sync().

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Other walks of this list are either under rcu_read_lock() or the list
mutation lock (mddev_lock()).  This protects against the improbable case of a
disk being removed from the array at the start of md_do_sync().

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
