<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/raid1.c, branch v2.6.29</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>md: avoid races when stopping resync.</title>
<updated>2009-02-25T02:18:47+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-02-25T02:18:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=73d5c38a9536142e062c35997b044e89166e063b'/>
<id>73d5c38a9536142e062c35997b044e89166e063b</id>
<content type='text'>
There has been a race in raid10 and raid1 for a long time
which has only recently started showing up due to a scheduler changed.

When a sync_read request finishes, as soon as reschedule_retry
is called, another thread can mark the resync request as having
completed, so md_do_sync can finish, -&gt;stop can be called, and
-&gt;conf can be freed.  So using conf after reschedule_retry is not
safe.

Similarly, when finishing a sync_write, calling md_done_sync must be
the last thing we do, as it allows a chain of events which will free
conf and other data structures.

The first of these requires action in raid10.c
The second requires action in raid1.c and raid10.c

Cc: stable@kernel.org
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There has been a race in raid10 and raid1 for a long time
which has only recently started showing up due to a scheduler changed.

When a sync_read request finishes, as soon as reschedule_retry
is called, another thread can mark the resync request as having
completed, so md_do_sync can finish, -&gt;stop can be called, and
-&gt;conf can be freed.  So using conf after reschedule_retry is not
safe.

Similarly, when finishing a sync_write, calling md_done_sync must be
the last thing we do, as it allows a chain of events which will free
conf and other data structures.

The first of these requires action in raid10.c
The second requires action in raid1.c and raid10.c

Cc: stable@kernel.org
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Allow read error in a single drive raid1 to be passed up.</title>
<updated>2009-02-06T04:06:47+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-02-06T04:06:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4706b349f4a8312d31b3d0cf61fe721699356920'/>
<id>4706b349f4a8312d31b3d0cf61fe721699356920</id>
<content type='text'>
If a raid1 only has a single working device and gets a read error, 
we choose to simply return that error up to the filesystem (or whatever)
rather than failing the whole array.

However the codes doesn't quite do that.  We attempt a readbalance
which allocates the same drive, so we retry the read - indefinitely. 

Instead:  If read_balance in the error case chooses the same drive that just
failed, treat it as a failure and don't retry.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a raid1 only has a single working device and gets a read error, 
we choose to simply return that error up to the filesystem (or whatever)
rather than failing the whole array.

However the codes doesn't quite do that.  We attempt a readbalance
which allocates the same drive, so we retry the read - indefinitely. 

Instead:  If read_balance in the error case chooses the same drive that just
failed, treat it as a failure and don't retry.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: don't retry recovery of raid1 that fails due to error on source drive.</title>
<updated>2009-01-08T21:31:11+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2009-01-08T21:31:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4044ba58dd15cb01797c4fd034f39ef4a75f7cc3'/>
<id>4044ba58dd15cb01797c4fd034f39ef4a75f7cc3</id>
<content type='text'>
If a raid1 has only one working drive and it has a sector which
gives an error on read, then an attempt to recover onto a spare will
fail, but as the single remaining drive is not removed from the
array, the recovery will be immediately re-attempted, resulting
in an infinite recovery loop.

So detect this situation and don't retry recovery once an error
on the lone remaining drive is detected.

Allow recovery to be retried once every time a spare is added
in case the problem wasn't actually a media error.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a raid1 has only one working drive and it has a sector which
gives an error on read, then an attempt to recover onto a spare will
fail, but as the single remaining drive is not removed from the
array, the recovery will be immediately re-attempted, resulting
in an infinite recovery loop.

So detect this situation and don't retry recovery once an error
on the lone remaining drive is detected.

Allow recovery to be retried once every time a spare is added
in case the problem wasn't actually a media error.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: use list_for_each_entry macro directly</title>
<updated>2009-01-08T21:31:08+00:00</updated>
<author>
<name>Cheng Renquan</name>
<email>crquan@gmail.com</email>
</author>
<published>2009-01-08T21:31:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=159ec1fc060ab22b157a62364045f5e98749c4d3'/>
<id>159ec1fc060ab22b157a62364045f5e98749c4d3</id>
<content type='text'>
The rdev_for_each macro defined in &lt;linux/raid/md_k.h&gt; is identical to
list_for_each_entry_safe, from &lt;linux/list.h&gt;, it should be defined to
use list_for_each_entry_safe, instead of reinventing the wheel.

But some calls to each_entry_safe don't really need a safe version,
just a direct list_for_each_entry is enough, this could save a temp
variable (tmp) in every function that used rdev_for_each.

In this patch, most rdev_for_each loops are replaced by list_for_each_entry,
totally save many tmp vars; and only in the other situations that will call
list_del to delete an entry, the safe version is used.

Signed-off-by: Cheng Renquan &lt;crquan@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The rdev_for_each macro defined in &lt;linux/raid/md_k.h&gt; is identical to
list_for_each_entry_safe, from &lt;linux/list.h&gt;, it should be defined to
use list_for_each_entry_safe, instead of reinventing the wheel.

But some calls to each_entry_safe don't really need a safe version,
just a direct list_for_each_entry is enough, this could save a temp
variable (tmp) in every function that used rdev_for_each.

In this patch, most rdev_for_each loops are replaced by list_for_each_entry,
totally save many tmp vars; and only in the other situations that will call
list_del to delete an entry, the safe version is used.

Signed-off-by: Cheng Renquan &lt;crquan@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: build failure due to missing delay.h</title>
<updated>2008-10-15T10:57:05+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2008-10-14T22:09:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=255707274ea25d486b7de060a30ba4ac50593408'/>
<id>255707274ea25d486b7de060a30ba4ac50593408</id>
<content type='text'>
Today's linux-next build (powerpc ppc64_defconfig) failed like this:

drivers/md/raid1.c: In function 'sync_request':
drivers/md/raid1.c:1759: error: implicit declaration of function 'msleep_interruptible'
make[3]: *** [drivers/md/raid1.o] Error 1
make[3]: *** Waiting for unfinished jobs....
drivers/md/raid10.c: In function 'sync_request':
drivers/md/raid10.c:1749: error: implicit declaration of function 'msleep_interruptible'
make[3]: *** [drivers/md/raid10.o] Error 1
drivers/md/md.c: In function 'md_do_sync':
drivers/md/md.c:5915: error: implicit declaration of function 'msleep'

Caused by commit 6caa3b0bbdb474647f6bdd8a958ffc46f78d8d58 ("md: Remove
unnecessary #includes, #defines, and function declarations").  I added
the following patch.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Today's linux-next build (powerpc ppc64_defconfig) failed like this:

drivers/md/raid1.c: In function 'sync_request':
drivers/md/raid1.c:1759: error: implicit declaration of function 'msleep_interruptible'
make[3]: *** [drivers/md/raid1.o] Error 1
make[3]: *** Waiting for unfinished jobs....
drivers/md/raid10.c: In function 'sync_request':
drivers/md/raid10.c:1749: error: implicit declaration of function 'msleep_interruptible'
make[3]: *** [drivers/md/raid10.o] Error 1
drivers/md/md.c: In function 'md_do_sync':
drivers/md/md.c:5915: error: implicit declaration of function 'msleep'

Caused by commit 6caa3b0bbdb474647f6bdd8a958ffc46f78d8d58 ("md: Remove
unnecessary #includes, #defines, and function declarations").  I added
the following patch.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: move stats from disk to part0</title>
<updated>2008-10-09T06:56:08+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2008-08-25T10:56:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=074a7aca7afa6f230104e8e65eba3420263714a5'/>
<id>074a7aca7afa6f230104e8e65eba3420263714a5</id>
<content type='text'>
Move stats related fields - stamp, in_flight, dkstats - from disk to
part0 and unify stat handling such that...

* part_stat_*() now updates part0 together if the specified partition
  is not part0.  ie. part_stat_*() are now essentially all_stat_*().

* {disk|all}_stat_*() are gone.

* part_round_stats() is updated similary.  It handles part0 stats
  automatically and disk_round_stats() is killed.

* part_{inc|dec}_in_fligh() is implemented which automatically updates
  part0 stats for parts other than part0.

* disk_map_sector_rcu() is updated to return part0 if no part matches.
  Combined with the above changes, this makes NULL special case
  handling in callers unnecessary.

* Separate stats show code paths for disk are collapsed into part
  stats show code paths.

* Rename disk_stat_lock/unlock() to part_stat_lock/unlock()

While at it, reposition stat handling macros a bit and add missing
parentheses around macro parameters.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move stats related fields - stamp, in_flight, dkstats - from disk to
part0 and unify stat handling such that...

* part_stat_*() now updates part0 together if the specified partition
  is not part0.  ie. part_stat_*() are now essentially all_stat_*().

* {disk|all}_stat_*() are gone.

* part_round_stats() is updated similary.  It handles part0 stats
  automatically and disk_round_stats() is killed.

* part_{inc|dec}_in_fligh() is implemented which automatically updates
  part0 stats for parts other than part0.

* disk_map_sector_rcu() is updated to return part0 if no part matches.
  Combined with the above changes, this makes NULL special case
  handling in callers unnecessary.

* Separate stats show code paths for disk are collapsed into part
  stats show code paths.

* Rename disk_stat_lock/unlock() to part_stat_lock/unlock()

While at it, reposition stat handling macros a bit and add missing
parentheses around macro parameters.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: fix diskstats access</title>
<updated>2008-10-09T06:56:06+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2008-08-25T10:47:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c9959059161ddd7bf4670cf47367033d6b2f79c4'/>
<id>c9959059161ddd7bf4670cf47367033d6b2f79c4</id>
<content type='text'>
There are two variants of stat functions - ones prefixed with double
underbars which don't care about preemption and ones without which
disable preemption before manipulating per-cpu counters.  It's unclear
whether the underbarred ones assume that preemtion is disabled on
entry as some callers don't do that.

This patch unifies diskstats access by implementing disk_stat_lock()
and disk_stat_unlock() which take care of both RCU (for partition
access) and preemption (for per-cpu counter access).  diskstats access
should always be enclosed between the two functions.  As such, there's
no need for the versions which disables preemption.  They're removed
and double underbars ones are renamed to drop the underbars.  As an
extra argument is added, there's no danger of using the old version
unconverted.

disk_stat_lock() uses get_cpu() and returns the cpu index and all
diskstat functions which access per-cpu counters now has @cpu
argument to help RT.

This change adds RCU or preemption operations at some places but also
collapses several preemption ops into one at others.  Overall, the
performance difference should be negligible as all involved ops are
very lightweight per-cpu ones.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are two variants of stat functions - ones prefixed with double
underbars which don't care about preemption and ones without which
disable preemption before manipulating per-cpu counters.  It's unclear
whether the underbarred ones assume that preemtion is disabled on
entry as some callers don't do that.

This patch unifies diskstats access by implementing disk_stat_lock()
and disk_stat_unlock() which take care of both RCU (for partition
access) and preemption (for per-cpu counter access).  diskstats access
should always be enclosed between the two functions.  As such, there's
no need for the versions which disables preemption.  They're removed
and double underbars ones are renamed to drop the underbars.  As an
extra argument is added, there's no danger of using the old version
unconverted.

disk_stat_lock() uses get_cpu() and returns the cpu index and all
diskstat functions which access per-cpu counters now has @cpu
argument to help RT.

This change adds RCU or preemption operations at some places but also
collapses several preemption ops into one at others.  Overall, the
performance difference should be negligible as all involved ops are
very lightweight per-cpu ones.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: raid fixups for removal of bi_hw_segments</title>
<updated>2008-10-09T06:56:03+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>jens.axboe@oracle.com</email>
</author>
<published>2008-08-15T08:41:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=960e739d9e9f1c2346d8bdc65299ee2e1ed42218'/>
<id>960e739d9e9f1c2346d8bdc65299ee2e1ed42218</id>
<content type='text'>
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drop vmerge accounting</title>
<updated>2008-10-09T06:56:03+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2008-08-15T08:20:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5df97b91b5d7ed426034fcc84cb6e7cf682b8838'/>
<id>5df97b91b5d7ed426034fcc84cb6e7cf682b8838</id>
<content type='text'>
Remove hw_segments field from struct bio and struct request. Without virtual
merge accounting they have no purpose.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove hw_segments field from struct bio and struct request. Without virtual
merge accounting they have no purpose.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Make mddev-&gt;array_size sector-based.</title>
<updated>2008-07-21T07:05:22+00:00</updated>
<author>
<name>Andre Noll</name>
<email>maan@systemlinux.org</email>
</author>
<published>2008-07-21T07:05:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f233ea5c9e0d8b95e4283bf6a3436b88f6fd3586'/>
<id>f233ea5c9e0d8b95e4283bf6a3436b88f6fd3586</id>
<content type='text'>
This patch renames the array_size field of struct mddev_s to array_sectors
and converts all instances to use units of 512 byte sectors instead of 1k
blocks.

Signed-off-by: Andre Noll &lt;maan@systemlinux.org&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch renames the array_size field of struct mddev_s to array_sectors
and converts all instances to use units of 512 byte sectors instead of 1k
blocks.

Signed-off-by: Andre Noll &lt;maan@systemlinux.org&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
