<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/raid10.c, branch v2.6.38</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>md: avoid spinlock problem in blk_throtl_exit</title>
<updated>2011-02-21T07:25:57+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-21T07:25:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=da9cf5050a2e3dbc3cf26a8d908482eb4485ed49'/>
<id>da9cf5050a2e3dbc3cf26a8d908482eb4485ed49</id>
<content type='text'>
blk_throtl_exit assumes that -&gt;queue_lock still exists,
so make sure that it does.
To do this, we stop redirecting -&gt;queue_lock to conf-&gt;device_lock
and leave it pointing where it is initialised - __queue_lock.

As the blk_plug functions check the -&gt;queue_lock is held, we now
take that spin_lock explicitly around the plug functions.  We don't
need the locking, just the warning removal.

This is needed for any kernel with the blk_throtl code, which is
which is 2.6.37 and later.

Cc: stable@kernel.org
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
blk_throtl_exit assumes that -&gt;queue_lock still exists,
so make sure that it does.
To do this, we stop redirecting -&gt;queue_lock to conf-&gt;device_lock
and leave it pointing where it is initialised - __queue_lock.

As the blk_plug functions check the -&gt;queue_lock is held, we now
take that spin_lock explicitly around the plug functions.  We don't
need the locking, just the warning removal.

This is needed for any kernel with the blk_throtl code, which is
which is 2.6.37 and later.

Cc: stable@kernel.org
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>FIX: md: process hangs at wait_barrier after 0-&gt;10 takeover</title>
<updated>2011-02-08T00:49:02+00:00</updated>
<author>
<name>Krzysztof Wojcik</name>
<email>krzysztof.wojcik@intel.com</email>
</author>
<published>2011-02-04T13:18:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=02214dc5461c36da26a34014cab4e1bb484edba2'/>
<id>02214dc5461c36da26a34014cab4e1bb484edba2</id>
<content type='text'>
Following symptoms were observed:
1. After raid0-&gt;raid10 takeover operation we have array with 2
missing disks.
When we add disk for rebuild, recovery process starts as expected
but it does not finish- it stops at about 90%, md126_resync process
hangs in "D" state.
2. Similar behavior is when we have mounted raid0 array and we
execute takeover to raid10. After this when we try to unmount array-
it causes process umount hangs in "D"

In scenarios above processes hang at the same function- wait_barrier
in raid10.c.
Process waits in macro "wait_event_lock_irq" until the
"!conf-&gt;barrier" condition will be true.
In scenarios above it never happens.

Reason was that at the end of level_store, after calling pers-&gt;run,
we call mddev_resume. This calls pers-&gt;quiesce(mddev, 0) with
RAID10, that calls lower_barrier.
However raise_barrier hadn't been called on that 'conf' yet,
so conf-&gt;barrier becomes negative, which is bad.

This patch introduces setting conf-&gt;barrier=1 after takeover
operation. It prevents to become barrier negative after call
lower_barrier().

Signed-off-by: Krzysztof Wojcik &lt;krzysztof.wojcik@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Following symptoms were observed:
1. After raid0-&gt;raid10 takeover operation we have array with 2
missing disks.
When we add disk for rebuild, recovery process starts as expected
but it does not finish- it stops at about 90%, md126_resync process
hangs in "D" state.
2. Similar behavior is when we have mounted raid0 array and we
execute takeover to raid10. After this when we try to unmount array-
it causes process umount hangs in "D"

In scenarios above processes hang at the same function- wait_barrier
in raid10.c.
Process waits in macro "wait_event_lock_irq" until the
"!conf-&gt;barrier" condition will be true.
In scenarios above it never happens.

Reason was that at the end of level_store, after calling pers-&gt;run,
we call mddev_resume. This calls pers-&gt;quiesce(mddev, 0) with
RAID10, that calls lower_barrier.
However raise_barrier hadn't been called on that 'conf' yet,
so conf-&gt;barrier becomes negative, which is bad.

This patch introduces setting conf-&gt;barrier=1 after takeover
operation. It prevents to become barrier negative after call
lower_barrier().

Signed-off-by: Krzysztof Wojcik &lt;krzysztof.wojcik@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md-new-param-to_sync_page_io</title>
<updated>2011-01-13T22:14:33+00:00</updated>
<author>
<name>Jonathan Brassow</name>
<email>jbrassow@redhat.com</email>
</author>
<published>2011-01-13T22:14:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ccebd4c4159462c96397ae9af9c667bb394d7b70'/>
<id>ccebd4c4159462c96397ae9af9c667bb394d7b70</id>
<content type='text'>
Add new parameter to 'sync_page_io'.

The new parameter allows us to distinguish between metadata and data
operations.  This becomes important later when we add the ability to
use separate devices for data and metadata.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add new parameter to 'sync_page_io'.

The new parameter allows us to distinguish between metadata and data
operations.  This becomes important later when we add the ability to
use separate devices for data and metadata.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Fix single printks with multiple KERN_&lt;level&gt;s</title>
<updated>2011-01-13T22:14:33+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2011-01-13T22:14:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=067032bc628598606056412594042564fcf09e22'/>
<id>067032bc628598606056412594042564fcf09e22</id>
<content type='text'>
Noticed-by: Russell King &lt;linux@arm.linux.org.uk&gt;
Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Noticed-by: Russell King &lt;linux@arm.linux.org.uk&gt;
Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: protect against NULL reference when waiting to start a raid10.</title>
<updated>2010-12-09T06:02:14+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-12-09T06:02:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=589a594be1fb8815b3f18e517be696c48664f728'/>
<id>589a594be1fb8815b3f18e517be696c48664f728</id>
<content type='text'>
When we fail to start a raid10 for some reason, we call
md_unregister_thread to kill the thread that was created.

Unfortunately md_thread() will then make one call into the handler
(raid10d) even though md_wakeup_thread has not been called.  This is
not safe and as md_unregister_thread is called after mddev-&gt;private
has been set to NULL, it will definitely cause a NULL dereference.

So fix this at both ends:
 - md_thread should only call the handler if THREAD_WAKEUP has been
   set.
 - raid10 should call md_unregister_thread before setting things
   to NULL just like all the other raid modules do.

This is applicable to 2.6.35 and later.

Cc: stable@kernel.org
Reported-by: "Citizen" &lt;citizen_lee@thecus.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we fail to start a raid10 for some reason, we call
md_unregister_thread to kill the thread that was created.

Unfortunately md_thread() will then make one call into the handler
(raid10d) even though md_wakeup_thread has not been called.  This is
not safe and as md_unregister_thread is called after mddev-&gt;private
has been set to NULL, it will definitely cause a NULL dereference.

So fix this at both ends:
 - md_thread should only call the handler if THREAD_WAKEUP has been
   set.
 - raid10 should call md_unregister_thread before setting things
   to NULL just like all the other raid modules do.

This is applicable to 2.6.35 and later.

Cc: stable@kernel.org
Reported-by: "Citizen" &lt;citizen_lee@thecus.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: use separate bio pool for each md device.</title>
<updated>2010-10-28T06:36:15+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-10-26T07:31:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a167f663243662aa9153c01086580a11cde9ffdc'/>
<id>a167f663243662aa9153c01086580a11cde9ffdc</id>
<content type='text'>
bio_clone and bio_alloc allocate from a common bio pool.
If an md device is stacked with other devices that use this pool, or under
something like swap which uses the pool, then the multiple calls on
the pool can cause deadlocks.

So allocate a local bio pool for each md array and use that rather
than the common pool.

This pool is used both for regular IO and metadata updates.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bio_clone and bio_alloc allocate from a common bio pool.
If an md device is stacked with other devices that use this pool, or under
something like swap which uses the pool, then the multiple calls on
the pool can cause deadlocks.

So allocate a local bio pool for each md array and use that rather
than the common pool.

This pool is used both for regular IO and metadata updates.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: change type of first arg to sync_page_io.</title>
<updated>2010-10-28T06:36:11+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-10-27T04:16:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2b193363ef68667ad717a6723165e0dccf99470f'/>
<id>2b193363ef68667ad717a6723165e0dccf99470f</id>
<content type='text'>
Currently sync_page_io takes a 'bdev'.
Every caller passes 'rdev-&gt;bdev'.
We will soon want another field out of the rdev in sync_page_io,
So just pass the rdev instead of the bdev out of it.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently sync_page_io takes a 'bdev'.
Every caller passes 'rdev-&gt;bdev'.
We will soon want another field out of the rdev in sync_page_io,
So just pass the rdev instead of the bdev out of it.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: use bio_kmalloc rather than bio_alloc when failure is acceptable.</title>
<updated>2010-10-28T06:36:06+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-10-26T06:33:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6746557f0325a66f57d179126426e38a8ea66945'/>
<id>6746557f0325a66f57d179126426e38a8ea66945</id>
<content type='text'>
bio_alloc can never fail (as it uses a mempool) but an block
indefinitely, especially if the caller is holding a reference to a
previously allocated bio.

So these to places which both handle failure and hold multiple bios
should not use bio_alloc, they should use bio_kmalloc.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bio_alloc can never fail (as it uses a mempool) but an block
indefinitely, especially if the caller is holding a reference to a
previously allocated bio.

So these to places which both handle failure and hold multiple bios
should not use bio_alloc, they should use bio_kmalloc.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Fix possible deadlock with multiple mempool allocations.</title>
<updated>2010-10-28T06:34:07+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-10-19T01:54:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e78064f42ad474ce9c31760861f7fb0cfc22532'/>
<id>4e78064f42ad474ce9c31760861f7fb0cfc22532</id>
<content type='text'>
It is not safe to allocate from a mempool while holding an item
previously allocated from that mempool as that can deadlock when the
mempool is close to exhaustion.

So don't use a bio list to collect the bios to write to multiple
devices in raid1 and raid10.
Instead queue each bio as it becomes available so an unplug will
activate all previously allocated bios and so a new bio has a chance
of being allocated.

This means we must set the 'remaining' count to '1' before submitting
any requests, then when all are submitted, decrement 'remaining' and
possible handle the write completion at that point.

Reported-by: Torsten Kaiser &lt;just.for.lkml@googlemail.com&gt;
Tested-by: Torsten Kaiser &lt;just.for.lkml@googlemail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is not safe to allocate from a mempool while holding an item
previously allocated from that mempool as that can deadlock when the
mempool is close to exhaustion.

So don't use a bio list to collect the bios to write to multiple
devices in raid1 and raid10.
Instead queue each bio as it becomes available so an unplug will
activate all previously allocated bios and so a new bio has a chance
of being allocated.

This means we must set the 'remaining' count to '1' before submitting
any requests, then when all are submitted, decrement 'remaining' and
possible handle the write completion at that point.

Reported-by: Torsten Kaiser &lt;just.for.lkml@googlemail.com&gt;
Tested-by: Torsten Kaiser &lt;just.for.lkml@googlemail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: use sector_t in bitmap_get_counter</title>
<updated>2010-10-28T06:32:26+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2010-10-18T23:03:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=57dab0bdf689d42972975ec646d862b0900a4bf3'/>
<id>57dab0bdf689d42972975ec646d862b0900a4bf3</id>
<content type='text'>
bitmap_get_counter returns the number of sectors covered
by the counter in a pass-by-reference variable.
In some cases this can be very large, so make it a sector_t
for safety.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bitmap_get_counter returns the number of sectors covered
by the counter in a pass-by-reference variable.
In some cases this can be very large, so make it a sector_t
for safety.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;

</pre>
</div>
</content>
</entry>
</feed>
