<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/md.c, branch v2.6.38</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>md: Fix - again - partition detection when array becomes active</title>
<updated>2011-02-24T06:26:41+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-24T06:26:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f0b4f7e2f29af678bd9af43422c537dcb6008603'/>
<id>f0b4f7e2f29af678bd9af43422c537dcb6008603</id>
<content type='text'>
Revert
    b821eaa572fd737faaf6928ba046e571526c36c6
and
    f3b99be19ded511a1bf05a148276239d9f13eefa

When I wrote the first of these I had a wrong idea about the
lifetime of 'struct block_device'.  It can disappear at any time that
the block device is not open if it falls out of the inode cache.

So relying on the 'size' recorded with it to detect when the
device size has changed and so we need to revalidate, is wrong.

Rather, we really do need the 'changed' attribute stored directly in
the mddev and set/tested as appropriate.

Without this patch, a sequence of:
   mknod / open / close / unlink

(which can cause a block_device to be created and then destroyed)
will result in a rescan of the partition table and consequence removal
and addition of partitions.
Several of these in a row can get udev racing to create and unlink and
other code can get confused.

With the patch, the rescan is only performed when needed and so there
are no races.

This is suitable for any stable kernel from 2.6.35.

Reported-by: "Wojcik, Krzysztof" &lt;krzysztof.wojcik@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: stable@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Revert
    b821eaa572fd737faaf6928ba046e571526c36c6
and
    f3b99be19ded511a1bf05a148276239d9f13eefa

When I wrote the first of these I had a wrong idea about the
lifetime of 'struct block_device'.  It can disappear at any time that
the block device is not open if it falls out of the inode cache.

So relying on the 'size' recorded with it to detect when the
device size has changed and so we need to revalidate, is wrong.

Rather, we really do need the 'changed' attribute stored directly in
the mddev and set/tested as appropriate.

Without this patch, a sequence of:
   mknod / open / close / unlink

(which can cause a block_device to be created and then destroyed)
will result in a rescan of the partition table and consequence removal
and addition of partitions.
Several of these in a row can get udev racing to create and unlink and
other code can get confused.

With the patch, the rescan is only performed when needed and so there
are no races.

This is suitable for any stable kernel from 2.6.35.

Reported-by: "Wojcik, Krzysztof" &lt;krzysztof.wojcik@intel.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: stable@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>md: correctly handle probe of an 'mdp' device.</title>
<updated>2011-02-16T02:58:51+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-16T02:58:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8f5f02c460b7ca74ce55ce126ce0c1e58a3f923d'/>
<id>8f5f02c460b7ca74ce55ce126ce0c1e58a3f923d</id>
<content type='text'>
'mdp' devices are md devices with preallocated device numbers
for partitions. As such it is possible to mknod and open a partition
before opening the whole device.

this causes  md_probe() to be called with a device number of a
partition, which in-turn calls mddev_find with such a number.

However mddev_find expects the number of a 'whole device' and
does the wrong thing with partition numbers.

So add code to mddev_find to remove the 'partition' part of
a device number and just work with the 'whole device'.

This patch addresses https://bugzilla.kernel.org/show_bug.cgi?id=28652

Reported-by: hkmaly@bigfoot.com
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: &lt;stable@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
'mdp' devices are md devices with preallocated device numbers
for partitions. As such it is possible to mknod and open a partition
before opening the whole device.

this causes  md_probe() to be called with a device number of a
partition, which in-turn calls mddev_find with such a number.

However mddev_find expects the number of a 'whole device' and
does the wrong thing with partition numbers.

So add code to mddev_find to remove the 'partition' part of
a device number and just work with the 'whole device'.

This patch addresses https://bugzilla.kernel.org/show_bug.cgi?id=28652

Reported-by: hkmaly@bigfoot.com
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: &lt;stable@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: don't set_capacity before array is active.</title>
<updated>2011-02-16T02:58:38+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-16T02:58:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cbe6ef1d2622e08e272600b3cb6040bed60f0450'/>
<id>cbe6ef1d2622e08e272600b3cb6040bed60f0450</id>
<content type='text'>
If the desired size of an array is set (via sysfs) before the array is
active (which is the normal sequence), we currrently call set_capacity
immediately.
This means that a subsequent 'open' (as can be caused by some
udev-triggers program) will notice the new size and try to probe for
partitions.  However as the array isn't quite ready yet the read will
fail.  Then when the array is read, as the size doesn't change again
we don't try to re-probe.

So when setting array size via sysfs, only call set_capacity if the
array is already active.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the desired size of an array is set (via sysfs) before the array is
active (which is the normal sequence), we currrently call set_capacity
immediately.
This means that a subsequent 'open' (as can be caused by some
udev-triggers program) will notice the new size and try to probe for
partitions.  However as the array isn't quite ready yet the read will
fail.  Then when the array is read, as the size doesn't change again
we don't try to re-probe.

So when setting array size via sysfs, only call set_capacity if the
array is already active.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md_make_request: don't touch the bio after calling make_request</title>
<updated>2011-02-07T22:53:28+00:00</updated>
<author>
<name>Chris Mason</name>
<email>chris.mason@oracle.com</email>
</author>
<published>2011-02-08T00:21:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e91ece5590b3c728624ab57043fc7a05069c604a'/>
<id>e91ece5590b3c728624ab57043fc7a05069c604a</id>
<content type='text'>
md_make_request was calling bio_sectors() for part_stat_add
after it was calling the make_request function.  This is
bad because the make_request function can free the bio and
because the bi_size field can change around.

The fix here was suggested by Jens Axboe.  It saves the
sector count before the make_request call.  I hit this
with CONFIG_DEBUG_PAGEALLOC turned on while trying to break
his pretty fusionio card.

Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Chris Mason &lt;chris.mason@oracle.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
md_make_request was calling bio_sectors() for part_stat_add
after it was calling the make_request function.  This is
bad because the make_request function can free the bio and
because the bi_size field can change around.

The fix here was suggested by Jens Axboe.  It saves the
sector count before the make_request call.  I hit this
with CONFIG_DEBUG_PAGEALLOC turned on while trying to break
his pretty fusionio card.

Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Chris Mason &lt;chris.mason@oracle.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Don't allow slot_store while resync/recovery is happening.</title>
<updated>2011-02-02T00:57:13+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-02-02T00:57:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6751b2bde477f56ceef67aa1d298ce44e8e2e23'/>
<id>c6751b2bde477f56ceef67aa1d298ce44e8e2e23</id>
<content type='text'>
Activating a spare in an array while resync/recovery is already
happening can lead the that spare being marked in-sync when it isn't
really.
So don't allow the 'slot' to be set (this activating the device)
while resync/recovery is happening.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Activating a spare in an array while resync/recovery is already
happening can lead the that spare being marked in-sync when it isn't
really.
So don't allow the 'slot' to be set (this activating the device)
while resync/recovery is happening.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: don't clear curr_resync_completed at end of resync.</title>
<updated>2011-01-31T03:30:27+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-01-31T03:30:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7281f8129c362436237b82c8c026494dd36479dc'/>
<id>7281f8129c362436237b82c8c026494dd36479dc</id>
<content type='text'>
There is no need to set this to zero at this point.  It will be
set to zero by remove_and_add_spares or at the start of
md_do_sync at the latest.
And setting it to zero before MD_RECOVERY_RUNNING is cleared can
make a 'zero' appear briefly in the 'sync_completed' sysfs attribute
just as resync is finishing.

So simply remove this setting to zero.


Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no need to set this to zero at this point.  It will be
set to zero by remove_and_add_spares or at the start of
md_do_sync at the latest.
And setting it to zero before MD_RECOVERY_RUNNING is cleared can
make a 'zero' appear briefly in the 'sync_completed' sysfs attribute
just as resync is finishing.

So simply remove this setting to zero.


Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Don't use remove_and_add_spares to remove failed devices from a read-only array</title>
<updated>2011-01-31T02:47:13+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-01-31T02:47:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a8c42c7f476b5bb39bb3a5b32d5473b9a46cadb9'/>
<id>a8c42c7f476b5bb39bb3a5b32d5473b9a46cadb9</id>
<content type='text'>
remove_and_add_spares is called in two places where the needs really
are very different.
remove_and_add_spares should not be called on an array which is about
to be reshaped as some extra devices might have been manually added
and that would remove them.  However if the array is 'read-auto',
that will currently happen, which is bad.

So in the 'ro != 0' case don't call remove_and_add_spares but simply
remove the failed devices as the comment suggests is needed.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
remove_and_add_spares is called in two places where the needs really
are very different.
remove_and_add_spares should not be called on an array which is about
to be reshaped as some extra devices might have been manually added
and that would remove them.  However if the array is 'read-auto',
that will currently happen, which is bad.

So in the 'ro != 0' case don't call remove_and_add_spares but simply
remove the failed devices as the comment suggests is needed.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Remove the AllReserved flag for component devices.</title>
<updated>2011-01-31T01:10:09+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-01-31T01:10:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f21e9ff7f77d41ceca4e1e5ee5a4efa5ad7a5e40'/>
<id>f21e9ff7f77d41ceca4e1e5ee5a4efa5ad7a5e40</id>
<content type='text'>
This flag is not needed and is used badly.

Devices that are included in a native-metadata array are reserved
exclusively for that array - and currently have AllReserved set.
They all are bd_claimed for the rdev and so cannot be shared.

Devices that are included in external-metadata arrays can be shared
among multiple arrays - providing there is no overlap.
These are bd_claimed for md in general - not for a particular rdev.

When changing the amount of a device that is used in an array we need
to check for overlap.  This currently includes a check on AllReserved
So even without overlap, sharing with an AllReserved device is not
allowed.
However the bd_claim usage already precludes sharing with these
devices, so the test on AllReserved is not needed.  And in fact it is
wrong.

As this is the only use of AllReserved, simply remove all usage and
definition of AllReserved.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This flag is not needed and is used badly.

Devices that are included in a native-metadata array are reserved
exclusively for that array - and currently have AllReserved set.
They all are bd_claimed for the rdev and so cannot be shared.

Devices that are included in external-metadata arrays can be shared
among multiple arrays - providing there is no overlap.
These are bd_claimed for md in general - not for a particular rdev.

When changing the amount of a device that is used in an array we need
to check for overlap.  This currently includes a check on AllReserved
So even without overlap, sharing with an AllReserved device is not
allowed.
However the bd_claim usage already precludes sharing with these
devices, so the test on AllReserved is not needed.  And in fact it is
wrong.

As this is the only use of AllReserved, simply remove all usage and
definition of AllReserved.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: revert change to raid_disks on failure.</title>
<updated>2011-01-31T00:57:42+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-01-31T00:57:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=de171cb9a52598cc023adceafc6c166112401386'/>
<id>de171cb9a52598cc023adceafc6c166112401386</id>
<content type='text'>
If we try to update_raid_disks and it fails, we should put
'delta_disks' back to zero.  This is important because some code,
such as slot_store, assumes that delta_disks has been validated.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we try to update_raid_disks and it fails, we should put
'delta_disks' back to zero.  This is important because some code,
such as slot_store, assumes that delta_disks has been validated.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: restore multiple bd_link_disk_holder() support</title>
<updated>2011-01-14T17:44:22+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-01-14T17:43:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=49731baa41df404c2c3f44555869ab387363af43'/>
<id>49731baa41df404c2c3f44555869ab387363af43</id>
<content type='text'>
Commit e09b457b (block: simplify holder symlink handling) incorrectly
assumed that there is only one link at maximum.  dm may use multiple
links and expects block layer to track reference count for each link,
which is different from and unrelated to the exclusive device holder
identified by @holder when the device is opened.

Remove the single holder assumption and automatic removal of the link
and revive the per-link reference count tracking.  The code
essentially behaves the same as before commit e09b457b sans the
unnecessary kobject reference count dancing.

While at it, note that this facility should not be used by anyone else
than the current ones.  Sysfs symlinks shouldn't be abused like this
and the whole thing doesn't belong in the block layer at all.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Milan Broz &lt;mbroz@redhat.com&gt;
Cc: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: linux-raid@vger.kernel.org
Cc: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit e09b457b (block: simplify holder symlink handling) incorrectly
assumed that there is only one link at maximum.  dm may use multiple
links and expects block layer to track reference count for each link,
which is different from and unrelated to the exclusive device holder
identified by @holder when the device is opened.

Remove the single holder assumption and automatic removal of the link
and revive the per-link reference count tracking.  The code
essentially behaves the same as before commit e09b457b sans the
unnecessary kobject reference count dancing.

While at it, note that this facility should not be used by anyone else
than the current ones.  Sysfs symlinks shouldn't be abused like this
and the whole thing doesn't belong in the block layer at all.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Milan Broz &lt;mbroz@redhat.com&gt;
Cc: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: linux-raid@vger.kernel.org
Cc: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
