<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/block, branch v3.6-rc4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>block: Don't use static to define "void *p" in show_partition_start()</title>
<updated>2012-08-03T08:42:00+00:00</updated>
<author>
<name>Jianpeng Ma</name>
<email>majianpeng@gmail.com</email>
</author>
<published>2012-08-03T08:42:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0676806707281e27b13d44323bed580a8160b7a4'/>
<id>0676806707281e27b13d44323bed580a8160b7a4</id>
<content type='text'>
I met a odd prblem:read /proc/partitions may return zero.

I wrote a file test.c:
int main()
{
	char buff[4096];
	int ret;
	int fd;
	printf("pid=%d\n",getpid());
	while (1) {
		fd = open("/proc/partitions", O_RDONLY);
		if (fd &lt; 0) {
			printf("open error %s\n", strerror(errno));
			return 0;
		}
		ret = read(fd, buff, 4096);
		if (ret &lt;= 0)
			printf("ret=%d, %s, %ld\n", ret,
				strerror(errno), lseek(fd,0,SEEK_CUR));
		close(fd);
	}
	exit(0);
}

You can reproduce by:
1:while true;do cat /proc/partitions &gt; /dev/null ;done
2:./test

I reviewed the code and found:

&gt;&gt; static void *show_partition_start(struct seq_file *seqf, loff_t *pos)
&gt;&gt; {
&gt;&gt; 	static void *p;
&gt;&gt;
&gt;&gt; 	p = disk_seqf_start(seqf, pos);
&gt;&gt; 	if (!IS_ERR_OR_NULL(p) &amp;&amp; !*pos)
&gt;&gt; 		seq_puts(seqf, "major minor  #blocks  name\n\n");
&gt;&gt; 	return p;
&gt;&gt; }
		test								cat /proc/partitions
	p = disk_seqf_start()(Not NULL)
									p = disk_seqf_start()(NULL because pos)
	if (!IS_ERR_OR_NULL(p) &amp;&amp; !*pos)

Signed-off-by: Jianpeng Ma &lt;majianpeng@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I met a odd prblem:read /proc/partitions may return zero.

I wrote a file test.c:
int main()
{
	char buff[4096];
	int ret;
	int fd;
	printf("pid=%d\n",getpid());
	while (1) {
		fd = open("/proc/partitions", O_RDONLY);
		if (fd &lt; 0) {
			printf("open error %s\n", strerror(errno));
			return 0;
		}
		ret = read(fd, buff, 4096);
		if (ret &lt;= 0)
			printf("ret=%d, %s, %ld\n", ret,
				strerror(errno), lseek(fd,0,SEEK_CUR));
		close(fd);
	}
	exit(0);
}

You can reproduce by:
1:while true;do cat /proc/partitions &gt; /dev/null ;done
2:./test

I reviewed the code and found:

&gt;&gt; static void *show_partition_start(struct seq_file *seqf, loff_t *pos)
&gt;&gt; {
&gt;&gt; 	static void *p;
&gt;&gt;
&gt;&gt; 	p = disk_seqf_start(seqf, pos);
&gt;&gt; 	if (!IS_ERR_OR_NULL(p) &amp;&amp; !*pos)
&gt;&gt; 		seq_puts(seqf, "major minor  #blocks  name\n\n");
&gt;&gt; 	return p;
&gt;&gt; }
		test								cat /proc/partitions
	p = disk_seqf_start()(Not NULL)
									p = disk_seqf_start()(NULL because pos)
	if (!IS_ERR_OR_NULL(p) &amp;&amp; !*pos)

Signed-off-by: Jianpeng Ma &lt;majianpeng@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: Add blk_bio_map_sg() helper</title>
<updated>2012-08-02T21:42:04+00:00</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-08-02T21:42:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=85b9f66a41eb8ee3f1dfc95707412705463cdd97'/>
<id>85b9f66a41eb8ee3f1dfc95707412705463cdd97</id>
<content type='text'>
Add a helper to map a bio to a scatterlist, modelled after
blk_rq_map_sg.

This helper is useful for any driver that wants to create
a scatterlist from its -&gt;make_request_fn method.

Changes in v2:
 - Use __blk_segment_map_sg to avoid duplicated code
 - Add cocbook style function comment

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Shaohua Li &lt;shli@kernel.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a helper to map a bio to a scatterlist, modelled after
blk_rq_map_sg.

This helper is useful for any driver that wants to create
a scatterlist from its -&gt;make_request_fn method.

Changes in v2:
 - Use __blk_segment_map_sg to avoid duplicated code
 - Add cocbook style function comment

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Shaohua Li &lt;shli@kernel.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: Introduce __blk_segment_map_sg() helper</title>
<updated>2012-08-02T21:42:03+00:00</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-08-02T21:42:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=963ab9e5da95c654bb3ab937cc478de4f7088a96'/>
<id>963ab9e5da95c654bb3ab937cc478de4f7088a96</id>
<content type='text'>
Split the mapping code in blk_rq_map_sg() to a helper
__blk_segment_map_sg(), so that other mapping function, e.g.
blk_bio_map_sg(), can share the code.

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Shaohua Li &lt;shli@kernel.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Suggested-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Split the mapping code in blk_rq_map_sg() to a helper
__blk_segment_map_sg(), so that other mapping function, e.g.
blk_bio_map_sg(), can share the code.

Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Shaohua Li &lt;shli@kernel.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Suggested-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Suggested-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: split discard into aligned requests</title>
<updated>2012-08-02T07:48:50+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2012-08-02T07:48:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6e666345e1b79c62ba82339cc7d55a89cb73f88'/>
<id>c6e666345e1b79c62ba82339cc7d55a89cb73f88</id>
<content type='text'>
When a disk has large discard_granularity and small max_discard_sectors,
discards are not split with optimal alignment.  In the limit case of
discard_granularity == max_discard_sectors, no request could be aligned
correctly, so in fact you might end up with no discarded logical blocks
at all.

Another example that helps showing the condition in the patch is with
discard_granularity == 64, max_discard_sectors == 128.  A request that is
submitted for 256 sectors 2..257 will be split in two: 2..129, 130..257.
However, only 2 aligned blocks out of 3 are included in the request;
128..191 may be left intact and not discarded.  With this patch, the
first request will be truncated to ensure good alignment of what's left,
and the split will be 2..127, 128..255, 256..257.  The patch will also
take into account the discard_alignment.

At most one extra request will be introduced, because the first request
will be reduced by at most granularity-1 sectors, and granularity
must be less than max_discard_sectors.  Subsequent requests will run
on round_down(max_discard_sectors, granularity) sectors, as in the
current code.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Tested-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a disk has large discard_granularity and small max_discard_sectors,
discards are not split with optimal alignment.  In the limit case of
discard_granularity == max_discard_sectors, no request could be aligned
correctly, so in fact you might end up with no discarded logical blocks
at all.

Another example that helps showing the condition in the patch is with
discard_granularity == 64, max_discard_sectors == 128.  A request that is
submitted for 256 sectors 2..257 will be split in two: 2..129, 130..257.
However, only 2 aligned blocks out of 3 are included in the request;
128..191 may be left intact and not discarded.  With this patch, the
first request will be truncated to ensure good alignment of what's left,
and the split will be 2..127, 128..255, 256..257.  The patch will also
take into account the discard_alignment.

At most one extra request will be introduced, because the first request
will be reduced by at most granularity-1 sectors, and granularity
must be less than max_discard_sectors.  Subsequent requests will run
on round_down(max_discard_sectors, granularity) sectors, as in the
current code.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Tested-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: reorganize rounding of max_discard_sectors</title>
<updated>2012-08-02T07:48:49+00:00</updated>
<author>
<name>Paolo Bonzini</name>
<email>pbonzini@redhat.com</email>
</author>
<published>2012-08-02T07:48:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f6ff53d3611b564661896be23369b54d84941a0e'/>
<id>f6ff53d3611b564661896be23369b54d84941a0e</id>
<content type='text'>
Mostly a preparation for the next patch.

In principle this fixes an infinite loop if max_discard_sectors &lt; granularity,
but that really shouldn't happen.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Tested-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Mostly a preparation for the next patch.

In principle this fixes an infinite loop if max_discard_sectors &lt; granularity,
but that really shouldn't happen.

Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Tested-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-3.6/drivers' of git://git.kernel.dk/linux-block</title>
<updated>2012-08-01T16:06:47+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-08-01T16:06:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eff0d13f3823f35d70228cd151d2a2c89288ff32'/>
<id>eff0d13f3823f35d70228cd151d2a2c89288ff32</id>
<content type='text'>
Pull block driver changes from Jens Axboe:

 - Making the plugging support for drivers a bit more sane from Neil.
   This supersedes the plugging change from Shaohua as well.

 - The usual round of drbd updates.

 - Using a tail add instead of a head add in the request completion for
   ndb, making us find the most completed request more quickly.

 - A few floppy changes, getting rid of a duplicated flag and also
   running the floppy init async (since it takes forever in boot terms)
   from Andi.

* 'for-3.6/drivers' of git://git.kernel.dk/linux-block:
  floppy: remove duplicated flag FD_RAW_NEED_DISK
  blk: pass from_schedule to non-request unplug functions.
  block: stack unplug
  blk: centralize non-request unplug handling.
  md: remove plug_cnt feature of plugging.
  block/nbd: micro-optimization in nbd request completion
  drbd: announce FLUSH/FUA capability to upper layers
  drbd: fix max_bio_size to be unsigned
  drbd: flush drbd work queue before invalidate/invalidate remote
  drbd: fix potential access after free
  drbd: call local-io-error handler early
  drbd: do not reset rs_pending_cnt too early
  drbd: reset congestion information before reporting it in /proc/drbd
  drbd: report congestion if we are waiting for some userland callback
  drbd: differentiate between normal and forced detach
  drbd: cleanup, remove two unused global flags
  floppy: Run floppy initialization asynchronous
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull block driver changes from Jens Axboe:

 - Making the plugging support for drivers a bit more sane from Neil.
   This supersedes the plugging change from Shaohua as well.

 - The usual round of drbd updates.

 - Using a tail add instead of a head add in the request completion for
   ndb, making us find the most completed request more quickly.

 - A few floppy changes, getting rid of a duplicated flag and also
   running the floppy init async (since it takes forever in boot terms)
   from Andi.

* 'for-3.6/drivers' of git://git.kernel.dk/linux-block:
  floppy: remove duplicated flag FD_RAW_NEED_DISK
  blk: pass from_schedule to non-request unplug functions.
  block: stack unplug
  blk: centralize non-request unplug handling.
  md: remove plug_cnt feature of plugging.
  block/nbd: micro-optimization in nbd request completion
  drbd: announce FLUSH/FUA capability to upper layers
  drbd: fix max_bio_size to be unsigned
  drbd: flush drbd work queue before invalidate/invalidate remote
  drbd: fix potential access after free
  drbd: call local-io-error handler early
  drbd: do not reset rs_pending_cnt too early
  drbd: reset congestion information before reporting it in /proc/drbd
  drbd: report congestion if we are waiting for some userland callback
  drbd: differentiate between normal and forced detach
  drbd: cleanup, remove two unused global flags
  floppy: Run floppy initialization asynchronous
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-3.6/core' of git://git.kernel.dk/linux-block</title>
<updated>2012-08-01T16:02:41+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-08-01T16:02:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8cf1a3fce0b95050b63d451c9d561da0da2aa4d6'/>
<id>8cf1a3fce0b95050b63d451c9d561da0da2aa4d6</id>
<content type='text'>
Pull core block IO bits from Jens Axboe:
 "The most complicated part if this is the request allocation rework by
  Tejun, which has been queued up for a long time and has been in
  for-next ditto as well.

  There are a few commits from yesterday and today, mostly trivial and
  obvious fixes.  So I'm pretty confident that it is sound.  It's also
  smaller than usual."

* 'for-3.6/core' of git://git.kernel.dk/linux-block:
  block: remove dead func declaration
  block: add partition resize function to blkpg ioctl
  block: uninitialized ioc-&gt;nr_tasks triggers WARN_ON
  block: do not artificially constrain max_sectors for stacking drivers
  blkcg: implement per-blkg request allocation
  block: prepare for multiple request_lists
  block: add q-&gt;nr_rqs[] and move q-&gt;rq.elvpriv to q-&gt;nr_rqs_elvpriv
  blkcg: inline bio_blkcg() and friends
  block: allocate io_context upfront
  block: refactor get_request[_wait]()
  block: drop custom queue draining used by scsi_transport_{iscsi|fc}
  mempool: add @gfp_mask to mempool_create_node()
  blkcg: make root blkcg allocation use %GFP_KERNEL
  blkcg: __blkg_lookup_create() doesn't need radix preload
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull core block IO bits from Jens Axboe:
 "The most complicated part if this is the request allocation rework by
  Tejun, which has been queued up for a long time and has been in
  for-next ditto as well.

  There are a few commits from yesterday and today, mostly trivial and
  obvious fixes.  So I'm pretty confident that it is sound.  It's also
  smaller than usual."

* 'for-3.6/core' of git://git.kernel.dk/linux-block:
  block: remove dead func declaration
  block: add partition resize function to blkpg ioctl
  block: uninitialized ioc-&gt;nr_tasks triggers WARN_ON
  block: do not artificially constrain max_sectors for stacking drivers
  blkcg: implement per-blkg request allocation
  block: prepare for multiple request_lists
  block: add q-&gt;nr_rqs[] and move q-&gt;rq.elvpriv to q-&gt;nr_rqs_elvpriv
  blkcg: inline bio_blkcg() and friends
  block: allocate io_context upfront
  block: refactor get_request[_wait]()
  block: drop custom queue draining used by scsi_transport_{iscsi|fc}
  mempool: add @gfp_mask to mempool_create_node()
  blkcg: make root blkcg allocation use %GFP_KERNEL
  blkcg: __blkg_lookup_create() doesn't need radix preload
</pre>
</div>
</content>
</entry>
<entry>
<title>block: remove dead func declaration</title>
<updated>2012-08-01T10:25:54+00:00</updated>
<author>
<name>Yuanhan Liu</name>
<email>yuanhan.liu@linux.intel.com</email>
</author>
<published>2012-08-01T10:25:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=80799fbb7d10c30df78015b3fa21f7ffcfc0eb2c'/>
<id>80799fbb7d10c30df78015b3fa21f7ffcfc0eb2c</id>
<content type='text'>
__generic_unplug_device() function is removed with commit
7eaceaccab5f40bbfda044629a6298616aeaed50, which forgot to
remove the declaration at meantime. Here remove it.

Signed-off-by: Yuanhan Liu &lt;yuanhan.liu@linux.intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__generic_unplug_device() function is removed with commit
7eaceaccab5f40bbfda044629a6298616aeaed50, which forgot to
remove the declaration at meantime. Here remove it.

Signed-off-by: Yuanhan Liu &lt;yuanhan.liu@linux.intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: add partition resize function to blkpg ioctl</title>
<updated>2012-08-01T10:24:18+00:00</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2012-08-01T10:24:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c83f6bf98dc1f1a194118b3830706cebbebda8c4'/>
<id>c83f6bf98dc1f1a194118b3830706cebbebda8c4</id>
<content type='text'>
Add a new operation code (BLKPG_RESIZE_PARTITION) to the BLKPG ioctl that
allows altering the size of an existing partition, even if it is currently
in use.

This patch converts hd_struct-&gt;nr_sects into sequence counter because
One might extend a partition while IO is happening to it and update of
nr_sects can be non-atomic on 32bit machines with 64bit sector_t. This
can lead to issues like reading inconsistent size of a partition. Sequence
counter have been used so that readers don't have to take bdev mutex lock
as we call sector_in_part() very frequently.

Now all the access to hd_struct-&gt;nr_sects should happen using sequence
counter read/update helper functions part_nr_sects_read/part_nr_sects_write.
There is one exception though, set_capacity()/get_capacity(). I think
theoritically race should exist there too but this patch does not
modify set_capacity()/get_capacity() due to sheer number of call sites
and I am afraid that change might break something. I have left that as a
TODO item. We can handle it later if need be. This patch does not introduce
any new races as such w.r.t set_capacity()/get_capacity().

v2: Add CONFIG_LBDAF test to UP preempt case as suggested by Phillip.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Phillip Susi &lt;psusi@ubuntu.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new operation code (BLKPG_RESIZE_PARTITION) to the BLKPG ioctl that
allows altering the size of an existing partition, even if it is currently
in use.

This patch converts hd_struct-&gt;nr_sects into sequence counter because
One might extend a partition while IO is happening to it and update of
nr_sects can be non-atomic on 32bit machines with 64bit sector_t. This
can lead to issues like reading inconsistent size of a partition. Sequence
counter have been used so that readers don't have to take bdev mutex lock
as we call sector_in_part() very frequently.

Now all the access to hd_struct-&gt;nr_sects should happen using sequence
counter read/update helper functions part_nr_sects_read/part_nr_sects_write.
There is one exception though, set_capacity()/get_capacity(). I think
theoritically race should exist there too but this patch does not
modify set_capacity()/get_capacity() due to sheer number of call sites
and I am afraid that change might break something. I have left that as a
TODO item. We can handle it later if need be. This patch does not introduce
any new races as such w.r.t set_capacity()/get_capacity().

v2: Add CONFIG_LBDAF test to UP preempt case as suggested by Phillip.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Phillip Susi &lt;psusi@ubuntu.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: uninitialized ioc-&gt;nr_tasks triggers WARN_ON</title>
<updated>2012-08-01T10:17:27+00:00</updated>
<author>
<name>Olof Johansson</name>
<email>olof@lixom.net</email>
</author>
<published>2012-08-01T10:17:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4638a83e8615de9c16c39dfed234951d0f468cf1'/>
<id>4638a83e8615de9c16c39dfed234951d0f468cf1</id>
<content type='text'>
Hi,

I'm using the old-fashioned 'dump' backup tool, and I noticed that it spews the
below warning as of 3.5-rc1 and later (3.4 is fine):

[   10.886893] ------------[ cut here ]------------
[   10.886904] WARNING: at include/linux/iocontext.h:140 copy_process+0x1488/0x1560()
[   10.886905] Hardware name: Bochs
[   10.886906] Modules linked in:
[   10.886908] Pid: 2430, comm: dump Not tainted 3.5.0-rc7+ #27
[   10.886908] Call Trace:
[   10.886911]  [&lt;ffffffff8107ce8a&gt;] warn_slowpath_common+0x7a/0xb0
[   10.886912]  [&lt;ffffffff8107ced5&gt;] warn_slowpath_null+0x15/0x20
[   10.886913]  [&lt;ffffffff8107c088&gt;] copy_process+0x1488/0x1560
[   10.886914]  [&lt;ffffffff8107c244&gt;] do_fork+0xb4/0x340
[   10.886918]  [&lt;ffffffff8108effa&gt;] ? recalc_sigpending+0x1a/0x50
[   10.886919]  [&lt;ffffffff8108f6b2&gt;] ? __set_task_blocked+0x32/0x80
[   10.886920]  [&lt;ffffffff81091afa&gt;] ? __set_current_blocked+0x3a/0x60
[   10.886923]  [&lt;ffffffff81051db3&gt;] sys_clone+0x23/0x30
[   10.886925]  [&lt;ffffffff8179bd73&gt;] stub_clone+0x13/0x20
[   10.886927]  [&lt;ffffffff8179baa2&gt;] ? system_call_fastpath+0x16/0x1b
[   10.886928] ---[ end trace 32a14af7ee6a590b ]---

Reproducing is easy, I can hit it on a KVM system with a very basic
config (x86_64 make defconfig + enable the drivers needed). To hit it,
just install dump (on debian/ubuntu, not sure what the package might be
called on Fedora), and:

dump -o -f /tmp/foo /

You'll see the warning in dmesg once it forks off the I/O process and
starts dumping filesystem contents.

I bisected it down to the following commit:

commit f6e8d01bee036460e03bd4f6a79d014f98ba712e
Author: Tejun Heo &lt;tj@kernel.org&gt;
Date:   Mon Mar 5 13:15:26 2012 -0800

    block: add io_context-&gt;active_ref

    Currently ioc-&gt;nr_tasks is used to decide two things - whether an ioc
    is done issuing IOs and whether it's shared by multiple tasks.  This
    patch separate out the first into ioc-&gt;active_ref, which is acquired
    and released using {get|put}_io_context_active() respectively.

    This will be used to associate bio's with a given task.  This patch
    doesn't introduce any visible behavior change.

    Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
    Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
    Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;

It seems like the init of ioc-&gt;nr_tasks was removed in that patch,
so it starts out at 0 instead of 1.

Tejun, is the right thing here to add back the init, or should something else
be done?

The below patch removes the warning, but I haven't done any more extensive
testing on it.

Signed-off-by: Olof Johansson &lt;olof@lixom.net&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hi,

I'm using the old-fashioned 'dump' backup tool, and I noticed that it spews the
below warning as of 3.5-rc1 and later (3.4 is fine):

[   10.886893] ------------[ cut here ]------------
[   10.886904] WARNING: at include/linux/iocontext.h:140 copy_process+0x1488/0x1560()
[   10.886905] Hardware name: Bochs
[   10.886906] Modules linked in:
[   10.886908] Pid: 2430, comm: dump Not tainted 3.5.0-rc7+ #27
[   10.886908] Call Trace:
[   10.886911]  [&lt;ffffffff8107ce8a&gt;] warn_slowpath_common+0x7a/0xb0
[   10.886912]  [&lt;ffffffff8107ced5&gt;] warn_slowpath_null+0x15/0x20
[   10.886913]  [&lt;ffffffff8107c088&gt;] copy_process+0x1488/0x1560
[   10.886914]  [&lt;ffffffff8107c244&gt;] do_fork+0xb4/0x340
[   10.886918]  [&lt;ffffffff8108effa&gt;] ? recalc_sigpending+0x1a/0x50
[   10.886919]  [&lt;ffffffff8108f6b2&gt;] ? __set_task_blocked+0x32/0x80
[   10.886920]  [&lt;ffffffff81091afa&gt;] ? __set_current_blocked+0x3a/0x60
[   10.886923]  [&lt;ffffffff81051db3&gt;] sys_clone+0x23/0x30
[   10.886925]  [&lt;ffffffff8179bd73&gt;] stub_clone+0x13/0x20
[   10.886927]  [&lt;ffffffff8179baa2&gt;] ? system_call_fastpath+0x16/0x1b
[   10.886928] ---[ end trace 32a14af7ee6a590b ]---

Reproducing is easy, I can hit it on a KVM system with a very basic
config (x86_64 make defconfig + enable the drivers needed). To hit it,
just install dump (on debian/ubuntu, not sure what the package might be
called on Fedora), and:

dump -o -f /tmp/foo /

You'll see the warning in dmesg once it forks off the I/O process and
starts dumping filesystem contents.

I bisected it down to the following commit:

commit f6e8d01bee036460e03bd4f6a79d014f98ba712e
Author: Tejun Heo &lt;tj@kernel.org&gt;
Date:   Mon Mar 5 13:15:26 2012 -0800

    block: add io_context-&gt;active_ref

    Currently ioc-&gt;nr_tasks is used to decide two things - whether an ioc
    is done issuing IOs and whether it's shared by multiple tasks.  This
    patch separate out the first into ioc-&gt;active_ref, which is acquired
    and released using {get|put}_io_context_active() respectively.

    This will be used to associate bio's with a given task.  This patch
    doesn't introduce any visible behavior change.

    Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
    Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
    Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;

It seems like the init of ioc-&gt;nr_tasks was removed in that patch,
so it starts out at 0 instead of 1.

Tejun, is the right thing here to add back the init, or should something else
be done?

The below patch removes the warning, but I haven't done any more extensive
testing on it.

Signed-off-by: Olof Johansson &lt;olof@lixom.net&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
