<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/mm, branch linux-2.6.24.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>alloc_percpu() fails to allocate percpu data</title>
<updated>2008-04-19T01:53:21+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2008-03-28T18:42:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2f95fda842dd607e2d02f973b22a5aacf78bbd1b'/>
<id>2f95fda842dd607e2d02f973b22a5aacf78bbd1b</id>
<content type='text'>
upstream commit: be852795e1c8d3829ddf3cb1ce806113611fa555

Some oprofile results obtained while using tbench on a 2x2 cpu machine were
very surprising.

For example, loopback_xmit() function was using high number of cpu cycles
to perform the statistic updates, supposed to be real cheap since they use
percpu data

        pcpu_lstats = netdev_priv(dev);
        lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
        lb_stats-&gt;packets++;  /* HERE : serious contention */
        lb_stats-&gt;bytes += skb-&gt;len;

struct pcpu_lstats is a small structure containing two longs.  It appears
that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
instead of giving to each cpu a separate cache line.

Using the following patch gave me impressive boost in various benchmarks
( 6 % in tbench)
(all percpu_counters hit this bug too)

Long term fix (ie &gt;= 2.6.26) would be to let each CPU allocate their own
block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
merging the SGI stuff of course...

Note : SLUB vs SLAB is important here to *show* the improvement, since they
dont have the same minimum allocation sizes (8 bytes vs 32 bytes).  This
could very well explain regressions some guys reported when they switched
to SLUB.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
upstream commit: be852795e1c8d3829ddf3cb1ce806113611fa555

Some oprofile results obtained while using tbench on a 2x2 cpu machine were
very surprising.

For example, loopback_xmit() function was using high number of cpu cycles
to perform the statistic updates, supposed to be real cheap since they use
percpu data

        pcpu_lstats = netdev_priv(dev);
        lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
        lb_stats-&gt;packets++;  /* HERE : serious contention */
        lb_stats-&gt;bytes += skb-&gt;len;

struct pcpu_lstats is a small structure containing two longs.  It appears
that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
instead of giving to each cpu a separate cache line.

Using the following patch gave me impressive boost in various benchmarks
( 6 % in tbench)
(all percpu_counters hit this bug too)

Long term fix (ie &gt;= 2.6.26) would be to let each CPU allocate their own
block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
merging the SGI stuff of course...

Note : SLUB vs SLAB is important here to *show* the improvement, since they
dont have the same minimum allocation sizes (8 bytes vs 32 bytes).  This
could very well explain regressions some guys reported when they switched
to SLUB.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PERCPU : __percpu_alloc_mask() can dynamically size percpu_data storage</title>
<updated>2008-04-19T01:53:21+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2008-03-28T18:42:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=28680bfb8269703def997e2269caf9bfe2de489c'/>
<id>28680bfb8269703def997e2269caf9bfe2de489c</id>
<content type='text'>
upstream commit: b3242151906372f30f57feaa43b4cac96a23edb1

Instead of allocating a fix sized array of NR_CPUS pointers for percpu_data,
we can use nr_cpu_ids, which is generally &lt; NR_CPUS.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
upstream commit: b3242151906372f30f57feaa43b4cac96a23edb1

Instead of allocating a fix sized array of NR_CPUS pointers for percpu_data,
we can use nr_cpu_ids, which is generally &lt; NR_CPUS.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>slab: fix cache_cache bootstrap in kmem_cache_init()</title>
<updated>2008-04-19T01:53:20+00:00</updated>
<author>
<name>Daniel Yeisley</name>
<email>dan.yeisley@unisys.com</email>
</author>
<published>2008-03-26T21:37:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ce4039e002eab66502cd7d2cbcc6fcbdcbf828ee'/>
<id>ce4039e002eab66502cd7d2cbcc6fcbdcbf828ee</id>
<content type='text'>
upstream commit: ec1f5eeeb5a79a0d48036de649a3498da42db565

Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.
 
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Olaf Hering &lt;olaf@aepfle.de&gt;
Signed-off-by: Dan Yeisley &lt;dan.yeisley@unisys.com&gt;
Signed-off-by: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
upstream commit: ec1f5eeeb5a79a0d48036de649a3498da42db565

Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.
 
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Olaf Hering &lt;olaf@aepfle.de&gt;
Signed-off-by: Dan Yeisley &lt;dan.yeisley@unisys.com&gt;
Signed-off-by: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>slab: NUMA slab allocator migration bugfix</title>
<updated>2008-03-24T18:48:36+00:00</updated>
<author>
<name>Joe Korty</name>
<email>joe.korty@ccur.com</email>
</author>
<published>2008-03-05T23:04:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0837aca7b42f1bc9cd8dc1cd4d10aa2aaddaf84d'/>
<id>0837aca7b42f1bc9cd8dc1cd4d10aa2aaddaf84d</id>
<content type='text'>
NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller.  As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit.  When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

	kernel BUG at mm/slab.c:2417
	cache_alloc_refill+0xe6
	kmem_cache_alloc+0xd0
	...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty &lt;joe.korty@ccur.com&gt;
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller.  As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit.  When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

	kernel BUG at mm/slab.c:2417
	cache_alloc_refill+0xe6
	kmem_cache_alloc+0xd0
	...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty &lt;joe.korty@ccur.com&gt;
Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>hugetlb: ensure we do not reference a surplus page after handing it to buddy</title>
<updated>2008-03-24T18:47:19+00:00</updated>
<author>
<name>Andy Whitcroft</name>
<email>apw@shadowen.org</email>
</author>
<published>2008-02-24T02:10:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=83e8acc059f9b3075de6b4386488f09abc9a4868'/>
<id>83e8acc059f9b3075de6b4386488f09abc9a4868</id>
<content type='text'>
commit: e5df70ab194543522397fa3da8c8f80564a0f7d3

When we free a page via free_huge_page and we detect that we are in surplus
the page will be returned to the buddy.  After this we no longer own the page.

However at the end free_huge_page we clear out our mapping pointer from
page private.  Even where the page is not a surplus we free the page to
the hugepage pool, drop the pool locks and then clear page private.  In
either case the page may have been reallocated.  BAD.

Make sure we clear out page private before we free the page.

Signed-off-by: Andy Whitcroft &lt;apw@shadowen.org&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit: e5df70ab194543522397fa3da8c8f80564a0f7d3

When we free a page via free_huge_page and we detect that we are in surplus
the page will be returned to the buddy.  After this we no longer own the page.

However at the end free_huge_page we clear out our mapping pointer from
page private.  Even where the page is not a surplus we free the page to
the hugepage pool, drop the pool locks and then clear page private.  In
either case the page may have been reallocated.  BAD.

Make sure we clear out page private before we free the page.

Signed-off-by: Andy Whitcroft &lt;apw@shadowen.org&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iov_iter_advance() fix</title>
<updated>2008-03-24T18:47:10+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2008-03-11T01:50:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b6845726368e5b7b086e6d6438c9380bf5b7bc1c'/>
<id>b6845726368e5b7b086e6d6438c9380bf5b7bc1c</id>
<content type='text'>
commit: f7009264c519603b8ec67c881bd368a56703cfc9

iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array.  Fix this by checking against
i-&gt;count before we skip a zero-length iov.

The bug was reproduced with a test program that continually randomly creates
iovs to writev.  The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Tested-by: "Kevin Coffman" &lt;kwc@citi.umich.edu&gt;
Cc: "Alexey Dobriyan" &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit: f7009264c519603b8ec67c881bd368a56703cfc9

iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array.  Fix this by checking against
i-&gt;count before we skip a zero-length iov.

The bug was reproduced with a test program that continually randomly creates
iovs to writev.  The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Tested-by: "Kevin Coffman" &lt;kwc@citi.umich.edu&gt;
Cc: "Alexey Dobriyan" &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>SLUB: Deal with annoying gcc warning on kfree()</title>
<updated>2008-02-26T00:18:57+00:00</updated>
<author>
<name>Christoph Lameter</name>
<email>clameter@sgi.com</email>
</author>
<published>2008-02-08T01:47:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=227db665f6f946d376d48785b08d2b0cd1f21aad'/>
<id>227db665f6f946d376d48785b08d2b0cd1f21aad</id>
<content type='text'>
patch 5bb983b0cce9b7b281af15730f7019116dd42568 in mainline.

gcc 4.2 spits out an annoying warning if one casts a const void *
pointer to a void * pointer. No warning is generated if the
conversion is done through an assignment.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
patch 5bb983b0cce9b7b281af15730f7019116dd42568 in mainline.

gcc 4.2 spits out an annoying warning if one casts a const void *
pointer to a void * pointer. No warning is generated if the
conversion is done through an assignment.

Signed-off-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Be more robust about bad arguments in get_user_pages()</title>
<updated>2008-02-26T00:18:44+00:00</updated>
<author>
<name>Jonathan Corbet</name>
<email>corbet@lwn.net</email>
</author>
<published>2008-02-11T23:17:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=58e6cf1df821c76f245a45da05f4ac8f880e3296'/>
<id>58e6cf1df821c76f245a45da05f4ac8f880e3296</id>
<content type='text'>
patch 900cf086fd2fbad07f72f4575449e0d0958f860f in mainline.

So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop.  So, if it is passed in as zero, the loop
will execute once and decrement len to -1.  At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do.  Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code.  I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
patch 900cf086fd2fbad07f72f4575449e0d0958f860f in mainline.

So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop.  So, if it is passed in as zero, the loop
will execute once and decrement len to -1.  At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do.  Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code.  I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>hugetlb: add locking for overcommit sysctl</title>
<updated>2008-02-26T00:18:32+00:00</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@us.ibm.com</email>
</author>
<published>2008-02-08T12:18:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=091a61f602b7db7f4d1fdcb41e6ff9a97a6e0cce'/>
<id>091a61f602b7db7f4d1fdcb41e6ff9a97a6e0cce</id>
<content type='text'>
patch a3d0c6aa1bb342b9b2c7b123b52ac2f48a4d4d0a in mainline.

When I replaced hugetlb_dynamic_pool with nr_overcommit_hugepages I used
proc_doulongvec_minmax() directly.  However, hugetlb.c's locking rules
require that all counter modifications occur under the hugetlb_lock.  Add a
callback into the hugetlb code similar to the one for nr_hugepages.  Grab
the lock around the manipulation of nr_overcommit_hugepages in
proc_doulongvec_minmax().

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
patch a3d0c6aa1bb342b9b2c7b123b52ac2f48a4d4d0a in mainline.

When I replaced hugetlb_dynamic_pool with nr_overcommit_hugepages I used
proc_doulongvec_minmax() directly.  However, hugetlb.c's locking rules
require that all counter modifications occur under the hugetlb_lock.  Add a
callback into the hugetlb code similar to the one for nr_hugepages.  Grab
the lock around the manipulation of nr_overcommit_hugepages in
proc_doulongvec_minmax().

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>fix writev regression: pan hanging unkillable and un-straceable</title>
<updated>2008-02-08T19:46:29+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>nickpiggin@yahoo.com.au</email>
</author>
<published>2008-02-02T14:01:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=145eb46ca9f10a16790a59a327bcb59362bf40bc'/>
<id>145eb46ca9f10a16790a59a327bcb59362bf40bc</id>
<content type='text'>
patch 124d3b7041f9a0ca7c43a6293e1cae4576c32fd5 in mainline.

Frederik Himpe reported an unkillable and un-straceable pan process.

Zero length iovecs can go into an infinite loop in writev, because the 
iovec iterator does not always advance over them.

The sequence required to trigger this is not trivial. I think it 
requires that a zero-length iovec be followed by a non-zero-length iovec 
which causes a pagefault in the atomic usercopy. This causes the writev 
code to drop back into single-segment copy mode, which then tries to 
copy the 0 bytes of the zero-length iovec; a zero length copy looks like 
a failure though, so it loops.

Put a test into iov_iter_advance to catch zero-length iovecs. We could 
just put the test in the fallback path, but I feel it is more robust to 
skip over zero-length iovecs throughout the code (iovec iterator may be 
used in filesystems too, so it should be robust).

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
patch 124d3b7041f9a0ca7c43a6293e1cae4576c32fd5 in mainline.

Frederik Himpe reported an unkillable and un-straceable pan process.

Zero length iovecs can go into an infinite loop in writev, because the 
iovec iterator does not always advance over them.

The sequence required to trigger this is not trivial. I think it 
requires that a zero-length iovec be followed by a non-zero-length iovec 
which causes a pagefault in the atomic usercopy. This causes the writev 
code to drop back into single-segment copy mode, which then tries to 
copy the 0 bytes of the zero-length iovec; a zero length copy looks like 
a failure though, so it loops.

Put a test into iov_iter_advance to catch zero-length iovecs. We could 
just put the test in the fallback path, but I feel it is more robust to 
skip over zero-length iovecs throughout the code (iovec iterator may be 
used in filesystems too, so it should be robust).

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</pre>
</div>
</content>
</entry>
</feed>
