<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/ipv4/ipvs/ip_vs_ctl.c, branch v2.6.23</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>[NETFILTER]: Fix/improve deadlock condition on module removal netfilter</title>
<updated>2007-09-11T09:28:26+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2007-09-11T09:28:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=16fcec35e7d7c4faaa4709f6434a4a25b06d25e3'/>
<id>16fcec35e7d7c4faaa4709f6434a4a25b06d25e3</id>
<content type='text'>
So I've had a deadlock reported to me.  I've found that the sequence of
events goes like this:

1) process A (modprobe) runs to remove ip_tables.ko

2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
increasing the ip_tables socket_ops use count

3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
in the kernel, which in turn executes the ip_tables module cleanup routine,
which calls nf_unregister_sockopt

4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
calling process into uninterruptible sleep, expecting the process using the
socket option code to wake it up when it exits the kernel

4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
ipt_find_table_lock, which in this case calls request_module to load
ip_tables_nat.ko

5) request_module forks a copy of modprobe (process C) to load the module and
blocks until modprobe exits.

6) Process C. forked by request_module process the dependencies of
ip_tables_nat.ko, of which ip_tables.ko is one.

7) Process C attempts to lock the request module and all its dependencies, it
blocks when it attempts to lock ip_tables.ko (which was previously locked in
step 3)

Theres not really any great permanent solution to this that I can see, but I've
developed a two part solution that corrects the problem

Part 1) Modifies the nf_sockopt registration code so that, instead of using a
use counter internal to the nf_sockopt_ops structure, we instead use a pointer
to the registering modules owner to do module reference counting when nf_sockopt
calls a modules set/get routine.  This prevents the deadlock by preventing set 4
from happening.

Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
remove operations (the same way rmmod does), and add an option to explicity
request blocking operation.  So if you select blocking operation in modprobe you
can still cause the above deadlock, but only if you explicity try (and since
root can do any old stupid thing it would like....  :)  ).

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
So I've had a deadlock reported to me.  I've found that the sequence of
events goes like this:

1) process A (modprobe) runs to remove ip_tables.ko

2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
increasing the ip_tables socket_ops use count

3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
in the kernel, which in turn executes the ip_tables module cleanup routine,
which calls nf_unregister_sockopt

4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
calling process into uninterruptible sleep, expecting the process using the
socket option code to wake it up when it exits the kernel

4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
ipt_find_table_lock, which in this case calls request_module to load
ip_tables_nat.ko

5) request_module forks a copy of modprobe (process C) to load the module and
blocks until modprobe exits.

6) Process C. forked by request_module process the dependencies of
ip_tables_nat.ko, of which ip_tables.ko is one.

7) Process C attempts to lock the request module and all its dependencies, it
blocks when it attempts to lock ip_tables.ko (which was previously locked in
step 3)

Theres not really any great permanent solution to this that I can see, but I've
developed a two part solution that corrects the problem

Part 1) Modifies the nf_sockopt registration code so that, instead of using a
use counter internal to the nf_sockopt_ops structure, we instead use a pointer
to the registering modules owner to do module reference counting when nf_sockopt
calls a modules set/get routine.  This prevents the deadlock by preventing set 4
from happening.

Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
remove operations (the same way rmmod does), and add an option to explicity
request blocking operation.  So if you select blocking operation in modprobe you
can still cause the above deadlock, but only if you explicity try (and since
root can do any old stupid thing it would like....  :)  ).

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPVS]: Use IP_VS_WAIT_WHILE when encessary.</title>
<updated>2007-08-14T05:52:15+00:00</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2007-08-10T22:50:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cae7ca3d3de48851e929de9469397749638df779'/>
<id>cae7ca3d3de48851e929de9469397749638df779</id>
<content type='text'>
For architectures that don't have a volatile atomic_ts constructs like
while (atomic_read(&amp;something)); might result in endless loops since a
barrier() is missing which forces the compiler to generate code that
actually reads memory contents.
Fix this in ipvs by using the IP_VS_WAIT_WHILE macro which resolves to
while (expr) { cpu_relax(); }
(why isn't this open coded btw?)

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For architectures that don't have a volatile atomic_ts constructs like
while (atomic_read(&amp;something)); might result in endless loops since a
barrier() is missing which forces the compiler to generate code that
actually reads memory contents.
Fix this in ipvs by using the IP_VS_WAIT_WHILE macro which resolves to
while (expr) { cpu_relax(); }
(why isn't this open coded btw?)

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPV4]: Clean up duplicate includes in net/ipv4/</title>
<updated>2007-08-14T05:52:02+00:00</updated>
<author>
<name>Jesper Juhl</name>
<email>jesper.juhl@gmail.com</email>
</author>
<published>2007-08-10T22:17:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f49f9967b263cc88b48d912172afdc621bcb0a3c'/>
<id>f49f9967b263cc88b48d912172afdc621bcb0a3c</id>
<content type='text'>
This patch cleans up duplicate includes in
	net/ipv4/

Signed-off-by: Jesper Juhl &lt;jesper.juhl@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch cleans up duplicate includes in
	net/ipv4/

Signed-off-by: Jesper Juhl &lt;jesper.juhl@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Make all initialized struct seq_operations const.</title>
<updated>2007-07-11T06:07:31+00:00</updated>
<author>
<name>Philippe De Muyter</name>
<email>phdm@macqel.be</email>
</author>
<published>2007-07-11T06:07:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56b3d975bbce65f655c5612b4822da671f9fd9b2'/>
<id>56b3d975bbce65f655c5612b4822da671f9fd9b2</id>
<content type='text'>
Make all initialized struct seq_operations in net/ const

Signed-off-by: Philippe De Muyter &lt;phdm@macqel.be&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make all initialized struct seq_operations in net/ const

Signed-off-by: Philippe De Muyter &lt;phdm@macqel.be&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>unify flush_work/flush_work_keventd and rename it to cancel_work_sync</title>
<updated>2007-05-09T19:30:53+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@tv-sign.ru</email>
</author>
<published>2007-05-09T09:34:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=28e53bddf814485699a4142bc056fd37d4e11dd4'/>
<id>28e53bddf814485699a4142bc056fd37d4e11dd4</id>
<content type='text'>
flush_work(wq, work) doesn't need the first parameter, we can use cwq-&gt;wq
(this was possible from the very beginnig, I missed this).  So we can unify
flush_work_keventd and flush_work.

Also, rename flush_work() to cancel_work_sync() and fix all callers.
Perhaps this is not the best name, but "flush_work" is really bad.

(akpm: this is why the earlier patches bypassed maintainers)

Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Jeff Garzik &lt;jeff@garzik.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Jens Axboe &lt;jens.axboe@oracle.com&gt;
Cc: Tejun Heo &lt;htejun@gmail.com&gt;
Cc: Auke Kok &lt;auke-jan.h.kok@intel.com&gt;,
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
flush_work(wq, work) doesn't need the first parameter, we can use cwq-&gt;wq
(this was possible from the very beginnig, I missed this).  So we can unify
flush_work_keventd and flush_work.

Also, rename flush_work() to cancel_work_sync() and fix all callers.
Perhaps this is not the best name, but "flush_work" is really bad.

(akpm: this is why the earlier patches bypassed maintainers)

Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Jeff Garzik &lt;jeff@garzik.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Jens Axboe &lt;jens.axboe@oracle.com&gt;
Cc: Tejun Heo &lt;htejun@gmail.com&gt;
Cc: Auke Kok &lt;auke-jan.h.kok@intel.com&gt;,
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: flush defense_work before module unload</title>
<updated>2007-05-09T19:30:52+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@tv-sign.ru</email>
</author>
<published>2007-05-09T09:34:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c214b2cc5f9be7c236f9b91acf524688ff0e3e72'/>
<id>c214b2cc5f9be7c236f9b91acf524688ff0e3e72</id>
<content type='text'>
net/ipv4/ipvs/ip_vs_core.c

	module_exit
	    ip_vs_cleanup
		ip_vs_control_cleanup
		    cancel_rearming_delayed_work
	// done

This is unsafe.  The module may be unloaded and the memory may be freed
while defense_work's handler is still running/preempted.

Do flush_work(&amp;defense_work.work) after cancel_rearming_delayed_work().

Alternatively, we could add flush_work() to cancel_rearming_delayed_work(),
but note that we can't change cancel_delayed_work() in the same manner
because it may be called from atomic context.

Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
net/ipv4/ipvs/ip_vs_core.c

	module_exit
	    ip_vs_cleanup
		ip_vs_control_cleanup
		    cancel_rearming_delayed_work
	// done

This is unsafe.  The module may be unloaded and the memory may be freed
while defense_work's handler is still running/preempted.

Do flush_work(&amp;defense_work.work) after cancel_rearming_delayed_work().

Alternatively, we could add flush_work() to cancel_rearming_delayed_work(),
but note that we can't change cancel_delayed_work() in the same manner
because it may be called from atomic context.

Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] sysctl: remove insert_at_head from register_sysctl</title>
<updated>2007-02-14T16:09:59+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2007-02-14T08:34:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0b4d414714f0d2f922d39424b0c5c82ad900a381'/>
<id>0b4d414714f0d2f922d39424b0c5c82ad900a381</id>
<content type='text'>
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name.  Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Acked-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Acked-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Andi Kleen &lt;ak@muc.de&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Corey Minyard &lt;minyard@acm.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: "John W. Linville" &lt;linville@tuxdriver.com&gt;
Cc: James Bottomley &lt;James.Bottomley@steeleye.com&gt;
Cc: Jan Kara &lt;jack@ucw.cz&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Cc: Mark Fasheh &lt;mark.fasheh@oracle.com&gt;
Cc: David Chinner &lt;dgc@sgi.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name.  Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Acked-by: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Acked-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Luck, Tony" &lt;tony.luck@intel.com&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Andi Kleen &lt;ak@muc.de&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Corey Minyard &lt;minyard@acm.org&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: "John W. Linville" &lt;linville@tuxdriver.com&gt;
Cc: James Bottomley &lt;James.Bottomley@steeleye.com&gt;
Cc: Jan Kara &lt;jack@ucw.cz&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Cc: Mark Fasheh &lt;mark.fasheh@oracle.com&gt;
Cc: David Chinner &lt;dgc@sgi.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] mark struct file_operations const 7</title>
<updated>2007-02-12T17:48:46+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2007-02-12T08:55:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9a32144e9d7b4e21341174b1a83b82a82353be86'/>
<id>9a32144e9d7b4e21341174b1a83b82a82353be86</id>
<content type='text'>
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many struct file_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>WorkStruct: make allyesconfig</title>
<updated>2006-11-22T14:57:56+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2006-11-22T14:57:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c4028958b6ecad064b1a6303a6a5906d4fe48d73'/>
<id>c4028958b6ecad064b1a6303a6a5906d4fe48d73</id>
<content type='text'>
Fix up for make allyesconfig.

Signed-Off-By: David Howells &lt;dhowells@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix up for make allyesconfig.

Signed-Off-By: David Howells &lt;dhowells@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPVS]: ipvs annotations</title>
<updated>2006-09-29T01:03:04+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2006-09-28T21:29:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=014d730d56b559eacb11e91969a1f41c3feb36f9'/>
<id>014d730d56b559eacb11e91969a1f41c3feb36f9</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
