<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv4/af_inet.c, branch linux-3.3.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>per-netns ipv4 sysctl_tcp_mem</title>
<updated>2011-12-13T00:04:11+00:00</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2011-12-11T21:47:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3dc43e3e4d0b52197d3205214fe8f162f9e0c334'/>
<id>3dc43e3e4d0b52197d3205214fe8f162f9e0c334</id>
<content type='text'>
This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;
CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Reviewed-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;
CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: introduce and use netdev_features_t for device features sets</title>
<updated>2011-11-16T22:43:10+00:00</updated>
<author>
<name>Michał Mirosław</name>
<email>mirq-linux@rere.qmqm.pl</email>
</author>
<published>2011-11-15T15:29:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c8f44affb7244f2ac3e703cab13d55ede27621bb'/>
<id>c8f44affb7244f2ac3e703cab13d55ede27621bb</id>
<content type='text'>
v2:	add couple missing conversions in drivers
	split unexporting netdev_fix_features()
	implemented %pNF
	convert sock::sk_route_(no?)caps

Signed-off-by: Michał Mirosław &lt;mirq-linux@rere.qmqm.pl&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
v2:	add couple missing conversions in drivers
	split unexporting netdev_fix_features()
	implemented %pNF
	convert sock::sk_route_(no?)caps

Signed-off-by: Michał Mirosław &lt;mirq-linux@rere.qmqm.pl&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: reduce percpu needs for icmpmsg mibs</title>
<updated>2011-11-09T21:04:20+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-11-08T13:04:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=acb32ba3dee66d58704caeeb8c6ff95f60efdc66'/>
<id>acb32ba3dee66d58704caeeb8c6ff95f60efdc66</id>
<content type='text'>
Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
(can be ~88000 us).

This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
for all possible cpus can read 16 Mbytes of memory.

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
(can be ~88000 us).

This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
for all possible cpus can read 16 Mbytes of memory.

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: compat_ioctl is local to af_inet.c, make it static</title>
<updated>2011-10-19T23:24:39+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2011-10-15T09:26:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=686dc6b64b58e69715ce92177da0732a6464db69'/>
<id>686dc6b64b58e69715ce92177da0732a6464db69</id>
<content type='text'>
ipv4: compat_ioctl is local to af_inet.c, make it static

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipv4: compat_ioctl is local to af_inet.c, make it static

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ipv4: relax AF_INET check in bind()</title>
<updated>2011-08-30T22:57:00+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-08-30T22:57:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=29c486df6a208432b370bd4be99ae1369ede28d8'/>
<id>29c486df6a208432b370bd4be99ae1369ede28d8</id>
<content type='text'>
commit d0733d2e29b65 (Check for mistakenly passed in non-IPv4 address)
added regression on legacy apps that use bind() with AF_UNSPEC family.

Relax the check, but make sure the bind() is done on INADDR_ANY
addresses, as AF_UNSPEC has probably no sane meaning for other
addresses.

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=42012

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Reported-and-bisected-by: Rene Meier &lt;r_meier@freenet.de&gt;
CC: Marcus Meissner &lt;meissner@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d0733d2e29b65 (Check for mistakenly passed in non-IPv4 address)
added regression on legacy apps that use bind() with AF_UNSPEC family.

Relax the check, but make sure the bind() is done on INADDR_ANY
addresses, as AF_UNSPEC has probably no sane meaning for other
addresses.

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=42012

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Reported-and-bisected-by: Rene Meier &lt;r_meier@freenet.de&gt;
CC: Marcus Meissner &lt;meissner@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2011-07-06T06:23:37+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-07-06T06:23:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e12fe68ce34d60c04bb1ddb1d3cc5c3022388fe4'/>
<id>e12fe68ce34d60c04bb1ddb1d3cc5c3022388fe4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bind() fix error return on wrong address family</title>
<updated>2011-07-05T04:37:41+00:00</updated>
<author>
<name>Marcus Meissner</name>
<email>meissner@novell.com</email>
</author>
<published>2011-07-04T01:30:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c349a528cd47e2272ded0ea358363855e86180da'/>
<id>c349a528cd47e2272ded0ea358363855e86180da</id>
<content type='text'>
Hi,

Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.

The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.

Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
	EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
	EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
	No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip

Ciao, Marcus

Signed-off-by: Marcus Meissner &lt;meissner@suse.de&gt;
Cc: Reinhard Max &lt;max@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hi,

Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.

The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.

Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
	EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
	EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
	No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip

Ciao, Marcus

Signed-off-by: Marcus Meissner &lt;meissner@suse.de&gt;
Cc: Reinhard Max &lt;max@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2011-06-21T05:29:08+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-06-21T05:29:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9f6ec8d697c08963d83880ccd35c13c5ace716ea'/>
<id>9f6ec8d697c08963d83880ccd35c13c5ace716ea</id>
<content type='text'>
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
</pre>
</div>
</content>
</entry>
<entry>
<title>net: rfs: enable RFS before first data packet is received</title>
<updated>2011-06-17T19:27:31+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-06-17T03:45:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1eddceadb0d6441cd39b2c38705a8f5fec86e770'/>
<id>1eddceadb0d6441cd39b2c38705a8f5fec86e770</id>
<content type='text'>
Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
&gt; From: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
&gt; Date: Fri, 17 Jun 2011 00:50:46 +0100
&gt;
&gt; &gt; On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
&gt; &gt;&gt; @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
&gt; &gt;&gt;  			goto discard;
&gt; &gt;&gt;
&gt; &gt;&gt;  		if (nsk != sk) {
&gt; &gt;&gt; +			sock_rps_save_rxhash(nsk, skb-&gt;rxhash);
&gt; &gt;&gt;  			if (tcp_child_process(sk, nsk, skb)) {
&gt; &gt;&gt;  				rsk = nsk;
&gt; &gt;&gt;  				goto reset;
&gt; &gt;&gt;
&gt; &gt;
&gt; &gt; I haven't tried this, but it looks reasonable to me.
&gt; &gt;
&gt; &gt; What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.
&gt;
&gt; Indeed ipv6 side needs the same fix.
&gt;
&gt; Eric please add that part and resubmit.  And in fact I might stick
&gt; this into net-2.6 instead of net-next-2.6
&gt;

OK, here is the net-2.6 based one then, thanks !

[PATCH v2] net: rfs: enable RFS before first data packet is received

First packet received on a passive tcp flow is not correctly RFS
steered.

One sock_rps_record_flow() call is missing in inet_accept()

But before that, we also must record rxhash when child socket is setup.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Tom Herbert &lt;therbert@google.com&gt;
CC: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
CC: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@conan.davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
&gt; From: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
&gt; Date: Fri, 17 Jun 2011 00:50:46 +0100
&gt;
&gt; &gt; On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
&gt; &gt;&gt; @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
&gt; &gt;&gt;  			goto discard;
&gt; &gt;&gt;
&gt; &gt;&gt;  		if (nsk != sk) {
&gt; &gt;&gt; +			sock_rps_save_rxhash(nsk, skb-&gt;rxhash);
&gt; &gt;&gt;  			if (tcp_child_process(sk, nsk, skb)) {
&gt; &gt;&gt;  				rsk = nsk;
&gt; &gt;&gt;  				goto reset;
&gt; &gt;&gt;
&gt; &gt;
&gt; &gt; I haven't tried this, but it looks reasonable to me.
&gt; &gt;
&gt; &gt; What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.
&gt;
&gt; Indeed ipv6 side needs the same fix.
&gt;
&gt; Eric please add that part and resubmit.  And in fact I might stick
&gt; this into net-2.6 instead of net-next-2.6
&gt;

OK, here is the net-2.6 based one then, thanks !

[PATCH v2] net: rfs: enable RFS before first data packet is received

First packet received on a passive tcp flow is not correctly RFS
steered.

One sock_rps_record_flow() call is missing in inet_accept()

But before that, we also must record rxhash when child socket is setup.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Tom Herbert &lt;therbert@google.com&gt;
CC: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
CC: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@conan.davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>snmp: reduce percpu needs by 50%</title>
<updated>2011-06-11T23:23:59+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-06-10T19:45:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8f0ea0fe3a036a47767f9c80e81b13e379a1f43b'/>
<id>8f0ea0fe3a036a47767f9c80e81b13e379a1f43b</id>
<content type='text'>
SNMP mibs use two percpu arrays, one used in BH context, another in USER
context. With increasing number of cpus in machines, and fact that ipv6
uses per network device ipstats_mib, this is consuming a lot of memory
if many network devices are registered.

commit be281e554e2a (ipv6: reduce per device ICMP mib sizes) shrinked
percpu needs for ipv6, but we can reduce memory use a bit more.

With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer
need this BH/USER separation since we can update counters in a single
x86 instruction, regardless of the BH/USER context.

Other arches than x86 might need to disable irq in their
irqsafe_cpu_inc() implementation : If this happens to be a problem, we
can make SNMP_ARRAY_SZ arch dependent, but a previous poll
( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not
raise strong opposition.

Only on 32bit arches, we need to disable BH for 64bit counters updates
done from USER context (currently used for IP MIB)

This also reduces vmlinux size :

1) x86_64 build
$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
7853650	1293772	1896448	11043870	 a8841e	vmlinux.before
7850578	1293772	1896448	11040798	 a8781e	vmlinux.after

2) i386  build
$ size vmlinux.before vmlinux.afterpatch
   text	   data	    bss	    dec	    hex	filename
6039335	 635076	3670016	10344427	 9dd7eb	vmlinux.before
6037342	 635076	3670016	10342434	 9dd022	vmlinux.afterpatch

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Ingo Molnar &lt;mingo@elte.hu&gt;
CC: Tejun Heo &lt;tj@kernel.org&gt;
CC: Christoph Lameter &lt;cl@linux-foundation.org&gt;
CC: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SNMP mibs use two percpu arrays, one used in BH context, another in USER
context. With increasing number of cpus in machines, and fact that ipv6
uses per network device ipstats_mib, this is consuming a lot of memory
if many network devices are registered.

commit be281e554e2a (ipv6: reduce per device ICMP mib sizes) shrinked
percpu needs for ipv6, but we can reduce memory use a bit more.

With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer
need this BH/USER separation since we can update counters in a single
x86 instruction, regardless of the BH/USER context.

Other arches than x86 might need to disable irq in their
irqsafe_cpu_inc() implementation : If this happens to be a problem, we
can make SNMP_ARRAY_SZ arch dependent, but a previous poll
( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not
raise strong opposition.

Only on 32bit arches, we need to disable BH for 64bit counters updates
done from USER context (currently used for IP MIB)

This also reduces vmlinux size :

1) x86_64 build
$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
7853650	1293772	1896448	11043870	 a8841e	vmlinux.before
7850578	1293772	1896448	11040798	 a8781e	vmlinux.after

2) i386  build
$ size vmlinux.before vmlinux.afterpatch
   text	   data	    bss	    dec	    hex	filename
6039335	 635076	3670016	10344427	 9dd7eb	vmlinux.before
6037342	 635076	3670016	10342434	 9dd022	vmlinux.afterpatch

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Ingo Molnar &lt;mingo@elte.hu&gt;
CC: Tejun Heo &lt;tj@kernel.org&gt;
CC: Christoph Lameter &lt;cl@linux-foundation.org&gt;
CC: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
