<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv4/af_inet.c, branch linux-3.2.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net: ping: do not abuse udp_poll()</title>
<updated>2017-09-15T17:30:50+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-06-03T16:29:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c6b7065b8625bd1924e27bb4792560829b5cd603'/>
<id>c6b7065b8625bd1924e27bb4792560829b5cd603</id>
<content type='text'>
commit 77d4b1d36926a9b8387c6b53eeba42bcaaffcea3 upstream.

Alexander reported various KASAN messages triggered in recent kernels

The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.

Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Cc: Solar Designer &lt;solar@openwall.com&gt;
Cc: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Cc: Lorenzo Colitti &lt;lorenzo@google.com&gt;
Acked-By: Lorenzo Colitti &lt;lorenzo@google.com&gt;
Tested-By: Lorenzo Colitti &lt;lorenzo@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2:
 - Drop IPv6 bits
 - Adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 77d4b1d36926a9b8387c6b53eeba42bcaaffcea3 upstream.

Alexander reported various KASAN messages triggered in recent kernels

The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.

Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Cc: Solar Designer &lt;solar@openwall.com&gt;
Cc: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Cc: Lorenzo Colitti &lt;lorenzo@google.com&gt;
Acked-By: Lorenzo Colitti &lt;lorenzo@google.com&gt;
Tested-By: Lorenzo Colitti &lt;lorenzo@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2:
 - Drop IPv6 bits
 - Adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add validation for the socket syscall protocol argument</title>
<updated>2015-12-30T02:26:03+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2015-12-14T21:03:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ef6d51d24d878be2291d7af783441356eb77649d'/>
<id>ef6d51d24d878be2291d7af783441356eb77649d</id>
<content type='text'>
[ Upstream commit 79462ad02e861803b3840cc782248c7359451cd9 ]

郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

	int socket_fd;
	struct sockaddr_in addr;
	addr.sin_port = 0;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_family = 10;

	socket_fd = socket(10,3,0x40000000);
	connect(socket_fd , &amp;addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel:  [&lt;ffffffff816db90e&gt;] ? inet_autobind+0x2e/0x70
kernel:  [&lt;ffffffff816db9a4&gt;] inet_dgram_connect+0x54/0x80
kernel:  [&lt;ffffffff81645069&gt;] SYSC_connect+0xd9/0x110
kernel:  [&lt;ffffffff810ac51b&gt;] ? ptrace_notify+0x5b/0x80
kernel:  [&lt;ffffffff810236d8&gt;] ? syscall_trace_enter_phase2+0x108/0x200
kernel:  [&lt;ffffffff81645e0e&gt;] SyS_connect+0xe/0x10
kernel:  [&lt;ffffffff81779515&gt;] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang &lt;cwang@twopensource.com&gt;
Reported-by: 郭永刚 &lt;guoyonggang@360.cn&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2: open-code U8_MAX]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 79462ad02e861803b3840cc782248c7359451cd9 ]

郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

	int socket_fd;
	struct sockaddr_in addr;
	addr.sin_port = 0;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_family = 10;

	socket_fd = socket(10,3,0x40000000);
	connect(socket_fd , &amp;addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel:  [&lt;ffffffff816db90e&gt;] ? inet_autobind+0x2e/0x70
kernel:  [&lt;ffffffff816db9a4&gt;] inet_dgram_connect+0x54/0x80
kernel:  [&lt;ffffffff81645069&gt;] SYSC_connect+0xd9/0x110
kernel:  [&lt;ffffffff810ac51b&gt;] ? ptrace_notify+0x5b/0x80
kernel:  [&lt;ffffffff810236d8&gt;] ? syscall_trace_enter_phase2+0x108/0x200
kernel:  [&lt;ffffffff81645e0e&gt;] SyS_connect+0xe/0x10
kernel:  [&lt;ffffffff81779515&gt;] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang &lt;cwang@twopensource.com&gt;
Reported-by: 郭永刚 &lt;guoyonggang@360.cn&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2: open-code U8_MAX]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: use a stronger hash for tcp</title>
<updated>2013-03-06T03:24:21+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-02-21T12:18:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0fd0ff7e1fcc4b4bc5d17ab1d200f23dea7c681d'/>
<id>0fd0ff7e1fcc4b4bc5d17ab1d200f23dea7c681d</id>
<content type='text'>
[ Upstream commit 08dcdbf6a7b9d14c2302c5bd0c5390ddf122f664 ]

It looks like its possible to open thousands of TCP IPv6
sessions on a server, all landing in a single slot of TCP hash
table. Incoming packets have to lookup sockets in a very
long list.

We should hash all bits from foreign IPv6 addresses, using
a salt and hash mix, not a simple XOR.

inet6_ehashfn() can also separately use the ports, instead
of xoring them.

Reported-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 08dcdbf6a7b9d14c2302c5bd0c5390ddf122f664 ]

It looks like its possible to open thousands of TCP IPv6
sessions on a server, all landing in a single slot of TCP hash
table. Incoming packets have to lookup sockets in a very
long list.

We should hash all bits from foreign IPv6 addresses, using
a salt and hash mix, not a simple XOR.

inet6_ehashfn() can also separately use the ports, instead
of xoring them.

Reported-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: compat_ioctl is local to af_inet.c, make it static</title>
<updated>2011-10-19T23:24:39+00:00</updated>
<author>
<name>Gerrit Renker</name>
<email>gerrit@erg.abdn.ac.uk</email>
</author>
<published>2011-10-15T09:26:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=686dc6b64b58e69715ce92177da0732a6464db69'/>
<id>686dc6b64b58e69715ce92177da0732a6464db69</id>
<content type='text'>
ipv4: compat_ioctl is local to af_inet.c, make it static

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipv4: compat_ioctl is local to af_inet.c, make it static

Signed-off-by: Gerrit Renker &lt;gerrit@erg.abdn.ac.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ipv4: relax AF_INET check in bind()</title>
<updated>2011-08-30T22:57:00+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-08-30T22:57:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=29c486df6a208432b370bd4be99ae1369ede28d8'/>
<id>29c486df6a208432b370bd4be99ae1369ede28d8</id>
<content type='text'>
commit d0733d2e29b65 (Check for mistakenly passed in non-IPv4 address)
added regression on legacy apps that use bind() with AF_UNSPEC family.

Relax the check, but make sure the bind() is done on INADDR_ANY
addresses, as AF_UNSPEC has probably no sane meaning for other
addresses.

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=42012

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Reported-and-bisected-by: Rene Meier &lt;r_meier@freenet.de&gt;
CC: Marcus Meissner &lt;meissner@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d0733d2e29b65 (Check for mistakenly passed in non-IPv4 address)
added regression on legacy apps that use bind() with AF_UNSPEC family.

Relax the check, but make sure the bind() is done on INADDR_ANY
addresses, as AF_UNSPEC has probably no sane meaning for other
addresses.

Bugzilla reference : https://bugzilla.kernel.org/show_bug.cgi?id=42012

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Reported-and-bisected-by: Rene Meier &lt;r_meier@freenet.de&gt;
CC: Marcus Meissner &lt;meissner@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2011-07-06T06:23:37+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-07-06T06:23:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e12fe68ce34d60c04bb1ddb1d3cc5c3022388fe4'/>
<id>e12fe68ce34d60c04bb1ddb1d3cc5c3022388fe4</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>net: bind() fix error return on wrong address family</title>
<updated>2011-07-05T04:37:41+00:00</updated>
<author>
<name>Marcus Meissner</name>
<email>meissner@novell.com</email>
</author>
<published>2011-07-04T01:30:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c349a528cd47e2272ded0ea358363855e86180da'/>
<id>c349a528cd47e2272ded0ea358363855e86180da</id>
<content type='text'>
Hi,

Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.

The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.

Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
	EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
	EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
	No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip

Ciao, Marcus

Signed-off-by: Marcus Meissner &lt;meissner@suse.de&gt;
Cc: Reinhard Max &lt;max@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hi,

Reinhard Max also pointed out that the error should EAFNOSUPPORT according
to POSIX.

The Linux manpages have it as EINVAL, some other OSes (Minix, HPUX, perhaps BSD) use
EAFNOSUPPORT. Windows uses WSAEFAULT according to MSDN.

Other protocols error values in their af bind() methods in current mainline git as far
as a brief look shows:
	EAFNOSUPPORT: atm, appletalk, l2tp, llc, phonet, rxrpc
	EINVAL: ax25, bluetooth, decnet, econet, ieee802154, iucv, netlink, netrom, packet, rds, rose, unix, x25,
	No check?: can/raw, ipv6/raw, irda, l2tp/l2tp_ip

Ciao, Marcus

Signed-off-by: Marcus Meissner &lt;meissner@suse.de&gt;
Cc: Reinhard Max &lt;max@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2011-06-21T05:29:08+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-06-21T05:29:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9f6ec8d697c08963d83880ccd35c13c5ace716ea'/>
<id>9f6ec8d697c08963d83880ccd35c13c5ace716ea</id>
<content type='text'>
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
</pre>
</div>
</content>
</entry>
<entry>
<title>net: rfs: enable RFS before first data packet is received</title>
<updated>2011-06-17T19:27:31+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-06-17T03:45:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1eddceadb0d6441cd39b2c38705a8f5fec86e770'/>
<id>1eddceadb0d6441cd39b2c38705a8f5fec86e770</id>
<content type='text'>
Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
&gt; From: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
&gt; Date: Fri, 17 Jun 2011 00:50:46 +0100
&gt;
&gt; &gt; On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
&gt; &gt;&gt; @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
&gt; &gt;&gt;  			goto discard;
&gt; &gt;&gt;
&gt; &gt;&gt;  		if (nsk != sk) {
&gt; &gt;&gt; +			sock_rps_save_rxhash(nsk, skb-&gt;rxhash);
&gt; &gt;&gt;  			if (tcp_child_process(sk, nsk, skb)) {
&gt; &gt;&gt;  				rsk = nsk;
&gt; &gt;&gt;  				goto reset;
&gt; &gt;&gt;
&gt; &gt;
&gt; &gt; I haven't tried this, but it looks reasonable to me.
&gt; &gt;
&gt; &gt; What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.
&gt;
&gt; Indeed ipv6 side needs the same fix.
&gt;
&gt; Eric please add that part and resubmit.  And in fact I might stick
&gt; this into net-2.6 instead of net-next-2.6
&gt;

OK, here is the net-2.6 based one then, thanks !

[PATCH v2] net: rfs: enable RFS before first data packet is received

First packet received on a passive tcp flow is not correctly RFS
steered.

One sock_rps_record_flow() call is missing in inet_accept()

But before that, we also must record rxhash when child socket is setup.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Tom Herbert &lt;therbert@google.com&gt;
CC: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
CC: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@conan.davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Le jeudi 16 juin 2011 à 23:38 -0400, David Miller a écrit :
&gt; From: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
&gt; Date: Fri, 17 Jun 2011 00:50:46 +0100
&gt;
&gt; &gt; On Wed, 2011-06-15 at 04:15 +0200, Eric Dumazet wrote:
&gt; &gt;&gt; @@ -1594,6 +1594,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
&gt; &gt;&gt;  			goto discard;
&gt; &gt;&gt;
&gt; &gt;&gt;  		if (nsk != sk) {
&gt; &gt;&gt; +			sock_rps_save_rxhash(nsk, skb-&gt;rxhash);
&gt; &gt;&gt;  			if (tcp_child_process(sk, nsk, skb)) {
&gt; &gt;&gt;  				rsk = nsk;
&gt; &gt;&gt;  				goto reset;
&gt; &gt;&gt;
&gt; &gt;
&gt; &gt; I haven't tried this, but it looks reasonable to me.
&gt; &gt;
&gt; &gt; What about IPv6?  The logic in tcp_v6_do_rcv() looks very similar.
&gt;
&gt; Indeed ipv6 side needs the same fix.
&gt;
&gt; Eric please add that part and resubmit.  And in fact I might stick
&gt; this into net-2.6 instead of net-next-2.6
&gt;

OK, here is the net-2.6 based one then, thanks !

[PATCH v2] net: rfs: enable RFS before first data packet is received

First packet received on a passive tcp flow is not correctly RFS
steered.

One sock_rps_record_flow() call is missing in inet_accept()

But before that, we also must record rxhash when child socket is setup.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Tom Herbert &lt;therbert@google.com&gt;
CC: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
CC: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@conan.davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>snmp: reduce percpu needs by 50%</title>
<updated>2011-06-11T23:23:59+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-06-10T19:45:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8f0ea0fe3a036a47767f9c80e81b13e379a1f43b'/>
<id>8f0ea0fe3a036a47767f9c80e81b13e379a1f43b</id>
<content type='text'>
SNMP mibs use two percpu arrays, one used in BH context, another in USER
context. With increasing number of cpus in machines, and fact that ipv6
uses per network device ipstats_mib, this is consuming a lot of memory
if many network devices are registered.

commit be281e554e2a (ipv6: reduce per device ICMP mib sizes) shrinked
percpu needs for ipv6, but we can reduce memory use a bit more.

With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer
need this BH/USER separation since we can update counters in a single
x86 instruction, regardless of the BH/USER context.

Other arches than x86 might need to disable irq in their
irqsafe_cpu_inc() implementation : If this happens to be a problem, we
can make SNMP_ARRAY_SZ arch dependent, but a previous poll
( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not
raise strong opposition.

Only on 32bit arches, we need to disable BH for 64bit counters updates
done from USER context (currently used for IP MIB)

This also reduces vmlinux size :

1) x86_64 build
$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
7853650	1293772	1896448	11043870	 a8841e	vmlinux.before
7850578	1293772	1896448	11040798	 a8781e	vmlinux.after

2) i386  build
$ size vmlinux.before vmlinux.afterpatch
   text	   data	    bss	    dec	    hex	filename
6039335	 635076	3670016	10344427	 9dd7eb	vmlinux.before
6037342	 635076	3670016	10342434	 9dd022	vmlinux.afterpatch

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Ingo Molnar &lt;mingo@elte.hu&gt;
CC: Tejun Heo &lt;tj@kernel.org&gt;
CC: Christoph Lameter &lt;cl@linux-foundation.org&gt;
CC: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SNMP mibs use two percpu arrays, one used in BH context, another in USER
context. With increasing number of cpus in machines, and fact that ipv6
uses per network device ipstats_mib, this is consuming a lot of memory
if many network devices are registered.

commit be281e554e2a (ipv6: reduce per device ICMP mib sizes) shrinked
percpu needs for ipv6, but we can reduce memory use a bit more.

With recent percpu infrastructure (irqsafe_cpu_inc() ...), we no longer
need this BH/USER separation since we can update counters in a single
x86 instruction, regardless of the BH/USER context.

Other arches than x86 might need to disable irq in their
irqsafe_cpu_inc() implementation : If this happens to be a problem, we
can make SNMP_ARRAY_SZ arch dependent, but a previous poll
( https://lkml.org/lkml/2011/3/17/174 ) to arch maintainers did not
raise strong opposition.

Only on 32bit arches, we need to disable BH for 64bit counters updates
done from USER context (currently used for IP MIB)

This also reduces vmlinux size :

1) x86_64 build
$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
7853650	1293772	1896448	11043870	 a8841e	vmlinux.before
7850578	1293772	1896448	11040798	 a8781e	vmlinux.after

2) i386  build
$ size vmlinux.before vmlinux.afterpatch
   text	   data	    bss	    dec	    hex	filename
6039335	 635076	3670016	10344427	 9dd7eb	vmlinux.before
6037342	 635076	3670016	10342434	 9dd022	vmlinux.afterpatch

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Andi Kleen &lt;andi@firstfloor.org&gt;
CC: Ingo Molnar &lt;mingo@elte.hu&gt;
CC: Tejun Heo &lt;tj@kernel.org&gt;
CC: Christoph Lameter &lt;cl@linux-foundation.org&gt;
CC: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org
CC: linux-arch@vger.kernel.org
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
