<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/core/dev.c, branch v3.5.7</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net: do not disable sg for packets requiring no checksum</title>
<updated>2012-10-12T20:47:05+00:00</updated>
<author>
<name>Ed Cashin</name>
<email>ecashin@coraid.com</email>
</author>
<published>2012-09-19T15:49:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b9f8fc18d7f8e1ef260e99ddb027e1e4f2820122'/>
<id>b9f8fc18d7f8e1ef260e99ddb027e1e4f2820122</id>
<content type='text'>
[ Upstream commit c0d680e577ff171e7b37dbdb1b1bf5451e851f04 ]

A change in a series of VLAN-related changes appears to have
inadvertently disabled the use of the scatter gather feature of
network cards for transmission of non-IP ethernet protocols like ATA
over Ethernet (AoE).  Below is a reference to the commit that
introduces a "harmonize_features" function that turns off scatter
gather when the NIC does not support hardware checksumming for the
ethernet protocol of an sk buff.

  commit f01a5236bd4b140198fbcc550f085e8361fd73fa
  Author: Jesse Gross &lt;jesse@nicira.com&gt;
  Date:   Sun Jan 9 06:23:31 2011 +0000

      net offloading: Generalize netif_get_vlan_features().

The can_checksum_protocol function is not equipped to consider a
protocol that does not require checksumming.  Calling it for a
protocol that requires no checksum is inappropriate.

The patch below has harmonize_features call can_checksum_protocol when
the protocol needs a checksum, so that the network layer is not forced
to perform unnecessary skb linearization on the transmission of AoE
packets.  Unnecessary linearization results in decreased performance
and increased memory pressure, as reported here:

  http://www.spinics.net/lists/linux-mm/msg15184.html

The problem has probably not been widely experienced yet, because
only recently has the kernel.org-distributed aoe driver acquired the
ability to use payloads of over a page in size, with the patchset
recently included in the mm tree:

  https://lkml.org/lkml/2012/8/28/140

The coraid.com-distributed aoe driver already could use payloads of
greater than a page in size, but its users generally do not use the
newest kernels.

Signed-off-by: Ed Cashin &lt;ecashin@coraid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c0d680e577ff171e7b37dbdb1b1bf5451e851f04 ]

A change in a series of VLAN-related changes appears to have
inadvertently disabled the use of the scatter gather feature of
network cards for transmission of non-IP ethernet protocols like ATA
over Ethernet (AoE).  Below is a reference to the commit that
introduces a "harmonize_features" function that turns off scatter
gather when the NIC does not support hardware checksumming for the
ethernet protocol of an sk buff.

  commit f01a5236bd4b140198fbcc550f085e8361fd73fa
  Author: Jesse Gross &lt;jesse@nicira.com&gt;
  Date:   Sun Jan 9 06:23:31 2011 +0000

      net offloading: Generalize netif_get_vlan_features().

The can_checksum_protocol function is not equipped to consider a
protocol that does not require checksumming.  Calling it for a
protocol that requires no checksum is inappropriate.

The patch below has harmonize_features call can_checksum_protocol when
the protocol needs a checksum, so that the network layer is not forced
to perform unnecessary skb linearization on the transmission of AoE
packets.  Unnecessary linearization results in decreased performance
and increased memory pressure, as reported here:

  http://www.spinics.net/lists/linux-mm/msg15184.html

The problem has probably not been widely experienced yet, because
only recently has the kernel.org-distributed aoe driver acquired the
ability to use payloads of over a page in size, with the patchset
recently included in the mm tree:

  https://lkml.org/lkml/2012/8/28/140

The coraid.com-distributed aoe driver already could use payloads of
greater than a page in size, but its users generally do not use the
newest kernels.

Signed-off-by: Ed Cashin &lt;ecashin@coraid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: small bug on rxhash calculation</title>
<updated>2012-10-12T20:47:03+00:00</updated>
<author>
<name>Chema Gonzalez</name>
<email>chema@google.com</email>
</author>
<published>2012-09-07T13:40:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fd5b2aaa47e613e4e546f39f117fe3178f0aafb3'/>
<id>fd5b2aaa47e613e4e546f39f117fe3178f0aafb3</id>
<content type='text'>
[ Upstream commit 6862234238e84648c305526af2edd98badcad1e0 ]

In the current rxhash calculation function, while the
sorting of the ports/addrs is coherent (you get the
same rxhash for packets sharing the same 4-tuple, in
both directions), ports and addrs are sorted
independently. This implies packets from a connection
between the same addresses but crossed ports hash to
the same rxhash.

For example, traffic between A=S:l and B=L:s is hashed
(in both directions) from {L, S, {s, l}}. The same
rxhash is obtained for packets between C=S:s and D=L:l.

This patch ensures that you either swap both addrs and ports,
or you swap none. Traffic between A and B, and traffic
between C and D, get their rxhash from different sources
({L, S, {l, s}} for A&lt;-&gt;B, and {L, S, {s, l}} for C&lt;-&gt;D)

The patch is co-written with Eric Dumazet &lt;edumazet@google.com&gt;

Signed-off-by: Chema Gonzalez &lt;chema@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 6862234238e84648c305526af2edd98badcad1e0 ]

In the current rxhash calculation function, while the
sorting of the ports/addrs is coherent (you get the
same rxhash for packets sharing the same 4-tuple, in
both directions), ports and addrs are sorted
independently. This implies packets from a connection
between the same addresses but crossed ports hash to
the same rxhash.

For example, traffic between A=S:l and B=L:s is hashed
(in both directions) from {L, S, {s, l}}. The same
rxhash is obtained for packets between C=S:s and D=L:l.

This patch ensures that you either swap both addrs and ports,
or you swap none. Traffic between A and B, and traffic
between C and D, get their rxhash from different sources
({L, S, {l, s}} for A&lt;-&gt;B, and {L, S, {s, l}} for C&lt;-&gt;D)

The patch is co-written with Eric Dumazet &lt;edumazet@google.com&gt;

Signed-off-by: Chema Gonzalez &lt;chema@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>af_packet: don't emit packet on orig fanout group</title>
<updated>2012-10-02T17:38:54+00:00</updated>
<author>
<name>Eric Leblond</name>
<email>eric@regit.org</email>
</author>
<published>2012-08-16T22:02:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c182b9556632ce565dc7ecbb49759334628f75bb'/>
<id>c182b9556632ce565dc7ecbb49759334628f75bb</id>
<content type='text'>
[ Upstream commit c0de08d04215031d68fa13af36f347a6cfa252ca ]

If a packet is emitted on one socket in one group of fanout sockets,
it is transmitted again. It is thus read again on one of the sockets
of the fanout group. This result in a loop for software which
generate packets when receiving one.
This retransmission is not the intended behavior: a fanout group
must behave like a single socket. The packet should not be
transmitted on a socket if it originates from a socket belonging
to the same fanout group.

This patch fixes the issue by changing the transmission check to
take fanout group info account.

Reported-by: Aleksandr Kotov &lt;a1k@mail.ru&gt;
Signed-off-by: Eric Leblond &lt;eric@regit.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit c0de08d04215031d68fa13af36f347a6cfa252ca ]

If a packet is emitted on one socket in one group of fanout sockets,
it is transmitted again. It is thus read again on one of the sockets
of the fanout group. This result in a loop for software which
generate packets when receiving one.
This retransmission is not the intended behavior: a fanout group
must behave like a single socket. The packet should not be
transmitted on a socket if it originates from a socket belonging
to the same fanout group.

This patch fixes the issue by changing the transmission check to
take fanout group info account.

Reported-by: Aleksandr Kotov &lt;a1k@mail.ru&gt;
Signed-off-by: Eric Leblond &lt;eric@regit.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/core: Fix potential memory leak in dev_set_alias()</title>
<updated>2012-10-02T17:38:40+00:00</updated>
<author>
<name>Alexey Khoroshilov</name>
<email>khoroshilov@ispras.ru</email>
</author>
<published>2012-08-08T00:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b5db88321043909060745304c1e485a6a2839d31'/>
<id>b5db88321043909060745304c1e485a6a2839d31</id>
<content type='text'>
[ Upstream commit 7364e445f62825758fa61195d237a5b8ecdd06ec ]

Do not leak memory by updating pointer with potentially NULL realloc return value.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 7364e445f62825758fa61195d237a5b8ecdd06ec ]

Do not leak memory by updating pointer with potentially NULL realloc return value.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Allow driver to limit number of GSO segments per skb</title>
<updated>2012-10-02T17:38:39+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>bhutchings@solarflare.com</email>
</author>
<published>2012-07-30T15:57:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=525e07b8b6f18aec949c9f68672a0a3e3d8c38b1'/>
<id>525e07b8b6f18aec949c9f68672a0a3e3d8c38b1</id>
<content type='text'>
[ Upstream commit 30b678d844af3305cda5953467005cebb5d7b687 ]

A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps).  Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once.  This
results in a single skb that expands to 861 segments.

In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring.  The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset).  This particularly affects sfc, for which the
issue is designated as CVE-2012-3412.

Therefore:
1. Add the field net_device::gso_max_segs holding the device-specific
   limit.
2. In netif_skb_features(), if the number of segments is too high then
   mask out GSO features to force fall back to software GSO.

Signed-off-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 30b678d844af3305cda5953467005cebb5d7b687 ]

A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps).  Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once.  This
results in a single skb that expands to 861 segments.

In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring.  The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset).  This particularly affects sfc, for which the
issue is designated as CVE-2012-3412.

Therefore:
1. Add the field net_device::gso_max_segs holding the device-specific
   limit.
2. In netif_skb_features(), if the number of segments is too high then
   mask out GSO features to force fall back to software GSO.

Signed-off-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: feed /dev/random with the MAC address when registering a device</title>
<updated>2012-08-15T14:52:48+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2012-07-05T01:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b8570c10637c4e3666b6da473a17fa5484d3b06c'/>
<id>b8570c10637c4e3666b6da473a17fa5484d3b06c</id>
<content type='text'>
commit 7bf2357524408b97fec58344caf7397f8140c3fd upstream.

Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 7bf2357524408b97fec58344caf7397f8140c3fd upstream.

Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>net: Statically initialize init_net.dev_base_head</title>
<updated>2012-07-18T20:32:27+00:00</updated>
<author>
<name>Rustad, Mark D</name>
<email>mark.d.rustad@intel.com</email>
</author>
<published>2012-07-18T09:06:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=734b65417b24d6eea3e3d7457e1f11493890ee1d'/>
<id>734b65417b24d6eea3e3d7457e1f11493890ee1d</id>
<content type='text'>
This change eliminates an initialization-order hazard most
recently seen when netprio_cgroup is built into the kernel.

With thanks to Eric Dumazet for catching a bug.

Signed-off-by: Mark Rustad &lt;mark.d.rustad@intel.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This change eliminates an initialization-order hazard most
recently seen when netprio_cgroup is built into the kernel.

With thanks to Eric Dumazet for catching a bug.

Signed-off-by: Mark Rustad &lt;mark.d.rustad@intel.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: cgroup: fix out of bounds accesses</title>
<updated>2012-07-09T21:50:54+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-08T21:45:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=91c68ce2b26319248a32d7baa1226f819d283758'/>
<id>91c68ce2b26319248a32d7baa1226f819d283758</id>
<content type='text'>
dev-&gt;priomap is allocated by extend_netdev_table() called from
update_netdev_tables().
And this is only called if write_priomap() is called.

But if write_priomap() is not called, it seems we can have out of bounds
accesses in cgrp_destroy(), read_priomap() &amp; skb_update_prio()

With help from Gao Feng

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Acked-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
dev-&gt;priomap is allocated by extend_netdev_table() called from
update_netdev_tables().
And this is only called if write_priomap() is called.

But if write_priomap() is not called, it seems we can have out of bounds
accesses in cgrp_destroy(), read_priomap() &amp; skb_update_prio()

With help from Gao Feng

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Cc: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Acked-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Downgrade CAP_SYS_MODULE deprecated message from error to warning.</title>
<updated>2012-06-28T23:55:05+00:00</updated>
<author>
<name>Vinson Lee</name>
<email>vlee@twitter.com</email>
</author>
<published>2012-06-27T14:32:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7cecb523adedcaf8acba5e14d47559d8bc3f40d7'/>
<id>7cecb523adedcaf8acba5e14d47559d8bc3f40d7</id>
<content type='text'>
Make logging level consistent with other deprecation messages in net
subsystem.

Signed-off-by: Vinson Lee &lt;vlee@twitter.com&gt;
Cc: David Mackey &lt;tdmackey@twitter.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make logging level consistent with other deprecation messages in net
subsystem.

Signed-off-by: Vinson Lee &lt;vlee@twitter.com&gt;
Cc: David Mackey &lt;tdmackey@twitter.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: remove skb_orphan_try()</title>
<updated>2012-06-15T22:30:15+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-06-14T06:42:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=62b1a8ab9b3660bb820d8dfe23148ed6cda38574'/>
<id>62b1a8ab9b3660bb820d8dfe23148ed6cda38574</id>
<content type='text'>
Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk-&gt;sk_wmemalloc reaches sk-&gt;sk_sndbuf (assuming
sk_sndbuf is not too big)

We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.

Reverts commits :
fc6055a5ba31 net: Introduce skb_orphan_try()
87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag

Reported-and-bisected-by: Jean-Michel Hautbois &lt;jhautbois@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Tested-by: Oliver Hartkopp &lt;socketcan@hartkopp.net&gt;
Acked-by: Oliver Hartkopp &lt;socketcan@hartkopp.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Orphaning skb in dev_hard_start_xmit() makes bonding behavior
unfriendly for applications sending big UDP bursts : Once packets
pass the bonding device and come to real device, they might hit a full
qdisc and be dropped. Without orphaning, the sender is automatically
throttled because sk-&gt;sk_wmemalloc reaches sk-&gt;sk_sndbuf (assuming
sk_sndbuf is not too big)

We could try to defer the orphaning adding another test in
dev_hard_start_xmit(), but all this seems of little gain,
now that BQL tends to make packets more likely to be parked
in Qdisc queues instead of NIC TX ring, in cases where performance
matters.

Reverts commits :
fc6055a5ba31 net: Introduce skb_orphan_try()
87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try()
and removes SKBTX_DRV_NEEDS_SK_REF flag

Reported-and-bisected-by: Jean-Michel Hautbois &lt;jhautbois@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Tested-by: Oliver Hartkopp &lt;socketcan@hartkopp.net&gt;
Acked-by: Oliver Hartkopp &lt;socketcan@hartkopp.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
