<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv4/ip_output.c, branch linux-3.4.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Patch for 3.2.x, 3.4.x IP identifier regression</title>
<updated>2015-02-02T09:05:26+00:00</updated>
<author>
<name>Jeffrey Knockel</name>
<email>jeffk@cs.unm.edu</email>
</author>
<published>2014-12-12T06:14:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fd873bf1ce5477514515e82aa8acdc7ec06a9b97'/>
<id>fd873bf1ce5477514515e82aa8acdc7ec06a9b97</id>
<content type='text'>
commit c3b4ccb8b03769e2867fabecc078483ee6710ccf upstream.

With commits 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") and
04ca6973f7c1 ("ip: make IP identifiers less predictable"), IP
identifiers are generated from a counter chosen from an array of
counters indexed by the hash of the outgoing packet header's source
address, destination address, and protocol number.  Thus, in
__ip_make_skb(), we must now call ip_select_ident() only after setting
these fields in the IP header to prevent IP identifiers from being
generated from bogus counters.

IP id sequence before fix: 18174, 5789, 5953, 59420, 59637, ...
After fix: 5967, 6185, 6374, 6600, 6795, 6892, 7051, 7288, ...

Signed-off-by: Jeffrey Knockel &lt;jeffk@cs.unm.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
[Backported to 3.4: adjust context]
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c3b4ccb8b03769e2867fabecc078483ee6710ccf upstream.

With commits 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") and
04ca6973f7c1 ("ip: make IP identifiers less predictable"), IP
identifiers are generated from a counter chosen from an array of
counters indexed by the hash of the outgoing packet header's source
address, destination address, and protocol number.  Thus, in
__ip_make_skb(), we must now call ip_select_ident() only after setting
these fields in the IP header to prevent IP identifiers from being
generated from bogus counters.

IP id sequence before fix: 18174, 5789, 5953, 59420, 59637, ...
After fix: 5967, 6185, 6374, 6600, 6795, 6892, 7051, 7288, ...

Signed-off-by: Jeffrey Knockel &lt;jeffk@cs.unm.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
[Backported to 3.4: adjust context]
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inetpeer: get rid of ip_id_count</title>
<updated>2014-08-14T00:42:35+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-02T12:26:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ad52eef552c7896ec6024ee72fc126167fe5c4e2'/>
<id>ad52eef552c7896ec6024ee72fc126167fe5c4e2</id>
<content type='text'>
[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inet: fix possible memory corruption with UDP_CORK and UFO</title>
<updated>2013-11-04T12:23:41+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-10-21T22:07:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=478e9a72b5e0cc9ecd8b787db90dec01d283123c'/>
<id>478e9a72b5e0cc9ecd8b787db90dec01d283123c</id>
<content type='text'>
[ This is a simplified -stable version of a set of upstream commits. ]

This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:

"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)

Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled.  If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.

This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.

The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).

Cc: Jiri Pirko &lt;jiri@resnulli.us&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ This is a simplified -stable version of a set of upstream commits. ]

This is a replacement patch only for stable which does fix the problems
handled by the following two commits in -net:

"ip_output: do skb ufo init for peeked non ufo skb as well" (e93b7d748be887cd7639b113ba7d7ef792a7efb9)
"ip6_output: do skb ufo init for peeked non ufo skb as well" (c547dbf55d5f8cf615ccc0e7265e98db27d3fb8b)

Three frames are written on a corked udp socket for which the output
netdevice has UFO enabled.  If the first and third frame are smaller than
the mtu and the second one is bigger, we enqueue the second frame with
skb_append_datato_frags without initializing the gso fields. This leads
to the third frame appended regulary and thus constructing an invalid skb.

This fixes the problem by always using skb_append_datato_frags as soon
as the first frag got enqueued to the skb without marking the packet
as SKB_GSO_UDP.

The problem with only two frames for ipv6 was fixed by "ipv6: udp
packets following an UFO enqueued packet need also be handled by UFO"
(2811ebac2521ceac84f2bdae402455baa6a7fb47).

Cc: Jiri Pirko &lt;jiri@resnulli.us&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ip: generate unique IP identificator if local fragmentation is allowed</title>
<updated>2013-10-13T22:42:48+00:00</updated>
<author>
<name>Ansis Atteka</name>
<email>aatteka@nicira.com</email>
</author>
<published>2013-09-18T22:29:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f72299da3e1a010a3d77fbed0b9ee6abd0a19911'/>
<id>f72299da3e1a010a3d77fbed0b9ee6abd0a19911</id>
<content type='text'>
[ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ]

If local fragmentation is allowed, then ip_select_ident() and
ip_select_ident_more() need to generate unique IDs to ensure
correct defragmentation on the peer.

For example, if IPsec (tunnel mode) has to encrypt large skbs
that have local_df bit set, then all IP fragments that belonged
to different ESP datagrams would have used the same identificator.
If one of these IP fragments would get lost or reordered, then
peer could possibly stitch together wrong IP fragments that did
not belong to the same datagram. This would lead to a packet loss
or data corruption.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 703133de331a7a7df47f31fb9de51dc6f68a9de8 ]

If local fragmentation is allowed, then ip_select_ident() and
ip_select_ident_more() need to generate unique IDs to ensure
correct defragmentation on the peer.

For example, if IPsec (tunnel mode) has to encrypt large skbs
that have local_df bit set, then all IP fragments that belonged
to different ESP datagrams would have used the same identificator.
If one of these IP fragments would get lost or reordered, then
peer could possibly stitch together wrong IP fragments that did
not belong to the same datagram. This would lead to a packet loss
or data corruption.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ip: use ip_hdr() in __ip_make_skb() to retrieve IP header</title>
<updated>2013-10-13T22:42:47+00:00</updated>
<author>
<name>Ansis Atteka</name>
<email>aatteka@nicira.com</email>
</author>
<published>2013-09-18T22:29:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=832ae42a43dd7ea2a39d7cc0687363d0039da850'/>
<id>832ae42a43dd7ea2a39d7cc0687363d0039da850</id>
<content type='text'>
[ Upstream commit 749154aa56b57652a282cbde57a57abc278d1205 ]

skb-&gt;data already points to IP header, but for the sake of
consistency we can also use ip_hdr() to retrieve it.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 749154aa56b57652a282cbde57a57abc278d1205 ]

skb-&gt;data already points to IP header, but for the sake of
consistency we can also use ip_hdr() to retrieve it.

Signed-off-by: Ansis Atteka &lt;aatteka@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Remove all #inclusions of asm/system.h</title>
<updated>2012-03-28T17:30:03+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2012-03-28T17:30:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9ffc93f203c18a70623f21950f1dd473c9ec48cd'/>
<id>9ffc93f203c18a70623f21950f1dd473c9ec48cd</id>
<content type='text'>
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*&lt;asm/system[.]h&gt;.*\n!!' `grep -Irl '^#\s*include\s*&lt;asm/system[.]h&gt;' *`

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*&lt;asm/system[.]h&gt;.*\n!!' `grep -Irl '^#\s*include\s*&lt;asm/system[.]h&gt;' *`

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.</title>
<updated>2011-12-05T20:20:19+00:00</updated>
<author>
<name>David Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-12-02T16:52:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2721745501a26d0dc3b88c0d2f3aa11471891388'/>
<id>2721745501a26d0dc3b88c0d2f3aa11471891388</id>
<content type='text'>
To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: use a 64bit load/store in output path</title>
<updated>2011-12-01T18:28:54+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-11-30T19:00:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=84f9307c5da84a7c8513e7607dc8427d2cbd63e3'/>
<id>84f9307c5da84a7c8513e7607dc8427d2cbd63e3</id>
<content type='text'>
gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT</title>
<updated>2011-10-24T07:06:21+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-10-24T07:06:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=66b13d99d96a1a69f47a6bc3dc47f45955967377'/>
<id>66b13d99d96a1a69f47a6bc3dc47f45955967377</id>
<content type='text'>
There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.

In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.

Example of things that were broken :
  - Routing using TOS as a selector
  - Firewalls
  - Trafic classification / shaping

We now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.

Notes :
 - We still reflect incoming TOS in RST messages.
 - We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
 - A patch is needed for IPv6

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.

In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.

Example of things that were broken :
  - Routing using TOS as a selector
  - Firewalls
  - Trafic classification / shaping

We now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.

Notes :
 - We still reflect incoming TOS in RST messages.
 - We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
 - A patch is needed for IPv6

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add skb frag size accessors</title>
<updated>2011-10-19T07:10:46+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-10-18T21:00:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9e903e085262ffbf1fc44a17ac06058aca03524a'/>
<id>9e903e085262ffbf1fc44a17ac06058aca03524a</id>
<content type='text'>
To ease skb-&gt;truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To ease skb-&gt;truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
