<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv4/tcp_input.c, branch linux-4.2.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>tcp: initialize tp-&gt;copied_seq in case of cross SYN connection</title>
<updated>2015-12-15T05:25:38+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-11-26T16:18:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=24be0611037fd6ceddbf2e6206163a2cfbcaa5d9'/>
<id>24be0611037fd6ceddbf2e6206163a2cfbcaa5d9</id>
<content type='text'>
[ Upstream commit 142a2e7ece8d8ac0e818eb2c91f99ca894730e2a ]

Dmitry provided a syzkaller (http://github.com/google/syzkaller)
generated program that triggers the WARNING at
net/ipv4/tcp.c:1729 in tcp_recvmsg() :

WARN_ON(tp-&gt;copied_seq != tp-&gt;rcv_nxt &amp;&amp;
        !(flags &amp; (MSG_PEEK | MSG_TRUNC)));

His program is specifically attempting a Cross SYN TCP exchange,
that we support (for the pleasure of hackers ?), but it looks we
lack proper tcp-&gt;copied_seq initialization.

Thanks again Dmitry for your report and testings.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Tested-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 142a2e7ece8d8ac0e818eb2c91f99ca894730e2a ]

Dmitry provided a syzkaller (http://github.com/google/syzkaller)
generated program that triggers the WARNING at
net/ipv4/tcp.c:1729 in tcp_recvmsg() :

WARN_ON(tp-&gt;copied_seq != tp-&gt;rcv_nxt &amp;&amp;
        !(flags &amp; (MSG_PEEK | MSG_TRUNC)));

His program is specifically attempting a Cross SYN TCP exchange,
that we support (for the pleasure of hackers ?), but it looks we
lack proper tcp-&gt;copied_seq initialization.

Thanks again Dmitry for your report and testings.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Tested-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: fix potential huge kmalloc() calls in TCP_REPAIR</title>
<updated>2015-12-15T05:25:37+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-11-19T05:03:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a37ce6a70998a744ec01d5f7310b79822b409c72'/>
<id>a37ce6a70998a744ec01d5f7310b79822b409c72</id>
<content type='text'>
[ Upstream commit 5d4c9bfbabdb1d497f21afd81501e5c54b0c85d9 ]

tcp_send_rcvq() is used for re-injecting data into tcp receive queue.

Problems :

- No check against size is performed, allowed user to fool kernel in
  attempting very large memory allocations, eventually triggering
  OOM when memory is fragmented.

- In case of fault during the copy we do not return correct errno.

Lets use alloc_skb_with_frags() to cook optimal skbs.

Fixes: 292e8d8c8538 ("tcp: Move rcvq sending to tcp_input.c")
Fixes: c0e88ff0f256 ("tcp: Repair socket queues")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Acked-by: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5d4c9bfbabdb1d497f21afd81501e5c54b0c85d9 ]

tcp_send_rcvq() is used for re-injecting data into tcp receive queue.

Problems :

- No check against size is performed, allowed user to fool kernel in
  attempting very large memory allocations, eventually triggering
  OOM when memory is fragmented.

- In case of fault during the copy we do not return correct errno.

Lets use alloc_skb_with_frags() to cook optimal skbs.

Fixes: 292e8d8c8538 ("tcp: Move rcvq sending to tcp_input.c")
Fixes: c0e88ff0f256 ("tcp: Repair socket queues")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Acked-by: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: don't use F-RTO on non-recurring timeouts</title>
<updated>2015-07-16T00:17:21+00:00</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2015-07-13T19:10:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f82b681a511f4d61069e9586a9cf97bdef371ef3'/>
<id>f82b681a511f4d61069e9586a9cf97bdef371ef3</id>
<content type='text'>
Currently F-RTO may repeatedly send new data packets on non-recurring
timeouts in CA_Loss mode. This is a bug because F-RTO (RFC5682)
should only be used on either new recovery or recurring timeouts.

This exacerbates the recovery progress during frequent timeout &amp;
repair, because we prioritize sending new data packets instead of
repairing the holes when the bandwidth is already scarce.

Fix it by correcting the test of a new recovery episode.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently F-RTO may repeatedly send new data packets on non-recurring
timeouts in CA_Loss mode. This is a bug because F-RTO (RFC5682)
should only be used on either new recovery or recurring timeouts.

This exacerbates the recovery progress during frequent timeout &amp;
repair, because we prioritize sending new data packets instead of
repairing the holes when the bandwidth is already scarce.

Fix it by correcting the test of a new recovery episode.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: fill shinfo-&gt;gso_size at last moment</title>
<updated>2015-06-11T23:33:11+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-06-11T16:15:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f69ad292cfd13aa7ee00847320c6bb9ba2154e87'/>
<id>f69ad292cfd13aa7ee00847320c6bb9ba2154e87</id>
<content type='text'>
In commit cd7d8498c9a5 ("tcp: change tcp_skb_pcount() location") we stored
gso_segs in a temporary cache hot location.

This patch does the same for gso_size.

This allows to save 2 cache line misses in tcp xmit path for
the last packet that is considered but not sent because of
various conditions (cwnd, tso defer, receiver window, TSQ...)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit cd7d8498c9a5 ("tcp: change tcp_skb_pcount() location") we stored
gso_segs in a temporary cache hot location.

This patch does the same for gso_size.

This allows to save 2 cache line misses in tcp xmit path for
the last packet that is considered but not sent because of
various conditions (cwnd, tso defer, receiver window, TSQ...)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: fill shinfo-&gt;gso_type at last moment</title>
<updated>2015-06-11T23:33:11+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-06-11T16:15:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=51466a7545b73b7ad7bcfb33410d2823ccfaa501'/>
<id>51466a7545b73b7ad7bcfb33410d2823ccfaa501</id>
<content type='text'>
Our goal is to touch skb_shinfo(skb) only when absolutely needed,
to avoid two cache line misses in TCP output path for last skb
that is considered but not sent because of various conditions
(cwnd, tso defer, receiver window, TSQ...)

A packet is GSO only when skb_shinfo(skb)-&gt;gso_size is not zero.

We can set skb_shinfo(skb)-&gt;gso_type to sk-&gt;sk_gso_type even for
non GSO packets.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Our goal is to touch skb_shinfo(skb) only when absolutely needed,
to avoid two cache line misses in TCP output path for last skb
that is considered but not sent because of various conditions
(cwnd, tso defer, receiver window, TSQ...)

A packet is GSO only when skb_shinfo(skb)-&gt;gso_size is not zero.

We can set skb_shinfo(skb)-&gt;gso_type to sk-&gt;sk_gso_type even for
non GSO packets.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: export tcp_enter_cwr()</title>
<updated>2015-06-11T07:09:12+00:00</updated>
<author>
<name>Kenneth Klette Jonassen</name>
<email>kennetkl@ifi.uio.no</email>
</author>
<published>2015-06-10T17:08:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7782ad8bb55c18f634a6713ec38c8f1ae86ad0b5'/>
<id>7782ad8bb55c18f634a6713ec38c8f1ae86ad0b5</id>
<content type='text'>
Upcoming tcp_cdg uses tcp_enter_cwr() to initiate PRR. Export this
function so that CDG can be compiled as a module.

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: David Hayes &lt;davihay@ifi.uio.no&gt;
Cc: Andreas Petlund &lt;apetlund@simula.no&gt;
Cc: Dave Taht &lt;dave.taht@bufferbloat.net&gt;
Cc: Nicolas Kuhn &lt;nicolas.kuhn@telecom-bretagne.eu&gt;
Signed-off-by: Kenneth Klette Jonassen &lt;kennetkl@ifi.uio.no&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Upcoming tcp_cdg uses tcp_enter_cwr() to initiate PRR. Export this
function so that CDG can be compiled as a module.

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Stephen Hemminger &lt;stephen@networkplumber.org&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: David Hayes &lt;davihay@ifi.uio.no&gt;
Cc: Andreas Petlund &lt;apetlund@simula.no&gt;
Cc: Dave Taht &lt;dave.taht@bufferbloat.net&gt;
Cc: Nicolas Kuhn &lt;nicolas.kuhn@telecom-bretagne.eu&gt;
Signed-off-by: Kenneth Klette Jonassen &lt;kennetkl@ifi.uio.no&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-05-23T05:22:35+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-05-23T05:22:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=36583eb54d46c36a447afd6c379839f292397429'/>
<id>36583eb54d46c36a447afd6c379839f292397429</id>
<content type='text'>
Conflicts:
	drivers/net/ethernet/cadence/macb.c
	drivers/net/phy/phy.c
	include/linux/skbuff.h
	net/ipv4/tcp.c
	net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --&gt; OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/ethernet/cadence/macb.c
	drivers/net/phy/phy.c
	include/linux/skbuff.h
	net/ipv4/tcp.c
	net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --&gt; OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: fix a potential deadlock in tcp_get_info()</title>
<updated>2015-05-22T17:46:06+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-05-22T04:51:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d654976cbf852ee20612ee10dbe57cdacda9f452'/>
<id>d654976cbf852ee20612ee10dbe57cdacda9f452</id>
<content type='text'>
Taking socket spinlock in tcp_get_info() can deadlock, as
inet_diag_dump_icsk() holds the &amp;hashinfo-&gt;ehash_locks[i],
while packet processing can use the reverse locking order.

We could avoid this locking for TCP_LISTEN states, but lockdep would
certainly get confused as all TCP sockets share same lockdep classes.

[  523.722504] ======================================================
[  523.728706] [ INFO: possible circular locking dependency detected ]
[  523.734990] 4.1.0-dbg-DEV #1676 Not tainted
[  523.739202] -------------------------------------------------------
[  523.745474] ss/18032 is trying to acquire lock:
[  523.750002]  (slock-AF_INET){+.-...}, at: [&lt;ffffffff81669d44&gt;] tcp_get_info+0x2c4/0x360
[  523.758129]
[  523.758129] but task is already holding lock:
[  523.763968]  (&amp;(&amp;hashinfo-&gt;ehash_locks[i])-&gt;rlock){+.-...}, at: [&lt;ffffffff816bcb75&gt;] inet_diag_dump_icsk+0x1d5/0x6c0
[  523.774661]
[  523.774661] which lock already depends on the new lock.
[  523.774661]
[  523.782850]
[  523.782850] the existing dependency chain (in reverse order) is:
[  523.790326]
-&gt; #1 (&amp;(&amp;hashinfo-&gt;ehash_locks[i])-&gt;rlock){+.-...}:
[  523.796599]        [&lt;ffffffff811126bb&gt;] lock_acquire+0xbb/0x270
[  523.802565]        [&lt;ffffffff816f5868&gt;] _raw_spin_lock+0x38/0x50
[  523.808628]        [&lt;ffffffff81665af8&gt;] __inet_hash_nolisten+0x78/0x110
[  523.815273]        [&lt;ffffffff816819db&gt;] tcp_v4_syn_recv_sock+0x24b/0x350
[  523.822067]        [&lt;ffffffff81684d41&gt;] tcp_check_req+0x3c1/0x500
[  523.828199]        [&lt;ffffffff81682d09&gt;] tcp_v4_do_rcv+0x239/0x3d0
[  523.834331]        [&lt;ffffffff816842fe&gt;] tcp_v4_rcv+0xa8e/0xc10
[  523.840202]        [&lt;ffffffff81658fa3&gt;] ip_local_deliver_finish+0x133/0x3e0
[  523.847214]        [&lt;ffffffff81659a9a&gt;] ip_local_deliver+0xaa/0xc0
[  523.853440]        [&lt;ffffffff816593b8&gt;] ip_rcv_finish+0x168/0x5c0
[  523.859624]        [&lt;ffffffff81659db7&gt;] ip_rcv+0x307/0x420

Lets use u64_sync infrastructure instead. As a bonus, 64bit
arches get optimized, as these are nop for them.

Fixes: 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Taking socket spinlock in tcp_get_info() can deadlock, as
inet_diag_dump_icsk() holds the &amp;hashinfo-&gt;ehash_locks[i],
while packet processing can use the reverse locking order.

We could avoid this locking for TCP_LISTEN states, but lockdep would
certainly get confused as all TCP sockets share same lockdep classes.

[  523.722504] ======================================================
[  523.728706] [ INFO: possible circular locking dependency detected ]
[  523.734990] 4.1.0-dbg-DEV #1676 Not tainted
[  523.739202] -------------------------------------------------------
[  523.745474] ss/18032 is trying to acquire lock:
[  523.750002]  (slock-AF_INET){+.-...}, at: [&lt;ffffffff81669d44&gt;] tcp_get_info+0x2c4/0x360
[  523.758129]
[  523.758129] but task is already holding lock:
[  523.763968]  (&amp;(&amp;hashinfo-&gt;ehash_locks[i])-&gt;rlock){+.-...}, at: [&lt;ffffffff816bcb75&gt;] inet_diag_dump_icsk+0x1d5/0x6c0
[  523.774661]
[  523.774661] which lock already depends on the new lock.
[  523.774661]
[  523.782850]
[  523.782850] the existing dependency chain (in reverse order) is:
[  523.790326]
-&gt; #1 (&amp;(&amp;hashinfo-&gt;ehash_locks[i])-&gt;rlock){+.-...}:
[  523.796599]        [&lt;ffffffff811126bb&gt;] lock_acquire+0xbb/0x270
[  523.802565]        [&lt;ffffffff816f5868&gt;] _raw_spin_lock+0x38/0x50
[  523.808628]        [&lt;ffffffff81665af8&gt;] __inet_hash_nolisten+0x78/0x110
[  523.815273]        [&lt;ffffffff816819db&gt;] tcp_v4_syn_recv_sock+0x24b/0x350
[  523.822067]        [&lt;ffffffff81684d41&gt;] tcp_check_req+0x3c1/0x500
[  523.828199]        [&lt;ffffffff81682d09&gt;] tcp_v4_do_rcv+0x239/0x3d0
[  523.834331]        [&lt;ffffffff816842fe&gt;] tcp_v4_rcv+0xa8e/0xc10
[  523.840202]        [&lt;ffffffff81658fa3&gt;] ip_local_deliver_finish+0x133/0x3e0
[  523.847214]        [&lt;ffffffff81659a9a&gt;] ip_local_deliver+0xaa/0xc0
[  523.853440]        [&lt;ffffffff816593b8&gt;] ip_rcv_finish+0x168/0x5c0
[  523.859624]        [&lt;ffffffff81659db7&gt;] ip_rcv+0x307/0x420

Lets use u64_sync infrastructure instead. As a bonus, 64bit
arches get optimized, as these are nop for them.

Fixes: 0df48c26d841 ("tcp: add tcpi_bytes_acked to tcp_info")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: don't over-send F-RTO probes</title>
<updated>2015-05-19T20:36:57+00:00</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2015-05-18T19:31:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b7b0ed910cd8450db6d98cd4361c644bb1c88412'/>
<id>b7b0ed910cd8450db6d98cd4361c644bb1c88412</id>
<content type='text'>
After sending the new data packets to probe (step 2), F-RTO may
incorrectly send more probes if the next ACK advances SND_UNA and
does not sack new packet. However F-RTO RFC 5682 probes at most
once. This bug may cause sender to always send new data instead of
repairing holes, inducing longer HoL blocking on the receiver for
the application.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After sending the new data packets to probe (step 2), F-RTO may
incorrectly send more probes if the next ACK advances SND_UNA and
does not sack new packet. However F-RTO RFC 5682 probes at most
once. This bug may cause sender to always send new data instead of
repairing holes, inducing longer HoL blocking on the receiver for
the application.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: only undo on partial ACKs in CA_Loss</title>
<updated>2015-05-19T20:36:57+00:00</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2015-05-18T19:31:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=da34ac7626b571d262f92b93f11eb32dd58d9c4e'/>
<id>da34ac7626b571d262f92b93f11eb32dd58d9c4e</id>
<content type='text'>
Undo based on TCP timestamps should only happen on ACKs that advance
SND_UNA, according to the Eifel algorithm in RFC 3522:

Section 3.2:

  (4) If the value of the Timestamp Echo Reply field of the
      acceptable ACK's Timestamps option is smaller than the
      value of RetransmitTS, then proceed to step (5),

Section Terminology:
   We use the term 'acceptable ACK' as defined in [RFC793].  That is an
   ACK that acknowledges previously unacknowledged data.

This is because upon receiving an out-of-order packet, the receiver
returns the last timestamp that advances RCV_NXT, not the current
timestamp of the packet in the DUPACK. Without checking the flag,
the DUPACK will cause tcp_packet_delayed() to return true and
tcp_try_undo_loss() will revert cwnd reduction.

Note that we check the condition in CA_Recovery already by only
calling tcp_try_undo_partial() if FLAG_SND_UNA_ADVANCED is set or
tcp_try_undo_recovery() if snd_una crosses high_seq.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Undo based on TCP timestamps should only happen on ACKs that advance
SND_UNA, according to the Eifel algorithm in RFC 3522:

Section 3.2:

  (4) If the value of the Timestamp Echo Reply field of the
      acceptable ACK's Timestamps option is smaller than the
      value of RetransmitTS, then proceed to step (5),

Section Terminology:
   We use the term 'acceptable ACK' as defined in [RFC793].  That is an
   ACK that acknowledges previously unacknowledged data.

This is because upon receiving an out-of-order packet, the receiver
returns the last timestamp that advances RCV_NXT, not the current
timestamp of the packet in the DUPACK. Without checking the flag,
the DUPACK will cause tcp_packet_delayed() to return true and
tcp_try_undo_loss() will revert cwnd reduction.

Note that we check the condition in CA_Recovery already by only
calling tcp_try_undo_partial() if FLAG_SND_UNA_ADVANCED is set or
tcp_try_undo_recovery() if snd_una crosses high_seq.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
