<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/net/tcp.h, branch v3.13</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>tcp: properly handle stretch acks in slow start</title>
<updated>2013-11-05T00:57:59+00:00</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2013-10-31T18:07:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9f9843a751d0a2057f9f3d313886e7e5e6ebaac9'/>
<id>9f9843a751d0a2057f9f3d313886e7e5e6ebaac9</id>
<content type='text'>
Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
regardless the number of packets. Consequently slow start performance
is highly dependent on the degree of the stretch ACKs caused by
receiver or network ACK compression mechanisms (e.g., delayed-ACK,
GRO, etc).  But slow start algorithm is to send twice the amount of
packets of packets left so it should process a stretch ACK of degree
N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
follow up patch will use the remainder of the N (if greater than 1)
to adjust cwnd in the congestion avoidance phase.

In addition this patch retires the experimental limited slow start
(LSS) feature. LSS has multiple drawbacks but questionable benefit. The
fractional cwnd increase in LSS requires a loop in slow start even
though it's rarely used. Configuring such an increase step via a global
sysctl on different BDPS seems hard. Finally and most importantly the
slow start overshoot concern is now better covered by the Hybrid slow
start (hystart) enabled by default.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Slow start now increases cwnd by 1 if an ACK acknowledges some packets,
regardless the number of packets. Consequently slow start performance
is highly dependent on the degree of the stretch ACKs caused by
receiver or network ACK compression mechanisms (e.g., delayed-ACK,
GRO, etc).  But slow start algorithm is to send twice the amount of
packets of packets left so it should process a stretch ACK of degree
N as if N ACKs of degree 1, then exits when cwnd exceeds ssthresh. A
follow up patch will use the remainder of the N (if greater than 1)
to adjust cwnd in the congestion avoidance phase.

In addition this patch retires the experimental limited slow start
(LSS) feature. LSS has multiple drawbacks but questionable benefit. The
fractional cwnd increase in LSS requires a loop in slow start even
though it's rarely used. Configuring such an increase step via a global
sysctl on different BDPS seems hard. Finally and most importantly the
slow start overshoot concern is now better covered by the Hybrid slow
start (hystart) enabled by default.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp_memcontrol: Remove the per netns control.</title>
<updated>2013-10-21T22:43:02+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2013-10-19T23:25:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a4fe34bf902b8f709c635ab37f1f39de0b86cff2'/>
<id>a4fe34bf902b8f709c635ab37f1f39de0b86cff2</id>
<content type='text'>
The code that is implemented is per memory cgroup not per netns, and
having per netns bits is just confusing.  Remove the per netns bits to
make it easier to see what is really going on.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The code that is implemented is per memory cgroup not per netns, and
having per netns bits is just confusing.  Remove the per netns bits to
make it easier to see what is really going on.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: switch tcp_fastopen key generation to net_get_random_once</title>
<updated>2013-10-19T23:45:35+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-10-19T19:48:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=222e83d2e0aecb6a5e8d42b1a8d51332a1eba960'/>
<id>222e83d2e0aecb6a5e8d42b1a8d51332a1eba960</id>
<content type='text'>
Changed key initialization of tcp_fastopen cookies to net_get_random_once.

If the user sets a custom key net_get_random_once must be called at
least once to ensure we don't overwrite the user provided key when the
first cookie is generated later on.

Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Changed key initialization of tcp_fastopen cookies to net_get_random_once.

If the user sets a custom key net_get_random_once must be called at
least once to ensure we don't overwrite the user provided key when the
first cookie is generated later on.

Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inet: split syncookie keys for ipv4 and ipv6 and initialize with net_get_random_once</title>
<updated>2013-10-19T23:45:35+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-10-19T19:48:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b23a002fc6f0c19846ee0382f019429af54a27e9'/>
<id>b23a002fc6f0c19846ee0382f019429af54a27e9</id>
<content type='text'>
This patch splits the secret key for syncookies for ipv4 and ipv6 and
initializes them with net_get_random_once. This change was the reason I
did this series. I think the initialization of the syncookie_secret is
way to early.

Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch splits the secret key for syncookies for ipv4 and ipv6 and
initializes them with net_get_random_once. This change was the reason I
did this series. I think the initialization of the syncookie_secret is
way to early.

Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: rename tcp_tso_segment()</title>
<updated>2013-10-18T17:38:39+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-18T17:36:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=28be6e07e8bccee76b51bca8fdba52c1b28fc77c'/>
<id>28be6e07e8bccee76b51bca8fdba52c1b28fc77c</id>
<content type='text'>
Rename tcp_tso_segment() to tcp_gso_segment(), to better reflect
what is going on, and ease grep games.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename tcp_tso_segment() to tcp_gso_segment(), to better reflect
what is going on, and ease grep games.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inet: rename ir_loc_port to ir_num</title>
<updated>2013-10-10T18:37:35+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-10T07:04:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b44084c2c822f99dd3f2334b288b7e463d222662'/>
<id>b44084c2c822f99dd3f2334b288b7e463d222662</id>
<content type='text'>
In commit 634fb979e8f ("inet: includes a sock_common in request_sock")
I forgot that the two ports in sock_common do not have same byte order :

skc_dport is __be16 (network order), but skc_num is __u16 (host order)

So sparse complains because ir_loc_port (mapped into skc_num) is
considered as __u16 while it should be __be16

Let rename ir_loc_port to ireq-&gt;ir_num (analogy with inet-&gt;inet_num),
and perform appropriate htons/ntohs conversions.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In commit 634fb979e8f ("inet: includes a sock_common in request_sock")
I forgot that the two ports in sock_common do not have same byte order :

skc_dport is __be16 (network order), but skc_num is __u16 (host order)

So sparse complains because ir_loc_port (mapped into skc_num) is
considered as __u16 while it should be __be16

Let rename ir_loc_port to ireq-&gt;ir_num (analogy with inet-&gt;inet_num),
and perform appropriate htons/ntohs conversions.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inet: includes a sock_common in request_sock</title>
<updated>2013-10-10T04:08:07+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-09T22:21:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=634fb979e8f3a70f04c1f2f519d0cd1142eb5c1a'/>
<id>634fb979e8f3a70f04c1f2f519d0cd1142eb5c1a</id>
<content type='text'>
TCP listener refactoring, part 5 :

We want to be able to insert request sockets (SYN_RECV) into main
ehash table instead of the per listener hash table to allow RCU
lookups and remove listener lock contention.

This patch includes the needed struct sock_common in front
of struct request_sock

This means there is no more inet6_request_sock IPv6 specific
structure.

Following inet_request_sock fields were renamed as they became
macros to reference fields from struct sock_common.
Prefix ir_ was chosen to avoid name collisions.

loc_port   -&gt; ir_loc_port
loc_addr   -&gt; ir_loc_addr
rmt_addr   -&gt; ir_rmt_addr
rmt_port   -&gt; ir_rmt_port
iif        -&gt; ir_iif

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
TCP listener refactoring, part 5 :

We want to be able to insert request sockets (SYN_RECV) into main
ehash table instead of the per listener hash table to allow RCU
lookups and remove listener lock contention.

This patch includes the needed struct sock_common in front
of struct request_sock

This means there is no more inet6_request_sock IPv6 specific
structure.

Following inet_request_sock fields were renamed as they became
macros to reference fields from struct sock_common.
Prefix ir_ was chosen to avoid name collisions.

loc_port   -&gt; ir_loc_port
loc_addr   -&gt; ir_loc_addr
rmt_addr   -&gt; ir_rmt_addr
rmt_port   -&gt; ir_rmt_port
iif        -&gt; ir_iif

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp/dccp: remove twchain</title>
<updated>2013-10-09T03:19:24+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-03T07:22:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=05dbc7b59481ca891bbcfe6799a562d48159fbf7'/>
<id>05dbc7b59481ca891bbcfe6799a562d48159fbf7</id>
<content type='text'>
TCP listener refactoring, part 3 :

Our goal is to hash SYN_RECV sockets into main ehash for fast lookup,
and parallel SYN processing.

Current inet_ehash_bucket contains two chains, one for ESTABLISH (and
friend states) sockets, another for TIME_WAIT sockets only.

As the hash table is sized to get at most one socket per bucket, it
makes little sense to have separate twchain, as it makes the lookup
slightly more complicated, and doubles hash table memory usage.

If we make sure all socket types have the lookup keys at the same
offsets, we can use a generic and faster lookup. It turns out TIME_WAIT
and ESTABLISHED sockets already have common lookup fields for IPv4.

[ INET_TW_MATCH() is no longer needed ]

I'll provide a follow-up to factorize IPv6 lookup as well, to remove
INET6_TW_MATCH()

This way, SYN_RECV pseudo sockets will be supported the same.

A new sock_gen_put() helper is added, doing either a sock_put() or
inet_twsk_put() [ and will support SYN_RECV later ].

Note this helper should only be called in real slow path, when rcu
lookup found a socket that was moved to another identity (freed/reused
immediately), but could eventually be used in other contexts, like
sock_edemux()

Before patch :

dmesg | grep "TCP established"

TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

After patch :

TCP established hash table entries: 524288 (order: 10, 4194304 bytes)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
TCP listener refactoring, part 3 :

Our goal is to hash SYN_RECV sockets into main ehash for fast lookup,
and parallel SYN processing.

Current inet_ehash_bucket contains two chains, one for ESTABLISH (and
friend states) sockets, another for TIME_WAIT sockets only.

As the hash table is sized to get at most one socket per bucket, it
makes little sense to have separate twchain, as it makes the lookup
slightly more complicated, and doubles hash table memory usage.

If we make sure all socket types have the lookup keys at the same
offsets, we can use a generic and faster lookup. It turns out TIME_WAIT
and ESTABLISHED sockets already have common lookup fields for IPv4.

[ INET_TW_MATCH() is no longer needed ]

I'll provide a follow-up to factorize IPv6 lookup as well, to remove
INET6_TW_MATCH()

This way, SYN_RECV pseudo sockets will be supported the same.

A new sock_gen_put() helper is added, doing either a sock_put() or
inet_twsk_put() [ and will support SYN_RECV later ].

Note this helper should only be called in real slow path, when rcu
lookup found a socket that was moved to another identity (freed/reused
immediately), but could eventually be used in other contexts, like
sock_edemux()

Before patch :

dmesg | grep "TCP established"

TCP established hash table entries: 524288 (order: 11, 8388608 bytes)

After patch :

TCP established hash table entries: 524288 (order: 10, 4194304 bytes)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: syncookies: reduce cookie lifetime to 128 seconds</title>
<updated>2013-09-24T14:39:58+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2013-09-20T20:32:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8c27bd75f04fb9cb70c69c3cfe24f4e6d8e15906'/>
<id>8c27bd75f04fb9cb70c69c3cfe24f4e6d8e15906</id>
<content type='text'>
We currently accept cookies that were created less than 4 minutes ago
(ie, cookies with counter delta 0-3).  Combined with the 8 mss table
values, this yields 32 possible values (out of 2**32) that will be valid.

Reducing the lifetime to &lt; 2 minutes halves the guessing chance while
still providing a large enough period.

While at it, get rid of jiffies value -- they overflow too quickly on
32 bit platforms.

getnstimeofday is used to create a counter that increments every 64s.
perf shows getnstimeofday cost is negible compared to sha_transform;
normal tcp initial sequence number generation uses getnstimeofday, too.

Reported-by: Jakob Lell &lt;jakob@jakoblell.com&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We currently accept cookies that were created less than 4 minutes ago
(ie, cookies with counter delta 0-3).  Combined with the 8 mss table
values, this yields 32 possible values (out of 2**32) that will be valid.

Reducing the lifetime to &lt; 2 minutes halves the guessing chance while
still providing a large enough period.

While at it, get rid of jiffies value -- they overflow too quickly on
32 bit platforms.

getnstimeofday is used to create a counter that increments every 64s.
perf shows getnstimeofday cost is negible compared to sha_transform;
normal tcp initial sequence number generation uses getnstimeofday, too.

Reported-by: Jakob Lell &lt;jakob@jakoblell.com&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp.h: Remove extern from function prototypes</title>
<updated>2013-09-23T20:29:40+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2013-09-23T18:33:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5c9f30236f53f3fbee170e2452dfa0e5dce4727b'/>
<id>5c9f30236f53f3fbee170e2452dfa0e5dce4727b</id>
<content type='text'>
There are a mix of function prototypes with and without extern
in the kernel sources.  Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler.  Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are a mix of function prototypes with and without extern
in the kernel sources.  Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler.  Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
