<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/ipv4/tcp_timer.c, branch v2.6.27</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>net: convert BUG_TRAP to generic WARN_ON</title>
<updated>2008-07-26T04:43:18+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@helsinki.fi</email>
</author>
<published>2008-07-26T04:43:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=547b792cac0a038b9dbf958d3c120df3740b5572'/>
<id>547b792cac0a038b9dbf958d3c120df3740b5572</id>
<content type='text'>
Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.

I could make at least one BUILD_BUG_ON conversion.

Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@helsinki.fi&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.

I could make at least one BUILD_BUG_ON conversion.

Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@helsinki.fi&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mib: add net to NET_INC_STATS_BH</title>
<updated>2008-07-17T03:31:16+00:00</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2008-07-17T03:31:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=de0744af1fe2d0a3d428f6af0f2fe1f6179b1a9c'/>
<id>de0744af1fe2d0a3d428f6af0f2fe1f6179b1a9c</id>
<content type='text'>
Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: de-bloat a bit with factoring NET_INC_STATS_BH out</title>
<updated>2008-07-03T08:05:41+00:00</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2008-07-03T08:05:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=40b215e594b65a3488576c9d24b367548e18902a'/>
<id>40b215e594b65a3488576c9d24b367548e18902a</id>
<content type='text'>
There are some places in TCP that select one MIB index to
bump snmp statistics like this:

	if (&lt;something&gt;)
		NET_INC_STATS_BH(&lt;some_id&gt;);
	else if (&lt;something_else&gt;)
		NET_INC_STATS_BH(&lt;some_other_id&gt;);
	...
	else
		NET_INC_STATS_BH(&lt;default_id&gt;);

or in a more tricky but still similar way.

On the other hand, this NET_INC_STATS_BH is a camouflaged
increment of percpu variable, which is not that small.

Factoring those cases out de-bloats 235 bytes on non-preemptible
i386 config and drives parts of the code into 80 columns.

add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-235 (-235)
function                                     old     new   delta
tcp_fastretrans_alert                       1437    1424     -13
tcp_dsack_set                                137     124     -13
tcp_xmit_retransmit_queue                    690     676     -14
tcp_try_undo_recovery                        283     265     -18
tcp_sacktag_write_queue                     1550    1515     -35
tcp_update_reordering                        162     106     -56
tcp_retransmit_timer                         990     904     -86

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are some places in TCP that select one MIB index to
bump snmp statistics like this:

	if (&lt;something&gt;)
		NET_INC_STATS_BH(&lt;some_id&gt;);
	else if (&lt;something_else&gt;)
		NET_INC_STATS_BH(&lt;some_other_id&gt;);
	...
	else
		NET_INC_STATS_BH(&lt;default_id&gt;);

or in a more tricky but still similar way.

On the other hand, this NET_INC_STATS_BH is a camouflaged
increment of percpu variable, which is not that small.

Factoring those cases out de-bloats 235 bytes on non-preemptible
i386 config and drives parts of the code into 80 columns.

add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-235 (-235)
function                                     old     new   delta
tcp_fastretrans_alert                       1437    1424     -13
tcp_dsack_set                                137     124     -13
tcp_xmit_retransmit_queue                    690     676     -14
tcp_try_undo_recovery                        283     265     -18
tcp_sacktag_write_queue                     1550    1515     -35
tcp_update_reordering                        162     106     -56
tcp_retransmit_timer                         990     904     -86

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6</title>
<updated>2008-06-14T03:52:39+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2008-06-14T03:52:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4ae127d1b6c71f9240dd4245f240e6dd8fc98014'/>
<id>4ae127d1b6c71f9240dd4245f240e6dd8fc98014</id>
<content type='text'>
Conflicts:

	drivers/net/smc911x.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:

	drivers/net/smc911x.c
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Revert 'process defer accept as established' changes.</title>
<updated>2008-06-12T23:34:35+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2008-06-12T23:31:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ec0a196626bd12e0ba108d7daa6d95a4fb25c2c5'/>
<id>ec0a196626bd12e0ba108d7daa6d95a4fb25c2c5</id>
<content type='text'>
This reverts two changesets, ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adbf471c7a6b80102e38e1d5a346b3b38
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --&gt; child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
&gt;--- a/net/ipv4/tcp_timer.c
&gt;+++ b/net/ipv4/tcp_timer.c
&gt;@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
&gt; 		goto death;
&gt; 	}
&gt;
&gt;+	if (tp-&gt;defer_tcp_accept.request &amp;&amp; sk-&gt;sk_state == TCP_ESTABLISHED) {
&gt;+		tcp_send_active_reset(sk, GFP_ATOMIC);
&gt;+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts two changesets, ec3c0982a2dd1e671bad8e9d26c28dcba0039d87
("[TCP]: TCP_DEFER_ACCEPT updates - process as established") and
the follow-on bug fix 9ae27e0adbf471c7a6b80102e38e1d5a346b3b38
("tcp: Fix slab corruption with ipv6 and tcp6fuzz").

This change causes several problems, first reported by Ingo Molnar
as a distcc-over-loopback regression where connections were getting
stuck.

Ilpo Järvinen first spotted the locking problems.  The new function
added by this code, tcp_defer_accept_check(), only has the
child socket locked, yet it is modifying state of the parent
listening socket.

Fixing that is non-trivial at best, because we can't simply just grab
the parent listening socket lock at this point, because it would
create an ABBA deadlock.  The normal ordering is parent listening
socket --&gt; child socket, but this code path would require the
reverse lock ordering.

Next is a problem noticed by Vitaliy Gusev, he noted:

----------------------------------------
&gt;--- a/net/ipv4/tcp_timer.c
&gt;+++ b/net/ipv4/tcp_timer.c
&gt;@@ -481,6 +481,11 @@ static void tcp_keepalive_timer (unsigned long data)
&gt; 		goto death;
&gt; 	}
&gt;
&gt;+	if (tp-&gt;defer_tcp_accept.request &amp;&amp; sk-&gt;sk_state == TCP_ESTABLISHED) {
&gt;+		tcp_send_active_reset(sk, GFP_ATOMIC);
&gt;+		goto death;

Here socket sk is not attached to listening socket's request queue. tcp_done()
will not call inet_csk_destroy_sock() (and tcp_v4_destroy_sock() which should
release this sk) as socket is not DEAD. Therefore socket sk will be lost for
freeing.
----------------------------------------

Finally, Alexey Kuznetsov argues that there might not even be any
real value or advantage to these new semantics even if we fix all
of the bugs:

----------------------------------------
Hiding from accept() sockets with only out-of-order data only
is the only thing which is impossible with old approach. Is this really
so valuable? My opinion: no, this is nothing but a new loophole
to consume memory without control.
----------------------------------------

So revert this thing for now.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: remove CVS keywords</title>
<updated>2008-06-12T04:00:38+00:00</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@kernel.org</email>
</author>
<published>2008-06-11T05:46:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0b040829952d84bf2a62526f0e24b624e0699447'/>
<id>0b040829952d84bf2a62526f0e24b624e0699447</id>
<content type='text'>
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[TCP]: Format addresses appropriately in debug messages.</title>
<updated>2008-04-14T11:09:36+00:00</updated>
<author>
<name>YOSHIFUJI Hideaki</name>
<email>yoshfuji@linux-ipv6.org</email>
</author>
<published>2008-04-14T11:09:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=569508c964a8b5235e00998523bc3acd3f6aff01'/>
<id>569508c964a8b5235e00998523bc3acd3f6aff01</id>
<content type='text'>
Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPV4]: Use NIPQUAD_FMT to format ipv4 addresses.</title>
<updated>2008-04-14T11:09:00+00:00</updated>
<author>
<name>YOSHIFUJI Hideaki</name>
<email>yoshfuji@linux-ipv6.org</email>
</author>
<published>2008-04-14T11:09:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a7d632b6b4ad1c92746ed409e41f9dc571ec04e2'/>
<id>a7d632b6b4ad1c92746ed409e41f9dc571ec04e2</id>
<content type='text'>
And use %u to format port.

Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
And use %u to format port.

Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[TCP]: TCP_DEFER_ACCEPT updates - process as established</title>
<updated>2008-03-21T23:33:01+00:00</updated>
<author>
<name>Patrick McManus</name>
<email>mcmanus@ducksong.com</email>
</author>
<published>2008-03-21T23:33:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ec3c0982a2dd1e671bad8e9d26c28dcba0039d87'/>
<id>ec3c0982a2dd1e671bad8e9d26c28dcba0039d87</id>
<content type='text'>
Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus &lt;mcmanus@ducksong.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change TCP_DEFER_ACCEPT implementation so that it transitions a
connection to ESTABLISHED after handshake is complete instead of
leaving it in SYN-RECV until some data arrvies. Place connection in
accept queue when first data packet arrives from slow path.

Benefits:
  - established connection is now reset if it never makes it
   to the accept queue

 - diagnostic state of established matches with the packet traces
   showing completed handshake

 - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be
   enforced with reasonable accuracy instead of rounding up to next
   exponential back-off of syn-ack retry.

Signed-off-by: Patrick McManus &lt;mcmanus@ducksong.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[TCP]: Do not purge sk_forward_alloc entirely in tcp_delack_timer().</title>
<updated>2008-01-28T23:01:42+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2008-01-11T05:56:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9993e7d313e80bdc005d09c7def91903e0068f07'/>
<id>9993e7d313e80bdc005d09c7def91903e0068f07</id>
<content type='text'>
Otherwise we beat heavily on the global tcp_memory atomics
when all of the sockets in the system are slowly sending
perioding packet clumps.

Noticed and suggested by Eric Dumazet.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Otherwise we beat heavily on the global tcp_memory atomics
when all of the sockets in the system are slowly sending
perioding packet clumps.

Noticed and suggested by Eric Dumazet.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
