<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/sched, branch v2.6.14</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>[NET]: Disable NET_SCH_CLK_CPU for SMP x86 hosts</title>
<updated>2005-10-13T21:41:44+00:00</updated>
<author>
<name>Andi Kleen</name>
<email>ak@suse.de</email>
</author>
<published>2005-10-13T21:41:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=34cb711ba922f53cca45443b8c3c1078873cf599'/>
<id>34cb711ba922f53cca45443b8c3c1078873cf599</id>
<content type='text'>
Opterons with frequency scaling have fully unsynchronized TSCs
running at different frequencies, so using TSCs there is not a good idea. 
Also some other x86 boxes have this problem. gettimeofday should be good 
enough, so just disable it.

Signed-off-by: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Opterons with frequency scaling have fully unsynchronized TSCs
running at different frequencies, so using TSCs there is not a good idea. 
Also some other x86 boxes have this problem. gettimeofday should be good 
enough, so just disable it.

Signed-off-by: Andi Kleen &lt;ak@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[INET]: speedup inet (tcp/dccp) lookups</title>
<updated>2005-10-03T21:13:38+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2005-10-03T21:13:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=81c3d5470ecc70564eb9209946730fe2be93ad06'/>
<id>81c3d5470ecc70564eb9209946730fe2be93ad06</id>
<content type='text'>
Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)

(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

1) First some performance data :
--------------------------------

tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

The most time critical code is :

sk_for_each(sk, node, &amp;head-&gt;chain) {
     if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
         goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.

As INET_MATCH first comparison uses inet_sk(__sk)-&gt;daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.

This can be problematic if some chains are very long.

2) The goal
-----------

The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of the patch
---------------------------

Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.

struct sock_common {
	unsigned short		skc_family;
	volatile unsigned char	skc_state;
	unsigned char		skc_reuse;
	int			skc_bound_dev_if;
	struct hlist_node	skc_node;
	struct hlist_node	skc_bind_node;
	atomic_t		skc_refcnt;
+	unsigned int		skc_hash;
	struct proto		*skc_prot;
};

Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.

Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)

File include/net/inet_hashtables.h

64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)-&gt;sk_hash == (__hash))
     ((*((__u64 *)&amp;(inet_sk(__sk)-&gt;daddr)))== (__cookie))   &amp;&amp;  \
     ((*((__u32 *)&amp;(inet_sk(__sk)-&gt;dport))) == (__ports))   &amp;&amp;  \
     (!((__sk)-&gt;sk_bound_dev_if) || ((__sk)-&gt;sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)-&gt;sk_hash == (__hash))                 &amp;&amp;  \
     (inet_sk(__sk)-&gt;daddr          == (__saddr))   &amp;&amp;  \
     (inet_sk(__sk)-&gt;rcv_saddr      == (__daddr))   &amp;&amp;  \
     (!((__sk)-&gt;sk_bound_dev_if) || ((__sk)-&gt;sk_bound_dev_if == (__dif))))


- Adds a prefetch(head-&gt;chain.first) in 
__inet_lookup_established()/__tcp_v4_check_established() and 
__inet6_lookup_established()/__tcp_v6_check_established() and 
__dccp_v4_check_established() to bring into cache the first element of the 
list, before the {read|write}_lock(&amp;head-&gt;lock);

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Acked-by: Arnaldo Carvalho de Melo &lt;acme@ghostprotocols.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Arnaldo and I agreed it could be applied now, because I have other
pending patches depending on this one (Thank you Arnaldo)

(The other important patch moves skc_refcnt in a separate cache line,
so that the SMP/NUMA performance doesnt suffer from cache line ping pongs)

1) First some performance data :
--------------------------------

tcp_v4_rcv() wastes a *lot* of time in __inet_lookup_established()

The most time critical code is :

sk_for_each(sk, node, &amp;head-&gt;chain) {
     if (INET_MATCH(sk, acookie, saddr, daddr, ports, dif))
         goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of
"struct sock" is prefetched.

As INET_MATCH first comparison uses inet_sk(__sk)-&gt;daddr, wich is far
away from the begining of "struct sock", it has to bring into CPU
cache cold cache line. Each iteration has to use at least 2 cache
lines.

This can be problematic if some chains are very long.

2) The goal
-----------

The idea I had is to change things so that INET_MATCH() may return
FALSE in 99% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of the patch
---------------------------

Adds a new 'unsigned int skc_hash' field in 'struct sock_common',
filling a 32 bits hole on 64 bits platform.

struct sock_common {
	unsigned short		skc_family;
	volatile unsigned char	skc_state;
	unsigned char		skc_reuse;
	int			skc_bound_dev_if;
	struct hlist_node	skc_node;
	struct hlist_node	skc_bind_node;
	atomic_t		skc_refcnt;
+	unsigned int		skc_hash;
	struct proto		*skc_prot;
};

Store in this 32 bits field the full hash, not masked by (ehash_size -
1) Using this full hash as the first comparison done in INET_MATCH
permits us immediatly skip the element without touching a second cache
line in case of a miss.

Suppress the sk_hashent/tw_hashent fields since skc_hash (aliased to
sk_hash and tw_hash) already contains the slot number if we mask with
(ehash_size - 1)

File include/net/inet_hashtables.h

64 bits platforms :
#define INET_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)-&gt;sk_hash == (__hash))
     ((*((__u64 *)&amp;(inet_sk(__sk)-&gt;daddr)))== (__cookie))   &amp;&amp;  \
     ((*((__u32 *)&amp;(inet_sk(__sk)-&gt;dport))) == (__ports))   &amp;&amp;  \
     (!((__sk)-&gt;sk_bound_dev_if) || ((__sk)-&gt;sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __hash, __cookie, __saddr, __daddr, __ports, __dif)\
     (((__sk)-&gt;sk_hash == (__hash))                 &amp;&amp;  \
     (inet_sk(__sk)-&gt;daddr          == (__saddr))   &amp;&amp;  \
     (inet_sk(__sk)-&gt;rcv_saddr      == (__daddr))   &amp;&amp;  \
     (!((__sk)-&gt;sk_bound_dev_if) || ((__sk)-&gt;sk_bound_dev_if == (__dif))))


- Adds a prefetch(head-&gt;chain.first) in 
__inet_lookup_established()/__tcp_v4_check_established() and 
__inet6_lookup_established()/__tcp_v6_check_established() and 
__dccp_v4_check_established() to bring into cache the first element of the 
list, before the {read|write}_lock(&amp;head-&gt;lock);

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Acked-by: Arnaldo Carvalho de Melo &lt;acme@ghostprotocols.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] timer initialization cleanup: DEFINE_TIMER</title>
<updated>2005-09-09T21:03:48+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2005-09-09T20:10:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8d06afab73a75f40ae2864e6c296356bab1ab473'/>
<id>8d06afab73a75f40ae2864e6c296356bab1ab473</id>
<content type='text'>
Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
been in the -RT tree for some time.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Clean up timer initialization by introducing DEFINE_TIMER a'la
DEFINE_SPINLOCK.  Build and boot-tested on x86.  A similar patch has been
been in the -RT tree for some time.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[LIB]: Make TEXTSEARCH_BM plain tristate like the others</title>
<updated>2005-08-29T23:11:11+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2005-08-25T23:23:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=29cb9f9c5502f6218cd3ea574efe46a5e55522d2'/>
<id>29cb9f9c5502f6218cd3ea574efe46a5e55522d2</id>
<content type='text'>
And select it when the relevant modules are enabled.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
And select it when the relevant modules are enabled.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NETLINK]: Convert netlink users to use group numbers instead of bitmasks</title>
<updated>2005-08-29T23:00:54+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2005-08-15T02:29:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ac6d439d2097b72ea0cbc2322ce1263a38bc1fd0'/>
<id>ac6d439d2097b72ea0cbc2322ce1263a38bc1fd0</id>
<content type='text'>
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Deinline netif_carrier_{on,off}().</title>
<updated>2005-08-29T22:57:08+00:00</updated>
<author>
<name>Denis Vlasenko</name>
<email>vda@ilport.com.ua</email>
</author>
<published>2005-08-11T22:32:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0a242efc4fb859b2da506cdf8f3366231602e4ff'/>
<id>0a242efc4fb859b2da506cdf8f3366231602e4ff</id>
<content type='text'>
# grep -r 'netif_carrier_o[nf]' linux-2.6.12 | wc -l
246

# size vmlinux.org vmlinux.carrier
text    data     bss     dec     hex filename
4339634 1054414  259296 5653344  564360 vmlinux.org
4337710 1054414  259296 5651420  563bdc vmlinux.carrier

And this ain't an allyesconfig kernel!

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
# grep -r 'netif_carrier_o[nf]' linux-2.6.12 | wc -l
246

# size vmlinux.org vmlinux.carrier
text    data     bss     dec     hex filename
4339634 1054414  259296 5653344  564360 vmlinux.org
4337710 1054414  259296 5651420  563bdc vmlinux.carrier

And this ain't an allyesconfig kernel!

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Kill skb-&gt;tc_classid</title>
<updated>2005-08-29T22:31:18+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2005-08-10T02:25:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=abc3bc58047efa72ee9c2e208cbeb73d261ad703'/>
<id>abc3bc58047efa72ee9c2e208cbeb73d261ad703</id>
<content type='text'>
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()</title>
<updated>2005-08-23T17:12:44+00:00</updated>
<author>
<name>Thomas Graf</name>
<email>tgraf@suug.ch</email>
</author>
<published>2005-08-23T17:12:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0fbbeb1ba43bd04f0f1d4f161b7f72437a1c8a03'/>
<id>0fbbeb1ba43bd04f0f1d4f161b7f72437a1c8a03</id>
<content type='text'>
qdisc_create_dflt() is missing to destroy the newly allocated
default qdisc if the initialization fails resulting in leaks
of all kinds. The only caller in mainline which may trigger
this bug is sch_tbf.c in tbf_create_dflt_qdisc().

Note: qdisc_create_dflt() doesn't fulfill the official locking
      requirements of qdisc_destroy() but since the qdisc could
      never be seen by the outside world this doesn't matter
      and it can stay as-is until the locking of pkt_sched
      is cleaned up.

Signed-off-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
qdisc_create_dflt() is missing to destroy the newly allocated
default qdisc if the initialization fails resulting in leaks
of all kinds. The only caller in mainline which may trigger
this bug is sch_tbf.c in tbf_create_dflt_qdisc().

Note: qdisc_create_dflt() doesn't fulfill the official locking
      requirements of qdisc_destroy() but since the qdisc could
      never be seen by the outside world this doesn't matter
      and it can stay as-is until the locking of pkt_sched
      is cleaned up.

Signed-off-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[EMATCH]: Remove feature ifdefs in meta ematch.</title>
<updated>2005-07-25T02:44:23+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2005-07-25T02:44:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7686ee1ad976efeddf10583f013462c66408ae51'/>
<id>7686ee1ad976efeddf10583f013462c66408ae51</id>
<content type='text'>
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PKT_SCHED]: em_meta: Kill TCF_META_ID_{INDEV,SECURITY,TCVERDICT}</title>
<updated>2005-07-22T21:43:52+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2005-07-22T21:43:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=261688d01ec07d3a265b8ace6ec68310fbd96a96'/>
<id>261688d01ec07d3a265b8ace6ec68310fbd96a96</id>
<content type='text'>
More unusable TCF_META_* match types that need to get eliminated
before 2.6.13 goes out the door.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
More unusable TCF_META_* match types that need to get eliminated
before 2.6.13 goes out the door.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
</pre>
</div>
</content>
</entry>
</feed>
