<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/sched, branch v2.6.35</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>net sched: fix race in mirred device removal</title>
<updated>2010-07-25T04:04:20+00:00</updated>
<author>
<name>stephen hemminger</name>
<email>shemminger@vyatta.com</email>
</author>
<published>2010-07-22T18:45:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3b87956ea645fb4de7e59c7d0aa94de04be72615'/>
<id>3b87956ea645fb4de7e59c7d0aa94de04be72615</id>
<content type='text'>
This fixes hang when target device of mirred packet classifier
action is removed.

If a mirror or redirection action is configured to cause packets
to go to another device, the classifier holds a ref count, but was assuming
the adminstrator cleaned up all redirections before removing. The fix
is to add a notifier and cleanup during unregister.

The new list is implicitly protected by RTNL mutex because
it is held during filter add/delete as well as notifier.

Signed-off-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Acked-by: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes hang when target device of mirred packet classifier
action is removed.

If a mirror or redirection action is configured to cause packets
to go to another device, the classifier holds a ref count, but was assuming
the adminstrator cleaned up all redirections before removing. The fix
is to add a notifier and cleanup during unregister.

The new list is implicitly protected by RTNL mutex because
it is held during filter add/delete as well as notifier.

Signed-off-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Acked-by: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>act_nat: not all of the ICMP packets need an IP header payload</title>
<updated>2010-07-13T03:00:19+00:00</updated>
<author>
<name>Changli Gao</name>
<email>xiaosuo@gmail.com</email>
</author>
<published>2010-07-09T15:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=70c2efa5a32a7d38e66224844032160317fa7887'/>
<id>70c2efa5a32a7d38e66224844032160317fa7887</id>
<content type='text'>
not all of the ICMP packets need an IP header payload, so we check the length
of the skbs only when the packets should have an IP header payload.

Based upon analysis and initial patch by Rodrigo Partearroyo González.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
----
 net/sched/act_nat.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
not all of the ICMP packets need an IP header payload, so we check the length
of the skbs only when the packets should have an IP header payload.

Based upon analysis and initial patch by Rodrigo Partearroyo González.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
----
 net/sched/act_nat.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Clear IFF_XMIT_DST_RELEASE for teql interfaces</title>
<updated>2010-06-16T21:47:30+00:00</updated>
<author>
<name>Tom Hughes</name>
<email>tom@compton.nu</email>
</author>
<published>2010-06-15T22:24:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fa68a7822780fdc1295f7efb7e4313e62b447e75'/>
<id>fa68a7822780fdc1295f7efb7e4313e62b447e75</id>
<content type='text'>
https://bugzilla.kernel.org/show_bug.cgi?id=16183

The sch_teql module, which can be used to load balance over a set of
underlying interfaces, stopped working after 2.6.30 and has been
broken in all kernels since then for any underlying interface which
requires the addition of link level headers.

The problem is that the transmit routine relies on being able to
access the destination address in the skb in order to do address
resolution once it has decided which underlying interface it is going
to transmit through.

In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by
default for all interfaces, which causes the destination address to be
released before the transmit routine for the interface is called.

The solution is to clear that flag for teql interfaces.

Signed-off-by: Tom Hughes &lt;tom@compton.nu&gt;
Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
https://bugzilla.kernel.org/show_bug.cgi?id=16183

The sch_teql module, which can be used to load balance over a set of
underlying interfaces, stopped working after 2.6.30 and has been
broken in all kernels since then for any underlying interface which
requires the addition of link level headers.

The problem is that the transmit routine relies on being able to
access the destination address in the skb in order to do address
resolution once it has decided which underlying interface it is going
to transmit through.

In 2.6.31 the IFF_XMIT_DST_RELEASE flag was introduced, and set by
default for all interfaces, which causes the destination address to be
released before the transmit routine for the interface is called.

The solution is to clear that flag for teql interfaces.

Signed-off-by: Tom Hughes &lt;tom@compton.nu&gt;
Acked-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>act_pedit: access skb-&gt;data safely</title>
<updated>2010-06-03T10:28:27+00:00</updated>
<author>
<name>Changli Gao</name>
<email>xiaosuo@gmail.com</email>
</author>
<published>2010-06-02T04:55:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=db2c24175d149b55784f7cb2c303622ce962c1ae'/>
<id>db2c24175d149b55784f7cb2c303622ce962c1ae</id>
<content type='text'>
access skb-&gt;data safely

we should use skb_header_pointer() and skb_store_bits() to access skb-&gt;data to
handle small or non-linear skbs.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
----
 net/sched/act_pedit.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
access skb-&gt;data safely

we should use skb_header_pointer() and skb_store_bits() to access skb-&gt;data to
handle small or non-linear skbs.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
----
 net/sched/act_pedit.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cls_u32: use skb_header_pointer() to dereference data safely</title>
<updated>2010-06-02T14:32:42+00:00</updated>
<author>
<name>Changli Gao</name>
<email>xiaosuo@gmail.com</email>
</author>
<published>2010-06-02T14:32:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fbc2e7d9cf49e0bf89b9e91fd60a06851a855c5d'/>
<id>fbc2e7d9cf49e0bf89b9e91fd60a06851a855c5d</id>
<content type='text'>
use skb_header_pointer() to dereference data safely

the original skb-&gt;data dereference isn't safe, as there isn't any skb-&gt;len or
skb_is_nonlinear() check. skb_header_pointer() is used instead in this patch.
And when the skb isn't long enough, we terminate the function u32_classify()
immediately with -1.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
use skb_header_pointer() to dereference data safely

the original skb-&gt;data dereference isn't safe, as there isn't any skb-&gt;len or
skb_is_nonlinear() check. skb_header_pointer() is used instead in this patch.
And when the skb isn't long enough, we terminate the function u32_classify()
immediately with -1.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>act_nat: fix the wrong checksum when addr isn't in old_addr/mask</title>
<updated>2010-06-02T13:51:34+00:00</updated>
<author>
<name>Changli Gao</name>
<email>xiaosuo@gmail.com</email>
</author>
<published>2010-05-29T14:26:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=33c29dde7d04dc0ec0edb649d20ccf1351c13a06'/>
<id>33c29dde7d04dc0ec0edb649d20ccf1351c13a06</id>
<content type='text'>
fix the wrong checksum when addr isn't in old_addr/mask

For TCP and UDP packets, when addr isn't in old_addr/mask we don't do SNAT or
DNAT, and we should not update layer 4 checksum.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
----
 net/sched/act_nat.c |    4 ++++
 1 file changed, 4 insertions(+)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fix the wrong checksum when addr isn't in old_addr/mask

For TCP and UDP packets, when addr isn't in old_addr/mask we don't do SNAT or
DNAT, and we should not update layer 4 checksum.

Signed-off-by: Changli Gao &lt;xiaosuo@gmail.com&gt;
----
 net/sched/act_nat.c |    4 ++++
 1 file changed, 4 insertions(+)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cls_cgroup: Store classid in struct sock</title>
<updated>2010-05-24T07:12:34+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2010-05-24T07:12:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f845172531fb7410c7fb7780b1a6e51ee6df7d52'/>
<id>f845172531fb7410c7fb7780b1a6e51ee6df7d52</id>
<content type='text'>
Up until now cls_cgroup has relied on fetching the classid out of
the current executing thread.  This runs into trouble when a packet
processing is delayed in which case it may execute out of another
thread's context.

Furthermore, even when a packet is not delayed we may fail to
classify it if soft IRQs have been disabled, because this scenario
is indistinguishable from one where a packet unrelated to the
current thread is processed by a real soft IRQ.

In fact, the current semantics is inherently broken, as a single
skb may be constructed out of the writes of two different tasks.
A different manifestation of this problem is when the TCP stack
transmits in response of an incoming ACK.  This is currently
unclassified.

As we already have a concept of packet ownership for accounting
purposes in the skb-&gt;sk pointer, this is a natural place to store
the classid in a persistent manner.

This patch adds the cls_cgroup classid in struct sock, filling up
an existing hole on 64-bit :)

The value is set at socket creation time.  So all sockets created
via socket(2) automatically gains the ID of the thread creating it.
Whenever another process touches the socket by either reading or
writing to it, we will change the socket classid to that of the
process if it has a valid (non-zero) classid.

For sockets created on inbound connections through accept(2), we
inherit the classid of the original listening socket through
sk_clone, possibly preceding the actual accept(2) call.

In order to minimise risks, I have not made this the authoritative
classid.  For now it is only used as a backup when we execute
with soft IRQs disabled.  Once we're completely happy with its
semantics we can use it as the sole classid.

Footnote: I have rearranged the error path on cls_group module
creation.  If we didn't do this, then there is a window where
someone could create a tc rule using cls_group before the cgroup
subsystem has been registered.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Up until now cls_cgroup has relied on fetching the classid out of
the current executing thread.  This runs into trouble when a packet
processing is delayed in which case it may execute out of another
thread's context.

Furthermore, even when a packet is not delayed we may fail to
classify it if soft IRQs have been disabled, because this scenario
is indistinguishable from one where a packet unrelated to the
current thread is processed by a real soft IRQ.

In fact, the current semantics is inherently broken, as a single
skb may be constructed out of the writes of two different tasks.
A different manifestation of this problem is when the TCP stack
transmits in response of an incoming ACK.  This is currently
unclassified.

As we already have a concept of packet ownership for accounting
purposes in the skb-&gt;sk pointer, this is a natural place to store
the classid in a persistent manner.

This patch adds the cls_cgroup classid in struct sock, filling up
an existing hole on 64-bit :)

The value is set at socket creation time.  So all sockets created
via socket(2) automatically gains the ID of the thread creating it.
Whenever another process touches the socket by either reading or
writing to it, we will change the socket classid to that of the
process if it has a valid (non-zero) classid.

For sockets created on inbound connections through accept(2), we
inherit the classid of the original listening socket through
sk_clone, possibly preceding the actual accept(2) call.

In order to minimise risks, I have not made this the authoritative
classid.  For now it is only used as a backup when we execute
with soft IRQs disabled.  Once we're completely happy with its
semantics we can use it as the sole classid.

Footnote: I have rearranged the error path on cls_group module
creation.  If we didn't do this, then there is a window where
someone could create a tc rule using cls_group before the cgroup
subsystem has been registered.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net_sched: Fix qdisc_notify()</title>
<updated>2010-05-24T06:11:07+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-05-22T20:37:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=53b0f08042f04813cd1a7473dacd3edfacb28eb3'/>
<id>53b0f08042f04813cd1a7473dacd3edfacb28eb3</id>
<content type='text'>
Ben Pfaff reported a kernel oops and provided a test program to
reproduce it.

https://kerneltrap.org/mailarchive/linux-netdev/2010/5/21/6277805

tc_fill_qdisc() should not be called for builtin qdisc, or it
dereference a NULL pointer to get device ifindex.

Fix is to always use tc_qdisc_dump_ignore() before calling
tc_fill_qdisc().

Reported-by: Ben Pfaff &lt;blp@nicira.com&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Ben Pfaff reported a kernel oops and provided a test program to
reproduce it.

https://kerneltrap.org/mailarchive/linux-netdev/2010/5/21/6277805

tc_fill_qdisc() should not be called for builtin qdisc, or it
dereference a NULL pointer to get device ifindex.

Fix is to always use tc_qdisc_dump_ignore() before calling
tc_fill_qdisc().

Reported-by: Ben Pfaff &lt;blp@nicira.com&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Remove unnecessary returns from void function()s</title>
<updated>2010-05-18T06:23:14+00:00</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2010-05-18T06:08:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3fa21e07e6acefa31f974d57fba2b6920a7ebd1a'/>
<id>3fa21e07e6acefa31f974d57fba2b6920a7ebd1a</id>
<content type='text'>
This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (&lt;&gt;) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
  xargs perl -i -e 'local $/ ; while (&lt;&gt;) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net sched: cleanup and rate limit warning</title>
<updated>2010-05-18T06:23:13+00:00</updated>
<author>
<name>stephen hemminger</name>
<email>shemminger@vyatta.com</email>
</author>
<published>2010-05-11T14:24:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b60b6592baa69c43a5a0f55d6300a7feaab15338'/>
<id>b60b6592baa69c43a5a0f55d6300a7feaab15338</id>
<content type='text'>
If the user has a bad classification configuration, and gets a packet
that goes through too many steps. Chances are more packets will arrive,
and the message spew will overrun syslog because it is not rate limited.
And because it is not tagged with appropriate priority it can't not be screened.

Added the qdisc to the message to try and give some more context when
the message does arrive.

Signed-off-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Acked-by: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the user has a bad classification configuration, and gets a packet
that goes through too many steps. Chances are more packets will arrive,
and the message spew will overrun syslog because it is not rate limited.
And because it is not tagged with appropriate priority it can't not be screened.

Added the qdisc to the message to try and give some more context when
the message does arrive.

Signed-off-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
Acked-by: Jamal Hadi Salim &lt;hadi@cyberus.ca&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
