<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv6/ip6_input.c, branch v3.10.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>net: Fix memory leak if TPROXY used with TCP early demux</title>
<updated>2014-02-06T19:08:17+00:00</updated>
<author>
<name>Holger Eitzenberger</name>
<email>holger@eitzenberger.org</email>
</author>
<published>2014-01-27T09:33:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=873c4941de470ec803d409150df3d9d4b705f578'/>
<id>873c4941de470ec803d409150df3d9d4b705f578</id>
<content type='text'>
[ Upstream commit a452ce345d63ddf92cd101e4196569f8718ad319 ]

I see a memory leak when using a transparent HTTP proxy using TPROXY
together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable):

unreferenced object 0xffff88008cba4a40 (size 1696):
  comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s)
  hex dump (first 32 bytes):
    0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00  .. j@..7..2.....
    02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff810b710a&gt;] kmem_cache_alloc+0xad/0xb9
    [&lt;ffffffff81270185&gt;] sk_prot_alloc+0x29/0xc5
    [&lt;ffffffff812702cf&gt;] sk_clone_lock+0x14/0x283
    [&lt;ffffffff812aaf3a&gt;] inet_csk_clone_lock+0xf/0x7b
    [&lt;ffffffff8129a893&gt;] netlink_broadcast+0x14/0x16
    [&lt;ffffffff812c1573&gt;] tcp_create_openreq_child+0x1b/0x4c3
    [&lt;ffffffff812c033e&gt;] tcp_v4_syn_recv_sock+0x38/0x25d
    [&lt;ffffffff812c13e4&gt;] tcp_check_req+0x25c/0x3d0
    [&lt;ffffffff812bf87a&gt;] tcp_v4_do_rcv+0x287/0x40e
    [&lt;ffffffff812a08a7&gt;] ip_route_input_noref+0x843/0xa55
    [&lt;ffffffff812bfeca&gt;] tcp_v4_rcv+0x4c9/0x725
    [&lt;ffffffff812a26f4&gt;] ip_local_deliver_finish+0xe9/0x154
    [&lt;ffffffff8127a927&gt;] __netif_receive_skb+0x4b2/0x514
    [&lt;ffffffff8127aa77&gt;] process_backlog+0xee/0x1c5
    [&lt;ffffffff8127c949&gt;] net_rx_action+0xa7/0x200
    [&lt;ffffffff81209d86&gt;] add_interrupt_randomness+0x39/0x157

But there are many more, resulting in the machine going OOM after some
days.

From looking at the TPROXY code, and with help from Florian, I see
that the memory leak is introduced in tcp_v4_early_demux():

  void tcp_v4_early_demux(struct sk_buff *skb)
  {
    /* ... */

    iph = ip_hdr(skb);
    th = tcp_hdr(skb);

    if (th-&gt;doff &lt; sizeof(struct tcphdr) / 4)
        return;

    sk = __inet_lookup_established(dev_net(skb-&gt;dev), &amp;tcp_hashinfo,
                       iph-&gt;saddr, th-&gt;source,
                       iph-&gt;daddr, ntohs(th-&gt;dest),
                       skb-&gt;skb_iif);
    if (sk) {
        skb-&gt;sk = sk;

where the socket is assigned unconditionally to skb-&gt;sk, also bumping
the refcnt on it.  This is problematic, because in our case the skb
has already a socket assigned in the TPROXY target.  This then results
in the leak I see.

The very same issue seems to be with IPv6, but haven't tested.

Reviewed-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Holger Eitzenberger &lt;holger@eitzenberger.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit a452ce345d63ddf92cd101e4196569f8718ad319 ]

I see a memory leak when using a transparent HTTP proxy using TPROXY
together with TCP early demux and Kernel v3.8.13.15 (Ubuntu stable):

unreferenced object 0xffff88008cba4a40 (size 1696):
  comm "softirq", pid 0, jiffies 4294944115 (age 8907.520s)
  hex dump (first 32 bytes):
    0a e0 20 6a 40 04 1b 37 92 be 32 e2 e8 b4 00 00  .. j@..7..2.....
    02 00 07 01 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff810b710a&gt;] kmem_cache_alloc+0xad/0xb9
    [&lt;ffffffff81270185&gt;] sk_prot_alloc+0x29/0xc5
    [&lt;ffffffff812702cf&gt;] sk_clone_lock+0x14/0x283
    [&lt;ffffffff812aaf3a&gt;] inet_csk_clone_lock+0xf/0x7b
    [&lt;ffffffff8129a893&gt;] netlink_broadcast+0x14/0x16
    [&lt;ffffffff812c1573&gt;] tcp_create_openreq_child+0x1b/0x4c3
    [&lt;ffffffff812c033e&gt;] tcp_v4_syn_recv_sock+0x38/0x25d
    [&lt;ffffffff812c13e4&gt;] tcp_check_req+0x25c/0x3d0
    [&lt;ffffffff812bf87a&gt;] tcp_v4_do_rcv+0x287/0x40e
    [&lt;ffffffff812a08a7&gt;] ip_route_input_noref+0x843/0xa55
    [&lt;ffffffff812bfeca&gt;] tcp_v4_rcv+0x4c9/0x725
    [&lt;ffffffff812a26f4&gt;] ip_local_deliver_finish+0xe9/0x154
    [&lt;ffffffff8127a927&gt;] __netif_receive_skb+0x4b2/0x514
    [&lt;ffffffff8127aa77&gt;] process_backlog+0xee/0x1c5
    [&lt;ffffffff8127c949&gt;] net_rx_action+0xa7/0x200
    [&lt;ffffffff81209d86&gt;] add_interrupt_randomness+0x39/0x157

But there are many more, resulting in the machine going OOM after some
days.

From looking at the TPROXY code, and with help from Florian, I see
that the memory leak is introduced in tcp_v4_early_demux():

  void tcp_v4_early_demux(struct sk_buff *skb)
  {
    /* ... */

    iph = ip_hdr(skb);
    th = tcp_hdr(skb);

    if (th-&gt;doff &lt; sizeof(struct tcphdr) / 4)
        return;

    sk = __inet_lookup_established(dev_net(skb-&gt;dev), &amp;tcp_hashinfo,
                       iph-&gt;saddr, th-&gt;source,
                       iph-&gt;daddr, ntohs(th-&gt;dest),
                       skb-&gt;skb_iif);
    if (sk) {
        skb-&gt;sk = sk;

where the socket is assigned unconditionally to skb-&gt;sk, also bumping
the refcnt on it.  This is problematic, because in our case the skb
has already a socket assigned in the TPROXY target.  This then results
in the leak I see.

The very same issue seems to be with IPv6, but haven't tested.

Reviewed-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Holger Eitzenberger &lt;holger@eitzenberger.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: don't accept node local multicast traffic from the wire</title>
<updated>2013-03-29T18:57:33+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-03-26T08:13:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1c4a154e5253687c51123956dfcee9e9dfa8542d'/>
<id>1c4a154e5253687c51123956dfcee9e9dfa8542d</id>
<content type='text'>
Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been
verified: http://www.rfc-editor.org/errata_search.php?eid=3480

We have to check for pkt_type and loopback flag because either the
packets are allowed to travel over the loopback interface (in which case
pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel
over a non-loopback interface back to us (in which case PACKET_TYPE is
PACKET_LOOPBACK and IFF_LOOPBACK flag is not set).

Cc: Erik Hugne &lt;erik.hugne@ericsson.com&gt;
Cc: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been
verified: http://www.rfc-editor.org/errata_search.php?eid=3480

We have to check for pkt_type and loopback flag because either the
packets are allowed to travel over the loopback interface (in which case
pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel
over a non-loopback interface back to us (in which case PACKET_TYPE is
PACKET_LOOPBACK and IFF_LOOPBACK flag is not set).

Cc: Erik Hugne &lt;erik.hugne@ericsson.com&gt;
Cc: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: stop multicast forwarding to process interface scoped addresses</title>
<updated>2013-03-08T17:28:20+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-03-08T02:07:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ddf64354af4a702ee0b85d0a285ba74c7278a460'/>
<id>ddf64354af4a702ee0b85d0a285ba74c7278a460</id>
<content type='text'>
v2:
a) used struct ipv6_addr_props

v3:
a) reverted changes for ipv6_addr_props

v4:
a) do not use __ipv6_addr_needs_scope_id

Cc: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
v2:
a) used struct ipv6_addr_props

v3:
a) reverted changes for ipv6_addr_props

v4:
a) do not use __ipv6_addr_needs_scope_id

Cc: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv[4|6]: correct dropwatch false positive in local_deliver_finish</title>
<updated>2013-03-01T20:56:29+00:00</updated>
<author>
<name>Neil Horman</name>
<email>nhorman@tuxdriver.com</email>
</author>
<published>2013-03-01T07:44:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d8c6f4b9b7848bca8babfc0ae43a50c8ab22fbb9'/>
<id>d8c6f4b9b7848bca8babfc0ae43a50c8ab22fbb9</id>
<content type='text'>
I had a report recently of a user trying to use dropwatch to localise some frame
loss, and they were getting false positives.  Turned out they were using a user
space SCTP stack that used raw sockets to grab frames.  When we don't have a
registered protocol for a given packet, we record it as a drop, even if a raw
socket receieves the frame.  We should only record the drop in the event a raw
socket doesnt exist to receive the frames

Tested by the reported successfully

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Reported-by: William Reich &lt;reich@ulticom.com&gt;
Tested-by: William Reich &lt;reich@ulticom.com&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: William Reich &lt;reich@ulticom.com&gt;
CC: eric.dumazet@gmail.com
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I had a report recently of a user trying to use dropwatch to localise some frame
loss, and they were getting false positives.  Turned out they were using a user
space SCTP stack that used raw sockets to grab frames.  When we don't have a
registered protocol for a given packet, we record it as a drop, even if a raw
socket receieves the frame.  We should only record the drop in the event a raw
socket doesnt exist to receive the frames

Tested by the reported successfully

Signed-off-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Reported-by: William Reich &lt;reich@ulticom.com&gt;
Tested-by: William Reich &lt;reich@ulticom.com&gt;
CC: "David S. Miller" &lt;davem@davemloft.net&gt;
CC: William Reich &lt;reich@ulticom.com&gt;
CC: eric.dumazet@gmail.com
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: don't accept multicast traffic with scope 0</title>
<updated>2013-02-11T19:00:54+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2013-02-10T05:35:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=20314092c1b41894d8c181bf9aa6f022be2416aa'/>
<id>20314092c1b41894d8c181bf9aa6f022be2416aa</id>
<content type='text'>
v2:
a) moved before multicast source address check
b) changed comment to netdev style

Cc: Erik Hugne &lt;erik.hugne@ericsson.com&gt;
Cc: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
v2:
a) moved before multicast source address check
b) changed comment to netdev style

Cc: Erik Hugne &lt;erik.hugne@ericsson.com&gt;
Cc: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Store Router Alert option in IP6CB directly.</title>
<updated>2013-01-14T01:17:14+00:00</updated>
<author>
<name>YOSHIFUJI Hideaki / 吉藤英明</name>
<email>yoshfuji@linux-ipv6.org</email>
</author>
<published>2013-01-13T05:02:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dd3332bfcb2223458f553f341d3388cb84040e6a'/>
<id>dd3332bfcb2223458f553f341d3388cb84040e6a</id>
<content type='text'>
Router Alert option is very small and we can store the value
itself in the skb.

Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Router Alert option is very small and we can store the value
itself in the skb.

Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().</title>
<updated>2013-01-14T01:17:14+00:00</updated>
<author>
<name>YOSHIFUJI Hideaki / 吉藤英明</name>
<email>yoshfuji@linux-ipv6.org</email>
</author>
<published>2013-01-13T05:02:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=daad151263cf334d57fcc0270e2483d4b4639650'/>
<id>daad151263cf334d57fcc0270e2483d4b4639650</id>
<content type='text'>
Move generalized version of ipv6_is_mld() to header,
and use it from ip6_mc_input().

Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move generalized version of ipv6_is_mld() to header,
and use it from ip6_mc_input().

Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: TCP early demux cleanup</title>
<updated>2012-07-30T21:53:21+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-29T21:06:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cca32e4bf999a34ac08d959f351f2b30bcd02460'/>
<id>cca32e4bf999a34ac08d959f351f2b30bcd02460</id>
<content type='text'>
early_demux() handlers should be called in RCU context, and as we
use skb_dst_set_noref(skb, dst), caller must not exit from RCU context
before dst use (skb_dst(skb)) or release (skb_drop(dst))

Therefore, rcu_read_lock()/rcu_read_unlock() pairs around
-&gt;early_demux() are confusing and not needed :

Protocol handlers are already in an RCU read lock section.
(__netif_receive_skb() does the rcu_read_lock() )

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
early_demux() handlers should be called in RCU context, and as we
use skb_dst_set_noref(skb, dst), caller must not exit from RCU context
before dst use (skb_dst(skb)) or release (skb_drop(dst))

Therefore, rcu_read_lock()/rcu_read_unlock() pairs around
-&gt;early_demux() are confusing and not needed :

Protocol handlers are already in an RCU read lock section.
(__netif_receive_skb() does the rcu_read_lock() )

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Early TCP socket demux</title>
<updated>2012-07-26T22:50:39+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-26T12:18:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c7109986db3c945f50ceed884a30e0fd8af3b89b'/>
<id>c7109986db3c945f50ceed884a30e0fd8af3b89b</id>
<content type='text'>
This is the IPv6 missing bits for infrastructure added in commit
41063e9dd1195 (ipv4: Early TCP socket demux.)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the IPv6 missing bits for infrastructure added in commit
41063e9dd1195 (ipv4: Early TCP socket demux.)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>inet: Sanitize inet{,6} protocol demux.</title>
<updated>2012-06-20T01:56:21+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-06-20T01:56:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f9242b6b28d61295f2bf7e8adfb1060b382e5381'/>
<id>f9242b6b28d61295f2bf7e8adfb1060b382e5381</id>
<content type='text'>
Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay
are just a straight arrays.  Remove all unnecessary hash masking.

Document MAX_INET_PROTOS.

Use RAW_HTABLE_SIZE when appropriate.

Reported-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay
are just a straight arrays.  Remove all unnecessary hash masking.

Document MAX_INET_PROTOS.

Use RAW_HTABLE_SIZE when appropriate.

Reported-by: Ben Hutchings &lt;bhutchings@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
