<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/net/ipv6, branch v4.4.32</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>udp: fix IP_CHECKSUM handling</title>
<updated>2016-11-15T06:46:39+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2016-10-24T01:03:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d46c76765da696502837d823227d4c32c28d8c05'/>
<id>d46c76765da696502837d823227d4c32c28d8c05</id>
<content type='text'>
[ Upstream commit 10df8e6152c6c400a563a673e9956320bfce1871 ]

First bug was added in commit ad6f939ab193 ("ip: Add offset parameter to
ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));

Then commit e6afc8ace6dd ("udp: remove headers from UDP packets before
queueing") forgot to adjust the offsets now UDP headers are pulled
before skb are put in receive queue.

Fixes: ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv")
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Sam Kumar &lt;samanthakumar@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Tested-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 10df8e6152c6c400a563a673e9956320bfce1871 ]

First bug was added in commit ad6f939ab193 ("ip: Add offset parameter to
ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));

Then commit e6afc8ace6dd ("udp: remove headers from UDP packets before
queueing") forgot to adjust the offsets now UDP headers are pulled
before skb are put in receive queue.

Fixes: ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv")
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Sam Kumar &lt;samanthakumar@google.com&gt;
Cc: Willem de Bruijn &lt;willemb@google.com&gt;
Tested-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: add recursion limit to GRO</title>
<updated>2016-11-15T06:46:38+00:00</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2016-10-20T13:58:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3cb00b90e8b1bd59382f5e1304dd751f9674f027'/>
<id>3cb00b90e8b1bd59382f5e1304dd751f9674f027</id>
<content type='text'>
[ Upstream commit fcd91dd449867c6bfe56a81cabba76b829fd05cd ]

Currently, GRO can do unlimited recursion through the gro_receive
handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem.  Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.

This patch adds a recursion counter to the GRO layer to prevent stack
overflow.  When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally.  This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.

Thanks to Vladimír Beneš &lt;vbenes@redhat.com&gt; for the initial bug report.

Fixes: CVE-2016-7039
Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Reviewed-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Tom Herbert &lt;tom@herbertland.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit fcd91dd449867c6bfe56a81cabba76b829fd05cd ]

Currently, GRO can do unlimited recursion through the gro_receive
handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem.  Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.

This patch adds a recursion counter to the GRO layer to prevent stack
overflow.  When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally.  This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.

Thanks to Vladimír Beneš &lt;vbenes@redhat.com&gt; for the initial bug report.

Fixes: CVE-2016-7039
Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Reviewed-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Tom Herbert &lt;tom@herbertland.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: correctly add local routes when lo goes up</title>
<updated>2016-11-15T06:46:38+00:00</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2016-10-12T08:10:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e635b4766174381572b95f8fae153e7f1f36cf65'/>
<id>e635b4766174381572b95f8fae153e7f1f36cf65</id>
<content type='text'>
[ Upstream commit a220445f9f4382c36a53d8ef3e08165fa27f7e2c ]

The goal of the patch is to fix this scenario:
 ip link add dummy1 type dummy
 ip link set dummy1 up
 ip link set lo down ; ip link set lo up

After that sequence, the local route to the link layer address of dummy1 is
not there anymore.

When the loopback is set down, all local routes are deleted by
addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
exists, because the corresponding idev has a reference on it. After the rcu
grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
set obsolete to DST_OBSOLETE_DEAD.

In this case, init_loopback() is called before dst_rcu_free(), thus
obsolete is still sets to something &lt;= 0. So, the function doesn't add the
route again. To avoid that race, let's check the rt6 refcnt instead.

Fixes: 25fb6ca4ed9c ("net IPv6 : Fix broken IPv6 routing table after loopback down-up")
Fixes: a881ae1f625c ("ipv6: don't call addrconf_dst_alloc again when enable lo")
Fixes: 33d99113b110 ("ipv6: reallocate addrconf router for ipv6 address when lo device up")
Reported-by: Francesco Santoro &lt;francesco.santoro@6wind.com&gt;
Reported-by: Samuel Gauthier &lt;samuel.gauthier@6wind.com&gt;
CC: Balakumaran Kannan &lt;Balakumaran.Kannan@ap.sony.com&gt;
CC: Maruthi Thotad &lt;Maruthi.Thotad@ap.sony.com&gt;
CC: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
CC: Weilong Chen &lt;chenweilong@huawei.com&gt;
CC: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit a220445f9f4382c36a53d8ef3e08165fa27f7e2c ]

The goal of the patch is to fix this scenario:
 ip link add dummy1 type dummy
 ip link set dummy1 up
 ip link set lo down ; ip link set lo up

After that sequence, the local route to the link layer address of dummy1 is
not there anymore.

When the loopback is set down, all local routes are deleted by
addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
exists, because the corresponding idev has a reference on it. After the rcu
grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
set obsolete to DST_OBSOLETE_DEAD.

In this case, init_loopback() is called before dst_rcu_free(), thus
obsolete is still sets to something &lt;= 0. So, the function doesn't add the
route again. To avoid that race, let's check the rt6 refcnt instead.

Fixes: 25fb6ca4ed9c ("net IPv6 : Fix broken IPv6 routing table after loopback down-up")
Fixes: a881ae1f625c ("ipv6: don't call addrconf_dst_alloc again when enable lo")
Fixes: 33d99113b110 ("ipv6: reallocate addrconf router for ipv6 address when lo device up")
Reported-by: Francesco Santoro &lt;francesco.santoro@6wind.com&gt;
Reported-by: Samuel Gauthier &lt;samuel.gauthier@6wind.com&gt;
CC: Balakumaran Kannan &lt;Balakumaran.Kannan@ap.sony.com&gt;
CC: Maruthi Thotad &lt;Maruthi.Thotad@ap.sony.com&gt;
CC: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
CC: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
CC: Weilong Chen &lt;chenweilong@huawei.com&gt;
CC: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ip6_tunnel: fix ip6_tnl_lookup</title>
<updated>2016-11-15T06:46:38+00:00</updated>
<author>
<name>Vadim Fedorenko</name>
<email>junk@yandex-team.ru</email>
</author>
<published>2016-10-11T19:47:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f9d4850af3c89934620f3f0363167da9cbb3f167'/>
<id>f9d4850af3c89934620f3f0363167da9cbb3f167</id>
<content type='text'>
[ Upstream commit 68d00f332e0ba7f60f212be74ede290c9f873bc5 ]

The commit ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel
endpoints.") introduces support for wildcards in tunnels endpoints,
but in some rare circumstances ip6_tnl_lookup selects wrong tunnel
interface relying only on source or destination address of the packet
and not checking presence of wildcard in tunnels endpoints. Later in
ip6_tnl_rcv this packets can be dicarded because of difference in
ipproto even if fallback device have proper ipproto configuration.

This patch adds checks of wildcard endpoint in tunnel avoiding such
behavior

Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Signed-off-by: Vadim Fedorenko &lt;junk@yandex-team.ru&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 68d00f332e0ba7f60f212be74ede290c9f873bc5 ]

The commit ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel
endpoints.") introduces support for wildcards in tunnels endpoints,
but in some rare circumstances ip6_tnl_lookup selects wrong tunnel
interface relying only on source or destination address of the packet
and not checking presence of wildcard in tunnels endpoints. Later in
ip6_tnl_rcv this packets can be dicarded because of difference in
ipproto even if fallback device have proper ipproto configuration.

This patch adds checks of wildcard endpoint in tunnel avoiding such
behavior

Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Signed-off-by: Vadim Fedorenko &lt;junk@yandex-team.ru&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: tcp: restore IP6CB for pktoptions skbs</title>
<updated>2016-11-15T06:46:37+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2016-10-12T17:01:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=705b5aca17c3a30ff93c53eb51368c8fcc9b49b8'/>
<id>705b5aca17c3a30ff93c53eb51368c8fcc9b49b8</id>
<content type='text'>
[ Upstream commit 8ce48623f0cf3d632e32448411feddccb693d351 ]

Baozeng Ding reported following KASAN splat :

BUG: KASAN: use-after-free in ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 at addr ffff880029c84ec8
Read of size 1 by task poc/25548
Call Trace:
 [&lt;ffffffff82cf43c9&gt;] dump_stack+0x12e/0x185 /lib/dump_stack.c:15
 [&lt;     inline     &gt;] print_address_description /mm/kasan/report.c:204
 [&lt;ffffffff817ced3b&gt;] kasan_report_error+0x48b/0x4b0 /mm/kasan/report.c:283
 [&lt;     inline     &gt;] kasan_report /mm/kasan/report.c:303
 [&lt;ffffffff817ced9e&gt;] __asan_report_load1_noabort+0x3e/0x40 /mm/kasan/report.c:321
 [&lt;ffffffff85c71da1&gt;] ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 /net/ipv6/datagram.c:687
 [&lt;ffffffff85c734c3&gt;] ip6_datagram_recv_ctl+0x33/0x40
 [&lt;ffffffff85c0b07c&gt;] do_ipv6_getsockopt.isra.4+0xaec/0x2150
 [&lt;ffffffff85c0c7f6&gt;] ipv6_getsockopt+0x116/0x230
 [&lt;ffffffff859b5a12&gt;] tcp_getsockopt+0x82/0xd0 /net/ipv4/tcp.c:3035
 [&lt;ffffffff855fb385&gt;] sock_common_getsockopt+0x95/0xd0 /net/core/sock.c:2647
 [&lt;     inline     &gt;] SYSC_getsockopt /net/socket.c:1776
 [&lt;ffffffff855f8ba2&gt;] SyS_getsockopt+0x142/0x230 /net/socket.c:1758
 [&lt;ffffffff8685cdc5&gt;] entry_SYSCALL_64_fastpath+0x23/0xc6
Memory state around the buggy address:
 ffff880029c84d80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff880029c84e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
&gt; ffff880029c84e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                              ^
 ffff880029c84f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff880029c84f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

He also provided a syzkaller reproducer.

Issue is that ip6_datagram_recv_specific_ctl() expects to find IP6CB
data that was moved at a different place in tcp_v6_rcv()

This patch moves tcp_v6_restore_cb() up and calls it from
tcp_v6_do_rcv() when np-&gt;pktoptions is set.

Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Baozeng Ding &lt;sploving1@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 8ce48623f0cf3d632e32448411feddccb693d351 ]

Baozeng Ding reported following KASAN splat :

BUG: KASAN: use-after-free in ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 at addr ffff880029c84ec8
Read of size 1 by task poc/25548
Call Trace:
 [&lt;ffffffff82cf43c9&gt;] dump_stack+0x12e/0x185 /lib/dump_stack.c:15
 [&lt;     inline     &gt;] print_address_description /mm/kasan/report.c:204
 [&lt;ffffffff817ced3b&gt;] kasan_report_error+0x48b/0x4b0 /mm/kasan/report.c:283
 [&lt;     inline     &gt;] kasan_report /mm/kasan/report.c:303
 [&lt;ffffffff817ced9e&gt;] __asan_report_load1_noabort+0x3e/0x40 /mm/kasan/report.c:321
 [&lt;ffffffff85c71da1&gt;] ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 /net/ipv6/datagram.c:687
 [&lt;ffffffff85c734c3&gt;] ip6_datagram_recv_ctl+0x33/0x40
 [&lt;ffffffff85c0b07c&gt;] do_ipv6_getsockopt.isra.4+0xaec/0x2150
 [&lt;ffffffff85c0c7f6&gt;] ipv6_getsockopt+0x116/0x230
 [&lt;ffffffff859b5a12&gt;] tcp_getsockopt+0x82/0xd0 /net/ipv4/tcp.c:3035
 [&lt;ffffffff855fb385&gt;] sock_common_getsockopt+0x95/0xd0 /net/core/sock.c:2647
 [&lt;     inline     &gt;] SYSC_getsockopt /net/socket.c:1776
 [&lt;ffffffff855f8ba2&gt;] SyS_getsockopt+0x142/0x230 /net/socket.c:1758
 [&lt;ffffffff8685cdc5&gt;] entry_SYSCALL_64_fastpath+0x23/0xc6
Memory state around the buggy address:
 ffff880029c84d80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff880029c84e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
&gt; ffff880029c84e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                              ^
 ffff880029c84f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff880029c84f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

He also provided a syzkaller reproducer.

Issue is that ip6_datagram_recv_specific_ctl() expects to find IP6CB
data that was moved at a different place in tcp_v6_rcv()

This patch moves tcp_v6_restore_cb() up and calls it from
tcp_v6_do_rcv() when np-&gt;pktoptions is set.

Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Baozeng Ding &lt;sploving1@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route</title>
<updated>2016-11-15T06:46:37+00:00</updated>
<author>
<name>Nikolay Aleksandrov</name>
<email>nikolay@cumulusnetworks.com</email>
</author>
<published>2016-09-25T21:08:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6eb0061fa630ae97c733a4dcbe3e23333ebe8626'/>
<id>6eb0061fa630ae97c733a4dcbe3e23333ebe8626</id>
<content type='text'>
[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&amp;mrt-&gt;ipmr_expire_timer))){+.-...}, at: [&lt;ffffffff810fbbf5&gt;] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [&lt;ffffffff8161e005&gt;] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]  &lt;IRQ&gt;  [&lt;ffffffff813a7804&gt;] dump_stack+0x85/0xc1
[ 7858.215662]  [&lt;ffffffff810a4a72&gt;] ___might_sleep+0x192/0x250
[ 7858.215868]  [&lt;ffffffff810a4b9f&gt;] __might_sleep+0x6f/0x100
[ 7858.216072]  [&lt;ffffffff8165bea3&gt;] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [&lt;ffffffff815a7a5f&gt;] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [&lt;ffffffff8157474b&gt;] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [&lt;ffffffff815a9a0c&gt;] netlink_unicast+0x19c/0x260
[ 7858.216900]  [&lt;ffffffff81573c70&gt;] rtnl_unicast+0x20/0x30
[ 7858.217128]  [&lt;ffffffff8161cd39&gt;] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [&lt;ffffffff8161e06f&gt;] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [&lt;ffffffff810fbc95&gt;] call_timer_fn+0xa5/0x350
[ 7858.218192]  [&lt;ffffffff810fbbf5&gt;] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [&lt;ffffffff810fde10&gt;] run_timer_softirq+0x260/0x640
[ 7858.218865]  [&lt;ffffffff8166379b&gt;] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [&lt;ffffffff816637c8&gt;] __do_softirq+0xe8/0x54f
[ 7858.219269]  [&lt;ffffffff8107a948&gt;] irq_exit+0xb8/0xc0
[ 7858.219463]  [&lt;ffffffff81663452&gt;] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [&lt;ffffffff816625bc&gt;] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]  &lt;EOI&gt;  [&lt;ffffffff81055f16&gt;] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [&lt;ffffffff810d64dd&gt;] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [&lt;ffffffff810298e3&gt;] default_idle+0x23/0x190
[ 7858.220574]  [&lt;ffffffff8102a20f&gt;] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [&lt;ffffffff810c9f8c&gt;] default_idle_call+0x4c/0x60
[ 7858.221016]  [&lt;ffffffff810ca33b&gt;] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [&lt;ffffffff8164f995&gt;] rest_init+0x135/0x140
[ 7858.221469]  [&lt;ffffffff81f83014&gt;] start_kernel+0x50e/0x51b
[ 7858.221670]  [&lt;ffffffff81f82120&gt;] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [&lt;ffffffff81f8243f&gt;] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [&lt;ffffffff81f8257c&gt;] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&amp;mrt-&gt;ipmr_expire_timer))){+.-...}, at: [&lt;ffffffff810fbbf5&gt;] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [&lt;ffffffff8161e005&gt;] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]  &lt;IRQ&gt;  [&lt;ffffffff813a7804&gt;] dump_stack+0x85/0xc1
[ 7858.215662]  [&lt;ffffffff810a4a72&gt;] ___might_sleep+0x192/0x250
[ 7858.215868]  [&lt;ffffffff810a4b9f&gt;] __might_sleep+0x6f/0x100
[ 7858.216072]  [&lt;ffffffff8165bea3&gt;] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [&lt;ffffffff815a7a5f&gt;] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [&lt;ffffffff8157474b&gt;] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [&lt;ffffffff815a9a0c&gt;] netlink_unicast+0x19c/0x260
[ 7858.216900]  [&lt;ffffffff81573c70&gt;] rtnl_unicast+0x20/0x30
[ 7858.217128]  [&lt;ffffffff8161cd39&gt;] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [&lt;ffffffff8161e06f&gt;] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [&lt;ffffffff810fbc95&gt;] call_timer_fn+0xa5/0x350
[ 7858.218192]  [&lt;ffffffff810fbbf5&gt;] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [&lt;ffffffff8161dfe0&gt;] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [&lt;ffffffff810fde10&gt;] run_timer_softirq+0x260/0x640
[ 7858.218865]  [&lt;ffffffff8166379b&gt;] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [&lt;ffffffff816637c8&gt;] __do_softirq+0xe8/0x54f
[ 7858.219269]  [&lt;ffffffff8107a948&gt;] irq_exit+0xb8/0xc0
[ 7858.219463]  [&lt;ffffffff81663452&gt;] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [&lt;ffffffff816625bc&gt;] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]  &lt;EOI&gt;  [&lt;ffffffff81055f16&gt;] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [&lt;ffffffff810d64dd&gt;] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [&lt;ffffffff810298e3&gt;] default_idle+0x23/0x190
[ 7858.220574]  [&lt;ffffffff8102a20f&gt;] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [&lt;ffffffff810c9f8c&gt;] default_idle_call+0x4c/0x60
[ 7858.221016]  [&lt;ffffffff810ca33b&gt;] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [&lt;ffffffff8164f995&gt;] rest_init+0x135/0x140
[ 7858.221469]  [&lt;ffffffff81f83014&gt;] start_kernel+0x50e/0x51b
[ 7858.221670]  [&lt;ffffffff81f82120&gt;] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [&lt;ffffffff81f8243f&gt;] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [&lt;ffffffff81f8257c&gt;] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()</title>
<updated>2016-11-15T06:46:36+00:00</updated>
<author>
<name>Lance Richardson</name>
<email>lrichard@redhat.com</email>
</author>
<published>2016-09-23T19:50:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4f312a802994e2bb7439262fdc43a0b8bd535697'/>
<id>4f312a802994e2bb7439262fdc43a0b8bd535697</id>
<content type='text'>
[ Upstream commit db32e4e49ce2b0e5fcc17803d011a401c0a637f6 ]

Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in
xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.

Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
This affected output route lookup for packets sent on an ip6gretap device
in cases where routing was dependent on the value of flowi6_proto.

Since the correct proto is already set in the tunnel flowi6 template via
commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
path."), simply delete the line setting the incorrect flowi6_proto value.

Suggested-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Reviewed-by: Shmulik Ladkani &lt;shmulik.ladkani@gmail.com&gt;
Signed-off-by: Lance Richardson &lt;lrichard@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit db32e4e49ce2b0e5fcc17803d011a401c0a637f6 ]

Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in
xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.

Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
This affected output route lookup for packets sent on an ip6gretap device
in cases where routing was dependent on the value of flowi6_proto.

Since the correct proto is already set in the tunnel flowi6 template via
commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
path."), simply delete the line setting the incorrect flowi6_proto value.

Suggested-by: Jiri Benc &lt;jbenc@redhat.com&gt;
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Reviewed-by: Shmulik Ladkani &lt;shmulik.ladkani@gmail.com&gt;
Signed-off-by: Lance Richardson &lt;lrichard@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tunnels: Remove encapsulation offloads on decap.</title>
<updated>2016-10-31T10:13:59+00:00</updated>
<author>
<name>Jesse Gross</name>
<email>jesse@kernel.org</email>
</author>
<published>2016-03-19T16:32:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9f9818f8c1cf44055634297247620be4755e7af2'/>
<id>9f9818f8c1cf44055634297247620be4755e7af2</id>
<content type='text'>
commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168 upstream.

If a packet is either locally encapsulated or processed through GRO
it is marked with the offloads that it requires. However, when it is
decapsulated these tunnel offload indications are not removed. This
means that if we receive an encapsulated TCP packet, aggregate it with
GRO, decapsulate, and retransmit the resulting frame on a NIC that does
not support encapsulation, we won't be able to take advantage of hardware
offloads even though it is just a simple TCP packet at this point.

This fixes the problem by stripping off encapsulation offload indications
when packets are decapsulated.

The performance impacts of this bug are significant. In a test where a
Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
and bridged to a VM performance is improved by 60% (5Gbps-&gt;8Gbps) as a
result of avoiding unnecessary segmentation at the VM tap interface.

Reported-by: Ramu Ramamurthy &lt;sramamur@linux.vnet.ibm.com&gt;
Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
Signed-off-by: Jesse Gross &lt;jesse@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
(backported from commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168)
[adapt iptunnel_pull_header arguments, avoid 7f290c9]
Signed-off-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: Juerg Haefliger &lt;juerg.haefliger@hpe.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168 upstream.

If a packet is either locally encapsulated or processed through GRO
it is marked with the offloads that it requires. However, when it is
decapsulated these tunnel offload indications are not removed. This
means that if we receive an encapsulated TCP packet, aggregate it with
GRO, decapsulate, and retransmit the resulting frame on a NIC that does
not support encapsulation, we won't be able to take advantage of hardware
offloads even though it is just a simple TCP packet at this point.

This fixes the problem by stripping off encapsulation offload indications
when packets are decapsulated.

The performance impacts of this bug are significant. In a test where a
Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
and bridged to a VM performance is improved by 60% (5Gbps-&gt;8Gbps) as a
result of avoiding unnecessary segmentation at the VM tap interface.

Reported-by: Ramu Ramamurthy &lt;sramamur@linux.vnet.ibm.com&gt;
Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
Signed-off-by: Jesse Gross &lt;jesse@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
(backported from commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168)
[adapt iptunnel_pull_header arguments, avoid 7f290c9]
Signed-off-by: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Signed-off-by: Juerg Haefliger &lt;juerg.haefliger@hpe.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tunnels: Don't apply GRO to multiple layers of encapsulation.</title>
<updated>2016-10-31T10:13:59+00:00</updated>
<author>
<name>Jesse Gross</name>
<email>jesse@kernel.org</email>
</author>
<published>2016-03-19T16:32:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5699b3431e0b14736867484b8669ead2d40f575e'/>
<id>5699b3431e0b14736867484b8669ead2d40f575e</id>
<content type='text'>
commit fac8e0f579695a3ecbc4d3cac369139d7f819971 upstream.

When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross &lt;jesse@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Juerg Haefliger &lt;juerg.haefliger@hpe.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit fac8e0f579695a3ecbc4d3cac369139d7f819971 upstream.

When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross &lt;jesse@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Juerg Haefliger &lt;juerg.haefliger@hpe.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: properly scale window in tcp_v[46]_reqsk_send_ack()</title>
<updated>2016-09-30T08:18:34+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2016-08-22T18:31:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c867a11289ee74c4594a96ccccac16f4cc29519e'/>
<id>c867a11289ee74c4594a96ccccac16f4cc29519e</id>
<content type='text'>
[ Upstream commit 20a2b49fc538540819a0c552877086548cff8d8d ]

When sending an ack in SYN_RECV state, we must scale the offered
window if wscale option was negotiated and accepted.

Tested:
 Following packetdrill test demonstrates the issue :

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

// Establish a connection.
+0 &lt; S 0:0(0) win 20000 &lt;mss 1000,sackOK,wscale 7, nop, TS val 100 ecr 0&gt;
+0 &gt; S. 0:0(0) ack 1 win 28960 &lt;mss 1460,sackOK, TS val 100 ecr 100, nop, wscale 7&gt;

+0 &lt; . 1:11(10) ack 1 win 156 &lt;nop,nop,TS val 99 ecr 100&gt;
// check that window is properly scaled !
+0 &gt; . 1:1(0) ack 1 win 226 &lt;nop,nop,TS val 200 ecr 100&gt;

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Tested-by: Holger Hoffstätte &lt;holger@applied-asynchrony.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 20a2b49fc538540819a0c552877086548cff8d8d ]

When sending an ack in SYN_RECV state, we must scale the offered
window if wscale option was negotiated and accepted.

Tested:
 Following packetdrill test demonstrates the issue :

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

// Establish a connection.
+0 &lt; S 0:0(0) win 20000 &lt;mss 1000,sackOK,wscale 7, nop, TS val 100 ecr 0&gt;
+0 &gt; S. 0:0(0) ack 1 win 28960 &lt;mss 1460,sackOK, TS val 100 ecr 100, nop, wscale 7&gt;

+0 &lt; . 1:11(10) ack 1 win 156 &lt;nop,nop,TS val 99 ecr 100&gt;
// check that window is properly scaled !
+0 &gt; . 1:1(0) ack 1 win 226 &lt;nop,nop,TS val 200 ecr 100&gt;

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Tested-by: Holger Hoffstätte &lt;holger@applied-asynchrony.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
