linux-stable.git/net/ipv6, branch v4.4.32

udp: fix IP_CHECKSUM handling

2016-11-15T06:46:39+00:00

[ Upstream commit 10df8e6152c6c400a563a673e9956320bfce1871 ]

First bug was added in commit ad6f939ab193 ("ip: Add offset parameter to
ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));

Then commit e6afc8ace6dd ("udp: remove headers from UDP packets before
queueing") forgot to adjust the offsets now UDP headers are pulled
before skb are put in receive queue.

Fixes: ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv")
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet 
Cc: Sam Kumar 
Cc: Willem de Bruijn 
Tested-by: Willem de Bruijn 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: add recursion limit to GRO

2016-11-15T06:46:38+00:00

[ Upstream commit fcd91dd449867c6bfe56a81cabba76b829fd05cd ]

Currently, GRO can do unlimited recursion through the gro_receive
handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem.  Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.

This patch adds a recursion counter to the GRO layer to prevent stack
overflow.  When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally.  This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.

Thanks to Vladimír Beneš  for the initial bug report.

Fixes: CVE-2016-7039
Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca 
Reviewed-by: Jiri Benc 
Acked-by: Hannes Frederic Sowa 
Acked-by: Tom Herbert 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipv6: correctly add local routes when lo goes up

2016-11-15T06:46:38+00:00

[ Upstream commit a220445f9f4382c36a53d8ef3e08165fa27f7e2c ]

The goal of the patch is to fix this scenario:
 ip link add dummy1 type dummy
 ip link set dummy1 up
 ip link set lo down ; ip link set lo up

After that sequence, the local route to the link layer address of dummy1 is
not there anymore.

When the loopback is set down, all local routes are deleted by
addrconf_ifdown()/rt6_ifdown(). At this time, the rt6_info entry still
exists, because the corresponding idev has a reference on it. After the rcu
grace period, dst_rcu_free() is called, and thus ___dst_free(), which will
set obsolete to DST_OBSOLETE_DEAD.

In this case, init_loopback() is called before dst_rcu_free(), thus
obsolete is still sets to something <= 0. So, the function doesn't add the
route again. To avoid that race, let's check the rt6 refcnt instead.

Fixes: 25fb6ca4ed9c ("net IPv6 : Fix broken IPv6 routing table after loopback down-up")
Fixes: a881ae1f625c ("ipv6: don't call addrconf_dst_alloc again when enable lo")
Fixes: 33d99113b110 ("ipv6: reallocate addrconf router for ipv6 address when lo device up")
Reported-by: Francesco Santoro 
Reported-by: Samuel Gauthier 
CC: Balakumaran Kannan 
CC: Maruthi Thotad 
CC: Sabrina Dubroca 
CC: Hannes Frederic Sowa 
CC: Weilong Chen 
CC: Gao feng 
Signed-off-by: Nicolas Dichtel 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ip6_tunnel: fix ip6_tnl_lookup

2016-11-15T06:46:38+00:00

[ Upstream commit 68d00f332e0ba7f60f212be74ede290c9f873bc5 ]

The commit ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel
endpoints.") introduces support for wildcards in tunnels endpoints,
but in some rare circumstances ip6_tnl_lookup selects wrong tunnel
interface relying only on source or destination address of the packet
and not checking presence of wildcard in tunnels endpoints. Later in
ip6_tnl_rcv this packets can be dicarded because of difference in
ipproto even if fallback device have proper ipproto configuration.

This patch adds checks of wildcard endpoint in tunnel avoiding such
behavior

Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Signed-off-by: Vadim Fedorenko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipv6: tcp: restore IP6CB for pktoptions skbs

2016-11-15T06:46:37+00:00

[ Upstream commit 8ce48623f0cf3d632e32448411feddccb693d351 ]

Baozeng Ding reported following KASAN splat :

BUG: KASAN: use-after-free in ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 at addr ffff880029c84ec8
Read of size 1 by task poc/25548
Call Trace:
 [] dump_stack+0x12e/0x185 /lib/dump_stack.c:15
 [<     inline     >] print_address_description /mm/kasan/report.c:204
 [] kasan_report_error+0x48b/0x4b0 /mm/kasan/report.c:283
 [<     inline     >] kasan_report /mm/kasan/report.c:303
 [] __asan_report_load1_noabort+0x3e/0x40 /mm/kasan/report.c:321
 [] ip6_datagram_recv_specific_ctl+0x13f1/0x15c0 /net/ipv6/datagram.c:687
 [] ip6_datagram_recv_ctl+0x33/0x40
 [] do_ipv6_getsockopt.isra.4+0xaec/0x2150
 [] ipv6_getsockopt+0x116/0x230
 [] tcp_getsockopt+0x82/0xd0 /net/ipv4/tcp.c:3035
 [] sock_common_getsockopt+0x95/0xd0 /net/core/sock.c:2647
 [<     inline     >] SYSC_getsockopt /net/socket.c:1776
 [] SyS_getsockopt+0x142/0x230 /net/socket.c:1758
 [] entry_SYSCALL_64_fastpath+0x23/0xc6
Memory state around the buggy address:
 ffff880029c84d80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff880029c84e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> ffff880029c84e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                              ^
 ffff880029c84f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff880029c84f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

He also provided a syzkaller reproducer.

Issue is that ip6_datagram_recv_specific_ctl() expects to find IP6CB
data that was moved at a different place in tcp_v6_rcv()

This patch moves tcp_v6_restore_cb() up and calls it from
tcp_v6_do_rcv() when np->pktoptions is set.

Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet 
Reported-by: Baozeng Ding 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route

2016-11-15T06:46:37+00:00

[ Upstream commit 2cf750704bb6d7ed8c7d732e071dd1bc890ea5e8 ]

Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid
instead of the previous dst_pid which was copied from in_skb's portid.
Since the skb is new the portid is 0 at that point so the packets are sent
to the kernel and we get scheduling while atomic or a deadlock (depending
on where it happens) by trying to acquire rtnl two times.
Also since this is RTM_GETROUTE, it can be triggered by a normal user.

Here's the sleeping while atomic trace:
[ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[ 7858.212881] 2 locks held by swapper/0/0:
[ 7858.213013]  #0:  (((&mrt->ipmr_expire_timer))){+.-...}, at: [] call_timer_fn+0x5/0x350
[ 7858.213422]  #1:  (mfc_unres_lock){+.....}, at: [] ipmr_expire_process+0x25/0x130
[ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179
[ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 7858.214108]  0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000
[ 7858.214412]  ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e
[ 7858.214716]  000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f
[ 7858.215251] Call Trace:
[ 7858.215412]    [] dump_stack+0x85/0xc1
[ 7858.215662]  [] ___might_sleep+0x192/0x250
[ 7858.215868]  [] __might_sleep+0x6f/0x100
[ 7858.216072]  [] mutex_lock_nested+0x33/0x4d0
[ 7858.216279]  [] ? netlink_lookup+0x25f/0x460
[ 7858.216487]  [] rtnetlink_rcv+0x1b/0x40
[ 7858.216687]  [] netlink_unicast+0x19c/0x260
[ 7858.216900]  [] rtnl_unicast+0x20/0x30
[ 7858.217128]  [] ipmr_destroy_unres+0xa9/0xf0
[ 7858.217351]  [] ipmr_expire_process+0x8f/0x130
[ 7858.217581]  [] ? ipmr_net_init+0x180/0x180
[ 7858.217785]  [] ? ipmr_net_init+0x180/0x180
[ 7858.217990]  [] call_timer_fn+0xa5/0x350
[ 7858.218192]  [] ? call_timer_fn+0x5/0x350
[ 7858.218415]  [] ? ipmr_net_init+0x180/0x180
[ 7858.218656]  [] run_timer_softirq+0x260/0x640
[ 7858.218865]  [] ? __do_softirq+0xbb/0x54f
[ 7858.219068]  [] __do_softirq+0xe8/0x54f
[ 7858.219269]  [] irq_exit+0xb8/0xc0
[ 7858.219463]  [] smp_apic_timer_interrupt+0x42/0x50
[ 7858.219678]  [] apic_timer_interrupt+0x8c/0xa0
[ 7858.219897]    [] ? native_safe_halt+0x6/0x10
[ 7858.220165]  [] ? trace_hardirqs_on+0xd/0x10
[ 7858.220373]  [] default_idle+0x23/0x190
[ 7858.220574]  [] arch_cpu_idle+0xf/0x20
[ 7858.220790]  [] default_idle_call+0x4c/0x60
[ 7858.221016]  [] cpu_startup_entry+0x39b/0x4d0
[ 7858.221257]  [] rest_init+0x135/0x140
[ 7858.221469]  [] start_kernel+0x50e/0x51b
[ 7858.221670]  [] ? early_idt_handler_array+0x120/0x120
[ 7858.221894]  [] x86_64_start_reservations+0x2a/0x2c
[ 7858.222113]  [] x86_64_start_kernel+0x13b/0x14a

Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts")
Signed-off-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()

2016-11-15T06:46:36+00:00

[ Upstream commit db32e4e49ce2b0e5fcc17803d011a401c0a637f6 ]

Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in
xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup.

Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value.
This affected output route lookup for packets sent on an ip6gretap device
in cases where routing was dependent on the value of flowi6_proto.

Since the correct proto is already set in the tunnel flowi6 template via
commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit
path."), simply delete the line setting the incorrect flowi6_proto value.

Suggested-by: Jiri Benc 
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Reviewed-by: Shmulik Ladkani 
Signed-off-by: Lance Richardson 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

tunnels: Remove encapsulation offloads on decap.

2016-10-31T10:13:59+00:00

commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168 upstream.

If a packet is either locally encapsulated or processed through GRO
it is marked with the offloads that it requires. However, when it is
decapsulated these tunnel offload indications are not removed. This
means that if we receive an encapsulated TCP packet, aggregate it with
GRO, decapsulate, and retransmit the resulting frame on a NIC that does
not support encapsulation, we won't be able to take advantage of hardware
offloads even though it is just a simple TCP packet at this point.

This fixes the problem by stripping off encapsulation offload indications
when packets are decapsulated.

The performance impacts of this bug are significant. In a test where a
Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated,
and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a
result of avoiding unnecessary segmentation at the VM tap interface.

Reported-by: Ramu Ramamurthy 
Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE")
Signed-off-by: Jesse Gross 
Signed-off-by: David S. Miller 
(backported from commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168)
[adapt iptunnel_pull_header arguments, avoid 7f290c9]
Signed-off-by: Stefan Bader 
Signed-off-by: Juerg Haefliger 
Signed-off-by: Greg Kroah-Hartman

tunnels: Don't apply GRO to multiple layers of encapsulation.

2016-10-31T10:13:59+00:00

commit fac8e0f579695a3ecbc4d3cac369139d7f819971 upstream.

When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross 
Signed-off-by: David S. Miller 
Signed-off-by: Juerg Haefliger 
Signed-off-by: Greg Kroah-Hartman

tcp: properly scale window in tcp_v[46]_reqsk_send_ack()

2016-09-30T08:18:34+00:00

[ Upstream commit 20a2b49fc538540819a0c552877086548cff8d8d ]

When sending an ack in SYN_RECV state, we must scale the offered
window if wscale option was negotiated and accepted.

Tested:
 Following packetdrill test demonstrates the issue :

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0

+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

// Establish a connection.
+0 < S 0:0(0) win 20000 
+0 > S. 0:0(0) ack 1 win 28960 

+0 < . 1:11(10) ack 1 win 156 
// check that window is properly scaled !
+0 > . 1:1(0) ack 1 win 226 

Signed-off-by: Eric Dumazet 
Cc: Yuchung Cheng 
Cc: Neal Cardwell 
Acked-by: Yuchung Cheng 
Acked-by: Neal Cardwell 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
Tested-by: Holger Hoffstätte