linux-stable.git/net/ipv6, branch v3.18.48

Fix "tunnels: Don't apply GRO to multiple layers of encapsulation."

2017-02-08T08:43:05+00:00

Commit dd4fff23f0f4c7c5414f50c091c78a7e423f85da which is commit
fac8e0f579695a3ecbc4d3cac369139d7f819971 upstream, seems to have
included the sit_gro_receive function, yet it never is used, causing an
obvious warning message.  Hook it up to the correct sit_offload
structure.

Note, for 3.16, the backport of fac8e0f579695a3ecbc4d3cac369139d7f819971
does not include this function, nor the ipip case.  I'm guessing that
this is not correct for 3.18, as one of the functions was included, but
could be totally wrong.

Cc: Jesse Gross 
Cc: David S. Miller 
Cc: Ben Hutchings 
Cc: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

netfilter: Fix switch statement warnings with recent gcc.

2017-02-08T08:43:03+00:00

commit c1f866767777d1c6abae0ec57effffcb72017c00 upstream.

More recent GCC warns about two kinds of switch statement uses:

1) Switching on an enumeration, but not having an explicit case
   statement for all members of the enumeration.  To show the
   compiler this is intentional, we simply add a default case
   with nothing more than a break statement.

2) Switching on a boolean value.  I think this warning is dumb
   but nevertheless you get it wholesale with -Wswitch.

This patch cures all such warnings in netfilter.

Signed-off-by: David S. Miller 
Acked-by: Pablo Neira Ayuso 
Signed-off-by: Greg Kroah-Hartman

tunnels: Don't apply GRO to multiple layers of encapsulation.

2017-01-15T14:49:53+00:00

[ Upstream commit fac8e0f579695a3ecbc4d3cac369139d7f819971 ]

When drivers express support for TSO of encapsulated packets, they
only mean that they can do it for one layer of encapsulation.
Supporting additional levels would mean updating, at a minimum,
more IP length fields and they are unaware of this.

No encapsulation device expresses support for handling offloaded
encapsulated packets, so we won't generate these types of frames
in the transmit path. However, GRO doesn't have a check for
multiple levels of encapsulation and will attempt to build them.

UDP tunnel GRO actually does prevent this situation but it only
handles multiple UDP tunnels stacked on top of each other. This
generalizes that solution to prevent any kind of tunnel stacking
that would cause problems.

Fixes: bf5a755f ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Jesse Gross 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin

udp: properly support MSG_PEEK with truncated buffers

2016-11-19T15:15:10+00:00

[ Upstream commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191 ]

Backport of this upstream commit into stable kernels :
89c22d8c3b27 ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.

In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
                                 msg->msg_iov);
returns -EFAULT.

This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.

For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.

This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.

Signed-off-by: Eric Dumazet 
Acked-by: Herbert Xu 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin

ipv6: addrconf: fix dev refcont leak when DAD failed

2016-10-06T02:40:20+00:00

[ Upstream commit 751eb6b6042a596b0080967c1a529a9fe98dac1d ]

In general, when DAD detected IPv6 duplicate address, ifp->state
will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a
delayed work, the call tree should be like this:

ndisc_recv_ns
  -> addrconf_dad_failure        <- missing ifp put
     -> addrconf_mod_dad_work
       -> schedule addrconf_dad_work()
         -> addrconf_dad_stop()  <- missing ifp hold before call it

addrconf_dad_failure() called with ifp refcont holding but not put.
addrconf_dad_work() call addrconf_dad_stop() without extra holding
refcount. This will not cause any issue normally.

But the race between addrconf_dad_failure() and addrconf_dad_work()
may cause ifp refcount leak and netdevice can not be unregister,
dmesg show the following messages:

IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected!
...
unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Cc: stable@vger.kernel.org
Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing
to workqueue")
Signed-off-by: Wei Yongjun 
Signed-off-by: David S. Miller 

Signed-off-by: Sasha Levin

netfilter: x_tables: speed up jump target validation

2016-08-03T15:34:46+00:00

[ Upstream commit f4dc77713f8016d2e8a3295e1c9c53a21f296def ]

The dummy ruleset I used to test the original validation change was broken,
most rules were unreachable and were not tested by mark_source_chains().

In some cases rulesets that used to load in a few seconds now require
several minutes.

sample ruleset that shows the behaviour:

echo "*filter"
for i in $(seq 0 100000);do
        printf ":chain_%06x - [0:0]\n" $i
done
for i in $(seq 0 100000);do
   printf -- "-A INPUT -j chain_%06x\n" $i
   printf -- "-A INPUT -j chain_%06x\n" $i
   printf -- "-A INPUT -j chain_%06x\n" $i
done
echo COMMIT

[ pipe result into iptables-restore ]

This ruleset will be about 74mbyte in size, with ~500k searches
though all 500k[1] rule entries. iptables-restore will take forever
(gave up after 10 minutes)

Instead of always searching the entire blob for a match, fill an
array with the start offsets of every single ipt_entry struct,
then do a binary search to check if the jump target is present or not.

After this change ruleset restore times get again close to what one
gets when reverting 36472341017529e (~3 seconds on my workstation).

[1] every user-defined rule gets an implicit RETURN, so we get
300k jumps + 100k userchains + 100k returns -> 500k rule entries

Fixes: 36472341017529e ("netfilter: x_tables: validate targets of jumps")
Reported-by: Jeff Wu 
Tested-by: Jeff Wu 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Florian Westphal 
Signed-off-by: Sasha Levin

netfilter: x_tables: introduce and use xt_copy_counters_from_user

2016-07-12T12:48:37+00:00

[ Upstream commit d7591f0c41ce3e67600a982bab6989ef0f07b3ce ]

The three variants use same copy&pasted code, condense this into a
helper and use that.

Make sure info.name is 0-terminated.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin

netfilter: x_tables: do compat validation via translate_table

2016-07-12T12:48:37+00:00

[ Upstream commit 09d9686047dbbe1cf4faa558d3ecc4aae2046054 ]

This looks like refactoring, but its also a bug fix.

Problem is that the compat path (32bit iptables, 64bit kernel) lacks a few
sanity tests that are done in the normal path.

For example, we do not check for underflows and the base chain policies.

While its possible to also add such checks to the compat path, its more
copy&pastry, for instance we cannot reuse check_underflow() helper as
e->target_offset differs in the compat case.

Other problem is that it makes auditing for validation errors harder; two
places need to be checked and kept in sync.

At a high level 32 bit compat works like this:
1- initial pass over blob:
   validate match/entry offsets, bounds checking
   lookup all matches and targets
   do bookkeeping wrt. size delta of 32/64bit structures
   assign match/target.u.kernel pointer (points at kernel
   implementation, needed to access ->compatsize etc.)

2- allocate memory according to the total bookkeeping size to
   contain the translated ruleset

3- second pass over original blob:
   for each entry, copy the 32bit representation to the newly allocated
   memory.  This also does any special match translations (e.g.
   adjust 32bit to 64bit longs, etc).

4- check if ruleset is free of loops (chase all jumps)

5-first pass over translated blob:
   call the checkentry function of all matches and targets.

The alternative implemented by this patch is to drop steps 3&4 from the
compat process, the translation is changed into an intermediate step
rather than a full 1:1 translate_table replacement.

In the 2nd pass (step #3), change the 64bit ruleset back to a kernel
representation, i.e. put() the kernel pointer and restore ->u.user.name .

This gets us a 64bit ruleset that is in the format generated by a 64bit
iptables userspace -- we can then use translate_table() to get the
'native' sanity checks.

This has two drawbacks:

1. we re-validate all the match and target entry structure sizes even
though compat translation is supposed to never generate bogus offsets.
2. we put and then re-lookup each match and target.

THe upside is that we get all sanity tests and ruleset validations
provided by the normal path and can remove some duplicated compat code.

iptables-restore time of autogenerated ruleset with 300k chains of form
-A CHAIN0001 -m limit --limit 1/s -j CHAIN0002
-A CHAIN0002 -m limit --limit 1/s -j CHAIN0003

shows no noticeable differences in restore times:
old:   0m30.796s
new:   0m31.521s
64bit: 0m25.674s

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin

netfilter: x_tables: xt_compat_match_from_user doesn't need a retval

2016-07-12T12:48:36+00:00

[ Upstream commit 0188346f21e6546498c2a0f84888797ad4063fc5 ]

Always returned 0.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin

netfilter: ip6_tables: simplify translate_compat_table args

2016-07-12T12:48:36+00:00

[ Upstream commit 329a0807124f12fe1c8032f95d8a8eb47047fb0e ]

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin