linux-stable.git/net/ipv4/ip_output.c, branch linux-4.3.y

net: preserve IP control block during GSO segmentation

2016-01-31T19:25:51+00:00

[ Upstream commit 9207f9d45b0ad071baa128e846d7e7ed85016df3 ]

Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.

This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.

Signed-off-by: Konstantin Khlebnikov 
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

udp: disallow UFO for sockets with SO_NO_CHECK option

2016-01-31T19:25:51+00:00

[ Upstream commit 40ba330227ad00b8c0cdf2f425736ff9549cc423 ]

Commit acf8dd0a9d0b ("udp: only allow UFO for packets from SOCK_DGRAM
sockets") disallows UFO for packets sent from raw sockets. We need to do
the same also for SOCK_DGRAM sockets with SO_NO_CHECK options, even if
for a bit different reason: while such socket would override the
CHECKSUM_PARTIAL set by ip_ufo_append_data(), gso_size is still set and
bad offloading flags warning is triggered in __skb_gso_segment().

In the IPv6 case, SO_NO_CHECK option is ignored but we need to disallow
UFO for packets sent by sockets with UDP_NO_CHECK6_TX option.

Signed-off-by: Michal Kubecek 
Tested-by: Shannon Nelson 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: Use VRF index for oif in ip_send_unicast_reply

2015-08-14T05:43:21+00:00

If output device is not specified use VRF device if input device is
enslaved. This is needed to ensure tcp acks and resets go out VRF device.

Signed-off-by: David Ahern 
Signed-off-by: David S. Miller

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next

2015-06-15T21:30:32+00:00

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

This a bit large (and late) patchset that contains Netfilter updates for
net-next. Most relevantly br_netfilter fixes, ipset RCU support, removal of
x_tables percpu ruleset copy and rework of the nf_tables netdev support. More
specifically, they are:

1) Warn the user when there is a better protocol conntracker available, from
   Marcelo Ricardo Leitner.

2) Fix forwarding of IPv6 fragmented traffic in br_netfilter, from Bernhard
   Thaler. This comes with several patches to prepare the change in first place.

3) Get rid of special mtu handling of PPPoE/VLAN frames for br_netfilter. This
   is not needed anymore since now we use the largest fragment size to
   refragment, from Florian Westphal.

4) Restore vlan tag when refragmenting in br_netfilter, also from Florian.

5) Get rid of the percpu ruleset copy in x_tables, from Florian. Plus another
   follow up patch to refine it from Eric Dumazet.

6) Several ipset cleanups, fixes and finally RCU support, from Jozsef Kadlecsik.

7) Get rid of parens in Netfilter Kconfig files.

8) Attach the net_device to the basechain as opposed to the initial per table
   approach in the nf_tables netdev family.

9) Subscribe to netdev events to detect the removal and registration of a
   device that is referenced by a basechain.
====================

Signed-off-by: David S. Miller

net: ipv4: un-inline ip_finish_output2

2015-06-12T21:19:17+00:00

text    data     bss     dec     hex filename
old: 16527      44       0   16571    40bb net/ipv4/ip_output.o
new: 14935      44       0   14979    3a83 net/ipv4/ip_output.o

Suggested-by: Eric Dumazet 
Signed-off-by: Florian Westphal 
Signed-off-by: David S. Miller

net: ip_fragment: remove BRIDGE_NETFILTER mtu special handling

2015-06-12T12:16:46+00:00

since commit d6b915e29f4adea9
("ip_fragment: don't forward defragmented DF packet") the largest
fragment size is available in the IPCB.

Therefore we no longer need to care about 'encapsulation'
overhead of stripped PPPOE/VLAN headers since ip_do_fragment
doesn't use device mtu in such cases.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso

ip_fragment: don't forward defragmented DF packet

2015-05-27T17:03:31+00:00

We currently always send fragments without DF bit set.

Thus, given following setup:

mtu1500 - mtu1500:1400 - mtu1400:1280 - mtu1280
   A           R1              R2         B

Where R1 and R2 run linux with netfilter defragmentation/conntrack
enabled, then if Host A sent a fragmented packet _with_ DF set to B, R1
will respond with icmp too big error if one of these fragments exceeded
1400 bytes.

However, if R1 receives fragment sizes 1200 and 100, it would
forward the reassembled packet without refragmenting, i.e.
R2 will send an icmp error in response to a packet that was never sent,
citing mtu that the original sender never exceeded.

The other minor issue is that a refragmentation on R1 will conceal the
MTU of R2-B since refragmentation does not set DF bit on the fragments.

This modifies ip_fragment so that we track largest fragment size seen
both for DF and non-DF packets, and set frag_max_size to the largest
value.

If the DF fragment size is larger or equal to the non-df one, we will
consider the packet a path mtu probe:
We set DF bit on the reassembled skb and also tag it with a new IPCB flag
to force refragmentation even if skb fits outdev mtu.

We will also set DF bit on each fragment in this case.

Joint work with Hannes Frederic Sowa.

Reported-by: Jesse Gross 
Signed-off-by: Florian Westphal 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

net: ipv4: avoid repeated calls to ip_skb_dst_mtu helper

2015-05-27T17:03:30+00:00

ip_skb_dst_mtu is small inline helper, but its called in several places.

before: 17061      44       0   17105    42d1 net/ipv4/ip_output.o
after:  16805      44       0   16849    41d1 net/ipv4/ip_output.o

Signed-off-by: Florian Westphal 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller

net: skbuff: add skb_append_pagefrags and use it

2015-05-25T04:06:58+00:00

Signed-off-by: Hannes Frederic Sowa 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller

bridge_netfilter: No ICMP packet on IPv4 fragmentation error

2015-05-19T04:15:39+00:00

When bridge netfilter re-fragments an IP packet for output, all
packets that can not be re-fragmented to their original input size
should be silently discarded.

However, current bridge netfilter output path generates an ICMP packet
with 'size exceeded MTU' message for such packets, this is a bug.

This patch refactors the ip_fragment() API to allow two separate
use cases. The bridge netfilter user case will not
send ICMP, the routing output will, as before.

Signed-off-by: Andy Zhou 
Acked-by: Florian Westphal 
Signed-off-by: David S. Miller