linux-stable.git/net/ipv4, branch v3.2.100

net: igmp: fix source address check for IGMPv3 reports

2018-03-03T15:51:02+00:00

commit ad23b750933ea7bf962678972a286c78a8fa36aa upstream.

Commit "net: igmp: Use correct source address on IGMPv3 reports"
introduced a check to validate the source address of locally generated
IGMPv3 packets.
Instead of checking the local interface address directly, it uses
inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
local subnet (or equal to the point-to-point address if used).

This breaks for point-to-point interfaces, so check against
ifa->ifa_local directly.

Cc: Kevin Cernekee 
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Reported-by: Sebastian Gottschall 
Signed-off-by: Felix Fietkau 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings

net: igmp: Use correct source address on IGMPv3 reports

2018-03-03T15:51:02+00:00

commit a46182b00290839fa3fa159d54fd3237bd8669f0 upstream.

Closing a multicast socket after the final IPv4 address is deleted
from an interface can generate a membership report that uses the
source IP from a different interface.  The following test script, run
from an isolated netns, reproduces the issue:

    #!/bin/bash

    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link set dummy0 up
    ip link set dummy1 up
    ip addr add 10.1.1.1/24 dev dummy0
    ip addr add 192.168.99.99/24 dev dummy1

    tcpdump -U -i dummy0 &
    socat EXEC:"sleep 2" \
        UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &

    sleep 1
    ip addr del 10.1.1.1/24 dev dummy0
    sleep 5
    kill %tcpdump

RFC 3376 specifies that the report must be sent with a valid IP source
address from the destination subnet, or from address 0.0.0.0.  Add an
extra check to make sure this is the case.

Signed-off-by: Kevin Cernekee 
Reviewed-by: Andrew Lunn 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings

net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()

2018-03-03T15:50:57+00:00

commit 20b50d79974ea3192e8c3ab7faf4e536e5f14d8f upstream.

Commit 8f659a03a0ba ("net: ipv4: fix for a race condition in
raw_sendmsg") fixed the issue of possibly inconsistent ->hdrincl handling
due to concurrent updates by reading this bit-field member into a local
variable and using the thus stabilized value in subsequent tests.

However, aforementioned commit also adds the (correct) comment that

  /* hdrincl should be READ_ONCE(inet->hdrincl)
   * but READ_ONCE() doesn't work with bit fields
   */

because as it stands, the compiler is free to shortcut or even eliminate
the local variable at its will.

Note that I have not seen anything like this happening in reality and thus,
the concern is a theoretical one.

However, in order to be on the safe side, emulate a READ_ONCE() on the
bit-field by doing it on the local 'hdrincl' variable itself:

	int hdrincl = inet->hdrincl;
	hdrincl = READ_ONCE(hdrincl);

This breaks the chain in the sense that the compiler is not allowed
to replace subsequent reads from hdrincl with reloads from inet->hdrincl.

Fixes: 8f659a03a0ba ("net: ipv4: fix for a race condition in raw_sendmsg")
Signed-off-by: Nicolai Stange 
Reviewed-by: Stefano Brivio 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2: use ACCESS_ONCE()]
Signed-off-by: Ben Hutchings

xfrm: Return error on unknown encap_type in init_state

2018-03-03T15:50:56+00:00

commit bcfd09f7837f5240c30fd2f52ee7293516641faa upstream.

Currently esp will happily create an xfrm state with an unknown
encap type for IPv4, without setting the necessary state parameters.
This patch fixes it by returning -EINVAL.

There is a similar problem in IPv6 where if the mode is unknown
we will skip initialisation while returning zero.  However, this
is harmless as the mode has already been checked further up the
stack.  This patch removes this anomaly by aligning the IPv6
behaviour with IPv4 and treating unknown modes (which cannot
actually happen) as transport mode.

Fixes: 38320c70d282 ("[IPSEC]: Use crypto_aead and authenc in ESP")
Signed-off-by: Herbert Xu 
Signed-off-by: Steffen Klassert 
Signed-off-by: Ben Hutchings

xfrm: Reinject transport-mode packets through tasklet

2018-03-03T15:50:49+00:00

commit acf568ee859f098279eadf551612f103afdacb4e upstream.

This is an old bugbear of mine:

https://www.mail-archive.com/netdev@vger.kernel.org/msg03894.html

By crafting special packets, it is possible to cause recursion
in our kernel when processing transport-mode packets at levels
that are only limited by packet size.

The easiest one is with DNAT, but an even worse one is where
UDP encapsulation is used in which case you just have to insert
an UDP encapsulation header in between each level of recursion.

This patch avoids this problem by reinjecting tranport-mode packets
through a tasklet.

Fixes: b05e106698d9 ("[IPV4/6]: Netfilter IPsec input hooks")
Signed-off-by: Herbert Xu 
Signed-off-by: Steffen Klassert 
[bwh: Backported to 3.2:
 - netfilter finish callbacks only receive an sk_buff pointer
 - Adjust context]
Signed-off-by: Ben Hutchings

tcp md5sig: Use skb's saddr when replying to an incoming segment

2018-03-03T15:50:46+00:00

commit 30791ac41927ebd3e75486f9504b6d2280463bf0 upstream.

The MD5-key that belongs to a connection is identified by the peer's
IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying
to an incoming segment from tcp_check_req() that failed the seq-number
checks.

Thus, to find the correct key, we need to use the skb's saddr and not
the daddr.

This bug seems to have been there since quite a while, but probably got
unnoticed because the consequences are not catastrophic. We will call
tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer,
thus the connection doesn't really fail.

Fixes: 9501f9722922 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().")
Signed-off-by: Christoph Paasch 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings

net: ipv4: fix for a race condition in raw_sendmsg

2018-03-03T15:50:46+00:00

commit 8f659a03a0ba9289b9aeb9b4470e6fb263d6f483 upstream.

inet->hdrincl is racy, and could lead to uninitialized stack pointer
usage, so its value should be read only once.

Fixes: c008ba5bdc9f ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt")
Signed-off-by: Mohamed Ghannam 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2:
 - flowi4 flags don't depend on hdrincl
 - Adjust context]
Signed-off-by: Ben Hutchings

ipv4: Avoid reading user iov twice after raw_probe_proto_opt

2018-03-03T15:50:46+00:00

commit c008ba5bdc9fa830e1a349b20b0be5a137bdef7a upstream.

Ever since raw_probe_proto_opt was added it had the problem of
causing the user iov to be read twice, once during the probe for
the protocol header and once again in ip_append_data.

This is a potential security problem since it means that whatever
we're probing may be invalid.  This patch plugs the hole by
firstly advancing the iov so we don't read the same spot again,
and secondly saving what we read the first time around for use
by ip_append_data.

Signed-off-by: Herbert Xu 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings

ipv4: Use standard iovec primitive in raw_probe_proto_opt

2018-03-03T15:50:45+00:00

commit 32b5913a931fd753faf3d4e1124b2bc2edb364da upstream.

The function raw_probe_proto_opt tries to extract the first two
bytes from the user input in order to seed the IPsec lookup for
ICMP packets.  In doing so it's processing iovec by hand and
overcomplicating things.

This patch replaces the manual iovec processing with a call to
memcpy_fromiovecend.

Signed-off-by: Herbert Xu 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings

tcp: fix tcp_mtu_probe() vs highest_sack

2018-01-01T20:51:00+00:00

commit 2b7cda9c35d3b940eb9ce74b30bbd5eb30db493d upstream.

Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.

Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.

If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.

Bad things would then happen, including infinite loops.

This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.

Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.

Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
Signed-off-by: Eric Dumazet 
Reported-by: Alexei Starovoitov 
Reported-by: Roman Gushchin 
Reported-by: Oleksandr Natalenko 
Acked-by: Alexei Starovoitov 
Acked-by: Neal Cardwell 
Acked-by: Yuchung Cheng 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings