linux-stable.git/net/core, branch linux-4.3.y

net: bpf: reject invalid shifts

2016-01-31T19:25:51+00:00

[ Upstream commit 229394e8e62a4191d592842cf67e80c62a492937 ]

On ARM64, a BUG() is triggered in the eBPF JIT if a filter with a
constant shift that can't be encoded in the immediate field of the
UBFM/SBFM instructions is passed to the JIT.  Since these shifts
amounts, which are negative or >= regsize, are invalid, reject them in
the eBPF verifier and the classic BPF filter checker, for all
architectures.

Signed-off-by: Rabin Vincent 
Acked-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: preserve IP control block during GSO segmentation

2016-01-31T19:25:51+00:00

[ Upstream commit 9207f9d45b0ad071baa128e846d7e7ed85016df3 ]

Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.

This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.

Signed-off-by: Konstantin Khlebnikov 
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: pktgen: fix null ptr deref in skb allocation

2016-01-31T19:25:51+00:00

[ Upstream commit 3de03596dfeee48bc803c1d1a6daf60a459929f3 ]

Fix possible null pointer dereference that may occur when calling
skb_reserve() on a null skb.

Fixes: 879c7220e82 ("net: pktgen: Observe needed_headroom of the device")
Signed-off-by: John Fastabend 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: possible use after free in dst_release

2016-01-31T19:25:50+00:00

[ Upstream commit 07a5d38453599052aff0877b16bb9c1585f08609 ]

dst_release should not access dst->flags after decrementing
__refcnt to 0. The dst_entry may be in dst_busy_list and
dst_gc_task may dst_destroy it before dst_release gets a chance
to access dst->flags.

Fixes: d69bbf88c8d0 ("net: fix a race in dst_release()")
Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Signed-off-by: Francesco Ruggeri 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: check both type and procotol for tcp sockets

2016-01-23T04:55:48+00:00

[ Upstream commit ac5cc977991d2dce85fc734a6c71ddb33f6fe3c1 ]

Dmitry reported the following out-of-bound access:

Call Trace:
 [] __asan_report_load4_noabort+0x3e/0x40
mm/kasan/report.c:294
 [] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880
 [<     inline     >] SYSC_setsockopt net/socket.c:1746
 [] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729
 [] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

This is because we mistake a raw socket as a tcp socket.
We should check both sk->sk_type and sk->sk_protocol to ensure
it is a tcp socket.

Willem points out __skb_complete_tx_timestamp() needs to fix as well.

Reported-by: Dmitry Vyukov 
Cc: Willem de Bruijn 
Cc: Eric Dumazet 
Signed-off-by: Cong Wang 
Acked-by: Willem de Bruijn 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

skbuff: Fix offset error in skb_reorder_vlan_header

2016-01-23T04:55:48+00:00

[ Upstream commit f654861569872d10dcb79d9d7ca219b316f94ff0 ]

skb_reorder_vlan_header is called after the vlan header has
been pulled.  As a result the offset of the begining of
the mac header has been incrased by 4 bytes (VLAN_HLEN).
When moving the mac addresses, include this incrase in
the offset calcualation so that the mac addresses are
copied correctly.

Fixes: a6e18ff1117 (vlan: Fix untag operations of stacked vlans with REORDER_HEADER off)
CC: Nicolas Dichtel 
CC: Patrick McHardy 
Signed-off-by: Vladislav Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

vlan: Fix untag operations of stacked vlans with REORDER_HEADER off

2016-01-23T04:55:48+00:00

[ Upstream commit a6e18ff111701b4ff6947605bfbe9594ec42a6e8 ]

When we have multiple stacked vlan devices all of which have
turned off REORDER_HEADER flag, the untag operation does not
locate the ethernet addresses correctly for nested vlans.
The reason is that in case of REORDER_HEADER flag being off,
the outer vlan headers are put back and the mac_len is adjusted
to account for the presense of the header.  Then, the subsequent
untag operation, for the next level vlan, always use VLAN_ETH_HLEN
to locate the begining of the ethernet header and that ends up
being a multiple of 4 bytes short of the actuall beginning
of the mac header (the multiple depending on the how many vlan
encapsulations ethere are).

As a reslult, if there are multiple levles of vlan devices
with REODER_HEADER being off, the recevied packets end up
being dropped.

To solve this, we use skb->mac_len as the offset.  The value
is always set on receive path and starts out as a ETH_HLEN.
The value is also updated when the vlan header manupations occur
so we know it will be correct.

Signed-off-by: Vladislav Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

sctp: update the netstamp_needed counter when copying sockets

2016-01-23T04:55:44+00:00

[ Upstream commit 01ce63c90170283a9855d1db4fe81934dddce648 ]

Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.

When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.

The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.

Reported-by: Dmitry Vyukov 
Signed-off-by: Marcelo Ricardo Leitner 
Acked-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net/neighbour: fix crash at dumping device-agnostic proxy entries

2015-12-15T05:41:04+00:00

[ Upstream commit 6adc5fd6a142c6e2c80574c1db0c7c17dedaa42e ]

Proxy entries could have null pointer to net-device.

Signed-off-by: Konstantin Khlebnikov 
Fixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2")
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds

2015-12-15T05:41:02+00:00

[ Upstream commit 6900317f5eff0a7070c5936e5383f589e0de7a09 ]

David and HacKurx reported a following/similar size overflow triggered
in a grsecurity kernel, thanks to PaX's gcc size overflow plugin:

(Already fixed in later grsecurity versions by Brad and PaX Team.)

[ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314
               cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr;
[ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7
[ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...]
[ 1002.296153]  ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8
[ 1002.296162]  ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8
[ 1002.296169]  ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60
[ 1002.296176] Call Trace:
[ 1002.296190]  [] dump_stack+0x45/0x57
[ 1002.296200]  [] report_size_overflow+0x38/0x60
[ 1002.296209]  [] scm_detach_fds+0x2ce/0x300
[ 1002.296220]  [] unix_stream_read_generic+0x609/0x930
[ 1002.296228]  [] unix_stream_recvmsg+0x4f/0x60
[ 1002.296236]  [] ? unix_set_peek_off+0x50/0x50
[ 1002.296243]  [] sock_recvmsg+0x47/0x60
[ 1002.296248]  [] ___sys_recvmsg+0xe2/0x1e0
[ 1002.296257]  [] __sys_recvmsg+0x46/0x80
[ 1002.296263]  [] SyS_recvmsg+0x2c/0x40
[ 1002.296271]  [] entry_SYSCALL_64_fastpath+0x12/0x85

Further investigation showed that this can happen when an *odd* number of
fds are being passed over AF_UNIX sockets.

In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
where i is the number of successfully passed fds, differ by 4 bytes due
to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
on 64 bit. The padding is used to align subsequent cmsg headers in the
control buffer.

When the control buffer passed in from the receiver side *lacks* these 4
bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
overflow in scm_detach_fds():

  int cmlen = CMSG_LEN(i * sizeof(int));  <--- cmlen w/o tail-padding
  err = put_user(SOL_SOCKET, &cm->cmsg_level);
  if (!err)
    err = put_user(SCM_RIGHTS, &cm->cmsg_type);
  if (!err)
    err = put_user(cmlen, &cm->cmsg_len);
  if (!err) {
    cmlen = CMSG_SPACE(i * sizeof(int));  <--- cmlen w/ 4 byte extra tail-padding
    msg->msg_control += cmlen;
    msg->msg_controllen -= cmlen;         <--- iff no tail-padding space here ...
  }                                            ... wrap-around

F.e. it will wrap to a length of 18446744073709551612 bytes in case the
receiver passed in msg->msg_controllen of 20 bytes, and the sender
properly transferred 1 fd to the receiver, so that its CMSG_LEN results
in 20 bytes and CMSG_SPACE in 24 bytes.

In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
issue in my tests as alignment seems always on 4 byte boundary. Same
should be in case of native 32 bit, where we end up with 4 byte boundaries
as well.

In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
a single fd would mean that on successful return, msg->msg_controllen is
being set by the kernel to 24 bytes instead, thus more than the input
buffer advertised. It could f.e. become an issue if such application later
on zeroes or copies the control buffer based on the returned msg->msg_controllen
elsewhere.

Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).

Going over the code, it seems like msg->msg_controllen is not being read
after scm_detach_fds() in scm_recv() anymore by the kernel, good!

Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
and ___sys_recvmsg() places the updated length, that is, new msg_control -
old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
in the example).

Long time ago, Wei Yongjun fixed something related in commit 1ac70e7ad24a
("[NET]: Fix function put_cmsg() which may cause usr application memory
overflow").

RFC3542, section 20.2. says:

  The fields shown as "XX" are possible padding, between the cmsghdr
  structure and the data, and between the data and the next cmsghdr
  structure, if required by the implementation. While sending an
  application may or may not include padding at the end of last
  ancillary data in msg_controllen and implementations must accept both
  as valid. On receiving a portable application must provide space for
  padding at the end of the last ancillary data as implementations may
  copy out the padding at the end of the control message buffer and
  include it in the received msg_controllen. When recvmsg() is called
  if msg_controllen is too small for all the ancillary data items
  including any trailing padding after the last item an implementation
  may set MSG_CTRUNC.

Since we didn't place MSG_CTRUNC for already quite a long time, just do
the same as in 1ac70e7ad24a to avoid an overflow.

Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix
error in SCM_RIGHTS code sample"). Some people must have copied this (?),
thus it got triggered in the wild (reported several times during boot by
David and HacKurx).

No Fixes tag this time as pre 2002 (that is, pre history tree).

Reported-by: David Sterba 
Reported-by: HacKurx 
Cc: PaX Team 
Cc: Emese Revfy 
Cc: Brad Spengler 
Cc: Wei Yongjun 
Cc: Eric Dumazet 
Reviewed-by: Hannes Frederic Sowa 
Signed-off-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman