linux-stable.git/include, branch v3.16.40

tcp: take care of truncations done by sk_filter()

2017-02-23T03:54:46+00:00

commit ac6e780070e30e4c35bd395acfe9191e6268bdd3 upstream.

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet 
Reported-by: Marco Grassi 
Reported-by: Vladis Dronov 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings

dccp: limit sk_filter trim to payload

2017-02-23T03:54:46+00:00

commit 4f0c40d94461cfd23893a17335b2ab78ecb333c8 upstream.

Dccp verifies packet integrity, including length, at initial rcv in
dccp_invalid_packet, later pulls headers in dccp_enqueue_skb.

A call to sk_filter in-between can cause __skb_pull to wrap skb->len.
skb_copy_datagram_msg interprets this as a negative value, so
(correctly) fails with EFAULT. The negative length is reported in
ioctl SIOCINQ or possibly in a DCCP_WARN in dccp_close.

Introduce an sk_receive_skb variant that caps how small a filter
program can trim packets, and call this in dccp with the header
length. Excessively trimmed packets are now processed normally and
queued for reception as 0B payloads.

Fixes: 7c657876b63c ("[DCCP]: Initial implementation")
Signed-off-by: Willem de Bruijn 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Ben Hutchings

rose: limit sk_filter trim to payload

2017-02-23T03:54:46+00:00

commit f4979fcea7fd36d8e2f556abef86f80e0d5af1ba upstream.

Sockets can have a filter program attached that drops or trims
incoming packets based on the filter program return value.

Rose requires data packets to have at least ROSE_MIN_LEN bytes. It
verifies this on arrival in rose_route_frame and unconditionally pulls
the bytes in rose_recvmsg. The filter can trim packets to below this
value in-between, causing pull to fail, leaving the partial header at
the time of skb_copy_datagram_msg.

Place a lower bound on the size to which sk_filter may trim packets
by introducing sk_filter_trim_cap and call this for rose packets.

Signed-off-by: Willem de Bruijn 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings

net: Add __sock_queue_rcv_skb()

2017-02-23T03:54:46+00:00

Extraxcted from commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac
"udp: remove headers from UDP packets before queueing".

Signed-off-by: Ben Hutchings

can: raw: raw_setsockopt: limit number of can_filter that can be set

2017-02-23T03:54:43+00:00

commit 332b05ca7a438f857c61a3c21a88489a21532364 upstream.

This patch adds a check to limit the number of can_filters that can be
set via setsockopt on CAN_RAW sockets. Otherwise allocations > MAX_ORDER
are not prevented resulting in a warning.

Reference: https://lkml.org/lkml/2016/12/2/230

Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Signed-off-by: Marc Kleine-Budde 
Signed-off-by: Ben Hutchings

ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()

2017-02-23T03:54:27+00:00

commit 23f4ffedb7d751c7e298732ba91ca75d224bc1a6 upstream.

skb->cb may contain data from previous layers. In the observed scenario,
the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
that small packets sent through the tunnel are mistakenly fragmented.

This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.

Signed-off-by: Eli Cooper 
Signed-off-by: David S. Miller 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings

netfilter: nf_tables: fix type mismatch with error return from nft_parse_u32_check

2017-02-23T03:54:23+00:00

commit f1d505bb762e30bf316ff5d3b604914649d6aed3 upstream.

Commit 36b701fae12ac ("netfilter: nf_tables: validate maximum value of
u32 netlink attributes") introduced nft_parse_u32_check with a return
value of "unsigned int", yet on error it returns "-ERANGE".

This patch corrects the mismatch by changing the return value to "int",
which happens to match the actual users of nft_parse_u32_check already.

Found by Coverity, CID 1373930.

Note that commit 21a9e0f1568ea ("netfilter: nft_exthdr: fix error
handling in nft_exthdr_init()) attempted to address the issue, but
did not address the return type of nft_parse_u32_check.

Signed-off-by: John W. Linville 
Cc: Laura Garcia Liebana 
Cc: Pablo Neira Ayuso 
Cc: Dan Carpenter 
Fixes: 36b701fae12ac ("netfilter: nf_tables: validate maximum value...")
Signed-off-by: Pablo Neira Ayuso 
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings

target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE

2017-02-23T03:54:18+00:00

commit 449a137846c84829a328757cd21fd9ca65c08519 upstream.

This patch addresses a bug where EXTENDED_COPY across multiple LUNs
results in a CHECK_CONDITION when the source + destination are not
located on the same physical node.

ESX Host environments expect sense COPY_ABORTED w/ COPY TARGET DEVICE
NOT REACHABLE to be returned when this occurs, in order to signal
fallback to local copy method.

As described in section 6.3.3 of spc4r22:

  "If it is not possible to complete processing of a segment because the
   copy manager is unable to establish communications with a copy target
   device, because the copy target device does not respond to INQUIRY,
   or because the data returned in response to INQUIRY indicates
   an unsupported logical unit, then the EXTENDED COPY command shall be
   terminated with CHECK CONDITION status, with the sense key set to
   COPY ABORTED, and the additional sense code set to COPY TARGET DEVICE
   NOT REACHABLE."

Tested on v4.1.y with ESX v5.5u2+ with BlockCopy across multiple nodes.

Reported-by: Nixon Vincent 
Tested-by: Nixon Vincent 
Cc: Nixon Vincent 
Tested-by: Dinesh Israni 
Signed-off-by: Dinesh Israni 
Cc: Dinesh Israni 
Signed-off-by: Nicholas Bellinger 
[bwh: Backported to 3.16: generate the sense data in
 transport_send_check_condition_and_sense() rather than adding to
 sense_info_table]
Signed-off-by: Ben Hutchings

ipc/sem.c: fix complex_count vs. simple op race

2017-02-23T03:54:12+00:00

commit 5864a2fd3088db73d47942370d0f7210a807b9bc upstream.

Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a
race:

sem_lock has a fast path that allows parallel simple operations.
There are two reasons why a simple operation cannot run in parallel:
 - a non-simple operations is ongoing (sma->sem_perm.lock held)
 - a complex operation is sleeping (sma->complex_count != 0)

As both facts are stored independently, a thread can bypass the current
checks by sleeping in the right positions.  See below for more details
(or kernel bugzilla 105651).

The patch fixes that by creating one variable (complex_mode)
that tracks both reasons why parallel operations are not possible.

The patch also updates stale documentation regarding the locking.

With regards to stable kernels:
The patch is required for all kernels that include the
commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") (3.10?)

The alternative is to revert the patch that introduced the race.

The patch is safe for backporting, i.e. it makes no assumptions
about memory barriers in spin_unlock_wait().

Background:
Here is the race of the current implementation:

Thread A: (simple op)
- does the first "sma->complex_count == 0" test

Thread B: (complex op)
- does sem_lock(): This includes an array scan. But the scan can't
  find Thread A, because Thread A does not own sem->lock yet.
- the thread does the operation, increases complex_count,
  drops sem_lock, sleeps

Thread A:
- spin_lock(&sem->lock), spin_is_locked(sma->sem_perm.lock)
- sleeps before the complex_count test

Thread C: (complex op)
- does sem_lock (no array scan, complex_count==1)
- wakes up Thread B.
- decrements complex_count

Thread A:
- does the complex_count test

Bug:
Now both thread A and thread C operate on the same array, without
any synchronization.

Fixes: 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()")
Link: http://lkml.kernel.org/r/1469123695-5661-1-git-send-email-manfred@colorfullife.com
Reported-by: 
Cc: "H. Peter Anvin" 
Cc: Peter Zijlstra 
Cc: Davidlohr Bueso 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: <1vier1@web.de>
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
[bwh: Backported to 3.16:
 - We missed out on some earlier memory barrier changes
 - Use set_mb instead of smp_store_mb]
Signed-off-by: Ben Hutchings

compiler: Allow 1- and 2-byte smp_load_acquire() and smp_store_release()

2017-02-23T03:54:11+00:00

commit 536fa402221f09633e7c5801b327055ab716a363 upstream.

CPUs without single-byte and double-byte loads and stores place some
"interesting" requirements on concurrent code.  For example (adapted
from Peter Hurley's test code), suppose we have the following structure:

	struct foo {
		spinlock_t lock1;
		spinlock_t lock2;
		char a; /* Protected by lock1. */
		char b; /* Protected by lock2. */
	};
	struct foo *foop;

Of course, it is common (and good) practice to place data protected
by different locks in separate cache lines.  However, if the locks are
rarely acquired (for example, only in rare error cases), and there are
a great many instances of the data structure, then memory footprint can
trump false-sharing concerns, so that it can be better to place them in
the same cache cache line as above.

But if the CPU does not support single-byte loads and stores, a store
to foop->a will do a non-atomic read-modify-write operation on foop->b,
which will come as a nasty surprise to someone holding foop->lock2.  So we
now require CPUs to support single-byte and double-byte loads and stores.
Therefore, this commit adjusts the definition of __native_word() to allow
these sizes to be used by smp_load_acquire() and smp_store_release().

Signed-off-by: Paul E. McKenney 
Cc: Peter Zijlstra 
Signed-off-by: Ben Hutchings