linux-stable.git/kernel/bpf, branch v4.9.78

bpf, array: fix overflow in max_entries and undefined behavior in index_mask

2018-01-17T08:38:55+00:00

commit bbeb6e4323dad9b5e0ee9f60c223dd532e2403b1 upstream.

syzkaller tried to alloc a map with 0xfffffffd entries out of a userns,
and thus unprivileged. With the recently added logic in b2157399cc98
("bpf: prevent out-of-bounds speculation") we round this up to the next
power of two value for max_entries for unprivileged such that we can
apply proper masking into potentially zeroed out map slots.

However, this will generate an index_mask of 0xffffffff, and therefore
a + 1 will let this overflow into new max_entries of 0. This will pass
allocation, etc, and later on map access we still enforce on the original
attr->max_entries value which was 0xfffffffd, therefore triggering GPF
all over the place. Thus bail out on overflow in such case.

Moreover, on 32 bit archs roundup_pow_of_two() can also not be used,
since fls_long(max_entries - 1) can result in 32 and 1UL << 32 in 32 bit
space is undefined. Therefore, do this by hand in a 64 bit variable.

This fixes all the issues triggered by syzkaller's reproducers.

Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Reported-by: syzbot+b0efb8e572d01bce1ae0@syzkaller.appspotmail.com
Reported-by: syzbot+6c15e9744f75f2364773@syzkaller.appspotmail.com
Reported-by: syzbot+d2f5524fb46fd3b312ee@syzkaller.appspotmail.com
Reported-by: syzbot+61d23c95395cc90dbc2b@syzkaller.appspotmail.com
Reported-by: syzbot+0d363c942452cca68c01@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann 
Signed-off-by: Alexei Starovoitov 
Signed-off-by: Greg Kroah-Hartman

bpf: prevent out-of-bounds speculation

2018-01-17T08:38:55+00:00

commit b2157399cc9898260d6031c5bfe45fe137c1fbe7 upstream.

Under speculation, CPUs may mis-predict branches in bounds checks. Thus,
memory accesses under a bounds check may be speculated even if the
bounds check fails, providing a primitive for building a side channel.

To avoid leaking kernel data round up array-based maps and mask the index
after bounds check, so speculated load with out of bounds index will load
either valid value from the array or zero from the padded area.

Unconditionally mask index for all array types even when max_entries
are not rounded to power of 2 for root user.
When map is created by unpriv user generate a sequence of bpf insns
that includes AND operation to make sure that JITed code includes
the same 'index & index_mask' operation.

If prog_array map is created by unpriv user replace
  bpf_tail_call(ctx, map, index);
with
  if (index >= max_entries) {
    index &= map->index_mask;
    bpf_tail_call(ctx, map, index);
  }
(along with roundup to power 2) to prevent out-of-bounds speculation.
There is secondary redundant 'if (index >= max_entries)' in the interpreter
and in all JITs, but they can be optimized later if necessary.

Other array-like maps (cpumap, devmap, sockmap, perf_event_array, cgroup_array)
cannot be used by unpriv, so no changes there.

That fixes bpf side of "Variant 1: bounds check bypass (CVE-2017-5753)" on
all architectures with and without JIT.

v2->v3:
Daniel noticed that attack potentially can be crafted via syscall commands
without loading the program, so add masking to those paths as well.

Signed-off-by: Alexei Starovoitov 
Acked-by: John Fastabend 
Signed-off-by: Daniel Borkmann 
Cc: Jiri Slaby 
[ Backported to 4.9 - gregkh ]
Signed-off-by: Greg Kroah-Hartman

bpf: refactor fixup_bpf_calls()

2018-01-17T08:38:55+00:00

commit 79741b3bdec01a8628368fbcfccc7d189ed606cb upstream.

reduce indent and make it iterate over instructions similar to
convert_ctx_accesses(). Also convert hard BUG_ON into soft verifier error.

Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Cc: Jiri Slaby 
[Backported to 4.9.y - gregkh]
Signed-off-by: Greg Kroah-Hartman

bpf: move fixup_bpf_calls() function

2018-01-17T08:38:55+00:00

commit e245c5c6a5656e4d61aa7bb08e9694fd6e5b2b9d upstream.

no functional change.
move fixup_bpf_calls() to verifier.c
it's being refactored in the next patch

Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Cc: Jiri Slaby 
[backported to 4.9 - gregkh]
Signed-off-by: Greg Kroah-Hartman

bpf/verifier: Fix states_equal() comparison of pointer and UNKNOWN

2017-12-29T16:43:00+00:00

An UNKNOWN_VALUE is not supposed to be derived from a pointer, unless
pointer leaks are allowed.  Therefore, states_equal() must not treat
a state with a pointer in a register as "equal" to a state with an
UNKNOWN_VALUE in that register.

This was fixed differently upstream, but the code around here was
largely rewritten in 4.14 by commit f1174f77b50c "bpf/verifier: rework
value tracking".  The bug can be detected by the bpf/verifier sub-test
"pointer/scalar confusion in state equality check (way 1)".

Signed-off-by: Ben Hutchings 
Cc: Edward Cree 
Cc: Jann Horn 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann

bpf: fix incorrect sign extension in check_alu_op()

2017-12-25T13:23:47+00:00

From: Jann Horn 

[ Upstream commit 95a762e2c8c942780948091f8f2a4f32fce1ac6f ]

Distinguish between
BPF_ALU64|BPF_MOV|BPF_K (load 32-bit immediate, sign-extended to 64-bit)
and BPF_ALU|BPF_MOV|BPF_K (load 32-bit immediate, zero-padded to 64-bit);
only perform sign extension in the first case.

Starting with v4.14, this is exploitable by unprivileged users as long as
the unprivileged_bpf_disabled sysctl isn't set.

Debian assigned CVE-2017-16995 for this issue.

v3:
 - add CVE number (Ben Hutchings)

Fixes: 484611357c19 ("bpf: allow access into map value arrays")
Signed-off-by: Jann Horn 
Acked-by: Edward Cree 
Signed-off-by: Alexei Starovoitov 
Signed-off-by: Daniel Borkmann 
Signed-off-by: Greg Kroah-Hartman

bpf: reject out-of-bounds stack pointer calculation

2017-12-25T13:23:47+00:00

From: Jann Horn 

Reject programs that compute wildly out-of-bounds stack pointers.
Otherwise, pointers can be computed with an offset that doesn't fit into an
`int`, causing security issues in the stack memory access check (as well as
signed integer overflow during offset addition).

This is a fix specifically for the v4.9 stable tree because the mainline
code looks very different at this point.

Fixes: 7bca0a9702edf ("bpf: enhance verifier to understand stack pointer arithmetic")
Signed-off-by: Jann Horn 
Acked-by: Daniel Borkmann 
Signed-off-by: Greg Kroah-Hartman

bpf: fix branch pruning logic

2017-12-25T13:23:47+00:00

From: Alexei Starovoitov 

[ Upstream commit c131187db2d3fa2f8bf32fdf4e9a4ef805168467 ]

when the verifier detects that register contains a runtime constant
and it's compared with another constant it will prune exploration
of the branch that is guaranteed not to be taken at runtime.
This is all correct, but malicious program may be constructed
in such a way that it always has a constant comparison and
the other branch is never taken under any conditions.
In this case such path through the program will not be explored
by the verifier. It won't be taken at run-time either, but since
all instructions are JITed the malicious program may cause JITs
to complain about using reserved fields, etc.
To fix the issue we have to track the instructions explored by
the verifier and sanitize instructions that are dead at run time
with NOPs. We cannot reject such dead code, since llvm generates
it for valid C code, since it doesn't do as much data flow
analysis as the verifier does.

Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)")
Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: Daniel Borkmann 
Signed-off-by: Greg Kroah-Hartman

bpf: adjust insn_aux_data when patching insns

2017-12-25T13:23:47+00:00

From: Alexei Starovoitov 

[ Upstream commit 8041902dae5299c1f194ba42d14383f734631009 ]

convert_ctx_accesses() replaces single bpf instruction with a set of
instructions. Adjust corresponding insn_aux_data while patching.
It's needed to make sure subsequent 'for(all insn)' loops
have matching insn and insn_aux_data.

Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

bpf: fix lockdep splat

2017-12-14T08:28:23+00:00

[ Upstream commit 89ad2fa3f043a1e8daae193bcb5fe34d5f8caf28 ]

pcpu_freelist_pop() needs the same lockdep awareness than
pcpu_freelist_populate() to avoid a false positive.

 [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]

 switchto-defaul/12508 [HC0[0]:SC0[6]:HE0:SE0] is trying to acquire:
  (&htab->buckets[i].lock){......}, at: [] __htab_percpu_map_update_elem+0x1cb/0x300

 and this task is already holding:
  (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...}, at: [] __dev_queue_xmit+0
x868/0x1240
 which would create a new lock dependency:
  (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...} -> (&htab->buckets[i].lock){......}

 but this new dependency connects a SOFTIRQ-irq-safe lock:
  (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...}
 ... which became SOFTIRQ-irq-safe at:
   [] __lock_acquire+0x42b/0x1f10
   [] lock_acquire+0xbc/0x1b0
   [] _raw_spin_lock+0x38/0x50
   [] __dev_queue_xmit+0x868/0x1240
   [] dev_queue_xmit+0x10/0x20
   [] ip_finish_output2+0x439/0x590
   [] ip_finish_output+0x150/0x2f0
   [] ip_output+0x7d/0x260
   [] ip_local_out+0x5e/0xe0
   [] ip_queue_xmit+0x205/0x620
   [] tcp_transmit_skb+0x5a8/0xcb0
   [] tcp_write_xmit+0x242/0x1070
   [] __tcp_push_pending_frames+0x3c/0xf0
   [] tcp_rcv_established+0x312/0x700
   [] tcp_v4_do_rcv+0x11c/0x200
   [] tcp_v4_rcv+0xaa2/0xc30
   [] ip_local_deliver_finish+0xa7/0x240
   [] ip_local_deliver+0x66/0x200
   [] ip_rcv_finish+0xdd/0x560
   [] ip_rcv+0x295/0x510
   [] __netif_receive_skb_core+0x988/0x1020
   [] __netif_receive_skb+0x21/0x70
   [] process_backlog+0x6f/0x230
   [] net_rx_action+0x229/0x420
   [] __do_softirq+0xd8/0x43d
   [] do_softirq_own_stack+0x1c/0x30
   [] do_softirq+0x55/0x60
   [] __local_bh_enable_ip+0xa8/0xb0
   [] cpu_startup_entry+0x1c7/0x500
   [] start_secondary+0x113/0x140

 to a SOFTIRQ-irq-unsafe lock:
  (&head->lock){+.+...}
 ... which became SOFTIRQ-irq-unsafe at:
 ...  [] __lock_acquire+0x82f/0x1f10
   [] lock_acquire+0xbc/0x1b0
   [] _raw_spin_lock+0x38/0x50
   [] pcpu_freelist_pop+0x7a/0xb0
   [] htab_map_alloc+0x50c/0x5f0
   [] SyS_bpf+0x265/0x1200
   [] entry_SYSCALL_64_fastpath+0x12/0x17

 other info that might help us debug this:

 Chain exists of:
   dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2 --> &htab->buckets[i].lock --> &head->lock

  Possible interrupt unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&head->lock);
                                local_irq_disable();
                                lock(dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2);
                                lock(&htab->buckets[i].lock);
   
     lock(dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2);

  *** DEADLOCK ***

Fixes: e19494edab82 ("bpf: introduce percpu_freelist")
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman