<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/bpf_verifier.h, branch v4.20-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>bpf: fix partial copy of map_ptr when dst is scalar</title>
<updated>2018-10-31T23:53:17+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-10-31T23:05:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0962590e553331db2cc0aef2dc35c57f6300dbbe'/>
<id>0962590e553331db2cc0aef2dc35c57f6300dbbe</id>
<content type='text'>
ALU operations on pointers such as scalar_reg += map_value_ptr are
handled in adjust_ptr_min_max_vals(). Problem is however that map_ptr
and range in the register state share a union, so transferring state
through dst_reg-&gt;range = ptr_reg-&gt;range is just buggy as any new
map_ptr in the dst_reg is then truncated (or null) for subsequent
checks. Fix this by adding a raw member and use it for copying state
over to dst_reg.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ALU operations on pointers such as scalar_reg += map_value_ptr are
handled in adjust_ptr_min_max_vals(). Problem is however that map_ptr
and range in the register state share a union, so transferring state
through dst_reg-&gt;range = ptr_reg-&gt;range is just buggy as any new
map_ptr in the dst_reg is then truncated (or null) for subsequent
checks. Fix this by adding a raw member and use it for copying state
over to dst_reg.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: add verifier callback to get stack usage info for offloaded progs</title>
<updated>2018-10-08T08:24:12+00:00</updated>
<author>
<name>Quentin Monnet</name>
<email>quentin.monnet@netronome.com</email>
</author>
<published>2018-10-07T11:56:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c941ce9c282cc606e6517356fcc186a9da2b4ab9'/>
<id>c941ce9c282cc606e6517356fcc186a9da2b4ab9</id>
<content type='text'>
In preparation for BPF-to-BPF calls in offloaded programs, add a new
function attribute to the struct bpf_prog_offload_ops so that drivers
supporting eBPF offload can hook at the end of program verification, and
potentially extract information collected by the verifier.

Implement a minimal callback (returning 0) in the drivers providing the
structs, namely netdevsim and nfp.

This will be useful in the nfp driver, in later commits, to extract the
number of subprograms as well as the stack depth for those subprograms.

Signed-off-by: Quentin Monnet &lt;quentin.monnet@netronome.com&gt;
Reviewed-by: Jiong Wang &lt;jiong.wang@netronome.com&gt;
Reviewed-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In preparation for BPF-to-BPF calls in offloaded programs, add a new
function attribute to the struct bpf_prog_offload_ops so that drivers
supporting eBPF offload can hook at the end of program verification, and
potentially extract information collected by the verifier.

Implement a minimal callback (returning 0) in the drivers providing the
structs, namely netdevsim and nfp.

This will be useful in the nfp driver, in later commits, to extract the
number of subprograms as well as the stack depth for those subprograms.

Signed-off-by: Quentin Monnet &lt;quentin.monnet@netronome.com&gt;
Reviewed-by: Jiong Wang &lt;jiong.wang@netronome.com&gt;
Reviewed-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Add reference tracking to verifier</title>
<updated>2018-10-03T00:53:47+00:00</updated>
<author>
<name>Joe Stringer</name>
<email>joe@wand.net.nz</email>
</author>
<published>2018-10-02T20:35:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fd978bf7fd312581a7ca454a991f0ffb34c4204b'/>
<id>fd978bf7fd312581a7ca454a991f0ffb34c4204b</id>
<content type='text'>
Allow helper functions to acquire a reference and return it into a
register. Specific pointer types such as the PTR_TO_SOCKET will
implicitly represent such a reference. The verifier must ensure that
these references are released exactly once in each path through the
program.

To achieve this, this commit assigns an id to the pointer and tracks it
in the 'bpf_func_state', then when the function or program exits,
verifies that all of the acquired references have been freed. When the
pointer is passed to a function that frees the reference, it is removed
from the 'bpf_func_state` and all existing copies of the pointer in
registers are marked invalid.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow helper functions to acquire a reference and return it into a
register. Specific pointer types such as the PTR_TO_SOCKET will
implicitly represent such a reference. The verifier must ensure that
these references are released exactly once in each path through the
program.

To achieve this, this commit assigns an id to the pointer and tracks it
in the 'bpf_func_state', then when the function or program exits,
verifies that all of the acquired references have been freed. When the
pointer is passed to a function that frees the reference, it is removed
from the 'bpf_func_state` and all existing copies of the pointer in
registers are marked invalid.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Add PTR_TO_SOCKET verifier type</title>
<updated>2018-10-03T00:53:47+00:00</updated>
<author>
<name>Joe Stringer</name>
<email>joe@wand.net.nz</email>
</author>
<published>2018-10-02T20:35:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c64b7983288e636356f7f5f652de4813e1cfedac'/>
<id>c64b7983288e636356f7f5f652de4813e1cfedac</id>
<content type='text'>
Teach the verifier a little bit about a new type of pointer, a
PTR_TO_SOCKET. This pointer type is accessed from BPF through the
'struct bpf_sock' structure.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Teach the verifier a little bit about a new type of pointer, a
PTR_TO_SOCKET. This pointer type is accessed from BPF through the
'struct bpf_sock' structure.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Add iterator for spilled registers</title>
<updated>2018-10-03T00:53:46+00:00</updated>
<author>
<name>Joe Stringer</name>
<email>joe@wand.net.nz</email>
</author>
<published>2018-10-02T20:35:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f3709f69b7c5cba6323cc03c29b64293b93be817'/>
<id>f3709f69b7c5cba6323cc03c29b64293b93be817</id>
<content type='text'>
Add this iterator for spilled registers, it concentrates the details of
how to get the current frame's spilled registers into a single macro
while clarifying the intention of the code which is calling the macro.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add this iterator for spilled registers, it concentrates the details of
how to get the current frame's spilled registers into a single macro
while clarifying the intention of the code which is calling the macro.

Signed-off-by: Joe Stringer &lt;joe@wand.net.nz&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf/verifier: per-register parent pointers</title>
<updated>2018-08-30T01:52:12+00:00</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-08-22T19:02:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=679c782de14bd48c19dd74cd1af20a2bc05dd936'/>
<id>679c782de14bd48c19dd74cd1af20a2bc05dd936</id>
<content type='text'>
By giving each register its own liveness chain, we elide the skip_callee()
 logic.  Instead, each register's parent is the state it inherits from;
 both check_func_call() and prepare_func_exit() automatically connect
 reg states to the correct chain since when they copy the reg state across
 (r1-r5 into the callee as args, and r0 out as the return value) they also
 copy the parent pointer.

Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
By giving each register its own liveness chain, we elide the skip_callee()
 logic.  Instead, each register's parent is the state it inherits from;
 both check_func_call() and prepare_func_exit() automatically connect
 reg states to the correct chain since when they copy the reg state across
 (r1-r5 into the callee as args, and r0 out as the return value) they also
 copy the parent pointer.

Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2018-05-26T23:46:15+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-05-26T23:46:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5b79c2af667c0e2684f2a6dbf6439074b78f490c'/>
<id>5b79c2af667c0e2684f2a6dbf6439074b78f490c</id>
<content type='text'>
Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Lots of easy overlapping changes in the confict
resolutions here.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2018-05-26T02:54:42+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-05-26T02:54:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=03250e1028057173b212341015d5fbf53327042c'/>
<id>03250e1028057173b212341015d5fbf53327042c</id>
<content type='text'>
Pull networking fixes from David Miller:
 "Let's begin the holiday weekend with some networking fixes:

   1) Whoops need to restrict cfg80211 wiphy names even more to 64
      bytes. From Eric Biggers.

   2) Fix flags being ignored when using kernel_connect() with SCTP,
      from Xin Long.

   3) Use after free in DCCP, from Alexey Kodanev.

   4) Need to check rhltable_init() return value in ipmr code, from Eric
      Dumazet.

   5) XDP handling fixes in virtio_net from Jason Wang.

   6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu.

   7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack
      Morgenstein.

   8) Prevent out-of-bounds speculation using indexes in BPF, from
      Daniel Borkmann.

   9) Fix regression added by AF_PACKET link layer cure, from Willem de
      Bruijn.

  10) Correct ENIC dma mask, from Govindarajulu Varadarajan.

  11) Missing config options for PMTU tests, from Stefano Brivio"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
  ibmvnic: Fix partial success login retries
  selftests/net: Add missing config options for PMTU tests
  mlx4_core: allocate ICM memory in page size chunks
  enic: set DMA mask to 47 bit
  ppp: remove the PPPIOCDETACH ioctl
  ipv4: remove warning in ip_recv_error
  net : sched: cls_api: deal with egdev path only if needed
  vhost: synchronize IOTLB message with dev cleanup
  packet: fix reserve calculation
  net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
  net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
  bpf: properly enforce index mask to prevent out-of-bounds speculation
  net/mlx4: Fix irq-unsafe spinlock usage
  net: phy: broadcom: Fix bcm_write_exp()
  net: phy: broadcom: Fix auxiliary control register reads
  net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
  net/mlx4: fix spelling mistake: "Inrerface" -&gt; "Interface" and rephrase message
  ibmvnic: Only do H_EOI for mobility events
  tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
  virtio-net: fix leaking page for gso packet during mergeable XDP
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking fixes from David Miller:
 "Let's begin the holiday weekend with some networking fixes:

   1) Whoops need to restrict cfg80211 wiphy names even more to 64
      bytes. From Eric Biggers.

   2) Fix flags being ignored when using kernel_connect() with SCTP,
      from Xin Long.

   3) Use after free in DCCP, from Alexey Kodanev.

   4) Need to check rhltable_init() return value in ipmr code, from Eric
      Dumazet.

   5) XDP handling fixes in virtio_net from Jason Wang.

   6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu.

   7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack
      Morgenstein.

   8) Prevent out-of-bounds speculation using indexes in BPF, from
      Daniel Borkmann.

   9) Fix regression added by AF_PACKET link layer cure, from Willem de
      Bruijn.

  10) Correct ENIC dma mask, from Govindarajulu Varadarajan.

  11) Missing config options for PMTU tests, from Stefano Brivio"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
  ibmvnic: Fix partial success login retries
  selftests/net: Add missing config options for PMTU tests
  mlx4_core: allocate ICM memory in page size chunks
  enic: set DMA mask to 47 bit
  ppp: remove the PPPIOCDETACH ioctl
  ipv4: remove warning in ip_recv_error
  net : sched: cls_api: deal with egdev path only if needed
  vhost: synchronize IOTLB message with dev cleanup
  packet: fix reserve calculation
  net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
  net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
  bpf: properly enforce index mask to prevent out-of-bounds speculation
  net/mlx4: Fix irq-unsafe spinlock usage
  net: phy: broadcom: Fix bcm_write_exp()
  net: phy: broadcom: Fix auxiliary control register reads
  net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
  net/mlx4: fix spelling mistake: "Inrerface" -&gt; "Interface" and rephrase message
  ibmvnic: Only do H_EOI for mobility events
  tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
  virtio-net: fix leaking page for gso packet during mergeable XDP
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: properly enforce index mask to prevent out-of-bounds speculation</title>
<updated>2018-05-24T15:15:43+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-05-24T00:32:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c93552c443ebc63b14e26e46d2e76941c88e0d71'/>
<id>c93552c443ebc63b14e26e46d2e76941c88e0d71</id>
<content type='text'>
While reviewing the verifier code, I recently noticed that the
following two program variants in relation to tail calls can be
loaded.

Variant 1:

  # bpftool p d x i 15
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:5]
    3: (05) goto pc+2
    4: (18) r2 = map[id:6]
    6: (b7) r3 = 7
    7: (35) if r3 &gt;= 0xa0 goto pc+2
    8: (54) (u32) r3 &amp;= (u32) 255
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 5
    5: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B
  # bpftool m s i 6
    6: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B

Variant 2:

  # bpftool p d x i 20
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:8]
    3: (05) goto pc+2
    4: (18) r2 = map[id:7]
    6: (b7) r3 = 7
    7: (35) if r3 &gt;= 0x4 goto pc+2
    8: (54) (u32) r3 &amp;= (u32) 3
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 8
    8: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B
  # bpftool m s i 7
    7: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B

In both cases the index masking inserted by the verifier in order
to control out of bounds speculation from a CPU via b2157399cc98
("bpf: prevent out-of-bounds speculation") seems to be incorrect
in what it is enforcing. In the 1st variant, the mask is applied
from the map with the significantly larger number of entries where
we would allow to a certain degree out of bounds speculation for
the smaller map, and in the 2nd variant where the mask is applied
from the map with the smaller number of entries, we get buggy
behavior since we truncate the index of the larger map.

The original intent from commit b2157399cc98 is to reject such
occasions where two or more different tail call maps are used
in the same tail call helper invocation. However, the check on
the BPF_MAP_PTR_POISON is never hit since we never poisoned the
saved pointer in the first place! We do this explicitly for map
lookups but in case of tail calls we basically used the tail
call map in insn_aux_data that was processed in the most recent
path which the verifier walked. Thus any prior path that stored
a pointer in insn_aux_data at the helper location was always
overridden.

Fix it by moving the map pointer poison logic into a small helper
that covers both BPF helpers with the same logic. After that in
fixup_bpf_calls() the poison check is then hit for tail calls
and the program rejected. Latter only happens in unprivileged
case since this is the *only* occasion where a rewrite needs to
happen, and where such rewrite is specific to the map (max_entries,
index_mask). In the privileged case the rewrite is generic for
the insn-&gt;imm / insn-&gt;code update so multiple maps from different
paths can be handled just fine since all the remaining logic
happens in the instruction processing itself. This is similar
to the case of map lookups: in case there is a collision of
maps in fixup_bpf_calls() we must skip the inlined rewrite since
this will turn the generic instruction sequence into a non-
generic one. Thus the patch_call_imm will simply update the
insn-&gt;imm location where the bpf_map_lookup_elem() will later
take care of the dispatch. Given we need this 'poison' state
as a check, the information of whether a map is an unpriv_array
gets lost, so enforcing it prior to that needs an additional
state. In general this check is needed since there are some
complex and tail call intensive BPF programs out there where
LLVM tends to generate such code occasionally. We therefore
convert the map_ptr rather into map_state to store all this
w/o extra memory overhead, and the bit whether one of the maps
involved in the collision was from an unpriv_array thus needs
to be retained as well there.

Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While reviewing the verifier code, I recently noticed that the
following two program variants in relation to tail calls can be
loaded.

Variant 1:

  # bpftool p d x i 15
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:5]
    3: (05) goto pc+2
    4: (18) r2 = map[id:6]
    6: (b7) r3 = 7
    7: (35) if r3 &gt;= 0xa0 goto pc+2
    8: (54) (u32) r3 &amp;= (u32) 255
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 5
    5: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B
  # bpftool m s i 6
    6: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B

Variant 2:

  # bpftool p d x i 20
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:8]
    3: (05) goto pc+2
    4: (18) r2 = map[id:7]
    6: (b7) r3 = 7
    7: (35) if r3 &gt;= 0x4 goto pc+2
    8: (54) (u32) r3 &amp;= (u32) 3
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 8
    8: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B
  # bpftool m s i 7
    7: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B

In both cases the index masking inserted by the verifier in order
to control out of bounds speculation from a CPU via b2157399cc98
("bpf: prevent out-of-bounds speculation") seems to be incorrect
in what it is enforcing. In the 1st variant, the mask is applied
from the map with the significantly larger number of entries where
we would allow to a certain degree out of bounds speculation for
the smaller map, and in the 2nd variant where the mask is applied
from the map with the smaller number of entries, we get buggy
behavior since we truncate the index of the larger map.

The original intent from commit b2157399cc98 is to reject such
occasions where two or more different tail call maps are used
in the same tail call helper invocation. However, the check on
the BPF_MAP_PTR_POISON is never hit since we never poisoned the
saved pointer in the first place! We do this explicitly for map
lookups but in case of tail calls we basically used the tail
call map in insn_aux_data that was processed in the most recent
path which the verifier walked. Thus any prior path that stored
a pointer in insn_aux_data at the helper location was always
overridden.

Fix it by moving the map pointer poison logic into a small helper
that covers both BPF helpers with the same logic. After that in
fixup_bpf_calls() the poison check is then hit for tail calls
and the program rejected. Latter only happens in unprivileged
case since this is the *only* occasion where a rewrite needs to
happen, and where such rewrite is specific to the map (max_entries,
index_mask). In the privileged case the rewrite is generic for
the insn-&gt;imm / insn-&gt;code update so multiple maps from different
paths can be handled just fine since all the remaining logic
happens in the instruction processing itself. This is similar
to the case of map lookups: in case there is a collision of
maps in fixup_bpf_calls() we must skip the inlined rewrite since
this will turn the generic instruction sequence into a non-
generic one. Thus the patch_call_imm will simply update the
insn-&gt;imm location where the bpf_map_lookup_elem() will later
take care of the dispatch. Given we need this 'poison' state
as a check, the information of whether a map is an unpriv_array
gets lost, so enforcing it prior to that needs an additional
state. In general this check is needed since there are some
complex and tail call intensive BPF programs out there where
LLVM tends to generate such code occasionally. We therefore
convert the map_ptr rather into map_state to store all this
w/o extra memory overhead, and the bit whether one of the maps
involved in the collision was from an unpriv_array thus needs
to be retained as well there.

Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Prevent memory disambiguation attack</title>
<updated>2018-05-19T18:44:24+00:00</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2018-05-15T16:27:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=af86ca4e3088fe5eacf2f7e58c01fa68ca067672'/>
<id>af86ca4e3088fe5eacf2f7e58c01fa68ca067672</id>
<content type='text'>
Detect code patterns where malicious 'speculative store bypass' can be used
and sanitize such patterns.

 39: (bf) r3 = r10
 40: (07) r3 += -216
 41: (79) r8 = *(u64 *)(r7 +0)   // slow read
 42: (7a) *(u64 *)(r10 -72) = 0  // verifier inserts this instruction
 43: (7b) *(u64 *)(r8 +0) = r3   // this store becomes slow due to r8
 44: (79) r1 = *(u64 *)(r6 +0)   // cpu speculatively executes this load
 45: (71) r2 = *(u8 *)(r1 +0)    // speculatively arbitrary 'load byte'
                                 // is now sanitized

Above code after x86 JIT becomes:
 e5: mov    %rbp,%rdx
 e8: add    $0xffffffffffffff28,%rdx
 ef: mov    0x0(%r13),%r14
 f3: movq   $0x0,-0x48(%rbp)
 fb: mov    %rdx,0x0(%r14)
 ff: mov    0x0(%rbx),%rdi
103: movzbq 0x0(%rdi),%rsi

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Detect code patterns where malicious 'speculative store bypass' can be used
and sanitize such patterns.

 39: (bf) r3 = r10
 40: (07) r3 += -216
 41: (79) r8 = *(u64 *)(r7 +0)   // slow read
 42: (7a) *(u64 *)(r10 -72) = 0  // verifier inserts this instruction
 43: (7b) *(u64 *)(r8 +0) = r3   // this store becomes slow due to r8
 44: (79) r1 = *(u64 *)(r6 +0)   // cpu speculatively executes this load
 45: (71) r2 = *(u8 *)(r1 +0)    // speculatively arbitrary 'load byte'
                                 // is now sanitized

Above code after x86 JIT becomes:
 e5: mov    %rbp,%rdx
 e8: add    $0xffffffffffffff28,%rdx
 ef: mov    0x0(%r13),%r14
 f3: movq   $0x0,-0x48(%rbp)
 fb: mov    %rdx,0x0(%r14)
 ff: mov    0x0(%rbx),%rdi
103: movzbq 0x0(%rdi),%rsi

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
