<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/filter.h, branch v6.1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>bpf: use bpf_prog_pack for bpf_dispatcher</title>
<updated>2022-09-27T03:40:43+00:00</updated>
<author>
<name>Song Liu</name>
<email>song@kernel.org</email>
</author>
<published>2022-09-26T18:47:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=19c02415da2345d0dda2b5c4495bc17cc14b18b5'/>
<id>19c02415da2345d0dda2b5c4495bc17cc14b18b5</id>
<content type='text'>
Allocate bpf_dispatcher with bpf_prog_pack_alloc so that bpf_dispatcher
can share pages with bpf programs.

arch_prepare_bpf_dispatcher() is updated to provide a RW buffer as working
area for arch code to write to.

This also fixes CPA W^X warnning like:

CPA refuse W^X violation: 8000000000000163 -&gt; 0000000000000163 range: ...

Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20220926184739.3512547-2-song@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allocate bpf_dispatcher with bpf_prog_pack_alloc so that bpf_dispatcher
can share pages with bpf programs.

arch_prepare_bpf_dispatcher() is updated to provide a RW buffer as working
area for arch code to write to.

This also fixes CPA W^X warnning like:

CPA refuse W^X violation: 8000000000000163 -&gt; 0000000000000163 range: ...

Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20220926184739.3512547-2-song@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Move nf_conn extern declarations to filter.h</title>
<updated>2022-09-20T21:41:35+00:00</updated>
<author>
<name>Daniel Xu</name>
<email>dxu@dxuuu.xyz</email>
</author>
<published>2022-09-20T14:15:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fdf214978a71b2749d26f6da2b1d51d9ac23831d'/>
<id>fdf214978a71b2749d26f6da2b1d51d9ac23831d</id>
<content type='text'>
We're seeing the following new warnings on netdev/build_32bit and
netdev/build_allmodconfig_warn CI jobs:

    ../net/core/filter.c:8608:1: warning: symbol
    'nf_conn_btf_access_lock' was not declared. Should it be static?
    ../net/core/filter.c:8611:5: warning: symbol 'nfct_bsa' was not
    declared. Should it be static?

Fix by ensuring extern declaration is present while compiling filter.o.

Fixes: 864b656f82cc ("bpf: Add support for writing to nf_conn:mark")
Signed-off-by: Daniel Xu &lt;dxu@dxuuu.xyz&gt;
Link: https://lore.kernel.org/r/2bd2e0283df36d8a4119605878edb1838d144174.1663683114.git.dxu@dxuuu.xyz
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We're seeing the following new warnings on netdev/build_32bit and
netdev/build_allmodconfig_warn CI jobs:

    ../net/core/filter.c:8608:1: warning: symbol
    'nf_conn_btf_access_lock' was not declared. Should it be static?
    ../net/core/filter.c:8611:5: warning: symbol 'nfct_bsa' was not
    declared. Should it be static?

Fix by ensuring extern declaration is present while compiling filter.o.

Fixes: 864b656f82cc ("bpf: Add support for writing to nf_conn:mark")
Signed-off-by: Daniel Xu &lt;dxu@dxuuu.xyz&gt;
Link: https://lore.kernel.org/r/2bd2e0283df36d8a4119605878edb1838d144174.1663683114.git.dxu@dxuuu.xyz
Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Use bpf_capable() instead of CAP_SYS_ADMIN for blinding decision</title>
<updated>2022-09-16T20:11:57+00:00</updated>
<author>
<name>Yauheni Kaliuta</name>
<email>ykaliuta@redhat.com</email>
</author>
<published>2022-09-05T09:01:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bfeb7e399bacae4ee46ad978f5fce3e47f0978d6'/>
<id>bfeb7e399bacae4ee46ad978f5fce3e47f0978d6</id>
<content type='text'>
The full CAP_SYS_ADMIN requirement for blinding looks too strict nowadays.
These days given unprivileged BPF is disabled by default, the main users
for constant blinding coming from unprivileged in particular via cBPF -&gt; eBPF
migration (e.g. old-style socket filters).

Signed-off-by: Yauheni Kaliuta &lt;ykaliuta@redhat.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220831090655.156434-1-ykaliuta@redhat.com
Link: https://lore.kernel.org/bpf/20220905090149.61221-1-ykaliuta@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The full CAP_SYS_ADMIN requirement for blinding looks too strict nowadays.
These days given unprivileged BPF is disabled by default, the main users
for constant blinding coming from unprivileged in particular via cBPF -&gt; eBPF
migration (e.g. old-style socket filters).

Signed-off-by: Yauheni Kaliuta &lt;ykaliuta@redhat.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220831090655.156434-1-ykaliuta@redhat.com
Link: https://lore.kernel.org/bpf/20220905090149.61221-1-ykaliuta@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: net: Change sk_getsockopt() to take the sockptr_t argument</title>
<updated>2022-09-03T03:34:30+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>martin.lau@kernel.org</email>
</author>
<published>2022-09-02T00:28:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4ff09db1b79b98b4a2a7511571c640b76cab3beb'/>
<id>4ff09db1b79b98b4a2a7511571c640b76cab3beb</id>
<content type='text'>
This patch changes sk_getsockopt() to take the sockptr_t argument
such that it can be used by bpf_getsockopt(SOL_SOCKET) in a
latter patch.

security_socket_getpeersec_stream() is not changed.  It stays
with the __user ptr (optval.user and optlen.user) to avoid changes
to other security hooks.  bpf_getsockopt(SOL_SOCKET) also does not
support SO_PEERSEC.

Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Link: https://lore.kernel.org/r/20220902002802.2888419-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch changes sk_getsockopt() to take the sockptr_t argument
such that it can be used by bpf_getsockopt(SOL_SOCKET) in a
latter patch.

security_socket_getpeersec_stream() is not changed.  It stays
with the __user ptr (optval.user and optlen.user) to avoid changes
to other security hooks.  bpf_getsockopt(SOL_SOCKET) also does not
support SO_PEERSEC.

Signed-off-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Link: https://lore.kernel.org/r/20220902002802.2888419-1-kafai@fb.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf, x86: fix freeing of not-finalized bpf_prog_pack</title>
<updated>2022-07-13T00:33:22+00:00</updated>
<author>
<name>Song Liu</name>
<email>song@kernel.org</email>
</author>
<published>2022-07-06T00:26:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1d5f82d9dd477d5c66e0214a68c3e4f308eadd6d'/>
<id>1d5f82d9dd477d5c66e0214a68c3e4f308eadd6d</id>
<content type='text'>
syzbot reported a few issues with bpf_prog_pack [1], [2]. This only happens
with multiple subprogs. In jit_subprogs(), we first call bpf_int_jit_compile()
on each sub program. And then, we call it on each sub program again. jit_data
is not freed in the first call of bpf_int_jit_compile(). Similarly we don't
call bpf_jit_binary_pack_finalize() in the first call of bpf_int_jit_compile().

If bpf_int_jit_compile() failed for one sub program, we will call
bpf_jit_binary_pack_finalize() for this sub program. However, we don't have a
chance to call it for other sub programs. Then we will hit "goto out_free" in
jit_subprogs(), and call bpf_jit_free on some subprograms that haven't got
bpf_jit_binary_pack_finalize() yet.

At this point, bpf_jit_binary_pack_free() is called and the whole 2MB page is
freed erroneously.

Fix this with a custom bpf_jit_free() for x86_64, which calls
bpf_jit_binary_pack_finalize() if necessary. Also, with custom
bpf_jit_free(), bpf_prog_aux-&gt;use_bpf_prog_pack is not needed any more,
remove it.

Fixes: 1022a5498f6f ("bpf, x86_64: Use bpf_jit_binary_pack_alloc")
[1] https://syzkaller.appspot.com/bug?extid=2f649ec6d2eea1495a8f
[2] https://syzkaller.appspot.com/bug?extid=87f65c75f4a72db05445
Reported-by: syzbot+2f649ec6d2eea1495a8f@syzkaller.appspotmail.com
Reported-by: syzbot+87f65c75f4a72db05445@syzkaller.appspotmail.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20220706002612.4013790-1-song@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
syzbot reported a few issues with bpf_prog_pack [1], [2]. This only happens
with multiple subprogs. In jit_subprogs(), we first call bpf_int_jit_compile()
on each sub program. And then, we call it on each sub program again. jit_data
is not freed in the first call of bpf_int_jit_compile(). Similarly we don't
call bpf_jit_binary_pack_finalize() in the first call of bpf_int_jit_compile().

If bpf_int_jit_compile() failed for one sub program, we will call
bpf_jit_binary_pack_finalize() for this sub program. However, we don't have a
chance to call it for other sub programs. Then we will hit "goto out_free" in
jit_subprogs(), and call bpf_jit_free on some subprograms that haven't got
bpf_jit_binary_pack_finalize() yet.

At this point, bpf_jit_binary_pack_free() is called and the whole 2MB page is
freed erroneously.

Fix this with a custom bpf_jit_free() for x86_64, which calls
bpf_jit_binary_pack_finalize() if necessary. Also, with custom
bpf_jit_free(), bpf_prog_aux-&gt;use_bpf_prog_pack is not needed any more,
remove it.

Fixes: 1022a5498f6f ("bpf, x86_64: Use bpf_jit_binary_pack_alloc")
[1] https://syzkaller.appspot.com/bug?extid=2f649ec6d2eea1495a8f
[2] https://syzkaller.appspot.com/bug?extid=87f65c75f4a72db05445
Reported-by: syzbot+2f649ec6d2eea1495a8f@syzkaller.appspotmail.com
Reported-by: syzbot+87f65c75f4a72db05445@syzkaller.appspotmail.com
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Link: https://lore.kernel.org/r/20220706002612.4013790-1-song@kernel.org
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf, x64: Add predicate for bpf2bpf with tailcalls support in JIT</title>
<updated>2022-06-21T16:52:04+00:00</updated>
<author>
<name>Tony Ambardar</name>
<email>tony.ambardar@gmail.com</email>
</author>
<published>2022-06-17T10:57:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=95acd8817e66d031d2e6ee7def3f1e1874819317'/>
<id>95acd8817e66d031d2e6ee7def3f1e1874819317</id>
<content type='text'>
The BPF core/verifier is hard-coded to permit mixing bpf2bpf and tail
calls for only x86-64. Change the logic to instead rely on a new weak
function 'bool bpf_jit_supports_subprog_tailcalls(void)', which a capable
JIT backend can override.

Update the x86-64 eBPF JIT to reflect this.

Signed-off-by: Tony Ambardar &lt;Tony.Ambardar@gmail.com&gt;
[jakub: drop MIPS bits and tweak patch subject]
Signed-off-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220617105735.733938-2-jakub@cloudflare.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The BPF core/verifier is hard-coded to permit mixing bpf2bpf and tail
calls for only x86-64. Change the logic to instead rely on a new weak
function 'bool bpf_jit_supports_subprog_tailcalls(void)', which a capable
JIT backend can override.

Update the x86-64 eBPF JIT to reflect this.

Signed-off-by: Tony Ambardar &lt;Tony.Ambardar@gmail.com&gt;
[jakub: drop MIPS bits and tweak patch subject]
Signed-off-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220617105735.733938-2-jakub@cloudflare.com
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: move bpf_prog to bpf.h</title>
<updated>2022-06-17T02:27:29+00:00</updated>
<author>
<name>Delyan Kratunov</name>
<email>delyank@fb.com</email>
</author>
<published>2022-06-14T23:10:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d687f621c518d791b5fffde8add3112d869b0b1b'/>
<id>d687f621c518d791b5fffde8add3112d869b0b1b</id>
<content type='text'>
In order to add a version of bpf_prog_run_array which accesses the
bpf_prog-&gt;aux member, bpf_prog needs to be more than a forward
declaration inside bpf.h.

Given that filter.h already includes bpf.h, this merely reorders
the type declarations for filter.h users. bpf.h users now have access to
bpf_prog internals.

Signed-off-by: Delyan Kratunov &lt;delyank@fb.com&gt;
Link: https://lore.kernel.org/r/3ed7824e3948f22d84583649ccac0ff0d38b6b58.1655248076.git.delyank@fb.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to add a version of bpf_prog_run_array which accesses the
bpf_prog-&gt;aux member, bpf_prog needs to be more than a forward
declaration inside bpf.h.

Given that filter.h already includes bpf.h, this merely reorders
the type declarations for filter.h users. bpf.h users now have access to
bpf_prog internals.

Signed-off-by: Delyan Kratunov &lt;delyank@fb.com&gt;
Link: https://lore.kernel.org/r/3ed7824e3948f22d84583649ccac0ff0d38b6b58.1655248076.git.delyank@fb.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Fix net.core.bpf_jit_harden race</title>
<updated>2022-03-16T22:13:36+00:00</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2022-03-09T12:33:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d2a3b7c5becc3992f8e7d2b9bf5eacceeedb9a48'/>
<id>d2a3b7c5becc3992f8e7d2b9bf5eacceeedb9a48</id>
<content type='text'>
It is the bpf_jit_harden counterpart to commit 60b58afc96c9 ("bpf: fix
net.core.bpf_jit_enable race"). bpf_jit_harden will be tested twice
for each subprog if there are subprogs in bpf program and constant
blinding may increase the length of program, so when running
"./test_progs -t subprogs" and toggling bpf_jit_harden between 0 and 2,
jit_subprogs may fail because constant blinding increases the length
of subprog instructions during extra passs.

So cache the value of bpf_jit_blinding_enabled() during program
allocation, and use the cached value during constant blinding, subprog
JITing and args tracking of tail call.

Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20220309123321.2400262-4-houtao1@huawei.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is the bpf_jit_harden counterpart to commit 60b58afc96c9 ("bpf: fix
net.core.bpf_jit_enable race"). bpf_jit_harden will be tested twice
for each subprog if there are subprogs in bpf program and constant
blinding may increase the length of program, so when running
"./test_progs -t subprogs" and toggling bpf_jit_harden between 0 and 2,
jit_subprogs may fail because constant blinding increases the length
of subprog instructions during extra passs.

So cache the value of bpf_jit_blinding_enabled() during program
allocation, and use the cached value during constant blinding, subprog
JITing and args tracking of tail call.

Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20220309123321.2400262-4-houtao1@huawei.com
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/</title>
<updated>2022-03-10T21:57:05+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2022-03-09T09:05:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9bb984f28d5bcb917d35d930fcfb89f90f9449fd'/>
<id>9bb984f28d5bcb917d35d930fcfb89f90f9449fd</id>
<content type='text'>
This patch is to simplify the uapi bpf.h regarding to the tstamp type
and use a similar way as the kernel to describe the value stored
in __sk_buff-&gt;tstamp.

My earlier thought was to avoid describing the semantic and
clock base for the rcv timestamp until there is more clarity
on the use case, so the __sk_buff-&gt;delivery_time_type naming instead
of __sk_buff-&gt;tstamp_type.

With some thoughts, it can reuse the UNSPEC naming.  This patch first
removes BPF_SKB_DELIVERY_TIME_NONE and also

rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC
and    BPF_SKB_DELIVERY_TIME_MONO   to BPF_SKB_TSTAMP_DELIVERY_MONO.

The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same:
__sk_buff-&gt;tstamp has delivery time in mono clock base.

BPF_SKB_TSTAMP_UNSPEC means __sk_buff-&gt;tstamp has the (rcv)
tstamp at ingress and the delivery time at egress.  At egress,
the clock base could be found from skb-&gt;sk-&gt;sk_clockid.
__sk_buff-&gt;tstamp == 0 naturally means NONE, so NONE is not needed.

With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress,
the __sk_buff-&gt;delivery_time_type is also renamed to __sk_buff-&gt;tstamp_type
which was also suggested in the earlier discussion:
https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/

The above will then make __sk_buff-&gt;tstamp and __sk_buff-&gt;tstamp_type
the same as its kernel skb-&gt;tstamp and skb-&gt;mono_delivery_time
counter part.

The internal kernel function bpf_skb_convert_dtime_type_read() is then
renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified
with the BPF_SKB_DELIVERY_TIME_NONE gone.  A BPF_ALU32_IMM(BPF_AND)
insn is also saved by using BPF_JMP32_IMM(BPF_JSET).

The bpf helper bpf_skb_set_delivery_time() is also renamed to
bpf_skb_set_tstamp().  The arg name is changed from dtime
to tstamp also.  It only allows setting tstamp 0 for
BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later
if there is use case to change mono delivery time to
non mono.

prog-&gt;delivery_time_access is also renamed to prog-&gt;tstamp_type_access.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch is to simplify the uapi bpf.h regarding to the tstamp type
and use a similar way as the kernel to describe the value stored
in __sk_buff-&gt;tstamp.

My earlier thought was to avoid describing the semantic and
clock base for the rcv timestamp until there is more clarity
on the use case, so the __sk_buff-&gt;delivery_time_type naming instead
of __sk_buff-&gt;tstamp_type.

With some thoughts, it can reuse the UNSPEC naming.  This patch first
removes BPF_SKB_DELIVERY_TIME_NONE and also

rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC
and    BPF_SKB_DELIVERY_TIME_MONO   to BPF_SKB_TSTAMP_DELIVERY_MONO.

The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same:
__sk_buff-&gt;tstamp has delivery time in mono clock base.

BPF_SKB_TSTAMP_UNSPEC means __sk_buff-&gt;tstamp has the (rcv)
tstamp at ingress and the delivery time at egress.  At egress,
the clock base could be found from skb-&gt;sk-&gt;sk_clockid.
__sk_buff-&gt;tstamp == 0 naturally means NONE, so NONE is not needed.

With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress,
the __sk_buff-&gt;delivery_time_type is also renamed to __sk_buff-&gt;tstamp_type
which was also suggested in the earlier discussion:
https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/

The above will then make __sk_buff-&gt;tstamp and __sk_buff-&gt;tstamp_type
the same as its kernel skb-&gt;tstamp and skb-&gt;mono_delivery_time
counter part.

The internal kernel function bpf_skb_convert_dtime_type_read() is then
renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified
with the BPF_SKB_DELIVERY_TIME_NONE gone.  A BPF_ALU32_IMM(BPF_AND)
insn is also saved by using BPF_JMP32_IMM(BPF_JSET).

The bpf helper bpf_skb_set_delivery_time() is also renamed to
bpf_skb_set_tstamp().  The arg name is changed from dtime
to tstamp also.  It only allows setting tstamp 0 for
BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later
if there is use case to change mono delivery time to
non mono.

prog-&gt;delivery_time_access is also renamed to prog-&gt;tstamp_type_access.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com
</pre>
</div>
</content>
</entry>
<entry>
<title>bpf: Add __sk_buff-&gt;delivery_time_type and bpf_skb_set_skb_delivery_time()</title>
<updated>2022-03-03T14:38:49+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2022-03-02T19:56:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8d21ec0e46ed6e39994accff8eb4f2be3d2e76b5'/>
<id>8d21ec0e46ed6e39994accff8eb4f2be3d2e76b5</id>
<content type='text'>
* __sk_buff-&gt;delivery_time_type:
This patch adds __sk_buff-&gt;delivery_time_type.  It tells if the
delivery_time is stored in __sk_buff-&gt;tstamp or not.

It will be most useful for ingress to tell if the __sk_buff-&gt;tstamp
has the (rcv) timestamp or delivery_time.  If delivery_time_type
is 0 (BPF_SKB_DELIVERY_TIME_NONE), it has the (rcv) timestamp.

Two non-zero types are defined for the delivery_time_type,
BPF_SKB_DELIVERY_TIME_MONO and BPF_SKB_DELIVERY_TIME_UNSPEC.  For UNSPEC,
it can only happen in egress because only mono delivery_time can be
forwarded to ingress now.  The clock of UNSPEC delivery_time
can be deduced from the skb-&gt;sk-&gt;sk_clockid which is how
the sch_etf doing it also.

* Provide forwarded delivery_time to tc-bpf@ingress:
With the help of the new delivery_time_type, the tc-bpf has a way
to tell if the __sk_buff-&gt;tstamp has the (rcv) timestamp or
the delivery_time.  During bpf load time, the verifier will learn if
the bpf prog has accessed the new __sk_buff-&gt;delivery_time_type.
If it does, it means the tc-bpf@ingress is expecting the
skb-&gt;tstamp could have the delivery_time.  The kernel will then
read the skb-&gt;tstamp as-is during bpf insn rewrite without
checking the skb-&gt;mono_delivery_time.  This is done by adding a
new prog-&gt;delivery_time_access bit.  The same goes for
writing skb-&gt;tstamp.

* bpf_skb_set_delivery_time():
The bpf_skb_set_delivery_time() helper is added to allow setting both
delivery_time and the delivery_time_type at the same time.  If the
tc-bpf does not need to change the delivery_time_type, it can directly
write to the __sk_buff-&gt;tstamp as the existing tc-bpf has already been
doing.  It will be most useful at ingress to change the
__sk_buff-&gt;tstamp from the (rcv) timestamp to
a mono delivery_time and then bpf_redirect_*().

bpf only has mono clock helper (bpf_ktime_get_ns), and
the current known use case is the mono EDT for fq, and
only mono delivery time can be kept during forward now,
so bpf_skb_set_delivery_time() only supports setting
BPF_SKB_DELIVERY_TIME_MONO.  It can be extended later when use cases
come up and the forwarding path also supports other clock bases.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* __sk_buff-&gt;delivery_time_type:
This patch adds __sk_buff-&gt;delivery_time_type.  It tells if the
delivery_time is stored in __sk_buff-&gt;tstamp or not.

It will be most useful for ingress to tell if the __sk_buff-&gt;tstamp
has the (rcv) timestamp or delivery_time.  If delivery_time_type
is 0 (BPF_SKB_DELIVERY_TIME_NONE), it has the (rcv) timestamp.

Two non-zero types are defined for the delivery_time_type,
BPF_SKB_DELIVERY_TIME_MONO and BPF_SKB_DELIVERY_TIME_UNSPEC.  For UNSPEC,
it can only happen in egress because only mono delivery_time can be
forwarded to ingress now.  The clock of UNSPEC delivery_time
can be deduced from the skb-&gt;sk-&gt;sk_clockid which is how
the sch_etf doing it also.

* Provide forwarded delivery_time to tc-bpf@ingress:
With the help of the new delivery_time_type, the tc-bpf has a way
to tell if the __sk_buff-&gt;tstamp has the (rcv) timestamp or
the delivery_time.  During bpf load time, the verifier will learn if
the bpf prog has accessed the new __sk_buff-&gt;delivery_time_type.
If it does, it means the tc-bpf@ingress is expecting the
skb-&gt;tstamp could have the delivery_time.  The kernel will then
read the skb-&gt;tstamp as-is during bpf insn rewrite without
checking the skb-&gt;mono_delivery_time.  This is done by adding a
new prog-&gt;delivery_time_access bit.  The same goes for
writing skb-&gt;tstamp.

* bpf_skb_set_delivery_time():
The bpf_skb_set_delivery_time() helper is added to allow setting both
delivery_time and the delivery_time_type at the same time.  If the
tc-bpf does not need to change the delivery_time_type, it can directly
write to the __sk_buff-&gt;tstamp as the existing tc-bpf has already been
doing.  It will be most useful at ingress to change the
__sk_buff-&gt;tstamp from the (rcv) timestamp to
a mono delivery_time and then bpf_redirect_*().

bpf only has mono clock helper (bpf_ktime_get_ns), and
the current known use case is the mono EDT for fq, and
only mono delivery time can be kept during forward now,
so bpf_skb_set_delivery_time() only supports setting
BPF_SKB_DELIVERY_TIME_MONO.  It can be extended later when use cases
come up and the forwarding path also supports other clock bases.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
