<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/ipv4/fib_rules.c, branch v6.2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>ipv4: remove unnecessary type castings</title>
<updated>2022-04-30T14:12:58+00:00</updated>
<author>
<name>Yu Zhe</name>
<email>yuzhe@nfschina.com</email>
</author>
<published>2022-04-29T02:14:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2e47eece158a7e5d2205be42e9c44b87302de1a7'/>
<id>2e47eece158a7e5d2205be42e9c44b87302de1a7</id>
<content type='text'>
remove unnecessary void* type castings.

Signed-off-by: Yu Zhe &lt;yuzhe@nfschina.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
remove unnecessary void* type castings.

Signed-off-by: Yu Zhe &lt;yuzhe@nfschina.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Reject again rules with high DSCP values</title>
<updated>2022-02-10T15:33:33+00:00</updated>
<author>
<name>Guillaume Nault</name>
<email>gnault@redhat.com</email>
</author>
<published>2022-02-10T12:24:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dc513a405cade3e47bcda8de27c7a7bf6eeddd18'/>
<id>dc513a405cade3e47bcda8de27c7a7bf6eeddd18</id>
<content type='text'>
Commit 563f8e97e054 ("ipv4: Stop taking ECN bits into account in
fib4-rules") replaced the validation test on frh-&gt;tos. While the new
test is stricter for ECN bits, it doesn't detect the use of high order
DSCP bits. This would be fine if IPv4 could properly handle them. But
currently, most IPv4 lookups are done with the three high DSCP bits
masked. Therefore, using these bits doesn't lead to the expected
result.

Let's reject such configurations again, so that nobody starts to
use and make any assumption about how the stack handles the three high
order DSCP bits in fib4 rules.

Fixes: 563f8e97e054 ("ipv4: Stop taking ECN bits into account in fib4-rules")
Signed-off-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 563f8e97e054 ("ipv4: Stop taking ECN bits into account in
fib4-rules") replaced the validation test on frh-&gt;tos. While the new
test is stricter for ECN bits, it doesn't detect the use of high order
DSCP bits. This would be fine if IPv4 could properly handle them. But
currently, most IPv4 lookups are done with the three high DSCP bits
masked. Therefore, using these bits doesn't lead to the expected
result.

Let's reject such configurations again, so that nobody starts to
use and make any assumption about how the stack handles the three high
order DSCP bits in fib4 rules.

Fixes: 563f8e97e054 ("ipv4: Stop taking ECN bits into account in fib4-rules")
Signed-off-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Stop taking ECN bits into account in fib4-rules</title>
<updated>2022-02-08T04:12:46+00:00</updated>
<author>
<name>Guillaume Nault</name>
<email>gnault@redhat.com</email>
</author>
<published>2022-02-04T13:58:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=563f8e97e054451d167327336a53b7381517a998'/>
<id>563f8e97e054451d167327336a53b7381517a998</id>
<content type='text'>
Use the new dscp_t type to replace the tos field of struct fib4_rule,
so that fib4-rules consistently ignore ECN bits.

Before this patch, fib4-rules did accept rules with the high order ECN
bit set (but not the low order one). Also, it relied on its callers
masking the ECN bits of -&gt;flowi4_tos to prevent those from influencing
the result. This was brittle and a few call paths still do the lookup
without masking the ECN bits first.

After this patch fib4-rules only compare the DSCP bits. ECN can't
influence the result anymore, even if the caller didn't mask these
bits. Also, fib4-rules now must have both ECN bits cleared or they will
be rejected.

Signed-off-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Acked-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the new dscp_t type to replace the tos field of struct fib4_rule,
so that fib4-rules consistently ignore ECN bits.

Before this patch, fib4-rules did accept rules with the high order ECN
bit set (but not the low order one). Also, it relied on its callers
masking the ECN bits of -&gt;flowi4_tos to prevent those from influencing
the result. This was brittle and a few call paths still do the lookup
without masking the ECN bits first.

After this patch fib4-rules only compare the DSCP bits. ECN can't
influence the result anymore, even if the caller didn't mask these
bits. Also, fib4-rules now must have both ECN bits cleared or they will
be rejected.

Signed-off-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Acked-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fib: rules: remove duplicated nla policies</title>
<updated>2021-12-16T15:18:35+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2021-12-16T12:05:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=92e1bcee067f1722703bf61047ef674be1b6d1c0'/>
<id>92e1bcee067f1722703bf61047ef674be1b6d1c0</id>
<content type='text'>
The attributes are identical in all implementations so move the ipv4 one
into the core and remove the per-family nla policies.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The attributes are identical in all implementations so move the ipv4 one
into the core and remove the per-family nla policies.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: convert fib_num_tclassid_users to atomic_t</title>
<updated>2021-12-02T11:56:04+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2021-12-02T02:26:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=213f5f8f31f10aa1e83187ae20fb7fa4e626b724'/>
<id>213f5f8f31f10aa1e83187ae20fb7fa4e626b724</id>
<content type='text'>
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
changes to net-&gt;ipv4.fib_num_tclassid_users were protected by RTNL.

After the change, this is no longer the case, as free_fib_info_rcu()
runs after rcu grace period, without rtnl being held.

Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
changes to net-&gt;ipv4.fib_num_tclassid_users were protected by RTNL.

After the change, this is no longer the case, as free_fib_info_rcu()
runs after rcu grace period, without rtnl being held.

Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: fix memory leak in fib6_rule_suppress</title>
<updated>2021-11-29T14:43:35+00:00</updated>
<author>
<name>msizanoen1</name>
<email>msizanoen@qtmlabs.xyz</email>
</author>
<published>2021-11-23T12:48:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cdef485217d30382f3bf6448c54b4401648fe3f1'/>
<id>cdef485217d30382f3bf6448c54b4401648fe3f1</id>
<content type='text'>
The kernel leaks memory when a `fib` rule is present in IPv6 nftables
firewall rules and a suppress_prefix rule is present in the IPv6 routing
rules (used by certain tools such as wg-quick). In such scenarios, every
incoming packet will leak an allocation in `ip6_dst_cache` slab cache.

After some hours of `bpftrace`-ing and source code reading, I tracked
down the issue to ca7a03c41753 ("ipv6: do not free rt if
FIB_LOOKUP_NOREF is set on suppress rule").

The problem with that change is that the generic `args-&gt;flags` always have
`FIB_LOOKUP_NOREF` set[1][2] but the IPv6-specific flag
`RT6_LOOKUP_F_DST_NOREF` might not be, leading to `fib6_rule_suppress` not
decreasing the refcount when needed.

How to reproduce:
 - Add the following nftables rule to a prerouting chain:
     meta nfproto ipv6 fib saddr . mark . iif oif missing drop
   This can be done with:
     sudo nft create table inet test
     sudo nft create chain inet test test_chain '{ type filter hook prerouting priority filter + 10; policy accept; }'
     sudo nft add rule inet test test_chain meta nfproto ipv6 fib saddr . mark . iif oif missing drop
 - Run:
     sudo ip -6 rule add table main suppress_prefixlength 0
 - Watch `sudo slabtop -o | grep ip6_dst_cache` to see memory usage increase
   with every incoming ipv6 packet.

This patch exposes the protocol-specific flags to the protocol
specific `suppress` function, and check the protocol-specific `flags`
argument for RT6_LOOKUP_F_DST_NOREF instead of the generic
FIB_LOOKUP_NOREF when decreasing the refcount, like this.

[1]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L71
[2]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L99

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215105
Fixes: ca7a03c41753 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel leaks memory when a `fib` rule is present in IPv6 nftables
firewall rules and a suppress_prefix rule is present in the IPv6 routing
rules (used by certain tools such as wg-quick). In such scenarios, every
incoming packet will leak an allocation in `ip6_dst_cache` slab cache.

After some hours of `bpftrace`-ing and source code reading, I tracked
down the issue to ca7a03c41753 ("ipv6: do not free rt if
FIB_LOOKUP_NOREF is set on suppress rule").

The problem with that change is that the generic `args-&gt;flags` always have
`FIB_LOOKUP_NOREF` set[1][2] but the IPv6-specific flag
`RT6_LOOKUP_F_DST_NOREF` might not be, leading to `fib6_rule_suppress` not
decreasing the refcount when needed.

How to reproduce:
 - Add the following nftables rule to a prerouting chain:
     meta nfproto ipv6 fib saddr . mark . iif oif missing drop
   This can be done with:
     sudo nft create table inet test
     sudo nft create chain inet test test_chain '{ type filter hook prerouting priority filter + 10; policy accept; }'
     sudo nft add rule inet test test_chain meta nfproto ipv6 fib saddr . mark . iif oif missing drop
 - Run:
     sudo ip -6 rule add table main suppress_prefixlength 0
 - Watch `sudo slabtop -o | grep ip6_dst_cache` to see memory usage increase
   with every incoming ipv6 packet.

This patch exposes the protocol-specific flags to the protocol
specific `suppress` function, and check the protocol-specific `flags`
argument for RT6_LOOKUP_F_DST_NOREF instead of the generic
FIB_LOOKUP_NOREF when decreasing the refcount, like this.

[1]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L71
[2]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L99

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215105
Fixes: ca7a03c41753 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule")
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fib: use indirect call wrappers in the most common fib_rules_ops</title>
<updated>2020-07-29T00:42:31+00:00</updated>
<author>
<name>Brian Vazquez</name>
<email>brianvv@google.com</email>
</author>
<published>2020-07-26T22:48:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b9aaec8f0be518096d1377082e0abe6a85e86ff3'/>
<id>b9aaec8f0be518096d1377082e0abe6a85e86ff3</id>
<content type='text'>
This avoids another inderect call per RX packet which save us around
20-40 ns.

Changelog:

v1 -&gt; v2:
- Move declaraions to fib_rules.h to remove warnings

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Brian Vazquez &lt;brianvv@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This avoids another inderect call per RX packet which save us around
20-40 ns.

Changelog:

v1 -&gt; v2:
- Move declaraions to fib_rules.h to remove warnings

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Brian Vazquez &lt;brianvv@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fib_notifier: propagate extack down to the notifier block callback</title>
<updated>2019-10-04T18:10:56+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@mellanox.com</email>
</author>
<published>2019-10-03T09:49:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b7a595577ef3dc9add2b3e6d00869d017306bfbe'/>
<id>b7a595577ef3dc9add2b3e6d00869d017306bfbe</id>
<content type='text'>
Since errors are propagated all the way up to the caller, propagate
possible extack of the caller all the way down to the notifier block
callback.

Signed-off-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since errors are propagated all the way up to the caller, propagate
possible extack of the caller all the way down to the notifier block
callback.

Signed-off-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2019-06-07T18:00:14+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-06-07T18:00:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a6cdeeb16bff89c8486324f53577db058cbe81ba'/>
<id>a6cdeeb16bff89c8486324f53577db058cbe81ba</id>
<content type='text'>
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Prepare for fib6_nh from a nexthop object</title>
<updated>2019-06-05T02:26:49+00:00</updated>
<author>
<name>David Ahern</name>
<email>dsahern@gmail.com</email>
</author>
<published>2019-06-04T03:19:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dcb1ecb50edf8219c3bd851de35897fb024c423b'/>
<id>dcb1ecb50edf8219c3bd851de35897fb024c423b</id>
<content type='text'>
Convert more IPv4 code to use fib_nh_common over fib_nh to enable routes
to use a fib6_nh based nexthop. In the end, only code not using a
nexthop object in a fib_info should directly access fib_nh in a fib_info
without checking the famiy and going through fib_nh_common. Those
functions will be marked when it is not directly evident.

Signed-off-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert more IPv4 code to use fib_nh_common over fib_nh to enable routes
to use a fib6_nh based nexthop. In the end, only code not using a
nexthop object in a fib_info should directly access fib_nh in a fib_info
without checking the famiy and going through fib_nh_common. Those
functions will be marked when it is not directly evident.

Signed-off-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
