| Age | Commit message (Collapse) | Author |
|
The netmap_ring struct starts with various const members and rencent
clang warns about leaving them uninitialized. Having them const in the
first place is highly suspicious since they are updated with various
macros but using hand-coded __DECONST(). But fixing that is a more
invasive change that I am unable to test.
```
.../freebsd/sys/dev/netmap/netmap_kloop.c:320:21: error: default initialization of an object of type 'struct netmap_ring' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
320 | struct netmap_ring shadow_ring; /* shadow copy of the netmap_ring */
| ^
.../freebsd/sys/net/netmap.h:290:16: note: member 'buf_ofs' declared 'const' here
290 | const int64_t buf_ofs;
| ^
```
Test Plan: Compiles
Reviewed by: vmaffione, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D52568
|
|
This is useful when dev.netmap.port_numa_affinity is set to 1. When
interfaces attach, they get a memory allocator that is copied from
nm_mem. Parameters in nm_mem can be set using sysctls, but this happens
after their values are copied.
To work around this, we can make it possible to set these memory
parameters as tunables.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D54178
|
|
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D52045
|
|
We bump the object reference count prior to mapping it into the kernel
map, at which point the vm_map_entry owns the reference. Then, if
vm_map_wire() fails, vm_map_remove() will release the reference, so we
should avoid decrementing it in the error path.
Reported by: Ilja van Sprundel <ivansprundel@ioactive.com>
Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D53066
|
|
Include opt_inet.h and opt_inet6.h early in the files including
virtio_net.h, since they use INET and/or INET6.
While there, remove redundant inclusion of sys/types.h, since it is
included already by sys/param.h.
There was a discussion to include opt_inet.h and opt_inet6.h also
in virtio_net.h. glebius suggested to add a mechanism for files
to check, if required opt_*.h files were included. virtio_net.h
will be the first consumer of this mechanism.
Reviewed by: glebius, Peter Lei
MFC after: 3 days
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D52046
|
|
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D49337
|
|
No functional change intended.
MFC after: 1 week
|
|
Each netmap adapter associated with a physical adapter is attached to a
netmap memory pool. contigmalloc() is used to allocate physically
contiguous memory for the pool, but ideally we would ensure that all
such memory is allocated from the NUMA domain local to the adapter.
Augment netmap's memory pools with a NUMA domain ID, similar to how
IOMMU groups are handled in the Linux port. That is, when attaching to
a physical adapter, ensure that the associated memory pools are local to
the adapter's associated memory domain, creating new pools as needed.
Some types of ifnets do not have any defined NUMA affinity; in this case
the domain ID in question is the sentinel value -1.
Add a sysctl, dev.netmap.port_numa_affinity, which can be used to enable
the new behaviour. Keep it disabled by now to avoid surprises in case
netmap applications are relying on zero-copy optimizations to forward
packets between ports belonging to different NUMA domains.
Reviewed by: vmaffione
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D46666
|
|
No functional change intended.
Reviewed by: vmaffione
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D46664
|
|
As of 9e6544dd6e02c46b805d11ab925c4f3b18ad7a4b contigfree(9) is no longer
needed and should not be used anymore. We leave a wrapper for 3rd party
code in at least 15.x but remove (almost) all other cases from the tree.
This leaves one use of contigfree(9) untouched; that was the original
trigger for 9e6544dd6e02 and is handled in D45813 (to be committed
seperately later).
Sponsored by: The FreeBSD Foundation
Reviewed by: markj, kib
Tested by: pho (10h stress test run)
Differential Revision: https://reviews.freebsd.org/D46099
|
|
Change 4787572d0580 made if_alloc_domain() never fail, then also do the
wrappers if_alloc(), if_alloc_dev(), and if_gethandle().
No functional change intended.
Reviewed by: kp, imp, glebius, stevek
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D45740
|
|
No functional change intended.
MFC after: 1 week
|
|
netmap_generic keeps a pool of mbufs for handling transfers, these mbufs
have an external buffer attached to them.
If some cases other parts of the network stack can chain these mbufs,
when this happens the normal pool destructor function can end up
free'ing the pool mbufs twice:
- A first time if a pool mbuf has been chained with another mbuf when
its chain is freed
- A second time when its entry in the pool is freed
Additionally, if other parts of the stack demote a pool mbuf its
interface reference will be cleared. In this case we deference a NULL
pointer when trying to free the mbuf through the destructor. Store a
reference to the adapter in ext_arg1 with the destructor callback so we
can find the correct adapter when free'ing a pool mbuf.
This change enables using netmap with epair interfaces.
Reviewed By: vmaffione
MFC after: 1 week
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44371
|
|
The CSB_WRITE() and _READ() macros respectively write to and read from
userspace memory and so can in principle fault. However, we do not
check for errors and will proceed blindly if they fail. Add assertions
to verify that they do not.
This is in preparation for annotating copyin() and related functions
with __result_use_check.
Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43200
|
|
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.
Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/
Sponsored by: Netflix
|
|
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
|
Remove /^/[*/]\s*\$FreeBSD\$.*\n/
|
|
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
|
|
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
|
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
|
|
ifnets are allowed to pass batches of multiple packets to if_input,
linked by the m_nextpkt pointer. iflib_rxeof() sometimes does this, for
example. Netmap's generic mode did not handle this and would only
deliver the first packet in the batch, leaking the rest.
PR: 270636
Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39426
|
|
In emulated mode, the FreeBSD netmap port attempts to perform zero-copy
transmission. This works as follows: the kernel ring is populated with
mbuf headers to which netmap buffers are attached. When transmitting,
the mbuf refcount is initialized to 2, and when the counter value has
been decremented to 1 netmap infers that the driver has freed the mbuf
and thus transmission is complete.
This scheme does not generalize to the situation where netmap is
attaching to a software interface which may transmit packets among
multiple "queues", as is the case with bridge or lagg interfaces. In
that case, we would be relying on backing hardware drivers to free
transmitted mbufs promptly, but this isn't guaranteed; a driver may
reasonably defer freeing a small number of transmitted buffers
indefinitely. If such a buffer ends up at the tail of a netmap transmit
ring, further transmits can end up blocked indefinitely.
Fix the problem by removing the zero-copy scheme (which is also not
implemented in the Linux port of netmap). Instead, the kernel ring is
populated with regular mbuf clusters into which netmap buffers are
copied by nm_os_generic_xmit_frame(). The refcounting scheme is
preserved, and this lets us avoid allocating a fresh cluster per
transmitted packet in the common case. If the transmit ring is full, a
callout is used to free the "stuck" mbuf, avoiding the queue deadlock
described above.
Furthermore, when recycling mbuf clusters, be sure to fully reinitialize
the mbuf header instead of simply re-setting M_PKTHDR. Some software
interfaces, like if_vlan, may set fields in the header which should be
reset before the mbuf is reused.
Reviewed by: vmaffione
MFC after: 1 month
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38065
|
|
The previous code unsuccesfully attempted to report a precise error for
each option in the user list. Moreover, commit 253b2ec199b broke some
ctrl-api-test (see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260547).
With this patch we bail out as soon as an unrecoverable error is detected and
we properly check for copy boundaries. EOPNOTSUPP no longer immediately
returns an error, so that any other option in the list may be examined
by the caller code and a precise report of the (un)supported options can
be returned to the user.
With this patch, all ctrl-api-test unit tests pass again.
PR: 260547
Submitted by: giuseppe.lettieri@unipi.it
Reviewed by: vmaffione
MFC after: 14 days
|
|
The save_if_input function pointer was meant to save the previous
value of ifp->if_input before replacing it with the emulated
adapter hook.
However, the same pointer value is already stored in the if_input
field of the netmap_adapter struct, to be used for host TX ring processing.
Reuse the netmap_adapter if_input field to simplify the code
and save some space.
MFC after: 14 days
|
|
MFC after: 7 days
|
|
No functional change intended.
Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39001
|
|
Reviewed by: vmaffione, zlei
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37814
|
|
Reported by: zlei
MFC after 3 days
|
|
Right now we have little visibility into packet drops within netmap.
Start trying to make packet loss issues more visible by counting queue
drops in the transmit path, and in the input path for interfaces running
in emulated mode, where we place received packets in a bounded software
queue that is processed by rxsync.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38064
|
|
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38063
|
|
The check is ok by default, since the default value of
netmap_generic_ringsize is 1024. But we should check against the
configured "ring" size.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38062
|
|
Per the removed comments these fields should be loaded only once, since
they can in principle be modified concurrently, though this would be a
violation of the userspace contract with netmap.
No functional change intended.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38061
|
|
Reviewed by: zlei
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37436
|
|
MFC after: 7 days
|
|
MFC after: 7 days
|
|
netmap_monitor_stop() called nm_monitor_none() after the head of
the zero-copy monitors had been reset, thus thinking that there
was nothing left to do.
MFC after: 7 days
|
|
Netmap users on FreeBSD are not supposed to import code from the
github netmap repository anymore. They should use the code that
is available in the src repo. We can therefore drop the compatibility
code.
MFC after: 7 days
|
|
MFC after: 1 week
|
|
A distinct number of double-semicolons have ended up in FreeBSD. Take a
pass at getting rid of many of these harmless typos.
Reviewed by: emaste, rrs
Pull Request: https://github.com/freebsd/freebsd-src/pull/609
Differential Revision: https://reviews.freebsd.org/D31716
|
|
- s/the the/the/
MFC after: 3 days
|
|
With clang 15, the following -Werror warning is produced:
sys/dev/netmap/if_re_netmap.h:179:8: error: variable 'n' set but not used [-Werror,-Wunused-but-set-variable]
u_int n;
^
The 'n' variable appears to have been a debugging aid that has never
been used for anything, so remove it.
MFC after: 3 days
|
|
|
|
|
|
The total size of the user-provided nmreq was first computed and then
trusted during the copyin. This might lead to kernel memory corruption
and escape from jails/containers.
Reported by: Lucas Leong (@_wmliang_) of Trend Micro Zero Day Initiative
Security: CVE-2022-23084
MFC after: 3 days
|
|
An unsanitized field in an option could be abused, causing an integer
overflow followed by kernel memory corruption. This might be used
to escape jails/containers.
Reported by: Reno Robert and Lucas Leong (@_wmliang_) of Trend Micro
Zero Day Initiative
Security: CVE-2022-23085
|
|
The new dev.netmap.max_bridges sysctl tunable can be set in
loader.conf(5) to change the default maximum number of VALE
switches that can be created. Current defaults is 8.
MFC after: 2 weeks
|
|
Symptom: when a single extmem memory region is provided to netmap
multiple times, for multiple interfaces, the memory region is
never released by netmap once all the existing file descriptors
are closed.
Fix the relevant condition in netmap_mem_drop(): release the memory
when the last user of netmap_adapter is gone, rather then when
the last user of netmap_mem_d is gone.
MFC after: 2 weeks
|
|
MFC after: 1 week
|
|
|
|
- make sure rings are disabled during resets
- introduce netmap_update_hostrings_mode(), with support
for multiple host rings
- always initialize ni_bufs_head in netmap_if
ni_bufs_head was not properly initialized when no external buffers were
requestedx and contained the ni_bufs_head from the last request. This
was causing spurious buffer frees when alternating between apps that
used external buffers and apps that did not use them.
- check na validitity under lock on detach
- netmap_mem: fix leak on error path
- nm_dispatch: fix compilation on Raspberry Pi
MFC after: 2 weeks
|