| Age | Commit message (Collapse) | Author |
|
Each netmap adapter associated with a physical adapter is attached to a
netmap memory pool. contigmalloc() is used to allocate physically
contiguous memory for the pool, but ideally we would ensure that all
such memory is allocated from the NUMA domain local to the adapter.
Augment netmap's memory pools with a NUMA domain ID, similar to how
IOMMU groups are handled in the Linux port. That is, when attaching to
a physical adapter, ensure that the associated memory pools are local to
the adapter's associated memory domain, creating new pools as needed.
Some types of ifnets do not have any defined NUMA affinity; in this case
the domain ID in question is the sentinel value -1.
Add a sysctl, dev.netmap.port_numa_affinity, which can be used to enable
the new behaviour. Keep it disabled by now to avoid surprises in case
netmap applications are relying on zero-copy optimizations to forward
packets between ports belonging to different NUMA domains.
Reviewed by: vmaffione
MFC after: 2 weeks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D46666
|
|
netmap_generic keeps a pool of mbufs for handling transfers, these mbufs
have an external buffer attached to them.
If some cases other parts of the network stack can chain these mbufs,
when this happens the normal pool destructor function can end up
free'ing the pool mbufs twice:
- A first time if a pool mbuf has been chained with another mbuf when
its chain is freed
- A second time when its entry in the pool is freed
Additionally, if other parts of the stack demote a pool mbuf its
interface reference will be cleared. In this case we deference a NULL
pointer when trying to free the mbuf through the destructor. Store a
reference to the adapter in ext_arg1 with the destructor callback so we
can find the correct adapter when free'ing a pool mbuf.
This change enables using netmap with epair interfaces.
Reviewed By: vmaffione
MFC after: 1 week
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44371
|
|
The CSB_WRITE() and _READ() macros respectively write to and read from
userspace memory and so can in principle fault. However, we do not
check for errors and will proceed blindly if they fail. Add assertions
to verify that they do not.
This is in preparation for annotating copyin() and related functions
with __result_use_check.
Reviewed by: vmaffione
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D43200
|
|
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
|
|
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
|
|
In emulated mode, the FreeBSD netmap port attempts to perform zero-copy
transmission. This works as follows: the kernel ring is populated with
mbuf headers to which netmap buffers are attached. When transmitting,
the mbuf refcount is initialized to 2, and when the counter value has
been decremented to 1 netmap infers that the driver has freed the mbuf
and thus transmission is complete.
This scheme does not generalize to the situation where netmap is
attaching to a software interface which may transmit packets among
multiple "queues", as is the case with bridge or lagg interfaces. In
that case, we would be relying on backing hardware drivers to free
transmitted mbufs promptly, but this isn't guaranteed; a driver may
reasonably defer freeing a small number of transmitted buffers
indefinitely. If such a buffer ends up at the tail of a netmap transmit
ring, further transmits can end up blocked indefinitely.
Fix the problem by removing the zero-copy scheme (which is also not
implemented in the Linux port of netmap). Instead, the kernel ring is
populated with regular mbuf clusters into which netmap buffers are
copied by nm_os_generic_xmit_frame(). The refcounting scheme is
preserved, and this lets us avoid allocating a fresh cluster per
transmitted packet in the common case. If the transmit ring is full, a
callout is used to free the "stuck" mbuf, avoiding the queue deadlock
described above.
Furthermore, when recycling mbuf clusters, be sure to fully reinitialize
the mbuf header instead of simply re-setting M_PKTHDR. Some software
interfaces, like if_vlan, may set fields in the header which should be
reset before the mbuf is reused.
Reviewed by: vmaffione
MFC after: 1 month
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38065
|
|
The save_if_input function pointer was meant to save the previous
value of ifp->if_input before replacing it with the emulated
adapter hook.
However, the same pointer value is already stored in the if_input
field of the netmap_adapter struct, to be used for host TX ring processing.
Reuse the netmap_adapter if_input field to simplify the code
and save some space.
MFC after: 14 days
|
|
MFC after: 7 days
|
|
Reviewed by: vmaffione, zlei
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37814
|
|
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38063
|
|
MFC after: 7 days
|
|
Netmap users on FreeBSD are not supposed to import code from the
github netmap repository anymore. They should use the code that
is available in the src repo. We can therefore drop the compatibility
code.
MFC after: 7 days
|
|
MFC after: 1 week
|
|
A distinct number of double-semicolons have ended up in FreeBSD. Take a
pass at getting rid of many of these harmless typos.
Reviewed by: emaste, rrs
Pull Request: https://github.com/freebsd/freebsd-src/pull/609
Differential Revision: https://reviews.freebsd.org/D31716
|
|
The new dev.netmap.max_bridges sysctl tunable can be set in
loader.conf(5) to change the default maximum number of VALE
switches that can be created. Current defaults is 8.
MFC after: 2 weeks
|
|
- make sure rings are disabled during resets
- introduce netmap_update_hostrings_mode(), with support
for multiple host rings
- always initialize ni_bufs_head in netmap_if
ni_bufs_head was not properly initialized when no external buffers were
requestedx and contained the ni_bufs_head from the last request. This
was causing spurious buffer frees when alternating between apps that
used external buffers and apps that did not use them.
- check na validitity under lock on detach
- netmap_mem: fix leak on error path
- nm_dispatch: fix compilation on Raspberry Pi
MFC after: 2 weeks
|
|
This is the April update to vendor/wpa committed upstream
2021/04/07.
This is MFV efec8223892b3e677acb46eae84ec3534989971f.
Suggested by: philip
Reviewed by: philip
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D29744
|
|
Explicitly disable ring synchronization before calling
callbacks that may result in a hardware reset.
Before this patch we relied on capturing the down/up events which,
however, may not be issued by all drivers.
|
|
No functional changes intended.
|
|
|
|
This feature enables applications to ask netmap to transmit or
receive packets starting at a user-specified offset from the
beginning of the netmap buffer. This is meant to ease those
packet manipulation operations such as pushing or popping packet
headers, that may be useful to implement software switches,
routers and other packet processors.
To use the feature, drivers (e.g., iflib, vtnet, etc.) must have
explicit support. This change does not add support for any driver,
but introduces the necessary kernel changes. However, offsets support
is already included for VALE ports and pipes.
|
|
Changes imported from the netmap github.
|
|
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().
MFC after: 1 week
Notes:
svn path=/head/; revision=362076
|
|
Clean up obsolete sysctl descriptions and add missing ones.
PR: 243838
Reviewed by: bcr
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23546
Notes:
svn path=/head/; revision=357663
|
|
The netmap passthrough subsystem requires proper support in the
hypervisor. In particular, two PCI device ids (from the Red Hat
PCI vendor id 0x1b36) need to be assigned to the two netmap
virtual devices. We then disable these devices until the ids have
not been assigned, in order to avoid conflicts with other
virtual devices emulated by upstream QEMU.
PR: 241774
MFC after: 3 days
Notes:
svn path=/head/; revision=356704
|
|
- Rework option processing.
- Use larger integers for memory size values in the
memory management code.
MFC after: 2 weeks
Notes:
svn path=/head/; revision=351657
|
|
This change adds a counter (kqueue_users) to keep track of how many
kqueue users are referencing a given struct nm_selinfo.
In this way, nm_os_selwakeup() can schedule the kevent notification
task only when kqueue is actually being used.
This is important to avoid wasting CPU in the common case where
kqueue is not used.
Reviewed by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19177
Notes:
svn path=/head/; revision=344253
|
|
Changelist:
- Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr
and nm_prlim, to avoid possible naming conflicts.
- Add netmap_krings_mode_commit() helper function and use that
to reduce code duplication.
- Refactor pipes control code to export some functions that
can be reused by the veth driver (on Linux) and epair(4).
- Add check to reject API requests with version less than 11.
- Small code refactoring for the null adapter.
MFC after: 1 week
Notes:
svn path=/head/; revision=343772
|
|
Add SYNC_KLOOP_MODE option, and add support for direct mode, where application
executes the TXSYNC and RXSYNC in the context of the ioeventfd wake up callback.
MFC after: 5 days
Notes:
svn path=/head/; revision=343689
|
|
When using poll(), select() or kevent() on netmap file descriptors,
netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands,
before collecting the events that are ready. In other words, the
poll/kevent callback has side effects. This is done to avoid the
overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)).
When the kqueue subsystem invokes the kqueue(9) f_event callback
(netmap_knrw), it holds the lock of the struct knlist object associated
to the netmap port (the lock is provided at initialization, by calling
knlist_init_mtx).
However, netmap_knrw() may need to wake up another netmap port (or even
the same one), which means that it may need to call knote().
Since knote() needs the lock of the struct knlist object associated to
the to-be-wake-up netmap port, it is possible to have a lock order reversal
problem (AB/BA deadlock).
This change prevents the deadlock by executing the knote() call in a
per-selinfo taskqueue, where it is possible to hold a mutex.
Reviewed by: aleksandr.fedorov_itglobal.com
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18956
Notes:
svn path=/head/; revision=343579
|
|
Changelist:
- Add the proper memory barriers in the kloop ring processing
functions.
- Fix memory barriers usage in the user helpers (nm_sync_kloop_appl_write,
nm_sync_kloop_appl_read).
- Fix nm_kr_txempty() helper to look at rhead rather than rcur. This
is important since the kloop can read a value of rcur which is ahead
of the value of rhead (see explanation in nm_sync_kloop_appl_write)
- Remove obsolete ptnetmap_guest_write_kring_csb() and
ptnet_guest_read_kring_csb(), and update if_ptnet(4) to use those.
- Prepare in advance the arguments for netmap_sync_kloop_[tr]x_ring(),
to make the kloop faster.
- Provide kernel and user implementation for nm_ldld_barrier() and
nm_ldst_barrier()
MFC after: 2 weeks
Notes:
svn path=/head/; revision=343346
|
|
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9)
users. However, this function can be called from different code paths,
with different lock requirements.
This patch fixes the knote() call argument to match the relavant lock state.
Also, comments have been updated to reflect current code.
PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846
Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18876
Notes:
svn path=/head/; revision=343344
|
|
This code validates the netmap buf_size against the interface MTU
and maximum descriptor size, to make sure the values are consistent.
Moving this functionality to its own function is needed because this
function is also called by Linux-specific code.
MFC after: 3 days
Notes:
svn path=/head/; revision=342300
|
|
Changelist:
- Replace netmap passthrough host support with a more general
mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop.
No kernel threads are used to use this feature: the application
is required to spawn a thread (or a process) and issue a
SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The
kernel loop is executed by the ioctl implementation, which returns
to userspace only when a different thread calls SYNC_KLOOP_STOP
or the netmap file descriptor is closed.
- Update the if_ptnet driver to cope with the new data structures,
and prune all the obsolete ptnetmap code.
- Add support for "null" netmap ports, useful to allocate netmap_if,
netmap_ring and netmap buffers to be used by specialized applications
(e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect.
- Various fixes and code refactoring.
Sponsored by: Sunny Valley Networks
Differential Revision: https://reviews.freebsd.org/D18015
Notes:
svn path=/head/; revision=341516
|
|
Changelist:
- Move large parts of VALE code to a new file and header netmap_bdg.[ch].
This is useful to reuse the code within upcoming projects.
- Improvements and bug fixes to pipes and monitors.
- Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to
handle differences between FreeBSD and Linux.
- Introduce some new helper functions to handle more host rings and fake
rings (netmap_all_rings(), netmap_real_rings(), ...)
- Added new sysctl to enable/disable hw checksum in emulated netmap mode.
- nm_inject: add support for NS_MOREFRAG
Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D17364
Notes:
svn path=/head/; revision=339639
|
|
Approved by: sbruno
Notes:
svn path=/head/; revision=333778
|
|
Changelist:
- Turn tx_rings and rx_rings arrays into arrays of pointers to kring
structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib,
vtnet and ptnet drivers to cope with the change.
- Generalize the nm_config() callback to accept a struct containing many
parameters.
- Introduce NKR_FAKERING to support buffers sharing (used for netmap
pipes)
- Improved API for external VALE modules.
- Various bug fixes and improvements to the netmap memory allocator,
including support for externally (userspace) allocated memory.
- Refactoring of netmap pipes: now linked rings share the same netmap
buffers, with a separate set of kring pointers (rhead, rcur, rtail).
Buffer swapping does not need to happen anymore.
- Large refactoring of the control API towards an extensible solution;
the goal is to allow the addition of more commands and extension of
existing ones (with new options) without the need of hacks or the
risk of running out of configuration space.
A new NIOCCTRL ioctl has been added to handle all the requests of the
new control API, which cover all the functionalities so far supported.
The netmap API bumps from 11 to 12 with this patch. Full backward
compatibility is provided for the old control command (NIOCREGIF), by
means of a new netmap_legacy module. Many parts of the old netmap.h
header has now been moved to netmap_legacy.h (included by netmap.h).
Approved by: hrs (mentor)
Notes:
svn path=/head/; revision=332423
|
|
Changelist:
- remove unused nkr_slot_flags
- new nm_intr adapter callback to enable/disable interrupts
- remove unused sysctls and document the other sysctls
- new infrastructure to support NS_MOREFRAG for NIC ports
- support for external memory allocator (for now linux-only),
including linux-specific changes in common headers
- optimizations within netmap pipes datapath
- improvements on VALE control API
- new nm_parse() helper function in netmap_user.h
- various bug fixes and code clean up
Approved by: hrs (mentor)
Notes:
svn path=/head/; revision=332319
|
|
The change upgrades the driver to use the split Communication Status
Block (CSB) format. In this way the variables written by the guest
and read by the host are allocated in a different cacheline than
the variables written by the host and read by the guest; this is
needed to avoid cache thrashing.
Approved by: hrs (mentor)
Notes:
svn path=/head/; revision=332047
|
|
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Notes:
svn path=/head/; revision=326255
|
|
version.
This commit contains mostly refactoring, a few fixes and minor added
functionality.
Submitted by: Vincenzo Maffione <v.maffione at gmail.com>
Requested by: many
Sponsored by: Rubicon Communications, LLC (Netgate)
Notes:
svn path=/head/; revision=319881
|
|
- use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range
(thanks to RedHat)
- export memory pool information through PCI registers
- improve mechanism for configuring passthrough on different hypervisors
Code is from Vincenzo Maffione as a follow up to his GSOC work.
Notes:
svn path=/head/; revision=308000
|
|
fix build on 32 bit platforms
simplify logic in netmap_virt.h
The commands (in net/netmap.h) to configure communication with the
hypervisor may be revised soon.
At the moment they are unused so this will not be a change of API.
Notes:
svn path=/head/; revision=307574
|
|
#include <sys/selinfo.h> should be here as all drivers that support
netmap need to use this file regardless.
Notes:
svn path=/head/; revision=307569
|
|
This commit, long overdue, contains contributions in the last 2 years
from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including:
+ fixes on monitor ports
+ the 'ptnet' virtual device driver, and ptnetmap backend, for
high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit)
+ improved emulated netmap mode
+ more robust error handling
+ removal of stale code
+ various fixes to code and documentation (some mixup between RX and TX
parameters, and private and public variables)
We also include an additional tool, nmreplay, which is functionally
equivalent to tcpreplay but operating on netmap ports.
Notes:
svn path=/head/; revision=307394
|
|
netmap_kern.h currently requires all drivers including it to include
selinfo.h.
Submitted by: mmacy@nextbsd.org
Reviewed by: gnn
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D5334
Notes:
svn path=/head/; revision=306772
|
|
Notes:
svn path=/head/; revision=285699
|
|
Notes:
svn path=/head/; revision=285695
|
|
(detected by jenkins run with gcc 4.9)
Update documentation on the use of netmap_priv_d,
rename the refcount and use the same structure in
FreeBSD and linux
No functional changes.
Notes:
svn path=/head/; revision=285359
|
|
This commit contains large contributions from Giuseppe Lettieri and
Stefano Garzarella, is partly supported by grants from Verisign and Cisco,
and brings in the following:
- fix zerocopy monitor ports and introduce copying monitor ports
(the latter are lower performance but give access to all traffic
in parallel with the application)
- exclusive open mode, useful to implement solutions that recover
from crashes of the main netmap client (suggested by Patrick Kelsey)
- revised memory allocator in preparation for the 'passthrough mode'
(ptnetmap) recently presented at bsdcan. ptnetmap is described in
S. Garzarella, G. Lettieri, L. Rizzo;
Virtual device passthrough for high speed VM networking,
ACM/IEEE ANCS 2015, Oakland (CA) May 2015
http://info.iet.unipi.it/~luigi/research.html
- fix rx CRC handing on ixl
- add module dependencies for netmap when building drivers as modules
- minor simplifications to device-specific routines (*txsync, *rxsync)
- general code cleanup (remove unused variables, introduce macros
to access rings and remove duplicate code,
Applications do not need to be recompiled, unless of course
they want to use the new features (monitors and exclusive open).
Those willing to try this code on stable/10 can just update the
sys/dev/netmap/*, sys/net/netmap* with the version in HEAD
and apply the small patches to individual device drivers.
MFC after: 1 month
Sponsored by: (partly) Verisign, Cisco
Notes:
svn path=/head/; revision=285349
|