summaryrefslogtreecommitdiff
path: root/sys/dev/netmap/netmap_freebsd.c
AgeCommit message (Collapse)Author
2025-10-18kqueue: handle copy for netmap filtersKonstantin Belousov
Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D52045
2025-10-17netmap: Fix error handling in nm_os_extmem_create()Mark Johnston
We bump the object reference count prior to mapping it into the kernel map, at which point the vm_map_entry owns the reference. Then, if vm_map_wire() fails, vm_map_remove() will release the reference, so we should avoid decrementing it in the error path. Reported by: Ilja van Sprundel <ivansprundel@ioactive.com> Reviewed by: vmaffione MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D53066
2025-03-27netmap: Add a cdev_pg_path hook that returns the name of the cdevJohn Baldwin
Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D49337
2024-11-26kern: Make fileops and filterops tables const where possibleMark Johnston
No functional change intended. MFC after: 1 week
2024-06-28net: Remove unneeded NULL check for the allocated ifnetZhenlei Huang
Change 4787572d0580 made if_alloc_domain() never fail, then also do the wrappers if_alloc(), if_alloc_dev(), and if_gethandle(). No functional change intended. Reviewed by: kp, imp, glebius, stevek MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D45740
2024-06-16netmap: Use device_set_descf()Mark Johnston
No functional change intended. MFC after: 1 week
2023-08-16sys: Remove $FreeBSD$: one-line .c comment patternWarner Losh
Remove /^/[*/]\s*\$FreeBSD\$.*\n/
2023-05-12spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
2023-04-05netmap: Handle packet batches in generic modeMark Johnston
ifnets are allowed to pass batches of multiple packets to if_input, linked by the m_nextpkt pointer. iflib_rxeof() sometimes does this, for example. Netmap's generic mode did not handle this and would only deliver the first packet in the batch, leaking the rest. PR: 270636 Reviewed by: vmaffione MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D39426
2023-04-05netmap: Fix queue stalls with generic interfacesMark Johnston
In emulated mode, the FreeBSD netmap port attempts to perform zero-copy transmission. This works as follows: the kernel ring is populated with mbuf headers to which netmap buffers are attached. When transmitting, the mbuf refcount is initialized to 2, and when the counter value has been decremented to 1 netmap infers that the driver has freed the mbuf and thus transmission is complete. This scheme does not generalize to the situation where netmap is attaching to a software interface which may transmit packets among multiple "queues", as is the case with bridge or lagg interfaces. In that case, we would be relying on backing hardware drivers to free transmitted mbufs promptly, but this isn't guaranteed; a driver may reasonably defer freeing a small number of transmitted buffers indefinitely. If such a buffer ends up at the tail of a netmap transmit ring, further transmits can end up blocked indefinitely. Fix the problem by removing the zero-copy scheme (which is also not implemented in the Linux port of netmap). Instead, the kernel ring is populated with regular mbuf clusters into which netmap buffers are copied by nm_os_generic_xmit_frame(). The refcounting scheme is preserved, and this lets us avoid allocating a fresh cluster per transmitted packet in the common case. If the transmit ring is full, a callout is used to free the "stuck" mbuf, avoiding the queue deadlock described above. Furthermore, when recycling mbuf clusters, be sure to fully reinitialize the mbuf header instead of simply re-setting M_PKTHDR. Some software interfaces, like if_vlan, may set fields in the header which should be reset before the mbuf is reused. Reviewed by: vmaffione MFC after: 1 month Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38065
2023-03-14netmap: get rid of save_if_input for emulated adaptersVincenzo Maffione
The save_if_input function pointer was meant to save the previous value of ifp->if_input before replacing it with the emulated adapter hook. However, the same pointer value is already stored in the if_input field of the netmap_adapter struct, to be used for host TX ring processing. Reuse the netmap_adapter if_input field to simplify the code and save some space. MFC after: 14 days
2023-03-09netmap: Remove obsolete compatibility definesMark Johnston
No functional change intended. Reviewed by: vmaffione MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D39001
2023-02-14Mechanically convert netmap(4) to IfAPIJustin Hibbits
Reviewed by: vmaffione, zlei Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D37814
2023-02-08netmap: drop redundant if_mtu assignmentVincenzo Maffione
Reported by: zlei MFC after 3 days
2023-01-13if_lagg: Allow lagg interfaces to be used with netmapTom Jones
Reviewed by: zlei Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D37436
2022-12-24netmap: drop compatibility FreeBSD codeVincenzo Maffione
Netmap users on FreeBSD are not supposed to import code from the github netmap repository anymore. They should use the code that is available in the src repo. We can therefore drop the compatibility code. MFC after: 7 days
2022-05-10netmap: Remove unused devclass arguments to DRIVER_MODULE.John Baldwin
2021-04-02netmap: several typo fixesVincenzo Maffione
No functional changes intended.
2021-03-29netmap: add kernel support for the "offsets" featureVincenzo Maffione
This feature enables applications to ask netmap to transmit or receive packets starting at a user-specified offset from the beginning of the netmap buffer. This is meant to ease those packet manipulation operations such as pushing or popping packet headers, that may be useful to implement software switches, routers and other packet processors. To use the feature, drivers (e.g., iflib, vtnet, etc.) must have explicit support. This change does not add support for any driver, but introduces the necessary kernel changes. However, offsets support is already included for VALE ports and pipes.
2021-03-20netmap: fix issues in nm_os_extmem_create()Vincenzo Maffione
- Call vm_object_reference() before vm_map_lookup_done(). - Use vm_mmap_to_errno() to convert vm_map_* return values to errno. - Fix memory leak of e->obj. Reported by: markj Reviewed by: markj MFC after: 1 week
2019-12-22Make page busy state deterministic on free. Pages must be xbusy whenJeff Roberson
removed from objects including calls to free. Pages must not be xbusy when freed and not on an object. Strengthen assertions to match these expectations. In practice very little code had to change busy handling to meet these rules but we can now make stronger guarantees to busy holders and avoid conditionally dropping busy in free. Refine vm_page_remove() and vm_page_replace() semantics now that we have stronger guarantees about busy state. This removes redundant and potentially problematic code that has proliferated. Discussed with: markj Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D22822 Notes: svn path=/head/; revision=356002
2019-10-15(4/6) Protect page valid with the busy lock.Jeff Roberson
Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594 Notes: svn path=/head/; revision=353539
2019-09-09Change synchonization rules for vm_page reference counting.Mark Johnston
There are several mechanisms by which a vm_page reference is held, preventing the page from being freed back to the page allocator. In particular, holding the page's object lock is sufficient to prevent the page from being freed; holding the busy lock or a wiring is sufficent as well. These references are protected by the page lock, which must therefore be acquired for many per-page operations. This results in false sharing since the page locks are external to the vm_page structures themselves and each lock protects multiple structures. Transition to using an atomically updated per-page reference counter. The object's reference is counted using a flag bit in the counter. A second flag bit is used to atomically block new references via pmap_extract_and_hold() while removing managed mappings of a page. Thus, the reference count of a page is guaranteed not to increase if the page is unbusied, unmapped, and the object's write lock is held. As a consequence of this, the page lock no longer protects a page's identity; operations which move pages between objects are now synchronized solely by the objects' locks. The vm_page_wire() and vm_page_unwire() KPIs are changed. The former requires that either the object lock or the busy lock is held. The latter no longer has a return value and may free the page if it releases the last reference to that page. vm_page_unwire_noq() behaves the same as before; the caller is responsible for checking its return value and freeing or enqueuing the page as appropriate. vm_page_wire_mapped() is introduced for use in pmap_extract_and_hold(). It fails if the page is concurrently being unmapped, typically triggering a fallback to the fault handler. vm_page_wire() no longer requires the page lock and vm_page_unwire() now internally acquires the page lock when releasing the last wiring of a page (since the page lock still protects a page's queue state). In particular, synchronization details are no longer leaked into the caller. The change excises the page lock from several frequently executed code paths. In particular, vm_object_terminate() no longer bounces between page locks as it releases an object's pages, and direct I/O and sendfile(SF_NOCACHE) completions no longer require the page lock. In these latter cases we now get linear scalability in the common scenario where different threads are operating on different files. __FreeBSD_version is bumped. The DRM ports have been updated to accomodate the KPI changes. Reviewed by: jeff (earlier version) Tested by: gallatin (earlier version), pho Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D20486 Notes: svn path=/head/; revision=352110
2019-07-01netmap: fix two panics with emulated adapterVincenzo Maffione
This patch fixes 2 panics. The first one is due to the current VNET not being set in the emulated adapter transmission path. The second one is caused by the M_PKTHDR flag not being set when preallocated mbufs are recycled in the transmit path. Submitted by: aleksandr.fedorov@itglobal.com Reviewed by: vmaffione MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D20824 Notes: svn path=/head/; revision=349581
2019-05-21Add two missing eventhandler.h headersConrad Meyer
These are obviously missing from the .c files, but don't show up in any tinderbox configuration (due to latent header pollution of some kind). It seems some configurations don't have this pollution, and the includes are obviously missing, so go ahead and add them. Reported by: Peter Jeremy <peter AT rulingia.com> X-MFC-With: r347984 Notes: svn path=/head/; revision=348022
2019-02-18netmap: don't schedule kqueue notify task when kqueue is not usedVincenzo Maffione
This change adds a counter (kqueue_users) to keep track of how many kqueue users are referencing a given struct nm_selinfo. In this way, nm_os_selwakeup() can schedule the kevent notification task only when kqueue is actually being used. This is important to avoid wasting CPU in the common case where kqueue is not used. Reviewed by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D19177 Notes: svn path=/head/; revision=344253
2019-02-05netmap: refactor logging macros and pipesVincenzo Maffione
Changelist: - Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr and nm_prlim, to avoid possible naming conflicts. - Add netmap_krings_mode_commit() helper function and use that to reduce code duplication. - Refactor pipes control code to export some functions that can be reused by the veth driver (on Linux) and epair(4). - Add check to reject API requests with version less than 11. - Small code refactoring for the null adapter. MFC after: 1 week Notes: svn path=/head/; revision=343772
2019-01-30netmap: fix lock order reversal related to kqueue usageVincenzo Maffione
When using poll(), select() or kevent() on netmap file descriptors, netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands, before collecting the events that are ready. In other words, the poll/kevent callback has side effects. This is done to avoid the overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)). When the kqueue subsystem invokes the kqueue(9) f_event callback (netmap_knrw), it holds the lock of the struct knlist object associated to the netmap port (the lock is provided at initialization, by calling knlist_init_mtx). However, netmap_knrw() may need to wake up another netmap port (or even the same one), which means that it may need to call knote(). Since knote() needs the lock of the struct knlist object associated to the to-be-wake-up netmap port, it is possible to have a lock order reversal problem (AB/BA deadlock). This change prevents the deadlock by executing the knote() call in a per-selinfo taskqueue, where it is possible to hold a mutex. Reviewed by: aleksandr.fedorov_itglobal.com MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18956 Notes: svn path=/head/; revision=343579
2019-01-29netmap: add notifications on kloop stopVincenzo Maffione
On sync-kloop stop, send a wake-up signal to the kloop, so that waiting for the timeout is not needed. Also, improve logging in netmap_freebsd.c. MFC after: 3 days Notes: svn path=/head/; revision=343549
2019-01-23netmap: fix knote() argument to match the mutex stateVincenzo Maffione
The nm_os_selwakeup function needs to call knote() to wake up kqueue(9) users. However, this function can be called from different code paths, with different lock requirements. This patch fixes the knote() call argument to match the relavant lock state. Also, comments have been updated to reflect current code. PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219846 Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18876 Notes: svn path=/head/; revision=343344
2018-12-05netmap: align codebase to the current upstream (760279cfb2730a585)Vincenzo Maffione
Changelist: - Replace netmap passthrough host support with a more general mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop. No kernel threads are used to use this feature: the application is required to spawn a thread (or a process) and issue a SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The kernel loop is executed by the ioctl implementation, which returns to userspace only when a different thread calls SYNC_KLOOP_STOP or the netmap file descriptor is closed. - Update the if_ptnet driver to cope with the new data structures, and prune all the obsolete ptnetmap code. - Add support for "null" netmap ports, useful to allocate netmap_if, netmap_ring and netmap buffers to be used by specialized applications (e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect. - Various fixes and code refactoring. Sponsored by: Sunny Valley Networks Differential Revision: https://reviews.freebsd.org/D18015 Notes: svn path=/head/; revision=341516
2018-11-28netmap: set IFCAP_NETMAP in if_capabilitiesVincenzo Maffione
Revision r307394 removed (by mistake) the code that sets IFCAP_NETMAP in if_capabilities on netmap_attach. This patch reverts this change. Reviewed by: np Approved by: gnn (mentor) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D17987 Notes: svn path=/head/; revision=341144
2018-10-23netmap: align codebase to the current upstream (sha 8374e1a7e6941)Vincenzo Maffione
Changelist: - Move large parts of VALE code to a new file and header netmap_bdg.[ch]. This is useful to reuse the code within upcoming projects. - Improvements and bug fixes to pipes and monitors. - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to handle differences between FreeBSD and Linux. - Introduce some new helper functions to handle more host rings and fake rings (netmap_all_rings(), netmap_real_rings(), ...) - Added new sysctl to enable/disable hw checksum in emulated netmap mode. - nm_inject: add support for NS_MOREFRAG Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D17364 Notes: svn path=/head/; revision=339639
2018-08-14Fix several memory leaks.David Bright
The libkqueue tests have several places that leak memory by using an idiom like: puts(kevent_to_str(kevp)); Rework to save the pointer returned from kevent_to_str() and then free() it after it has been used. Reported by: asomers (pointer to Coverity), Coverity CID: 1296063, 1296064, 1296065, 1296066, 1296067, 1350287, 1394960 Sponsored by: Dell EMC Notes: svn path=/head/; revision=337812
2018-05-19netmap: compare e1 with e2, not with itselfMatt Macy
Notes: svn path=/head/; revision=333865
2018-05-18netmap: pull fix for 32-bit support from upstreamMatt Macy
Approved by: sbruno Notes: svn path=/head/; revision=333778
2018-04-13Fix build on 32-bit systems.Brooks Davis
Notes: svn path=/head/; revision=332488
2018-04-12netmap: align codebase to the current upstream (commit id 3fb001303718146)Vincenzo Maffione
Changelist: - Turn tx_rings and rx_rings arrays into arrays of pointers to kring structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib, vtnet and ptnet drivers to cope with the change. - Generalize the nm_config() callback to accept a struct containing many parameters. - Introduce NKR_FAKERING to support buffers sharing (used for netmap pipes) - Improved API for external VALE modules. - Various bug fixes and improvements to the netmap memory allocator, including support for externally (userspace) allocated memory. - Refactoring of netmap pipes: now linked rings share the same netmap buffers, with a separate set of kring pointers (rhead, rcur, rtail). Buffer swapping does not need to happen anymore. - Large refactoring of the control API towards an extensible solution; the goal is to allow the addition of more commands and extension of existing ones (with new options) without the need of hacks or the risk of running out of configuration space. A new NIOCCTRL ioctl has been added to handle all the requests of the new control API, which cover all the functionalities so far supported. The netmap API bumps from 11 to 12 with this patch. Full backward compatibility is provided for the old control command (NIOCREGIF), by means of a new netmap_legacy module. Many parts of the old netmap.h header has now been moved to netmap_legacy.h (included by netmap.h). Approved by: hrs (mentor) Notes: svn path=/head/; revision=332423
2018-04-09netmap: align codebase to upstream version v11.4Vincenzo Maffione
Changelist: - remove unused nkr_slot_flags - new nm_intr adapter callback to enable/disable interrupts - remove unused sysctls and document the other sysctls - new infrastructure to support NS_MOREFRAG for NIC ports - support for external memory allocator (for now linux-only), including linux-specific changes in common headers - optimizations within netmap pipes datapath - improvements on VALE control API - new nm_parse() helper function in netmap_user.h - various bug fixes and code clean up Approved by: hrs (mentor) Notes: svn path=/head/; revision=332319
2017-11-27sys/dev: further adoption of SPDX licensing ID tags.Pedro F. Giffuni
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326255
2017-07-21Restore the changes done in r313982: Replace zero with NULL for pointers.Luiz Otavio O Souza
Spotted by: Harry Schmalzbauer MFC after: 1 week Sponsored by: Rubicon Communications, LLC (Netgate) Notes: svn path=/head/; revision=321321
2017-06-12Update the current version of netmap to bring it in sync with the githubLuiz Otavio O Souza
version. This commit contains mostly refactoring, a few fixes and minor added functionality. Submitted by: Vincenzo Maffione <v.maffione at gmail.com> Requested by: many Sponsored by: Rubicon Communications, LLC (Netgate) Notes: svn path=/head/; revision=319881
2017-02-20sys/dev: Replace zero with NULL for pointers.Pedro F. Giffuni
Makes things easier to read, plus architectures may set NULL to something different than zero. Found with: devel/coccinelle MFC after: 3 weeks Notes: svn path=/head/; revision=313982
2017-01-02[netmap] call RLOCK /and/ RUNLOCK.Adrian Chadd
Reported by: olivier Notes: svn path=/head/; revision=311045
2016-12-30[netmap] fix locking regressionsAdrian Chadd
* Firmware oriented NICs may need to sleep in their configuration paths. Use RLOCK instead of WLOCK to allow this to again occur. This fixes netmap on cxgbe. * Change the worker lock to a normal mutex rather than a spin lock. Drivers shouldn't be doing netmap work from the fast interrupt handlers, so it's not required to be a spinlock. Submitted by: luigi, Vincenzo Maffione <v.maffione@gmail.com> Reviewed by: jhb Notes: svn path=/head/; revision=310822
2016-11-30netmap: add cast to fix powerpc64 LINT kernelEd Maste
Attempt to fix powerpc64 LINT kernel broken by r308000. Netmap's use of a uint64_t wchan seems odd, but in the interest of minimizing this change just cast through uintptr_t to silence the compiler warning. Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D8669 Notes: svn path=/head/; revision=309306
2016-10-27Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail:Luigi Rizzo
- use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range (thanks to RedHat) - export memory pool information through PCI registers - improve mechanism for configuring passthrough on different hypervisors Code is from Vincenzo Maffione as a follow up to his GSOC work. Notes: svn path=/head/; revision=308000
2016-10-21netmap: Unbreak LINT-VIMAGE buildingSepherosa Ziehau
Sponsored by: Microsoft Notes: svn path=/head/; revision=307706
2016-10-21netmap: Unbreak i386 LINT buildingSepherosa Ziehau
Sponsored by: Microsoft Notes: svn path=/head/; revision=307703
2016-10-18remove stale and unused code from various filesLuigi Rizzo
fix build on 32 bit platforms simplify logic in netmap_virt.h The commands (in net/netmap.h) to configure communication with the hypervisor may be revised soon. At the moment they are unused so this will not be a change of API. Notes: svn path=/head/; revision=307574