summaryrefslogtreecommitdiff
path: root/sys/dev/netmap/netmap_generic.c
AgeCommit message (Collapse)Author
2024-03-26netmap: Address errors on memory free in netmap_genericTom Jones
netmap_generic keeps a pool of mbufs for handling transfers, these mbufs have an external buffer attached to them. If some cases other parts of the network stack can chain these mbufs, when this happens the normal pool destructor function can end up free'ing the pool mbufs twice: - A first time if a pool mbuf has been chained with another mbuf when its chain is freed - A second time when its entry in the pool is freed Additionally, if other parts of the stack demote a pool mbuf its interface reference will be cleared. In this case we deference a NULL pointer when trying to free the mbuf through the destructor. Store a reference to the adapter in ext_arg1 with the destructor callback so we can find the correct adapter when free'ing a pool mbuf. This change enables using netmap with epair interfaces. Reviewed By: vmaffione MFC after: 1 week Relnotes: yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D44371
2023-08-16sys: Remove $FreeBSD$: one-line .c patternWarner Losh
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-05-12spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
2023-04-05netmap: Fix queue stalls with generic interfacesMark Johnston
In emulated mode, the FreeBSD netmap port attempts to perform zero-copy transmission. This works as follows: the kernel ring is populated with mbuf headers to which netmap buffers are attached. When transmitting, the mbuf refcount is initialized to 2, and when the counter value has been decremented to 1 netmap infers that the driver has freed the mbuf and thus transmission is complete. This scheme does not generalize to the situation where netmap is attaching to a software interface which may transmit packets among multiple "queues", as is the case with bridge or lagg interfaces. In that case, we would be relying on backing hardware drivers to free transmitted mbufs promptly, but this isn't guaranteed; a driver may reasonably defer freeing a small number of transmitted buffers indefinitely. If such a buffer ends up at the tail of a netmap transmit ring, further transmits can end up blocked indefinitely. Fix the problem by removing the zero-copy scheme (which is also not implemented in the Linux port of netmap). Instead, the kernel ring is populated with regular mbuf clusters into which netmap buffers are copied by nm_os_generic_xmit_frame(). The refcounting scheme is preserved, and this lets us avoid allocating a fresh cluster per transmitted packet in the common case. If the transmit ring is full, a callout is used to free the "stuck" mbuf, avoiding the queue deadlock described above. Furthermore, when recycling mbuf clusters, be sure to fully reinitialize the mbuf header instead of simply re-setting M_PKTHDR. Some software interfaces, like if_vlan, may set fields in the header which should be reset before the mbuf is reused. Reviewed by: vmaffione MFC after: 1 month Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38065
2023-03-11netmap: get rid of WNA() macroVincenzo Maffione
MFC after: 7 days
2023-02-14Mechanically convert netmap(4) to IfAPIJustin Hibbits
Reviewed by: vmaffione, zlei Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D37814
2023-01-23netmap: Try to count packet drops in emulated modeMark Johnston
Right now we have little visibility into packet drops within netmap. Start trying to make packet loss issues more visible by counting queue drops in the transmit path, and in the input path for interfaces running in emulated mode, where we place received packets in a bounded software queue that is processed by rxsync. Reviewed by: vmaffione MFC after: 1 week Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38064
2023-01-23netmap: Fix a queue length check in the generic port rx pathMark Johnston
The check is ok by default, since the default value of netmap_generic_ringsize is 1024. But we should check against the configured "ring" size. Reviewed by: vmaffione MFC after: 1 week Sponsored by: Zenarmor Sponsored by: OPNsense Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D38062
2022-12-24netmap: drop compatibility FreeBSD codeVincenzo Maffione
Netmap users on FreeBSD are not supposed to import code from the github netmap repository anymore. They should use the code that is available in the src repo. We can therefore drop the compatibility code. MFC after: 7 days
2021-04-02netmap: several typo fixesVincenzo Maffione
No functional changes intended.
2021-03-29netmap: add kernel support for the "offsets" featureVincenzo Maffione
This feature enables applications to ask netmap to transmit or receive packets starting at a user-specified offset from the beginning of the netmap buffer. This is meant to ease those packet manipulation operations such as pushing or popping packet headers, that may be useful to implement software switches, routers and other packet processors. To use the feature, drivers (e.g., iflib, vtnet, etc.) must have explicit support. This change does not add support for any driver, but introduces the necessary kernel changes. However, offsets support is already included for VALE ports and pipes.
2019-10-28netmap: enter NET_EPOCH on generic txsyncVincenzo Maffione
After r353292, netmap generic adapter on if_vlan interfaces panics on asserting the NET_EPOCH. In more detail, this happens when nm_os_generic_xmit_frame() is called, that is in the generic txsync routine. Fix the issue by entering the NET_EPOCH during the generic txsync. We amortize the cost of entering/exiting over a whole batch of transmissions. PR: 241489 Reported by: Aleksandr Fedorov <aleksandr.fedorov@itglobal.com> Notes: svn path=/head/; revision=354137
2019-07-13netmap: fix bug introduced by r349752Vincenzo Maffione
r349752 introduced a NULL pointer reference bug in the emulated netmap code. Reported by: lwhsu MFC after: 3 days Notes: svn path=/head/; revision=349966
2019-07-04netmap: fix kernel pointer printing in netmap_generic.cVincenzo Maffione
Print the adapter name rather than the address of the adapter to avoid kernel address leakage. PR: Bug 238642 Submitted by: Fuqian Huang <huangfq.daxian@gmail.com> Reviewed by: vmaffione MFC after: 1 week Notes: svn path=/head/; revision=349752
2019-02-05netmap: refactor logging macros and pipesVincenzo Maffione
Changelist: - Replace ND, D and RD macros with nm_prdis, nm_prinf, nm_prerr and nm_prlim, to avoid possible naming conflicts. - Add netmap_krings_mode_commit() helper function and use that to reduce code duplication. - Refactor pipes control code to export some functions that can be reused by the veth driver (on Linux) and epair(4). - Add check to reject API requests with version less than 11. - Small code refactoring for the null adapter. MFC after: 1 week Notes: svn path=/head/; revision=343772
2018-12-05netmap: align codebase to the current upstream (760279cfb2730a585)Vincenzo Maffione
Changelist: - Replace netmap passthrough host support with a more general mechanism to call TXSYNC/RXSYNC from an in-kernel event-loop. No kernel threads are used to use this feature: the application is required to spawn a thread (or a process) and issue a SYNC_KLOOP_START (NIOCCTRL) command in the thread body. The kernel loop is executed by the ioctl implementation, which returns to userspace only when a different thread calls SYNC_KLOOP_STOP or the netmap file descriptor is closed. - Update the if_ptnet driver to cope with the new data structures, and prune all the obsolete ptnetmap code. - Add support for "null" netmap ports, useful to allocate netmap_if, netmap_ring and netmap buffers to be used by specialized applications (e.g. hypervisors). TXSYNC/RXSYNC on these ports have no effect. - Various fixes and code refactoring. Sponsored by: Sunny Valley Networks Differential Revision: https://reviews.freebsd.org/D18015 Notes: svn path=/head/; revision=341516
2018-10-23netmap: align codebase to the current upstream (sha 8374e1a7e6941)Vincenzo Maffione
Changelist: - Move large parts of VALE code to a new file and header netmap_bdg.[ch]. This is useful to reuse the code within upcoming projects. - Improvements and bug fixes to pipes and monitors. - Introduce nm_os_onattach(), nm_os_onenter() and nm_os_onexit() to handle differences between FreeBSD and Linux. - Introduce some new helper functions to handle more host rings and fake rings (netmap_all_rings(), netmap_real_rings(), ...) - Added new sysctl to enable/disable hw checksum in emulated netmap mode. - nm_inject: add support for NS_MOREFRAG Approved by: gnn (mentor) Differential Revision: https://reviews.freebsd.org/D17364 Notes: svn path=/head/; revision=339639
2018-05-18netmap: pull fix for 32-bit support from upstreamMatt Macy
Approved by: sbruno Notes: svn path=/head/; revision=333778
2018-04-12netmap: align codebase to the current upstream (commit id 3fb001303718146)Vincenzo Maffione
Changelist: - Turn tx_rings and rx_rings arrays into arrays of pointers to kring structs. This patch includes fixes for ixv, ixl, ix, re, cxgbe, iflib, vtnet and ptnet drivers to cope with the change. - Generalize the nm_config() callback to accept a struct containing many parameters. - Introduce NKR_FAKERING to support buffers sharing (used for netmap pipes) - Improved API for external VALE modules. - Various bug fixes and improvements to the netmap memory allocator, including support for externally (userspace) allocated memory. - Refactoring of netmap pipes: now linked rings share the same netmap buffers, with a separate set of kring pointers (rhead, rcur, rtail). Buffer swapping does not need to happen anymore. - Large refactoring of the control API towards an extensible solution; the goal is to allow the addition of more commands and extension of existing ones (with new options) without the need of hacks or the risk of running out of configuration space. A new NIOCCTRL ioctl has been added to handle all the requests of the new control API, which cover all the functionalities so far supported. The netmap API bumps from 11 to 12 with this patch. Full backward compatibility is provided for the old control command (NIOCREGIF), by means of a new netmap_legacy module. Many parts of the old netmap.h header has now been moved to netmap_legacy.h (included by netmap.h). Approved by: hrs (mentor) Notes: svn path=/head/; revision=332423
2018-04-09netmap: align codebase to upstream version v11.4Vincenzo Maffione
Changelist: - remove unused nkr_slot_flags - new nm_intr adapter callback to enable/disable interrupts - remove unused sysctls and document the other sysctls - new infrastructure to support NS_MOREFRAG for NIC ports - support for external memory allocator (for now linux-only), including linux-specific changes in common headers - optimizations within netmap pipes datapath - improvements on VALE control API - new nm_parse() helper function in netmap_user.h - various bug fixes and code clean up Approved by: hrs (mentor) Notes: svn path=/head/; revision=332319
2017-11-27sys/dev: further adoption of SPDX licensing ID tags.Pedro F. Giffuni
Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Notes: svn path=/head/; revision=326255
2017-10-09Shorten list of arguments to mbuf external storage freeing function.Gleb Smirnoff
All of these arguments are stored in m_ext, so there is no reason to pass them in the argument list. Not all functions need the second argument, some don't even need the first one. The second argument lives in next cache line, so not dereferencing it is a performance gain. This was discovered in sendfile(2), which will be covered by next commits. The second goal of this commit is to bring even more flexibility to m_ext mbufs, allowing to create more fields in m_ext, opaque to the generic mbuf code, and potentially set and dereferenced by subsystems. Reviewed by: gallatin, kbowling Differential Revision: https://reviews.freebsd.org/D12615 Notes: svn path=/head/; revision=324446
2017-07-21Do not allow the use of the loopback interface in netmap.Luiz Otavio O Souza
The generic support in netmap send the packets using if_transmit() and the loopback do not support packets coming from if_transmit()/if_start(). This avoids the use of the loopback interface and the subsequent crash that happens when the application send packets to the loopback interface. Details in: https://github.com/luigirizzo/netmap/issues/322 Reported by: Vincenzo Maffione <v.maffione@gmail.com> Sponsored by: Rubicon Communications, LLC (Netgate) Notes: svn path=/head/; revision=321317
2017-06-12Update the current version of netmap to bring it in sync with the githubLuiz Otavio O Souza
version. This commit contains mostly refactoring, a few fixes and minor added functionality. Submitted by: Vincenzo Maffione <v.maffione at gmail.com> Requested by: many Sponsored by: Rubicon Communications, LLC (Netgate) Notes: svn path=/head/; revision=319881
2017-02-14Unbreak the gcc build of netmap.Mark Johnston
This fixes several LINT targets. Reviewed by: Vincenzo Maffione Notes: svn path=/head/; revision=313747
2017-01-12Fix panic on mb_free_ext() due to NULL destructor.Sean Bruno
This used to happen because of the SET_MBUF_DESTRUCTOR() called on unregif. Submitted by: Vincenzo Maffione <v.maffione@gmail.com> Notes: svn path=/head/; revision=311986
2016-10-18remove stale and unused code from various filesLuigi Rizzo
fix build on 32 bit platforms simplify logic in netmap_virt.h The commands (in net/netmap.h) to configure communication with the hypervisor may be revised soon. At the moment they are unused so this will not be a change of API. Notes: svn path=/head/; revision=307574
2016-10-16Import the current version of netmap, aligned with the one on github.Luigi Rizzo
This commit, long overdue, contains contributions in the last 2 years from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including: + fixes on monitor ports + the 'ptnet' virtual device driver, and ptnetmap backend, for high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit) + improved emulated netmap mode + more robust error handling + removal of stale code + various fixes to code and documentation (some mixup between RX and TX parameters, and private and public variables) We also include an additional tool, nmreplay, which is functionally equivalent to tcpreplay but operating on netmap ports. Notes: svn path=/head/; revision=307394
2016-05-03sys/dev: minor spelling fixes.Pedro F. Giffuni
Most affect comments, very few have user-visible effects. Notes: svn path=/head/; revision=298955
2016-03-26Plug leak in m_unshare.Navdeep Parhar
m_unshare passes on the source mbuf's flags as-is to m_getcl and this results in a leak if the flags include M_NOFREE. The fix is to clear the bits not listed in M_COPYALL before calling m_getcl. M_RDONLY should probably be filtered out too but that's outside the scope of this fix. Add assertions in the zone_mbuf and zone_pack ctors to catch similar bugs. Update netmap_get_mbuf to not pass M_NOFREE to m_getcl. It's not clear what the original code was trying to do but it's likely incorrect. Updated code is no different functionally but it avoids the newly added assertions. Reviewed by: gnn@ Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D5698 Notes: svn path=/head/; revision=297298
2015-07-10Sync netmap sources with the version in our private tree.Luigi Rizzo
This commit contains large contributions from Giuseppe Lettieri and Stefano Garzarella, is partly supported by grants from Verisign and Cisco, and brings in the following: - fix zerocopy monitor ports and introduce copying monitor ports (the latter are lower performance but give access to all traffic in parallel with the application) - exclusive open mode, useful to implement solutions that recover from crashes of the main netmap client (suggested by Patrick Kelsey) - revised memory allocator in preparation for the 'passthrough mode' (ptnetmap) recently presented at bsdcan. ptnetmap is described in S. Garzarella, G. Lettieri, L. Rizzo; Virtual device passthrough for high speed VM networking, ACM/IEEE ANCS 2015, Oakland (CA) May 2015 http://info.iet.unipi.it/~luigi/research.html - fix rx CRC handing on ixl - add module dependencies for netmap when building drivers as modules - minor simplifications to device-specific routines (*txsync, *rxsync) - general code cleanup (remove unused variables, introduce macros to access rings and remove duplicate code, Applications do not need to be recompiled, unless of course they want to use the new features (monitors and exclusive open). Those willing to try this code on stable/10 can just update the sys/dev/netmap/*, sys/net/netmap* with the version in HEAD and apply the small patches to individual device drivers. MFC after: 1 month Sponsored by: (partly) Verisign, Cisco Notes: svn path=/head/; revision=285349
2014-11-10sync a comment with our internal repoLuigi Rizzo
Notes: svn path=/head/; revision=274353
2014-08-16Update to the current version of netmap.Luigi Rizzo
Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for *txsync() and *rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days. Notes: svn path=/head/; revision=270063
2014-07-11Fix style bug: rename the refcount field of m_ext to ext_cnt, to matchGleb Smirnoff
other members. Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=268530
2014-06-10change the netmap mbuf destructor so the same code works also on FreeBSD 9.Luigi Rizzo
For head and 10 this change has no effect, but on stable/9 it would cause panics when using emulated netmap on top of a standard device driver. Notes: svn path=/head/; revision=267328
2014-06-06better handling of netmap emulation over standard device drivers:Luigi Rizzo
plug a potential mbuf leak, and detect bogus drivers that return ENOBUFS even when the packet has been queued. MFC after: 3 days Notes: svn path=/head/; revision=267180
2014-06-06move netmap_getna() to a freebsd-specific fileLuigi Rizzo
Notes: svn path=/head/; revision=267170
2014-06-06remove two debugging messages, align comments with the codeLuigi Rizzo
in our development trunk Notes: svn path=/head/; revision=267163
2014-06-02Introduce a procedural interface to the ifnet structure. The newMarcel Moolenaar
interface allows the ifnet structure to be defined as an opaque type in NIC drivers. This then allows the ifnet structure to be changed without a need to change or recompile NIC drivers. Put differently, NIC drivers can be written and compiled once and be used with different network stack implementations, provided of course that those network stack implementations have an API and ABI compatible interface. This commit introduces the 'if_t' type to replace 'struct ifnet *' as the type of a network interface. The 'if_t' type is defined as 'void *' to enable the compiler to perform type conversion to 'struct ifnet *' and vice versa where needed and without warnings. The functions that implement the API are the only functions that need to have an explicit cast. The MII code has been converted to use the driver API to avoid unnecessary code churn. Code churn comes from having to work with both converted and unconverted drivers in correlation with having callback functions that take an interface. By converting the MII code first, the callback functions can be defined so that the compiler will perform the typecasts automatically. As soon as all drivers have been converted, the if_t type can be redefined as needed and the API functions can be fix to not need an explicit cast. The immediate benefactors of this change are: 1. Juniper Networks - The network stack implementation in Junos is entirely different from FreeBSD's one and this change allows Juniper to build "stock" NIC drivers that can be used in combination with both the FreeBSD and Junos stacks. 2. FreeBSD - This change opens the door towards changing ifnet and implementing new features and optimizations in the network stack without it requiring a change in the many NIC drivers FreeBSD has. Submitted by: Anuranjan Shukla <anshukla@juniper.net> Reviewed by: glebius@ Obtained from: Juniper Networks, Inc. Notes: svn path=/head/; revision=266974
2014-02-15This new version of netmap brings you the following:Luigi Rizzo
- netmap pipes, providing bidirectional blocking I/O while moving 100+ Mpps between processes using shared memory channels (no mistake: over one hundred million. But mind you, i said *moving* not *processing*); - kqueue support (BHyVe needs it); - improved user library. Just the interface name lets you select a NIC, host port, VALE switch port, netmap pipe, and individual queues. The upcoming netmap-enabled libpcap will use this feature. - optional extra buffers associated to netmap ports, for applications that need to buffer data yet don't want to make copies. - segmentation offloading for the VALE switch, useful between VMs. and a number of bug fixes and performance improvements. My colleagues Giuseppe Lettieri and Vincenzo Maffione did a substantial amount of work on these features so we owe them a big thanks. There are some external repositories that can be of interest: https://code.google.com/p/netmap our public repository for netmap/VALE code, including linux versions and other stuff that does not belong here, such as python bindings. https://code.google.com/p/netmap-libpcap a clone of the libpcap repository with netmap support. With this any libpcap client has access to most netmap feature with no recompilation. E.g. tcpdump can filter packets at 10-15 Mpps. https://code.google.com/p/netmap-ipfw a userspace version of ipfw+dummynet which uses netmap to send/receive packets. Speed is up in the 7-10 Mpps range per core for simple rulesets. Both netmap-libpcap and netmap-ipfw will be merged upstream at some point, but while this happens it is useful to have access to them. And yes, this code will be merged soon. It is infinitely better than the version currently in 10 and 9. MFC after: 3 days Notes: svn path=/head/; revision=261909
2014-01-16netmap_user.h:Luigi Rizzo
add separate rx/tx ring indexes add ring specifier in nm_open device name netmap.c, netmap_vale.c more consistent errno numbers netmap_generic.c correctly handle failure in registering interfaces. tools/tools/netmap/ massive cleanup of the example programs (a lot of common code is now in netmap_user.h.) nm_util.[ch] are going away soon. pcap.c will also go when i commit the native netmap support for libpcap. Notes: svn path=/head/; revision=260700
2014-01-06It is 2014 and we have a new version of netmap.Luigi Rizzo
Most relevant features: - netmap emulation on any NIC, even those without native netmap support. On the ixgbe we have measured about 4Mpps/core/queue in this mode, which is still a lot more than with sockets/bpf. - seamless interconnection of VALE switch, NICs and host stack. If you disable accelerations on your NIC (say em0) ifconfig em0 -txcsum -txcsum you can use the VALE switch to connect the NIC and the host stack: vale-ctl -h valeXX:em0 allowing sharing the NIC with other netmap clients. - THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers instead of pointers/count as before). This was unavoidable to support, in the future, multiple threads operating on the same rings. Netmap clients require very small source code changes to compile again. On the plus side, the new API should be easier to understand and the internals are a lot simpler. The manual page has been updated extensively to reflect the current features and give some examples. This is the result of work of several people including Giuseppe Lettieri, Vincenzo Maffione, Michio Honda and myself, and has been financially supported by EU projects CHANGE and OPENLAB, from NetApp University Research Fund, NEC, and of course the Universita` di Pisa. Notes: svn path=/head/; revision=260368
2013-12-15split netmap code according to functions:Luigi Rizzo
- netmap.c base code - netmap_freebsd.c FreeBSD-specific code - netmap_generic.c emulate netmap over standard drivers - netmap_mbq.c simple mbuf tailq - netmap_mem2.c memory management - netmap_vale.c VALE switch simplify devce-specific code Notes: svn path=/head/; revision=259412