summaryrefslogtreecommitdiff
path: root/sys/dev/netmap/netmap.c
AgeCommit message (Collapse)Author
2016-05-17Don't repeat the the word 'the'Eitan Adler
(one manual change to fix grammar) Confirmed With: db Approved by: secteam (not really, but this is a comment typo fix) Notes: svn path=/head/; revision=300050
2016-05-03sys/dev: minor spelling fixes.Pedro F. Giffuni
Most affect comments, very few have user-visible effects. Notes: svn path=/head/; revision=298955
2015-12-25Fix typo (s/harware/hardware/)Kevin Lo
Notes: svn path=/head/; revision=292730
2015-09-07Don't call enable_all_rings if the adapter has been freed.Adrian Chadd
This is a subtle use-after-free race that results in some very undesirable hang behaviour. Reviewed by: pkelsey Obtained from: Kip Macy, NextBSD (https://github.com/NextBSD/NextBSD/commit/91a9bd1dbb33dafb41684d054e59d73976de9654) Notes: svn path=/head/; revision=287543
2015-07-19add a use count so the netmap module cannot be unloaded while in use.Luigi Rizzo
Notes: svn path=/head/; revision=285699
2015-07-10staticize functions only used in netmap.cLuigi Rizzo
(detected by jenkins run with gcc 4.9) Update documentation on the use of netmap_priv_d, rename the refcount and use the same structure in FreeBSD and linux No functional changes. Notes: svn path=/head/; revision=285359
2015-07-10Sync netmap sources with the version in our private tree.Luigi Rizzo
This commit contains large contributions from Giuseppe Lettieri and Stefano Garzarella, is partly supported by grants from Verisign and Cisco, and brings in the following: - fix zerocopy monitor ports and introduce copying monitor ports (the latter are lower performance but give access to all traffic in parallel with the application) - exclusive open mode, useful to implement solutions that recover from crashes of the main netmap client (suggested by Patrick Kelsey) - revised memory allocator in preparation for the 'passthrough mode' (ptnetmap) recently presented at bsdcan. ptnetmap is described in S. Garzarella, G. Lettieri, L. Rizzo; Virtual device passthrough for high speed VM networking, ACM/IEEE ANCS 2015, Oakland (CA) May 2015 http://info.iet.unipi.it/~luigi/research.html - fix rx CRC handing on ixl - add module dependencies for netmap when building drivers as modules - minor simplifications to device-specific routines (*txsync, *rxsync) - general code cleanup (remove unused variables, introduce macros to access rings and remove duplicate code, Applications do not need to be recompiled, unless of course they want to use the new features (monitors and exclusive open). Those willing to try this code on stable/10 can just update the sys/dev/netmap/*, sys/net/netmap* with the version in HEAD and apply the small patches to individual device drivers. MFC after: 1 month Sponsored by: (partly) Verisign, Cisco Notes: svn path=/head/; revision=285349
2015-04-11netmap: improve the netmap attach message on FreeBSD.Rui Paulo
MFC after: 1 week Notes: svn path=/head/; revision=281406
2015-02-14two minor changes from the master netmap version:Luigi Rizzo
1. handle errors from nm_config(), if any (none of the FreeBSD drivers currently returns an error on this function, so this change is a no-op at this time 2. use a full memory barrier on ioctls Notes: svn path=/head/; revision=278774
2015-02-14whitespace change:Luigi Rizzo
clarify the role of MAKEDEV_ETERNAL_KLD, and remove an old #ifdef __FreeBSD__ since the code is valid on all platforms. Notes: svn path=/head/; revision=278773
2015-01-24Change the permissions from 0660 to 0600.Adrian Chadd
Otherwise people in wheel can do things with netmap, including but not limited to promisc transmit/receive. Approved by: luigi MFC after: 1 week Notes: svn path=/head/; revision=277653
2014-11-13add support for private knote lock (reduces lock contention),Luigi Rizzo
adapting OS_selrecord accordingly. Problem and fix suggested by adrian and jmg Notes: svn path=/head/; revision=274459
2014-09-25fix a panic when passing ifioctl from a netmap file descriptor toLuigi Rizzo
the underlying device. This needs to be merged to 10.1 Reported by: Patrick Kelsey MFC after: 3 days Notes: svn path=/head/; revision=272111
2014-08-16Update to the current version of netmap.Luigi Rizzo
Mostly bugfixes or features developed in the past 6 months, so this is a 10.1 candidate. Basically no user API changes (some bugfixes in sys/net/netmap_user.h). In detail: 1. netmap support for virtio-net, including in netmap mode. Under bhyve and with a netmap backend [2] we reach over 1Mpps with standard APIs (e.g. libpcap), and 5-8 Mpps in netmap mode. 2. (kernel) add support for multiple memory allocators, so we can better partition physical and virtual interfaces giving access to separate users. The most visible effect is one additional argument to the various kernel functions to compute buffer addresses. All netmap-supported drivers are affected, but changes are mechanical and trivial 3. (kernel) simplify the prototype for *txsync() and *rxsync() driver methods. All netmap drivers affected, changes mostly mechanical. 4. add support for netmap-monitor ports. Think of it as a mirroring port on a physical switch: a netmap monitor port replicates traffic present on the main port. Restrictions apply. Drive carefully. 5. if_lem.c: support for various paravirtualization features, experimental and disabled by default. Most of these are described in our ANCS'13 paper [1]. Paravirtualized support in netmap mode is new, and beats the numbers in the paper by a large factor (under qemu-kvm, we measured gues-host throughput up to 10-12 Mpps). A lot of refactoring and additional documentation in the files in sys/dev/netmap, but apart from #2 and #3 above, almost nothing of this stuff is visible to other kernel parts. Example programs in tools/tools/netmap have been updated with bugfixes and to support more of the existing features. This is meant to go into 10.1 so we plan an MFC before the Aug.22 deadline. A lot of this code has been contributed by my colleagues at UNIPI, including Giuseppe Lettieri, Vincenzo Maffione, Stefano Garzarella. MFC after: 3 days. Notes: svn path=/head/; revision=270063
2014-06-09Fixes from Fanco Ficthner on transparent modeLuigi Rizzo
* The way rings are updated changed with the last API bump. Also sync ->head when moving slots in netmap_sw_to_nic(). * Remove a crashing selrecord() call. * Unclog the logic surrounding netmap_rxsync_from_host(). * Add timestamping to RX host ring. * Remove a couple of obsolete comments. Submitted by: Franco Fichtner MFC after: 3 days Sponsored by: Packetwerk Notes: svn path=/head/; revision=267284
2014-06-06introduce mbq_lock() and mbq_unlock() for the mbq,Luigi Rizzo
so it is easier to buil the same code on linux (this generalizes the change in svn 267142) MFC after: 3 days Notes: svn path=/head/; revision=267177
2014-06-06align comments with the ones in our development trunkLuigi Rizzo
Notes: svn path=/head/; revision=267165
2014-06-06prevent a panic when the netdev/ifp is not set in attachLuigi Rizzo
(internal c63a7b85) MFC after: 3 days Notes: svn path=/head/; revision=267150
2014-06-06Use mtx_lock_spin/mtx_unlock_spin primitives on spin lockAndrey Zonov
Reviewed by: luigi MFC after: 1 week Notes: svn path=/head/; revision=267142
2014-06-05whitespace change: remove trailing whitespaceLuigi Rizzo
Notes: svn path=/head/; revision=267128
2014-02-18two small changes:Luigi Rizzo
- intercept FIONBIO and FIOASYNC ioctls on netmap file descriptors. libpcap calls them to set non blocking I/O on the file descriptor, for netmap this is a no-op because there is no read/write, but not intercepting would cause fcntl() to return -1 - rate limit and put under netmap.verbose some messages that occur when threads use concurrently the same file descriptor. Notes: svn path=/head/; revision=262149
2014-02-15This new version of netmap brings you the following:Luigi Rizzo
- netmap pipes, providing bidirectional blocking I/O while moving 100+ Mpps between processes using shared memory channels (no mistake: over one hundred million. But mind you, i said *moving* not *processing*); - kqueue support (BHyVe needs it); - improved user library. Just the interface name lets you select a NIC, host port, VALE switch port, netmap pipe, and individual queues. The upcoming netmap-enabled libpcap will use this feature. - optional extra buffers associated to netmap ports, for applications that need to buffer data yet don't want to make copies. - segmentation offloading for the VALE switch, useful between VMs. and a number of bug fixes and performance improvements. My colleagues Giuseppe Lettieri and Vincenzo Maffione did a substantial amount of work on these features so we owe them a big thanks. There are some external repositories that can be of interest: https://code.google.com/p/netmap our public repository for netmap/VALE code, including linux versions and other stuff that does not belong here, such as python bindings. https://code.google.com/p/netmap-libpcap a clone of the libpcap repository with netmap support. With this any libpcap client has access to most netmap feature with no recompilation. E.g. tcpdump can filter packets at 10-15 Mpps. https://code.google.com/p/netmap-ipfw a userspace version of ipfw+dummynet which uses netmap to send/receive packets. Speed is up in the 7-10 Mpps range per core for simple rulesets. Both netmap-libpcap and netmap-ipfw will be merged upstream at some point, but while this happens it is useful to have access to them. And yes, this code will be merged soon. It is infinitely better than the version currently in 10 and 9. MFC after: 3 days Notes: svn path=/head/; revision=261909
2014-01-16netmap_user.h:Luigi Rizzo
add separate rx/tx ring indexes add ring specifier in nm_open device name netmap.c, netmap_vale.c more consistent errno numbers netmap_generic.c correctly handle failure in registering interfaces. tools/tools/netmap/ massive cleanup of the example programs (a lot of common code is now in netmap_user.h.) nm_util.[ch] are going away soon. pcap.c will also go when i commit the native netmap support for libpcap. Notes: svn path=/head/; revision=260700
2014-01-09Fix build with VIMAGE.Gleb Smirnoff
Notes: svn path=/head/; revision=260462
2014-01-07fix use after free when releasing a netmap adapter.Luigi Rizzo
Submitted by: Giuseppe Lettieri Notes: svn path=/head/; revision=260411
2014-01-06It is 2014 and we have a new version of netmap.Luigi Rizzo
Most relevant features: - netmap emulation on any NIC, even those without native netmap support. On the ixgbe we have measured about 4Mpps/core/queue in this mode, which is still a lot more than with sockets/bpf. - seamless interconnection of VALE switch, NICs and host stack. If you disable accelerations on your NIC (say em0) ifconfig em0 -txcsum -txcsum you can use the VALE switch to connect the NIC and the host stack: vale-ctl -h valeXX:em0 allowing sharing the NIC with other netmap clients. - THE USER API HAS SLIGHTLY CHANGED (head/cur/tail pointers instead of pointers/count as before). This was unavoidable to support, in the future, multiple threads operating on the same rings. Netmap clients require very small source code changes to compile again. On the plus side, the new API should be easier to understand and the internals are a lot simpler. The manual page has been updated extensively to reflect the current features and give some examples. This is the result of work of several people including Giuseppe Lettieri, Vincenzo Maffione, Michio Honda and myself, and has been financially supported by EU projects CHANGE and OPENLAB, from NetApp University Research Fund, NEC, and of course the Universita` di Pisa. Notes: svn path=/head/; revision=260368
2013-12-18Fix build.Gleb Smirnoff
Notes: svn path=/head/; revision=259538
2013-12-15split netmap code according to functions:Luigi Rizzo
- netmap.c base code - netmap_freebsd.c FreeBSD-specific code - netmap_generic.c emulate netmap over standard drivers - netmap_mbq.c simple mbuf tailq - netmap_mem2.c memory management - netmap_vale.c VALE switch simplify devce-specific code Notes: svn path=/head/; revision=259412
2013-11-06remove a debugging messageLuigi Rizzo
Notes: svn path=/head/; revision=257758
2013-11-05remove some test code.Luigi Rizzo
Notes: svn path=/head/; revision=257666
2013-11-05fix a bug when a device has 1 tx (or rx) queue and more thanLuigi Rizzo
one queue of a different type. Submitted by: Vincenzo Maffione MFC after: 3 days Notes: svn path=/head/; revision=257665
2013-11-05check errors on return from netmap_attach()Luigi Rizzo
Submitted by: Giuseppe Lettieri MFC after: 3 days Notes: svn path=/head/; revision=257664
2013-11-02circumvent a couple of warnings:Luigi Rizzo
- on line 2550 intentionally overriding a const qualifier - on line 3219 intentionally converting uint64_t to a pointer Notes: svn path=/head/; revision=257550
2013-11-01update to the latest netmap snapshot.Luigi Rizzo
This includes the following: - use separate memory regions for VALE ports - locking fixes - some simplifications in the NIC-specific routines - performance improvements for the VALE switch - some new features in the pkt-gen test program - documentation updates There are small API changes that require programs to be recompiled (NETMAP_API has been bumped so you will detect old binaries at runtime). In particular: - struct netmap_slot now is 16 bytes to support an extra pointer, which may save one data copy when using VALE ports or VMs; - the struct netmap_if has two extra fields; MFC after: 3 days Notes: svn path=/head/; revision=257529
2013-10-26The r48589 promised to remove implicit inclusion of if_var.h soon. PrepareGleb Smirnoff
to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc. Notes: svn path=/head/; revision=257176
2013-06-05- fix a bug in the previous commit that was dropping the last packetLuigi Rizzo
from each batch flowing on the VALE switch - feature: add glue for 'indirect' buffers on the sender side: if a slot has NS_INDIRECT set, the netmap buffer contains pointer(s) to the actual userspace buffers, which are accessed with copyin(). The feature is not finalised yet, as it will likely need to deal with some iovec variant for proper scatter/gather support. This will save one copy for clients (e.g. qemu) that cannot use the netmap buffer directly. A curiosity: on amd64 copyin() appears to be 10-15% faster than pkt_copy() or bcopy() at least for sizes of 256 and greater. Notes: svn path=/head/; revision=251425
2013-05-30Bring in a number of new features, mostly implemented by Michio Honda:Luigi Rizzo
- the VALE switch now support up to 254 destinations per switch, unicast or broadcast (multicast goes to all ports). - we can attach hw interfaces and the host stack to a VALE switch, which means we will be able to use it more or less as a native bridge (minor tweaks still necessary). A 'vale-ctl' program is supplied in tools/tools/netmap to attach/detach ports the switch, and list current configuration. - the lookup function in the VALE switch can be reassigned to something else, similar to the pf hooks. This will enable attaching the firewall, or other processing functions (e.g. in-kernel openvswitch) directly on the netmap port. The internal API used by device drivers does not change. Userspace applications should be recompiled because we bump NETMAP_API as we now use some fields in the struct nmreq that were previously ignored -- otherwise, data structures are the same. Manpages will be committed separately. Notes: svn path=/head/; revision=251139
2013-05-02remove trailing whitespaceLuigi Rizzo
Notes: svn path=/head/; revision=250184
2013-04-30Partial cleanup in preparation for upcoming changes:Luigi Rizzo
- netmap_rx_irq()/netmap_tx_irq() can now be called by FreeBSD drivers hiding the logic for handling NIC interrupts in netmap mode. This also simplifies the case of NICs attached to VALE switches. Individual drivers will be updated with separate commits. - use the same refcount() API for FreeBSD and linux - plus some comments, typos and formatting fixes Portions contributed by Michio Honda Notes: svn path=/head/; revision=250107
2013-04-29whitespace changes:Luigi Rizzo
remove $Id$ lines, and add blank lines around some #if / #elif /#endif Notes: svn path=/head/; revision=250052
2013-04-19mostly whitespace changes:Luigi Rizzo
- remove vestiges of the old memory allocator - clean up some comments Notes: svn path=/head/; revision=249659
2013-03-09Switch the vm_object mutex to be a rwlock. This will enable in theAttilio Rao
future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho Notes: svn path=/head/; revision=248084
2013-01-23Add support for transparent mode while in netmap.Luigi Rizzo
By setting dev.netmap.fwd=1 (or enabling the feature with a per-ring flag), packets are forwarded between the NIC and the host stack unless the netmap client clears the NS_FORWARD flag on the individual descriptors. This feature greatly simplifies applications where some traffic (think of ARP, control traffic, ssh sessions...) must be processed by the host stack, whereas the bulk is handled by the netmap process which simply (un)marks packets that should not be forwarded. The default is chosen so that now a netmap receiver operates in a mode very similar to bpf. Of course there is no free lunch: traffic to/from the host stack still operates at OS speed (or less, as there is one extra copy in one direction). HOWEVER, since traffic goes to the user process before being reinjected, and reinjection occurs in a user context, you get some form of livelock protection for free. Notes: svn path=/head/; revision=245836
2013-01-23control some debugging messages with dev.netmap.verboseLuigi Rizzo
add infrastracture to adapt to changes in number of queues and buffers at runtime Notes: svn path=/head/; revision=245835
2012-10-19Fix build.Gleb Smirnoff
Notes: svn path=/head/; revision=241723
2012-10-19This is an import of code, mostly from Giuseppe Lettieri,Luigi Rizzo
that revises the netmap memory allocator so that the various parameters (number and size of buffers, rings, descriptors) can be modified at runtime through sysctl variables. The changes become effective when no netmap clients are active. The API is mostly unchanged, although the NIOCUNREGIF ioctl now does not bring the interface back to normal mode: and you need to close the file descriptor for that. This change was necessary to track who is using the mapped region, and since it is a simplification of the API there was no incentive in trying to preserve NIOCUNREGIF. We will remove the ioctl from the kernel next time we need a real API change (and version bump). Among other things, buffer allocation when opening devices is now much faster: it used to take O(N^2) time, now it is linear. Submitted by: Giuseppe Lettieri Notes: svn path=/head/; revision=241719
2012-08-09Improve lock and unlock symmetryEd Maste
- Move destruction of per-ring locks to netmap_dtor_locked to mirror the initialization that happens in NIOCREGIF. Otherwise unloading a netmap- capable interface that was never put into netmap mode would try to mtx_destroy an uninitialized mutex, and panic. - Destroy core_lock in netmap_detach, mirroring init in netmap_attach. - Also comment out the knlist_destroy for now as there is currently no knlist_init. Sponsored by: ADARA Networks Reviewed by: luigi@ Notes: svn path=/head/; revision=239149
2012-08-08Fix whitespace (missing newline)Ed Maste
Notes: svn path=/head/; revision=239141
2012-08-08Clarify comments about number of tx / rx ringsEd Maste
Notes: svn path=/head/; revision=239140
2012-08-02fix some signed/unsigned warnings in the netmap code.Luigi Rizzo
Unfortunately the original drivers still have a lot of sign conversion/comparison warnings. Notes: svn path=/head/; revision=238985