summaryrefslogtreecommitdiff
path: root/lib/libvmmapi
AgeCommit message (Collapse)Author
2025-12-17vmm: Add ability to destroy VMs on closeBojan Novković
This change adds the ability to tie a virtual machine's lifecycle to a /dev/vmmctl file descriptor. A user can request `vmmctl` to destroy a virtual machine on close using the `VMMCTL_CREATE_DESTROY_ON_CLOSE` flag when creating the virtual machine. `vmmctl` tracks such virtual machines in per-descriptor lists. Differential Revision: https://reviews.freebsd.org/D53729 Reviewed by: markj Sponsored by: The FreeBSD Foundation Sponsored by: Klara, Inc. MFC after: 3 months
2025-07-27libvmmapi: Add support for setting up and configuring guest NUMA domainsBojan Novković
This patch reworks libvmmapi to provide support for emulating NUMA domains in guests. More specifically, it reworks 'vm_setup_memory' to setup system memory segments for each guest NUMA domain. An emulated NUMA domain is described by a 'struct vmdom' in vmmapi.h. Aside from its size in bytes, each domain can be configured to use a specific domainset(9) policy and domain mask. 'vm_setup_memory' now takes two additional arguments - an array of struct vmdoms and the array's size. It then proceeds to set up a memory segment for each specified domain using the existing memory mapping scheme. If no domain info is passed, the memory setup falls back to the original, non-NUMA behaviour. Differential Revision: https://reviews.freebsd.org/D44566 Reviewed by: markj
2025-07-27vmm: Add support for guest NUMA emulationBojan Novković
This change adds the necessary kernelspace bits required for supporting NUMA domains in bhyve VMs. The layout of system memory segments and how they're created has been reworked. Each guest NUMA domain will now have its own memory segment. Furthermore, this change allows users to tweak the domain's backing vm_object domainset(9) policy. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D44565
2025-02-06libvmmapi: Fix auto-loading of vmm.koMark Johnston
- We should autoload vmm.ko when creating a VM with vm_openf(), to preserve behaviour prior to commit 99127fd10362. - kldload(2) returns a non-zero value upon success, so the existing code was wrong. Reviewed by: jhb Reported by: olivier Fixes: 99127fd10362 ("libvmmapi: Use the vmmctl device file to create and destroy VMs") Differential Revision: https://reviews.freebsd.org/D48797
2024-12-17riscv vmm: add SSTC extension check.Ruslan Bukin
Check if RISC-V SSTC is available and advertise to the guest. This is needed for Eswin EIC7700 that does not include SSTC. As we don't have a mechanism for reporting extension presence from the kernel to userspace, then use vm_cap_type for now. Reviewed by: mhorne, markj Differential Revision: https://reviews.freebsd.org/D48058
2024-11-05libvmmapi: Use the vmmctl device file to create and destroy VMsMark Johnston
This deprecates the vm_create() and vm_open() interfaces and introduces vm_openf(), which takes flags controlling its behaviour. In particular, it will optionally create a VM first, and it can optionally reinitialize an existing VM. This enables some simplification of existing consumers. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D47030
2024-10-31bhyve/riscv: Initial import.Ruslan Bukin
Add machine-dependent parts for bhyve hypervisor to support virtualization on RISC-V ISA. No objection: markj Sponsored by: UK Research and Innovation Differential Revision: https://reviews.freebsd.org/D45512
2024-04-10libvmmapi: Conditionalize compilation of some functionsMark Johnston
Hide definitions of several functions that currently don't have implementatations in the arm64 vmm port. In particular, add a WITH_VMMAPI_SNAPSHOT preprocessor variable that can be used to enable compilation of save/restore functions, and conditionalize compilation of some functions only used by amd64 bhyve. If in the long term they remain amd64-only, they can move to vmmapi_machdep.c, but for now it's not clear to me that that's the right thing to do. MFC after: 2 weeks Sponsored by: Innovate UK
2024-04-10libvmmapi: Zero out the structure passed to VM_GET_MEMSEGMark Johnston
Avoid assuming that the kernel zeros the name buffer, it does not do this for zero-length segments. MFC after: 2 weeks Sponsored by: Innovate UK
2024-04-10libvmmapi: Make vm_raise_msi() a common functionMark Johnston
Currently, bhyve PCI emulation uses vm_lapic_msi() to raise an MSI in the guest. The arm64 port has a similar function, vm_raise_msi(). Add vm_raise_msi() on amd64 as well and have it simply call vm_lapic_msi() so that bhyve can use a common, generically named function. Reviewed by: corvink, andrew, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41752
2024-04-10libvmmapi: Add arm64 supportMark Johnston
- Define wrappers for some MD ioctls. - Provide a list of vmm device ioctls for cap_ioctl_limit(). - Disable use of the lowmem region. Reviewed by: corvink MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41005
2024-04-10libvmmapi: Make memory segment handling a bit more abstractMark Johnston
libvmmapi leaves a hole at [3GB, 4GB) in the guest physical address space. This hole is not used in the arm64 port, which maps everything above 4GB. This change makes the code a bit more general to accomodate arm64 more naturally. In particular: - Remove vm_set_lowmem_limit(): it is unused and doesn't have well-defined constraints, e.g., nothing prevents a consumer from setting a lowmem limit above the highmem base. - Define a constant for the highmem base and use that everywhere that the base is currently hard-coded. - Make the lowmem limit a compile-time constant instead of a vmctx field. - Store segment info in an array. - Add vm_get_highmem_base(), for use in bhyve since the current value is hard-coded in some places. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41004
2024-04-10libvmmapi: Move PCI passthrough ioctl wrappers into a separate fileMark Johnston
The arm64 port doesn't implement PCI passthrough and in particular doesn't define the ioctls used by these wrappers. It might be that the ppt ioctl interface will require modification to support arm64. Until that's sorted out one way or another, put this code in a separate file so that it's easy to conditionally compile. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41003
2024-04-10libvmmapi: Move more amd64-specific ioctl wrappers to vmmapi_machdep.cMark Johnston
No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41002
2024-04-10libvmmapi: Split the ioctl list into MI and MD listsMark Johnston
To enable use in capability mode, libvmmapi needs a list of all the ioctls that might be invoked on the vmm device handle. Some of these ioctls are amd64-specific. Move the ioctl list to vmmapi_machdep.c and define a list of MI ioctls so that the arm64 port can build its own list without duplicating common ioctls. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41001
2024-04-10libvmmapi: Move VM capability names to vmmapi_machdep.cMark Johnston
Add some missing entries while here. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D41000
2024-04-10libvmmapi: Move some ioctl wrappers to vmmapi_machdep.cMark Johnston
ioctls relating to segments and various x86-specific interrupt controllers are easy candidates to move to vmmapi_machdep.c. In vmmapi.h I'm just ifdefing MD prototypes for now. We could instead split vmmapi.h into multiple headers, e.g., vmmapi.h and vmmapi_machdep.h, but it's not obvious to me yet that that's the right approach. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40999
2024-04-10libvmmapi: Add a subdirectory for amd64-specific codeMark Johnston
Move vmmapi_freebsd.c there. It contains x86-specific code used only by bhyveload(8). Move vcpu_reset() into vmmapi_machdep.c. It is also x86-specific. No functional change intended. Reviewed by: corvink, jhb MFC after: 2 weeks Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D40998
2024-04-08libvmmapi: add missing capability stringsRob Norris
Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed by: markj MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D44642
2023-11-26lib: Automated cleanup of cdefs and other formattingWarner Losh
Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/ Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/ Remove /\n+#if.*\n#endif.*\n+/ Remove /^#if.*\n#endif.*\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix
2023-08-16Remove $FreeBSD$: one-line sh patternWarner Losh
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16Remove $FreeBSD$: one-line .c patternWarner Losh
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16Remove $FreeBSD$: two-line .h patternWarner Losh
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-06-08libvmmapi: Remove some unneeded includesMark Johnston
These are amd64-specific and so can't be used when targetting arm64, but they don't appear to be needed. No functional change intended. MFC after: 1 week Sponsored by: The FreeBSD Foundation
2023-05-23vmm: Avoid embedding cpuset_t ioctl ABIsMark Johnston
Commit 0bda8d3e9f7a ("vmm: permit some IPIs to be handled by userspace") embedded cpuset_t into the vmm(4) ioctl ABI. This was a mistake since we otherwise have some leeway to change the cpuset_t for the whole system, but we want to keep the vmm ioctl ABI stable. Rework IPI reporting to avoid this problem. Along the way, make VM_RUN a bit more efficient: - Split vmexit metadata out of the main VM_RUN structure. This data is only written by the kernel. - Have userspace pass a cpuset_t pointer and cpusetsize in the VM_RUN structure, as is done for cpuset syscalls. - Have the destination CPU mask for VM_EXITCODE_IPIs live outside the vmexit info structure, and make VM_RUN copy it out separately. Zero out any extra bytes in the CPU mask, like cpuset syscalls do. - Modify the vmexit handler prototype to take a full VM_RUN structure. PR: 271330 Reviewed by: corvink, jhb (previous versions) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40113
2023-05-12spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSDWarner Losh
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix
2023-04-18Update/fix Makefile.depend for userlandSimon J. Gerraty
2023-03-24bhyve: Remove vmctx member from struct vm_snapshot_meta.John Baldwin
This is a userland-only pointer that isn't relevant to the kernel and doesn't belong in the ioctl structure shared between userland and the kernel. For the kernel, the old structure for the ioctl is still supported under COMPAT_FREEBSD13. This changes vm_snapshot_req() in libvmmapi to accept an explicit vmctx argument. It also changes vm_snapshot_guest2host_addr to take an explicit vmctx argument. As part of this change, move the declaration for this function and its wrapper macro from vmm_snapshot.h to snapshot.h as it is a userland-only API. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38125
2023-03-24libvmmapi: Add a struct vcpu and use it in most APIs.John Baldwin
This replaces the 'struct vm, int vcpuid' tuple passed to most API calls and is similar to the changes recently made in vmm(4) in the kernel. struct vcpu is an opaque type managed by libvmmapi. For now it stores a pointer to the VM context and an integer id. As an immediate effect this removes the divergence between the kernel and userland for the instruction emulation code introduced by the recent vmm(4) changes. Since this is a major change to the vmmapi API, bump VMMAPI_VERSION to 0x200 (2.0) and the shared library major version. While here (and since the major version is bumped), remove unused vcpu argument from vm_setup_pptdev_msi*(). Add new functions vm_suspend_all_cpus() and vm_resume_all_cpus() for use by the debug server. The underyling ioctl (which uses a vcpuid of -1) remains unchanged, but the userlevel API now uses separate functions for global CPU suspend/resume. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38124
2023-03-06libvmm: add missing ioctl's to vm_ioctl_cmdsVitaliy Gusev
Reviewed by: corvink, markj MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D38866
2022-11-18vmm: Use struct vcpu in the instruction emulation code.John Baldwin
This passes struct vcpu down in place of struct vm and and integer vcpu index through the in-kernel instruction emulation code. To minimize userland disruption, helper macros are used for the vCPU arguments passed into and through the shared instruction emulation code. A few other APIs used by the instruction emulation code have also been updated to accept struct vcpu in the kernel including vm_get/set_register and vm_inject_fault. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37161
2022-11-18bhyve: Remove unused vm and vcpu arguments from vm_copy routines.John Baldwin
The arguments identifying the VM and vCPU are only needed for vm_copy_setup. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37158
2022-10-24libvmmapi: Provide an interface for limiting rights on the device fdMark Johnston
Currently libvmmapi provides a way to get a list of the allowed ioctls on the vmm device file, so that bhyve can limit rights on the device file fd. The interface is rather strange: it allocates a copy of the list but returns a const pointer, so the caller has to cast away the const in order to free it without aggravating the compiler. As far as I can see, there's no reason to make a copy of the array, but changing vm_get_ioctls() to not do that would break compatibility. So this change just introduces a better interface: move all rights-limiting logic into libvmmapi. Any new operations on the fd should be wrapped by libvmmapi, so also discourage use of vm_get_device_fd(). Currently bhyve uses it only when limiting rights on the device fd. No functional change intended. Reviewed by: jhb MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D37098
2022-07-27bhyve: Initialize more registers in vcpu_reset()Corvin Köhne
- Clear CR2, EFER, and R8-15 to zero. - Reset DR6 and DR7 to their documented reset values. - Reset interrupt shadow state. - Document the reason CR0 is reset to a value that doesn't match its documented value. Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D35622 Sponsored by: Beckhoff Automation GmbH & Co. KG
2022-07-11libvmm: add __BEGIN_DECLS/__END_DECLS for linking with c++ binariesVitaliy Gusev
Reviewed by: jhb, markj, imp Sponsored by: vStack MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D35719
2022-06-30libvmmapi: Add vm_close()Vitaliy Gusev
Currently there is no way to safely free a vm structure without leaking the fd. vm_destroy() closes the fd but also destroys the VM whereas in some cases a VM needs to be opened (vm_open) and then closed (vm_close). Reviewed by: jhb Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D35073
2022-03-17libvmm: constify vm_get_name()Robert Wing
Allows callers of vm_get_name() to retrieve the vm name without having to allocate a buffer. While in the vicinity, do minor cleanup in vm_snapshot_basic_metadata(). Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D34290
2022-03-10bhyve: add ROM emulationCorvin Köhne
Some PCI devices especially GPUs require a ROM to work properly. The ROM is executed by boot firmware to initialize the device. To add a ROM to a device use the new ROM option for passthru device (e.g. -s passthru,0/2/0,rom=<path>/<to>/<rom>). It's necessary that the ROM is executed by the boot firmware. It won't be executed by any OS. Additionally, the boot firmware should be configured to execute the ROM file. For that reason, it's only possible to use a ROM when using OVMF with enabled bus enumeration. Differential Revision: https://reviews.freebsd.org/D33129 Sponsored by: Beckhoff Automation GmbH & Co. KG MFC after: 1 month
2022-02-07Extend the VMM stats interface to support a dynamic count of statistics.John Baldwin
- Add a starting index to 'struct vmstats' and change the VM_STATS ioctl to fetch the 64 stats starting at that index. A compat shim for <= 13 continues to fetch only the first 64 stats. - Extend vm_get_stats() in libvmmapi to use a loop and a static thread local buffer which grows to hold the stats needed. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D27463
2021-09-15Remove an always-true check.John Baldwin
This fixes a -Wtype-limits error from GCC 9. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D31936
2021-07-26libvmmapi: Fix warnings and stop overridding WARNSMark Johnston
- Avoid shadowing the global optarg. - Sprinkle __unused. - Cast nitems() to int. - Fix sign in vm_copy_setup(). Reviewed by: grehan MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D31306
2021-05-11libvmm: explicitly save and restore errno in vm_open()Robert Wing
In commit 6bb140e3ca895a14, vm_destroy() was replaced with free() to preserve errno. However, it's possible that free() may change the errno as well. Keep the free() call, but explicitly save and restore errno. Noted by: jhb Fixes: 6bb140e3ca895a14
2021-03-19bhyve: support relocating fbuf and passthru data BARsD Scott Phillips
We want to allow the UEFI firmware to enumerate and assign addresses to PCI devices so we can boot from NVMe[1]. Address assignment of PCI BARs is properly handled by the PCI emulation code in general, but a few specific cases need additional support. fbuf and passthru map additional objects into the guest physical address space and so need to handle address updates. Here we add a callback to emulated PCI devices to inform them of a BAR configuration change. fbuf and passthru then watch for these BAR changes and relocate the frame buffer memory segment and passthru device mmio area respectively. We also add new VM_MUNMAP_MEMSEG and VM_UNMAP_PPTDEV_MMIO ioctls to vmm(4) to facilitate the unmapping needed for addres updates. [1]: https://github.com/freebsd/uefi-edk2/pull/9/ Originally by: scottph MFC After: 1 week Sponsored by: Intel Corporation Reviewed by: grehan Approved by: philip (mentor) Differential Revision: https://reviews.freebsd.org/D24066
2021-03-06bhyvectl: print a better error message when vm_open() failsRobert Wing
Use errno to print a more descriptive error message when vm_open() fails libvmm: preserve errno when vm_device_open() fails vm_destroy() squashes errno by making a dive into sysctlbyname() - we can safely skip vm_destroy() here since it's not doing any critical clean up at this point. Replace vm_destroy() with a free() call. PR: 250671 MFC after: 3 days Submitted by: marko@apache.org Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D29109
2021-02-17libvmm: clean up vmmapi.hRobert Wing
struct checkpoint_op, enum checkpoint_opcodes, and MAX_SNAPSHOT_VMNAME are not vmm specific, move them out of the vmmapi header. They are used for the save/restore functionality that bhyve(8) provides and are better suited in usr.sbin/bhyve/snapshot.h Since bhyvectl(8) requires these, the Makefile for bhyvectl has been modified to include usr.sbin/bhyve/snapshot.h Reviewed by: kevans, grehan Differential Revision: https://reviews.freebsd.org/D28410
2020-11-24Honor the disabled setting for MSI-X interrupts for passthrough devices.John Baldwin
Add a new ioctl to disable all MSI-X interrupts for a PCI passthrough device and invoke it if a write to the MSI-X capability registers disables MSI-X. This avoids leaving MSI-X interrupts enabled on the host if a guest device driver has disabled them (e.g. as part of detaching a guest device driver). This was found by Chelsio QA when testing that a Linux guest could switch from MSI-X to MSI interrupts when using the cxgb4vf driver. While here, explicitly fail requests to enable MSI on a passthrough device if MSI-X is enabled and vice versa. Reported by: Sony Arpita Das @ Chelsio Reviewed by: grehan, markj MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27212 Notes: svn path=/head/; revision=368003
2020-05-15vmm(4), bhyve(8): Expose kernel-emulated special devices to userspaceConrad Meyer
Expose the special kernel LAPIC, IOAPIC, and HPET devices to userspace for use in, e.g., fallback instruction emulation (when userspace has a newer instruction decode/emulation layer than the kernel vmm(4)). Plumb the ioctl through libvmmapi and register the memory ranges in bhyve(8). Reviewed by: grehan Differential Revision: https://reviews.freebsd.org/D24525 Notes: svn path=/head/; revision=361082
2020-05-05Initial support for bhyve save and restore.John Baldwin
Save and restore (also known as suspend and resume) permits a snapshot to be taken of a guest's state that can later be resumed. In the current implementation, bhyve(8) creates a UNIX domain socket that is used by bhyvectl(8) to send a request to save a snapshot (and optionally exit after the snapshot has been taken). A snapshot currently consists of two files: the first holds a copy of guest RAM, and the second file holds other guest state such as vCPU register values and device model state. To resume a guest, bhyve(8) must be started with a matching pair of command line arguments to instantiate the same set of device models as well as a pointer to the saved snapshot. While the current implementation is useful for several uses cases, it has a few limitations. The file format for saving the guest state is tied to the ABI of internal bhyve structures and is not self-describing (in that it does not communicate the set of device models present in the system). In addition, the state saved for some device models closely matches the internal data structures which might prove a challenge for compatibility of snapshot files across a range of bhyve versions. The file format also does not currently support versioning of individual chunks of state. As a result, the current file format is not a fixed binary format and future revisions to save and restore will break binary compatiblity of snapshot files. The goal is to move to a more flexible format that adds versioning, etc. and at that point to commit to providing a reasonable level of compatibility. As a result, the current implementation is not enabled by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option for userland builds, and the kernel option BHYVE_SHAPSHOT. Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz Relnotes: yes Sponsored by: University Politehnica of Bucharest Sponsored by: Matthew Grooms (student scholarships) Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D19495 Notes: svn path=/head/; revision=360648
2020-04-21Map negative types passed to vm_capability_type2name to NULL.John Baldwin
Submitted by: vangyzen Notes: svn path=/head/; revision=360178
2020-04-21Add description string for VM_CAP_BPT_EXIT.John Baldwin
While here, replace the array of mapping structures with an array of string pointers where the index is the capability value. Submitted by: Rob Fairbanks <rob.fx907@gmail.com> Reviewed by: rgrimes MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D24289 Notes: svn path=/head/; revision=360166