summaryrefslogtreecommitdiff
path: root/sys/compat/linux
AgeCommit message (Collapse)Author
7 dayslinux: on vnet detach call clean_unrhdr(9) alwaysGleb Smirnoff
The assumption was incorrect, and the current VIMAGE implementation leaves a possibility for some interfaces still exist in a jail that is going away. Fixes: 607f11055d2d421770963162a4d9a99cdd136152
13 dayslinux: add hidraw ioctl handlerAlex S
First step towards getting the Linux version of SDL with HIDAPI gamepad drivers to work. Not quite complte as SDL expects to find some information in sysfs as well. Signed-off-by: Alex S <iwtcex@gmail.com> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1938
13 dayslinux: support termios2 ioctlsmothcompute
Signed-off-by: mothcompute <mothcompute@protonmail.com> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/1949
2025-12-28compat: linux: use appropriate variables for copying out old timersKyle Evans
We copyout &l_oval but do the conversions into &l_val, leaving us with stack garbage. A build with an LLVM21 cross-toolchain seems to catch this. Reported by: Florian Limberger <flo purplekraken com> Reviewed by: markj Fixes: a1fd2911ddb06 ("linux(4): Implement timer_settime64 syscall.") MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D52985
2025-12-11linux: Implement F_DUPFD_QUERY fcntl with kcmp(2) KCMP_FILERicardo Branco
Signed-off-by: Ricardo Branco <rbranco@suse.de> Reviewed by: kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1920
2025-12-11linux: Add support for kcmp(2) system callRicardo Branco
Signed-off-by: Ricardo Branco <rbranco@suse.de> Reviewed by: kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1920
2025-12-10linux: fix unr(9) leak on module unloadGleb Smirnoff
Suggested by: jhb Fixes: 607f11055d2d421770963162a4d9a99cdd136152
2025-12-10linux: fix panic on kldunloadGleb Smirnoff
The vnet_deregister_sysuninit() that is called by linker unload sequence also calls every registered destructor before unregistering it. IMHO, this is not correct in principle, but for now plug the regression right in the code that introduced the panic. Fixes: 607f11055d2d421770963162a4d9a99cdd136152
2025-12-09linux: fix build without VIMAGEGleb Smirnoff
Fixes: fbf05d2147b1add8b760be166c4b1fd4499ebce8
2025-12-08linux: store Linux Ethernet interface number in struct ifnetGleb Smirnoff
The old approach where we go through the list of interfaces and count them has bugs. One obvious bug with this dynamic translation is that once an Ethernet interface in the middle of the list goes away, all interfaces following it would change their Linux names. A bigger problem is the ifnet arrival and departure times. For example linsysfs has event handler for ifnet_arrival_event, and of course it wants to resolve the name. This accidentially works, due to a bug in if_attach() where we call if_link_ifnet() before invoking all the event handlers. Once the bug is fixed linsysfs won't be able to resolve the old way. The other side is ifnet_departure_event, where there is no bug, the eventhandlers are called after the if_unlink_ifnet(). This means old translation won't work for departure event handlers. One example is netlink. This change gives the Netlink a chance to emit a proper Linux interface departure message. However, there is another problem in Netlink, that the ifnet pointer is lost in the Netlink translation layer. Plug this with a cookie in netlink writer structure that can be set by the route layer and used by the Netlink Linux translation layer. This part of the diff seems unrelated, but it is hard to make it a separate change, as the old KPI goes away and to use the new one we need the pointer. Differential Revision: https://reviews.freebsd.org/D54077
2025-12-08linux: separate all ifnet(9) related code into linux_ifnet.cGleb Smirnoff
Remove linux_use_real_ifname(). It is no longer used outside of the file since 3ab3c9c29cf0. There is no functional change. Reviewed by: melifaro, dchagin Differential Revision: https://reviews.freebsd.org/D54076
2025-10-20linux: Make the macro LINUX_IOCTL_SET publicZhenlei Huang
There're some other drivers want to register and unregister linux ioctl handler. Move the macro LINUX_IOCTL_SET from tdfx_linux.h to linux_ioctl.h so that they can also benefit it. While here, rename the declaration of linux ioctl function to be consistent with the name of the handler. Meanwhile, drop a comment about the macro, since its function is obvious. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D53158
2025-10-18kqueuex(2): add KQUEUE_CPONFORKKonstantin Belousov
The created kqueue is copied on fork, together with the registered events. This means that a new kqueue is created at the same fd index as the parent' kqueue, and all registered events are copied into the new kqueue (when possible). The current active events list is also duplicated. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D52045
2025-10-15linux: add translation for MCAST_JOIN_GROUP family of socket optionsGleb Smirnoff
Differential Revision: https://reviews.freebsd.org/D52937
2025-10-15linux: make linux_to_bsd_sockaddr() use memory supplied by callerGleb Smirnoff
No functional change. Differential Revision: https://reviews.freebsd.org/D52936
2025-10-14imgact: Mark brandinfo and note structures as constMark Johnston
No functional change intended. Reviewed by: olce, kib, emaste MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D53062
2025-09-17linux: getsockopt(): Simplify exporting groups a bitOlivier Certner
No functional change (intended). Go through conversion to a 'l_gid_t' before copying out in order to cope with differing group types (except for not representable values, of course). This is what is done, e.g., for getgroups() in 'linux_misc.c'. As Linux's group type is the same as ours on all architectures, we could as well just stop bothering and copy out our memory representation, eliminating the loop here. Whatever the choice, though, it has to be consistent here and there. Introduce 'out' of type 'l_gid_t' to avoid performing "by hand" array arithmetics when copying out. MFC after: 5 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52280
2025-09-17linux: setgroups16(): Pre-extend the groups arrayOlivier Certner
For the size we know we will need in the end. No functional change (intended). MFC after: 5 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52279
2025-09-17linux: setgroups(): Avoid allocation under the process lockOlivier Certner
This was missed in commit 838d9858251e ("Rework the credential code to support larger values of NGROUPS (...)"). No functional change (intended). MFC after: 5 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52278
2025-09-17linux: setgroups(): Fix the group number's upper limitOlivier Certner
'ngroups_max' is the maximum number of supplementary groups the system will accept, and this has not changed. Fixes: 9da2fe96ff2e ("kern: fix setgroups(2) and getgroups(2) to match other platforms") MFC after: 5 days MFC to: stable/15 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52277
2025-09-17linux: Simplify further getgroups() after 'cr_gid' not in cr_groups[]Olivier Certner
No functional change (intended). While here, fix/improve style a bit and in setgroups(). MFC after: 5 days MFC to: stable/15 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D52276
2025-09-11Revert "linux: fix reporting NL_RTM_DELLINK to Netlink sockets"Gleb Smirnoff
I received a report that certain Linux application would crash on a message of a departure of an interface with FreeBSD name. Looks like dropping NL_RTM_DELLINK is a lesser evil than relay them with FreeBSD names. This reverts commit 554907bac3b264863a051f75eedc35d180d3e18c.
2025-08-22netlink: do not pass writer to the Linux translation layerGleb Smirnoff
Another flaw in the KPI between Netlink and Linuxulator is that we pass the on-stack writer structure. This structure belongs to someone, that we can't even identify inside nl_send() and we shall not tamper it. The Linux translation layer needs a writer, because it actually composes a new message. Instead of reusing someone's writer and trying to repair it in all possible cases where translation process tampers the writer, just let Linuxulator use its own writer. See also b977dd1ea5fb. PR: 288892 Reviewed by: melifaro Differential Revision: https://reviews.freebsd.org/D51928
2025-08-22linux: fix reporting NL_RTM_DELLINK to Netlink socketsGleb Smirnoff
The problem is that ifname_bsd_to_linux_name() requires the interface to exist. But when we are in the context of ifnet_departure_event EVENTHANDLER(9), it does not. Instead of silently dropping the message, let's send the FreeBSD name verbatim. At the moment special translation is done for IFT_LOOPBACK and IFT_ETHER only, and these two kinds of interfaces usually don't depart. So, this is not a final fix, but definitely an improvement. While here, simplify the associated code. Differential Revision: https://reviews.freebsd.org/D51927
2025-07-31kern: add a new ucred flag for groups having been setKyle Evans
Now that we can legitimately have ngroups == 0 as a result of calling crsetgroups(), set a flag when we've set groups for the sake of sanity checking usage of crextend(). While it's true this flag will only really be used under INVARIANTS, it's only the second flag bit that we're adding in 16 years. Reviewed by: olce Differential Revision: https://reviews.freebsd.org/D51646
2025-07-30kern: start tracking cr_gid outside of cr_groups[]Kyle Evans
This is the (mostly) kernel side of de-conflating cr_gid and the supplemental groups. The pre-existing behavior for getgroups() and setgroups() is retained to keep the user <-> kernel boundary functionally the same while we audit use of these syscalls, but we can remove a lot of the internal special-casing just by reorganizing ucred like this. struct xucred has been altered because the cr_gid macro becomes problematic if ucred has a real cr_gid member but xucred does not. Most notably, they both also have cr_groups[] members, so the definition means that we could easily have situations where we end up using the first supplemental group as the egid in some places. We really can't change the ABI of xucred, so instead we alias the first member to the `cr_gid` name and maintain the status quo. This also fixes the Linux setgroups(2)/getgroups(2) implementation to more cleanly preserve the group set, now that we don't need to special case cr_groups[0]. __FreeBSD_version bumped for the `struct ucred` ABI break. For relnotes: downstreams and out-of-tree modules absolutely must fix any references to cr_groups[0] in their code. These are almost exclusively incorrect in the new world, and cr_gid should be used instead. There is a cr_gid macro available in earlier FreeBSD versions that can be used to avoid having version-dependant conditionals to refer to the effective group id. Surrounding code may need adjusted if it peels off the first element of cr_groups and uses the others as the supplemental groups, since the supplemental groups start at cr_groups[0] now if &cr_groups[0] != &cr_gid. Relnotes: yes (see last paragraph) Co-authored-by: olce Differential Revision: https://reviews.freebsd.org/D51489
2025-07-26kern: allow kern_shm_open2 of an anonymous preconstructed shmfdKyle Evans
The motivation here is for future changes to the coredump code to be able to build up a coredump into a shmfd instead of a vnode, which then gets tapped out to userland via a character device. This also opens up the possibility that it's useful for the kernel to be able to construct a shmfd and pass it out to a process that shouldn't need to write to it. Reviewed by: emaste, kib, markj Differential Revision: https://reviews.freebsd.org/D51336
2025-07-04linux: Add inotify supportMark Johnston
Implement the Linux inotify system calls using the native implementation in vfs_inotify.c. PR: 240874 Reviewed by: brooks MFC after: 3 months Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D50761
2025-06-24sys: Add AT_HWCAP3 and AT_HWCAP4Andrew Turner
It is likely we will need these on arm64. Add them in preparation for flags in these to be added at some point in the future. Reviewed by: brooks, imp, kib Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D51005
2025-06-13linux: Fix usage of ptrace(PT_GET_SC_ARGS)Mark Johnston
The native handler expects the argument to be a pointer to an array of 8 syscall arguments, whereas the emulation provided an array that holds up to 6. Handle this by adding a new range of Linuxulator-specific ptrace commands. In particular, introduce PTLINUX_GET_SC_ARGS, which always copies exactly six arguments. This fixes the problem and removes the hack of checking the target thread ABI to decide whether to apply a Linux-specific quirk to PT_GET_SC_ARGS. Reviewed by: kib MFC after: 2 weeks Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D50758
2025-06-11machine/stdarg.h -> sys/stdarg.hBrooks Davis
Switch to using sys/stdarg.h for va_list type and va_* builtins. Make an attempt to insert the include in a sensible place. Where style(9) was followed this is easy, where it was ignored, aim for the first block of sys/*.h headers and don't get too fussy or try to fix other style bugs. Reviewed by: imp Exp-run by: antoine (PR 286274) Pull Request: https://github.com/freebsd/freebsd-src/pull/1595
2025-04-21exec: Remove parameter 'segflg' from exec_copyin_args()Wuyang Chung
In kern "copyin" means copy data from user address space to kernel address space. But in the function exec_copyin_args() there is a parameter 'segflg' that is used to specify the address space of the parameter 'fname'. In the source code there are two places where 'segflg' are not UIO_USERSPACE. In both cases the 'fname' argument are NULL so the argument 'segflg' are not important there. So it is safe to remove the parameter 'segflg' from the function exec_copyin_args(). Reviewed by: markj, jhb MFC after: 2 weeks Pull Request: https://github.com/freebsd/freebsd-src/pull/1590
2025-03-10linux: Handle IP_RECVTOS cmsg typeAlex S
This unbreaks apps using GameNetworkingSockets from Valve.
2025-03-10linux: Fix a typo in linux_recvmsg_commonAlex S
We are supposed to check the result of bsd_to_linux_sockopt_level here rather than its input.
2025-02-24umtx: Add a helper for unlocked umtxq_busy() callsMark Johnston
This seems like a natural complement to umtxq_unbusy_unlocked(). No functional change intended. Reviewed by: olce, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D49124
2025-01-03cpu_set_upcall / linux_clone_thread: Remove calls to cpu_thread_cleanJohn Baldwin
This is intended to clean state of a thread at the end of its lifecycle during wait(), not the beginning of its life cycle. Reviewed by: kib Sponsored by: AFRL, DARPA Differential Revision: https://reviews.freebsd.org/D48023
2024-11-29linux(4): Fix typo `xatrr` in function nameMinseo Kim
Correct `xatrr_to_extattr` to `xattr_to_extattr`. Signed-off-by: Minseo Kim <kimminss0@outlook.kr> Reviewed by: imp,emaste,markj Pull Request: https://github.com/freebsd/freebsd-src/pull/1533
2024-11-13linux sendfile: Fix handling of non-blocking socketsMark Johnston
FreeBSD sendfile() may perform a partial transfer and return EAGAIN if the socket is non-blocking. Linux sendfile() expects no error in this case, so squash EAGAIN. PR: 282495 Tested by: pieter@krikkit.xyz MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D47447
2024-10-21linux: support IUTF8Edward Tomasz Napierala
Make Linuxulator support the recently added IUTF8 termios(4) flag. Reviewed By: dchagin, emaste, imp Differential Revision: https://reviews.freebsd.org/D44525
2024-10-15linux.h: don't redefine lower_32_bits if already definedWarner Losh
systrace.c fails to build if we're using a common compiler.h for both openzfs and linuxkpi. The issue is easy enough to fix: don't redefined lower_32_bits if it's already defined in linux.h, since it's the least 'standardized'. This will allow systrace.c to build using an equivalent macro. MFC After: 3 days Sponsored by: Netflix
2024-09-24linuxulator: ignore AT_NO_AUTOMOUNT for all stat variantsEd Maste
Commit ff39d74aa99a ignored AT_NO_AUTOMOUNT for statx(), but did not change fstat64() or newfstatat(), which also take an equivalent flags argument. Add a linux_to_bsd_stat_flags() helper and use it in all three places. PR: 281526 Reviewed by: trasz Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D46711
2024-08-11faccessat(2): Honor AT_SYMLINK_NOFOLLOWFernando ApesteguĂ­a
Make the system call honor `AT_SYMLINK_NOFOLLOW`. Also enable this from `linux_faccessat2` where the issue arised the first time. Update manual pages accordingly. PR: 275295 Reported by: kenrap@kennethraplee.com Approved by: kib@ Differential Revision: https://reviews.freebsd.org/D46267
2024-06-14linux: Translate Linux NVME ioctls to the lower layers.Chuck Tuffli
The lower layers implement a ABI compatible Linux ioctl for a few of the Linux IOCTLs. Translate them and pass them down. Since they are ABI compatible, just use the nvme ioctl name. Co-Authored-by: Warner Losh <imp@bsdimp.com> Reviewed by: chuck Differential Revision: https://reviews.freebsd.org/D45416
2024-06-06linux: Allows writing to the vdso from the kernelAndrew Turner
We need to write to the vdso in the kernel to perform fixups. Move it from .rodata to .data so these can be run. Reported by: cy Sponsored by: Arm Ltd
2024-06-05linux64: Fix the build on arm64 with bti checkingAndrew Turner
When we enable checking for BTI on arm64 we need to include an ELF note in all object files linked into a module. As using objcopy from a binary to an ELF object file doesn't add the note switch to using .incbin from an assembly file. This allows us to add the needed note without affecting the included object. Reviewed by: imp, kib, emaste Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D45468
2024-05-31Redefine CLOCK_BOOTTIME to alias CLOCK_MONOTONIC, not CLOCK_UPTIMEVal Packett
The suspend-awareness situation with monotonic clocks across platforms is kind of a mess, let's try not making it worse. On Linux, CLOCK_MONOTONIC does NOT count suspended time, and CLOCK_BOOTTIME was introduced to INCLUDE suspended time. On OpenBSD, CLOCK_MONOTONIC DOES count suspended time, and CLOCK_UPTIME was introduced to EXCLUDE suspended time. On macOS, it's the same as OpenBSD, but with CLOCK_UPTIME_RAW. Right now, we do not have a monotonic clock that counts suspended time. We have CLOCK_UPTIME as a distinct ID alias, and CLOCK_BOOTTIME as a preprocessor alias, both being effectively `CLOCK_MONOTONIC` for now. When we introduce a suspend-aware clock in the future, it would make a lot more sense to do it the OpenBSD/macOS way, i.e. to make CLOCK_MONOTONIC include suspended time and make CLOCK_UPTIME exclude it, because that's what the name CLOCK_UPTIME implies: a deviation from the default intended for the uptime command to allow it to only show the time the system was actually up and not suspended. Let's change the define right now to make sure software using the define would not end up using the ID of the wrong clock in the future, and fix the IDs in the Linux compat code to match the expected changes too. See https://bugzilla.mozilla.org/show_bug.cgi?id=1824084 for more discussion. Fixes: 155f15118a77 ("clock_gettime: Add Linux aliases for CLOCK_*") Fixes: 25ada637362d ("Map Linux CLOCK_BOOTTIME to native CLOCK_UPTIME.") Sponsored by: https://www.patreon.com/valpackett Reviewed by: kib, imp Differential Revision: https://reviews.freebsd.org/D39270
2024-05-29minor style tweak.Warner Losh
checkstyle9 doesn't check for this construct... Fixes: 6d849754b996
2024-05-29linux: implement PR_CHILD_SET_SUBREAPERSon Phan Trung
Reviewed by: imp, dchagin Pull Request: https://github.com/freebsd/freebsd-src/pull/1260
2024-05-28linux: allow RTM_GETADDR without full ifaddrmsg argumentGleb Smirnoff
Even modern glibc uses truncated argument for RTM_GETADDR when it wants to list all addresses in a system. See sysdeps/unix/sysv/linux/ifaddrs.c:__netlink_sendreq(). It sends a one char payload. Linux kernel allows that as long as given socket is not marked as a 'strict'. We have a similar flag in the general netlink code and it is checked in sys/netlink/netlink_message_parser.h:nl_parse_header(). If the flag is not present, parser will allocate a temporary zeroed buffer to make the message correct. The checks added in b977dd1ea5fb blocked such message before the parser. My reading of glibc says that there are two types of messages that are sent with __netlink_sendreq() - RTM_GETLINK and RTM_GETADDR. The RTM_GETLINK is binary compatible between Linux and FreeBSD and thus doesn't need any ABI handler. PR: 279012 Fixes: b977dd1ea5fbc2df3f1279330be4d089322eb2cf
2024-05-23linux: Update linux manpage to mention mqueuefsRicardo Branco
Reviewed by: imp, kib Pull Request: https://github.com/freebsd/freebsd-src/pull/1248