summaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)Author
2019-08-17drop_monitor: Add support for packet alert mode for hardware dropsIdo Schimmel
In a similar fashion to software drops, extend drop monitor to send netlink events when packets are dropped by the underlying hardware. The main difference is that instead of encoding the program counter (PC) from which kfree_skb() was called in the netlink message, we encode the hardware trap name. The two are mostly equivalent since they should both help the user understand why the packet was dropped. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-15Merge tag 'linux-can-next-for-5.4-20190814' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2019-08-14 this is a pull request for net-next/master consisting of 41 patches. The first two patches are for the kvaser_pciefd driver: Christer Beskow removes unnecessary code in the kvaser_pciefd_pwm_stop() function, YueHaibing removes the unused including of <linux/version.h>. In the next patch YueHaibing also removes the unused including of <linux/version.h> in the f81601 driver. In the ti_hecc driver the next 6 patches are by me and fix checkpatch warnings. YueHaibing's patch removes an unused variable in the ti_hecc_mailbox_read() function. The next 6 patches all target the xilinx_can driver. Anssi Hannula's patch fixes a chip start failure with an invalid bus. The patch by Venkatesh Yadav Abbarapu skips an error message in case of a deferred probe. The 3 patches by Appana Durga Kedareswara rao fix the RX and TX path for CAN-FD frames. Srinivas Neeli's patch fixes the bit timing calculations for CAN-FD. The next 12 patches are by me and several checkpatch warnings in the af_can, raw and bcm components. Thomas Gleixner provides a patch for the bcm, which switches the timer to HRTIMER_MODE_SOFT and removes the hrtimer_tasklet. Then 6 more patches by me for the gw component, which fix checkpatch warnings, followed by 2 patches by Oliver Hartkopp to add CAN-FD support. The vcan driver gets 3 patches by me, fixing checkpatch warnings. And finally a patch by Andre Hartmann to fix typos in CAN's netlink header. ====================
2019-08-14netfilter: remove deprecation warnings from uapi headers.Jeremy Sowden
There are two netfilter userspace headers which contain deprecation warnings. While these headers are not used within the kernel, they are compiled stand-alone for header-testing. Pablo informs me that userspace iptables still refer to these headers, and the intention was to use xt_LOG.h instead and remove these, but userspace was never updated. Remove the warnings. Fixes: 2a475c409fe8 ("kbuild: remove all netfilter headers from header-test blacklist.") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-14USB: usbfs: Add a capability flag for runtime suspendAlan Stern
The recent commit 7794f486ed0b ("usbfs: Add ioctls for runtime power management") neglected to add a corresponding capability flag. This patch rectifies the omission. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: Mayuresh Kulkarni <mkulkarni@opensource.cirrus.com> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1908131613490.1941-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Rename mss field to mss_option field in synproxy, from Fernando Mancera. 2) Use SYSCTL_{ZERO,ONE} definitions in conntrack, from Matteo Croce. 3) More strict validation of IPVS sysctl values, from Junwei Hu. 4) Remove unnecessary spaces after on the right hand side of assignments, from yangxingwu. 5) Add offload support for bitwise operation. 6) Extend the nft_offload_reg structure to store immediate date. 7) Collapse several ip_set header files into ip_set.h, from Jeremy Sowden. 8) Make netfilter headers compile with CONFIG_KERNEL_HEADER_TEST=y, from Jeremy Sowden. 9) Fix several sparse warnings due to missing prototypes, from Valdis Kletnieks. 10) Use static lock initialiser to ensure connlabel spinlock is initialized on boot time to fix sched/act_ct.c, patch from Florian Westphal. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Daniel Borkmann says: ==================== The following pull-request contains BPF updates for your *net-next* tree. There is a small merge conflict in libbpf (Cc Andrii so he's in the loop as well): for (i = 1; i <= btf__get_nr_types(btf); i++) { t = (struct btf_type *)btf__type_by_id(btf, i); if (!has_datasec && btf_is_var(t)) { /* replace VAR with INT */ t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0); <<<<<<< HEAD /* * using size = 1 is the safest choice, 4 will be too * big and cause kernel BTF validation failure if * original variable took less than 4 bytes */ t->size = 1; *(int *)(t+1) = BTF_INT_ENC(0, 0, 8); } else if (!has_datasec && kind == BTF_KIND_DATASEC) { ======= t->size = sizeof(int); *(int *)(t + 1) = BTF_INT_ENC(0, 0, 32); } else if (!has_datasec && btf_is_datasec(t)) { >>>>>>> 72ef80b5ee131e96172f19e74b4f98fa3404efe8 /* replace DATASEC with STRUCT */ Conflict is between the two commits 1d4126c4e119 ("libbpf: sanitize VAR to conservative 1-byte INT") and b03bc6853c0e ("libbpf: convert libbpf code to use new btf helpers"), so we need to pick the sanitation fixup as well as use the new btf_is_datasec() helper and the whitespace cleanup. Looks like the following: [...] if (!has_datasec && btf_is_var(t)) { /* replace VAR with INT */ t->info = BTF_INFO_ENC(BTF_KIND_INT, 0, 0); /* * using size = 1 is the safest choice, 4 will be too * big and cause kernel BTF validation failure if * original variable took less than 4 bytes */ t->size = 1; *(int *)(t + 1) = BTF_INT_ENC(0, 0, 8); } else if (!has_datasec && btf_is_datasec(t)) { /* replace DATASEC with STRUCT */ [...] The main changes are: 1) Addition of core parts of compile once - run everywhere (co-re) effort, that is, relocation of fields offsets in libbpf as well as exposure of kernel's own BTF via sysfs and loading through libbpf, from Andrii. More info on co-re: http://vger.kernel.org/bpfconf2019.html#session-2 and http://vger.kernel.org/lpc-bpf2018.html#session-2 2) Enable passing input flags to the BPF flow dissector to customize parsing and allowing it to stop early similar to the C based one, from Stanislav. 3) Add a BPF helper function that allows generating SYN cookies from XDP and tc BPF, from Petar. 4) Add devmap hash-based map type for more flexibility in device lookup for redirects, from Toke. 5) Improvements to XDP forwarding sample code now utilizing recently enabled devmap lookups, from Jesper. 6) Add support for reporting the effective cgroup progs in bpftool, from Jakub and Takshak. 7) Fix reading kernel config from bpftool via /proc/config.gz, from Peter. 8) Fix AF_XDP umem pages mapping for 32 bit architectures, from Ivan. 9) Follow-up to add two more BPF loop tests for the selftest suite, from Alexei. 10) Add perf event output helper also for other skb-based program types, from Allan. 11) Fix a co-re related compilation error in selftests, from Yonghong. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-08-13can: netlink: fix documentation typosAndre Hartmann
This patch fixes some documentation typos in struct can_bittiming_const. Signed-off-by: Andre Hartmann <aha_1980@gmx.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-08-13can: gw: add support for CAN FD framesOliver Hartkopp
Introduce CAN FD support which needs an extension of the netlink API to pass CAN FD type content to the kernel which has a different size to Classic CAN. Additionally the struct canfd_frame has a new 'flags' element that can now be modified with can-gw. The new CGW_FLAGS_CAN_FD option flag defines whether the routing job handles Classic CAN or CAN FD frames. This setting is very strict at reception time and enables the new possibilities, e.g. CGW_FDMOD_* and modifying the flags element of struct canfd_frame, only when CGW_FLAGS_CAN_FD is set. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-08-13can: gw: use struct canfd_frame as internal data structureOliver Hartkopp
To prepare the CAN FD support this patch implements the first adaptions in data structures for CAN FD without changing the current functionality. Additionally some code at the end of this patch is moved or indented to simplify the review of the next implementation step. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-08-13PCI: Add #defines for some of PCIe spec r4.0 featuresVidya Sagar
Add #defines only for the Data Link Feature and Physical Layer 16.0 GT/s features as defined in PCIe spec r4.0, sec 7.7.4 for Data Link Feature and sec 7.7.5 for Physical Layer 16.0 GT/s. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2019-08-13netfilter: add missing includes to a number of header-files.Jeremy Sowden
A number of netfilter header-files used declarations and definitions from other headers without including them. Added include directives to make those declarations and definitions available. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-08-12fs-verity: add SHA-512 supportEric Biggers
Add SHA-512 support to fs-verity. This is primarily a demonstration of the trivial changes needed to support a new hash algorithm in fs-verity; most users will still use SHA-256, due to the smaller space required to store the hashes. But some users may prefer SHA-512. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctlEric Biggers
Add a root-only variant of the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl which removes all users' claims of the key, not just the current user's claim. I.e., it always removes the key itself, no matter how many users have added it. This is useful for forcing a directory to be locked, without having to figure out which user ID(s) the key was added under. This is planned to be used by a command like 'sudo fscrypt lock DIR --all-users' in the fscrypt userspace tool (http://github.com/google/fscrypt). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: allow unprivileged users to add/remove keys for v2 policiesEric Biggers
Allow the FS_IOC_ADD_ENCRYPTION_KEY and FS_IOC_REMOVE_ENCRYPTION_KEY ioctls to be used by non-root users to add and remove encryption keys from the filesystem-level crypto keyrings, subject to limitations. Motivation: while privileged fscrypt key management is sufficient for some users (e.g. Android and Chromium OS, where a privileged process manages all keys), the old API by design also allows non-root users to set up and use encrypted directories, and we don't want to regress on that. Especially, we don't want to force users to continue using the old API, running into the visibility mismatch between files and keyrings and being unable to "lock" encrypted directories. Intuitively, the ioctls have to be privileged since they manipulate filesystem-level state. However, it's actually safe to make them unprivileged if we very carefully enforce some specific limitations. First, each key must be identified by a cryptographic hash so that a user can't add the wrong key for another user's files. For v2 encryption policies, we use the key_identifier for this. v1 policies don't have this, so managing keys for them remains privileged. Second, each key a user adds is charged to their quota for the keyrings service. Thus, a user can't exhaust memory by adding a huge number of keys. By default each non-root user is allowed up to 200 keys; this can be changed using the existing sysctl 'kernel.keys.maxkeys'. Third, if multiple users add the same key, we keep track of those users of the key (of which there remains a single copy), and won't really remove the key, i.e. "lock" the encrypted files, until all those users have removed it. This prevents denial of service attacks that would be possible under simpler schemes, such allowing the first user who added a key to remove it -- since that could be a malicious user who has compromised the key. Of course, encryption keys should be kept secret, but the idea is that using encryption should never be *less* secure than not using encryption, even if your key was compromised. We tolerate that a user will be unable to really remove a key, i.e. unable to "lock" their encrypted files, if another user has added the same key. But in a sense, this is actually a good thing because it will avoid providing a false notion of security where a key appears to have been removed when actually it's still in memory, available to any attacker who compromises the operating system kernel. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: v2 encryption policy supportEric Biggers
Add a new fscrypt policy version, "v2". It has the following changes from the original policy version, which we call "v1" (*): - Master keys (the user-provided encryption keys) are only ever used as input to HKDF-SHA512. This is more flexible and less error-prone, and it avoids the quirks and limitations of the AES-128-ECB based KDF. Three classes of cryptographically isolated subkeys are defined: - Per-file keys, like used in v1 policies except for the new KDF. - Per-mode keys. These implement the semantics of the DIRECT_KEY flag, which for v1 policies made the master key be used directly. These are also planned to be used for inline encryption when support for it is added. - Key identifiers (see below). - Each master key is identified by a 16-byte master_key_identifier, which is derived from the key itself using HKDF-SHA512. This prevents users from associating the wrong key with an encrypted file or directory. This was easily possible with v1 policies, which identified the key by an arbitrary 8-byte master_key_descriptor. - The key must be provided in the filesystem-level keyring, not in a process-subscribed keyring. The following UAPI additions are made: - The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a fscrypt_policy_v2 to set a v2 encryption policy. It's disambiguated from fscrypt_policy/fscrypt_policy_v1 by the version code prefix. - A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added. It allows getting the v1 or v2 encryption policy of an encrypted file or directory. The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not be used because it did not have a way for userspace to indicate which policy structure is expected. The new ioctl includes a size field, so it is extensible to future fscrypt policy versions. - The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY, and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2 encryption policies. Such keys are kept logically separate from keys for v1 encryption policies, and are identified by 'identifier' rather than by 'descriptor'. The 'identifier' need not be provided when adding a key, since the kernel will calculate it anyway. This patch temporarily keeps adding/removing v2 policy keys behind the same permission check done for adding/removing v1 policy keys: capable(CAP_SYS_ADMIN). However, the next patch will carefully take advantage of the cryptographically secure master_key_identifier to allow non-root users to add/remove v2 policy keys, thus providing a full replacement for v1 policies. (*) Actually, in the API fscrypt_policy::version is 0 while on-disk fscrypt_context::format is 1. But I believe it makes the most sense to advance both to '2' to have them be in sync, and to consider the numbering to start at 1 except for the API quirk. Reviewed-by: Paul Crowley <paulcrowley@google.com> Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctlEric Biggers
Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS. Given a key specified by 'struct fscrypt_key_specifier' (the same way a key is specified for the other fscrypt key management ioctls), it returns status information in a 'struct fscrypt_get_key_status_arg'. The main motivation for this is that applications need to be able to check whether an encrypted directory is "unlocked" or not, so that they can add the key if it is not, and avoid adding the key (which may involve prompting the user for a passphrase) if it already is. It's possible to use some workarounds such as checking whether opening a regular file fails with ENOKEY, or checking whether the filenames "look like gibberish" or not. However, no workaround is usable in all cases. Like the other key management ioctls, the keyrings syscalls may seem at first to be a good fit for this. Unfortunately, they are not. Even if we exposed the keyring ID of the ->s_master_keys keyring and gave everyone Search permission on it (note: currently the keyrings permission system would also allow everyone to "invalidate" the keyring too), the fscrypt keys have an additional state that doesn't map cleanly to the keyrings API: the secret can be removed, but we can be still tracking the files that were using the key, and the removal can be re-attempted or the secret added again. After later patches, some applications will also need a way to determine whether a key was added by the current user vs. by some other user. Reserved fields are included in fscrypt_get_key_status_arg for this and other future extensions. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctlEric Biggers
Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY. This ioctl removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY. It wipes the secret key itself, then "locks" the encrypted files and directories that had been unlocked using that key -- implemented by evicting the relevant dentries and inodes from the VFS caches. The problem this solves is that many fscrypt users want the ability to remove encryption keys, causing the corresponding encrypted directories to appear "locked" (presented in ciphertext form) again. Moreover, users want removing an encryption key to *really* remove it, in the sense that the removed keys cannot be recovered even if kernel memory is compromised, e.g. by the exploit of a kernel security vulnerability or by a physical attack. This is desirable after a user logs out of the system, for example. In many cases users even already assume this to be the case and are surprised to hear when it's not. It is not sufficient to simply unlink the master key from the keyring (or to revoke or invalidate it), since the actual encryption transform objects are still pinned in memory by their inodes. Therefore, to really remove a key we must also evict the relevant inodes. Currently one workaround is to run 'sync && echo 2 > /proc/sys/vm/drop_caches'. But, that evicts all unused inodes in the system rather than just the inodes associated with the key being removed, causing severe performance problems. Moreover, it requires root privileges, so regular users can't "lock" their encrypted files. Another workaround, used in Chromium OS kernels, is to add a new VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of drop_caches that operates on a single super_block. It does: shrink_dcache_sb(sb); invalidate_inodes(sb, false); But it's still a hack. Yet, the major users of filesystem encryption want this feature badly enough that they are actually using these hacks. To properly solve the problem, start maintaining a list of the inodes which have been "unlocked" using each master key. Originally this wasn't possible because the kernel didn't keep track of in-use master keys at all. But, with the ->s_master_keys keyring it is now possible. Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY. It finds the specified master key in ->s_master_keys, then wipes the secret key itself, which prevents any additional inodes from being unlocked with the key. Then, it syncs the filesystem and evicts the inodes in the key's list. The normal inode eviction code will free and wipe the per-file keys (in ->i_crypt_info). Note that freeing ->i_crypt_info without evicting the inodes was also considered, but would have been racy. Some inodes may still be in use when a master key is removed, and we can't simply revoke random file descriptors, mmap's, etc. Thus, the ioctl simply skips in-use inodes, and returns -EBUSY to indicate that some inodes weren't evicted. The master key *secret* is still removed, but the fscrypt_master_key struct remains to keep track of the remaining inodes. Userspace can then retry the ioctl to evict the remaining inodes. Alternatively, if userspace adds the key again, the refreshed secret will be associated with the existing list of inodes so they remain correctly tracked for future key removals. The ioctl doesn't wipe pagecache pages. Thus, we tolerate that after a kernel compromise some portions of plaintext file contents may still be recoverable from memory. This can be solved by enabling page poisoning system-wide, which security conscious users may choose to do. But it's very difficult to solve otherwise, e.g. note that plaintext file contents may have been read in other places than pagecache pages. Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is initially restricted to privileged users only. This is sufficient for some use cases, but not all. A later patch will relax this restriction, but it will require introducing key hashes, among other changes. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctlEric Biggers
Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY. This ioctl adds an encryption key to the filesystem's fscrypt keyring ->s_master_keys, making any files encrypted with that key appear "unlocked". Why we need this ~~~~~~~~~~~~~~~~ The main problem is that the "locked/unlocked" (ciphertext/plaintext) status of encrypted files is global, but the fscrypt keys are not. fscrypt only looks for keys in the keyring(s) the process accessing the filesystem is subscribed to: the thread keyring, process keyring, and session keyring, where the session keyring may contain the user keyring. Therefore, userspace has to put fscrypt keys in the keyrings for individual users or sessions. But this means that when a process with a different keyring tries to access encrypted files, whether they appear "unlocked" or not is nondeterministic. This is because it depends on whether the files are currently present in the inode cache. Fixing this by consistently providing each process its own view of the filesystem depending on whether it has the key or not isn't feasible due to how the VFS caches work. Furthermore, while sometimes users expect this behavior, it is misguided for two reasons. First, it would be an OS-level access control mechanism largely redundant with existing access control mechanisms such as UNIX file permissions, ACLs, LSMs, etc. Encryption is actually for protecting the data at rest. Second, almost all users of fscrypt actually do need the keys to be global. The largest users of fscrypt, Android and Chromium OS, achieve this by having PID 1 create a "session keyring" that is inherited by every process. This works, but it isn't scalable because it prevents session keyrings from being used for any other purpose. On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't similarly abuse the session keyring, so to make 'sudo' work on all systems it has to link all the user keyrings into root's user keyring [2]. This is ugly and raises security concerns. Moreover it can't make the keys available to system services, such as sshd trying to access the user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to read certificates from the user's home directory (see [5]); or to Docker containers (see [6], [7]). By having an API to add a key to the *filesystem* we'll be able to fix the above bugs, remove userspace workarounds, and clearly express the intended semantics: the locked/unlocked status of an encrypted directory is global, and encryption is orthogonal to OS-level access control. Why not use the add_key() syscall ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We use an ioctl for this API rather than the existing add_key() system call because the ioctl gives us the flexibility needed to implement fscrypt-specific semantics that will be introduced in later patches: - Supporting key removal with the semantics such that the secret is removed immediately and any unused inodes using the key are evicted; also, the eviction of any in-use inodes can be retried. - Calculating a key-dependent cryptographic identifier and returning it to userspace. - Allowing keys to be added and removed by non-root users, but only keys for v2 encryption policies; and to prevent denial-of-service attacks, users can only remove keys they themselves have added, and a key is only really removed after all users who added it have removed it. Trying to shoehorn these semantics into the keyrings syscalls would be very difficult, whereas the ioctls make things much easier. However, to reuse code the implementation still uses the keyrings service internally. Thus we get lockless RCU-mode key lookups without having to re-implement it, and the keys automatically show up in /proc/keys for debugging purposes. References: [1] https://github.com/google/fscrypt [2] https://goo.gl/55cCrI#heading=h.vf09isp98isb [3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939 [4] https://github.com/google/fscrypt/issues/116 [5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715 [6] https://github.com/google/fscrypt/issues/128 [7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: use FSCRYPT_* definitions, not FS_*Eric Biggers
Update fs/crypto/ to use the new names for the UAPI constants rather than the old names, then make the old definitions conditional on !__KERNEL__. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fscrypt: use FSCRYPT_ prefix for uapi constantsEric Biggers
Prefix all filesystem encryption UAPI constants except the ioctl numbers with "FSCRYPT_" rather than with "FS_". This namespaces the constants more appropriately and makes it clear that they are related specifically to the filesystem encryption feature, and to the 'fscrypt_*' structures. With some of the old names like "FS_POLICY_FLAGS_VALID", it was not immediately clear that the constant had anything to do with encryption. This is also useful because we'll be adding more encryption-related constants, e.g. for the policy version, and we'd otherwise have to choose whether to use unclear names like FS_POLICY_V1 or inconsistent names like FS_ENCRYPTION_POLICY_V1. For source compatibility with existing userspace programs, keep the old names defined as aliases to the new names. Finally, as long as new names are being defined anyway, I skipped defining new names for the fscrypt mode numbers that aren't actually used: INVALID (0), AES_256_GCM (2), AES_256_CBC (3), SPECK128_256_XTS (7), and SPECK128_256_CTS (8). Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-12fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>Eric Biggers
More fscrypt definitions are being added, and we shouldn't use a disproportionate amount of space in <linux/fs.h> for fscrypt stuff. So move the fscrypt definitions to a new header <linux/fscrypt.h>. For source compatibility with existing userspace programs, <linux/fs.h> still includes the new header. Reviewed-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Biggers <ebiggers@google.com>
2019-08-11Merge tag 'v5.3-rc4' into nextDmitry Torokhov
Sync up with mainline to bring in device_property_count_u32 andother newer APIs.
2019-08-12Merge 5.3-rc4 into usb-nextGreg Kroah-Hartman
We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-12Merge 5.3-rc4 into char-misc-nextGreg Kroah-Hartman
We need the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-11drop_monitor: Expose tail drop counterIdo Schimmel
Previous patch made the length of the per-CPU skb drop list configurable. Expose a counter that shows how many packets could not be enqueued to this list. This allows users determine the desired queue length. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-11drop_monitor: Make drop queue length configurableIdo Schimmel
In packet alert mode, each CPU holds a list of dropped skbs that need to be processed in process context and sent to user space. To avoid exhausting the system's memory the maximum length of this queue is currently set to 1000. Allow users to tune the length of this queue according to their needs. The configured length is reported to user space when drop monitor configuration is queried. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-11drop_monitor: Add a command to query current configurationIdo Schimmel
Users should be able to query the current configuration of drop monitor before they start using it. Add a command to query the existing configuration which currently consists of alert mode and packet truncation length. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-11drop_monitor: Allow truncation of dropped packetsIdo Schimmel
When sending dropped packets to user space it is not always necessary to copy the entire packet as usually only the headers are of interest. Allow user to specify the truncation length and add the original length of the packet as additional metadata to the netlink message. By default no truncation is performed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-11drop_monitor: Add packet alert modeIdo Schimmel
So far drop monitor supported only one alert mode in which a summary of locations in which packets were recently dropped was sent to user space. This alert mode is sufficient in order to understand that packets were dropped, but lacks information to perform a more detailed analysis. Add a new alert mode in which the dropped packet itself is passed to user space along with metadata: The drop location (as program counter and resolved symbol), ingress netdevice and drop timestamp. More metadata can be added in the future. To avoid performing expensive operations in the context in which kfree_skb() is invoked (can be hard IRQ), the dropped skb is cloned and queued on per-CPU skb drop list. Then, in process context the netlink message is allocated, prepared and finally sent to user space. The per-CPU skb drop list is limited to 1000 skbs to prevent exhausting the system's memory. Subsequent patches will make this limit configurable and also add a counter that indicates how many skbs were tail dropped. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-11drop_monitor: Add alert mode operationsIdo Schimmel
The next patch is going to add another alert mode in which the dropped packet is notified to user space, instead of only a summary of recent drops. Abstract the differences between the modes by adding alert mode operations. The operations are selected based on the currently configured mode and associated with the probes and the work item just before tracing starts. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09sock: make cookie generation global instead of per netnsDaniel Borkmann
Generating and retrieving socket cookies are a useful feature that is exposed to BPF for various program types through bpf_get_socket_cookie() helper. The fact that the cookie counter is per netns is quite a limitation for BPF in practice in particular for programs in host namespace that use socket cookies as part of a map lookup key since they will be causing socket cookie collisions e.g. when attached to BPF cgroup hooks or cls_bpf on tc egress in host namespace handling container traffic from veth or ipvlan devices with peer in different netns. Change the counter to be global instead. Socket cookie consumers must assume the value as opqaue in any case. Not every socket must have a cookie generated and knowledge of the counter value itself does not provide much value either way hence conversion to global is fine. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Martynas Pumputis <m@lambda.lt> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09Merge tag 'drm-fixes-2019-08-09' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Usual fixes roundup. Nothing too crazy or serious, one non-released ioctl is removed in the amdkfd driver. core: - mode parser strncpy fix i915: - GLK DSI escape clock setting - HDCP memleak fix tegra: - one gpiod/of regression fix amdgpu: - fix VCN to handle the latest navi10 firmware - fix for fan control on navi10 - properly handle SMU metrics table on navi10 - fix a resume regression on Stoney - kfd revert a GWS ioctl vmwgfx: - memory leak fix rockchip: - suspend fix" * tag 'drm-fixes-2019-08-09' of git://anongit.freedesktop.org/drm/drm: drm/vmwgfx: fix memory leak when too many retries have occurred Revert "drm/amdkfd: New IOCTL to allocate queue GWS" Revert "drm/amdgpu: fix transform feedback GDS hang on gfx10 (v2)" drm/amdgpu: pin the csb buffer on hw init for gfx v8 drm/rockchip: Suspend DP late drm/i915: Fix wrong escape clock divisor init for GLK drm/i915: fix possible memory leak in intel_hdcp_auth_downstream() drm/modes: Fix unterminated strncpy drm/amd/powerplay: correct navi10 vcn powergate drm/amd/powerplay: honor hw limit on fetching metrics data for navi10 drm/amd/powerplay: Allow changing of fan_control in smu_v11_0 drm/amd/amdgpu/vcn_v2_0: Move VCN 2.0 specific dec ring test to vcn_v2_0 drm/amd/amdgpu/vcn_v2_0: Mark RB commands as KMD commands drm/tegra: Fix gpiod_get_from_of_node() regression
2019-08-09Merge tag 'drm-misc-next-2019-08-08' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.4: UAPI Changes: - HDCP: Add a Content protection type property Cross-subsystem Changes: Core Changes: - Continue to rework the include dependencies - fb: Remove the unused drm_gem_fbdev_fb_create function - drm-dp-helper: Make the link rate calculation more tolerant to non-explicitly defined, yet supported, rates - fb-helper: Map DRM client buffer only when required, and instanciate a shadow buffer when the device has a dirty function or says so - connector: Add a helper to link the DDC adapter used by that connector to the userspace - vblank: Switch from DRM_WAIT_ON to wait_event_interruptible_timeout - dma-buf: Fix a stack corruption - ttm: Embed a drm_gem_object struct to make ttm_buffer_object a superclass of GEM, and convert drivers to use it. - hdcp: Improvements to report the content protection type to the userspace Driver Changes: - Remove drm_gem_prime_import/export from being defined in the drivers - Drop DRM_AUTH usage from drivers - Continue to drop drmP.h - Convert drivers to the connector ddc helper - ingenic: Add support for more panel-related cases - komeda: Support for dual-link - lima: Reduce logging - mpag200: Fix the cursor support - panfrost: Export GPU features register to userspace through an ioctl - pl111: Remove the CLD pads wiring support from the DT - rockchip: Rework to use DRM PSR helpers, fix a bug in the VOP_WIN_GET macro - sun4i: Improve support for color encoding and range - tinydrm: Rework SPI support, improve MIPI-DBI support, move to drm/tiny - vkms: Rework of the CRC tracking - bridges: - sii902x: Add support for audio graph card - tc358767: Rework AUX data handling code - ti-sn65dsi86: Add Debugfs and proper DSI mode flags support - panels - Support for GiantPlus GPM940B0, Sharp LQ070Y3DG3B, Ortustech COM37H3M, Novatek NT39016, Sharp LS020B1DD01D, Raydium RM67191, Boe Himax8279d, Sharp LD-D5116Z01B - Conversion of the device tree bindings to the YAML description - jh057n00900: Rework the enable / disable path - fbdev: - ssd1307fb: Support more devices based on that controller Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808121423.xzpedzkpyecvsiy4@flea
2019-08-09usbfs: Add ioctls for runtime power managementAlan Stern
It has been requested that usbfs should implement runtime power management, instead of forcing the device to remain at full power as long as the device file is open. This patch introduces that new feature. It does so by adding three new usbfs ioctls: USBDEVFS_FORBID_SUSPEND: Prevents the device from going into runtime suspend (and causes a resume if the device is already suspended). USBDEVFS_ALLOW_SUSPEND: Allows the device to go into runtime suspend. Some time may elapse before the device actually is suspended, depending on things like the autosuspend delay. USBDEVFS_WAIT_FOR_RESUME: Blocks until the call is interrupted by a signal or at least one runtime resume has occurred since the most recent ALLOW_SUSPEND ioctl call (which may mean immediately, even if the device is currently suspended). In the latter case, the device is prevented from suspending again just as if FORBID_SUSPEND was called before the ioctl returns. For backward compatibility, when the device file is first opened runtime suspends are forbidden. The userspace program can then allow suspends whenever it wants, and either resume the device directly (by forbidding suspends again) or wait for a resume from some other source (such as a remote wakeup). URBs submitted to a suspended device will fail or will complete with an appropriate error code. This combination of ioctls is sufficient for user programs to have nearly the same degree of control over a device's runtime power behavior as kernel drivers do. Still lacking is documentation for the new ioctls. I intend to add it later, after the existing documentation for the usbfs userspace API is straightened out into a reasonable form. Suggested-by: Mayuresh Kulkarni <mkulkarni@opensource.cirrus.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1908071013220.1514-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-08Merge tag 'drm-fixes-5.3-2019-08-07' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-fixes drm-fixes-5.3-2019-08-07: amdgpu: - Fixes VCN to handle the latest navi10 firmware - Fixes for fan control on navi10 - Properly handle SMU metrics table on navi10 - Fix a resume regression on Stoney amdkfd: - Revert new GWS ioctl. It's not ready. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190807184221.3323-1-alexander.deucher@amd.com
2019-08-07Revert "drm/amdkfd: New IOCTL to allocate queue GWS"Alex Deucher
This reverts commit 1a058c3376765ee31d65e28cbbb9d4ff15120056. This interface is still in too much flux. Revert until it's sorted out. Acked-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Just minor overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: "Yeah I should have sent a pull request last week, so there is a lot more here than usual: 1) Fix memory leak in ebtables compat code, from Wenwen Wang. 2) Several kTLS bug fixes from Jakub Kicinski (circular close on disconnect etc.) 3) Force slave speed check on link state recovery in bonding 802.3ad mode, from Thomas Falcon. 4) Clear RX descriptor bits before assigning buffers to them in stmmac, from Jose Abreu. 5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF loops, from Nishka Dasgupta. 6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean. 7) Need to hold sock across skb->destructor invocation, from Cong Wang. 8) IP header length needs to be validated in ipip tunnel xmit, from Haishuang Yan. 9) Use after free in ip6 tunnel driver, also from Haishuang Yan. 10) Do not use MSI interrupts on r8169 chips before RTL8168d, from Heiner Kallweit. 11) Upon bridge device init failure, we need to delete the local fdb. From Nikolay Aleksandrov. 12) Handle erros from of_get_mac_address() properly in stmmac, from Martin Blumenstingl. 13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef Kadlecsik. 14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with some devices, so revert. From Johannes Berg. 15) Fix deadlock in rxrpc, from David Howells. 16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue Haibing. 17) Fix mvpp2 crash on module removal, from Matteo Croce. 18) Fix race in genphy_update_link, from Heiner Kallweit. 19) bpf_xdp_adjust_head() stopped working with generic XDP when we fixes generic XDP to support stacked devices properly, fix from Jesper Dangaard Brouer. 20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from David Ahern. 21) Several memory leaks in new sja1105 driver, from Vladimir Oltean" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits) net: dsa: sja1105: Fix memory leak on meta state machine error path net: dsa: sja1105: Fix memory leak on meta state machine normal path net: dsa: sja1105: Really fix panic on unregistering PTP clock net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well net: dsa: sja1105: Fix broken learning with vlan_filtering disabled net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus() net: sched: sample: allow accessing psample_group with rtnl net: sched: police: allow accessing police->params with rtnl net: hisilicon: Fix dma_map_single failed on arm64 net: hisilicon: fix hip04-xmit never return TX_BUSY net: hisilicon: make hip04_tx_reclaim non-reentrant tc-testing: updated vlan action tests with batch create/delete net sched: update vlan action for batched events operations net: stmmac: tc: Do not return a fragment entry net: stmmac: Fix issues when number of Queues >= 4 net: stmmac: xgmac: Fix XGMAC selftests be2net: disable bh with spin_lock in be_process_mcc net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER ...
2019-08-06arm64: Introduce prctl() options to control the tagged user addresses ABICatalin Marinas
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve(). A Kconfig option allows the overall disabling of the relaxed ABI. The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Will Deacon <will@kernel.org>
2019-08-05fpga: dfl: fme: add DFL_FPGA_FME_PORT_RELEASE/ASSIGN ioctl support.Wu Hao
In order to support virtualization usage via PCIe SRIOV, this patch adds two ioctls under FPGA Management Engine (FME) to release and assign back the port device. In order to safely turn Port from PF into VF and enable PCIe SRIOV, it requires user to invoke this PORT_RELEASE ioctl to release port firstly to remove userspace interfaces, and then configure the PF/VF access register in FME. After disable SRIOV, it requires user to invoke this PORT_ASSIGN ioctl to attach the port back to PF. Ioctl interfaces: * DFL_FPGA_FME_PORT_RELEASE Release platform device of given port, it deletes port platform device to remove related userspace interfaces on PF. After this function, then it's safe to configure PF/VF access mode to VF, and enable VFs via SRIOV. * DFL_FPGA_FME_PORT_ASSIGN Assign platform device of given port back to PF. After configure PF/VF access mode to PF, this ioctl adds port platform device back to re-enable related userspace interfaces on PF. Signed-off-by: Zhang Yi Z <yi.z.zhang@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Acked-by: Alan Tull <atull@kernel.org> Acked-by: Moritz Fischer <mdf@kernel.org> Signed-off-by: Moritz Fischer <mdf@kernel.org> Link: https://lore.kernel.org/r/1564914022-3710-2-git-send-email-hao.wu@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-03net/socket: fix GCC8+ Wpacked-not-aligned warningsQian Cai
There are a lot of those warnings with GCC8+ 64-bit, In file included from ./include/linux/sctp.h:42, from net/core/skbuff.c:47: ./include/uapi/linux/sctp.h:395:1: warning: alignment 4 of 'struct sctp_paddr_change' is less than 8 [-Wpacked-not-aligned] } __attribute__((packed, aligned(4))); ^ ./include/uapi/linux/sctp.h:728:1: warning: alignment 4 of 'struct sctp_setpeerprim' is less than 8 [-Wpacked-not-aligned] } __attribute__((packed, aligned(4))); ^ ./include/uapi/linux/sctp.h:727:26: warning: 'sspp_addr' offset 4 in 'struct sctp_setpeerprim' isn't aligned to 8 [-Wpacked-not-aligned] struct sockaddr_storage sspp_addr; ^~~~~~~~~ ./include/uapi/linux/sctp.h:741:1: warning: alignment 4 of 'struct sctp_prim' is less than 8 [-Wpacked-not-aligned] } __attribute__((packed, aligned(4))); ^ ./include/uapi/linux/sctp.h:740:26: warning: 'ssp_addr' offset 4 in 'struct sctp_prim' isn't aligned to 8 [-Wpacked-not-aligned] struct sockaddr_storage ssp_addr; ^~~~~~~~ ./include/uapi/linux/sctp.h:792:1: warning: alignment 4 of 'struct sctp_paddrparams' is less than 8 [-Wpacked-not-aligned] } __attribute__((packed, aligned(4))); ^ ./include/uapi/linux/sctp.h:784:26: warning: 'spp_address' offset 4 in 'struct sctp_paddrparams' isn't aligned to 8 [-Wpacked-not-aligned] struct sockaddr_storage spp_address; ^~~~~~~~~~~ ./include/uapi/linux/sctp.h:905:1: warning: alignment 4 of 'struct sctp_paddrinfo' is less than 8 [-Wpacked-not-aligned] } __attribute__((packed, aligned(4))); ^ ./include/uapi/linux/sctp.h:899:26: warning: 'spinfo_address' offset 4 in 'struct sctp_paddrinfo' isn't aligned to 8 [-Wpacked-not-aligned] struct sockaddr_storage spinfo_address; ^~~~~~~~~~~~~~ This is because the commit 20c9c825b12f ("[SCTP] Fix SCTP socket options to work with 32-bit apps on 64-bit kernels.") added "packed, aligned(4)" GCC attributes to some structures but one of the members, i.e, "struct sockaddr_storage" in those structures has the attribute, "aligned(__alignof__ (struct sockaddr *)" which is 8-byte on 64-bit systems, so the commit overwrites the designed alignments for "sockaddr_storage". To fix this, "struct sockaddr_storage" needs to be aligned to 4-byte as it is only used in those packed sctp structure which is part of UAPI, and "struct __kernel_sockaddr_storage" is used in some other places of UAPI that need not to change alignments in order to not breaking userspace. Use an implicit alignment for "struct __kernel_sockaddr_storage" so it can keep the same alignments as a member in both packed and un-packed structures without breaking UAPI. Suggested-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-02crypto: add header include guardsMasahiro Yamada
Add header include guards in case they are included multiple times. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-08-01pidfd: add P_PIDFD to waitid()Christian Brauner
This adds the P_PIDFD type to waitid(). One of the last remaining bits for the pidfd api is to make it possible to wait on pidfds. With P_PIDFD added to waitid() the parts of userspace that want to use the pidfd api to exclusively manage processes can do so now. One of the things this will unblock in the future is the ability to make it possible to retrieve the exit status via waitid(P_PIDFD) for non-parent processes if handed a _suitable_ pidfd that has this feature set. This is similar to what you can do on FreeBSD with kqueue(). It might even end up being possible to wait on a process as a non-parent if an appropriate property is enabled on the pidfd. With P_PIDFD no scoping of the process identified by the pidfd is possible, i.e. it explicitly blocks things such as wait4(-1), wait4(0), waitid(P_ALL), waitid(P_PGID) etc. It only allows for semantics equivalent to wait4(pid), waitid(P_PID). Users that need scoping should rely on pid-based wait*() syscalls for now. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Howells <dhowells@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Andy Lutomirsky <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20190727222229.6516-2-christian@brauner.io
2019-07-31net: bridge: mcast: add delete due to fast-leave mdb flagNikolay Aleksandrov
In user-space there's no way to distinguish why an mdb entry was deleted and that is a problem for daemons which would like to keep the mdb in sync with remote ends (e.g. mlag) but would also like to converge faster. In almost all cases we'd like to age-out the remote entry for performance and convergence reasons except when fast-leave is enabled. In that case we want explicit immediate remote delete, thus add mdb flag which is set only when the entry is being deleted due to fast-leave. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains Netfilter fixes for your net tree: 1) memleak in ebtables from the error path for the 32/64 compat layer, from Florian Westphal. 2) Fix inverted meta ifname/ifidx matching when no interface is set on either from the input/output path, from Phil Sutter. 3) Remove goto label in nft_meta_bridge, also from Phil. 4) Missing include guard in xt_connlabel, from Masahiro Yamada. 5) Two patch to fix ipset destination MAC matching coming from Stephano Brivio, via Jozsef Kadlecsik. 6) Fix set rename and listing concurrency problem, from Shijie Luo. Patch also coming via Jozsef Kadlecsik. 7) ebtables 32/64 compat missing base chain policy in rule count, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-07-31cfg80211: add support for parsing OBBS_PD attributesJohn Crispin
Add the data structure, policy and parsing code allowing userland to send the OBSS PD information into the kernel. Signed-off-by: John Crispin <john@phrozen.org> Link: https://lore.kernel.org/r/20190730163701.18836-2-john@phrozen.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-07-30bpf: add bpf_tcp_gen_syncookie helperPetar Penkov
This helper function allows BPF programs to try to generate SYN cookies, given a reference to a listener socket. The function works from XDP and with an skb context since bpf_skc_lookup_tcp can lookup a socket in both cases. Signed-off-by: Petar Penkov <ppenkov@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-30ppdev: add header include guardMasahiro Yamada
Add a header include guard just in case. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Link: https://lore.kernel.org/r/20190728152739.9249-1-yamada.masahiro@socionext.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-29xdp: Add devmap_hash map type for looking up devices by hashed indexToke Høiland-Jørgensen
A common pattern when using xdp_redirect_map() is to create a device map where the lookup key is simply ifindex. Because device maps are arrays, this leaves holes in the map, and the map has to be sized to fit the largest ifindex, regardless of how many devices actually are actually needed in the map. This patch adds a second type of device map where the key is looked up using a hashmap, instead of being used as an array index. This allows maps to be densely packed, so they can be smaller. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-07-29Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio/vhost fixes from Michael Tsirkin: - Fixes in the iommu and balloon devices. - Disable the meta-data optimization for now - I hope we can get it fixed shortly, but there's no point in making users suffer crashes while we are working on that. * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: disable metadata prefetch optimization iommu/virtio: Update to most recent specification balloon: fix up comments mm/balloon_compaction: avoid duplicate page removal