linux.git - Linux kernel source tree

Age	Commit message (Collapse)	Author
2019-10-29	io_uring: add support for IORING_OP_ACCEPT	Jens Axboe
	This allows an application to call accept4() in an async fashion. Like other opcodes, we first try a non-blocking accept, then punt to async context if we have to. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29	Merge tag 'fuse-fixes-5.4-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Mostly virtiofs fixes, but also fixes a regression and couple of longstanding data/metadata writeback ordering issues" * tag 'fuse-fixes-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: redundant get_fuse_inode() calls in fuse_writepages_fill() fuse: Add changelog entries for protocols 7.1 - 7.8 fuse: truncate pending writes on O_TRUNC fuse: flush dirty data/metadata before non-truncate setattr virtiofs: Remove set but not used variable 'fc' virtiofs: Retry request submission from worker context virtiofs: Count pending forgets as in_flight forgets virtiofs: Set FR_SENT flag only after request has been sent virtiofs: No need to check fpq->connected state virtiofs: Do not end request in submission context fuse: don't advise readdirplus for negative lookup fuse: don't dereference req->args on finished request virtio-fs: don't show mount options virtio-fs: Change module name to virtiofs.ko
2019-10-29	io_uring: add support for canceling timeout requests	Jens Axboe
	We might have cases where the need for a specific timeout is gone, add support for canceling an existing timeout operation. This works like the POLL_REMOVE command, where the application passes in the user_data of the timeout it wishes to cancel in the sqe->addr field. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29	io_uring: add support for absolute timeouts	Jens Axboe
	This is a pretty trivial addition on top of the relative timeouts we have now, but it's handy for ensuring tighter timing for those that are building scheduling primitives on top of io_uring. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29	io_uring: allow application controlled CQ ring size	Jens Axboe
	We currently size the CQ ring as twice the SQ ring, to allow some flexibility in not overflowing the CQ ring. This is done because the SQE life time is different than that of the IO request itself, the SQE is consumed as soon as the kernel has seen the entry. Certain application don't need a huge SQ ring size, since they just submit IO in batches. But they may have a lot of requests pending, and hence need a big CQ ring to hold them all. By allowing the application to control the CQ ring size multiplier, we can cater to those applications more efficiently. If an application wants to define its own CQ ring size, it must set IORING_SETUP_CQSIZE in the setup flags, and fill out io_uring_params->cq_entries. The value must be a power of two. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-29	io_uring: add support for IORING_REGISTER_FILES_UPDATE	Jens Axboe
	Allows the application to remove/replace/add files to/from a file set. Passes in a struct: struct io_uring_files_update { __u32 offset; __s32 *fds; }; that holds an array of fds, size of array passed in through the usual nr_args part of the io_uring_register() system call. The logic is as follows: 1) If ->fds[i] is -1, the existing file at i + ->offset is removed from the set. 2) If ->fds[i] is a valid fd, the existing file at i + ->offset is replaced with ->fds[i]. For case #2, is the existing file is currently empty (fd == -1), the new fd is simply added to the array. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-28	net: Fix misspellings of "configure" and "configuration"	Geert Uytterhoeven
	Fix various misspellings of "configuration" and "configure". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-28	Merge tag 'v5.4-rc5' into rdma.git for-next	Jason Gunthorpe
	Linux 5.4-rc5 For dependencies in the next patches Conflict resolved by keeping the delete of the unlock. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28	seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUE	Christian Brauner
	Switch from BIT(0) to (1UL << 0). First, there are already two different forms used in the header, so there's no need to add a third. Second, the BIT() macros is kernel internal and afaict not actually exposed to userspace. Maybe there's some magic there I'm missing but it definitely causes issues when compiling a program that tries to use SECCOMP_USER_NOTIF_FLAG_CONTINUE. It currently fails in the following way: # github.com/lxc/lxd/lxd /usr/bin/ld: $WORK/b001/_x003.o: in function `__do_user_notification_continue': lxd/main_checkfeature.go:240: undefined reference to `BIT' collect2: error: ld returned 1 exit status Switching to (1UL << 0) should prevent that and is more in line what is already done in the rest of the header. Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20191024212539.4059-1-christian.brauner@ubuntu.com Signed-off-by: Kees Cook <keescook@chromium.org>
2019-10-28	RDMA/vmw_pvrdma: Use resource ids from physical device if available	Bryan Tan
	This change allows the RDMA stack to use physical resource numbers if they are passed up from the device. This is accomplished by separating the concept of the QP number from the QP handle. Previously, the two were the same, as the QP number was exposed to the guest and also used to reference a virtual QP in the device backend. With physical resource numbers exposed, the QP number given to the guest is the number assigned from the physical HCA's QP, while the QP handle is still the internal handle used to reference a virtual QP. Regardless of whether the device is exposing physical ids, the driver will still try to pick up the QP handle from the backend if possible. The MR keys exposed to the guest will also be the MR keys created by the physical HCA, instead of virtual MR keys. The distinction between handle and keys is already present for MRs so there is no need to do anything special here. A new version of the create QP response has been added to the device API to pass up the QP number and handle. The driver will also report these to userspace in the udata response if userspace supports it or not create the queuepair if not. I also had to do a refactor of the destroy qp code to reuse it if we fail to copy to userspace. Link: https://lore.kernel.org/r/20191028181444.19448-1-aditr@vmware.com Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Bryan Tan <bryantan@vmware.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28	rdma: Remove nes ABI header	Jason Gunthorpe
	This was missed when nes was removed. Fixes: 2d3c72ed5041 ("rdma: Remove nes") Link: https://lore.kernel.org/r/20191024135059.GA20084@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-28	ASoC: SOF: token: add tokens for PCM compatible with D0i3 substate	Keyon Jie
	Add stream token SOF_TKN_STREAM_PLAYBACK_COMPATIBLE_D0I3 and SOF_TKN_STREAM_CAPTURE_COMPATIBLE_D0I3 to denote if the stream can be opened at low power d0i3 status or not. Signed-off-by: Keyon Jie <yang.jie@linux.intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20191025224122.7718-9-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2019-10-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2019-10-27 The following pull-request contains BPF updates for your net-next tree. We've added 52 non-merge commits during the last 11 day(s) which contain a total of 65 files changed, 2604 insertions(+), 1100 deletions(-). The main changes are: 1) Revolutionize BPF tracing by using in-kernel BTF to type check BPF assembly code. The work here teaches BPF verifier to recognize kfree_skb()'s first argument as 'struct sk_buff *' in tracepoints such that verifier allows direct use of bpf_skb_event_output() helper used in tc BPF et al (w/o probing memory access) that dumps skb data into perf ring buffer. Also add direct loads to probe memory in order to speed up/replace bpf_probe_read() calls, from Alexei Starovoitov. 2) Big batch of changes to improve libbpf and BPF kselftests. Besides others: generalization of libbpf's CO-RE relocation support to now also include field existence relocations, revamp the BPF kselftest Makefile to add test runner concept allowing to exercise various ways to build BPF programs, and teach bpf_object__open() and friends to automatically derive BPF program type/expected attach type from section names to ease their use, from Andrii Nakryiko. 3) Fix deadlock in stackmap's build-id lookup on rq_lock(), from Song Liu. 4) Allow to read BTF as raw data from bpftool. Most notable use case is to dump /sys/kernel/btf/vmlinux through this, from Jiri Olsa. 5) Use bpf_redirect_map() helper in libbpf's AF_XDP helper prog which manages to improve "rx_drop" performance by ~4%., from Björn Töpel. 6) Fix to restore the flow dissector after reattach BPF test and also fix error handling in bpf_helper_defs.h generation, from Jakub Sitnicki. 7) Improve verifier's BTF ctx access for use outside of raw_tp, from Martin KaFai Lau. 8) Improve documentation for AF_XDP with new sections and to reflect latest features, from Magnus Karlsson. 9) Add back 'version' section parsing to libbpf for old kernels, from John Fastabend. 10) Fix strncat bounds error in libbpf's libbpf_prog_type_by_name(), from KP Singh. 11) Turn on -mattr=+alu32 in LLVM by default for BPF kselftests in order to improve insn coverage for built BPF progs, from Yonghong Song. 12) Misc minor cleanups and fixes, from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next	David S. Miller
	Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next, more specifically: * Updates for ipset: 1) Coding style fix for ipset comment extension, from Jeremy Sowden. 2) De-inline many functions in ipset, from Jeremy Sowden. 3) Move ipset function definition from header to source file. 4) Move ip_set_put_flags() to source, export it as a symbol, remove inline. 5) Move range_to_mask() to the source file where this is used. 6) Move ip_set_get_ip_port() to the source file where this is used. * IPVS selftests and netns improvements: 7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan. 8) Three patches to add selftest script for ipvs, also from Haishuang Yan. * Conntrack updates and new nf_hook_slow_list() function: 9) Document ct ecache extension, from Florian Westphal. 10) Skip ct extensions from ctnetlink dump, from Florian. 11) Free ct extension immediately, from Florian. 12) Skip access to ecache extension from nf_ct_deliver_cached_events() this is not correct as reported by Syzbot. 13) Add and use nf_hook_slow_list(), from Florian. * Flowtable infrastructure updates: 14) Move priority to nf_flowtable definition. 15) Dynamic allocation of per-device hooks in flowtables. 16) Allow to include netdevice only once in flowtable definitions. 17) Rise maximum number of devices per flowtable. * Netfilter hardware offload infrastructure updates: 18) Add nft_flow_block_chain() helper function. 19) Pass callback list to nft_setup_cb_call(). 20) Add nft_flow_cls_offload_setup() helper function. 21) Remove rules for the unregistered device via netdevice event. 22) Support for multiple devices in a basechain definition at the ingress hook. 22) Add nft_chain_offload_cmd() helper function. 23) Add nft_flow_block_offload_init() helper function. 24) Rewind in case of failing to bind multiple devices to hook. 25) Typo in IPv6 tproxy module description, from Norman Rasmussen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-25	tcp: add TCP_INFO status for failed client TFO	Jason Baron
	The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether or not data-in-SYN was ack'd on both the client and server side. We'd like to gather more information on the client-side in the failure case in order to indicate the reason for the failure. This can be useful for not only debugging TFO, but also for creating TFO socket policies. For example, if a middle box removes the TFO option or drops a data-in-SYN, we can can detect this case, and turn off TFO for these connections saving the extra retransmits. The newly added tcpi_fastopen_client_fail status is 2 bits and has the following 4 states: 1) TFO_STATUS_UNSPEC Catch-all state which includes when TFO is disabled via black hole detection, which is indicated via LINUX_MIB_TCPFASTOPENBLACKHOLE. 2) TFO_COOKIE_UNAVAILABLE If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie is available in the cache. 3) TFO_DATA_NOT_ACKED Data was sent with SYN, we received a SYN/ACK but it did not cover the data portion. Cookie is not accepted by server because the cookie may be invalid or the server may be overloaded. 4) TFO_SYN_RETRANSMITTED Data was sent with SYN, we received a SYN/ACK which did not cover the data after at least 1 additional SYN was sent (without data). It may be the case that a middle-box is dropping data-in-SYN packets. Thus, it would be more efficient to not use TFO on this connection to avoid extra retransmits during connection establishment. These new fields do not cover all the cases where TFO may fail, but other failures, such as SYN/ACK + data being dropped, will result in the connection not becoming established. And a connection blackhole after session establishment shows up as a stalled connection. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Christoph Paasch <cpaasch@apple.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-25	fcntl: fix typo in RWH_WRITE_LIFE_NOT_SET r/w hint name	Eugene Syromiatnikov
	According to commit message in the original commit c75b1d9421f8 ("fs: add fcntl() interface for setting/getting write life time hints"), as well as userspace library[1] and man page update[2], R/W hint constants are intended to have RWH_* prefix. However, RWF_WRITE_LIFE_NOT_SET retained "RWF_*" prefix used in the early versions of the proposed patch set[3]. Rename it and provide the old name as a synonym for the new one for backward compatibility. [1] https://github.com/axboe/fio/commit/bd553af6c849 [2] https://github.com/mkerrisk/man-pages/commit/580082a186fd [3] https://www.mail-archive.com/linux-block@vger.kernel.org/msg09638.html Fixes: c75b1d9421f8 ("fs: add fcntl() interface for setting/getting write life time hints") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-10-26	Merge tag 'drm-next-5.5-2019-10-09' of ↵	Dave Airlie
	git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-10-09: amdgpu: - Additional RAS enablement for vega20 - RAS page retirement and bad page storage in EEPROM - No GPU reset with unrecoverable RAS errors - Reserve vram for page tables rather than trying to evict - Fix issues with GPU reset and xgmi hives - DC i2c over aux fixes - Direct submission for clears, PTE/PDE updates - Improvements to help support recoverable GPU page faults - Silence harmless SAD block messages - Clean up code for creating a bo at a fixed location - Initial DC HDCP support - Lots of documentation fixes - GPU reset for renoir - Add IH clockgating support for soc15 asics - Powerplay improvements - DC MST cleanups - Add support for MSI-X - Misc cleanups and bug fixes amdkfd: - Query KFD device info by asic type rather than pci ids - Add navi14 support - Add renoir support - Add navi12 support - gfx10 trap handler improvements - pasid cleanups - Check against device cgroup ttm: - Return -EBUSY with pipelining with no_gpu_wait radeon: - Silence harmless SAD block messages device_cgroup: - Export devcgroup_check_permission Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
2019-10-26	crypto: ccp - Retry SEV INIT command in case of integrity check failure.	Ashish Kalra
	SEV INIT command loads the SEV related persistent data from NVS and initializes the platform context. The firmware validates the persistent state. If validation fails, the firmware will reset the persisent state and return an integrity check failure status. At this point, a subsequent INIT command should succeed, so retry the command. The INIT command retry is only done during driver initialization. Additional enums along with SEV_RET_SECURE_DATA_INVALID are added to sev_ret_code to maintain continuity and relevance of enum values. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-10-25	dma-buf: Add dma-buf heaps framework	Andrew F. Davis
	This framework allows a unified userspace interface for dma-buf exporters, allowing userland to allocate specific types of memory for use in dma-buf sharing. Each heap is given its own device node, which a user can allocate a dma-buf fd from using the DMA_HEAP_IOC_ALLOC. This code is an evoluiton of the Android ION implementation, and a big thanks is due to its authors/maintainers over time for their effort: Rebecca Schultz Zavin, Colin Cross, Benjamin Gaignard, Laura Abbott, and many other contributors! Cc: Laura Abbott <labbott@redhat.com> Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Liam Mark <lmark@codeaurora.org> Cc: Pratik Patel <pratikp@codeaurora.org> Cc: Brian Starkey <Brian.Starkey@arm.com> Cc: Vincent Donnefort <Vincent.Donnefort@arm.com> Cc: Sudipto Paul <Sudipto.Paul@arm.com> Cc: Andrew F. Davis <afd@ti.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chenbo Feng <fengc@google.com> Cc: Alistair Strachan <astrachan@google.com> Cc: Hridya Valsaraju <hridya@google.com> Cc: Hillf Danton <hdanton@sina.com> Cc: dri-devel@lists.freedesktop.org Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Reviewed-by: Brian Starkey <brian.starkey@arm.com> Acked-by: Laura Abbott <labbott@redhat.com> Tested-by: Ayan Kumar Halder <ayan.halder@arm.com> Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191021190310.85221-2-john.stultz@linaro.org
2019-10-24	drm: Spelling s/connet/connect/	Geert Uytterhoeven
	Fix misspellings of "connector" and "connection" Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20191024151737.29287-1-geert+renesas@glider.be
2019-10-24	media: v4l2-core: Add new metadata format	Vandana BN
	Add new metadata format to support metadata output in vivid. Signed-off-by: Vandana BN <bnvandana@gmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-10-24	Merge remote-tracking branch 'kvmarm/kvm-arm64/stolen-time' into ↵	Marc Zyngier
	kvmarm-master/next
2019-10-23	compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t	Arnd Bergmann
	The ppp_idle structure is defined in terms of __kernel_time_t, which is defined as 'long' on all architectures, and this usage is not affected by the y2038 problem since it transports a time interval rather than an absolute time. However, the ppp user space defines the same structure as time_t, which may be 64-bit wide on new libc versions even on 32-bit architectures. It's easy enough to just handle both possible structure layouts on all architectures, to deal with the possibility that a user space ppp implementation comes with its own ppp_idle structure definition, as well as to document the fact that the driver is y2038-safe. Doing this also avoids the need for a special compat mode translation, since 32-bit and 64-bit kernels now support the same interfaces. The old 32-bit structure is also available on native 64-bit architectures now, but this is harmless. Cc: netdev@vger.kernel.org Cc: linux-ppp@vger.kernel.org Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-10-23	Merge drm/drm-next into drm-misc-next	Sean Paul
	Parroting Daniel's backmerge justification from 2e79e22e092acd55da0b2db066e4826d7d152c41: Thierry needs fd70c7755bf0 ("drm/bridge: tc358767: fix max_tu_symbol value") to be able to merge his dp_link patch series. Signed-off-by: Sean Paul <seanpaul@chromium.org>
2019-10-23	Revert "drm/omap: add OMAP_BO flags to affect buffer allocation"	Sean Paul
	This reverts commit 23b482252836ab3c5e6b3b20ed3038449cbc7679. This patch does not have an acceptable open source userspace implementation, and as such it does not meet the requirements for adding new UAPI. Discussion is in the Link. Link: https://lists.freedesktop.org/archives/dri-devel/2019-October/240586.html Fixes: 23b482252836 ("drm/omap: add OMAP_BO flags to affect buffer allocation") Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jean-Jacques Hiblot <jjhiblot@ti.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20191022204733.235801-1-sean@poorly.run
2019-10-23	fuse: Add changelog entries for protocols 7.1 - 7.8	Alan Somers
	Retroactively add changelog entry for FUSE protocols 7.1 through 7.8. Signed-off-by: Alan Somers <asomers@FreeBSD.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-10-23	netfilter: nf_tables: support for multiple devices per netdev hook	Pablo Neira Ayuso
	This patch allows you to register one netdev basechain to multiple devices. This adds a new NFTA_HOOK_DEVS netlink attribute to specify the list of netdevices. Basechains store a list of hooks. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-23	Merge v5.4-rc4 into drm-next	Daniel Vetter
	Thierry needs fd70c7755bf0 ("drm/bridge: tc358767: fix max_tu_symbol value") to be able to merge his dp_link patch series. Some adjacent changes conflicts, plus some clashes in i915 due to cherry-picking and git trying to be helpful and leaving both versions in. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2019-10-22	Merge branch 'mlx5-rd-sgl' into rdma.git for-next	Jason Gunthorpe
	From Yamin Friedman: ==================== This series from Yamin implements long standing "TODO" existed in rw.c. It allows the driver to specify a cut-over point where it is faster to build a lkey MR rather than do a large SGL for RDMA READ operations. mlx5 HW gets a notable performane boost by switching to MRs. ==================== Based on the mlx5-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for dependencies * branch 'mlx5-rd-sgl': (3 commits) RDMA/mlx5: Add capability for max sge to get optimized performance RDMA/rw: Support threshold for registration vs scattering to local pages net/mlx5: Expose optimal performance scatter entries capability
2019-10-21	clone3: add CLONE_CLEAR_SIGHAND	Christian Brauner
	Reset all signal handlers of the child not set to SIG_IGN to SIG_DFL. Mutually exclusive with CLONE_SIGHAND to not disturb other thread's signal handler. In the spirit of closer cooperation between glibc developers and kernel developers (cf. [2]) this patchset came out of a discussion on the glibc mailing list for improving posix_spawn() (cf. [1], [3], [4]). Kernel support for this feature has been explicitly requested by glibc and I see no reason not to help them with this. The child helper process on Linux posix_spawn must ensure that no signal handlers are enabled, so the signal disposition must be either SIG_DFL or SIG_IGN. However, it requires a sigprocmask to obtain the current signal mask and at least _NSIG sigaction calls to reset the signal handlers for each posix_spawn call or complex state tracking that might lead to data corruption in glibc. Adding this flags lets glibc avoid these problems. [1]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00149.html [3]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00158.html [4]: https://www.sourceware.org/ml/libc-alpha/2019-10/msg00160.html [2]: https://lwn.net/Articles/799331/ '[...] by asking for better cooperation with the C-library projects in general. They should be copied on patches containing ABI changes, for example. I noted that there are often times where C-library developers wish the kernel community had done things differently; how could those be avoided in the future? Members of the audience suggested that more glibc developers should perhaps join the linux-api list. The other suggestion was to "copy Florian on everything".' Cc: Florian Weimer <fweimer@redhat.com> Cc: libc-alpha@sourceware.org Cc: linux-api@vger.kernel.org Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Link: https://lore.kernel.org/r/20191014104538.3096-1-christian.brauner@ubuntu.com
2019-10-21	KVM: arm64: Provide VCPU attributes for stolen time	Steven Price
	Allow user space to inform the KVM host where in the physical memory map the paravirtualized time structures should be located. User space can set an attribute on the VCPU providing the IPA base address of the stolen time structure for that VCPU. This must be repeated for every VCPU in the VM. The address is given in terms of the physical address visible to the guest and must be 64 byte aligned. The guest will discover the address via a hypercall. Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-10-21	KVM: arm/arm64: Allow user injection of external data aborts	Christoffer Dall
	In some scenarios, such as buggy guest or incorrect configuration of the VMM and firmware description data, userspace will detect a memory access to a portion of the IPA, which is not mapped to any MMIO region. For this purpose, the appropriate action is to inject an external abort to the guest. The kernel already has functionality to inject an external abort, but we need to wire up a signal from user space that lets user space tell the kernel to do this. It turns out, we already have the set event functionality which we can perfectly reuse for this. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-10-21	KVM: arm/arm64: Allow reporting non-ISV data aborts to userspace	Christoffer Dall
	For a long time, if a guest accessed memory outside of a memslot using any of the load/store instructions in the architecture which doesn't supply decoding information in the ESR_EL2 (the ISV bit is not set), the kernel would print the following message and terminate the VM as a result of returning -ENOSYS to userspace: load/store instruction decoding not implemented The reason behind this message is that KVM assumes that all accesses outside a memslot is an MMIO access which should be handled by userspace, and we originally expected to eventually implement some sort of decoding of load/store instructions where the ISV bit was not set. However, it turns out that many of the instructions which don't provide decoding information on abort are not safe to use for MMIO accesses, and the remaining few that would potentially make sense to use on MMIO accesses, such as those with register writeback, are not used in practice. It also turns out that fetching an instruction from guest memory can be a pretty horrible affair, involving stopping all CPUs on SMP systems, handling multiple corner cases of address translation in software, and more. It doesn't appear likely that we'll ever implement this in the kernel. What is much more common is that a user has misconfigured his/her guest and is actually not accessing an MMIO region, but just hitting some random hole in the IPA space. In this scenario, the error message above is almost misleading and has led to a great deal of confusion over the years. It is, nevertheless, ABI to userspace, and we therefore need to introduce a new capability that userspace explicitly enables to change behavior. This patch introduces KVM_CAP_ARM_NISV_TO_USER (NISV meaning Non-ISV) which does exactly that, and introduces a new exit reason to report the event to userspace. User space can then emulate an exception to the guest, restart the guest, suspend the guest, or take any other appropriate action as per the policy of the running system. Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alexander Graf <graf@amazon.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2019-10-21	media: videodev2.h: add V4L2_DEC_CMD_FLUSH	Hans Verkuil
	Add this new V4L2_DEC_CMD_FLUSH decoder command and document it. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alexandre Courbot <acourbot@chromium.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-10-21	media: vb2: add V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF	Hans Verkuil
	This patch adds support for the V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF flag. It also adds a new V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF capability. Drivers should set vb2_queue->subsystem_flags to VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF to indicate support for this flag. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-10-21	KVM: PPC: Report single stepping capability	Fabiano Rosas
	When calling the KVM_SET_GUEST_DEBUG ioctl, userspace might request the next instruction to be single stepped via the KVM_GUESTDBG_SINGLESTEP control bit of the kvm_guest_debug structure. This patch adds the KVM_CAP_PPC_GUEST_DEBUG_SSTEP capability in order to inform userspace about the state of single stepping support. We currently don't have support for guest single stepping implemented in Book3S HV so the capability is only present for Book3S PR and BookE. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2019-10-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller
	Several cases of overlapping changes which were for the most part trivially resolvable. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-18	drm/fourcc: Fix undefined left shift in DRM_FORMAT_BIG_ENDIAN macros	Adam Jackson
	1<<31 is undefined because it's a signed int and C is terrible. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191018175041.613780-1-ajax@redhat.com
2019-10-18	drm/syncobj: extend syncobj query ability v3	Chunming Zhou
	user space needs a flexiable query ability. So that umd can get last signaled or submitted point. v2: add sanitizer checking. v3: rebase Change-Id: I6512b430524ebabe715e602a2bf5abb0a7e780ea Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Christian König <Christian.Koenig@amd.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/series/64044/
2019-10-17	RDMA/uapi: Fix and re-organize the usage of rdma_driver_id	Yishai Hadas
	Fix 'enum rdma_driver_id' to preserve other driver values before that RDMA_DRIVER_CXGB3 was deleted. As this value is UAPI we can't affect other values as of a deletion of one driver id. Fixes: 30e0f6cf5acb ("RDMA/iw_cxgb3: Remove the iw_cxgb3 module from kernel") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20191015075419.18185-2-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-10-17	bpf: Check types of arguments passed into helpers	Alexei Starovoitov
	Introduce new helper that reuses existing skb perf_event output implementation, but can be called from raw_tracepoint programs that receive 'struct sk_buff ' as tracepoint argument or can walk other kernel data structures to skb pointer. In order to do that teach verifier to resolve true C types of bpf helpers into in-kernel BTF ids. The type of kernel pointer passed by raw tracepoint into bpf program will be tracked by the verifier all the way until it's passed into helper function. For example: kfree_skb() kernel function calls trace_kfree_skb(skb, loc); bpf programs receives that skb pointer and may eventually pass it into bpf_skb_output() bpf helper which in-kernel is implemented via bpf_skb_event_output() kernel function. Its first argument in the kernel is 'struct sk_buff '. The verifier makes sure that types match all the way. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org
2019-10-17	bpf: Add attach_btf_id attribute to program load	Alexei Starovoitov
	Add attach_btf_id attribute to prog_load command. It's similar to existing expected_attach_type attribute which is used in several cgroup based program types. Unfortunately expected_attach_type is ignored for tracing programs and cannot be reused for new purpose. Hence introduce attach_btf_id to verify bpf programs against given in-kernel BTF type id at load time. It is strictly checked to be valid for raw_tp programs only. In a later patches it will become: btf_id == 0 semantics of existing raw_tp progs. btd_id > 0 raw_tp with BTF and additional type safety. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191016032505.2089704-5-ast@kernel.org
2019-10-16	serial: fsl_linflexuart: Be consistent with the name	Stefan-Gabriel Mirea
	For consistency reasons, spell the controller name as "LINFlexD" in comments and documentation. Signed-off-by: Stefan-Gabriel Mirea <stefan-gabriel.mirea@nxp.com> Link: https://lore.kernel.org/r/1571230107-8493-4-git-send-email-stefan-gabriel.mirea@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-15	ethtool: Add support for 400Gbps (50Gbps per lane) link modes	Jiri Pirko
	Add support for 400Gbps speed, link modes of 50Gbps per lane Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-15	iommu: Introduce guest PASID bind function	Jacob Pan
	Guest shared virtual address (SVA) may require host to shadow guest PASID tables. Guest PASID can also be allocated from the host via enlightened interfaces. In this case, guest needs to bind the guest mm, i.e. cr3 in guest physical address to the actual PASID table in the host IOMMU. Nesting will be turned on such that guest virtual address can go through a two level translation: - 1st level translates GVA to GPA - 2nd level translates GPA to HPA This patch introduces APIs to bind guest PASID data to the assigned device entry in the physical IOMMU. See the diagram below for usage explanation. .-------------. .---------------------------. \| vIOMMU \| \| Guest process mm, FL only \| \| \| '---------------------------' .----------------/ \| PASID Entry \|--- PASID cache flush - '-------------' \| \| \| V \| \| GP '-------------' Guest ------\| Shadow \|----------------------- GP->HP* --------- v v \| Host v .-------------. .----------------------. \| pIOMMU \| \| Bind FL for GVA-GPA \| \| \| '----------------------' .----------------/ \| \| PASID Entry \| V (Nested xlate) '----------------\.---------------------. \| \| \|Set SL to GPA-HPA \| \| \| '---------------------' '-------------' Where: - FL = First level/stage one page tables - SL = Second level/stage two page tables - GP = Guest PASID - HP = Host PASID * Conversion needed if non-identity GP-HP mapping option is chosen. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-15	iommu: Introduce cache_invalidate API	Yi L Liu
	In any virtualization use case, when the first translation stage is "owned" by the guest OS, the host IOMMU driver has no knowledge of caching structure updates unless the guest invalidation activities are trapped by the virtualizer and passed down to the host. Since the invalidation data can be obtained from user space and will be written into physical IOMMU, we must allow security check at various layers. Therefore, generic invalidation data format are proposed here, model specific IOMMU drivers need to convert them into their own format. Signed-off-by: Yi L Liu <yi.l.liu@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-10-15	Merge drm/drm-next into drm-intel-next-queued	Joonas Lahtinen
	Backmerging to pull in HDR DP code: https://lists.freedesktop.org/archives/dri-devel/2019-September/236453.html Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-10-14	drm/i915/perf: allow holding preemption on filtered ctx	Lionel Landwerlin
	We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) v3: Use priorities to avoid preemption rather than the HW mechanism v4: Just modify the port priority reporting function v5: Add nopreempt flag on gem context and always flag requests appropriately, regarless of OA reconfiguration. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-4-chris@chris-wilson.co.uk
2019-10-14	drm/i915/perf: Allow dynamic reconfiguration of the OA stream	Chris Wilson
	Introduce a new perf_ioctl command to change the OA configuration of the active stream. This allows the OA stream to be reconfigured between batch buffers, giving greater flexibility in sampling. We inject a request into the OA context to reconfigure the stream asynchronously on the GPU in between and ordered with execbuffer calls. Original patch for dynamic reconfiguration by Lionel Landwerlin. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-3-chris@chris-wilson.co.uk
2019-10-14	drm/i915: add support for perf configuration queries	Lionel Landwerlin
	Listing configurations at the moment is supported only through sysfs. This might cause issues for applications wanting to list configurations from a container where sysfs isn't available. This change adds a way to query the number of configurations and their content through the i915 query uAPI. v2: Fix sparse warnings (Lionel) Add support to query configuration using uuid (Lionel) v3: Fix some inconsistency in uapi header (Lionel) Fix unlocking when not locked issue (Lionel) Add debug messages (Lionel) v4: Fix missing unlock (Dan) v5: Drop lock when copying config content to userspace (Chris) v6: Drop lock when copying config list to userspace (Chris) Fix deadlock when calling i915_perf_get_oa_config() under perf.metrics_lock (Lionel) Add i915_oa_config_get() (Chris) Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/932 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191014201404.22468-2-chris@chris-wilson.co.uk