linux.git/arch/riscv/include/asm, branch v4.17

riscv: there is no

2018-04-24T17:54:23+00:00

So don't list it as generic-y.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Palmer Dabbelt

Merge tag 'riscv-for-linus-4.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

2018-04-04T23:43:47+00:00

Pull RISC-V updates from Palmer Dabbelt:
 "This contains the new features we'd like to incorporate into the
  RISC-V port for 4.17. We might have a bit more stuff land later in the
  merge window, but I wanted to get this out earlier just so everyone
  can see where we currently stand.

  A short summary of the changes is:

   - We've added support for dynamic ftrace on RISC-V targets.

   - There have been a handful of cleanups to our atomic and locking
     routines. They now more closely match the released RISC-V memory
     model draft.

   - Our module loading support has been cleaned up and is now enabled
     by default, despite some limitations still existing.

   - A patch to define COMMANDLINE_FORCE instead of COMMANDLINE_OVERRIDE
     so the generic device tree code picks up handling all our command
     line stuff.

  There's more information in the merge commits for each patch set"

* tag 'riscv-for-linus-4.17-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: (21 commits)
  RISC-V: Rename CONFIG_CMDLINE_OVERRIDE to CONFIG_CMDLINE_FORCE
  RISC-V: Add definition of relocation types
  RISC-V: Enable module support in defconfig
  RISC-V: Support SUB32 relocation type in kernel module
  RISC-V: Support ADD32 relocation type in kernel module
  RISC-V: Support ALIGN relocation type in kernel module
  RISC-V: Support RVC_BRANCH/JUMP relocation type in kernel modulewq
  RISC-V: Support HI20/LO12_I/LO12_S relocation type in kernel module
  RISC-V: Support CALL relocation type in kernel module
  RISC-V: Support GOT_HI20/CALL_PLT relocation type in kernel module
  RISC-V: Add section of GOT.PLT for kernel module
  RISC-V: Add sections of PLT and GOT for kernel module
  riscv/atomic: Strengthen implementations with fences
  riscv/spinlock: Strengthen implementations with fences
  riscv/barrier: Define __smp_{store_release,load_acquire}
  riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support
  riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support
  riscv/ftrace: Add ARCH_SUPPORTS_FTRACE_OPS support
  riscv/ftrace: Add dynamic function graph tracer support
  riscv/ftrace: Add dynamic function tracer support
  ...

RISC-V: Fixes to module loading

2018-04-03T03:43:14+00:00

This cleans up the module support that was commited earlier to work with
what's actually emitted from our GCC port as it lands upstream.  Most of
the work here is adding new relocations to the kernel.

There's some limitations on module loading imposed by the kernel:

* The kernel doesn't support linker relaxation, which is necessary to
  support R_RISCV_ALIGN.  In order to get reliable module building
  you're going to need to a GCC that supports the new '-mno-relax',
  which IIRC isn't going to be out until 8.1.0.  It's somewhat unlikely
  that R_RISCV_ALIGN will appear in a module even without '-mno-relax'
  support, so issues shouldn't be common.

* There is no large code model for RISC-V, which means modules must be
  loaded within a 32-bit signed offset of the kernel.  We don't
  currently have any mechanism for ensuring this memory remains free or
  moving pages around, so issues here might be common.

I fixed a singcle merge conflict in arch/riscv/kernel/Makefile.

RISC-V: Assorted memory model fixes

2018-04-03T03:36:33+00:00

These fixes fall into three categories

* The definiton of __smp_{store_release,load_acquire}, which allow us to
  emit a full fence when unnecessary.
* Fixes to avoid relying on the behavior of "*.aqrl" atomics, as those
  are specified in the currently released RISC-V memory model draft in
  a way that makes them useless for Linux.  This might change in the
  future, but now the code matches the memory model spec as it's written
  so at least we're getting closer to something sane.  The actual fix is
  to delete the RISC-V specific atomics and drop back to generic
  versions that use the new fences from above.
* Cleanups to our atomic macros, which are mostly non-functional
  changes.

Unfortunately I haven't given these as thorough of a testing as I
probably should have, but I've poked through the code and they seem
generally OK.

RISC-V: Add section of GOT.PLT for kernel module

2018-04-03T03:00:54+00:00

Separate the function symbol address from .plt to .got.plt section.

The original plt entry has trampoline code with symbol address,
there is a 32-bit padding bwtween jar instruction and symbol address.

Extract the symbol address to .got.plt to reduce the module size.

Signed-off-by: Zong Li 
Signed-off-by: Palmer Dabbelt

RISC-V: Add sections of PLT and GOT for kernel module

2018-04-03T03:00:54+00:00

The address of external symbols will locate more than 32-bit offset
in 64-bit kernel with sv39 or sv48 virtual addressing.

Module loader emits the GOT and PLT entries for data symbols and
function symbols respectively.

The PLT entry is a trampoline code for jumping to the 64-bit
real address. The GOT entry is just the data symbol address.

Signed-off-by: Zong Li 
Signed-off-by: Palmer Dabbelt

riscv/atomic: Strengthen implementations with fences

2018-04-03T02:59:44+00:00

Atomics present the same issue with locking: release and acquire
variants need to be strengthened to meet the constraints defined
by the Linux-kernel memory consistency model [1].

Atomics present a further issue: implementations of atomics such
as atomic_cmpxchg() and atomic_add_unless() rely on LR/SC pairs,
which do not give full-ordering with .aqrl; for example, current
implementations allow the "lr-sc-aqrl-pair-vs-full-barrier" test
below to end up with the state indicated in the "exists" clause.

In order to "synchronize" LKMM and RISC-V's implementation, this
commit strengthens the implementations of the atomics operations
by replacing .rl and .aq with the use of ("lightweigth") fences,
and by replacing .aqrl LR/SC pairs in sequences such as:

  0:      lr.w.aqrl  %0, %addr
          bne        %0, %old, 1f
          ...
          sc.w.aqrl  %1, %new, %addr
          bnez       %1, 0b
  1:

with sequences of the form:

  0:      lr.w       %0, %addr
          bne        %0, %old, 1f
          ...
          sc.w.rl    %1, %new, %addr   /* SC-release   */
          bnez       %1, 0b
          fence      rw, rw            /* "full" fence */
  1:

following Daniel's suggestion.

These modifications were validated with simulation of the RISC-V
memory consistency model.

C lr-sc-aqrl-pair-vs-full-barrier

{}

P0(int *x, int *y, atomic_t *u)
{
	int r0;
	int r1;

	WRITE_ONCE(*x, 1);
	r0 = atomic_cmpxchg(u, 0, 1);
	r1 = READ_ONCE(*y);
}

P1(int *x, int *y, atomic_t *v)
{
	int r0;
	int r1;

	WRITE_ONCE(*y, 1);
	r0 = atomic_cmpxchg(v, 0, 1);
	r1 = READ_ONCE(*x);
}

exists (u=1 /\ v=1 /\ 0:r1=0 /\ 1:r1=0)

[1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
    https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
    https://marc.info/?l=linux-kernel&m=151633436614259&w=2

Suggested-by: Daniel Lustig 
Signed-off-by: Andrea Parri 
Cc: Palmer Dabbelt 
Cc: Albert Ou 
Cc: Daniel Lustig 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
Cc: Ingo Molnar 
Cc: Linus Torvalds 
Cc: linux-riscv@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Palmer Dabbelt

riscv/spinlock: Strengthen implementations with fences

2018-04-03T02:59:43+00:00

Current implementations map locking operations using .rl and .aq
annotations.  However, this mapping is unsound w.r.t. the kernel
memory consistency model (LKMM) [1]:

Referring to the "unlock-lock-read-ordering" test reported below,
Daniel wrote:

  "I think an RCpc interpretation of .aq and .rl would in fact
   allow the two normal loads in P1 to be reordered [...]

   The intuition would be that the amoswap.w.aq can forward from
   the amoswap.w.rl while that's still in the store buffer, and
   then the lw x3,0(x4) can also perform while the amoswap.w.rl
   is still in the store buffer, all before the l1 x1,0(x2)
   executes.  That's not forbidden unless the amoswaps are RCsc,
   unless I'm missing something.

   Likewise even if the unlock()/lock() is between two stores.
   A control dependency might originate from the load part of
   the amoswap.w.aq, but there still would have to be something
   to ensure that this load part in fact performs after the store
   part of the amoswap.w.rl performs globally, and that's not
   automatic under RCpc."

Simulation of the RISC-V memory consistency model confirmed this
expectation.

In order to "synchronize" LKMM and RISC-V's implementation, this
commit strengthens the implementations of the locking operations
by replacing .rl and .aq with the use of ("lightweigth") fences,
resp., "fence rw,  w" and "fence r , rw".

C unlock-lock-read-ordering

{}
/* s initially owned by P1 */

P0(int *x, int *y)
{
        WRITE_ONCE(*x, 1);
        smp_wmb();
        WRITE_ONCE(*y, 1);
}

P1(int *x, int *y, spinlock_t *s)
{
        int r0;
        int r1;

        r0 = READ_ONCE(*y);
        spin_unlock(s);
        spin_lock(s);
        r1 = READ_ONCE(*x);
}

exists (1:r0=1 /\ 1:r1=0)

[1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2
    https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXM
    https://marc.info/?l=linux-kernel&m=151633436614259&w=2

Signed-off-by: Andrea Parri 
Cc: Palmer Dabbelt 
Cc: Albert Ou 
Cc: Daniel Lustig 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
Cc: Ingo Molnar 
Cc: Linus Torvalds 
Cc: linux-riscv@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Palmer Dabbelt

riscv/barrier: Define __smp_{store_release,load_acquire}

2018-04-03T02:59:43+00:00

Introduce __smp_{store_release,load_acquire}, and rely on the generic
definitions for smp_{store_release,load_acquire}. This avoids the use
of full ("rw,rw") fences on SMP.

Signed-off-by: Andrea Parri 
Signed-off-by: Palmer Dabbelt

riscv/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support

2018-04-03T02:59:13+00:00

Cc: Greentime Hu 
Signed-off-by: Alan Kao 
Signed-off-by: Palmer Dabbelt