linux-stable.git/kernel/events/callchain.c, branch v4.15

perf/callchain: Force USER_DS when invoking perf_callchain_user()

2017-05-10T05:54:00+00:00

Perf can generate and record a user callchain in response to a synchronous
request, such as a tracepoint firing. If this happens under set_fs(KERNEL_DS),
then we can end up walking the user stack (and dereferencing/saving whatever we
find there) without the protections usually afforded by checks such as
access_ok.

Rather than play whack-a-mole with each architecture's stack unwinding
implementation, fix the root of the problem by ensuring that we force USER_DS
when invoking perf_callchain_user from the perf core.

Reported-by: Al Viro 
Signed-off-by: Will Deacon 
Acked-by: Peter Zijlstra 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Thomas Gleixner 
Signed-off-by: Ingo Molnar

sched/headers: Prepare for new header dependencies before moving code to

2017-03-02T07:42:36+00:00

We are going to split  out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder  file that just
maps to  to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds 
Cc: Mike Galbraith 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

perf core: Per event callchain limit

2016-05-30T15:41:44+00:00

Additionally to being able to control the system wide maximum depth via
/proc/sys/kernel/perf_event_max_stack, now we are able to ask for
different depths per event, using perf_event_attr.sample_max_stack for
that.

This uses an u16 hole at the end of perf_event_attr, that, when
perf_event_attr.sample_type has the PERF_SAMPLE_CALLCHAIN, if
sample_max_stack is zero, means use perf_event_max_stack, otherwise
it'll be bounds checked under callchain_mutex.

Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Alexei Starovoitov 
Cc: Brendan Gregg 
Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Masami Hiramatsu 
Cc: Milian Wolff 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: Wang Nan 
Cc: Zefan Li 
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf core: Separate accounting of contexts and real addresses in a stack trace

2016-05-17T02:11:53+00:00

The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.

A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.

Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Alexei Starovoitov 
Cc: Brendan Gregg 
Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Masami Hiramatsu 
Cc: Milian Wolff 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: Wang Nan 
Cc: Zefan Li 
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf core: Add perf_callchain_store_context() helper

2016-05-17T02:11:52+00:00

We need have different helpers to account how many contexts we have in
the sample and for real addresses, so do it now as a prep patch, to
ease review.

Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf core: Add a 'nr' field to perf_event_callchain_context

2016-05-17T02:11:51+00:00

We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.

This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.

Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf core: Pass max stack as a perf_callchain_entry context

2016-05-17T02:11:50+00:00

This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.

Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Alexei Starovoitov 
Cc: Brendan Gregg 
Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Masami Hiramatsu 
Cc: Milian Wolff 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: Wang Nan 
Cc: Zefan Li 
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf core: Generalize max_stack sysctl handler

2016-05-17T02:11:49+00:00

So that it can be used for other stack related knobs, such as the
upcoming one to tweak the max number of of contexts per stack sample.

In all those cases we can only change the value if there are no perf
sessions collecting stacks, so they need to grab that mutex, etc.

Cc: David Ahern 
Cc: Frederic Weisbecker 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf core: Allow setting up max frame stack depth via sysctl

2016-04-27T13:20:39+00:00

The default remains 127, which is good for most cases, and not even hit
most of the time, but then for some cases, as reported by Brendan, 1024+
deep frames are appearing on the radar for things like groovy, ruby.

And in some workloads putting a _lower_ cap on this may make sense. One
that is per event still needs to be put in place tho.

The new file is:

  # cat /proc/sys/kernel/perf_event_max_stack
  127

Chaging it:

  # echo 256 > /proc/sys/kernel/perf_event_max_stack
  # cat /proc/sys/kernel/perf_event_max_stack
  256

But as soon as there is some event using callchains we get:

  # echo 512 > /proc/sys/kernel/perf_event_max_stack
  -bash: echo: write error: Device or resource busy
  #

Because we only allocate the callchain percpu data structures when there
is a user, which allows for changing the max easily, its just a matter
of having no callchain users at that point.

Reported-and-Tested-by: Brendan Gregg 
Reviewed-by: Frederic Weisbecker 
Acked-by: Alexei Starovoitov 
Acked-by: David Ahern 
Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: He Kuang 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Masami Hiramatsu 
Cc: Milian Wolff 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Cc: Wang Nan 
Cc: Zefan Li 
Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

perf: generalize perf_callchain

2016-02-20T05:21:44+00:00

. avoid walking the stack when there is no room left in the buffer
. generalize get_perf_callchain() to be called from bpf helper

Signed-off-by: Alexei Starovoitov 
Signed-off-by: David S. Miller