linux-stable.git/fs/proc/array.c, branch linux-5.15.y

procfs: fix missing RCU protection when reading real_parent in do_task_stat()

2026-03-04T12:19:37+00:00

[ Upstream commit 76149d53502cf17ef3ae454ff384551236fba867 ]

When reading /proc/[pid]/stat, do_task_stat() accesses task->real_parent
without proper RCU protection, which leads to:

  cpu 0                               cpu 1
  -----                               -----
  do_task_stat
    var = task->real_parent
                                      release_task
                                        call_rcu(delayed_put_task_struct)
    task_tgid_nr_ns(var)
      rcu_read_lock   <--- Too late to protect task->real_parent!
      task_pid_ptr    <--- UAF!
      rcu_read_unlock

This patch uses task_ppid_nr_ns() instead of task_tgid_nr_ns() to add
proper RCU protection for accessing task->real_parent.

Link: https://lkml.kernel.org/r/20260128083007.3173016-1-alexjlzheng@tencent.com
Fixes: 06fffb1267c9 ("do_task_stat: don't take rcu_read_lock()")
Signed-off-by: Jinliang Zheng 
Acked-by: Oleg Nesterov 
Cc: David Hildenbrand 
Cc: Ingo Molnar 
Cc: Lorenzo Stoakes 
Cc: Mateusz Guzik 
Cc: ruippan 
Cc: Usama Arif 
Signed-off-by: Andrew Morton 
Signed-off-by: Sasha Levin

fs/proc: do_task_stat: use __for_each_thread()

2025-07-17T16:30:47+00:00

commit 7904e53ed5a20fc678c01d5d1b07ec486425bb6a upstream.

do/while_each_thread should be avoided when possible.

Link: https://lkml.kernel.org/r/20230909164501.GA11581@redhat.com
Signed-off-by: Oleg Nesterov 
Cc: Eric W. Biederman 
Signed-off-by: Andrew Morton 
Stable-dep-of: 7601df8031fd ("fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats")
Cc: stable@vger.kernel.org
[ mheyne: adjusted context ]
Signed-off-by: Maximilian Heyne 
Signed-off-by: Greg Kroah-Hartman

fs/proc: do_task_stat: use sig->stats_lock to gather the threads/children stats

2025-05-02T05:44:24+00:00

commit 7601df8031fd67310af891897ef6cc0df4209305 upstream.

lock_task_sighand() can trigger a hard lockup.  If NR_CPUS threads call
do_task_stat() at the same time and the process has NR_THREADS, it will
spin with irqs disabled O(NR_CPUS * NR_THREADS) time.

Change do_task_stat() to use sig->stats_lock to gather the statistics
outside of ->siglock protected section, in the likely case this code will
run lockless.

Link: https://lkml.kernel.org/r/20240123153357.GA21857@redhat.com
Signed-off-by: Oleg Nesterov 
Signed-off-by: Dylan Hatch 
Cc: Eric W. Biederman 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Sasha Levin 
Signed-off-by: David Sauerwein 
Signed-off-by: Greg Kroah-Hartman

fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()

2024-03-15T14:48:22+00:00

[ Upstream commit 60f92acb60a989b14e4b744501a0df0f82ef30a3 ]

Patch series "fs/proc: do_task_stat: use sig->stats_".

do_task_stat() has the same problem as getrusage() had before "getrusage:
use sig->stats_lock rather than lock_task_sighand()": a hard lockup.  If
NR_CPUS threads call lock_task_sighand() at the same time and the process
has NR_THREADS, spin_lock_irq will spin with irqs disabled O(NR_CPUS *
NR_THREADS) time.

This patch (of 3):

thread_group_cputime() does its own locking, we can safely shift
thread_group_cputime_adjusted() which does another for_each_thread loop
outside of ->siglock protected section.

Not only this removes for_each_thread() from the critical section with
irqs disabled, this removes another case when stats_lock is taken with
siglock held.  We want to remove this dependency, then we can change the
users of stats_lock to not disable irqs.

Link: https://lkml.kernel.org/r/20240123153313.GA21832@redhat.com
Link: https://lkml.kernel.org/r/20240123153355.GA21854@redhat.com
Signed-off-by: Oleg Nesterov 
Signed-off-by: Dylan Hatch 
Cc: Eric W. Biederman 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Sasha Levin

proc: Use task_is_running() for wchan in /proc/$pid/stat

2024-03-15T14:48:22+00:00

[ Upstream commit 4e046156792c26bef8a4e30be711777fc8578257 ]

The implementations of get_wchan() can be expensive. The only information
imparted here is whether or not a process is currently blocked in the
scheduler (and even this doesn't need to be exact). Avoid doing the
heavy lifting of stack walking and just report that information by using
task_is_running().

Signed-off-by: Kees Cook 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20211008111626.211281780@infradead.org
Stable-dep-of: 60f92acb60a9 ("fs/proc: do_task_stat: move thread_group_cputime_adjusted() outside of lock_task_sighand()")
Signed-off-by: Sasha Levin

proc: stop using seq_get_buf in proc_task_name

2021-09-08T18:50:25+00:00

Use seq_escape_str and seq_printf instead of poking holes into the
seq_file abstraction.

Link: https://lkml.kernel.org/r/20210810151945.1795567-1-hch@lst.de
Signed-off-by: Christoph Hellwig 
Acked-by: Christian Brauner 
Cc: Alexey Dobriyan 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

2021-06-29T03:39:26+00:00

Pull user namespace rlimit handling update from Eric Biederman:
 "This is the work mainly by Alexey Gladkov to limit rlimits to the
  rlimits of the user that created a user namespace, and to allow users
  to have stricter limits on the resources created within a user
  namespace."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  cred: add missing return error code when set_cred_ucounts() failed
  ucounts: Silence warning in dec_rlimit_ucounts
  ucounts: Set ucount_max to the largest positive value the type can hold
  kselftests: Add test to check for rlimit changes in different user namespaces
  Reimplement RLIMIT_MEMLOCK on top of ucounts
  Reimplement RLIMIT_SIGPENDING on top of ucounts
  Reimplement RLIMIT_MSGQUEUE on top of ucounts
  Reimplement RLIMIT_NPROC on top of ucounts
  Use atomic_t for ucounts reference counting
  Add a reference to ucounts for each cred
  Increase size of ucounts to atomic_long_t

Reimplement RLIMIT_SIGPENDING on top of ucounts

2021-04-30T19:14:02+00:00

The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

Changelog

v11:
* Revert most of changes to fix performance issues.

v10:
* Fix memory leak on get_ucounts failure.

Signed-off-by: Alexey Gladkov 
Link: https://lkml.kernel.org/r/df9d7764dddd50f28616b7840de74ec0f81711a8.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman

seccomp: Fix CONFIG tests for Seccomp_filters

2021-03-31T05:33:50+00:00

Strictly speaking, seccomp filters are only used
when CONFIG_SECCOMP_FILTER.
This patch fixes the condition to enable "Seccomp_filters"
in /proc/$pid/status.

Signed-off-by: Kenta Tada 
Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status")
Signed-off-by: Kees Cook 
Link: https://lore.kernel.org/r/OSBPR01MB26772D245E2CF4F26B76A989F5669@OSBPR01MB2677.jpnprd01.prod.outlook.com

proc: provide details on indirect branch speculation

2020-12-16T06:46:15+00:00

Similar to speculation store bypass, show information about the indirect
branch speculation mode of a task in /proc/$pid/status.

For testing/benchmarking, I needed to see whether IB (Indirect Branch)
speculation (see Spectre-v2) is enabled on a task, to see whether an
IBPB instruction should be executed on an address space switch.
Unfortunately, this information isn't available anywhere else and
currently the only way to get it is to hack the kernel to expose it
(like this change).  It also helped expose a bug with conditional IB
speculation on certain CPUs.

Another place this could be useful is to audit the system when using
sanboxing.  With this change, I can confirm that seccomp-enabled
process have IB speculation force disabled as expected when the kernel
command line parameter `spectre_v2_user=seccomp`.

Since there's already a 'Speculation_Store_Bypass' field, I used that
as precedent for adding this one.

[amistry@google.com: remove underscores from field name to workaround documentation issue]
  Link: https://lkml.kernel.org/r/20201106131015.v2.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid

Link: https://lkml.kernel.org/r/20201030172731.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid
Signed-off-by: Anand K Mistry 
Cc: Anthony Steinhauser 
Cc: Thomas Gleixner 
Cc: Anand K Mistry 
Cc: Alexey Dobriyan 
Cc: Alexey Gladkov 
Cc: Jonathan Corbet 
Cc: Kees Cook 
Cc: Mauro Carvalho Chehab 
Cc: Michal Hocko 
Cc: Mike Rapoport 
Cc: NeilBrown 
Cc: Peter Zijlstra 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds