<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/signal.c, branch v5.2-rc4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>signal: improve comments</title>
<updated>2019-06-05T13:06:07+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>christian@brauner.io</email>
</author>
<published>2019-06-04T13:18:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c732327f04a3818f35fa97d07b1d64d31b691d78'/>
<id>c732327f04a3818f35fa97d07b1d64d31b691d78</id>
<content type='text'>
Improve the comments for pidfd_send_signal().
First, the comment still referred to a file descriptor for a process as a
"task file descriptor" which stems from way back at the beginning of the
discussion. Replace this with "pidfd" for consistency.
Second, the wording for the explanation of the arguments to the syscall
was a bit inconsistent, e.g. some used the past tense some used present
tense. Make the wording more consistent.

Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Improve the comments for pidfd_send_signal().
First, the comment still referred to a file descriptor for a process as a
"task file descriptor" which stems from way back at the beginning of the
discussion. Replace this with "pidfd" for consistency.
Second, the wording for the explanation of the arguments to the syscall
was a bit inconsistent, e.g. some used the past tense some used present
tense. Make the wording more consistent.

Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: trace_signal_deliver when signal_group_exit</title>
<updated>2019-06-01T22:51:32+00:00</updated>
<author>
<name>Zhenliang Wei</name>
<email>weizhenliang@huawei.com</email>
</author>
<published>2019-06-01T05:30:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=98af37d624ed8c83f1953b1b6b2f6866011fc064'/>
<id>98af37d624ed8c83f1953b1b6b2f6866011fc064</id>
<content type='text'>
In the fixes commit, removing SIGKILL from each thread signal mask and
executing "goto fatal" directly will skip the call to
"trace_signal_deliver".  At this point, the delivery tracking of the
SIGKILL signal will be inaccurate.

Therefore, we need to add trace_signal_deliver before "goto fatal" after
executing sigdelset.

Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info.

Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com
Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT")
Signed-off-by: Zhenliang Wei &lt;weizhenliang@huawei.com&gt;
Reviewed-by: Christian Brauner &lt;christian@brauner.io&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Ivan Delalande &lt;colona@arista.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the fixes commit, removing SIGKILL from each thread signal mask and
executing "goto fatal" directly will skip the call to
"trace_signal_deliver".  At this point, the delivery tracking of the
SIGKILL signal will be inaccurate.

Therefore, we need to add trace_signal_deliver before "goto fatal" after
executing sigdelset.

Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info.

Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com
Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT")
Signed-off-by: Zhenliang Wei &lt;weizhenliang@huawei.com&gt;
Reviewed-by: Christian Brauner &lt;christian@brauner.io&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Ivan Delalande &lt;colona@arista.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Add SPDX license identifier for missed files</title>
<updated>2019-05-21T08:50:45+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-19T12:08:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=457c89965399115e5cd8bf38f9c597293405703d'/>
<id>457c89965399115e5cd8bf38f9c597293405703d</id>
<content type='text'>
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: unconditionally leave the frozen state in ptrace_stop()</title>
<updated>2019-05-16T17:43:58+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2019-05-16T17:38:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=05b289263772b0698589abc47771264a685cd365'/>
<id>05b289263772b0698589abc47771264a685cd365</id>
<content type='text'>
Alex Xu reported a regression in strace, caused by the introduction of
the cgroup v2 freezer. The regression can be reproduced by stracing
the following simple program:

  #include &lt;unistd.h&gt;

  int main() {
      write(1, "a", 1);
      return 0;
  }

An attempt to run strace ./a.out leads to the infinite loop:
  [ pre-main omitted ]
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  [ repeats forever ]

The problem occurs because the traced task leaves ptrace_stop()
(and the signal handling loop) with the frozen bit set. So let's
call cgroup_leave_frozen(true) unconditionally after sleeping
in ptrace_stop().

With this patch applied, strace works as expected:
  [ pre-main omitted ]
  write(1, "a", 1)                        = 1
  exit_group(0)                           = ?
  +++ exited with 0 +++

Reported-by: Alex Xu &lt;alex_y_xu@yahoo.ca&gt;
Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer")
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Alex Xu reported a regression in strace, caused by the introduction of
the cgroup v2 freezer. The regression can be reproduced by stracing
the following simple program:

  #include &lt;unistd.h&gt;

  int main() {
      write(1, "a", 1);
      return 0;
  }

An attempt to run strace ./a.out leads to the infinite loop:
  [ pre-main omitted ]
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  write(1, "a", 1)                        = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
  [ repeats forever ]

The problem occurs because the traced task leaves ptrace_stop()
(and the signal handling loop) with the frozen bit set. So let's
call cgroup_leave_frozen(true) unconditionally after sleeping
in ptrace_stop().

With this patch applied, strace works as expected:
  [ pre-main omitted ]
  write(1, "a", 1)                        = 1
  exit_group(0)                           = ?
  +++ exited with 0 +++

Reported-by: Alex Xu &lt;alex_y_xu@yahoo.ca&gt;
Fixes: 76f969e8948d ("cgroup: cgroup v2 freezer")
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kernel/signal.c: annotate implicit fall through</title>
<updated>2019-05-15T02:52:50+00:00</updated>
<author>
<name>Mathieu Malaterre</name>
<email>malat@debian.org</email>
</author>
<published>2019-05-14T22:44:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b028fb612849add771679c1b99a50d99264c9632'/>
<id>b028fb612849add771679c1b99a50d99264c9632</id>
<content type='text'>
There is a plan to build the kernel with -Wimplicit-fallthrough and this
place in the code produced a warning (W=1).

This commit remove the following warning:

  kernel/signal.c:795:13: warning: this statement may fall through [-Wimplicit-fallthrough=]

Link: http://lkml.kernel.org/r/20190114203505.17875-1-malat@debian.org
Signed-off-by: Mathieu Malaterre &lt;malat@debian.org&gt;
Acked-by: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a plan to build the kernel with -Wimplicit-fallthrough and this
place in the code produced a warning (W=1).

This commit remove the following warning:

  kernel/signal.c:795:13: warning: this statement may fall through [-Wimplicit-fallthrough=]

Link: http://lkml.kernel.org/r/20190114203505.17875-1-malat@debian.org
Signed-off-by: Mathieu Malaterre &lt;malat@debian.org&gt;
Acked-by: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2019-05-09T20:52:12+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-05-09T20:52:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=abde77eb5c66b2f98539c4644b54f34b7e179e6b'/>
<id>abde77eb5c66b2f98539c4644b54f34b7e179e6b</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:
 "This includes Roman's cgroup2 freezer implementation.

  It's a separate machanism from cgroup1 freezer. Instead of blocking
  user tasks in arbitrary uninterruptible sleeps, the new implementation
  extends jobctl stop - frozen tasks are trapped in jobctl stop until
  thawed and can be killed and ptraced. Lots of thanks to Oleg for
  sheperding the effort.

  Other than that, there are a few trivial changes"

* 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: never call do_group_exit() with task-&gt;frozen bit set
  kernel: cgroup: fix misuse of %x
  cgroup: get rid of cgroup_freezer_frozen_exit()
  cgroup: prevent spurious transition into non-frozen state
  cgroup: Remove unused cgrp variable
  cgroup: document cgroup v2 freezer interface
  cgroup: add tracing points for cgroup v2 freezer
  cgroup: make TRACE_CGROUP_PATH irq-safe
  kselftests: cgroup: add freezer controller self-tests
  kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy()
  cgroup: cgroup v2 freezer
  cgroup: protect cgroup-&gt;nr_(dying_)descendants by css_set_lock
  cgroup: implement __cgroup_task_count() helper
  cgroup: rename freezer.c into legacy_freezer.c
  cgroup: remove extra cgroup_migrate_finish() call
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull cgroup updates from Tejun Heo:
 "This includes Roman's cgroup2 freezer implementation.

  It's a separate machanism from cgroup1 freezer. Instead of blocking
  user tasks in arbitrary uninterruptible sleeps, the new implementation
  extends jobctl stop - frozen tasks are trapped in jobctl stop until
  thawed and can be killed and ptraced. Lots of thanks to Oleg for
  sheperding the effort.

  Other than that, there are a few trivial changes"

* 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: never call do_group_exit() with task-&gt;frozen bit set
  kernel: cgroup: fix misuse of %x
  cgroup: get rid of cgroup_freezer_frozen_exit()
  cgroup: prevent spurious transition into non-frozen state
  cgroup: Remove unused cgrp variable
  cgroup: document cgroup v2 freezer interface
  cgroup: add tracing points for cgroup v2 freezer
  cgroup: make TRACE_CGROUP_PATH irq-safe
  kselftests: cgroup: add freezer controller self-tests
  kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy()
  cgroup: cgroup v2 freezer
  cgroup: protect cgroup-&gt;nr_(dying_)descendants by css_set_lock
  cgroup: implement __cgroup_task_count() helper
  cgroup: rename freezer.c into legacy_freezer.c
  cgroup: remove extra cgroup_migrate_finish() call
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: never call do_group_exit() with task-&gt;frozen bit set</title>
<updated>2019-05-09T14:56:47+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2019-05-08T20:34:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f2b31bb598248c04721cb8485e6091a9feb045ac'/>
<id>f2b31bb598248c04721cb8485e6091a9feb045ac</id>
<content type='text'>
I've got two independent reports that cgroup_task_frozen() check
in cgroup_exit() has been triggered by lkp libhugetlbfs-test and
LTP ptrace01 tests.

For example:
[   44.576072] WARNING: CPU: 1 PID: 3028 at kernel/cgroup/cgroup.c:5932 cgroup_exit+0x148/0x160
[   44.577724] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sr_mod cdrom
bochs_drm sg ttm ata_generic pata_acpi ppdev drm_kms_helper snd_pcm syscopyarea aesni_intel snd_timer
sysfillrect sysimgblt snd crypto_simd cryptd glue_helper soundcore fb_sys_fops joydev drm serio_raw pcspkr
ata_piix libata i2c_piix4 floppy parport_pc parport ip_tables
[   44.583106] CPU: 1 PID: 3028 Comm: ptrace-write-hu Not tainted 5.1.0-rc3-00053-g9262503 #5
[   44.584600] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   44.586116] RIP: 0010:cgroup_exit+0x148/0x160
[   44.587135] Code: 0f 84 50 ff ff ff 48 8b 85 c8 0c 00 00 48 8b 78 70 e8 ec 2e 00 00 e9 3b ff ff ff f0 ff 43 60
0f 88 72 21 89 00 e9 48 ff ff ff &lt;0f&gt; 0b e9 1b ff ff ff e8 3c 73 f4 ff 66 90 66 2e 0f 1f 84 00 00 00
[   44.590113] RSP: 0018:ffffb25702dcfd30 EFLAGS: 00010002
[   44.591167] RAX: ffff96a7fee32410 RBX: ffff96a7ff1d6000 RCX: dead000000000200
[   44.592446] RDX: ffff96a7ff1d6080 RSI: ffff96a7fec75290 RDI: ffff96a7fec75290
[   44.593715] RBP: ffff96a7fec745c0 R08: ffff96a7fec74658 R09: 0000000000000000
[   44.594985] R10: 0000000000000000 R11: 0000000000000001 R12: ffff96a7fec75101
[   44.596266] R13: ffff96a7fec745c0 R14: ffff96a7ff3bde30 R15: ffff96a7fec75130
[   44.597550] FS:  0000000000000000(0000) GS:ffff96a7dd700000(0000) knlGS:0000000000000000
[   44.598950] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[   44.600098] CR2: 00000000f7a00000 CR3: 000000000d20e000 CR4: 00000000000406e0
[   44.601417] Call Trace:
[   44.602777]  do_exit+0x337/0xc40
[   44.603677]  do_group_exit+0x3a/0xa0
[   44.604610]  get_signal+0x12e/0x8d0
[   44.605533]  ? __switch_to_asm+0x40/0x70
[   44.606503]  do_signal+0x36/0x650
[   44.607409]  ? __switch_to_asm+0x40/0x70
[   44.608383]  ? __schedule+0x267/0x860
[   44.609329]  exit_to_usermode_loop+0x89/0xf0
[   44.610349]  do_fast_syscall_32+0x251/0x2e3
[   44.611357]  entry_SYSENTER_compat+0x7f/0x91
[   44.612376] ---[ end trace e4ca5cfc4b7f7964 ]---

The problem is caused by the ptrace_signal() call in the for loop
in get_signal(). There is a cgroup_enter_frozen() call inside
ptrace_signal(), so after exit from ptrace_signal() the task-&gt;frozen
bit might be set. In this case do_group_exit() can be called with the
task-&gt;frozen bit set and trigger the warning. This is only place where
we can leave the loop with the task-&gt;frozen bit set and without
setting JOBCTL_TRAP_FREEZE and TIF_SIGPENDING.

To resolve this problem, let's move cgroup_leave_frozen(true) call to
just after the fatal label. If the task is going to die, the frozen
bit must be cleared no matter how we get into this point.

Reported-by: kernel test robot &lt;rong.a.chen@intel.com&gt;
Reported-by: Qian Cai &lt;cai@lca.pw&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I've got two independent reports that cgroup_task_frozen() check
in cgroup_exit() has been triggered by lkp libhugetlbfs-test and
LTP ptrace01 tests.

For example:
[   44.576072] WARNING: CPU: 1 PID: 3028 at kernel/cgroup/cgroup.c:5932 cgroup_exit+0x148/0x160
[   44.577724] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sr_mod cdrom
bochs_drm sg ttm ata_generic pata_acpi ppdev drm_kms_helper snd_pcm syscopyarea aesni_intel snd_timer
sysfillrect sysimgblt snd crypto_simd cryptd glue_helper soundcore fb_sys_fops joydev drm serio_raw pcspkr
ata_piix libata i2c_piix4 floppy parport_pc parport ip_tables
[   44.583106] CPU: 1 PID: 3028 Comm: ptrace-write-hu Not tainted 5.1.0-rc3-00053-g9262503 #5
[   44.584600] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[   44.586116] RIP: 0010:cgroup_exit+0x148/0x160
[   44.587135] Code: 0f 84 50 ff ff ff 48 8b 85 c8 0c 00 00 48 8b 78 70 e8 ec 2e 00 00 e9 3b ff ff ff f0 ff 43 60
0f 88 72 21 89 00 e9 48 ff ff ff &lt;0f&gt; 0b e9 1b ff ff ff e8 3c 73 f4 ff 66 90 66 2e 0f 1f 84 00 00 00
[   44.590113] RSP: 0018:ffffb25702dcfd30 EFLAGS: 00010002
[   44.591167] RAX: ffff96a7fee32410 RBX: ffff96a7ff1d6000 RCX: dead000000000200
[   44.592446] RDX: ffff96a7ff1d6080 RSI: ffff96a7fec75290 RDI: ffff96a7fec75290
[   44.593715] RBP: ffff96a7fec745c0 R08: ffff96a7fec74658 R09: 0000000000000000
[   44.594985] R10: 0000000000000000 R11: 0000000000000001 R12: ffff96a7fec75101
[   44.596266] R13: ffff96a7fec745c0 R14: ffff96a7ff3bde30 R15: ffff96a7fec75130
[   44.597550] FS:  0000000000000000(0000) GS:ffff96a7dd700000(0000) knlGS:0000000000000000
[   44.598950] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[   44.600098] CR2: 00000000f7a00000 CR3: 000000000d20e000 CR4: 00000000000406e0
[   44.601417] Call Trace:
[   44.602777]  do_exit+0x337/0xc40
[   44.603677]  do_group_exit+0x3a/0xa0
[   44.604610]  get_signal+0x12e/0x8d0
[   44.605533]  ? __switch_to_asm+0x40/0x70
[   44.606503]  do_signal+0x36/0x650
[   44.607409]  ? __switch_to_asm+0x40/0x70
[   44.608383]  ? __schedule+0x267/0x860
[   44.609329]  exit_to_usermode_loop+0x89/0xf0
[   44.610349]  do_fast_syscall_32+0x251/0x2e3
[   44.611357]  entry_SYSENTER_compat+0x7f/0x91
[   44.612376] ---[ end trace e4ca5cfc4b7f7964 ]---

The problem is caused by the ptrace_signal() call in the for loop
in get_signal(). There is a cgroup_enter_frozen() call inside
ptrace_signal(), so after exit from ptrace_signal() the task-&gt;frozen
bit might be set. In this case do_group_exit() can be called with the
task-&gt;frozen bit set and trigger the warning. This is only place where
we can leave the loop with the task-&gt;frozen bit set and without
setting JOBCTL_TRAP_FREEZE and TIF_SIGPENDING.

To resolve this problem, let's move cgroup_leave_frozen(true) call to
just after the fatal label. If the task is going to die, the frozen
bit must be cleared no matter how we get into this point.

Reported-by: kernel test robot &lt;rong.a.chen@intel.com&gt;
Reported-by: Qian Cai &lt;cai@lca.pw&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux</title>
<updated>2019-05-07T19:30:24+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-05-07T19:30:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eac7078a0fff1e72cf2b641721e3f55ec7e5e21e'/>
<id>eac7078a0fff1e72cf2b641721e3f55ec7e5e21e</id>
<content type='text'>
Pull pidfd updates from Christian Brauner:
 "This patchset makes it possible to retrieve pidfds at process creation
  time by introducing the new flag CLONE_PIDFD to the clone() system
  call. Linus originally suggested to implement this as a new flag to
  clone() instead of making it a separate system call.

  After a thorough review from Oleg CLONE_PIDFD returns pidfds in the
  parent_tidptr argument. This means we can give back the associated pid
  and the pidfd at the same time. Access to process metadata information
  thus becomes rather trivial.

  As has been agreed, CLONE_PIDFD creates file descriptors based on
  anonymous inodes similar to the new mount api. They are made
  unconditional by this patchset as they are now needed by core kernel
  code (vfs, pidfd) even more than they already were before (timerfd,
  signalfd, io_uring, epoll etc.). The core patchset is rather small.
  The bulky looking changelist is caused by David's very simple changes
  to Kconfig to make anon inodes unconditional.

  A pidfd comes with additional information in fdinfo if the kernel
  supports procfs. The fdinfo file contains the pid of the process in
  the callers pid namespace in the same format as the procfs status
  file, i.e. "Pid:\t%d".

  To remove worries about missing metadata access this patchset comes
  with a sample/test program that illustrates how a combination of
  CLONE_PIDFD and pidfd_send_signal() can be used to gain race-free
  access to process metadata through /proc/&lt;pid&gt;.

  Further work based on this patchset has been done by Joel. His work
  makes pidfds pollable. It finished too late for this merge window. I
  would prefer to have it sitting in linux-next for a while and send it
  for inclusion during the 5.3 merge window"

* tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  samples: show race-free pidfd metadata access
  signal: support CLONE_PIDFD with pidfd_send_signal
  clone: add CLONE_PIDFD
  Make anon_inodes unconditional
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull pidfd updates from Christian Brauner:
 "This patchset makes it possible to retrieve pidfds at process creation
  time by introducing the new flag CLONE_PIDFD to the clone() system
  call. Linus originally suggested to implement this as a new flag to
  clone() instead of making it a separate system call.

  After a thorough review from Oleg CLONE_PIDFD returns pidfds in the
  parent_tidptr argument. This means we can give back the associated pid
  and the pidfd at the same time. Access to process metadata information
  thus becomes rather trivial.

  As has been agreed, CLONE_PIDFD creates file descriptors based on
  anonymous inodes similar to the new mount api. They are made
  unconditional by this patchset as they are now needed by core kernel
  code (vfs, pidfd) even more than they already were before (timerfd,
  signalfd, io_uring, epoll etc.). The core patchset is rather small.
  The bulky looking changelist is caused by David's very simple changes
  to Kconfig to make anon inodes unconditional.

  A pidfd comes with additional information in fdinfo if the kernel
  supports procfs. The fdinfo file contains the pid of the process in
  the callers pid namespace in the same format as the procfs status
  file, i.e. "Pid:\t%d".

  To remove worries about missing metadata access this patchset comes
  with a sample/test program that illustrates how a combination of
  CLONE_PIDFD and pidfd_send_signal() can be used to gain race-free
  access to process metadata through /proc/&lt;pid&gt;.

  Further work based on this patchset has been done by Joel. His work
  makes pidfds pollable. It finished too late for this merge window. I
  would prefer to have it sitting in linux-next for a while and send it
  for inclusion during the 5.3 merge window"

* tag 'pidfd-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  samples: show race-free pidfd metadata access
  signal: support CLONE_PIDFD with pidfd_send_signal
  clone: add CLONE_PIDFD
  Make anon_inodes unconditional
</pre>
</div>
</content>
</entry>
<entry>
<title>signal: support CLONE_PIDFD with pidfd_send_signal</title>
<updated>2019-05-07T12:31:03+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>christian@brauner.io</email>
</author>
<published>2019-04-17T20:50:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2151ad1b067275730de1b38c7257478cae47d29e'/>
<id>2151ad1b067275730de1b38c7257478cae47d29e</id>
<content type='text'>
Let pidfd_send_signal() use pidfds retrieved via CLONE_PIDFD.  With this
patch pidfd_send_signal() becomes independent of procfs.  This fullfils
the request made when we merged the pidfd_send_signal() patchset.  The
pidfd_send_signal() syscall is now always available allowing for it to
be used by users without procfs mounted or even users without procfs
support compiled into the kernel.

Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
Co-developed-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Michael Kerrisk (man-pages)" &lt;mtk.manpages@gmail.com&gt;
Cc: Andy Lutomirsky &lt;luto@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Let pidfd_send_signal() use pidfds retrieved via CLONE_PIDFD.  With this
patch pidfd_send_signal() becomes independent of procfs.  This fullfils
the request made when we merged the pidfd_send_signal() patchset.  The
pidfd_send_signal() syscall is now always available allowing for it to
be used by users without procfs mounted or even users without procfs
support compiled into the kernel.

Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
Co-developed-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Michael Kerrisk (man-pages)" &lt;mtk.manpages@gmail.com&gt;
Cc: Andy Lutomirsky &lt;luto@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup: prevent spurious transition into non-frozen state</title>
<updated>2019-05-06T15:39:06+00:00</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2019-04-26T17:59:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb2c4cd87874a7975b7b8615866b3a87bae10aab'/>
<id>cb2c4cd87874a7975b7b8615866b3a87bae10aab</id>
<content type='text'>
If freezing of a cgroup races with waking of a task from
the frozen state (like waiting in vfork() or in do_signal_stop()),
a spurious transition of the cgroup state can happen.

The task enters cgroup_leave_frozen(true), the cgroup-&gt;nr_frozen_tasks
counter decrements, and the cgroup is switched to the unfrozen state.

To prevent it, let's reserve cgroup_leave_frozen(true) for
terminating processes and use cgroup_leave_frozen(false) otherwise.

To avoid busy-looping in the signal handling loop waiting
for JOBCTL_TRAP_FREEZE set from the cgroup freezing path,
let's do it explicitly in cgroup_leave_frozen(), if the task
is going to stay frozen.

Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If freezing of a cgroup races with waking of a task from
the frozen state (like waiting in vfork() or in do_signal_stop()),
a spurious transition of the cgroup state can happen.

The task enters cgroup_leave_frozen(true), the cgroup-&gt;nr_frozen_tasks
counter decrements, and the cgroup is switched to the unfrozen state.

To prevent it, let's reserve cgroup_leave_frozen(true) for
terminating processes and use cgroup_leave_frozen(false) otherwise.

To avoid busy-looping in the signal handling loop waiting
for JOBCTL_TRAP_FREEZE set from the cgroup freezing path,
let's do it explicitly in cgroup_leave_frozen(), if the task
is going to stay frozen.

Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
