You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are two ways I have been able to reproduce the problem.
The first method occurs at random, and in spans of time (running in release
mode).
The second seems to occur every time I run internal tools linked against
libprofiler with gdb/cgdb.
I have been unable to generate a simplified reproducer that can be shared.
What steps will reproduce the problem?
1. compile code in debug mode, linked against libprofiler.so
2. run executable in cgdb
3. wait
4. interrupt execution and observe that:
a. all but one thread are waiting in poll, or epoll, or pthread_cond_wait, or etc.
b. one thread is stuck in a fork system call, on the ARCH_FORK line
c. CPU is at 100%
What is the expected output? What do you see instead?
The program is expected to finish normally.
The program hangs 'forever' in a call to fork(). On the ARCH_FORK() macro with
$rax = -ERESTARTNOINTR
What version of the product are you using? On what operating system?
2.2.1 / 2.4
RHEL6
Please provide any additional information below.
I have a quick (non-complete) fix (attached) for this using pthread_atfork and
pthread_sigmask to block SIGPROF before a fork and then re-enable it
afterwards. From my testing, this always prevents the hanging issue.
I have communicated my fix with Developer Services at my job and they have
indicated that it would be preferred if this solution could be patched into the
gperftools source code.
While this is probably sufficient for the usecase at my job, it feels
incomplete for the purposes of patching into the gperftools codebase.
Original issue reported on code.google.com by Sam.J.Ja...@gmail.com on 20 Jul 2015 at 5:18
Thanks for bug report.
I would like to understand it a bit more. I.e. it's great that blocking SIGPROF
during fork helps your case, but I'm really curious why not having it causes
fork to spin. Is that because signal always triggers during fork? But then how
is that possible ?
Can you please submit some test program that causes this behavior ? Or maybe
elaborate more on your finding?
Original comment by alkondratenko on 21 Jul 2015 at 2:44
The signal does not always trigger during fork when run in release mode.
However, as far as I can tell is does always trigger with GDB/CGDB.
From my understanding, this errno is handled by the kernel by re-attempting the
interrupted syscall (reset $rax and move the instruction pointer back). Why
this gets trapped in a spin is beyond me though.
As I mentioned, I have as-of-yet been unable to create a reproducer case, but I
will keep looking into it.
Original comment by Sam.J.Ja...@gmail.com on 21 Jul 2015 at 3:10
Original issue reported on code.google.com by
Sam.J.Ja...@gmail.com
on 20 Jul 2015 at 5:18Attachments:
The text was updated successfully, but these errors were encountered: