Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent fork from happening in known fork unsafe API #10864

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

casperisfine
Copy link
Contributor

@casperisfine casperisfine commented May 29, 2024

[Feature #20590]

For better of for worse, fork(2) remain the primary provider of parallelism in Ruby programs. Even though it's frowned uppon in many circles, and a lot of literature will simply state that only async-signal safe APIs are safe to use after fork(), in practice most APIs work well as long as you are careful about not forking while another thread is holding a pthread mutex.

One of the APIs that is known cause fork safety issues is getaddrinfo. If you fork while another thread is inside getaddrinfo, a mutex may be left locked in the child, with no way to unlock it.

I think we could reduce the impact of these problem by preventing in for the most notorious and common cases, by locking around fork(2) and known unsafe APIs with a read-write lock.

FYI: @mame @beauraF

This comment has been minimized.

@@ -498,9 +524,6 @@ rb_getaddrinfo(const char *hostp, const char *portp, const struct addrinfo *hint
errno = err;
return EAI_SYSTEM;
}
pthread_detach(th);

rb_thread_call_without_gvl2(wait_getaddrinfo, arg, cancel_getaddrinfo, arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this deleted on purpose? it seems important to me! (it's the bit that actually waits on the condvar for the lookup to be done, right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, just a bad rebase I think, let me check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fixed it. It should have been replaced by fork_safe_async_getaddrinfo(arg);

static void *
fork_safe_async_getaddrinfo(void *ptr)
{
return rb_thread_prevent_fork(async_getaddrinfo, ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akr pointed out that it is wrong to call rb_thread_prevent_fork here.

If Ruby's getaddrinfo is interrupted, such as by Ctrl+C, the pthread in question calling getaddrinfo(3) may remain even after async_getaddrinfo returns. So this does not satisfy your purpose (to avoid fork is called during getaddrinfo(3) execution).
Instead, you should guard the call of getaddrinfo(3) (and freeaddrinfo(3) too?), in do_getaddrinfo.

Note that rb_thread_prevent_fork would be called with no GVL in this case. Because rb_ensure requires GVL, rb_thread_prevent_fork must not use it. I think rb_thread_prevent_fork doesn't have to guard longjmp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mame. I finally figured my linking error, and made changes based on your review.

I think it's much simpler now, how does it look?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casperisfine Thanks! It looks good to me. @akr Could you review it too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akr said it looks okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's great. Does that means you are good with merging it, or do you wish to add it at a dev meeting first? https://bugs.ruby-lang.org/issues/20590

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 I rebased it and enabled auto-merge. Thanks again!

@casperisfine casperisfine force-pushed the fork-lock branch 2 times, most recently from d784192 to ad730ba Compare August 1, 2024 10:46
@byroot byroot enabled auto-merge (rebase) September 5, 2024 09:00
auto-merge was automatically disabled September 5, 2024 09:00

Head branch was pushed to by a user without write access

…unsafe API

[Feature #20590]

For better of for worse, fork(2) remain the primary provider of
parallelism in Ruby programs. Even though it's frowned uppon in
many circles, and a lot of literature will simply state that only
async-signal safe APIs are safe to use after `fork()`, in practice
most APIs work well as long as you are careful about not forking
while another thread is holding a pthread mutex.

One of the APIs that is known cause fork safety issues is `getaddrinfo`.
If you fork while another thread is inside `getaddrinfo`, a mutex
may be left locked in the child, with no way to unlock it.

I think we could reduce the impact of these problem by preventing
in for the most notorious and common cases, by locking around
`fork(2)` and known unsafe APIs with a read-write lock.
@casperisfine casperisfine changed the title Proof of Concept: Allow to prevent fork from happening in known fork unsafe API Prevent fork from happening in known fork unsafe API Sep 5, 2024
@byroot byroot enabled auto-merge (rebase) September 5, 2024 09:11
@byroot byroot merged commit 63cbe3f into ruby:master Sep 5, 2024
107 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants