Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket is already closed when CURL_POLL_REMOVE notification is received #14976

Closed
pkropachev opened this issue Sep 20, 2024 · 8 comments
Closed

Comments

@pkropachev
Copy link

pkropachev commented Sep 20, 2024

I did this

Hello Team!

We faced with problem, it looks like sometimes socket is already closed when CURLMOPT_SOCKETFUNCTION callback is called with CURL_POLL_REMOVE argument. See example ephiperfifo.c.
Immediately after calling callback we check fd with fnctl() or epoll_ctl() and both syscalls return errno 9 (EBADF).

Could you please confirm if it's not guaranteed by CURLMOPT_SOCKETFUNCTION function with CURL_POLL_REMOVE argument that fd won't be closed before this notification?

As far as I understand possible workaround can be registering CURLOPT_CLOSESOCKETFUNCTION callback and closing fd by our self?

Unfortunately it's difficult to reproduce, so I can't provide some simple example for reproducing.

Thanks!

I expected the following

No response

curl/libcurl version

curl 8.6.0

operating system

Ubuntu 20.04.6 LTS

@bagder
Copy link
Member

bagder commented Sep 22, 2024

It would be helpful if you tested the latest version just to see if the problem still remains. libcurl is meant to remove the socket before closing it.

But without a way to reproduce it, or a detailed log from you where it happens, I don't think we can do much!

@pkropachev
Copy link
Author

Ok, sure. I will try to reproduce. Actually I guess our problem is related with setting CURLMOPT_MAXCONNECTS options. Not sure if it's proper way of using. Anyway I will provide example and the resulting behavior.

@steve-chavez
Copy link

I'm also seeing this on libcurl 8.0.1. Digging the source code I found that this was fixed on #4211, but no test was added. So it could be that a regression happened after 7.66.0 (I haven't confirmed it was fixed here though).

Also got here after following the ephiperfifo example. The error is hard to reproduce for me, it happens when making ~10K requests.

@steve-chavez
Copy link

steve-chavez commented Sep 24, 2024

Just in case it helps. For me this only happens when name resolution is involved, I can't reproduce locally. All the requests that succeed go through a CURL_POLL_OUT -> CURL_POLL_IN -> CURL_POLL_REMOVE sequence, the ones that fail do CURL_POLL_OUT -> CURL_POLL_REMOVE and then timeout if a fnctl check is done to avoid the EBADF error with EPOLL_CTL_DEL.

Edit: fails with AsynchDNS support, with or without c-ares.

@steve-chavez
Copy link

steve-chavez commented Sep 27, 2024

I can confirm I don't receive the EBADF starting from 8.8.0, but there are timeouts as mentioned on #15079 (comment).

Seems the problem on the older versions was that timeouts were happening but the file descriptors were removed before the epoll_ctl fired.

@bagder
Copy link
Member

bagder commented Oct 6, 2024

My reading of the info provided in this issue is that this problem does not exist in the latest release?

@pkropachev
Copy link
Author

Let me check on my side as well.

@bagder
Copy link
Member

bagder commented Oct 13, 2024

Presumed fixed.

@bagder bagder closed this as completed Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants