Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix splice_impl and tee_impl: Fix readiness and make returned fut Send #28

Merged
merged 13 commits into from
Jun 26, 2022

Conversation

NobodyXu
Copy link
Contributor

@NobodyXu NobodyXu commented Jun 20, 2022

Fixed #26

Fix:

  • splice_impl: Make auto-generated Future to be Send.
  • splice_impl: Fix readiness detection using test_read_write_readiness
  • tee_impl: Fix readiness detection using test_read_write_readiness
  • test_read_write_readiness: Implemented using libc::poll

Signed-off-by: Jiahao XU Jiahao_XU@outlook.com

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@pkolaczk
Copy link

Wow! That was fast!

@pkolaczk
Copy link

Not it compiles fine for me!

@NobodyXu NobodyXu changed the title Fix splice_impl: Make the ret fut to be Send Fix splice_impl: Make the returned fut to be Send Jun 20, 2022
@NobodyXu NobodyXu changed the title Fix splice_impl: Make the returned fut to be Send Fix splice_impl: Make the returned Future to be Send Jun 20, 2022
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@NobodyXu NobodyXu changed the title Fix splice_impl: Make the returned Future to be Send Fix splice_impl and tee_impl: Fix readiness and make returned fut Send Jun 21, 2022
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2022

Codecov Report

Merging #28 (54c92a4) into master (ec7414e) will decrease coverage by 5.80%.
The diff coverage is 31.70%.

@@            Coverage Diff             @@
##           master      #28      +/-   ##
==========================================
- Coverage   81.93%   76.13%   -5.81%     
==========================================
  Files           1        1              
  Lines         620      683      +63     
==========================================
+ Hits          508      520      +12     
- Misses        112      163      +51     
Impacted Files Coverage Δ
src/lib.rs 76.13% <31.70%> (-5.81%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec7414e...54c92a4. Read the comment docs.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
src/lib.rs Outdated Show resolved Hide resolved
@NobodyXu NobodyXu force-pushed the fix/splice_impl branch 4 times, most recently from 6dc30ea to a5f0331 Compare June 21, 2022 07:38
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
by issueing a read/write of zero size.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
by issueing a read/write of zero size.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Do not use `as_mut_slice` method of array.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@pkolaczk
Copy link

After testing it a bit more with dd piped to nc, I noticed it sometimes locks up.

@NobodyXu
Copy link
Contributor Author

After testing it a bit more with dd piped to nc, I noticed it sometimes locks up.

I think you can use strace again to see which syscall is blocking it, or you can share the relevant code, otherwise I can't help you find the problem.

@pkolaczk
Copy link

Yeah, of course. Trying to debug it.

@pkolaczk
Copy link

strace logs before it locked up:

[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536
[pid 82854] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65218
[pid 82854] splice(11, NULL, 10, NULL, 318, SPLICE_F_NONBLOCK) = -1 EAGAIN (Zasoby chwilowo niedostępne)
[pid 82854] poll([{fd=11, events=POLLIN}, {fd=10, events=POLLOUT}], 2, 0) = 1 ([{fd=11, revents=POLLIN}])
[pid 82854] epoll_wait(3, [{events=EPOLLOUT, data={u32=16777222, u64=16777222}}, {events=EPOLLIN|EPOLLOUT, data={u32=16777219, u64=16777219}}, {events=EPOLLIN|EPOLLOUT, data={u32=50331649, u64=50331649}}, {events=EPOLLIN, data={u32=16777221, u64=16777221}}], 1024, -1) = 4
[pid 82854] epoll_wait(3, [{events=EPOLLOUT, data={u32=33554434, u64=33554434}}, {events=EPOLLOUT, data={u32=33554436, u64=33554436}}], 1024, -1) = 2
[pid 82854] epoll_wait(3, [{events=EPOLLIN|EPOLLOUT, data={u32=16777219, u64=16777219}}, {events=EPOLLIN|EPOLLOUT, data={u32=50331649, u64=50331649}}], 1024, -1) = 2
[pid 82854] epoll_wait(3, 

that last line is really truncated like that

@NobodyXu
Copy link
Contributor Author

@pkolaczk Looks like it's keep issueing epoll_wait syscall, which should be issued by tokio itself.

The syscalls showed in your comments all have timeout set to -1, means it will wait indefinitely.
Which means that tokio is handling some events generated by your code.

I am not sure on what caused it to happen.

@pkolaczk
Copy link

pkolaczk commented Jun 21, 2022

Which means that tokio is handling some events generated by your code.

The problem is, there is no code running except that one single loop I showed you, and the main loop that waits for the next connection.
I even removed that select! totally and now transferring data in one direction only.
It looks to me it works fine when streams aren't blocked but fails once it locks on the write side and splice returns EAGAIN.
It does not lockup when I send just very tiny amounts of data through it.

[pid 91903] splice(7, NULL, 12, NULL, 65536, SPLICE_F_NONBLOCK) = 65536    // read full buffer
[pid 91903] splice(11, NULL, 10, NULL, 65536, SPLICE_F_NONBLOCK) = 65218  // could not write all of this
[pid 91903] splice(11, NULL, 10, NULL, 318, SPLICE_F_NONBLOCK) = -1 EAGAIN  // now tried to write the rest but got EAGAIN

src/lib.rs Outdated Show resolved Hide resolved
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@pkolaczk
Copy link

It doesn't lock up after the fix.

@NobodyXu
Copy link
Contributor Author

@pkolaczk Is there any other regressions/bugs you encountered or any performance-wise problem?

@pkolaczk
Copy link

At the moment it looks like it works, but I haven't tested it very extensively.

One thing that blocks me from testing it better is the fact that this crate forces me to use a different representation of my streams, instead of just the OwnedReadHalf and OwnedWriteHalf. Which makes it hard to integrate with the rest of the code, where I have more tests and benchmarks. It cannot just swap a zero-copy implementation into the existing code base. Type aliases solve it only to some degree, as there is some other code now that relies on my stream representation to offer stuff that is not available on AsyncFd (e.g. fetching the peer addresses). I could probably get away from this by wrapping + some additional traits, but this adds a lot of accidental complexity and honestly I'd want to avoid it.

Having said that, this PR is already a huge improvement over what was before, so I'm for merging it and maybe let's address the API thing and eventual perf improvements in a separate PR?

@NobodyXu
Copy link
Contributor Author

@yskszk63 Can you review this PR please?

@yskszk63
Copy link
Owner

@NobodyXu @pkolaczk
Sorry for the late response.
Thanks for the great work!

@yskszk63 yskszk63 merged commit 73ffe44 into yskszk63:master Jun 26, 2022
@yskszk63
Copy link
Owner

It has been released as v0.2.12.

@NobodyXu NobodyXu deleted the fix/splice_impl branch June 27, 2022 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Splice between a pipe and a socket
4 participants