Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(terminal): repeat retry screen instruction sending during stdin cache replay (#3372) #3506

Closed
wants to merge 1 commit into from

Conversation

lypanov
Copy link

@lypanov lypanov commented Jul 18, 2024

The lack of re-retry on screen instruction playback from the stdin_cache ends up breaking sixel support as screen instructions such as pixel dimension detection response are unreliably replayed when the screen was not yet ready. This is a regression in sixel support introduced in 0.35.0.

Unsure if this is universal or something specific to our platform. However given it was seen by other users across multiple terminal emulators I started to look into it myself.

Existing tests pass though I'd much prefer to have written a test for this relatively complicated regression. Alas it'll be a while before I can invest the time so wanted to push the PR already and get some feedback / see how you feel about it.

@lypanov lypanov marked this pull request as draft July 18, 2024 16:18
@lypanov lypanov marked this pull request as ready for review July 18, 2024 16:43
@lypanov lypanov changed the title fix(terminal): repeat instruction sending on error (#3372) fix(terminal): repeat retry screen instruction sending during stdn cache replay (#3372) Jul 19, 2024
@lypanov lypanov changed the title fix(terminal): repeat retry screen instruction sending during stdn cache replay (#3372) fix(terminal): repeat retry screen instruction sending during stdin cache replay (#3372) Jul 19, 2024
This ended up breaking sixel support as screen instructions such as
  pixel dimension detection response were unreliably replayed when
  the screen was not yet ready.
@lypanov lypanov force-pushed the repeat_instruction_retries branch from f24f715 to 587c953 Compare July 19, 2024 13:44
@lypanov
Copy link
Author

lypanov commented Jul 19, 2024

Force pushed fix for formatting.

@imsnif
Copy link
Member

imsnif commented Jul 19, 2024

Great find! Correct me if I'm wrong though - doesn't this create an infinite loop with no delay and no exit condition? Or am I missing something?

@lypanov
Copy link
Author

lypanov commented Jul 19, 2024

Great find! Correct me if I'm wrong though - doesn't this create an infinite loop with no delay and no exit condition? Or am I missing something?

If the needed session never gets to a ready state it could infinite loop. As it is already it does result in a burst of logging activity so I originally had a small sleep in-between retry attempts, but removed it in the end as not sure it's worth the extra complexity (never saw more than 15 retries). Any artificial (as in, not the normal path of session becomes ready) exit condition would probably mean keeping track of retry counts which I'm not convinced makes sense from an overhead perspective. Are there other conditions I'm missing? Could a single screen instruction failure result in this retrying infinitely? My Rust is pretty basic but my reading is that the only condition that would result in a retry is the screen not being ready.

To prevent a feasible "logging is inundated by retries" when unrelated development work results in screen never getting to a ready state the logging itself could be modified to make this clear, or an exponential backoff could be added.

Let me know your thoughts / preferences!

@manueldeprada
Copy link
Contributor

Thanks for the work @lypanov
Tthe fix works as expected for me

@imsnif
Copy link
Member

imsnif commented Jul 26, 2024

@lypanov - I would like to add some sort of pause between retries. I think that would be our best option. Otherwise even breaking the process if the server becomes unresponsive would be challenging, the machine will probably go to 100% CPU and it would generally be an even more unpleasant experience when things crash.

@imsnif
Copy link
Member

imsnif commented Aug 13, 2024

No stress ofc, but any updates on this?

@lypanov
Copy link
Author

lypanov commented Aug 13, 2024

Sorry for the delay Aram postponed this for a week or two more as currently working on a personal project after a lay off.

@manueldeprada
Copy link
Contributor

@imsnif take a look here: 64a1cbc

Is adding a 100 millis sleep there enough? Or you were thinking in other kind of pause?

@imsnif
Copy link
Member

imsnif commented Aug 16, 2024

@manueldeprada - 100ms is a lot, but I think it might even work with 5-10ms. You can test this by creating an infinite loop in this case (eg. not handling the instructions and just pushing them over and over into the Vec) and seeing if you can kill the process without -9. (make sure to compile with --release or to install with cargo x install <path>).

@lypanov
Copy link
Author

lypanov commented Aug 16, 2024 via email

@imsnif
Copy link
Member

imsnif commented Aug 16, 2024

While I don't always follow Zero One Infinity as can be seen in the code base, I think this is a good place to follow it. In other words: we have no way of knowing how many attempts will be the right amount (especially considering old hardware, temporary hiccups, etc.)

I hear you about the log spam, but we have spooling mechanisms in place so at worst it shouldn't take up more than a few hundred kilobytes if memory serves.

@manueldeprada
Copy link
Contributor

manueldeprada commented Aug 20, 2024

@manueldeprada - 100ms is a lot, but I think it might even work with 5-10ms. You can test this by creating an infinite loop in this case (eg. not handling the instructions and just pushing them over and over into the Vec) and seeing if you can kill the process without -9. (make sure to compile with --release or to install with cargo x install <path>).

I just tested with 5ms, --release and a while true loop (without handle_instruction), and the impact on CPU seems totally negligible:
image

@manueldeprada
Copy link
Contributor

This PR can be closed now :)

Thanks for your efforts @imsnif

and thanks so much for initially pushing this PR @lypanov !!!

@lypanov
Copy link
Author

lypanov commented Oct 16, 2024

I'll try to find the time to at least push my branch which fixes the aspect ratio issue too. Though that'll require lots more conversation that this relatively trivial one.

@lypanov lypanov closed this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants