Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regen_certs.yml workflow #34484

Open
jcscottiii opened this issue Jun 17, 2022 · 3 comments
Open

Fix regen_certs.yml workflow #34484

jcscottiii opened this issue Jun 17, 2022 · 3 comments

Comments

@jcscottiii
Copy link
Contributor

jcscottiii commented Jun 17, 2022

Hi everyone! This is my first bug. So I have some questions /options at the end due to my lack of knowledge. Any help would be great. I can create the PR for whatever option we pick.

The regen_certs workflow is currently failing.

Error

Uncaught Exception: TypeError: %d format: a real number is required, not NoneType

Debugging

Looking at the stack trace, this is the impacted area:

wpt/tools/serve/serve.py

Lines 1226 to 1233 in 22d4255

server.stop(timeout=1)
if server.proc.exitcode == 0:
logger.info('Status of subprocess "%s": exited correctly', subproc.name)
else:
logger.warning('Status of subprocess "%s": failed. Exit with non-zero status: %d',
subproc.name, subproc.exitcode)
failed_subproc += 1

Line 1226 - Calls the stop (which calls multiprocess.join()). The timeout is currently set to 1 second.

Line 1231 - Has the uncaught exception. Exitcode is None still.

Hypothesis

Seems like the process is still alive. Which would explain why exit code is None.

Verification

Added this line to print if is still alive
image

It indeed did is still alive

image

Resolution

Options:

  1. Increase the stop timeout to 5 seconds. This worked for me locally. Simple solution. Also add a check for exit code in the else statement
  2. If it indeed needs to stop within 1 second, count it as a failure. Q: Is it supposed to complete within 1 second?

Bigger question: What caused it to take longer for the thread to join all of a sudden? I have not dug deep into that.

@KyleJu
Copy link
Contributor

KyleJu commented Jun 17, 2022

Thanks for walking us through the analysis and I think what you said there makes lots sense. In terms of your solutions, I am up for increasing the timeout but hesitate to do it directly. [wpt/tools/serve/serve.py](https://github.com/web-platform-tests/wpt/blob/22d425520a0de4af40b56b5a35bc18b07cd9996c/tools/serve/serve.py#L1226-L1233) file is used/shared internally (inside chromium) and externally (upstream GitHub) through a two-way sync and might not be prudent if we simply increase the timeout from 1 to 5 seconds. If the issue is caused by serve.py itself, I would expect to see similar bugs being filed on GitHub or Chromium directly a while back. Looking at the error messages:

Arguments: ('http-private on port 36793', None)

The timeout seems specific for the http-private server, so I would suggest

  • Find the minimum seconds it takes for http-private to join and set it as an additional parameter in here, but keeps the default timeout to 1 second
  • Try to figure out why http-private takes longer to join, although we have to dig into the changes when this workflow started failing. Basically look through the changes from Oct 11 (last time this workflow passed) to Nov 11 (first failure)

@jcscottiii
Copy link
Contributor Author

Quick comparison of the commits between those dates: 44642fa...035df0a

@foolip
Copy link
Member

foolip commented Jul 6, 2022

Fixing this might help with #34715. I also reckon the process is still running, but the question is if it would exit if we keep waiting. I think the first step should be to fix the logging if the process is still running, since that might happen again in the future for some other reason.

I'd do that by logging something different if subproc.exitcode is None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants