Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: improve test-https-server-keep-alive-timeout #13312

Closed
wants to merge 3 commits into from

Conversation

Trott
Copy link
Member

@Trott Trott commented May 30, 2017

The test is flaky under load. These changes greatly improve reliability.

  • Use a recurring interval to determine when the test should end rather
    than a timer.
  • Increase server timeout to 500ms to allow for events being delayed by
    system load

Changing to an interval has the added benefit of reducing the test run
time from over 2 seconds to under 1 second.

Fixes: #13307

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • commit message follows commit guidelines
Affected core subsystem(s)

test https

@Trott Trott added https Issues or PRs related to the https subsystem. test Issues and PRs related to the tests. labels May 30, 2017
}));
});

test(function serverNoEndKeepAliveTimeoutWithPipeline(cb) {
let socket;
let destroyedSockets = 0;
let timeoutCount = 0;
let requestCount = 0;
process.on('exit', () => {
assert.strictEqual(timeoutCount, 1);
assert.strictEqual(requestCount, 3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert.strictEqual(requestCount, 3); [](start = 4, length = 36)

Do we need this since you are clearing the interval with requestCount === 3 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it in to make sure it doesn't subsequently increase to 4 or higher. Probably not needed but also maybe doesn't hurt?

@@ -31,20 +31,18 @@ function run() {

test(function serverKeepAliveTimeoutWithPipeline(cb) {
let socket;
let destroyedSockets = 0;
let timeoutCount = 0;
let requestCount = 0;
process.on('exit', function() {
assert.strictEqual(timeoutCount, 1);
assert.strictEqual(requestCount, 3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the test be refactored to eliminate the need for the exit handler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I'd rather that be in a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that since this PR is focused on robustness, it will be "offtopic" to remove, but now it "cries out" to remove this in favor of a mustCall(x,3) in line (new file) 37, and corresponding 63 & 66.

Copy link
Contributor

@aqrln aqrln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@lpinca
Copy link
Member

lpinca commented May 31, 2017

Maybe I'm missing something but why is the polling/interval needed? Isn't wrapping the callback of server.setTimeout() with common.mustCall() sufficient?

@Trott
Copy link
Member Author

Trott commented May 31, 2017

Maybe I'm missing something but why is the polling/interval needed? Isn't wrapping the callback of server.setTimeout() with common.mustCall() sufficient?

@lpinca I added an additional commit to remove the interval. Seems to work (including under load). LGTY?

@lpinca
Copy link
Member

lpinca commented May 31, 2017

LGTM. It's a lot cleaner 👍 .

@refack
Copy link
Contributor

refack commented May 31, 2017

LGTM. It's a lot cleaner 👍 .

Personally I like the setInterval(fn, 1) idiom (#13252 (comment)) since it's less arbitrary than 500, but not enough for asking another change 🤷‍♂️
I do think we should think about this... #13346 just so we don't drive contributors crazy with back and forth.

@Trott
Copy link
Member Author

Trott commented May 31, 2017

Personally I like the setInterval(fn, 1) idiom (#13252 (comment)) since it's less arbitrary than 500,

I don't think setInterval() can be applied here, or at least not trivially, because the 500 is not the timeout for a setTimeout() call but a server.setTimeout(). Unfortunate/confusing name, but there it is.

@lpinca
Copy link
Member

lpinca commented May 31, 2017

What @Trott said. In this case the interval was cleared when the server.setTimeout() callback was fired making the whole interval useless imo.

@Trott
Copy link
Member Author

Trott commented May 31, 2017

(Oh, and the 500 ms server.setTimeout() has been there all along. It didn't replace the setInterval(). Just in case that's the source of confusion. In the existing test, it's a 200 ms timeout though.)

@refack
Copy link
Contributor

refack commented May 31, 2017

Yeah ok, but still the setInterval had if (socket && socket.destroyed && requestCount === 3). Who counts to 3 now?
AFAICT the 500 is there to give the 3 req enough time to fire? So it is an "empirical" solution not an "analytical" one, right? On an extremely busy machine the 3 will not get to the server and the test will flake.

Again I'm not objecting, just thinking out loud. I'm fine with K.I.S.S.

@Trott
Copy link
Member Author

Trott commented Jun 1, 2017

@lpinca
Copy link
Member

lpinca commented Jun 1, 2017

Yeah ok, but still the setInterval had if (socket && socket.destroyed && requestCount === 3). Who counts to 3 now?

I don't think this is relevant. As it was, the interval would have never been cleared if requestCount !== 3 making the test fail for timeout? The requestCount is still there and checked when process exits.

I wonder if that small value (50) for server.keepAliveTimeout is intended though.

Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nits

@@ -31,20 +31,18 @@ function run() {

test(function serverKeepAliveTimeoutWithPipeline(cb) {
let socket;
let destroyedSockets = 0;
let timeoutCount = 0;
let requestCount = 0;
process.on('exit', function() {
assert.strictEqual(timeoutCount, 1);
assert.strictEqual(requestCount, 3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that since this PR is focused on robustness, it will be "offtopic" to remove, but now it "cries out" to remove this in favor of a mustCall(x,3) in line (new file) 37, and corresponding 63 & 66.

socket.destroy();
});
server.close();
Copy link
Contributor

@refack refack Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just for readability, could you put comment that // this ends the test or do a:

function endTest(socket) {
  socket.destroy();
  server.close();
  cb();
}
...
server.setTimeout(500, common.mustCall(endTest));

Here and in L41?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Trott added 3 commits June 1, 2017 11:13
The test is flaky under load. These changes greatly improve reliability.

* Use a recurring interval to determine when the test should end rather
  than a timer.
* Increase server timeout to 500ms to allow for events being delayed by
  system load

Changing to an interval has the added benefit of reducing the test run
time from over 2 seconds to under 1 second.

Fixes: nodejs#13307
@Trott
Copy link
Member Author

Trott commented Jun 1, 2017

jasnell pushed a commit that referenced this pull request Jun 1, 2017
The test is flaky under load. These changes greatly improve reliability.

* Use a recurring interval to determine when the test should end rather
  than a timer.
* Increase server timeout to 500ms to allow for events being delayed by
  system load

Changing to an interval has the added benefit of reducing the test run
time from over 2 seconds to under 1 second.

Fixes: #13307

PR-URL: #13312
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
@jasnell
Copy link
Member

jasnell commented Jun 1, 2017

Landed in ce5745b

@jasnell jasnell closed this Jun 1, 2017
jasnell pushed a commit that referenced this pull request Jun 5, 2017
The test is flaky under load. These changes greatly improve reliability.

* Use a recurring interval to determine when the test should end rather
  than a timer.
* Increase server timeout to 500ms to allow for events being delayed by
  system load

Changing to an interval has the added benefit of reducing the test run
time from over 2 seconds to under 1 second.

Fixes: #13307

PR-URL: #13312
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
@jasnell jasnell mentioned this pull request Jun 5, 2017
aqrln pushed a commit that referenced this pull request Jun 9, 2017
Make the same reliability changes that were applied to the https test in
ce5745b.

Refs: #13312
PR-URL: #13448
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
addaleax pushed a commit that referenced this pull request Jun 10, 2017
Make the same reliability changes that were applied to the https test in
ce5745b.

Refs: #13312
PR-URL: #13448
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
@gibfahn gibfahn mentioned this pull request Jun 15, 2017
3 tasks
@Trott Trott deleted the deflake-timeout branch January 13, 2022 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
https Issues or PRs related to the https subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate flaky test-https-server-keep-alive-timeout
7 participants