Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: unify header treatment #46528

Merged

Conversation

marco-ippolito
Copy link
Member

@marco-ippolito marco-ippolito commented Feb 6, 2023

resolves: #46395
I'm not sure what the test test/parallel/test-http-server-response-standalone.js impacted by this change was trying to achieve @Trott

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/http
  • @nodejs/net

@nodejs-github-bot nodejs-github-bot added http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run. labels Feb 6, 2023
@Trott
Copy link
Member

Trott commented Feb 6, 2023

I'm not sure what the test test/parallel/test-http-server-response-standalone.js impacted by this change was trying to achieve @Trott

It was added in #14387 by @mcollina so maybe he can add context if that PR doesn't supply the information you need.

Copy link
Contributor

@climba03003 climba03003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look right to me.
The header should be process before ._send, so it can be ensure either merge with data and pending for first write always receive the same byte.

@ShogunPanda
Copy link
Contributor

@climba03003 Are you referring to the test, right?

@climba03003
Copy link
Contributor

climba03003 commented Feb 6, 2023

@climba03003 Are you referring to the test, right?

I means the implementation.
Combining header and data is used to increase the throughput.
Remove !encoding is a de-optimization since most of the people doesn't specify encoding.

The trade-off is process the header with consistence encoding before combine to data.
Add a bit of processing time, but still combine header and data in most of the case.

node/lib/_http_outgoing.js

Lines 438 to 587 in a2a954b

function _storeHeader(firstLine, headers) {
// firstLine in the case of request is: 'GET /index.html HTTP/1.1\r\n'
// in the case of response it is: 'HTTP/1.1 200 OK\r\n'
const state = {
connection: false,
contLen: false,
te: false,
date: false,
expect: false,
trailer: false,
header: firstLine
};
if (headers) {
if (headers === this[kOutHeaders]) {
for (const key in headers) {
const entry = headers[key];
processHeader(this, state, entry[0], entry[1], false);
}
} else if (ArrayIsArray(headers)) {
if (headers.length && ArrayIsArray(headers[0])) {
for (let i = 0; i < headers.length; i++) {
const entry = headers[i];
processHeader(this, state, entry[0], entry[1], true);
}
} else {
if (headers.length % 2 !== 0) {
throw new ERR_INVALID_ARG_VALUE('headers', headers);
}
for (let n = 0; n < headers.length; n += 2) {
processHeader(this, state, headers[n + 0], headers[n + 1], true);
}
}
} else {
for (const key in headers) {
if (ObjectPrototypeHasOwnProperty(headers, key)) {
processHeader(this, state, key, headers[key], true);
}
}
}
}
let { header } = state;
// Date header
if (this.sendDate && !state.date) {
header += 'Date: ' + utcDate() + '\r\n';
}
// Force the connection to close when the response is a 204 No Content or
// a 304 Not Modified and the user has set a "Transfer-Encoding: chunked"
// header.
//
// RFC 2616 mandates that 204 and 304 responses MUST NOT have a body but
// node.js used to send out a zero chunk anyway to accommodate clients
// that don't have special handling for those responses.
//
// It was pointed out that this might confuse reverse proxies to the point
// of creating security liabilities, so suppress the zero chunk and force
// the connection to close.
if (this.chunkedEncoding && (this.statusCode === 204 ||
this.statusCode === 304)) {
debug(this.statusCode + ' response should not use chunked encoding,' +
' closing connection.');
this.chunkedEncoding = false;
this.shouldKeepAlive = false;
}
// keep-alive logic
if (this._removedConnection) {
// shouldKeepAlive is generally true for HTTP/1.1. In that common case,
// even if the connection header isn't sent, we still persist by default.
this._last = !this.shouldKeepAlive;
} else if (!state.connection) {
const shouldSendKeepAlive = this.shouldKeepAlive &&
(state.contLen || this.useChunkedEncodingByDefault || this.agent);
if (shouldSendKeepAlive && this.maxRequestsOnConnectionReached) {
header += 'Connection: close\r\n';
} else if (shouldSendKeepAlive) {
header += 'Connection: keep-alive\r\n';
if (this._keepAliveTimeout && this._defaultKeepAlive) {
const timeoutSeconds = MathFloor(this._keepAliveTimeout / 1000);
let max = '';
if (~~this._maxRequestsPerSocket > 0) {
max = `, max=${this._maxRequestsPerSocket}`;
}
header += `Keep-Alive: timeout=${timeoutSeconds}${max}\r\n`;
}
} else {
this._last = true;
header += 'Connection: close\r\n';
}
}
if (!state.contLen && !state.te) {
if (!this._hasBody) {
// Make sure we don't end the 0\r\n\r\n at the end of the message.
this.chunkedEncoding = false;
} else if (!this.useChunkedEncodingByDefault) {
this._last = true;
} else if (!state.trailer &&
!this._removedContLen &&
typeof this._contentLength === 'number') {
header += 'Content-Length: ' + this._contentLength + '\r\n';
} else if (!this._removedTE) {
header += 'Transfer-Encoding: chunked\r\n';
this.chunkedEncoding = true;
} else {
// We should only be able to get here if both Content-Length and
// Transfer-Encoding are removed by the user.
// See: test/parallel/test-http-remove-header-stays-removed.js
debug('Both Content-Length and Transfer-Encoding are removed');
}
}
// Test non-chunked message does not have trailer header set,
// message will be terminated by the first empty line after the
// header fields, regardless of the header fields present in the
// message, and thus cannot contain a message body or 'trailers'.
if (this.chunkedEncoding !== true && state.trailer) {
throw new ERR_HTTP_TRAILER_INVALID();
}
this._header = header + '\r\n';
this._headerSent = false;
// Wait until the first body chunk, or close(), is sent to flush,
// UNLESS we're sending Expect: 100-continue.
if (state.expect) this._send('');
}
function processHeader(self, state, key, value, validate) {
if (validate)
validateHeaderName(key);
if (ArrayIsArray(value)) {
if (
(value.length < 2 || !isCookieField(key)) &&
(!self[kUniqueHeaders] || !self[kUniqueHeaders].has(StringPrototypeToLowerCase(key)))
) {
// Retain for(;;) loop for performance reasons
// Refs: https://github.com/nodejs/node/pull/30958
for (let i = 0; i < value.length; i++)
storeHeader(self, state, key, value[i], validate);
return;
}
value = ArrayPrototypeJoin(value, '; ');
}
storeHeader(self, state, key, value, validate);
}

@ShogunPanda
Copy link
Contributor

I knew about the optimization. I was double checking.
I guess when encoding it's not set we have to make sure we correctly encode the headers. Hoping it does not result in big performance loss.

@climba03003
Copy link
Contributor

climba03003 commented Feb 6, 2023

I guess when encoding it's not set we have to make sure we correctly encode the headers. Hoping it does not result in big performance loss.

Inside the code, node.js is always expecting the header is latin1 encoding.
So, we can do something like Buffer.from(headerValue, 'latin').toString('utf-8') in _storeHeader.
Maybe there is another approach which is more performant.

In this case, we can expect the non-string branch always utf-8. Since we do the transformation already.
In string branch, latin1 or utf-8 encoding doesn't really matter. Since the byte already changed.

// from utf-8 to latin1
const first = Buffer.from('å', 'utf8').toString('latin1')
console.log(first, Buffer.from(first)) // <Buffer c3 83 c2 a5>

// here is what should be done before hand
const second = Buffer.from('å', 'latin1').toString('utf8')
console.log(second, Buffer.from(second)) // <Buffer ef bf bd>

// when data is latin1
const third = Buffer.from('ef bf bd', 'hex')
console.log(third, Buffer.from(third, 'latin1')) // <Buffer ef>
// when data is utf-8 or no encoding
const fouth = Buffer.from('ef bf bd', 'hex')
console.log(fouth, Buffer.from(fouth, 'utf8')) // <Buffer ef>

Edit: hmm, buffer truncated doesn't seems correct.

@climba03003
Copy link
Contributor

climba03003 commented Feb 6, 2023

The problem is actually cause by content-disposition allow latin1 but other header is always restricted to ASCII.

Maybe special handling of content-disposition in _storeHeader should do the trick without massive change.
Same as nodejs/undici#1903 (comment)

@marco-ippolito
Copy link
Member Author

marco-ippolito commented Feb 6, 2023

@climba03003 I've changed it to check inside processHeader if the header is content-disposition and there is content-length it encodes it in latin1. It will not impact performance but it feels more like a workaround rather than a full solution.

@ShogunPanda
Copy link
Contributor

The RFC itself is a special case so I guess we have to act in "workaround way".

LGTM.

@ShogunPanda
Copy link
Contributor

@nodejs/http Are we good on this?

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@marco-ippolito
Copy link
Member Author

I think this is going to be a breaking change wdyt? @ronag @Trott

@ronag
Copy link
Member

ronag commented Feb 15, 2023

I think it's more of a bug fix than a breaking change...

@ShogunPanda ShogunPanda added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 20, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 20, 2023
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@ShogunPanda ShogunPanda added the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 21, 2023
@ShogunPanda ShogunPanda added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Feb 21, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 21, 2023
@nodejs-github-bot nodejs-github-bot merged commit 3fd4343 into nodejs:main Feb 21, 2023
@nodejs-github-bot
Copy link
Collaborator

Landed in 3fd4343

targos pushed a commit that referenced this pull request Mar 13, 2023
PR-URL: #46528
Fixes: #46395
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
danielleadams pushed a commit that referenced this pull request Apr 11, 2023
PR-URL: #46528
Fixes: #46395
Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
codebytere added a commit to electron/electron that referenced this pull request Apr 17, 2023
codebytere added a commit to electron/electron that referenced this pull request Apr 17, 2023
codebytere added a commit to electron/electron that referenced this pull request Apr 18, 2023
* chore: bump node in DEPS to v18.16.0

* build,test: add proper support for IBM i

nodejs/node#46739

* lib: enforce use of trailing commas

nodejs/node#46881

* src: add initial support for single executable applications

nodejs/node#45038

* lib: do not crash using workers with disabled shared array buffers

nodejs/node#41023

* src: remove shadowed variable in OptionsParser::Parse

nodejs/node#46672

* src: allow embedder control of code generation policy

nodejs/node#46368

* src: allow optional Isolate termination in node::Stop()

nodejs/node#46583

* lib: fix BroadcastChannel initialization location

nodejs/node#46864

* chore: fixup patch indices

* chore: sync filenames.json

* fix: add simdutf dep to src/inspector BUILD.gn

- nodejs/node#46471
- nodejs/node#46472

* deps: replace url parser with Ada

nodejs/node#46410

* tls: support automatic DHE

nodejs/node#46978

* fixup! src: add initial support for single executable applications

* http: unify header treatment

nodejs/node#46528

* fix: libc++ buffer overflow in string_view ctor

nodejs/node#46410

* test: include strace openat test

nodejs/node#46150

* fixup! fixup! src: add initial support for single executable applications

---------

Co-authored-by: electron-roller[bot] <84116207+electron-roller[bot]@users.noreply.github.com>
Co-authored-by: Shelley Vohr <shelley.vohr@gmail.com>
@ArsalanDotMe
Copy link
Contributor

This PR introduced a bug #50978 and there is an associated PR #50977 to fix it. Please review when possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. http Issues or PRs related to the http subsystem. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

http: unify the header treatment
8 participants