Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue with version v3.667.0 #6553

Open
4 tasks done
JeremieDoctrine opened this issue Oct 9, 2024 · 18 comments
Open
4 tasks done

Memory issue with version v3.667.0 #6553

JeremieDoctrine opened this issue Oct 9, 2024 · 18 comments
Assignees
Labels
bug This issue is a bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p0 This issue is the highest priority potential-regression Marking this issue as a potential regression to be checked by team member workaround-available This issue has a work around available.

Comments

@JeremieDoctrine
Copy link

JeremieDoctrine commented Oct 9, 2024

Checkboxes for prior research

Describe the bug

We have three services using the aws-sdk` v3.667.0
This is the memory usage per version :
CleanShot 2024-10-09 at 16 01 50@2x

Regression Issue

  • Select this option if this issue appears to be a regression.

SDK version number

@aws-sdk/client-dynamodb@3.667.0, @aws-sdk/lib-dynamodb@3.667.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v22.4.1

Reproduction Steps

run npm start the container immediately runs out of memory.

Observed Behavior

Memory increases at boot time

Expected Behavior

Memory do not increase at boot time

Possible Solution

No response

Additional Information/Context

No response

@JeremieDoctrine JeremieDoctrine added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 9, 2024
@github-actions github-actions bot added the potential-regression Marking this issue as a potential regression to be checked by team member label Oct 9, 2024
@JeremieDoctrine
Copy link
Author

JeremieDoctrine commented Oct 9, 2024

I know there is very little information in this ticket. I have no much time to investigate.
But i'm, quite sure this version is bugged. And wanted to create an issue because i'm sure other people will run into the same issue.

@Chrisp1tv
Copy link

Hello!
I'm working on a Nodejs (version 22.9.0) project and I'm observing the same issue when using AWS SDK with NodeJS streams since this version, 3.667.0. We can provide more details if it can help to debug the issue :)

@lucavb
Copy link

lucavb commented Oct 9, 2024

In our case it is also making our Nest app go OOM on startup.

@FelixK-Witt
Copy link

FelixK-Witt commented Oct 9, 2024

I can confirm that our deployments run out of memory during startup too, when upgrading from 3.666.0 to 3.667.0. We're using node/express server.

@MathieuGuillet
Copy link

Same here with node JS / Nest and these dependencies:

    "@aws-sdk/client-s3": "^3.667.0",
    "@aws-sdk/credential-providers": "^3.667.0",
    "@aws-sdk/lib-storage": "^3.667.0",
    "@aws-sdk/lib-storage": "^3.667.0",
    "@aws-sdk/rds-signer": "^3.667.0",

@krukid
Copy link

krukid commented Oct 9, 2024

same here, the memory leak occurred with these aws deps

aws-sdk@2.1691.0
@aws-sdk/credential-providers@3.667.0

rolling back to @aws-sdk/credential-providers@3.664.0 fixed the issue

@kuhe
Copy link
Contributor

kuhe commented Oct 9, 2024

Hi, we are marking 3.666.0 series of @aws-sdk/* packages as latest on NPM while we prepare a fix.

@kuhe kuhe added the p0 This issue is the highest priority label Oct 9, 2024
@kuhe kuhe self-assigned this Oct 9, 2024
@kuhe kuhe removed the needs-triage This issue or PR still needs to be triaged. label Oct 9, 2024
@kuhe kuhe added the pending-release This issue will be fixed by an approved PR that hasn't been released yet. label Oct 9, 2024
@kuhe
Copy link
Contributor

kuhe commented Oct 9, 2024

I'm sorry for not catching this bug. A fix has been made in PR #6555.

We will release the new version later today.

The root cause is calling the memoized credentials provider function in the user agent middleware. The credentials provider function may include an SDK operation to e.g. STS, and during that invocation it loops into the same middleware.

After resolving the immediate issue, we will investigate how to improve test coverage to avoid recurrences.

@kuhe
Copy link
Contributor

kuhe commented Oct 9, 2024

https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.668.0 has been released with what I believe is the fix.

That said, does anyone have a more specific reproduction setup for this issue?

@kuhe kuhe added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed pending-release This issue will be fixed by an approved PR that hasn't been released yet. labels Oct 9, 2024
@matt-halliday
Copy link

matt-halliday commented Oct 10, 2024

https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.668.0 has been released with what I believe is the fix.

That said, does anyone have a more specific reproduction setup for this issue?

I don't have a full repro repo... can probably create one if needed?

We were seeing it in a node API running in K8s with a service role granting GetObject access to a bucket.

A request would come in:

/get-foo/{id}

Our S3Client would go something like this:

try {
    const get = new GetObjectCommand({
        Bucket: 'bucket',
        key: <the-id>,
    });
    // we get here
    const data = await s3Client.send(get);
    return JSON.parse(data.Body);
} catch (err) {
    // we never get here
}

It would crash immediately...

@angusjellis
Copy link

We've observed the same behaviour. We have a simple tool that includes the AWS SDK as part of its dependencies.

It instantiates an S3 client before doing anything else, using an OIDC token from GitLab CI.

When the tool runs on a container with 2GB of RAM, if using version3.666 it works as expected, but on version 3.667 we can pinpoint that it runs out of memory as soon as the S3 client is instantiated.

@matt-halliday
Copy link

matt-halliday commented Oct 10, 2024

We've observed the same behaviour. We have a simple tool that includes the AWS SDK as part of its dependencies.

It instantiates an S3 client before doing anything else, using an OIDC token from GitLab CI.

When the tool runs on a container with 2GB of RAM, if using version3.666 it works as expected, but on version 3.667 we can pinpoint that it runs out of memory as soon as the S3 client is instantiated.

Interesting, our client instantiated ok, but exploded whenever we tried to use it

Edit: now I remember we were intially getting crashloops on start up, increased resource limits to get around that, and then saw the crashes on calling the client. Fog of war.

@aqeelat
Copy link

aqeelat commented Oct 10, 2024

@kuhe @aws-sdk/types is not updated. Latest is still 3.664.0.

@kuhe
Copy link
Contributor

kuhe commented Oct 10, 2024

Not all packages get updated to every version. The clients on version 3.668.0 still use @aws-sdk/types@3.667.0, which is not an affected package. The affected package is @aws-sdk/middleware-user-agent@3.667.0.

@kuhe
Copy link
Contributor

kuhe commented Oct 10, 2024

It looks like container credentials may be the precondition, which I'll investigate.

@matt-halliday
Copy link

It looks like container credentials may be the precondition, which I'll investigate.

Using a profile, I couldn't recreate locally with a basic benchmark so that's quite possible... I've attached some nasty code if it helps. You can change the client version in package.json, run npm i && npm run benchmark for output in the console.

benchmark.zip

@kuhe
Copy link
Contributor

kuhe commented Oct 10, 2024

I believe this is fixed with v3.668.0, so I'll be closing the issue soon unless anyone reports the issue as persisting in v3.668.0, but I'll comment with the root cause details when I'm able to determine the preconditions.

@kuhe kuhe added closing-soon This issue will automatically close in 4 days unless further comments are made. workaround-available This issue has a work around available. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Oct 10, 2024
@kuhe
Copy link
Contributor

kuhe commented Oct 10, 2024

My testing shows the issue affected assumeRoleWithWebCredentials but not assumeRole. assumeRoleWithWebCredentials usually coincides with usage of process.env.AWS_WEB_IDENTITY_TOKEN_FILE.

@github-actions github-actions bot removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Oct 11, 2024
@kuhe kuhe added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p0 This issue is the highest priority potential-regression Marking this issue as a potential regression to be checked by team member workaround-available This issue has a work around available.
Projects
None yet
Development

No branches or pull requests

10 participants