Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby-source-filesystem): allow adjusting createRemoteFileNode retry/timeout settings via env vars #24535

Merged
merged 5 commits into from
Jul 16, 2020

Conversation

webbugt
Copy link
Contributor

@webbugt webbugt commented May 27, 2020

Description

For developers with awful internet connection (like me), when using createRemoteFileNode, default timeouts of 30000ms are too short. Out of 300 files, it'd fail 1-4 images due to timeout and mess up the build process.
Thus, it would be useful to overwrite default remote file download settings (retry limits and timeouts) with .env variables.
For my specific case, just doubling all the values solved all issues.

For developers with awful internet connection (like me), when using createRemoteFileNode, default timeouts of 30000ms are too short. Out of 300 files, it'd fail 1-4 images due to timeout and mess up the build process. 
Thus, it would be useful to overwrite default remote file download settings (retry limits and timeouts) with .env variables. 
For my specific case, just doubling all the values solved all issues.
@webbugt webbugt requested a review from a team as a code owner May 27, 2020 15:08
@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label May 27, 2020
@wardpeet wardpeet added status: needs core review Currently awaiting review from Core team member type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement. and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels May 28, 2020
@pieh
Copy link
Contributor

pieh commented May 28, 2020

What values you are using now? I wonder if we should just not bump "defaults", so this can be more failure-proof out of the box for flaky connections.

@pieh
Copy link
Contributor

pieh commented May 28, 2020

That said - I do see value in users being able to define their own, so I'm overall OK with this, but we should document it somewhere in gatsby-source-filesystem/README.md.

We have note:

To prevent concurrent requests overload of processRemoteNode, you can adjust the 200 default concurrent downloads, with GATSBY_CONCURRENT_DOWNLOAD environment variable.

It is currently bit weirdly placed in "Options" section, but it's really just "option" for createRemoteFileNode and not plugin option. So what I would suggest is add "Troubleshooting" section below docs for createRemoteFileNode, mention problem (ideally with example error you are getting on flaky connection so users can find it when searching for this speciffic error) and list available env vars with some explanation and their default values

@webbugt webbugt requested a review from a team as a code owner June 4, 2020 20:14
@LekoArts
Copy link
Contributor

This can also fix #24856

@webbugt
Copy link
Contributor Author

webbugt commented Jun 16, 2020

This should be it then. Do I need to expand the documentation or change the defaults?
I've used these settings, although for my specific problem only changing the timeout would probably suffice. I have a really bad connection at home and had to urgently build a large site.

GATSBY_CONCURRENT_DOWNLOAD=2 (could probably bump it to 10 now)

GATSBY_STALL_RETRY_LIMIT=6
GATSBY_STALL_TIMEOUT=60000
GATSBY_CONNECTION_RETRY_LIMIT=10
GATSBY_CONNECTION_TIMEOUT=60000

@polarathene
Copy link
Contributor

A related PR resolved the timeout issue I was hitting, which seemed to be from the timeout being before the requests were actually made/sent.

I still had trouble getting all images to consistently download(only 24 images), triggering a retry failure. In my case, the retries weren't likely to be that helpful, I have a ~10Mbps(~1MB/sec) download connection over wifi(G or N, it's an old device from 2008). Default concurrent requests was too much, I only tried the example of GATSBY_CONCURRENT_DOWNLOAD=2, which worked first time without any problems. Download time(81 sec, ~150MB) seemed to be no slower(possibly faster) than my best attempt before(22/24 downloaded).

It would probably be a useful feature if failures lowered the allowed concurrent requests(eg halving, concurrent_requests / (2 * successive_retry_attempts), such that the limit is halved as soon as a retry needs to happen, then halved again if a 2nd retry needs to try a 3rd time, and so forth(not halving whenever a retry for any request is attempted).

@webbugt
Copy link
Contributor Author

webbugt commented Jun 23, 2020

It would probably be a useful feature if failures lowered the allowed concurrent requests(eg halving, concurrent_requests / (2 * successive_retry_attempts), such that the limit is halved as soon as a retry needs to happen, then halved again if a 2nd retry needs to try a 3rd time, and so forth(not halving whenever a retry for any request is attempted).

Setting GATSBY_CONCURRENT_DOWNLOAD didn't solve my issue, which is why I had to add these env variables. Even when set to 1, it would miss some files, I really don't know why but every time for me some files timeout in 30s, but all work file with timeout set to 60.

@polarathene
Copy link
Contributor

@webbugt yes, the related PR I linked to helps address that issue. It delays the timeout timer from starting until the request is made, not when it's added the queue(something like that). It'd be great if either one of these PRs got merged though.

@pieh pieh changed the title fix(gatsby-source-filesystem) Remote File Node env variables feat(gatsby-source-filesystem): allow adjusting createRemoteFileNode retry/timeout settings via env vars Jul 16, 2020
Copy link
Contributor

@pieh pieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. Solutions discussed here ( auto adjusting concurrency settings in response to failures ) sounds good, but this can act as kind of stopgap solution

@pieh pieh merged commit c08fa4a into gatsbyjs:master Jul 16, 2020
@gatsbot
Copy link

gatsbot bot commented Jul 16, 2020

Holy buckets, @webbugt — we just merged your PR to Gatsby! 💪💜

Gatsby is built by awesome people like you. Let us say “thanks” in two ways:

  1. We’d like to send you some Gatsby swag. As a token of our appreciation, you can go to the Gatsby Swag Store and log in with your GitHub account to get a coupon code good for one free piece of swag. We’ve got Gatsby t-shirts, stickers, hats, scrunchies, and much more. (You can also unlock even more free swag with 5 contributions — wink wink nudge nudge.) See gatsby.dev/swag for details.
  2. We just invited you to join the Gatsby organization on GitHub. This will add you to our team of maintainers. Accept the invite by visiting https://github.com/orgs/gatsbyjs/invitation. By joining the team, you’ll be able to label issues, review pull requests, and merge approved pull requests.

If there’s anything we can do to help, please don’t hesitate to reach out to us: tweet at @gatsbyjs and we’ll come a-runnin’.

Thanks again!

mwfrost pushed a commit to mwfrost/gatsby that referenced this pull request Apr 20, 2023
…` retry/timeout settings via env vars (gatsbyjs#24535)

* (gatsby-source-filesystem) Remote File Node env variables

For developers with awful internet connection (like me), when using createRemoteFileNode, default timeouts of 30000ms are too short. Out of 300 files, it'd fail 1-4 images due to timeout and mess up the build process. 
Thus, it would be useful to overwrite default remote file download settings (retry limits and timeouts) with .env variables. 
For my specific case, just doubling all the values solved all issues.

* Update create-remote-file-node.js

removed ";"s

* updated readme

* update

Co-authored-by: Michal Piechowiak <misiek.piechowiak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: needs core review Currently awaiting review from Core team member type: feature or enhancement Issue that is not a bug and requests the addition of a new feature or enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants