Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend branch caching logic to PR content #18638

Closed
rarkins opened this issue Oct 28, 2022 · 11 comments · Fixed by #18850
Closed

Extend branch caching logic to PR content #18638

rarkins opened this issue Oct 28, 2022 · 11 comments · Fixed by #18850
Assignees
Labels
core:cache performance Performance-related improvements priority-2-high Bugs impacting wide number of users or very important features status:in-progress Someone is working on implementation type:feature Feature (new functionality)

Comments

@rarkins
Copy link
Collaborator

rarkins commented Oct 28, 2022

What would you like Renovate to be able to do?

Improve performance by using intelligent cache logic to reduce work validating PR bodies.

If you have any ideas on how this should be implemented, please tell us here.

This should reuse the same data structures we have for branch caches already.

The primary goal here is to avoid the changelog lookups which take up many network requests and time.

We should work out what logic we can use to decide whether to check if a PR body needs updating. If the "fingerprint" holds, then we skip the PR body check altogether.

There is likely two parts to this fingerprinting decision:

  • Has the config fingerprint changed? If so then we always validate it
  • How old are the updates, and how long since we last validated it?

For example:

  • If any release is <2 hours old, maybe we always validate (because release notes are often written/updated soon after release)
  • If >2 <24 hours old, we check every 24 hours
  • If >1d <7d old, we check daily
  • Otherwise, weekly or never?

We also need to think about what to do with releases where we don't have timestamps. In this case we probably use the PR creation date as the starting point and not the release timestamp.

Is this a feature you are interested in implementing yourself?

Maybe

@rarkins rarkins added type:feature Feature (new functionality) priority-2-high Bugs impacting wide number of users or very important features status:requirements Full requirements are not yet known, so implementation should not be started performance Performance-related improvements labels Oct 28, 2022
@viceice
Copy link
Member

viceice commented Oct 28, 2022

We should also include such intelligent caching to changelog retrieval to cache release notes from github releases much longer for old releases. so we can avoid the network call and used local cache.

Maybe a HEAD request can improve more to check for last modified, so we save bandwidth too.

@rarkins
Copy link
Collaborator Author

rarkins commented Oct 29, 2022

Simple initial approach for PR caching: if the PR hasn't been edited in 24 hours, and there's no config change or upgrades change, then skip checking release notes or updating the PR.

Platforms like GitHub have an "updated at" field in PRs, but we are interested in the PR body, not necessarily all of the PR. e.g. if someone approves it or adds a label, that should not trigger us to check it again.

@rarkins
Copy link
Collaborator Author

rarkins commented Nov 2, 2022

If PR config fingerprint is unchanged, and PR body hasn't been edited in >24 hours, then skip the PR update check (including changelog lookups)

@rarkins
Copy link
Collaborator Author

rarkins commented Nov 3, 2022

Ultimately we want to avoid reaching this clause unnecessarily:

if (config.fetchReleaseNotes) {
// fetch changelogs when not already done;
await embedChangelogs(upgrades);
}

Currently we con't generate the pr body until further down:

const prBody = getPrBody(config, {
debugData: updatePrDebugData(existingPr?.bodyStruct?.debugData),
});

@RahulGautamSingh
Copy link
Collaborator

Simple initial approach for PR caching: if the PR hasn't been edited in 24 hours, and there's no config change or upgrades change, then skip checking release notes or updating the PR.

Platforms like GitHub have an "updated at" field in PRs, but we are interested in the PR body, not necessarily all of the PR. e.g. if someone approves it or adds a label, that should not trigger us to check it again.

Could we not store the last edited time in branchCache and use it?

@RahulGautamSingh
Copy link
Collaborator

RahulGautamSingh commented Nov 7, 2022

if (existingPr) {
logger.debug('Found existing PR');
}

I think we can add the if-check here:

  if (existingPr) {
    logger.debug('Found existing PR');
    // check pr config fingerprint
     QN. Should we use filtered config from the start or begin with whole config and filter later?
     
    // check upgrades same
    QN. Focus only on upgrades fields related to PR body
    
    // check last edited time
  }

Should we create a new PR cache for this, (IMO no)? Adding an object to the existing BranchCache will be simpler.

@RahulGautamSingh
Copy link
Collaborator

} else if (!config.committedFiles && !config.rebaseRequested) {
logger.debug(
{
prTitle,
},
'PR body changed'
);
}

If we only update the last-edited-time inside this if-block we won't need to worry about other PR updates.

@rarkins
Copy link
Collaborator Author

rarkins commented Nov 7, 2022

Yes add to branch cache

@rarkins rarkins added status:in-progress Someone is working on implementation core:cache and removed status:requirements Full requirements are not yet known, so implementation should not be started labels Nov 25, 2022
@RahulGautamSingh
Copy link
Collaborator

RahulGautamSingh commented Jan 8, 2023

This is the old logic to skip an PR update.

// Check if existing PR needs updating
const existingPrTitle = stripEmojis(existingPr.title);
const existingPrBodyHash = existingPr.bodyStruct?.hash;
const newPrTitle = stripEmojis(prTitle);
const newPrBodyHash = hashBody(prBody);
if (
existingPrTitle === newPrTitle &&
existingPrBodyHash === newPrBodyHash
) {
// TODO: types (#7154)
logger.debug(`${existingPr.displayNumber!} does not need updating`);
return { type: 'with-pr', pr: existingPr };
}

I think it will be safe to use this to filter out the fields for the prFingerprint and wanted confirmation on the same.
The prFingerprint will consists of the prTitle and the fields from BranchConfig & BranchUpgradeConfig that are directly used to construct the prBody.

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 10, 2023

I guess so?

@renovate-release
Copy link
Collaborator

🎉 This issue has been resolved in version 34.148.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
core:cache performance Performance-related improvements priority-2-high Bugs impacting wide number of users or very important features status:in-progress Someone is working on implementation type:feature Feature (new functionality)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants