Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise Export Memory Usage #9388

Open
wants to merge 3 commits into
base: hotfix
Choose a base branch
from

Conversation

simonr44
Copy link
Contributor

@simonr44 simonr44 commented Dec 3, 2021

Description

This PR passes records to the export function in export_utils in small batches rather than all at once to minimise the memory consumption of the csv export process.

Motivation and Context

When exporting lots of records at one time, the php memory consumption can quickly exceed the default memory_limit
This change will process the records in small batches, consuming less memory.

Large exports of +/- 100k records can easily exceed 750MB peak memory consumption. With this change consumption remains below the php default value of 128MB ( Approx 74MB in testing, but will vary based on record contents )

How To Test This

Select all records on the list view and perform a bulk action export
Best performed on a test DB with many thousands of records.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Final checklist

  • My code follows the code style of this project found here.
  • My change requires a change to the documentation.
  • I have read the How to Contribute guidelines.

@jack7anderson7 jack7anderson7 added Status:Assessed PRs that have been tested and confirmed to resolve an issue by a core team member Branch:Hotfix labels Jul 21, 2022
jack7anderson7
jack7anderson7 previously approved these changes Jul 21, 2022
Copy link
Contributor

@clemente-raposo clemente-raposo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@clemente-raposo clemente-raposo added Status: Requires Testing Requires Manual Testing Status: Passed Code Review Mark issue has passed code review reviewed labels Jun 26, 2024
@chris001
Copy link
Contributor

Would be great to have this on import also, if not already there, so that users wouldn't have to continue to import large number of records by the command line.

@johnM2401
Copy link
Contributor

Hey @simonr44 !
After some testing, this does indeed seem to cut resource costs quite notably.
On a dataset of about 100k records locally, it seems to cut ~400mb of memory usage.

However, I've noticed one possible issue.

It seems as though it adds the Header row for each batch of records, back into the CSV.

I've grabbed a CSV from pre-fix and post-fix. (left and right respectively), using the same dataset.
However, the post-fix CSV appears to have 110 more rows:
image


After sorting A->Z on the "Name" header row, I noticed that there were now 110 extra rows that contained the header row items
See screenshot:
image
image

I imagine the CSV should only have the initial header row?
This might cause data issues if the CSV is ever re-imported to a CRM.
Could you have a look when you get a chance?

Thank you!

@johnM2401 johnM2401 added the Status:Requires Updates Issues & PRs which requires input or update from the author label Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Branch:Hotfix Status:Assessed PRs that have been tested and confirmed to resolve an issue by a core team member Status: Passed Code Review Mark issue has passed code review reviewed Status: Requires Testing Requires Manual Testing Status:Requires Updates Issues & PRs which requires input or update from the author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants