Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data-corruption issue with s3boto and s3boto3 multipart uploads #504

Merged
merged 6 commits into from
Jun 1, 2018

Conversation

jnm
Copy link
Contributor

@jnm jnm commented May 30, 2018

This is a consolidation of #197 and #365 with merge conflicts resolved and flake8 niggles addressed. All credit for the fixes belongs to @vinayinvicible and @tommwatson.

Since the data-corruption problem is long-standing and severe, please consider merging these fixes and their included unit tests. If there are any issues with the proposed changes that make you hesitate to merge, please bring them up candidly—I'll work to address them ASAP!

I believe that this PR should:

Additionally, #450 is related but cannot be closed by this PR because it also addresses #383, which is a separate issue about file reading.

@codecov-io
Copy link

codecov-io commented May 31, 2018

Codecov Report

Merging #504 into master will increase coverage by 0.3%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #504     +/-   ##
=========================================
+ Coverage    76.1%   76.41%   +0.3%     
=========================================
  Files          11       11             
  Lines        1578     1590     +12     
=========================================
+ Hits         1201     1215     +14     
+ Misses        377      375      -2
Impacted Files Coverage Δ
storages/backends/s3boto.py 88.03% <100%> (+0.58%) ⬆️
storages/backends/s3boto3.py 87.12% <100%> (+0.54%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f07eab...175ad80. Read the comment docs.

@jnm
Copy link
Contributor Author

jnm commented May 31, 2018

I replaced method.assert_not_called() with self.assertFalse(method.called) to restore Python 3.4 compatibility. The wondrous green check marks have appeared!

@jschneier
Copy link
Owner

Taking a look at this one now.

@jnm
Copy link
Contributor Author

jnm commented Jun 3, 2018

@jschneier, thanks for looking at this so quickly. You're going to hate me, but these changes introduce a memory leak because the temporary file is never truncated. I'm working on another PR right now to correct the problem. Sorry!

@jschneier
Copy link
Owner

jschneier commented Jun 3, 2018 via email

jnm added a commit to jnm/django-storages that referenced this pull request Jun 4, 2018
by truncating the buffer after uploading it. Follows the approach of jschneier#169.
@jnm
Copy link
Contributor Author

jnm commented Jun 4, 2018

Thanks. Here's what I've got: #506. This time, I tried creating a 10GB ZIP on the fly and streaming it to S3 before making the PR :) It seems to work well.

nitely pushed a commit to satellogic/django-storages that referenced this pull request Jul 30, 2018
…schneier#504)

* fix for files with size more than buffer size

fixes jschneier#160

* fix(s3bot3): spool buffer file to end of all uploaded parts after each
             part upload.

             Fixes jschneier#364, similar issue to jschneier#160 for s3boto3. Inspired by
             vinayinvicible's fix for jschneier#160.

* Fix style issues flagged by flake8

* `At least two spaces before inline comment (E261)`
* `Imports are incorrectly sorted.`

* Fix s3boto3 test incompatibility with Python 3.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants