Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#3158 - [Jib core] Tar archives with same contents are not reproducible #3159

Merged
merged 16 commits into from
Apr 13, 2021

Conversation

davidtron
Copy link
Contributor

@davidtron davidtron commented Mar 20, 2021

Fixes #3158.

Sets the mod time on TarArchiveEntry to DEFAULT_MODIFICATION_TIME (Epoch second 1) in order to create reproducible Tar images. Associated unit test creates 2 tars output streams from the same input 2 seconds apart to show the failure if we don't explicitly set the TarArchiveEntry mod time

@google-cla google-cla bot added the cla: yes label Mar 20, 2021
@codecov
Copy link

codecov bot commented Mar 20, 2021

Codecov Report

Merging #3159 (dd70bba) into master (5c5cf90) will increase coverage by 0.16%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #3159      +/-   ##
============================================
+ Coverage     71.09%   71.25%   +0.16%     
- Complexity     2319     2329      +10     
============================================
  Files           279      279              
  Lines          9822     9850      +28     
  Branches        992      977      -15     
============================================
+ Hits           6983     7019      +36     
  Misses         2493     2493              
+ Partials        346      338       -8     
Impacted Files Coverage Δ Complexity Δ
...com/google/cloud/tools/jib/image/ImageTarball.java 92.06% <100.00%> (+0.53%) 9.00 <1.00> (+1.00)
...m/google/cloud/tools/jib/tar/TarStreamBuilder.java 100.00% <100.00%> (ø) 9.00 <0.00> (ø)
...oud/tools/jib/gradle/ExtraDirectoryParameters.java 77.27% <0.00%> (-16.85%) 6.00% <0.00%> (+1.00%) ⬇️
.../cloud/tools/jib/maven/JibPluginConfiguration.java 67.60% <0.00%> (-0.29%) 54.00% <0.00%> (ø%)
.../main/java/com/google/cloud/tools/jib/cli/Jar.java 43.75% <0.00%> (ø) 14.00% <0.00%> (ø%)
...ain/java/com/google/cloud/tools/jib/cli/Build.java 18.91% <0.00%> (ø) 4.00% <0.00%> (ø%)
...m/google/cloud/tools/jib/cli/CommonCliOptions.java 96.61% <0.00%> (+0.05%) 41.00% <0.00%> (+1.00%)
...d/tools/jib/gradle/ExtraDirectoriesParameters.java 89.18% <0.00%> (+0.30%) 11.00% <0.00%> (ø%)
.../google/cloud/tools/jib/cli/logging/CliLogger.java 92.30% <0.00%> (+0.64%) 17.00% <0.00%> (+1.00%)
...jib/plugins/common/JavaContainerBuilderHelper.java 95.45% <0.00%> (+0.90%) 21.00% <0.00%> (+8.00%)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5c5cf90...dd70bba. Read the comment docs.

@loosebazooka
Copy link
Member

I think this might be working as intended. Where are you encountering this as an issue that needs to be fixed? ReproducibleLayerBuilder adds in the behavior of reproducibility.

@davidtron
Copy link
Contributor Author

The problem occurs when I create a Tar file using Jib
Jib.from(RegistryImage.named(image)).containerize(Containerizer.to(TarImage.at(outputPath).named(image)));
I've put the steps to reproduce the issue #3158 and an example unit test that will fail without the proposed change to set the tar modtime to a fixed time.

My question is - should Jib produce a reproducible tar file when the input files are the same but the command has been run at a different time?

@loosebazooka
Copy link
Member

loosebazooka commented Mar 23, 2021

Ah okay, I see.

I think this needs to be solved in ImageTarBall's calls to TarStreamBuilder (#addByteEntry and #addBlobEntry). Those methods should be allowing a timestamp parameter.

@chanseokoh does that make sense?

Copy link
Member

@chanseokoh chanseokoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your contribution! I think we can accept the general direction to reset timestamps. Please check out the comment below.

@chanseokoh
Copy link
Member

chanseokoh commented Mar 23, 2021

Note to self: CHANGELOG

@chanseokoh
Copy link
Member

@davidtron just checking in, do you plan to continue the work?

@davidtron
Copy link
Contributor Author

@chanseokoh yes just updated the code to pull up to ImageTarball as suggested

Copy link
Member

@chanseokoh chanseokoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Can you add some tests? I think its' not difficult to do so in TarStreamBuilderTest, as I see other tests open a tar and go over TarArchiveEntry?

davidtron and others added 7 commits April 9, 2021 09:37
…arball.java

Co-authored-by: Chanseok Oh <chanseok@google.com>
…mBuilderTest.java

Co-authored-by: Chanseok Oh <chanseok@google.com>
…mBuilderTest.java

Co-authored-by: Chanseok Oh <chanseok@google.com>
…mBuilder.java

Co-authored-by: Chanseok Oh <chanseok@google.com>
…mBuilder.java

Co-authored-by: Chanseok Oh <chanseok@google.com>
@davidtron
Copy link
Contributor Author

davidtron commented Apr 9, 2021

Just updating the test, I found that when we add a TarEntry to TarStreamBuilder we don't manipulate the time, instead rely on the fact that ReproducibleLayerBuilder which delegates to UniqueTarArchiveEntries to manage the mod time.

Just wondering if there is a clear way to articulate that in TarStreamBuilderTest so others don't waste time trying to work out why it only tests the time on entries created via addByteEntry and addBlobEntry but not addTarArchiveEntry

Copy link
Member

@chanseokoh chanseokoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, thanks for your contribution!

TarStreamBuilder is completely independent of ReproducibleLayerBuilder, so no need to bring that into your thinking. It's just that addTarArchiveEntry(TarArchiveEntry) accepts a fully-built tar entry as-is (we should not manipulate the entry at all). OTOH, the other add methods internally create a new tar entry with the given properties, so it's only natural to set those properties.

So I'm not exactly sure how you're writing tests, but you can test all of them to check if they all work in the intended way. If addTarArchiveEntry(TarArchiveEntry) adds an entry whose modTime is set to, say, 2021-04-09, then you can just check if the result tar archive has the very entry with the same modTime?

Copy link
Member

@chanseokoh chanseokoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

I actually wanted to have separate and dedicated unit tests, but given the back-and-forth, I can take care of it after merging this.

@chanseokoh chanseokoh merged commit 367fa66 into GoogleContainerTools:master Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Jib core] Tar archives with same contents are not reproducible
4 participants