Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRC issue with deflate compression and a read loop #414

Closed
rddesmond opened this issue Sep 6, 2019 · 7 comments
Closed

CRC issue with deflate compression and a read loop #414

rddesmond opened this issue Sep 6, 2019 · 7 comments
Labels
fixed Issue or bug has been fixed zlib Zlib compression library

Comments

@rddesmond
Copy link

rddesmond commented Sep 6, 2019

With a particular output buffer, a zip file with an AES encrypted (this may be red herring) and zlib compressed item fails reading with a CRC error. I noticed it with a buffer size of 8192 and a uncompressed item of size 7077903, but I saw it at some other combinations.

On investigation, it appears that zlib::inflate may read data from the input buffer and/or write it to the output buffer, but not neccessarily both. "If inflate returns Z_OK and with zero avail_out, it must be called again after making room in the output buffer because there might be more output pending." This means: zlib may consume X bytes, decompress them to an internal buffer, and then output the decompressed results over multiple calls. In the current implementation of mz_strm_zlib, the decompress loop will be escaped if there is no data to read and the minizip controlled buffer is empty, though there may be more data to output.

main.cpp.zip is a test script that shows the problem.

Tested against minizip 2.8.9 (also 2.8.1) with

  • iconv 1.15
  • OpenSSL 1.0.2p
  • bzip2 1.0.6
  • zlib 1.2.11
@nmoinvaz
Copy link
Member

nmoinvaz commented Sep 6, 2019

Does changing mz_strm_zlib.c at line 208 from:

while (zlib->zstream.avail_out > 0);

To:

while ((zlib->zstream.avail_out > 0)  || (flush == Z_FINISH && err == Z_OK));

Solve the issue?

@rddesmond
Copy link
Author

rddesmond commented Sep 6, 2019

No, but the issue disappears if I remove mz_strm_zlib.c:171 and mz_strm_zlib.c:172, where it doesn't continue to check inflate if the parent stream doesn't have any more data.

--- minizip-2.8.9/mz_strm_zlib.c
+++ new/minizip-2.8.9/mz_strm_zlib.c	2019-09-05 23:11:18.851591105 -0400
@@ -168,8 +168,8 @@
 
             if (read < 0)
                 return read;
-            if (read == 0)
-                break;
+//            if (read == 0)
+//                break;
 
             zlib->zstream.next_in = zlib->buffer;
             zlib->zstream.avail_in = read;

I'm testing to see if that is sufficient and doesn't break anything, though so far it looks like it is working. Do you think any other decompression libraries need a similar change? It seems reasonable that because they can decompress unpredictable amounts, they will have to have some internal mechanism to store it when it doesn't fit in the output.

@nmoinvaz
Copy link
Member

nmoinvaz commented Sep 7, 2019

minizip probably uses inflate a bit differently than other libraries with the flush parameter Z_SYNC_FLUSH. Other libraries might use Z_NO_FLUSH followed by Z_FINISH. In the Z_FINISH state it would finish writing all remaining bytes.

@nmoinvaz nmoinvaz closed this as completed Sep 7, 2019
@nmoinvaz nmoinvaz added 2.0 fixed Issue or bug has been fixed zlib Zlib compression library labels Sep 7, 2019
@nmoinvaz
Copy link
Member

nmoinvaz commented Sep 7, 2019

I have made a similar change to the bzip and lzma streams in case the problem is also there. Thanks reporting and fixing.

@rddesmond
Copy link
Author

Thank you!

@nmoinvaz
Copy link
Member

I took another look at this. I'm not so sure && (in_bytes > 0 || out_bytes > 0) is needed. Also I found a bug in the main.cpp in the test_assert. It wouldn't return the right error number because err would be evaluated twice in a #define.

@rddesmond
Copy link
Author

rddesmond commented Sep 18, 2019

Thank for the second look. The most important part of my PR was the removal of the if (read == 0) short circuit. The addition to the while loop clause (in_bytes > 0 || out_bytes > 0) is probably not needed, but I was a little cautious to make sure we didn't get stuck in a no-work loop. I added it to give an additional "has done no work" end case to the loop, though it looks like Z_STREAM_END covers that case completely.

I see now that you added the while loop clause but not the removal of the short circuit to the other method. The removal of the short circuit is the change that really fixed it. (In zlib, I didn't encounter truncated output in the other streams, but certainly could have missed it.)

Good catch on test_assert. I got sloppy because it was just code to show the problem and not in the PR.. but if you want it in it, I'm happy to revise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Issue or bug has been fixed zlib Zlib compression library
Projects
None yet
Development

No branches or pull requests

2 participants