Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Propagate EncoderBuilder's setting of block_checksum #52

Conversation

jshearer
Copy link

It appears that when BlockChecksum was originally introduced, the setting on EncoderBuilder never got passed down into LZ4FPreferences, resulting in errors like this. In this case, it's Go's lz4 library throwing the error:

lz4: invalid block checksum: got a1d59126; expected e48f521c

I confirmed that applying this patch locally fixes all of the invalid block checksum errors. I didn't have time to dig deeper into why this error was happening in the first place -- I would expect it to just result in the block checksum flag being turned on, and correct block checksums to be emitted, but somewhere along the line that appears to not be working correctly. Nevertheless, I believe this change fixes the ability to enable/disable block checksums correctly.

jshearer added a commit to estuary/flow that referenced this pull request Sep 23, 2024
While investigating the cause of LZ4 compression issues related to franz-go (see comments here #1651), I found `lz4_flex` which is a pure-Rust lz4 implementation which appears to be safer and faster than `lz4`/`lz4-sys` that `kafka-protocol` is using. Now that tychedelia/kafka-protocol-rs#81 allows us to use our own compression, and `lz4`'s configuration of block checksums is broken (fix here 10XGenomics/lz4-rs#52), I thought it would be a good time to swap to `lz4_flex`.
jshearer added a commit to estuary/flow that referenced this pull request Sep 23, 2024
While investigating the cause of LZ4 compression issues related to franz-go (see comments here #1651), I found `lz4_flex` which is a pure-Rust lz4 implementation which appears to be safer and faster than `lz4`/`lz4-sys` that `kafka-protocol` is using. Now that tychedelia/kafka-protocol-rs#81 allows us to use our own compression, and `lz4`'s configuration of block checksums is broken (fix here 10XGenomics/lz4-rs#52), I thought it would be a good time to swap to `lz4_flex`.
@pmarks
Copy link

pmarks commented Sep 26, 2024

@jshearer interesting, thanks for the report. This changes ends up changing BlockChecksums to be off by default. So presumably that's why you don't see the problem in Go. I'm going to take this patch, but also change the default to be BlockChecksums ON, because that is the current behavior of the library. With that change you'll be able to disable checksums in the builder to fix the issue you're seeing in Go. I'll investigate the further.

@pmarks
Copy link

pmarks commented Sep 26, 2024

lz4-rs has been setting the BlockChecksum during encoding. This checksum is always tested during decompression, and we haven't seen any errors. That makes me suspect that the bug is actually in the go lz4 impl, since the default for lz4 in general is to not do block checksumming. I'm superseding this PR with #55 which keeps BlockChecksum ON by default, but will actually switch it off if you disable it in EncoderBuilder.

@pmarks pmarks closed this Sep 26, 2024
@jshearer
Copy link
Author

jshearer commented Sep 26, 2024

That makes me suspect that the bug is actually in the go lz4 impl

I think you're right, as I tested this with lz4_flex as well and got the same error (once I turned on block checksumming since it's off by default like you said) so I don't think this EncoderBuilder issue is causal. Probably good for it to be on by default at least from an "exercising surface area" perspective, if not necessarily a performance perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants