Clarify `WriteRequest.write_offset` spec for compressed blobs #193

nodirg · 2021-04-19T22:22:55Z

The CAS compression spec currently states that WriteRequest.write_offset
is the offset in the uncompressed blob, but it does not match Google's
implementation on the client or server side.

The original intention was to encode each chunk, but having the entire
stream compressed seems better also because it allows the
reading-from-data-source-eg-disk and writing-to-CAS to happen
concurrently.

The CAS compression spec currently states that WriteRequest.write_offset is the offset in the uncompressed blob, but it does not match Google's implementation on the client or server side. The original intention was to encode each chunk, but having the entire stream compressed seems better also because it allows the reading-from-data-source-eg-disk and writing-to-CAS to happen concurrently.

nodirg · 2021-04-19T22:24:27Z

@ramymedhat

nodirg · 2021-04-19T22:32:09Z

@EricBurnett who I think was involved in this API design.

EricBurnett · 2021-04-19T23:08:25Z

I'm not sure which Google implementations you're referring to, but it being the offset into the uncompressed data is by design and correctly specified, and at least some Google implementations (e.g. RBE) get this right.

This has been implemented incorrectly in a few cases, e.g. bazelbuild/remote-apis-sdks#273 fixed using the wrong offset in the SDK. Please flag anything still doing it wrong (internally if it's a non-public implementation) and we'll get those implementations fixed too.

The original intention was to encode each chunk, but having the entire
stream compressed seems better also because it allows the
reading-from-data-source-eg-disk and writing-to-CAS to happen
concurrently.

Most CAS implementations don't support resumable uploads today so you can construct them however you wish (including streaming a compressed file from disk, compressing+concatenating chunks, or compressing on the fly), since you'll have to start over regardless if the stream gets broken. But yeah, it is a weakness of this scheme that resuming a partial upload of a large, compressed-on-disk file may require decompressing/recompressing the whole thing. That's a little unfortunate, but an uncommon enough corner case that I don't think it's worth solving for, and making the offsets be into the compressed data is problematic in its own right. Happy to expand if you need more information.

nodirg · 2021-04-19T23:19:44Z

https://github.com/bazelbuild/remote-apis-sdks/blob/40d1266c08bdd688d691a2e4090a01d498318030/go/pkg/chunker/chunker.go#L206 increases offset with the length of the compressed chunk. Then this offset is used as WriteRequest.WriteOffset: https://github.com/bazelbuild/remote-apis-sdks/blob/40d1266c08bdd688d691a2e4090a01d498318030/go/pkg/client/bytestream.go#L49
This implementation is used in production.

I am working on a faster implementation of uploader. I tried the following scheme

Read a chunk from the uncompressed reader
Compress it with zstd.Encoder.EncodeAll()
Pass the compressed as WriteRequest.Data and use the uncompressed offset as WriteRequest.WriteOffset

but the server responded with an error message about invalid offset. I can provide the code, but it will take me some time.

ramymedhat · 2021-04-19T23:32:44Z

@rubensf

EricBurnett · 2021-04-19T23:53:09Z

Ahh. Hmm. I confused read and write there, but that was a mistake. In particular:

The read offset is specified once, at the beginning of the stream.
The write offset is specified at the beginning of the stream and on every subsequent message
The write offset in the bytestream spec is defined as:
- // In the first WriteRequest of a Write() action, it indicates
  // the initial offset for the Write() call. The value must be equal to
  // the committed_size that a call to QueryWriteStatus() would return.
  //
  // On subsequent calls, this value must be set and must be equal to
  // the sum of the first write_offset and the sizes of all data bundles
  // sent previously on this stream.

That's more awkward than I remembered :(. Apologies for the confusion! And more to the point, I only confirmed that the RBE read implementation matches the spec, but if you're telling me the write implementation does not, that's unfortunate but good to know.

I'm...not actually sure what the right resolution is in this case. Making the write path use compressed offsets and the read path use uncompressed offsets would be easy I suppose, and I think more correctly matched to a strict reading of the ByteStream spec where it specifies the sum of the sizes of all data bundles sent on the stream. On the other hand, that's inconsistent with Read, and arguably problematic for anyone implementing resumable uploads - where it's not expected they'd commit the incoming bytes to storage byte-for-byte equally, and so wouldn't necessarily be able to handle resumptions of the form "the next bytes in a compressed stream, not aligned to any particular chunk boundary or interpretable without the exact bytes of the broken steam on hand". (Or to put it another way, I'm still pretty sure GetWriteStatus() should return the number of post-decompression bytes successfully persisted).

"Correct" in the ideal case would be to have the write offset in the initial chunk be the offset in the uncompressed bytes, but for subsequent chunks to be specified in terms of the sum of the lengths of the data bundles for validation purposes...except there's only one field in ByteStream and so this'd have you summing compressed and uncompressed bytes together today. Ugh.

rubensf · 2021-04-20T04:08:21Z

Thinking about this has been quite a brain twist! But for the most part I concur with what Eric wrote.

Any scenario where the subsequent write_offsets refer to the uncompressed size would inherently disobey the spec (...the sum of the first write_offset and the sizes of all data bundles sent previously on this stream...), and thus be a no-go.

Changing the first write_offset to refer to the compressed size, however, would require QueryWriteStatus to return the compressed size, and doing that would effectively require us to expose one more API where the client can check the compressed -> uncompressed mapping. That sounds generally more work than it's worth. FWIW, I don't think this would mean "not aligned to any particular chunk boundary" since the server is the one specifying the committed_size for the client to use, and I'm confident the server will prefer an interpretable boundary as the committed_size :)

Additionally, the current implementation by RBE's client and server is technically adhering to the spec already. subsequent calls are already set to the first write_offset plus the size of all data bundles sent. The easiest solution here is likely to be clarifying the RE spec to explain this. Granted I'm likely bias since I worked on the SDK implementation.

Of course, we could always ignore what the Bytestream specifies and establish our own expectations, but in that scenario we might as well write our own REByteStream. That's not unreasonable but I don't think the pros outweight the cons.

rubensf · 2021-04-20T04:34:59Z

The original intention was to encode each chunk, but having the entire
stream compressed seems better also because it allows the
reading-from-data-source-eg-disk and writing-to-CAS to happen
concurrently.

Also I'm not sure what you mean here? IIUC writing to CAS is sequential, and write_offset referring to the uncompressed or compressed offset shouldn't affect potential performance.

FWIW, the API doesn't specify anything regarding chunks for writes. The SDK uses chunks as an implementation detail. In any case, zstd has frames which are effectively another abstraction for chunks and compressed as a single unit. The SDK also already reads-from-disk and compress-data concurrently, though SDK specific discussions shouldn't happen on this thread.

only in the first request.

nodirg · 2021-04-20T04:51:17Z

Thanks. I've clarified the spec. PTAL.

@rubensf you are right. The difference in the spec doesn't require reading and writing to be sequential.

mostynb

This complexity is caused by Bytestream not quite fitting our use case exactly, and our desire for an incremental improvement to uncompressed Bytestream. For REAPIv3, we should IMO replace Bytestream with our own protocol (there are a few people probably insterested in working on replacements, but I don't know the time scale for REAPIv3 so maybe it's too early to start).

To summarise the reasoning behind the current design:

The initial write_offset must refer to an offset in the uncompressed blob, because we have no way to specify a specific compressed version of the blob. (In most cases, I expect server implementations to only support 0 here anyway.)
Subsequent write_offset values refer to the amount of compressed data received so far (plus the initial uncompressed write_offset), because the bytestream spec states it, but also because it would be less useful for the server to check the uncompressed write offsets. If we count compressed data here, the server can do some simple checks that it agrees with the client on how much compressed data has been sent. On the other hand if we were to count uncompressed data, the server might not be able to perform checks until all the data has been sent anyway (eg if a block-based compressor is used) in which case it could just decompress everything and check the size + hash of the uncompressed blob.

build/bazel/remote/execution/v2/remote_execution.proto

EricBurnett · 2021-04-20T18:22:31Z

Change LGTM! Thanks Nodir.

I wonder if we should go further and call out explicitly that this is nonsensical (adding uncompressed offset to compressed byte lengths gives...nothing meaningful) but done to fit the semantics of ByteStream, and/or that this will be split into distinct fields in V3 as we get off ByteStream, but that's optional. The text as written looks good, it's just going to reasonably make people question why it's defined so strangely :).

mostynb · 2021-04-20T18:45:19Z

Change LGTM! Thanks Nodir.

I wonder if we should go further and call out explicitly that this is nonsensical (adding uncompressed offset to compressed byte lengths gives...nothing meaningful) but done to fit the semantics of ByteStream, and/or that this will be split into distinct fields in V3 as we get off ByteStream, but that's optional. The text as written looks good, it's just going to reasonably make people question why it's defined so strangely :).

This also sounds good to me.

The implementation in bazel-remote simply ignores the field in subsequent requests because it isn't useful.

nodirg · 2021-04-21T00:39:02Z

Acked the nonsensical piece explicitly in the comment.

build/bazel/remote/execution/v2/remote_execution.proto

mostynb · 2021-04-21T08:09:12Z

lgtm, but I would remove This might be improved in the next version. since it doesn't add any useful info (you could say this about any part of this file).

nodirg · 2021-04-21T21:27:50Z

lgtm, but I would remove This might be improved in the next version. since it doesn't add any useful info (you could say this about any part of this file).

done

nodirg · 2021-04-21T21:28:58Z

seems like everyone is happy? How do I get an official "LGTM"? Note that I don't have permissions to submit this CL.

nodirg · 2021-04-22T16:31:42Z

Thanks for approving, @mostynb!

Folks, I don't have merging permissions. Please merge it for me. Thanks.

nodirg requested review from agoulti, bergsieker, buchgr and ola-rozenfeld as code owners April 19, 2021 22:22

google-cla bot added the cla: yes Pull requests whose authors are covered by a CLA with Google. label Apr 19, 2021

clarify

e9fa041

nodirg changed the title ~~Fix incorrect protobuf spec~~ Clarify WriteRequest.write_offset spec for compressed blobs Apr 20, 2021

Clarify that WriteRequest.write_offset refers to uncompressed blob

3006bb2

only in the first request.

mostynb reviewed Apr 20, 2021

View reviewed changes

build/bazel/remote/execution/v2/remote_execution.proto Outdated Show resolved Hide resolved

Clarify further

35b7bd9

mostynb reviewed Apr 20, 2021

View reviewed changes

build/bazel/remote/execution/v2/remote_execution.proto Outdated Show resolved Hide resolved

The previous

9785883

mostynb reviewed Apr 20, 2021

View reviewed changes

build/bazel/remote/execution/v2/remote_execution.proto Outdated Show resolved Hide resolved

Ack nonsensical semantics

2c0a44e

rubensf reviewed Apr 21, 2021

View reviewed changes

build/bazel/remote/execution/v2/remote_execution.proto Outdated Show resolved Hide resolved

build/bazel/remote/execution/v2/remote_execution.proto Outdated Show resolved Hide resolved

nodirg added 2 commits April 20, 2021 19:32

nit

4405d1a

nit

99974c2

Remove a promise

ac23a6c

mostynb approved these changes Apr 21, 2021

View reviewed changes

bergsieker approved these changes Apr 22, 2021

View reviewed changes

bergsieker merged commit c2c166f into bazelbuild:master Apr 22, 2021

mostynb mentioned this pull request Feb 8, 2022

Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify `WriteRequest.write_offset` spec for compressed blobs #193

Clarify `WriteRequest.write_offset` spec for compressed blobs #193

nodirg commented Apr 19, 2021

nodirg commented Apr 19, 2021

nodirg commented Apr 19, 2021

EricBurnett commented Apr 19, 2021

nodirg commented Apr 19, 2021

ramymedhat commented Apr 19, 2021

EricBurnett commented Apr 19, 2021

rubensf commented Apr 20, 2021

rubensf commented Apr 20, 2021

nodirg commented Apr 20, 2021

mostynb left a comment

EricBurnett commented Apr 20, 2021

mostynb commented Apr 20, 2021

nodirg commented Apr 21, 2021

mostynb commented Apr 21, 2021

nodirg commented Apr 21, 2021

nodirg commented Apr 21, 2021

nodirg commented Apr 22, 2021

Clarify WriteRequest.write_offset spec for compressed blobs #193

Clarify WriteRequest.write_offset spec for compressed blobs #193

Conversation

nodirg commented Apr 19, 2021

nodirg commented Apr 19, 2021

nodirg commented Apr 19, 2021

EricBurnett commented Apr 19, 2021

nodirg commented Apr 19, 2021

ramymedhat commented Apr 19, 2021

EricBurnett commented Apr 19, 2021

rubensf commented Apr 20, 2021

rubensf commented Apr 20, 2021

nodirg commented Apr 20, 2021

mostynb left a comment

Choose a reason for hiding this comment

EricBurnett commented Apr 20, 2021

mostynb commented Apr 20, 2021

nodirg commented Apr 21, 2021

mostynb commented Apr 21, 2021

nodirg commented Apr 21, 2021

nodirg commented Apr 21, 2021

nodirg commented Apr 22, 2021

Clarify `WriteRequest.write_offset` spec for compressed blobs #193

Clarify `WriteRequest.write_offset` spec for compressed blobs #193