Skip to content

Commit

Permalink
[6.4.0] Add support for the BLAKE3 digest function (#19191)
Browse files Browse the repository at this point in the history
* Add BLAKE3 digest function to remote_execution.proto

(cherry picked from commit e382abf)

* Add BLAKE3 source code to third_party

(cherry picked from commit 0ff4f6d)

* Add BLAKE3 source code to third_party

This PR adds the BLAKE3 C and asm sources to third_party, and includes a BUILD
file to build them.

PiperOrigin-RevId: 541539341
Change-Id: I49b1edce20a7d0f986e29712e6050e4e0b9c1d44

(cherry picked from commit a3a569e)

* Support new-style digest functions

Support new-style digest functions.

This PR adds support for new-style digest functions to the remote execution
library code. The remote-apis spec says:

```
// * `digest_function` is a lowercase string form of a `DigestFunction.Value`
//   enum, indicating which digest function was used to compute `hash`. If the
//   digest function used is one of MD5, MURMUR3, SHA1, SHA256, SHA384, SHA512,
//   or VSO, this component MUST be omitted. In that case the server SHOULD
//   infer the digest function using the length of the `hash` and the digest
//   functions announced in the server's capabilities.
```

PiperOrigin-RevId: 543691155
Change-Id: If8c386d923db1b24dff6054c8ab3f783409b7f13

(cherry picked from commit 88412ce)

* Add BLAKE3 hasher to vfs

This PR adds the Blake3Hasher and Blake3HashFunction classes to vfs and
makes them available under the flag --digest_function=BLAKE3.

PiperOrigin-RevId: 550525978
Change-Id: Iedc0886c51755585d56b4d8f47676d3be5bbedba

(cherry picked from commit cc49d68)

* Simplify blake3 hasher and improve performance

Improve BLAKE3 performance by reducing copies.

Rather than buffering bytes, it's faster to just update the blake3 hasher as bytes are hashed.

Performance with buffering:
```
 Benchmark                   (inputSize)  Mode  Cnt      Score     Error  Units
 Blake3Benchmark.blake3Hash            1  avgt   10   1766.697 ± 709.515  ns/op
 Blake3Benchmark.blake3Hash           10  avgt   10   1466.253 ±  19.474  ns/op
 Blake3Benchmark.blake3Hash          100  avgt   10   1522.845 ±  15.480  ns/op
 Blake3Benchmark.blake3Hash         1000  avgt   10   2254.156 ±   8.588  ns/op
 Blake3Benchmark.blake3Hash        10000  avgt   10   4660.881 ±  28.637  ns/op
 Blake3Benchmark.blake3Hash       100000  avgt   10  24283.191 ±  32.754  ns/op
 Blake3Benchmark.sha256Hash            1  avgt   10    742.091 ±   6.307  ns/op
 Blake3Benchmark.sha256Hash           10  avgt   10    757.844 ±  12.042  ns/op
 Blake3Benchmark.sha256Hash          100  avgt   10    942.902 ± 555.874  ns/op
 Blake3Benchmark.sha256Hash         1000  avgt   10   1208.336 ± 508.392  ns/op
 Blake3Benchmark.sha256Hash        10000  avgt   10   4871.231 ± 494.507  ns/op
 Blake3Benchmark.sha256Hash       100000  avgt   10  40686.231 ±  63.814  ns/op
```

Performance without buffering (after this CL):

```
 Benchmark                   (inputSize)  Mode  Cnt      Score     Error  Units
 Blake3Benchmark.blake3Hash            1  avgt   10   1021.075 ±  11.640  ns/op
 Blake3Benchmark.blake3Hash           10  avgt   10   1029.561 ±  19.850  ns/op
 Blake3Benchmark.blake3Hash          100  avgt   10   1070.509 ±  12.140  ns/op
 Blake3Benchmark.blake3Hash         1000  avgt   10   1685.043 ±  13.963  ns/op
 Blake3Benchmark.blake3Hash        10000  avgt   10   3939.516 ±  13.212  ns/op
 Blake3Benchmark.blake3Hash       100000  avgt   10  21730.550 ±  22.976  ns/op
 Blake3Benchmark.sha256Hash            1  avgt   10    745.943 ±   9.853  ns/op
 Blake3Benchmark.sha256Hash           10  avgt   10    747.649 ±  17.381  ns/op
 Blake3Benchmark.sha256Hash          100  avgt   10    962.802 ± 590.879  ns/op
 Blake3Benchmark.sha256Hash         1000  avgt   10   1189.069 ± 412.327  ns/op
 Blake3Benchmark.sha256Hash        10000  avgt   10   4594.978 ±  21.833  ns/op
 Blake3Benchmark.sha256Hash       100000  avgt   10  39224.248 ± 229.265  ns/op
```

PiperOrigin-RevId: 551225043
Change-Id: I57dc0700b2f44d6faf75ffbd29f1607544e54f29

(cherry picked from commit d922ab3)

* Ensure namedSetOfFiles URIs specify blob type correctly

I noticed when testing the BLAKE3 digest function that uploaded files were being referenced incorrectly in the BES because they were missing the digest type. This CL fixes that.

Before: https://github.com/bazelbuild/bazel/assets/141737/52781f1b-b897-48f0-8956-f63c57b59436

After: https://github.com/bazelbuild/bazel/assets/141737/01ebc61b-3512-4ca5-8e2d-f47ad5f086b7

PiperOrigin-RevId: 553405394
Change-Id: Ic976e5a58f80ab8b5270b4aedc6204c22562533a

(cherry picked from commit 1929367)
Signed-off-by: Brentley Jones <github@brentleyjones.com>

* Separate new-style-hash content in DiskCache

DiskCache always stores files in /root/{cas/ac}/digestHash. This change keeps the current behavior, but for new style digest functions inserts a directory between /root/ and {cas/ac} to disambiguate the digest function type.

This prevents issues that could theoretically happen if there were hash collisions between two digest functions sharing the same cache directory.

PiperOrigin-RevId: 554477775
Change-Id: Ibef994e432764c260a3cab821ca6583c231c5b50

(cherry picked from commit f17b280)
Signed-off-by: Brentley Jones <github@brentleyjones.com>

---------

Signed-off-by: Brentley Jones <github@brentleyjones.com>
Co-authored-by: Tyler Williams <williams.tyler@gmail.com>
  • Loading branch information
brentleyjones and tylerwilliams authored Aug 8, 2023
1 parent 5dd60a6 commit 03ba2ba
Show file tree
Hide file tree
Showing 30 changed files with 1,021 additions and 67 deletions.
1 change: 1 addition & 0 deletions BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ genrule(
pkg_tar(
name = "bootstrap-jars",
srcs = [
"@blake3",
"@com_google_protobuf//:protobuf_java",
"@com_google_protobuf//:protobuf_java_util",
"@com_google_protobuf//:protobuf_javalite",
Expand Down
1 change: 1 addition & 0 deletions MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ bazel_dep(name = "platforms", version = "0.0.5")
bazel_dep(name = "rules_pkg", version = "0.7.0")
bazel_dep(name = "stardoc", version = "0.5.0", repo_name = "io_bazel_skydoc")
bazel_dep(name = "zstd-jni", version = "1.5.2-3")
bazel_dep(name = "blake3", version = "1.3.3")
bazel_dep(name = "zlib", version = "1.2.13")

# The following are required when building without WORKSPACE SUFFIX
Expand Down
8 changes: 8 additions & 0 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,14 @@ dist_http_archive(
strip_prefix = "zstd-jni-1.5.2-3",
)

dist_http_archive(
name = "blake3",
build_file = "//third_party:blake3/blake3.BUILD",
patch_cmds = EXPORT_WORKSPACE_IN_BUILD_BAZEL_FILE,
patch_cmds_win = EXPORT_WORKSPACE_IN_BUILD_BAZEL_FILE_WIN,
strip_prefix = "BLAKE3-1.3.3",
)

http_archive(
name = "org_snakeyaml",
build_file_content = """
Expand Down
15 changes: 15 additions & 0 deletions distdir_deps.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,21 @@ DIST_DEPS = {
"additional_distfiles",
],
},
"blake3": {
"archive": "1.3.3.zip",
"sha256": "bb529ba133c0256df49139bd403c17835edbf60d2ecd6463549c6a5fe279364d",
"urls": [
"https://github.com/BLAKE3-team/BLAKE3/archive/refs/tags/1.3.3.zip",
],
"used_in": [
"additional_distfiles",
],
"license_kinds": [
"@rules_license//licenses/spdx:Apache-2.0",
],
"license_text": "LICENSE",
"package_version": "1.3.3",
},
###################################################
#
# Build time dependencies for testing and packaging
Expand Down
2 changes: 2 additions & 0 deletions src/main/java/com/google/devtools/build/lib/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ filegroup(
"//src/main/java/com/google/devtools/build/lib/versioning:srcs",
"//src/main/java/com/google/devtools/build/lib/vfs/inmemoryfs:srcs",
"//src/main/java/com/google/devtools/build/lib/vfs:srcs",
"//src/main/java/com/google/devtools/build/lib/vfs/bazel:srcs",
"//src/main/java/com/google/devtools/build/lib/windows:srcs",
"//src/main/java/com/google/devtools/build/lib/worker:srcs",
"//src/main/java/com/google/devtools/build/skyframe:srcs",
Expand Down Expand Up @@ -403,6 +404,7 @@ java_library(
"//src/main/java/com/google/devtools/build/lib/vfs",
"//src/main/java/com/google/devtools/build/lib/vfs:output_service",
"//src/main/java/com/google/devtools/build/lib/vfs:pathfragment",
"//src/main/java/com/google/devtools/build/lib/vfs/bazel",
"//src/main/java/com/google/devtools/build/lib/windows",
"//src/main/java/com/google/devtools/build/lib/worker:worker_metric",
"//src/main/java/com/google/devtools/build/skyframe",
Expand Down
1 change: 1 addition & 0 deletions src/main/java/com/google/devtools/build/lib/bazel/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ java_library(
"//src/main/java/com/google/devtools/build/lib/util:os",
"//src/main/java/com/google/devtools/build/lib/vfs",
"//src/main/java/com/google/devtools/build/lib/vfs:pathfragment",
"//src/main/java/com/google/devtools/build/lib/vfs/bazel",
"//src/main/java/com/google/devtools/build/lib/windows",
"//src/main/java/com/google/devtools/common/options",
"//src/main/protobuf:failure_details_java_proto",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
import com.google.devtools.build.lib.vfs.FileSystem;
import com.google.devtools.build.lib.vfs.JavaIoFileSystem;
import com.google.devtools.build.lib.vfs.PathFragment;
import com.google.devtools.build.lib.vfs.bazel.BazelHashFunctions;
import com.google.devtools.build.lib.windows.WindowsFileSystem;
import com.google.devtools.common.options.OptionsParsingException;
import com.google.devtools.common.options.OptionsParsingResult;
Expand All @@ -44,6 +45,10 @@
* com.google.devtools.build.lib.vfs.FileSystem} class use {@code SHA256} by default.
*/
public class BazelFileSystemModule extends BlazeModule {
static {
BazelHashFunctions.ensureRegistered();
}

@Override
public ModuleFileSystem getFileSystem(
OptionsParsingResult startupOptions, PathFragment realExecRootBase)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,16 @@
// limitations under the License.
package com.google.devtools.build.lib.remote;

import static com.google.devtools.build.lib.remote.util.DigestUtil.isOldStyleDigestFunction;
import static com.google.devtools.build.lib.remote.util.RxFutures.toCompletable;
import static com.google.devtools.build.lib.remote.util.RxFutures.toListenableFuture;
import static com.google.devtools.build.lib.remote.util.RxFutures.toSingle;
import static com.google.devtools.build.lib.remote.util.Utils.grpcAwareErrorMessage;

import build.bazel.remote.execution.v2.Digest;
import build.bazel.remote.execution.v2.DigestFunction;
import build.bazel.remote.execution.v2.RequestMetadata;
import com.google.common.base.Ascii;
import com.google.common.base.Preconditions;
import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Maps;
Expand Down Expand Up @@ -133,13 +136,21 @@ private static final class PathMetadata {
private final boolean directory;
private final boolean remote;
private final boolean omitted;

PathMetadata(Path path, Digest digest, boolean directory, boolean remote, boolean omitted) {
private final DigestFunction.Value digestFunction;

PathMetadata(
Path path,
Digest digest,
boolean directory,
boolean remote,
boolean omitted,
DigestFunction.Value digestFunction) {
this.path = path;
this.digest = digest;
this.directory = directory;
this.remote = remote;
this.omitted = omitted;
this.digestFunction = digestFunction;
}

public Path getPath() {
Expand All @@ -161,20 +172,27 @@ public boolean isRemote() {
public boolean isOmitted() {
return omitted;
}

public DigestFunction.Value getDigestFunction() {
return digestFunction;
}
}

/**
* Collects metadata for {@code file}. Depending on the underlying filesystem used this method
* might do I/O.
*/
private PathMetadata readPathMetadata(Path file) throws IOException {
DigestUtil digestUtil = new DigestUtil(xattrProvider, file.getFileSystem().getDigestFunction());

if (file.isDirectory()) {
return new PathMetadata(
file,
/* digest= */ null,
/* directory= */ true,
/* remote= */ false,
/* omitted= */ false);
/* omitted= */ false,
/* digestFunction= */ digestUtil.getDigestFunction());
}

PathFragment filePathFragment = file.asFragment();
Expand All @@ -190,9 +208,14 @@ private PathMetadata readPathMetadata(Path file) throws IOException {
}
}

DigestUtil digestUtil = new DigestUtil(xattrProvider, file.getFileSystem().getDigestFunction());
Digest digest = digestUtil.compute(file);
return new PathMetadata(file, digest, /* directory= */ false, isRemoteFile(file), omitted);
return new PathMetadata(
file,
digest,
/* directory= */ false,
isRemoteFile(file),
omitted,
digestUtil.getDigestFunction());
}

private static void processQueryResult(
Expand All @@ -209,7 +232,8 @@ private static void processQueryResult(
file.getDigest(),
file.isDirectory(),
/* remote= */ true,
file.isOmitted());
file.isOmitted(),
file.getDigestFunction());
knownRemotePaths.add(remotePathMetadata);
}
}
Expand Down Expand Up @@ -305,7 +329,8 @@ private Single<List<PathMetadata>> uploadLocalFiles(
// set remote to true so the PathConverter will use bytestream://
// scheme to convert the URI for this file
/* remote= */ true,
path.isOmitted()))
path.isOmitted(),
path.getDigestFunction()))
.onErrorResumeNext(
error -> {
reportUploadError(error, path.getPath(), path.getDigest());
Expand Down Expand Up @@ -341,7 +366,8 @@ private Single<PathConverter> upload(Set<Path> files) {
/* digest= */ null,
/* directory= */ false,
/* remote= */ false,
/* omitted= */ false);
/* omitted= */ false,
DigestFunction.Value.SHA256);
}
})
.collect(Collectors.toList())
Expand Down Expand Up @@ -385,7 +411,7 @@ public ReferenceCounted touch(Object o) {
private static class PathConverterImpl implements PathConverter {

private final String remoteServerInstanceName;
private final Map<Path, Digest> pathToDigest;
private final Map<Path, PathMetadata> pathToMetadata;
private final Set<Path> skippedPaths;
private final Set<Path> localPaths;

Expand All @@ -395,18 +421,18 @@ private static class PathConverterImpl implements PathConverter {
RemoteBuildEventUploadMode remoteBuildEventUploadMode) {
Preconditions.checkNotNull(uploads);
this.remoteServerInstanceName = remoteServerInstanceName;
pathToDigest = Maps.newHashMapWithExpectedSize(uploads.size());
pathToMetadata = Maps.newHashMapWithExpectedSize(uploads.size());
ImmutableSet.Builder<Path> skippedPaths = ImmutableSet.builder();
ImmutableSet.Builder<Path> localPaths = ImmutableSet.builder();
for (PathMetadata pair : uploads) {
Path path = pair.getPath();
Digest digest = pair.getDigest();
for (PathMetadata metadata : uploads) {
Path path = metadata.getPath();
Digest digest = metadata.getDigest();
if (digest != null) {
// Always use bytestream:// in MINIMAL mode
if (remoteBuildEventUploadMode == RemoteBuildEventUploadMode.MINIMAL) {
pathToDigest.put(path, digest);
} else if (pair.isRemote()) {
pathToDigest.put(path, digest);
pathToMetadata.put(path, metadata);
} else if (metadata.isRemote()) {
pathToMetadata.put(path, metadata);
} else {
localPaths.add(path);
}
Expand All @@ -427,18 +453,34 @@ public String apply(Path path) {
return String.format("file://%s", path.getPathString());
}

Digest digest = pathToDigest.get(path);
if (digest == null) {
PathMetadata metadata = pathToMetadata.get(path);
if (metadata == null) {
if (skippedPaths.contains(path)) {
return null;
}
// It's a programming error to reference a file that has not been uploaded.
throw new IllegalStateException(
String.format("Illegal file reference: '%s'", path.getPathString()));
}
return String.format(
"bytestream://%s/blobs/%s/%d",
remoteServerInstanceName, digest.getHash(), digest.getSizeBytes());

Digest digest = metadata.getDigest();
DigestFunction.Value digestFunction = metadata.getDigestFunction();
String out;
if (isOldStyleDigestFunction(digestFunction)) {
out =
String.format(
"bytestream://%s/blobs/%s/%d",
remoteServerInstanceName, digest.getHash(), digest.getSizeBytes());
} else {
out =
String.format(
"bytestream://%s/blobs/%s/%s/%d",
remoteServerInstanceName,
Ascii.toLowerCase(digestFunction.getValueDescriptor().getName()),
digest.getHash(),
digest.getSizeBytes());
}
return out;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,14 @@

import static com.google.common.base.Preconditions.checkArgument;
import static com.google.common.util.concurrent.Futures.immediateVoidFuture;
import static com.google.devtools.build.lib.remote.util.DigestUtil.isOldStyleDigestFunction;
import static com.google.devtools.build.lib.remote.util.Utils.getFromFuture;
import static com.google.devtools.build.lib.remote.util.Utils.waitForBulkTransfer;
import static java.lang.String.format;
import static java.util.concurrent.TimeUnit.SECONDS;

import build.bazel.remote.execution.v2.Digest;
import build.bazel.remote.execution.v2.DigestFunction;
import com.google.bytestream.ByteStreamGrpc;
import com.google.bytestream.ByteStreamGrpc.ByteStreamFutureStub;
import com.google.bytestream.ByteStreamGrpc.ByteStreamStub;
Expand Down Expand Up @@ -69,6 +71,7 @@ class ByteStreamUploader {
private final CallCredentialsProvider callCredentialsProvider;
private final long callTimeoutSecs;
private final RemoteRetrier retrier;
private final DigestFunction.Value digestFunction;

@Nullable private final Semaphore openedFilePermits;

Expand All @@ -89,14 +92,16 @@ class ByteStreamUploader {
CallCredentialsProvider callCredentialsProvider,
long callTimeoutSecs,
RemoteRetrier retrier,
int maximumOpenFiles) {
int maximumOpenFiles,
DigestFunction.Value digestFunction) {
checkArgument(callTimeoutSecs > 0, "callTimeoutSecs must be gt 0.");
this.instanceName = instanceName;
this.channel = channel;
this.callCredentialsProvider = callCredentialsProvider;
this.callTimeoutSecs = callTimeoutSecs;
this.retrier = retrier;
this.openedFilePermits = maximumOpenFiles != -1 ? new Semaphore(maximumOpenFiles) : null;
this.digestFunction = digestFunction;
}

@VisibleForTesting
Expand Down Expand Up @@ -175,11 +180,26 @@ public ListenableFuture<Void> uploadBlobAsync(
MoreExecutors.directExecutor());
}

private static String buildUploadResourceName(
private String buildUploadResourceName(
String instanceName, UUID uuid, Digest digest, boolean compressed) {
String template =
compressed ? "uploads/%s/compressed-blobs/zstd/%s/%d" : "uploads/%s/blobs/%s/%d";
String resourceName = format(template, uuid, digest.getHash(), digest.getSizeBytes());

String resourceName;

if (isOldStyleDigestFunction(digestFunction)) {
String template =
compressed ? "uploads/%s/compressed-blobs/zstd/%s/%d" : "uploads/%s/blobs/%s/%d";
resourceName = format(template, uuid, digest.getHash(), digest.getSizeBytes());
} else {
String template =
compressed ? "uploads/%s/compressed-blobs/zstd/%s/%s/%d" : "uploads/%s/blobs/%s/%s/%d";
resourceName =
format(
template,
uuid,
Ascii.toLowerCase(digestFunction.getValueDescriptor().getName()),
digest.getHash(),
digest.getSizeBytes());
}
if (!Strings.isNullOrEmpty(instanceName)) {
resourceName = instanceName + "/" + resourceName;
}
Expand Down
Loading

0 comments on commit 03ba2ba

Please sign in to comment.