-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: generate sharded DAG index on client and invoke w index/add
#1451
Conversation
@@ -7,7 +7,7 @@ import varint from 'varint' | |||
*/ | |||
|
|||
/** Byte length of a CBOR encoded CAR header with zero roots. */ | |||
const NO_ROOTS_HEADER_LENGTH = 17 | |||
const NO_ROOTS_HEADER_LENGTH = 18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 FML this was wrong. Either it was wrong the whole time or @ipld/car
started encoding the header differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be great to just export it from @ipld/car
@@ -19,7 +19,7 @@ export async function randomBytes(size) { | |||
} else { | |||
crypto.getRandomValues(chunk) | |||
} | |||
size -= bytes.length | |||
size -= chunk.length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FML we were creating random data in tests where the last 65,536 bytes were random but the rest was ZEROs.
for (const slice of slices.values()) { | ||
slice[0] += diff | ||
} | ||
controller.enqueue(await encodeCAR(blocks, slices, rootCID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting a root CID in the last CAR shard is a massive PITA. I wish we didn't do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I thought we were putting it in the first shard.
Let's put it as a separate thing that is not in the set of shards. It would be easier to change that now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I would like all shards to be rootless and track the DAG root in upload/add
and/or content claims.
f8fc6d5
to
6d691de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments. I am not sure about the placement of the DAG block. Maybe better to change the index to keep it separate.
Review of test.index.js
was not done as I still need to understand how that code works.
|
||
// add the CAR shard itself to the slices | ||
meta.slices.set(meta.cid.multihash, [0, meta.size]) | ||
shardIndexes.push(meta.slices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check with ShardedDAGIndex
, but I thought this was supposed to be the first slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are sorted when encoded anyway so it doesn't matter where you put it?
{ message: 'failed index/add invocation' } | ||
) | ||
}) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will any of these tests fail if the DAG block is in the wrong place or missing? If no, then that would be good. Or, change the index to put the DAGblock in a different place than the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but there probably should be a test that asserts the additional slice that spans the whole blob is added...
@@ -7,7 +7,7 @@ import varint from 'varint' | |||
*/ | |||
|
|||
/** Byte length of a CBOR encoded CAR header with zero roots. */ | |||
const NO_ROOTS_HEADER_LENGTH = 17 | |||
const NO_ROOTS_HEADER_LENGTH = 18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be great to just export it from @ipld/car
…m:web3-storage/w3up into feat/generate-sharded-dag-index-in-client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
some suggestions for follow ups in review, + filecoin/offer
things I believe MUST be added
@@ -183,5 +195,24 @@ async function uploadBlockStream( | |||
if (!root) throw new Error('missing root CID') | |||
|
|||
await Upload.add(conf, root, shards, options) | |||
|
|||
const index = ShardedDAGIndex.create(root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should we have an option to not index? I mean, let's do this by default, but we wanted this to be optional, perhaps just an issue where user can tweak their indexing intentions would be good enough for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
// Store the index in the space | ||
const indexDigest = await Blob.add(conf, indexBytes.ok, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is great to store in the space, but we will need to look into deleting this on remove from space to not have its usage. Maybe we can just fill in issue now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Vasco Santos <santos.vasco10@gmail.com>
Co-authored-by: Vasco Santos <santos.vasco10@gmail.com>
…m:web3-storage/w3up into feat/generate-sharded-dag-index-in-client
🤖 I have created a release *beep* *boop* --- ## [15.0.0](upload-api-v14.0.0...upload-api-v15.0.0) (2024-05-15) ### ⚠ BREAKING CHANGES * delegated capabilities required to use `uploadFile`, `uploadDirectory` and `uploadCAR` have changed. In order to use these methods your agent will now need to be delegated `blob/add`, `index/add`, `filecoin/offer` and `upload/add` capabilities. Note: no code changes are required. ### Features * generate sharded DAG index on client and invoke w `index/add` ([#1451](#1451)) ([a6d9026](a6d9026)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [19.0.0](access-v18.4.0...access-v19.0.0) (2024-05-15) ### ⚠ BREAKING CHANGES * delegated capabilities required to use `uploadFile`, `uploadDirectory` and `uploadCAR` have changed. In order to use these methods your agent will now need to be delegated `blob/add`, `index/add`, `filecoin/offer` and `upload/add` capabilities. Note: no code changes are required. ### Features * generate sharded DAG index on client and invoke w `index/add` ([#1451](#1451)) ([a6d9026](a6d9026)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
🤖 I have created a release *beep* *boop* --- ## [15.0.0](upload-client-v14.1.1...upload-client-v15.0.0) (2024-05-15) ### ⚠ BREAKING CHANGES * delegated capabilities required to use `uploadFile`, `uploadDirectory` and `uploadCAR` have changed. In order to use these methods your agent will now need to be delegated `blob/add`, `index/add`, `filecoin/offer` and `upload/add` capabilities. Note: no code changes are required. ### Features * generate sharded DAG index on client and invoke w `index/add` ([#1451](#1451)) ([a6d9026](a6d9026)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Alan Shaw <alan.shaw@protocol.ai>
🤖 I have created a release *beep* *boop* --- ## [14.0.0](w3up-client-v13.1.1...w3up-client-v14.0.0) (2024-06-04) ### ⚠ BREAKING CHANGES * **upload-api:** integrate agent store for idempotence & invocation/receipt persistence ([#1444](#1444)) * delegated capabilities required to use `uploadFile`, `uploadDirectory` and `uploadCAR` have changed. In order to use these methods your agent will now need to be delegated `blob/add`, `index/add`, `filecoin/offer` and `upload/add` capabilities. Note: no code changes are required. ### Features * generate sharded DAG index on client and invoke w `index/add` ([#1451](#1451)) ([a6d9026](a6d9026)) * **upload-api:** integrate agent store for idempotence & invocation/receipt persistence ([#1444](#1444)) ([c9bf33e](c9bf33e)) ### Fixes * check for blob/accept receipts before blob/add is concluded ([#1459](#1459)) ([462518c](462518c)) * export blob client ([#1485](#1485)) ([7944077](7944077)) * rename blob and index client capabilities ([#1478](#1478)) ([17e3a31](17e3a31)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Vasco Santos <santos.vasco10@gmail.com>
This PR generates index data for blocks as CAR shards are constructed. Once all shards have been successfully sent, a sharded DAG index is encoded and stored (using
blob/add
) and thenindex/add
is invoked with the CID of the index as a parameter.