Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: aggregation spec to use height instead of size #66

Merged
merged 2 commits into from
Jul 18, 2023

Conversation

vasco-santos
Copy link
Contributor

Renames size to height for piece/aggregate based on described in storacha/data-segment#13

Note that:

  • given piece and aggregate are both perfect binary trees, leaf count of can be derived as 2 ** height, and the size of the tree can be derived by leafCount * Node.Size (32).

# https://github.com/filecoin-project/go-state-types/blob/1e6cf0d47cdda75383ef036fc2725d1cf51dbde8/abi/piece.go#L47-L50
# Uses `height` field instead of `size`. `height` field can be used to derive `leafCount` and consequently `size`, while
# allowing the usage of smaller numbers instead of `bigint`.
type PieceInfo {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gozala by this change, I am considering to just rename PieceInfo to PieceProof instead.

Motivation would be to create a new name instead of relying on PieceInfo that is already known to have cid + size. We could just name it PieceProof and add comment saying it can be used to derive PieceInfo. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's fine to leave as PieceInfo, the point is for this information to now be the "piece information" as we'd not materialize and pass around a size value unless for display.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want to call it a PieceProof as it can be confused with InclusionProof (in fact that's how it a appears in the go implementation).

If you want to rename perhaps call it a PieceDetail, personally I don't have strong feelings about the name. In fact it is a mistake that height is not part of the CID, but I'm not sure it's worth trying to fix that.

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be pieceProof or even just proof? See rationale on other review comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please lets not use term proof as we'll have inclusion proofs and ucans have their own proofs and it's going to get very confusing. I'm open to different name, but proof is just too confusing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can we just call this an aggregate instead ? Alternatively we could change it all to

  "pieces": { "/": "bafy...pieces" } // dag-cbor containing pieces
  "aggregate": { "/": "bafy...aggregate-cid" },
  "height": 4 // height of the aggregate tree

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's kind of a mistake that height / size is not part of the CID itself.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos we should also check with spade team if they even need or care about the pieces here, I suspect they are not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can we just call this an aggregate instead ?

we previously went back from aggregate to piece given they by the end of the day are the same. Have height and cid, and perhaps in the future they could be encoded in the CID as you say.

Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I think a minor clarification on "perfect" binary tree would help but I think the meaning is obvious so not a blocker.

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
"link": { "/": "commitment...aggregate-proof" },
"size": 10102020203013342343
"height": 4 /* height of the perfect binary tree for the aggregate */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "perfect" the right word? "Perfectly balanced" perhaps?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the wikipedia

A perfect binary tree is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level

Which is why I use that term in data-segment lib

# https://github.com/filecoin-project/go-state-types/blob/1e6cf0d47cdda75383ef036fc2725d1cf51dbde8/abi/piece.go#L47-L50
# Uses `height` field instead of `size`. `height` field can be used to derive `leafCount` and consequently `size`, while
# allowing the usage of smaller numbers instead of `bigint`.
type PieceInfo {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's fine to leave as PieceInfo, the point is for this information to now be the "piece information" as we'd not materialize and pass around a size value unless for display.

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
"link": { "/": "commitment...aggregate-proof" },
"size": 10102020203013342343
"height": 4 /* height of the perfect binary tree for the aggregate */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the wikipedia

A perfect binary tree is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level

Which is why I use that term in data-segment lib

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please lets not use term proof as we'll have inclusion proofs and ucans have their own proofs and it's going to get very confusing. I'm open to different name, but proof is just too confusing

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually can we just call this an aggregate instead ? Alternatively we could change it all to

  "pieces": { "/": "bafy...pieces" } // dag-cbor containing pieces
  "aggregate": { "/": "bafy...aggregate-cid" },
  "height": 4 // height of the aggregate tree

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's kind of a mistake that height / size is not part of the CID itself.

@@ -132,7 +132,7 @@ A Storefront principal can invoke a capabilty to offer an aggregate that is read
"offer": { "/": "bafy...many-cars" }, /* dag-cbor CID with offer content */
"piece": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vasco-santos we should also check with spade team if they even need or care about the pieces here, I suspect they are not.

Invoking `aggregate/offer` capability submits an aggregate to a broker service for inclusion in one or more Filecoin deals. The `nb.offer` field represents a "Ferry" aggregate offer that is ready for a Filecoin deal. Its value is the DAG-CBOR CID that refers to a "Ferry" offer. It encodes a dag-cbor block with an array of entries representing all the CAR files to include in the aggregated deal. This block MUST be included in the CAR file that transports the invocation. Its format is:
Invoking `aggregate/offer` capability submits an aggregate to a broker service for inclusion in one or more Filecoin deals. The `nb.piece` field represents the proof of the `piece` to be offered for the deal. It contains the proof CID `piece.link` together with the `piece.height` of the perfect binary tree computed for this aggregate. The `height` can be used to derive the leaf count (`2 ** height`), which can then be used to derive the `size` of the piece (`leafCount * Node.Size` where `Node.Size` is 32).

The `nb.offer` field represents a "Ferry" aggregate offer that is ready for a Filecoin deal. Its value is the DAG-CBOR CID that refers to a "Ferry" offer. It encodes a dag-cbor block with an array of entries representing all the CAR files to include in the aggregated deal. This block MUST be included in the CAR file that transports the invocation. Its format is:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect CARs in offer to be in same order as they appear in the aggregate ? I think we need a note whether they SHOULD, MUST, MAY or are RECOMMENDED to follow the order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, they MUST because other party might intend to validate before submitting . I will add

# https://github.com/filecoin-project/go-state-types/blob/1e6cf0d47cdda75383ef036fc2725d1cf51dbde8/abi/piece.go#L47-L50
# Uses `height` field instead of `size`. `height` field can be used to derive `leafCount` and consequently `size`, while
# allowing the usage of smaller numbers instead of `bigint`.
type PieceInfo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't want to call it a PieceProof as it can be confused with InclusionProof (in fact that's how it a appears in the go implementation).

If you want to rename perhaps call it a PieceDetail, personally I don't have strong feelings about the name. In fact it is a mistake that height is not part of the CID, but I'm not sure it's worth trying to fix that.

@vasco-santos vasco-santos merged commit b2ba739 into main Jul 18, 2023
2 checks passed
@vasco-santos vasco-santos deleted the fix/aggregation-spec-to-use-height-instead-of-size branch July 18, 2023 07:58
vasco-santos added a commit to storacha/w3up that referenced this pull request Jul 25, 2023
…together with client and api (#831)

Renames `size` to `height` for `piece`/`aggregate` based on described in
storacha/data-segment#13

Note that:
- took the opportunity to update this to finally rely on `data-segment`
lib as it is now ready to be used (both in tests, and to validate
aggregate)
- given `piece` and `aggregate` are both perfect binary trees, leaf
count of can be derived as `2 ** height`, and the size of the tree can
be derived by `leafCount * Node.Size` (32).

Needs storacha/specs#66
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants