Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: blob implementation #1342

Merged
merged 27 commits into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
1a456e8
feat: blob implementation
vasco-santos Mar 25, 2024
dece9c4
feat: http put
vasco-santos Apr 3, 2024
e09310b
fix: upgrade ucanto libs
vasco-santos Apr 4, 2024
71b4550
fix: wire conclude and accept together
vasco-santos Apr 5, 2024
05bade8
chore: address review
vasco-santos Apr 5, 2024
ab02d08
fix: minor tweaks
vasco-santos Apr 5, 2024
8931124
chore: add location claim
vasco-santos Apr 5, 2024
9fac96c
chore: address review
vasco-santos Apr 8, 2024
730b601
chore: address review
vasco-santos Apr 8, 2024
92500c6
chore: add await error
vasco-santos Apr 8, 2024
c88f634
chore: update based on https://github.com/web3-storage/specs/pull/117
vasco-santos Apr 9, 2024
5a49c72
chore: address review
vasco-santos Apr 9, 2024
6579dde
fix: derive blob provider from last 32 bytes from content
vasco-santos Apr 9, 2024
7618768
fix: try to fix tests
vasco-santos Apr 9, 2024
4de3b96
chore: address review
vasco-santos Apr 9, 2024
e58e864
chore: clean up
vasco-santos Apr 9, 2024
5adcb0d
fix: adopt fully ucan await pipeline
vasco-santos Apr 10, 2024
6f2bf8f
fix: add expiration to allocate receipt and wire up space for accept
vasco-santos Apr 10, 2024
dc5e821
chore: re-enable attw
vasco-santos Apr 11, 2024
0cc774a
chore: rename exp to ttl and make optional for blob accept
vasco-santos Apr 11, 2024
a40e9f2
fix: address review comments for capabilities and types
vasco-santos Apr 11, 2024
e5e60d5
fix: address review comments for upload api
vasco-santos Apr 11, 2024
f32d7ab
refactor: make schedule functions the same as invocation creation
vasco-santos Apr 11, 2024
b91dd04
test: simplify and add tests
vasco-santos Apr 11, 2024
8b0d4ab
test: add all storage tests
vasco-santos Apr 11, 2024
4cf99fa
chore: address last review comments
vasco-santos Apr 12, 2024
15d732c
fix: remove unused dep
vasco-santos Apr 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion packages/capabilities/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,10 @@
"types": "./dist/src/filecoin/dealer.d.ts",
"import": "./src/filecoin/dealer.js"
},
"./web3.storage/blob": {
"types": "./dist/src/web3.storage/blob.d.ts",
"import": "./src/web3.storage/blob.js"
},
Copy link
Contributor

@Gozala Gozala Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably expose other capabilities also

Suggested change
},
},
"./blob": {
"types": "./dist/src/blob.d.ts",
"import": "./src/blob.js"
},
"./http": {
"types": "./dist/src/http.d.ts",
"import": "./src/http.js"
},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"./types": {
"types": "./dist/src/types.d.ts",
"import": "./src/types.js"
Expand Down Expand Up @@ -88,7 +92,8 @@
"@ucanto/principal": "^9.0.1",
"@ucanto/transport": "^9.1.1",
"@ucanto/validator": "^9.0.2",
"@web3-storage/data-segment": "^3.2.0"
"@web3-storage/data-segment": "^3.2.0",
"uint8arrays": "^5.0.3"
},
"devDependencies": {
"@web3-storage/eslint-config-w3up": "workspace:^",
Expand Down
75 changes: 75 additions & 0 deletions packages/capabilities/src/blob.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
/**
* Blob Capabilities.
*
* Blob is a fixed size byte array addressed by the multihash.
* Usually blobs are used to represent set of IPLD blocks at different byte ranges.
*
* These can be imported directly with:
* ```js
* import * as Blob from '@web3-storage/capabilities/blob'
* ```
*
* @module
*/
import { capability, Schema } from '@ucanto/validator'
import { equalBlob, equalWith, SpaceDID } from './utils.js'

/**
* Agent capabilities for Blob protocol
*/

/**
* Capability can only be delegated (but not invoked) allowing audience to
* derived any `blob/` prefixed capability for the (memory) space identified
* by DID in the `with` field.
*/
export const blob = capability({
can: 'blob/*',
/**
* DID of the (memory) space where Blob is intended to
* be stored.
*/
with: SpaceDID,
derives: equalWith,
})

/**
* Blob description for being ingested by the service.
*/
export const content = Schema.struct({
/**
* A multihash digest of the blob payload bytes, uniquely identifying blob.
*/
digest: Schema.bytes(),
/**
* Number of bytes contained by this blob. Service will provision write target
* for this exact size. Attempt to write a larger Blob file will fail.
*/
size: Schema.integer(),
})

/**
* `blob/add` capability allows agent to store a Blob into a (memory) space
* identified by did:key in the `with` field. Agent should compute blob multihash
* and size and provide it under `nb.blob` field, allowing a service to provision
* a write location for the agent to PUT desired Blob into.
*/
export const add = capability({
can: 'blob/add',
/**
* DID of the (memory) space where Blob is intended to
* be stored.
*/
with: SpaceDID,
nb: Schema.struct({
/**
* Blob to be added on the space.
*/
blob: content,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not nb: content?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better to not tie the implementation with this, this follows the spec, so I would recommend to discuss this at the spec level via issue/PR rather than here... see storacha/specs#118

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @vasco-santos on discussing this in spec, however since initial spec was merged I'll respond here. I think we'd like to maintain some flexibility to extend capability over time, e.g. I anticipate that we may want to allow setting expiry and perhaps target region in the future. In this instance if we made nb: content it would made those extensions more difficult.

Relatedly I'm also exploring thread of making content into a multihash that captures payload size per multiformats/multihash#163 which is also why we want to treat content as a unit as opposed to set of fields.

}),
derives: equalBlob,
})

// ⚠️ We export imports here so they are not omitted in generated typedefs
// @see https://github.com/microsoft/TypeScript/issues/51548
export { Schema }
49 changes: 49 additions & 0 deletions packages/capabilities/src/http.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
/**
* HTTP Capabilities.
*
* These can be imported directly with:
* ```js
* import * as HTTP from '@web3-storage/capabilities/http'
* ```
*
* @module
*/
import { capability, Schema, ok } from '@ucanto/validator'
import { content } from './blob.js'
import { equal, equalBody, equalWith, SpaceDID, Await, and } from './utils.js'

/**
* `http/put` capability invocation MAY be performed by any authorized agent on behalf of the subject
* as long as they have referenced `body` content to do so.
*/
export const put = capability({
can: 'http/put',
/**
* DID of the (memory) space where Blob is intended to
* be stored.
*/
with: SpaceDID,
nb: Schema.struct({
/**
* Description of body to send (digest/size).
*/
body: content,
Gozala marked this conversation as resolved.
Show resolved Hide resolved
/**
* HTTP(S) location that can receive blob content via HTTP PUT request.
*/
url: Schema.string().or(Await),
/**
* HTTP headers.
*/
headers: Schema.dictionary({ value: Schema.string() }).or(Await),
}),
derives: (claim, from) => {
return (
and(equalWith(claim, from)) ||
and(equalBody(claim, from)) ||
and(equal(claim.nb.url, from.nb, 'url')) ||
and(equal(claim.nb.headers, from.nb, 'headers')) ||
ok({})
)
},
})
10 changes: 10 additions & 0 deletions packages/capabilities/src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,9 @@ import * as DealTracker from './filecoin/deal-tracker.js'
import * as UCAN from './ucan.js'
import * as Plan from './plan.js'
import * as Usage from './usage.js'
import * as Blob from './blob.js'
import * as W3sBlob from './web3.storage/blob.js'
import * as HTTP from './http.js'

export {
Access,
Expand Down Expand Up @@ -63,6 +66,7 @@ export const abilitiesAsStrings = [
Access.access.can,
Access.authorize.can,
UCAN.attest.can,
UCAN.conclude.can,
Customer.get.can,
Consumer.has.can,
Consumer.get.can,
Expand All @@ -86,4 +90,10 @@ export const abilitiesAsStrings = [
Plan.get.can,
Usage.usage.can,
Usage.report.can,
Blob.blob.can,
Blob.add.can,
W3sBlob.blob.can,
W3sBlob.allocate.can,
W3sBlob.accept.can,
HTTP.put.can,
]
117 changes: 116 additions & 1 deletion packages/capabilities/src/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ import {
import { space, info } from './space.js'
import * as provider from './provider.js'
import { top } from './top.js'
import * as BlobCaps from './blob.js'
import * as W3sBlobCaps from './web3.storage/blob.js'
import * as HTTPCaps from './http.js'
import * as StoreCaps from './store.js'
import * as UploadCaps from './upload.js'
import * as AccessCaps from './access.js'
Expand All @@ -41,6 +44,10 @@ export type ISO8601Date = string

export type { Unit, PieceLink }

export interface UCANAwait<Selector extends string = string, Task = unknown> {
'ucan/await': [Selector, Link<Task>]
}

/**
* An IPLD Link that has the CAR codec code.
*/
Expand Down Expand Up @@ -439,6 +446,95 @@ export interface UploadNotFound extends Ucanto.Failure {

export type UploadGetFailure = UploadNotFound | Ucanto.Failure

// HTTP
export type HTTPPut = InferInvokedCapability<typeof HTTPCaps.put>

// Blob
export type Blob = InferInvokedCapability<typeof BlobCaps.blob>
export type BlobAdd = InferInvokedCapability<typeof BlobCaps.add>
export type ServiceBlob = InferInvokedCapability<typeof W3sBlobCaps.blob>
export type BlobAllocate = InferInvokedCapability<typeof W3sBlobCaps.allocate>
export type BlobAccept = InferInvokedCapability<typeof W3sBlobCaps.accept>

export type BlobMultihash = Uint8Array
export interface BlobModel {
digest: BlobMultihash
size: number
}

// Blob add
export interface BlobAddSuccess {
site: UCANAwait<'.out.ok.site'>
}

export interface BlobSizeOutsideOfSupportedRange extends Ucanto.Failure {
name: 'BlobSizeOutsideOfSupportedRange'
}

export interface AwaitError extends Ucanto.Failure {
name: 'AwaitError'
}

// TODO: We need Ucanto.Failure because provideAdvanced can't handle errors without it
export type BlobAddFailure =
| BlobSizeOutsideOfSupportedRange
| AwaitError
| StorageGetError
| Ucanto.Failure

export interface BlobListItem {
blob: BlobModel
insertedAt: ISO8601Date
}

// Blob allocate
export interface BlobAllocateSuccess {
size: number
address?: BlobAddress
}
Comment on lines +491 to +494
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect address to be always there, otherwise http/put will not be able to reference url or headers

Suggested change
export interface BlobAllocateSuccess {
size: number
address?: BlobAddress
}
export interface BlobAllocateSuccess {
size: number
address: BlobAddress
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Reading this PR got me wondering if perhaps we should have used memory/allocate and memory/commit instead (which would avoid confusion between service blob and user blob) that way address: BlobAddress would have changed to memory: AllocatedMemory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please align this on the spec https://github.com/web3-storage/specs/blob/fix/align-blob-with-impl/w3-blob.md#allocate-blob ?

It is difficult to manage what is expected if we are always diverging from there. For a second invocation with different nonce, I think not creating a presigned url for something is there makes sense, but happy to reconsider

Copy link
Contributor Author

@vasco-santos vasco-santos Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already raised this a week ago #1342 (comment) and gets difficult to make progress this way :/
I would prefer to get this merged and list what we need to improve, rather than blocking iterative approach on this huge piece of work where there is always something else and details in spec never stable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The whole rename thing was just a thought, I was not implying that we should do it, sorry for not been clear about this. I just wanted to know if you it made sense to you, if so we can go update spec and then implementation as you say here.
  2. You are also correct to point out that address?: BlobAddress comes from spec. I was honestly surprised looking at the spec now, clearly I have overlooked the issue I have then pointed out in regards to await's. I'll fix the spec and we can change this in the impl as followup.


export interface BlobAddress {
url: ToString<URL>
headers: Record<string, string>
expiresAt: ISO8601Date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we call this ttl since it has well defined semantics that matches exactly what we do here.

Suggested change
expiresAt: ISO8601Date
ttl: ISO8601Date

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also isn't ttl already present via X-Amz-Expires query param, perhaps we should not duplicate it here but rather have function that could be used to derive ttl from the BlobAddress instead, that would prevent two from getting out of sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked about not making this provider specific yesterday and the limitations that aws gives. see #1342 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to have a date here, otherwise I cannot infer if presigned URL is expired, and therefore allocation needs to run again. TTL is usually how long more available, rather then a date. So, we would be doing the same issue that X-Amz-Expires has. We could also have a receipt time to avoid this. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per wikipedia about TTL

Time to live (TTL) or hop limit is a mechanism which limits the lifespan or lifetime of data in a computer or network. TTL may be implemented as a counter or timestamp attached to or embedded in the data.

I mostly have seen timestamps in TTL but in datagram it is indeed more of a lifetime left, so I think you are correct. Relatedly wiki mentions Expires header and same named cookie so I think you are correct that TTL may not be a good name.

How about renaming it to expires just to align with HTTP header and a cookie ? I'm also ok with leaving it as is.

In regards to my comment of not replicating data already in headers, my primary motivation here is to avoid getting two out of sync, but after thinking more about it I think you are correct. We may choose to expire it sooner than what's in the presigned URL so perhaps having two out of sync is not so bad, although having our expiry higher than what's in headers still bad, but I think it's ok.

}

// If user space has not enough space to allocate the blob.
export interface NotEnoughStorageCapacity extends Ucanto.Failure {
name: 'NotEnoughStorageCapacity'
}

export type BlobAllocateFailure = NotEnoughStorageCapacity | Ucanto.Failure

// Blob accept
export interface BlobAcceptSuccess {
// A Link for a delegation with site commiment for the added blob.
site: Link
vasco-santos marked this conversation as resolved.
Show resolved Hide resolved
}

export interface AllocatedMemoryHadNotBeenWrittenTo extends Ucanto.Failure {
name: 'AllocatedMemoryHadNotBeenWrittenTo'
}

// TODO: We should type the store errors and add them here, instead of Ucanto.Failure
export type BlobAcceptFailure =
| AllocatedMemoryHadNotBeenWrittenTo
| Ucanto.Failure

// Storage errors
export type StoragePutError = StorageOperationError
export type StorageGetError = StorageOperationError | RecordNotFound

// Operation on a storage failed with unexpected error
export interface StorageOperationError extends Error {
name: 'StorageOperationFailed'
}

// Record requested not found in the storage
export interface RecordNotFound extends Error {
name: 'RecordNotFound'
}

// Store
export type Store = InferInvokedCapability<typeof StoreCaps.store>
export type StoreAdd = InferInvokedCapability<typeof StoreCaps.add>
Expand Down Expand Up @@ -530,6 +626,7 @@ export interface UploadListSuccess extends ListResponse<UploadListItem> {}

export type UCANRevoke = InferInvokedCapability<typeof UCANCaps.revoke>
export type UCANAttest = InferInvokedCapability<typeof UCANCaps.attest>
export type UCANConclude = InferInvokedCapability<typeof UCANCaps.conclude>

export interface Timestamp {
/**
Expand All @@ -540,6 +637,8 @@ export interface Timestamp {

export type UCANRevokeSuccess = Timestamp

export type UCANConcludeSuccess = Timestamp

/**
* Error is raised when `UCAN` being revoked is not supplied or it's proof chain
* leading to supplied `scope` is not supplied.
Expand Down Expand Up @@ -578,6 +677,15 @@ export type UCANRevokeFailure =
| UnauthorizedRevocation
| RevocationsStoreFailure

/**
* Error is raised when receipt is received for unknown invocation
*/
export interface ReferencedInvocationNotFound extends Ucanto.Failure {
name: 'ReferencedInvocationNotFound'
}

export type UCANConcludeFailure = ReferencedInvocationNotFound | Ucanto.Failure

// Admin
export type Admin = InferInvokedCapability<typeof AdminCaps.admin>
export type AdminUploadInspect = InferInvokedCapability<
Expand Down Expand Up @@ -686,6 +794,7 @@ export type ServiceAbilityArray = [
Access['can'],
AccessAuthorize['can'],
UCANAttest['can'],
UCANConclude['can'],
CustomerGet['can'],
ConsumerHas['can'],
ConsumerGet['can'],
Expand All @@ -708,7 +817,13 @@ export type ServiceAbilityArray = [
AdminStoreInspect['can'],
PlanGet['can'],
Usage['can'],
UsageReport['can']
UsageReport['can'],
Blob['can'],
BlobAdd['can'],
ServiceBlob['can'],
BlobAllocate['can'],
BlobAccept['can'],
HTTPPut['can']
]

/**
Expand Down
30 changes: 29 additions & 1 deletion packages/capabilities/src/ucan.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
* UCAN core capabilities.
*/

import { capability, Schema } from '@ucanto/validator'
import { capability, Schema, ok } from '@ucanto/validator'
import * as API from '@ucanto/interface'
import { equalWith, equal, and, checkLink } from './utils.js'

Expand Down Expand Up @@ -74,6 +74,34 @@ export const revoke = capability({
),
})

/**
* `ucan/conclude` capability represents a receipt using a special UCAN capability.
*
* The UCAN invocation specification defines receipt record, that is cryptographically
* signed description of the invocation output and requested effects. Receipt
* structure is very similar to UCAN except it has no notion of expiry nor it is
* possible to delegate ability to issue receipt to another principal.
*/
export const conclude = capability({
can: 'ucan/conclude',
vasco-santos marked this conversation as resolved.
Show resolved Hide resolved
/**
* DID of the principal representing the Conclusion Authority.
* MUST be the DID of the audience of the ran invocation.
*/
with: Schema.did(),
nb: Schema.struct({
/**
* CID of the content with the Receipt.
*/
receipt: Schema.link(),
}),
derives: (claim, from) =>
// With field MUST be the same
and(equalWith(claim, from)) ||
and(checkLink(claim.nb.receipt, from.nb.receipt, 'nb.receipt')) ||
ok({}),
})

/**
* Issued by trusted authority (usually the one handling invocation) that attest
* that specific UCAN delegation has been considered authentic.
Expand Down
Loading
Loading