Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: require content-type parser to set content-type #423

3 changes: 0 additions & 3 deletions packages/interop/src/verified-fetch-unixfs-dir.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,13 +42,11 @@ describe('@helia/verified-fetch - unixfs directory', () => {
expect(resp).to.be.ok()
const text = await resp.text()
expect(text).to.equal('Don\'t we all.')
expect(resp.headers.get('content-type')).to.equal('text/plain')
})

it('can return an image for unixfs pathed data', async () => {
const resp = await verifiedFetch('ipfs://QmbQDovX7wRe9ek7u6QXe9zgCXkTzoUSsTFJEkrYV1HrVR/1 - Barrel - Part 1.png')
expect(resp).to.be.ok()
expect(resp.headers.get('content-type')).to.equal('image/png')
const imgData = await resp.blob()
expect(imgData).to.be.ok()
expect(imgData.size).to.equal(24848)
Expand All @@ -65,7 +63,6 @@ describe('@helia/verified-fetch - unixfs directory', () => {
it('loads path /ipfs/bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt', async () => {
const resp = await verifiedFetch('ipfs://bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt')
expect(resp).to.be.ok()
expect(resp.headers.get('content-type')).to.equal('text/plain')
const text = await resp.text()
// npx kubo@0.25.0 cat '/ipfs/bafybeidbclfqleg2uojchspzd4bob56dqetqjsj27gy2cq3klkkgxtpn4i/685.txt'
expect(text).to.equal(`Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc non imperdiet nunc. Proin ac quam ut nibh eleifend aliquet. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia curae; Sed ligula dolor, imperdiet sagittis arcu et, semper tincidunt urna. Donec et tempor augue, quis sollicitudin metus. Curabitur semper ullamcorper aliquet. Mauris hendrerit sodales lectus eget fermentum. Proin sollicitudin vestibulum commodo. Vivamus nec lectus eu augue aliquet dignissim nec condimentum justo. In hac habitasse platea dictumst. Mauris vel sem neque.
Expand Down
32 changes: 30 additions & 2 deletions packages/verified-fetch/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@
* const fetch = await createVerifiedFetch({
* gateways: ['https://trustless-gateway.link'],
* routers: ['http://delegated-ipfs.dev']
*})
* })
*
* const resp = await fetch('ipfs://bafy...')
*
Expand Down Expand Up @@ -112,6 +112,26 @@
* const json = await resp.json()
* ```
*
* ### Custom content-type parsing
*
* By default, `@helia/verified-fetch` does not set the `Content-Type` header. This is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected. You can provide a `contentTypeParser` function to the `createVerifiedFetch` function to handle parsing the content type. The function you provide will be passed the first bytes we receive from the network.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could link to #422 if desired.

Suggested change
* By default, `@helia/verified-fetch` does not set the `Content-Type` header. This is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected. You can provide a `contentTypeParser` function to the `createVerifiedFetch` function to handle parsing the content type. The function you provide will be passed the first bytes we receive from the network.
* By default, `@helia/verified-fetch` does not set the `Content-Type` header. This is because the `.json()`, `.text()`, `.blob()`, and `.arrayBuffer()` methods will usually work as expected. You can provide a `contentTypeParser` function to the `createVerifiedFetch` function to handle parsing the content type. The function you provide will be passed the first bytes we receive from the network. See https://github.com/ipfs/helia/issues/422 for our thoughts and reasoning.

*
* @example Customizing content-type parsing
*
* ```typescript
* import { createVerifiedFetch } from '@helia/verified-fetch'
* import { fileTypeFromBuffer } from '@sgtpooki/file-type'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not referencing mime-types because it only does recognition based on extension, and the troubles @2color mentioned with ipfs-examples/helia-examples#285.

Not referencing file-types because of the lack of browser support, that has been tested and verified in https://github.com/SgtPooki/file-type

*
* const fetch = await createVerifiedFetch({
* gateways: ['https://trustless-gateway.link'],
* routers: ['http://delegated-ipfs.dev'],
* contentTypeParser: async (bytes) => {
2color marked this conversation as resolved.
Show resolved Hide resolved
* // call to some magic-byte recognition library like magic-bytes, file-type, or your own custom byte recognition
* return fileTypeFromBuffer(bytes)?.mime ?? 'application/octet-stream'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* return fileTypeFromBuffer(bytes)?.mime ?? 'application/octet-stream'
* return (await fileTypeFromBuffer(bytes))?.mime ?? 'application/octet-stream'

* }
* })
* ```
*
* ## Comparison to fetch
*
* This module attempts to act as similarly to the `fetch()` API as possible.
Expand Down Expand Up @@ -229,7 +249,7 @@
import { trustlessGateway } from '@helia/block-brokers'
import { createHeliaHTTP } from '@helia/http'
import { delegatedHTTPRouting } from '@helia/routers'
import { VerifiedFetch as VerifiedFetchClass } from './verified-fetch.js'
import { VerifiedFetch as VerifiedFetchClass, type ContentTypeParser } from './verified-fetch.js'
import type { Helia } from '@helia/interface'
import type { IPNSRoutingEvents, ResolveDnsLinkProgressEvents, ResolveProgressEvents } from '@helia/ipns'
import type { GetEvents } from '@helia/unixfs'
Expand Down Expand Up @@ -262,8 +282,16 @@ export interface VerifiedFetch {
export interface CreateVerifiedFetchWithOptions {
gateways: string[]
routers?: string[]
/**
* A function to handle parsing content type from bytes. The function you provide will be passed the first set of
* bytes we receive from the network, and should return a string that will be used as the value for the `Content-Type`
* header in the response.
*/
contentTypeParser?: ContentTypeParser
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add contentTypeParser to verifiedFetch options? I don't think we need to allow passing it for each call, but we could. LMK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps not in the first instance to keep it simple? We can always add it later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to accept an options param here that allows for the VerifiedFetchInit options and just pass that through? That seems better and less prone to CreateVerifiedFetchWithOptions growing unnecessarily large.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I think maybe we want to create a separation between the createVerifiedFetch options arg type and the VerifiedFetch class init arg type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats why I went with this initially. we could always update if we really need to, but i don't imagine createVerifiedFetch init args will vary much

}

export type { ContentTypeParser }

export type BubbledProgressEvents =
// unixfs
GetEvents |
Expand Down
55 changes: 0 additions & 55 deletions packages/verified-fetch/src/utils/get-content-type.ts

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,27 +1,25 @@
import { CustomProgressEvent } from 'progress-events'
import { getContentType } from './get-content-type.js'
import type { VerifiedFetchInit } from '../index.js'
import type { ComponentLogger } from '@libp2p/interface'

/**
* Converts an async iterator of Uint8Array bytes to a stream and attempts to determine the content type of those bytes.
* Converts an async iterator of Uint8Array bytes to a stream and returns the first chunk of bytes
*/
export async function getStreamAndContentType (iterator: AsyncIterable<Uint8Array>, path: string, logger: ComponentLogger, options?: Pick<VerifiedFetchInit, 'onProgress'>): Promise<{ contentType: string, stream: ReadableStream<Uint8Array> }> {
const log = logger.forComponent('helia:verified-fetch:get-stream-and-content-type')
export async function getStreamFromAsyncIterable (iterator: AsyncIterable<Uint8Array>, path: string, logger: ComponentLogger, options?: Pick<VerifiedFetchInit, 'onProgress'>): Promise<{ stream: ReadableStream<Uint8Array>, firstChunk: Uint8Array }> {
const log = logger.forComponent('helia:verified-fetch:get-stream-from-async-iterable')
const reader = iterator[Symbol.asyncIterator]()
const { value, done } = await reader.next()
const { value: firstChunk, done } = await reader.next()

if (done === true) {
log.error('No content found for path', path)
throw new Error('No content found')
}

const contentType = await getContentType({ bytes: value, path })
const stream = new ReadableStream({
async start (controller) {
// the initial value is already available
options?.onProgress?.(new CustomProgressEvent<void>('verified-fetch:request:progress:chunk'))
controller.enqueue(value)
controller.enqueue(firstChunk)
},
async pull (controller) {
const { value, done } = await reader.next()
Expand All @@ -40,5 +38,8 @@ export async function getStreamAndContentType (iterator: AsyncIterable<Uint8Arra
}
})

return { contentType, stream }
return {
stream,
firstChunk
}
}
33 changes: 24 additions & 9 deletions packages/verified-fetch/src/verified-fetch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import { code as jsonCode } from 'multiformats/codecs/json'
import { decode, code as rawCode } from 'multiformats/codecs/raw'
import { CustomProgressEvent } from 'progress-events'
import { getStreamAndContentType } from './utils/get-stream-and-content-type.js'
import { getStreamFromAsyncIterable } from './utils/get-stream-from-async-iterable.js'
import { parseResource } from './utils/parse-resource.js'
import { walkPath, type PathWalkerFn } from './utils/walk-path.js'
import type { CIDDetail, Resource, VerifiedFetchInit as VerifiedFetchOptions } from './index.js'
Expand All @@ -29,12 +29,15 @@
pathWalker?: PathWalkerFn
}

export interface ContentTypeParser {
(bytes: Uint8Array): Promise<string>
}

/**
* Potential future options for the VerifiedFetch constructor.
*/
// eslint-disable-next-line @typescript-eslint/no-empty-interface
interface VerifiedFetchInit {

contentTypeParser?: ContentTypeParser
}

interface FetchHandlerFunctionArg {
Expand Down Expand Up @@ -72,6 +75,7 @@
private readonly json: JSON
private readonly pathWalker: PathWalkerFn
private readonly log: Logger
private readonly contentTypeParser: ContentTypeParser | undefined

constructor ({ helia, ipns, unixfs, dagJson, json, dagCbor, pathWalker }: VerifiedFetchComponents, init?: VerifiedFetchInit) {
this.helia = helia
Expand All @@ -87,6 +91,7 @@
this.json = json ?? heliaJson(helia)
this.dagCbor = dagCbor ?? heliaDagCbor(helia)
this.pathWalker = pathWalker ?? walkPath
this.contentTypeParser = init?.contentTypeParser
this.log.trace('created VerifiedFetch instance')
}

Expand Down Expand Up @@ -133,13 +138,13 @@
private async handleDagCbor ({ cid, path, options }: FetchHandlerFunctionArg): Promise<Response> {
this.log.trace('fetching %c/%s', cid, path)
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:start', { cid: cid.toString(), path }))
const result = await this.dagCbor.get(cid, {
const result = await this.dagCbor.get<Uint8Array>(cid, {
signal: options?.signal,
onProgress: options?.onProgress
})
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:end', { cid: cid.toString(), path }))
const response = new Response(JSON.stringify(result), { status: 200 })
response.headers.set('content-type', 'application/json')
const response = new Response(result, { status: 200 })
await this.setContentType(result, response)
return response
}

Expand Down Expand Up @@ -179,11 +184,11 @@
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:end', { cid: resolvedCID.toString(), path: '' }))
this.log('got async iterator for %c/%s', cid, path)

const { contentType, stream } = await getStreamAndContentType(asyncIter, path ?? '', this.helia.logger, {
const { stream, firstChunk } = await getStreamFromAsyncIterable(asyncIter, path ?? '', this.helia.logger, {
onProgress: options?.onProgress
})
const response = new Response(stream, { status: 200 })
response.headers.set('content-type', contentType)
await this.setContentType(firstChunk, response)

return response
}
Expand All @@ -194,10 +199,20 @@
const result = await this.helia.blockstore.get(cid)
options?.onProgress?.(new CustomProgressEvent<CIDDetail>('verified-fetch:request:end', { cid: cid.toString(), path }))
const response = new Response(decode(result), { status: 200 })
response.headers.set('content-type', 'application/octet-stream')
await this.setContentType(result, response)
return response
}

private async setContentType (bytes: Uint8Array, response: Response): Promise<void> {
if (this.contentTypeParser != null) {
try {
response.headers.set('content-type', await this.contentTypeParser(bytes))
} catch (err) {
this.log.error('Error parsing content type', err)
}

Check warning on line 212 in packages/verified-fetch/src/verified-fetch.ts

View check run for this annotation

Codecov / codecov/patch

packages/verified-fetch/src/verified-fetch.ts#L211-L212

Added lines #L211 - L212 were not covered by tests
}
}

/**
* Determines the format requested by the client, defaults to `null` if no format is requested.
*
Expand Down
34 changes: 0 additions & 34 deletions packages/verified-fetch/test/get-content-type.spec.ts

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
import { defaultLogger } from '@libp2p/logger'
import { expect } from 'aegir/chai'
import sinon from 'sinon'
import { getStreamAndContentType } from '../src/utils/get-stream-and-content-type.js'
import { getStreamFromAsyncIterable } from '../src/utils/get-stream-from-async-iterable.js'

describe('getStreamAndContentType', () => {
describe('getStreamFromAsyncIterable', () => {
let onProgressSpy: sinon.SinonSpy

beforeEach(() => {
Expand All @@ -12,23 +12,26 @@ describe('getStreamAndContentType', () => {

it('should throw an error if no content is found', async () => {
const iterator = (async function * () { })()
await expect(getStreamAndContentType(iterator, 'test', defaultLogger())).to.be.rejectedWith('No content found')
await expect(getStreamFromAsyncIterable(iterator, 'test', defaultLogger())).to.be.rejectedWith('No content found')
})

it('should return the correct content type and a readable stream', async () => {
const iterator = (async function * () { yield new TextEncoder().encode('Hello, world!') })()
const { contentType, stream } = await getStreamAndContentType(iterator, 'test.txt', defaultLogger(), { onProgress: onProgressSpy })
expect(contentType).to.equal('text/plain')
const chunks = new TextEncoder().encode('Hello, world!')
const iterator = (async function * () { yield chunks })()
const { firstChunk, stream } = await getStreamFromAsyncIterable(iterator, 'test.txt', defaultLogger(), { onProgress: onProgressSpy })
expect(firstChunk).to.equal(chunks)
const reader = stream.getReader()
const { value } = await reader.read()
expect(onProgressSpy.callCount).to.equal(1)
expect(new TextDecoder().decode(value)).to.equal('Hello, world!')
})

it('should handle multiple chunks of data', async () => {
const iterator = (async function * () { yield new TextEncoder().encode('Hello,'); yield new TextEncoder().encode(' world!') })()
const { contentType, stream } = await getStreamAndContentType(iterator, 'test.txt', defaultLogger(), { onProgress: onProgressSpy })
expect(contentType).to.equal('text/plain')
const textEncoder = new TextEncoder()
const chunks = ['Hello,', ' world!'].map((txt) => textEncoder.encode(txt))
const iterator = (async function * () { yield chunks[0]; yield chunks[1] })()
const { firstChunk, stream } = await getStreamFromAsyncIterable(iterator, 'test.txt', defaultLogger(), { onProgress: onProgressSpy })
expect(firstChunk).to.equal(chunks[0])
const reader = stream.getReader()
let result = ''
let chunk
Expand All @@ -42,20 +45,23 @@ describe('getStreamAndContentType', () => {
it('should include last value done is true', async () => {
// if done === true and there is a value
const LIMIT = 5
let actualFirstChunk: Uint8Array
const iterator: AsyncIterable<Uint8Array> = {
[Symbol.asyncIterator] () {
let i = 0
return {
async next () {
const done = i === LIMIT
const value = new Uint8Array([i++])
actualFirstChunk = actualFirstChunk ?? value
return Promise.resolve({ value, done })
}
}
}
}
const { contentType, stream } = await getStreamAndContentType(iterator, 'test.txt', defaultLogger(), { onProgress: onProgressSpy })
expect(contentType).to.equal('text/plain')
const { firstChunk, stream } = await getStreamFromAsyncIterable(iterator, 'test.txt', defaultLogger(), { onProgress: onProgressSpy })
// @ts-expect-error - actualFirstChunk is not used before set, because the await above.
expect(firstChunk).to.equal(actualFirstChunk)
const reader = stream.getReader()
const result = []
let chunk
Expand Down
Loading
Loading