Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: doublehashes can use different hashes #28

Merged
merged 2 commits into from
Oct 27, 2023

Conversation

hsanjuan
Copy link
Collaborator

Currently, we tried to parse double-hashes as CIDv0s, which implies that SHA256 was used to hash the double-hash. Instead we should try to parse a multihash directly, and support any hashing function for the double-hash.

The rule14 test has been changed to use a blake3 double-hash, which previously broke at the parsing level.

Currently, we tried to parse double-hashes as CIDv0s, which implies that
SHA256 was used to hash the double-hash. Instead we should try to parse a
multihash directly, and support any hashing function for the double-hash.

The rule14 test has been changed to use a blake3 double-hash, which previously
broke at the parsing level.
@hsanjuan hsanjuan self-assigned this Oct 24, 2023
denylist.go Outdated
Comment on lines 374 to 387
// attempt to parse a b58btc-encoded multihash
rule = strings.TrimPrefix(rule, "//")
c, err := cid.Decode(rule)
mh, err := multihash.FromB58String(rule)
if err == nil {
prefix := c.Prefix()
if prefix.Version != 0 {
return fmt.Errorf("double-hash is not a raw-multihash (cidv0) (%s:%d)", dl.Filename, number)
dmh, err := multihash.Decode(mh)
if err != nil {
return fmt.Errorf("what appears to be a multihash b58 string cannot be decoded (%s:%d): %w", dl.Filename, number, err)
}
e.Multihash = c.Hash()

e.Multihash = mh
// we use the multihash codec to group double-hashes
// with the same hashing function.
mhType = c.Prefix().MhType
mhType = dmh.Code
} else { // Assume a hex-encoded sha256 string
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lidel there is a problem here...

some of the badbits hex-encoded strings parse just fine as B58 strings, resulting in valid multihashes with weird code, which then make the whole error checking break for anything later.

I assume the other way around also is an issue: some b58 strings could perhaps be hex-decoded... ? If you have thoughts, they are welcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating two rules sgtm.

Assume a double-hash could be parsed as both a b58btc string and hex string,
but handle cases when that happens and it's obviously wrong:

- Multihashes of type Identity
- Multihashes of unknown hash functions
- Hex strings of length different than 64 chars
lidel added a commit to ipfs/kubo that referenced this pull request Oct 27, 2023
this ensures we test something other than default sha256

Ref. ipfs-shipyard/nopfs#28
Copy link
Member

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated tests in ipfs/kubo#10161 pass with this, thanks!
I also updated example in the spec added in ipfs/specs#383

@lidel
Copy link
Member

lidel commented Oct 27, 2023

(merging to consolidate changes while working on fix for ipfs/kubo#10161 (comment))

@lidel lidel merged commit 4134efe into master Oct 27, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants