Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datamodel Length() not usable with lazy nodes (e.g. ADLs) #531

Open
aschmahmann opened this issue Jul 6, 2023 · 2 comments
Open

datamodel Length() not usable with lazy nodes (e.g. ADLs) #531

aschmahmann opened this issue Jul 6, 2023 · 2 comments

Comments

@aschmahmann
Copy link

aschmahmann commented Jul 6, 2023

Noticed while discovering ipfs/go-unixfsnode#54

The contract on node.Length() is

// Length returns the length of a list, or the number of entries in a map,
// or -1 if the node is not of list nor map kind.
Length() int64

This means that if there is any sort of lazy loading or computation involved in working with a node that is definitely a list or map that there is nothing valid to return.

  1. Could return -1 but that's a contract violation
  2. Could return some number (e.g. go-unixfsnode's HAMT returns the largest number of entries it can confirm https://github.com/ipfs/go-unixfsnode/blob/364a5496925b291900d67c35b8b898e1a30b7f0a/hamt/shardeddir.go#L262-L289), but that's also a contract violation since the user expects "the number of entries in a map" not "some number <= the number of entries in a map" and the application has no indication that anything has gone wrong.

I suspect the options here are:

  1. Change Length to return an error
  2. Change the contract to allow -1 to mean any type of error
  3. Add a new function (name TBD) that does return an error and deprecate Length

While 2 seems the easiest, it has the ability to break a bunch of code without people noticing. Simple compile-time checked fixes like 1 seem a safer option.

Whether 1 or 3 is preferred seems like a matter of taste.


Note: the other functions on the Node interface which could potentially run into a similar issue are Kind(), IsAbsent(), and IsNull() However, it seems like these are less likely to be lazily constructed than Length and therefore less likely to be a problem.

Similarly, the map and list iterators can return errors once they're in use (and they can be checked in advance with Kind()) and so are also less problematic.

@rvagg
Copy link
Member

rvagg commented Jul 7, 2023

I agree that this is a problem, and it would be good to resolve it without too much of a hack.

I know that during development there was an attempt to reign in the verbosity of the API—it's already very verbose but it could be a lot more by allowing everything to error .. but maybe they should in a world where we're building multi-layered abstractions over these things with complex behaviour underneath.

In terms of options:

  1. Change Length to return an error

I'd prefer this, but it's going to be very painful breaking that contract, with everyone needing to make the jump together. Are we prepared to deal with that pain?

  1. Change the contract to allow -1 to mean any type of error

This could end up being more dangerous; a length is often used for iteration. Silently changing contract on this could leave users exposed to surprising, or damaging, behaviour. It might be best to be noisy about it.

  1. Add a new function (name TBD) that does return an error and deprecate Length

Does this buy us anything that option 1 doesn't give us? It breaks the interface so the same dependency upgrade pain will exist, we just get to wriggle out of having to check for err except that you get deprecation warnings when you don't! So it seems like a worse version of 1.

Another option exists, to ignore the problem and keep on papering over it. The unixfs-preload one is pretty annoying because one way to handle problems like this is to hold on to an error and return it on a later call to an error returning API call, but with unixfs-preload, no more calls to the Node are made so it gets lost entirely!

rvagg added a commit that referenced this issue Jul 7, 2023
Closes: #531

ADL and other Node implementations that need to perform non-trivial work in
calculating their Length(), such as by loading multiple child blocks, may
encounter errors which should not be silently ignored.
@rvagg
Copy link
Member

rvagg commented Jul 7, 2023

#532, just to see what that would look like internally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants