fix: error event propagation #233

hugomrdias · 2018-04-27T18:04:16Z

When using node event emitter error events should always be handled and propagated, i'm doing a fix on ipfs cli related to this issue ipfs/js-ipfs#1180 and couldn't catch errors coming from ipfsd this PR fixes that.

dryajov

LGTM. Al through CI is failing, I don't think its related to this changes, but to the recent changes made with class-is which I believe require releasing js-ipfs and js-ipfs-api.

dryajov · 2018-04-29T17:05:38Z

.gitignore

@@ -5,6 +5,8 @@ yarn.lock
 **/*.log
 test/repo-tests*

+.vscode
+.eslintrc


Is there any reason we need to ignore the . eslintrc file if its not being used? Usually aegir takes care of all our configs, linting included.

editors need to find a eslint config to provide linter messages and fix problems, because the config is hidden inside the aegir > eslint-config-aegir > .eslintrc the editor cant find it.
because of this i need to manually add this file in every repo to have a normal ctrl-s workflow.

Editors will usually allow pointing their linting tools to custom locations, and with many contributors/editors being used I don't think it would be practical to add a custom entry for each to the .gitignore file.

I don't know what our current guidelines in regards to .gitignore are, maybe @diasdavid or @victorbjelkholm can provide more context.

This works for all editors because this is all based on how eslint works, nothing fancy here 😄

hugomrdias · 2018-04-30T10:56:39Z

yes tests fail because of ipfs. with ipfs master linked locally everything is green 👍 .

daviddias · 2018-05-01T20:33:08Z

@travisperson wanna review here? Seems that we have to release this one with CI failing.

travisperson

Ya this LGTM

Mr0grog · 2018-05-02T19:41:09Z

src/factory-in-proc.js

@@ -130,6 +130,7 @@ class FactoryInProc {
    }

    const node = new Node(options)
+    node.once('error', err => callback(err, node))


If a node emits an error after starting, won’t this result in spawn’s callback getting called more than once? (First with a success, then, later, with an error.)

Shouldn't happen but ur right error should come first. Good catch, thank you

Mr0grog · 2018-05-04T16:39:52Z

That didn’t actually fix the problem! Try this:

const IPFSFactory = require('.')
const factory = IPFSFactory.create({type: 'proc', exec: require('ipfs')})

factory.spawn({disposable:true}, (error, daemon) => {
  // This callback should only ever execute once, but instead, it’s about to happen twice.
  console.log('Daemon spawned!')
  if (error) {
    return console.error('Spawn error:', error)
  }
  
  // Do an operation, just to make sure we're working
  daemon.api.id((error, id) => {
    if (error) {
      console.error('Could not get ID!')
    }
    else {
      console.log('Daemon ID:', id.id)
    }
    
    // Do something to make stopping fail (this is arbitrary, but let's say
    // anything *could* happen to put the daemon in a bad state)
    daemon.exec._repo.close(error => {
      console.log('Repo prematurely closed')
      
      // And finally, stop the daemon
      console.log('Stopping daemon...')
      daemon.stop(error => console.log('Daemon stopped:', error || 'no error'))
    })
  })
})

That should print:

Swarm listening on /ip4/127.0.0.1/tcp/59666/ipfs/QmbDGaByZoomn3NQQjZzwaPWH6ei3ptzWK7a7ECtS35DKL
Daemon spawned!
Daemon ID: QmbDGaByZoomn3NQQjZzwaPWH6ei3ptzWK7a7ECtS35DKL
Repo prematurely closed
Stopping daemon...
Daemon stopped: Error: repo is already closed
    [...stack trace removed...]

But instead, you get:

Swarm listening on /ip4/127.0.0.1/tcp/59666/ipfs/QmbDGaByZoomn3NQQjZzwaPWH6ei3ptzWK7a7ECtS35DKL
Daemon spawned!
Daemon ID: QmbDGaByZoomn3NQQjZzwaPWH6ei3ptzWK7a7ECtS35DKL
Repo prematurely closed
Stopping daemon...
Daemon spawned!
Spawn error: Error: repo is already closed
    [...stack trace removed...]
Daemon stopped: Error: repo is already closed
    [...stack trace removed...]

Note the repeated Daemon spawned! logs.

dryajov · 2018-05-04T17:25:43Z

Thats a great catch @Mr0grog and a pretty tricky scenario.

I see two options:

remove error handler and wrap in a try/catch

try {
    const node = new Node(options)

    series([
      (cb) => node.once('ready', cb),
      (cb) => node.repo._isInitialized(err => {
        // if err exists, repo failed to find config or the ipfs-repo package
        // version is different than that of the existing repo.
        node.initialized = !err
        cb()
      }),
      (cb) => options.init
        ? node.init(cb)
        : cb(),
      (cb) => options.start
        ? node.start(options.args, cb)
        : cb()
    ], (err) => callback(err, node))
} catch (err) {
  return callback(err)
}

Wrapping the initialization in a try/catch should take care of all possible error scenarios, but... See next case.

remove error handler once series is done:

    const node = new Node(options)
    const errHandler = (err) => callback(err, node) 
    node.once('error', errHandler)

    series([
      (cb) => node.once('ready', cb),
      (cb) => node.repo._isInitialized(err => {
        // if err exists, repo failed to find config or the ipfs-repo package
        // version is different than that of the existing repo.
        node.initialized = !err
        cb()
      }),
      (cb) => options.init
        ? node.init(cb)
        : cb(),
      (cb) => options.start
        ? node.start(options.args, cb)
        : cb()
    ], (err) => {
      node.removeListener('error', errHandler)
      callback(err, node)
     })

This would be the recommended approach, (case 1 can be seen as an anti-pattern of sorts), but it seems like there would still be cases (tho, most likely legit bugs in ipfs), where an error is thrown but initialization/startup is not aborted correctly, so spawn would be again called twice.

There might be a better approach, so would love to hear suggestions.

dryajov · 2018-05-04T17:28:03Z

BTW, if not error handler is registered, EventEmitter will handle it as a special case and throw the error, see - https://nodejs.org/api/events.html#events_error_events. I think, this is pretty safe in our case, because we control the code paths and handling the error with a proper try/catch....

Mr0grog · 2018-05-04T18:09:56Z

Yeah, I would agree that approach 2 is probably the right one.

Some other thoughts that affect this code but are probably out of scope:

It might be nice if the new Node(options) statement was wrapped in a try/catch, so a caller only has to worry about the callback and not also wrap factory.spawn() in a try/catch themselves. That would be a change from the current behavior, though.
It might also be nice if the error handler on node.exec that propagates the error up to node wasn’t a once handler, so users can reliably listen for error events on node (since some folks are certainly asking for a future where a single error does not shut down a node).

dryajov · 2018-05-04T18:33:29Z

It might be nice if the new Node(options) statement was wrapped in a try/catch, so a caller only has to worry about the callback and not also wrap factory.spawn() in a try/catch themselves. That would be a change from the current behavior, though.

If this is happening, it should be considered a bug. But things being as they are, and errors handling being a bit of a mix (callbacks, eventemitter and possibly throws in some third party code), it might not be a bad idea, tho callbacks in particular should not affect this code path, since we're newing the in process nodes.

It might also be nice if the error handler on node.exec that propagates the error up to node wasn’t a once handler, so users can reliably listen for error events on node...

Hmm... Not sure if I agree... Error events, specially in the EventEmitter sense are exceptions. If the exception was not handled by the library/function/method internally, it should terminate the process (albeit in a graceful manner), but I would not try to recover from such errors. So, error should be only thrown when recovery is not possible. Another way of looking at it is - using exceptions to control flow is mostly wrong.

(since some folks are certainly asking for a future where a single error does not shut down a node)

I think if cases where users want to recover from the error, it should not be thrown as an error in the first place, but handled internally by ipfs and re-thrown as a warning or logged somewhere, etc...

Mr0grog · 2018-05-04T20:22:44Z

If this is happening, it should be considered a bug… tho callbacks in particular should not affect this code path, since we're newing the in process nodes.

Hmmm, I’m not sure I follow. new IPFS() is designed to throw. If I did this:

factory.spawn({disposable: true, EXPERIMENTAL: {pubsub: 'hello!'}}, (error, node) => {
  // this never runs because factory.spawn throws an exception synchronously
})

that throws (yes it’s dumb, but still, it’s a thing that could happen). I’m not sure it’s reasonable to expect that this library will be so perfectly locked to whatever version of js-ipfs you have installed that it can knowingly prevent every possible error case. You could expect that, in which case this would be a bug, but that seems like a pretty hard-to-support expectation in reality.

Hmm... Not sure if I agree... Error events, specially in the EventEmitter sense are exceptions. If the exception was not handled by the library/function/method internally, it should terminate the process (albeit in a graceful manner), but I would not try to recover from such errors.

Totally! I’m not suggesting stopping that — I’m suggesting making the same facility available to a user of ipfsd as to a user of js-ipfs — that you can always listen for the error event (and if you don’t listen, it would still throw, as it always has). Maybe think of it this way:

factory.spawn({disposable: true}, (error, node) => {
  if (error) return console.error('uhoh:', error)

  // Should this be something you can do? All I was suggesting was yes :)
  // If you don’t subscribe, it would still thow — node.exec’s `error` is
  // handled, but it’s only handled by forwarding to node’s `error`, which
  // would still throw if a user doesn’t explicitly do the following line:
  node.on('error', error => {
    // do whatever you want here, same as you might with an IPFS node you
    // created via `new IPFS()`
  })
})

it should not be thrown as an error in the first place, but handled internally by ipfs and re-thrown as a warning or logged somewhere, etc...

Also agree! 😄 I should have linked this related conversation where Victor suggested the same. But now I think we are getting way off topic from this issue.

dryajov · 2018-05-04T21:03:48Z

Hmmm, I’m not sure I follow. new IPFS() is designed to throw. If I did this:

I'm actually agreeing 👍 sorry for not being clear - we should wrap new Node(...) with a try/catch. We should also standardize on our error reporting strategy, we seem to rely on EventEmitter to propagate errors, we should stick to it - I think a perfect example is the joi config validation, some errors get reported through .emit('error') (and throw if no handler is registered), the joi validation seems to throw exclusively.

Totally! I’m not suggesting stopping that — I’m suggesting making the same facility available to a user of ipfsd as to a user of js-ipfs — that you can always listen for the error event (and if you don’t listen, it would still throw, as it always has)

I was referring to this earlier point:

It might also be nice if the error handler on node.exec that propagates the error up to node wasn’t a once handler, so users can reliably listen for error events on node

I don't think we should change the .once for an .on, which is what I believe you're suggesting, rather threat all errors as fatal and propagate only once, which is what we do already - https://github.com/ipfs/js-ipfsd-ctl/blob/master/src/ipfsd-in-proc.js#L75. But I'm not totally opposed to it, tho I still stand by the point on all errors being fatal 😄

I think we have several actionable from this discussion (feel free to add more)

Fix calling spawn twice on error, see - fix: error event propagation #233 (comment)
Figure out what our error propagation strategy is when creating js-ipfs nodes and fix all that don't conform.
Figure out which errors are fatal when creating a js-ipfs node and which aren't - in which case they should not be thrown as errors.

Mr0grog · 2018-05-04T21:20:38Z

👍 to pretty much all of that!

Figure out which errors are fatal and which aren't - in which case they should not be thrown as errors.

I think this one is purely a js-ipfs matter and not anything to do with js-ipfsd-ctl, right?

dryajov · 2018-05-04T21:22:41Z

@Mr0grog awesome! Yes, 2nd and 3rd issues are js-ipfs specific 👍

fix: error event propagation

7b412ac

hugomrdias added the kind/bug A bug in existing code (including security flaws) label Apr 27, 2018

hugomrdias self-assigned this Apr 27, 2018

hugomrdias requested review from richardschneider, dignifiedquire and daviddias April 27, 2018 18:04

ghost added the status/in-progress In progress label Apr 27, 2018

dryajov suggested changes Apr 29, 2018

View reviewed changes

daviddias requested a review from travisperson May 1, 2018 20:32

travisperson previously approved these changes May 1, 2018

View reviewed changes

hugomrdias mentioned this pull request May 2, 2018

feat: improved error handling on the CLI ipfs/js-ipfs#1335

Closed

Mr0grog reviewed May 2, 2018

View reviewed changes

fix: register error event first

8c2ff7d

hugomrdias dismissed travisperson’s stale review via 8c2ff7d May 4, 2018 10:53

daviddias approved these changes May 4, 2018

View reviewed changes

daviddias merged commit db3dd68 into master May 4, 2018

ghost removed the status/in-progress In progress label May 4, 2018

daviddias deleted the fix/error-handling branch May 4, 2018 10:58

Mr0grog mentioned this pull request May 7, 2018

Spawn callback can be called multiple times #239

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: error event propagation #233

fix: error event propagation #233

hugomrdias commented Apr 27, 2018

dryajov left a comment

dryajov Apr 29, 2018

hugomrdias Apr 30, 2018

dryajov Apr 30, 2018

hugomrdias Apr 30, 2018

hugomrdias commented Apr 30, 2018 •

edited

Loading

daviddias commented May 1, 2018

travisperson left a comment

Mr0grog May 2, 2018 •

edited

Loading

hugomrdias May 2, 2018

Mr0grog commented May 4, 2018 •

edited

Loading

dryajov commented May 4, 2018

dryajov commented May 4, 2018

Mr0grog commented May 4, 2018 •

edited

Loading

dryajov commented May 4, 2018

Mr0grog commented May 4, 2018 •

edited

Loading

dryajov commented May 4, 2018 •

edited

Loading

Mr0grog commented May 4, 2018

dryajov commented May 4, 2018

fix: error event propagation #233

fix: error event propagation #233

Conversation

hugomrdias commented Apr 27, 2018

dryajov left a comment

Choose a reason for hiding this comment

dryajov Apr 29, 2018

Choose a reason for hiding this comment

hugomrdias Apr 30, 2018

Choose a reason for hiding this comment

dryajov Apr 30, 2018

Choose a reason for hiding this comment

hugomrdias Apr 30, 2018

Choose a reason for hiding this comment

hugomrdias commented Apr 30, 2018 • edited Loading

daviddias commented May 1, 2018

travisperson left a comment

Choose a reason for hiding this comment

Mr0grog May 2, 2018 • edited Loading

Choose a reason for hiding this comment

hugomrdias May 2, 2018

Choose a reason for hiding this comment

Mr0grog commented May 4, 2018 • edited Loading

dryajov commented May 4, 2018

dryajov commented May 4, 2018

Mr0grog commented May 4, 2018 • edited Loading

dryajov commented May 4, 2018

Mr0grog commented May 4, 2018 • edited Loading

dryajov commented May 4, 2018 • edited Loading

Mr0grog commented May 4, 2018

dryajov commented May 4, 2018

hugomrdias commented Apr 30, 2018 •

edited

Loading

Mr0grog May 2, 2018 •

edited

Loading

Mr0grog commented May 4, 2018 •

edited

Loading

Mr0grog commented May 4, 2018 •

edited

Loading

Mr0grog commented May 4, 2018 •

edited

Loading

dryajov commented May 4, 2018 •

edited

Loading