Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

child_process IPC is very slow, about 100-10000x slower than I expected (scales with msg size) #3145

Closed
avoidwork opened this issue Oct 1, 2015 · 22 comments
Labels
child_process Issues and PRs related to the child_process subsystem. cluster Issues and PRs related to the cluster subsystem. performance Issues and PRs related to the performance of Node.js.

Comments

@avoidwork
Copy link

Hi,

I'm not sure if my expectations of using IPC are unreasonable, and if so please tell me / close this issue. I planned to use a forked child_process to do some background ops for a nwjs app, and intended to send roughly 40MB of JSON data to the forked ps, and I'd get back a pojo describing that data; I measured the timing at roughly 250-300 seconds on a maxed out 2015 macbook pro (sad face); a Worker in chromium is doing the same job in 1-2 milliseconds.

I then decided to measure the example in the documentation on my personal maxed macbook air (less ram, slower cpu, fast ssd):

// index.js
var cp = require('child_process'),
    n = cp.fork(__dirname + '/sub.js'),
    precise = require('precise'),
    timer = precise();

n.on('message', function(m) {
  timer.stop();
  console.log('PARENT got message:', m);
  console.log('Message received in', timer.diff() / 1000000, 'ms');
});

timer.start();
n.send({ hello: 'world' });

// sub.js
process.on('message', function(m) {
  console.log('CHILD got message:', m);
});

process.send({ foo: 'bar' });

// Console output:
// PARENT got message: { foo: 'bar' }
// CHILD got message: { hello: 'world' }
// Message received in 94.963261 ms

In both hardware scenarios, a simple small text message takes 90-100ms. Not writing to console saves roughly 5-10ms in the provided example.

@bnoordhuis
Copy link
Member

The biggest reason Chromium is faster is that worker and main thread run in the same process, whereas node.js sends the message over a socket to the worker. Using shared memory has been discussed in the past but it's fraught with platform-specific perils.

The other reason is that it employs a fairly efficient binary-ish serializer/deserializer (blink::SerializedScriptValueFactory), whereas node.js uses plain JSON. We can make some strides there, although we probably end up with a ton of dependencies if we lift it from Blink wholesale. I'll look into that when I have time.

@bnoordhuis bnoordhuis added cluster Issues and PRs related to the cluster subsystem. child_process Issues and PRs related to the child_process subsystem. labels Oct 1, 2015
@avoidwork
Copy link
Author

Thanks, I appreciate that! 👍

@Rush
Copy link

Rush commented Oct 16, 2015

@bnoordhuis even with all the IPC .. do you think 100ms is reasonable? I think it shows something is terribly wrong, either with the test or with node. To be honest i have used child_process messaging a lot in some projects and I didn't notice such huge delays.

@avoidwork
Copy link
Author

The latency manifests in multiple versions of node; I tested 0.12.x & 4.1.0 when I filed this bug. I also ran the code first in nwjs 0.11.6 which uses one of the iojs 1.5.x versions I believe.

It's consistent across machines & versions of node/io; have you timed your messages? You can take the sample test above to validate the issue, precise is just putting process.hrtime into an object to hold state.

@Rush
Copy link

Rush commented Oct 16, 2015

Obviously your test is wrong, on several merits:

  • You are including child_process start time in your test
  • You are sending message from the child_process not in the message event handler but in the global scope
  • By running code repeatedly, V8 optimizes it so results are getting better and better
// index.js
var cp = require('child_process');
var n = cp.fork(__dirname + '/sub.js');
var precise = require('precise');
var assert = require('assert');

function runTest() {
  var timer = precise();

  var expectingId = Math.floor(Math.random() * 1000);
  console.log('Sending', {id: expectingId});
  timer.start();
  n.send({ id: expectingId });

  n.once('message', function(m) {
    timer.stop();
    console.log('Message received in', timer.diff() / 1000000, 'ms');
    console.log('PARENT got message back:', m);
    assert(m.id == expectingId);
    setTimeout(runTest, 500);
  });
}
runTest();
// sub.js
process.on('message', function(m) {
  process.send({ id: m.id });
});

My results:

~/node-test ❯ node index.js
Sending { id: 838 }
Message received in 52.125594 ms
PARENT got message back: { id: 838 }
Sending { id: 729 }
Message received in 0.509135 ms
PARENT got message back: { id: 729 }
Sending { id: 524 }
Message received in 0.122227 ms
PARENT got message back: { id: 524 }
Sending { id: 15 }
Message received in 0.141159 ms
PARENT got message back: { id: 15 }
Sending { id: 85 }
Message received in 0.122186 ms
PARENT got message back: { id: 85 }
Sending { id: 880 }
Message received in 0.145977 ms
PARENT got message back: { id: 880 }
Sending { id: 367 }
Message received in 0.143486 ms
PARENT got message back: { id: 367 }

Sub-millisecond performance is pretty good and in my opinion this issue can be closed. I used child_process for caching stuff in a cluster environment and I know it works well.

@Rush
Copy link

Rush commented Oct 16, 2015

Also, if you want to use it in a web-worker fashion, keep your child process alive, or have some child process pool to avoid paying for the start up time.

@avoidwork
Copy link
Author

As I stated earlier, that is the example from the documentation. Putting process.hrtime() in front of the send is valid, I'm testing the entire thing.

You're assuming multiple messages, and that's wrong. In the browser I can / do hold a warmed up blob ready to go, that's impossible in this case 👍

mba results:

jason@avoidwork:~/Projects$ node test
Sending { id: 793 }
Message received in 85.733802 ms
PARENT got message back: { id: 793 }
Sending { id: 91 }
Message received in 0.904732 ms
PARENT got message back: { id: 91 }
Sending { id: 538 }
Message received in 0.268293 ms
PARENT got message back: { id: 538 }
Sending { id: 692 }
Message received in 0.228595 ms
PARENT got message back: { id: 692 }

@Rush
Copy link

Rush commented Oct 16, 2015

What is wrong is your statement "child_process IPC is very slow". Also, I am not assuming anything. I have merely provided you with an example that shows that IPC performs very well, proving your statement wrong. Should you wish to start a new process every single message, it's your choice but you need to know that the cost is very high.

@avoidwork
Copy link
Author

lol, geez buddy... seriously.

The initial message size directly impacts the start up time. that's the bug. you're nitpicking on phrasing that only seems to bother you.

@avoidwork
Copy link
Author

For context, if the startup always took 80-100ms, I wouldn't have filed this bug, because that's consistent and I can work with it.

@Rush
Copy link

Rush commented Oct 16, 2015

Even sending several-MBs long messages I get sub millisecond performance. IPC is super good. The start up time is something that you cannot optimize, simply because node is a big program. Don't be mad at reality. Keep a process in the background that is ready to serve your messages. Just re-work your code and you don't need to sacrifice anything but RAM.

@avoidwork
Copy link
Author

Your last comment confirms the bug, so I don't understand any point you're trying to make. Yes the JIT has to warm up, yes there is a cold start time; I'm not arguing any of that.

  • I cannot have a ps sitting around for a single message eating up 24MB of ram (heap)
  • I want a fast first message but I'm willing to deal with it taking more time than a chromium worker due to the different methodology

None of this changes the ticket, or the fact that it's consistent. I fail to see how you think it's acceptable for > the latency to travel across north america, when the message isn't leaving the physical location.

What you consider 'super good' is only 'super good' in your conditions, and there's a whole world out there where your conditions are irrelevant, much like how you're dismissing my use case as irrelevant, because it's not an issue you're running into.

Simple fact: IPC takes 100x longer than a worker (we both confirmed this), and that initial response does scale linearly in time with the initial message body (the bug).

@ChALkeR ChALkeR added the performance Issues and PRs related to the performance of Node.js. label Feb 16, 2016
@Fishrock123
Copy link
Contributor

Any progress here? I assume this is still an issue?

@andrasq
Copy link

andrasq commented Jul 8, 2016

I too get 30ms for the first message, but subtracting out the child_process fork and startup time
(by either pausing or having the child send an "I've started" message first), the ipc itself
is under 3ms. Subsequent messages are under 1ms.

Sending a long message, however, is different. Messages up to 400kb still take just 2ms,
but 4mb takes 177ms, and 40mb takes 11,000 ms, which seems to be O(n^2) in length.
Same spike in latency for both long strings and arrays of numbers. Timed with node-6.2.1.

@rahatarmanahmed
Copy link
Contributor

Can confirm @andrasq's O(n^2) claim in v7.4.0:

// index.js
const fork = require('child_process').fork
const child = fork('./child.js', process.argv.slice(2))

child.once('message', m => {
    console.time('time')
    child.once('message', m => console.timeEnd('time'))
})
// child.js
const size = 2 ** parseInt(process.argv[2])
console.log(`Sending ${size} byte buffer over IPC`)
const b = Buffer.allocUnsafe(size)
process.send('start')
process.send(b)
$ for i in `seq 2 25`; do node index.js $i; done;
Sending 4 byte buffer over IPC
time: 0.502ms

Sending 8 byte buffer over IPC
time: 0.422ms

Sending 16 byte buffer over IPC
time: 0.455ms

Sending 32 byte buffer over IPC
time: 0.426ms

Sending 64 byte buffer over IPC
time: 0.425ms

Sending 128 byte buffer over IPC
time: 0.438ms

Sending 256 byte buffer over IPC
time: 0.623ms

Sending 512 byte buffer over IPC
time: 0.493ms

Sending 1024 byte buffer over IPC
time: 0.516ms

Sending 2048 byte buffer over IPC
time: 0.622ms

Sending 4096 byte buffer over IPC
time: 0.996ms

Sending 8192 byte buffer over IPC
time: 1.698ms

Sending 16384 byte buffer over IPC
time: 4.633ms

Sending 32768 byte buffer over IPC
time: 10.535ms

Sending 65536 byte buffer over IPC
time: 40.233ms

Sending 131072 byte buffer over IPC
time: 64.867ms

Sending 262144 byte buffer over IPC
time: 144.252ms

Sending 524288 byte buffer over IPC
time: 324.077ms

Sending 1048576 byte buffer over IPC
time: 725.892ms

Sending 2097152 byte buffer over IPC
time: 1835.453ms

Sending 4194304 byte buffer over IPC
time: 5148.031ms

Sending 8388608 byte buffer over IPC
time: 15828.809ms

Sending 16777216 byte buffer over IPC
time: 60626.902ms

Sending 33554432 byte buffer over IPC
time: 222937.077ms

Not sure how @Rush was able to send several-MB long messages in sub millisecond time.

I don't know why IPC would grow O(n^2), is there something inherent about IPC that would cause this?

@addaleax
Copy link
Member

addaleax commented Jan 6, 2017

I don't know why IPC would grow O(n^2), is there something inherent about IPC that would cause this?

I’m not sure how this related to the original report here but I heard O(n^2) and my mind jumped to #10557 – so I’m pretty sure that particular problem is getting fixed right now! 🎉

@rahatarmanahmed
Copy link
Contributor

@addaleax Nice! I compiled that branch and the performance does seem to scale linearly now! (i probably shouldn't have tested with powers of 2 tho)

$ for i in `seq 2 25`; do ~/git/node/out/Release/node index.js $i; done;
Sending 4 byte buffer over IPC
time: 0.620ms

Sending 8 byte buffer over IPC
time: 0.448ms

Sending 16 byte buffer over IPC
time: 0.454ms

Sending 32 byte buffer over IPC
time: 0.452ms

Sending 64 byte buffer over IPC
time: 0.453ms

Sending 128 byte buffer over IPC
time: 0.638ms

Sending 256 byte buffer over IPC
time: 0.474ms

Sending 512 byte buffer over IPC
time: 0.500ms

Sending 1024 byte buffer over IPC
time: 0.554ms

Sending 2048 byte buffer over IPC
time: 0.648ms

Sending 4096 byte buffer over IPC
time: 0.803ms

Sending 8192 byte buffer over IPC
time: 1.163ms

Sending 16384 byte buffer over IPC
time: 3.606ms

Sending 32768 byte buffer over IPC
time: 8.794ms

Sending 65536 byte buffer over IPC
time: 21.187ms

Sending 131072 byte buffer over IPC
time: 59.570ms

Sending 262144 byte buffer over IPC
time: 118.836ms

Sending 524288 byte buffer over IPC
time: 221.794ms

Sending 1048576 byte buffer over IPC
time: 428.668ms

Sending 2097152 byte buffer over IPC
time: 853.978ms

Sending 4194304 byte buffer over IPC
time: 1724.333ms

Sending 8388608 byte buffer over IPC
time: 3457.981ms

Sending 16777216 byte buffer over IPC
time: 6980.072ms

Sending 33554432 byte buffer over IPC
time: 17186.927ms

@avoidwork
Copy link
Author

thank you!

italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 18, 2017
Squashed from:
- child_process: stop indexOf() on whole IPC buffer
- child_process: get rid of forEach() and slice() in IPC
- child_process: get rid of another forEach() in IPC

Fixes: nodejs#3145
PR-URL: nodejs#10557
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 23, 2017
Squashed from:
- child_process: stop indexOf() on whole IPC buffer
- child_process: get rid of forEach() and slice() in IPC
- child_process: get rid of another forEach() in IPC

Fixes: nodejs#3145
PR-URL: nodejs#10557
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 25, 2017
Squashed from:
- child_process: stop indexOf() on whole IPC buffer
- child_process: get rid of forEach() and slice() in IPC
- child_process: get rid of another forEach() in IPC

Fixes: nodejs#3145
PR-URL: nodejs#10557
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
italoacasas pushed a commit to italoacasas/node that referenced this issue Jan 27, 2017
Squashed from:
- child_process: stop indexOf() on whole IPC buffer
- child_process: get rid of forEach() and slice() in IPC
- child_process: get rid of another forEach() in IPC

Fixes: nodejs#3145
PR-URL: nodejs#10557
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
addaleax pushed a commit that referenced this issue May 9, 2017
Squashed from:
- child_process: stop indexOf() on whole IPC buffer
- child_process: get rid of forEach() and slice() in IPC
- child_process: get rid of another forEach() in IPC

Fixes: #3145
PR-URL: #10557
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
andrew749 pushed a commit to michielbaird/node that referenced this issue Jul 19, 2017
Squashed from:
- child_process: stop indexOf() on whole IPC buffer
- child_process: get rid of forEach() and slice() in IPC
- child_process: get rid of another forEach() in IPC

Fixes: nodejs/node#3145
PR-URL: nodejs/node#10557
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
@wagenaartje
Copy link

I'm still experiencing this first-time message delay. Any updates on this?

@bnoordhuis
Copy link
Member

@wagenaartje If your copy of node is up-to-date, please file a new issue and include steps to reproduce.

@davalapar
Copy link

I'm on node 10 and I also encountered this, but what I noticed is that it's only the initial / first messages that gets a latency over 100ms. The following messages will eventually get ideal microsecond latencies (which are good enough for most IPC tasks).

Initial messages (sent as soon as the worker is initialized):

[ 0, 227980396 ] 228 'ms'
[ 0, 226999916 ] 227 'ms'
[ 0, 229141968 ] 229 'ms'
[ 0, 251178945 ] 251 'ms'

The following messages (in example, sent every 500ms):

[ 0, 198992 ] 0 'ms'
[ 0, 245317 ] 0 'ms'
[ 0, 215574 ] 0 'ms'
[ 0, 390349 ] 0 'ms'
[ 0, 299803 ] 0 'ms'

The second values in the array are nanoseconds from process.hrtime api.

@yokyj
Copy link

yokyj commented Mar 20, 2019

@addaleax Nice! I compiled that branch and the performance does seem to scale linearly now! (i probably shouldn't have tested with powers of 2 tho)

$ for i in `seq 2 25`; do ~/git/node/out/Release/node index.js $i; done;
Sending 4 byte buffer over IPC
time: 0.620ms

Sending 8 byte buffer over IPC
time: 0.448ms

Sending 16 byte buffer over IPC
time: 0.454ms

Sending 32 byte buffer over IPC
time: 0.452ms

Sending 64 byte buffer over IPC
time: 0.453ms

Sending 128 byte buffer over IPC
time: 0.638ms

Sending 256 byte buffer over IPC
time: 0.474ms

Sending 512 byte buffer over IPC
time: 0.500ms

Sending 1024 byte buffer over IPC
time: 0.554ms

Sending 2048 byte buffer over IPC
time: 0.648ms

Sending 4096 byte buffer over IPC
time: 0.803ms

Sending 8192 byte buffer over IPC
time: 1.163ms

Sending 16384 byte buffer over IPC
time: 3.606ms

Sending 32768 byte buffer over IPC
time: 8.794ms

Sending 65536 byte buffer over IPC
time: 21.187ms

Sending 131072 byte buffer over IPC
time: 59.570ms

Sending 262144 byte buffer over IPC
time: 118.836ms

Sending 524288 byte buffer over IPC
time: 221.794ms

Sending 1048576 byte buffer over IPC
time: 428.668ms

Sending 2097152 byte buffer over IPC
time: 853.978ms

Sending 4194304 byte buffer over IPC
time: 1724.333ms

Sending 8388608 byte buffer over IPC
time: 3457.981ms

Sending 16777216 byte buffer over IPC
time: 6980.072ms

Sending 33554432 byte buffer over IPC
time: 17186.927ms

dont you think its still so slow? sending 1MB object costs 0.5second, its even worse than http

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
child_process Issues and PRs related to the child_process subsystem. cluster Issues and PRs related to the cluster subsystem. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

No branches or pull requests