Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repeated decompression slower after upgrading from node12 #43044

Closed
airtable-keyhanvakil opened this issue May 10, 2022 · 4 comments
Closed

repeated decompression slower after upgrading from node12 #43044

airtable-keyhanvakil opened this issue May 10, 2022 · 4 comments

Comments

@airtable-keyhanvakil
Copy link
Contributor

airtable-keyhanvakil commented May 10, 2022

Version

v16.15.0

Platform

Linux ip-172-33-6-102 5.11.0-1027-aws #30~20.04.1-Ubuntu SMP Thu Jan 13 11:46:53 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

No response

What steps will reproduce the bug?

  1. Save the following file as repro.js:
const zlib = require('zlib');

function makeBigObject(leaves, depth) {
    if (depth === 0) {
        return "howdy";
    } else {
        const ret = {};
        for (let i = 0; i < leaves; ++i) {
            ret[i] = makeBigObject(leaves, depth - 1);
        }
        return ret;
    }
}

console.log('make large object to emulate large heap');
const o = makeBigObject(20000, 2);

const x = [];
for (let i = 0; i < 100000; i++) {
    x.push({r: Math.random(), v: Math.random() > 0.5});
}
const y = JSON.stringify(x);
const z = zlib.gzipSync(y).toString('base64');

console.log('allocate external memory');
const a = [];
for (let i = 0; i < 10; i++) a.push(Buffer.alloc(1_000_000_000));

let t = 0;
console.log('start benchmark');
console.time('decompression');
for (let i = 0; i < 10; i++) {
    const valAsBuffer = zlib.gunzipSync(Buffer.from(z, 'base64'));
    t += valAsBuffer.toString('utf8').length;
    if (t > 1e30) {
        console.log(o, x, a); // force these to remain in memory
    }
}
console.timeEnd('decompression');
  1. Run it with different versions of node (particularly comparing 12 vs 16).

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior?

Performance should be similar to the node12 version and not 10x worse.

What do you see instead?

$ node-v12.20.1 --max_old_space_size=5000 repro.js
decompression: 450.166ms
$ node-v14.19.2 --max_old_space_size=5000 repro.js
decompression: 3.102s
$ node-v16.15.0 --max_old_space_size=5000 repro.js
decompression: 3.123s
$ node-v18.1.0 --max_old_space_size=5000 repro.js
decompression: 3.267s

When I run with --trace-gc I see that there are several log lines that look like this referencing "external memory pressure" which don't happen with earlier versions:

[1542056:0x68b2e00]    21008 ms: Mark-sweep 3460.9 (3615.0) -> 3460.9 (3615.0) MB, 3017.3 / 0.0 ms  (+ 0.8 ms in 2 steps since start of marking, biggest step 0.8 ms, walltime since start of marking 3040 ms) (average mu = 0.543, current mu = 0.009) finalize incremental marking via stack guard; GC in old space requested
[1542056:0x68b2e00]    21022 ms: Scavenge 3460.9 (3615.0) -> 3460.9 (3615.0) MB, 7.0 / 0.0 ms  (average mu = 0.543, current mu = 0.009) external memory pressure; 
[1542056:0x68b2e00]    21030 ms: Scavenge 3460.9 (3615.0) -> 3460.9 (3615.0) MB, 7.4 / 0.0 ms  (average mu = 0.543, current mu = 0.009) external memory pressure; 
[1542056:0x68b2e00]    21039 ms: Scavenge 3460.9 (3615.0) -> 3460.9 (3615.0) MB, 7.6 / 0.0 ms  (average mu = 0.543, current mu = 0.009) external memory pressure; 

The low mutator utilization seems to indicate that this is a problem with the GC.

These log lines seem to take even longer as you increase the size of the heap (via changing makeBigObject), and the benchmark takes even longer.

Additional information

No response

@bnoordhuis
Copy link
Member

bnoordhuis commented May 10, 2022

Thanks for the report. I would probably not qualify this as a bug. The garbage collector tries to strike a CPU/memory balance that's arguably right for most programs but not for the particular setup you picked (small large heap, short-lived script.)

The GC activates when the increase in external memory (the zlib buffers) exceeds a threshold that's relative to the size of the JS heap. The specifics don't matter too much but it's a heuristic to stop external memory from ballooning.

Worth mentioning: V8 tweaked their heuristic based on feedback from us, that's why you see a difference after v12.

Is this an issue for you or was it just a curiosity you noticed?

@airtable-keyhanvakil
Copy link
Contributor Author

Thanks for looking into it! It's fine by me to close this as working-as-intended.

Is this an issue for you or was it just a curiosity you noticed?

Yes, unfortunately, my company is upgrading from 12->16 and ran into this issue. We noticed that this got significantly slower which slows down one of our important things by 3x.

I confess that this is a weird thing to do so we probably deserve it.

By chance, do you have ideas about how we can workaround it or a more specific indication of how it changed?

@bnoordhuis
Copy link
Member

Setting --min_semi_space_size=<mb> --max_semi_space_size=<mb> to some high value can help. The default is 16 MB. Small caveat: there are multiple semi spaces (nurseries) so the actual increase in memory is thrice what you specify.

--min_semi_space_size=<mb> isn't strictly needed but it can help get short-lived scripts over the initial "hump".

@airtable-keyhanvakil
Copy link
Contributor Author

Setting --min_semi_space_size=<mb> --max_semi_space_size=<mb> to some high value can help. The default is 16 MB. Small caveat: there are multiple semi spaces (nurseries) so the actual increase in memory is thrice what you specify.

--min_semi_space_size=<mb> isn't strictly needed but it can help get short-lived scripts over the initial "hump".

Thanks, this was good advice. For the actual use case I was able to tune this so that it was only a 70% slowdown versus node12. Unfortunately that still wasn't acceptable. We ended up implementing the functionality we needed as a native addon, and just reused the relevant buffers in native-land.

I am going to close this issue out now since it sounds like this is working-as-intended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants