-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What would be a recommended concurrency for hashing multiple files asynchronously? #35
Comments
That really depends on the machine (disk type/speed and number of cores), Node.js version, and file size though. But maybe there's like a safe number. Something that always saturates all CPU cores, but doesn't overload the system. This will need some manual experimentation. I don't have an answer for this. I would guess something like the number of CPU cores times 2-4. |
const concurrency = require("os").cpus().length * 3 |
This will depend on machine, file size, and algorithm as @sindresorhus has suggested. However, you're unlikely to see significant benefits beyond // node_modules zipped
BENCHMARK: 50 FILES @ 10 ITERATIONS FILE SIZE: 27877036
Concurrency: 1 Total: 56791 ms Average: 5679.1 ms Cores: 12
Concurrency: 6 Total: 24740 ms Average: 2474 ms Cores: 12
Concurrency: 12 Total: 24488 ms Average: 2448.8 ms Cores: 12
Concurrency: 24 Total: 21875 ms Average: 2187.5 ms Cores: 12
Concurrency: 36 Total: 21975 ms Average: 2197.5 ms Cores: 12
Concurrency: 48 Total: 21945 ms Average: 2194.5 ms Cores: 12 // random jpg
BENCHMARK: 50 FILES @ 10 ITERATIONS FILE SIZE: 549590
Concurrency: 1 Total: 1292 ms Average: 129.1 ms Cores: 12
Concurrency: 6 Total: 506 ms Average: 50.6 ms Cores: 12
Concurrency: 12 Total: 490 ms Average: 49 ms Cores: 12
Concurrency: 24 Total: 482 ms Average: 48.2 ms Cores: 12
Concurrency: 36 Total: 471 ms Average: 47.1 ms Cores: 12
Concurrency: 48 Total: 473 ms Average: 47.3 ms Cores: 12 const logicalProcessors = require('os').cpus().length;
const fs = require('fs');
const {default: PQueue} = require('p-queue');
const hasha = require('.');
// Update w/ your files
const files = [];
for (let i = 1; i <= 50; i++) {
files.push(`./fixtures/${i}.zip`);
}
const stats = fs.statSync(files[0]);
const count = 10;
const benchmark = async concurrency => {
const queue = new PQueue({concurrency});
let total = 0;
const benchmarkStart = Date.now();
for (let i = 0; i < count; i++) {
const iterationStart = Date.now();
await queue.addAll(files.map(f => () => hasha.fromFile(f)));
const iterationEnd = Date.now();
total += (iterationEnd - iterationStart);
}
const benchmarkEnd = Date.now();
console.log(`Concurrency: ${queue.concurrency}\t\tTotal: ${benchmarkEnd - benchmarkStart} ms\tAverage: ${total / count} ms\tCores: ${logicalProcessors}`);
};
const run = async () => {
console.log(`BENCHMARK: ${files.length} FILES @ ${count} ITERATIONS\tFILE SIZE: ${stats.size}`);
await benchmark(1);
await benchmark(logicalProcessors / 2);
await benchmark(logicalProcessors);
await benchmark(logicalProcessors * 2);
await benchmark(logicalProcessors * 3);
await benchmark(logicalProcessors * 4);
};
run(); |
@brandon93s Cool, thank you! |
It would be nice to have in readme a suggestion of what concurrency to use if I want to hash multiple files concurrently. I don't have knowledge on this to even guess.
Thank you!
The text was updated successfully, but these errors were encountered: