Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What would be a recommended concurrency for hashing multiple files asynchronously? #35

Open
papb opened this issue Oct 13, 2020 · 4 comments

Comments

@papb
Copy link

papb commented Oct 13, 2020

It would be nice to have in readme a suggestion of what concurrency to use if I want to hash multiple files concurrently. I don't have knowledge on this to even guess.

Thank you!

@sindresorhus
Copy link
Owner

That really depends on the machine (disk type/speed and number of cores), Node.js version, and file size though. But maybe there's like a safe number. Something that always saturates all CPU cores, but doesn't overload the system. This will need some manual experimentation. I don't have an answer for this. I would guess something like the number of CPU cores times 2-4.

@Richienb
Copy link
Contributor

I would guess something like the number of CPU cores times 2-4.

const concurrency = require("os").cpus().length * 3

@brandon93s
Copy link

This will depend on machine, file size, and algorithm as @sindresorhus has suggested. However, you're unlikely to see significant benefits beyond require('os').cpus().length since these algorithms are CPU-intensive. Note that my test machine has 6 physical cores and that the os module returns a logical processor count instead (hyperthreading). Below you will find a benchmark implementation and results in support of require('os').cpus().length:

// node_modules zipped
BENCHMARK:  50 FILES @ 10 ITERATIONS    FILE SIZE: 27877036
Concurrency: 1          Total: 56791 ms     Average: 5679.1 ms      Cores: 12
Concurrency: 6          Total: 24740 ms     Average: 2474 ms        Cores: 12
Concurrency: 12         Total: 24488 ms     Average: 2448.8 ms      Cores: 12
Concurrency: 24         Total: 21875 ms     Average: 2187.5 ms      Cores: 12
Concurrency: 36         Total: 21975 ms     Average: 2197.5 ms      Cores: 12
Concurrency: 48         Total: 21945 ms     Average: 2194.5 ms      Cores: 12
// random jpg
BENCHMARK:  50 FILES @ 10 ITERATIONS    FILE SIZE: 549590
Concurrency: 1          Total: 1292 ms      Average: 129.1 ms       Cores: 12
Concurrency: 6          Total: 506 ms       Average: 50.6 ms        Cores: 12
Concurrency: 12         Total: 490 ms       Average: 49 ms  Cores: 12
Concurrency: 24         Total: 482 ms       Average: 48.2 ms        Cores: 12
Concurrency: 36         Total: 471 ms       Average: 47.1 ms        Cores: 12
Concurrency: 48         Total: 473 ms       Average: 47.3 ms        Cores: 12
const logicalProcessors = require('os').cpus().length;
const fs = require('fs');
const {default: PQueue} = require('p-queue');
const hasha = require('.');

// Update w/ your files
const files = [];
for (let i = 1; i <= 50; i++) {
	files.push(`./fixtures/${i}.zip`);
}

const stats = fs.statSync(files[0]);

const count = 10;

const benchmark = async concurrency => {
	const queue = new PQueue({concurrency});
	let total = 0;
	const benchmarkStart = Date.now();
	for (let i = 0; i < count; i++) {
		const iterationStart = Date.now();
		await queue.addAll(files.map(f => () => hasha.fromFile(f)));
		const iterationEnd = Date.now();
		total += (iterationEnd - iterationStart);
	}

	const benchmarkEnd = Date.now();
	console.log(`Concurrency: ${queue.concurrency}\t\tTotal: ${benchmarkEnd - benchmarkStart} ms\tAverage: ${total / count} ms\tCores: ${logicalProcessors}`);
};

const run = async () => {
	console.log(`BENCHMARK:  ${files.length} FILES @ ${count} ITERATIONS\tFILE SIZE: ${stats.size}`);
	await benchmark(1);
	await benchmark(logicalProcessors / 2);
	await benchmark(logicalProcessors);
	await benchmark(logicalProcessors * 2);
	await benchmark(logicalProcessors * 3);
	await benchmark(logicalProcessors * 4);
};

run();

@papb
Copy link
Author

papb commented Nov 26, 2020

@brandon93s Cool, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants