-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert npm's compressed tarballs to uncompressed. #40
Comments
Hi @Daniel15 I was experimenting with git lfs to get around the issue, but unfortunately it appeared to be quite slow as for every checkout from lfs it was taking 3 seconds per package to get downloaded. I didn't make any research yet, but I hope there is a way to speed up that process as if you have 1k of packages it will take forever to checkout. |
Thanks for raising this @Daniel15. Initially I'm a little wary of diverging from npm's approach by using tarballs which aren't identical to theirs, however the point raised is good and valid, so should be investigated. Unless someone offers a PR, I'll pick this up as part of performance improvements after bugs and features have been completed. |
Digging into it this morning, it seems to be as simple as running eg. I've run this from my mac then tried an I will probably implement this behind an option such as |
Note to self, I've sketched up this script which should convert a .tgz to a .tar on Mac and Linux; const fs = require('fs');
const path = require('path');
const spawn = require('child_process').spawn;
function getTarFromTgz(tgz, tar) {
return new Promise((resolve, reject) => {
const gunzip = spawn('gunzip', ['-c', tgz]);
const wStream = fs.createWriteStream(tar);
gunzip.stdout.setEncoding('utf8');
gunzip.stdout.on('data', appendToTar);
gunzip.on('close', onClose);
function appendToTar(data) {
wStream.write(data);
}
function onClose(code) {
wStream.end();
if (code === 1) {
reject(`gunzip process exited with code ${code}`);
} else {
resolve(tar);
}
}
});
};
// EXAMPLE USAGE
const readFrom = path.resolve('package.tgz');
const writeTo = path.resolve('package.tar');
getTarFromTgz(readFrom, writeTo)
.then(tar => console.log(tar), err => console.error(err)); I will probably release this without Windows support initially, then add that in a later version. |
Increasing priority as users are being affected: https://github.com/jmeas/moolah/issues/433. |
A few more details, for the record: I didn't figure the issue out entirely, but it seemed sometimes git would not be sure if things were up to date and push lots of "objects" up to the repo as I worked on feature branches. An interesting thing to note is that shrinkpack was just in the git history, but not in the latest commit. In other words, I had added it and then removed it. The pushes were about 50mb, which was particularly problematic for me because I:
We were evaluating introducing this on some internal apps at Netflix, too, but we're concerned because our apps have many more dependencies than Moolah and other team members also use cell connections sometimes. I have a decent data plan but one night of work with 50mb per push would likely eat through it :) An example log of pushing a few files containing only text:
|
This assumes |
There definitely is. Should just be able as simple as passing a stream from |
perfect @DrewML 😎👌 |
@JamieMason you posted a comment with test branch (
|
thanks @szarouski, looking into it now |
I see the problem, running |
should be fixed now in |
@JamieMason np, thanks for looking into that. I get a different message now:
Update: sorry I run
Update2: I tried manually adding lodash.assign to dependencies and after running shrinkpack I get the error from the top of this comment. |
Looking into this, there is an issue with child_process.spawn on Windows which I have added a fix for. I've tested commit e54ed6a on Windows and it is now working. [1] http://stackoverflow.com/questions/27688804/how-do-i-debug-error-spawn-enoent-on-node-js |
Just discovered this project today through a comment on Reddit. Interesting project!
One thing that came to mind is that source control systems may not handle compressed tarballs very well, as every update to a package would result in a new copy of the entire file in the repo, which can make the repo very large. Perhaps uncompressed tarballs (ie.
.tar
files, not.tar.gz
or.tgz
) would be better since they can actually be diffed? I'm not sure if Git still treatstar
s as binary though, in which case it might not actually add any value. Just a thought :)The text was updated successfully, but these errors were encountered: