-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing static bundle size? #296
Comments
Personally I think that the benefits of using modular code outweighs the overhead in size by far. We should also remember that these files can usually be cached very aggressively (as long as they contain only application code). By splitting code into multiple, lazy bundles that are loaded on demand I think the syntactic noise is no longer a problem. Its good to think about it and to optimize things but imho there are more important problems than file size 😉 |
gzip can reduce to overhead of repeating patterns like |
It would be very interesting to build a commonjs inliner tool as a pre-minification step before using uglify for the lower-level optimizations. Would it be possible to get some real-world figures for how much of a difference this require overhead makes percentage-wise in d3, both with and without gzip and uglify? |
Fortunately, this was easy to test with real data because @sebmarkbage has already done the conversion automatically; see the fork sebmarkbage/d3. Using this fork, I ran the following commands: browserify d3.js -o d3.bundle.js
uglifyjs d3.bundle.js -c -m > d3.bundle.min.js The generated bundle was 352K, minified 156K, minified and gzipped 52K. The concatenated files are 264K, 124K and 44K respectively, which translates to overheads of 33%, 26% and 18% respectively. This is quite a bit higher than my initial estimate of 6%, which could have something to do with the automatic conversion. It’s quite possible that doing the conversion manually would produce more optimal results. But anyway, now we have a ballpark figure. |
There's a lot of overhead in internal exports that are shared between modules. Especially when those are non-constant. Those can't be renamed internally in the modules. You could also force a rename of internal exports. That should save you a few bytes. There are also modules that are purely internal which could be inlined together with other modules. This code base is actually an excellent example where static exports and import * is useful. Since there are so many cyclic dependencies and late updates to shared variables. An ES6 based module system makes this very easy and more static assumptions leads to smaller files. Converting this code base to ES6-style modules is simple. Just add a file called all.js containing "export * from ...;" for every file in makefile. Then add import * from "all" to the top of every file. Add "export" before every top level variable declaration or function declaration. Then you can easily do a sound analysis to package or convert it to Node-style modules. Therefore I'd recommend using ES6 modules in the source code. You could also do unsound assumptions in tools like browserify to over come the limitations of the Node-style module format. Then you could make the packaging even smaller. However, maintaining the current code base as idiomatic Node modules would be a significant shift in style. As evident by the conversion. |
Oh and if you don't want to have incompatible syntax in editor and stuff. You can use Labeled Modules instead of ES6. https://github.com/sebmarkbage/link.js |
I would also add that there is a lot of duplication that could be avoided: I am sure there are more instances. d3 is a big library For the above, I would also imagine that you could just do d3.interpolate = require('./lib/interpolate'); and now all of the methods are exposed under d3.interpolate.array, etc. I think if you work from a modular standpoint from the start you end up with a slightly different view on your requires and how to best make use of the space. Once you have everything organized via modules and require, you can very easily generate a dependency graph to start seeing what is used where. You can also make tools to identify unused requires and clean things up. the important takeaway is that it is MUCH easier to both test the code in parts and figure out wtf is going on when you have a module system. |
I've used an alternative bundler which mangles |
@sokra based on the numbers you have provided, I don't see how you got your 2%. Anyhow, I think all of this is trivial and the more important thing is for the project to start using require. We are talking a less than 10K difference when gzipped. This is beyond trivial and will get better with time. The gains of sane JS development outweigh this 10K IMHO. |
@shtylman I just want to point out that not the |
I think @sokra means 45K compared to 44K 😄 . I really like @sokra 's approach of moving all the resolving stuff into the bundling process and But as @shtylman pointed out: 10k isn't worth the discussion imho. I'd appreciate if someone writes a module that squeezes out another 10k, but for production there are other ways of reducing much more size, like serving minified images (via JPEGmini or PNGGauntlet). |
@sokra Which alternative bundler did you use? |
It this case to only (relevant) difference is that for a module var add = require("./math").add;
exports.increment = function increment(i) {
return add(i, 1);
}; browserify generates modules like this: 0: [function(require, module, exports) {
var add = require("./math").add;
exports.increment = function increment(i) {
return add(i, 1);
};
}, {"./math": 1}] and webpack generates modules like this: 0: function(module, exports, require) {
var add = require(/* ./math */1).add;
exports.increment = function increment(i) {
return add(i, 1);
};
} That's the whole magic of the 10K. Minimized: 0:[function(n,a,t){var r=n("./math").add;t.increment=function(n){return r(n,1)}},{"./math":1}]
0:function(n,r,a){var t=a(1).add;r.increment=function(n){return t(n,1)}} |
I think we all agree that there's big benefits to using modular code, but I think the purpose of this ticket is to ensure that there isn't a massive overhead in doing so. I work on a project which currently has >2000 require statements, and rising as we increase modularity. We recently switched to browserify from a home-rolled solution, and everything was great - except for a ~10% increase in size for our minified javascript, adding an overhead of around 80k! While a significant chunk of that would be removed by gzip during transfer, we also care about overall size for another reason - we cache javascript on the client using localStorage, which is quite size-sensitive. 10% overhead isn't massive, but it can be avoided using an approach like the one @sokra suggested, meaning everyone can use the advantages of sane modular development without incurring an overhead. I've hacked together a solution in browserify to prove the kind of gains this gives. See: rowanbeentje/node-detective@d4f6cd0 This changes browserify to perform require() lookups by index, so that The implementation isn't massively neat - it results in Suggestions on how to implement this in a neater fashion accepted, though I'm hoping @substack will now jump in and implement it properly considering the advantages shown :) |
Nice! Replacing module filenames with ids makes totally sense to me. 😄 Why do you cache JavaScript using localStorage? Shouldn't the HTTP-cache be used for static resources? In combination with long-time caching and hashed bundlenames (like |
@jhnns That's very much a discussion for elsewhere, but think mobile phones, offline usage, managed upgrades, and appCache issues :) |
+1 for (optionally) replacing the module filenames with ids! |
i think browserify shouldnt replace filenames with ids, it just complicates things and is not worth the effort |
@guybrush Is that with 16-character filename hashes? As above, the solution I had up and running for a while which uses numerically indexed ids saved around 80k (10% of minified size) for a large project, which I keep meaning to come back to... |
Here is a tool for converting |
@substack thanks for the link! We'd used browserify in the past for front-end stuff, but I recently experimented with using browserify to minify waterline. It's adapter-based, and supports streams, so I was curious, and a bit skeptical, if we could use it client-side. Ran it through browserify There's only one problem-- download size. To make it a realistic solution for us to use on projects as an ORM with things like Angular, we've got to get it a little smaller. I reckon we could require fewer things, but we haven't invested the time yet. I'll try out intreq and report back on the gains. |
I am having a bit trouble setting up I used a custom packer function like this function pack(params) {
var intreq = require('intreq'),
browserPack = require('browser-pack');
params.raw = false;
params.sourceMapPrefix = '//#';
return intreq().pipe(browserPack(params));
}
browserify({ pack: pack }); The above code snippet should resemble the Any hints what I might be doing wrong? |
I finally got around to this: bundle-collapser. Give it a bundle.js as input and it collapses the |
@substack awesome news!! |
Nice work! |
So nice! |
Nice! |
The static bundling in browserify 2 looks very promising. However, one of my concerns is that it will increase the code size of the generated bundle as compared to simple concatentation, due to the boilerplate for each require'able file.
I expect it is possible to reduce the size of generated static bundle to levels comparable to concatentation, but this might break the design goals of browserify (say by not allowing bundled files to be require'd from outside the bundle). So I wanted to see your thoughts on this issue before I considered taking a crack at it myself.
As a contrived example, consider the following file, length.js:
Ignoring the fixed overhead (196 bytes) for the bundle, the incremental size of just this file minified in the static bundle is about 61 bytes:
In a non-browserify world, the length function might instead be implemented as:
Which minifies to 30 bytes:
So, for the purpose of discussion, we can estimate that there is a per-file overhead of about 30 bytes when using static bundling versus concatenation. (Of course, building custom bundles using only the needed code via static analysis would produce a bundle that is far smaller than concatenating everything, but for this discussion I’m concerned with the default case, say where d3js.org provides a pre-built bundle with default functionality for convenience.)
If I extend this 30 bytes to D3’s 200 or so separate files, the resulting overhead is about 6KB, which represents about a 5% overhead on top of the 124KB d3.min.js. This isn’t huge, but at the same time, if I can avoid the overhead, that would make me happy.
One approach to reducing the size of the static bundle is to try to make it equivalent to concatenation. This might not be possible with certain edge cases, such as circular require's, but it wouldn’t be hard for a common usage pattern. For example, the minified length.js in the bundle might appear as:
This is only 31 bytes, identical to concatenation. And subsequently, all instances of
require("length")
would be replaced inline withL
. (In practice, browserify could use long names for each required module, such as _require_length, and then uglifyjs could reduce them to minimal variable names without collision.) Assuming that all bundled modules are listed in dependency-order, this should function equivalently to the current approach, but be quite a bit smaller.I guess the biggest downside of this approach is that non-exported local variables in the file would need to be namespaced so as to avoid leaking into other modules. This might be challenging to implement and would further increase the size of the non-minified bundle.
Anyway, curious to hear your thoughts. I might still be willing to go-ahead given all the other benefits, but I always like to have my cake and eat it too.
The text was updated successfully, but these errors were encountered: