meshopt compression: grouping buffer views and compression size #337
-
The
This results in a reasonably small number of buffer views per glTF asset, regardless of the number of mesh primitives. For a couple reasons though, I'd like to understand the tradeoffs of having far more buffer views (say, one per accessor) instead. I understand this would inflate the size of the JSON data considerably, but is that the only effect? Does it have any implications for the compressed binary payload, e.g. better compression ratio for larger units of compression? I've done some quick tests trying to check this, and it doesn't seem to matter, but hoping to get a sanity check on this conclusion. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
So "one per accessor" is an extreme example, in the sense that it's the one with the highest granularity and as such it will see the highest penalty. Here's what you could expect. First, of course the glTF JSON data is inflated noticeably. This can be a significant problem on models built out of a lot of small pieces with different materials or short animations of deep node trees or other cases like that. Second, you will see some increase in the binary data as well. For example, both attribute and triangle codec have ~16-24 bytes of "padding" (that can store important information) -- so if you have many small streams, the overhead of that data grows. This may not seem like a big problem, but imagine the animation use case with a lot of dummy single-keypoint curves - now suddenly you are paying ~4 bytes for the input, ~8 bytes for the output, ~24b each for padding for input/output, some bytes for JSON, etc. The good news is that gzip is likely to compress these extra bytes reasonably well, but this can inflate the size before compression. Beyond padding you can also see slightly reduced compression due to extra "breaks" in the possibly related data. This effect is usually going to be comparatively minor, because all compression algorithms here are fairly local, but there can be worst cases such as a lot of single-keyframe tracks with similar outputs. This is about it from the transmission size perspective. Modulo JSON waste you'll probably see comparable sizes after gzip but somewhat inflated sizes before gzip that effectively reflect the degree to which you end up compressing tiny buffers. The reason why gltfpack packs buffer views aggressively is not just that however, it's also to maximize the loading efficiency. On that front, first, there's some overhead to parse the extra JSON data, and there's some overhead to decompress - you need to copy some data around JS-WASM-JS boundary, call some JS/WASM functions etc. That overhead isn't very big, but if we're talking about thousands of small compressed blocks - it will add up. In index codec there's some fixed cost to decoding that is also non-trivial (it's on the order of "initialize 300 bytes of memory", but if you are decompressing a lot of cubes that may be a problem!). In addition, attribute codec heavily relies on SIMD for efficiency; it reaches peak performance on ~64 elements, decent performance on ~16 elements, but shorter sequences will not decode at peak throughput. Which, again, isn't a problem if you just have a few tiny sequences because everything is fast anyway, but can be a problem if you have thousands of 10-element sequences instead of 10 thousand-element ones. Second, while web loaders create individual WebGL objects for individual primitives, this is fairly wasteful. gltfpack gives the loader the opportunity to minimize memory waste and minimize the associated costs of tracking GL objects by allowing to create one GL object per bufferView with a given target usage (glTF semantics makes creating GL buffer per glTF buffer infeasible because glTF buffers may mix data with various access settings). To get there, it minimizes the number of bufferViews, using the key similar to what you write about. Now, whether or not any of these are particularly interesting things to worry about for glTF-Transform I'm not sure about, but essentially the consideration for gltfpack is that every time you split a buffer view, you lose a bit of transmission size, a bit of decompression time, a bit of loading efficiency and in some cases a bit of rendering performance due to increased GL object switching [given an optimal glTF loader], so gltfpack makes sure you don't need to pay these costs. |
Beta Was this translation helpful? Give feedback.
So "one per accessor" is an extreme example, in the sense that it's the one with the highest granularity and as such it will see the highest penalty. Here's what you could expect.
First, of course the glTF JSON data is inflated noticeably. This can be a significant problem on models built out of a lot of small pieces with different materials or short animations of deep node trees or other cases like that.
Second, you will see some increase in the binary data as well. For example, both attribute and triangle codec have ~16-24 bytes of "padding" (that can store important information) -- so if you have many small streams, the overhead of that data grows. This may not seem like a big problem,…