re-subroutinizing SourceSansPro-Regular.otf yields slightly bigger file #9

anthrotype · 2020-06-01T14:32:29Z

I tried downloading SourceSansPro-Regular.otf and run python -m cffsubr on it.
Comparing the resulting CFF table, I see that the original table is smaller than the one produced after running cffsubr.

I was wondering why this is the case?
Is SourceSansPro-Regular using some different library to do the subroutinization than the one used by tx tool? Or is it passing different options that I am not aware of?

How do you explain the diff?

The text was updated successfully, but these errors were encountered:

anthrotype · 2020-06-01T14:32:58Z

/cc @khaledhosny who is our only known user so far

khaledhosny · 2020-06-02T02:36:21Z

Source Sans Pro uses makeotf, so it might be difference from makeotf subroutinizer.

josh-hadley · 2020-06-02T19:52:44Z

I'm looking into this so I can explain in more detail but basically what @khaledhosny is the root of it: Source Sans Pro was built with makeotf and although the core subroutinization code is more or less identical between makeotfexe and tx, the result of subroutinization can be affected by the order in which the input glyphs are processed as part of subroutinization (not necessarily font glyph order -- it's the order in which the glyphs are analyzed for subroutinization).

On a related note: I built the latest SSP (from master branch) with the current AFDKO/makeotf, then ran the result of that through cffsubr. In that case, the original table is slightly larger than the one produced by cffsubr.

cjchapman · 2020-06-02T20:47:07Z

Historical note: I ported tx's faster subroutinizer code to makeotf in AFDKO PR #882, which went out in AFDKO 3.0.0. Prior to that, makeotf used different subroutinizer code.

cjchapman · 2020-06-02T20:51:09Z

As Josh mentioned, the tx and makeotf subroutinizers are nearly identical (since AFDKO 3.0.0). If anyone wants to compare them, they are in these two files:

anthrotype · 2020-06-02T21:01:14Z

Thank you for the insights. In cffsubr I am calling tx with the option to keep the order, I thought it would be required to ensure that I can then reinsert the modified cff table back in the sfnt container. But maybe that's not the case and I can let tx find the most optimal order?

cjchapman · 2020-06-02T21:07:19Z

You definitely need the +b "preserve glyph order" option for tx, otherwise you'll have problems like cmap mapping Unicodes to the wrong glyph indices.

josh-hadley · 2020-06-02T21:08:41Z

Yeah, what @cjchapman said. Don't remove +b unless you want a whole new set of problems to deal with (and likely still have size differences from makeotf).

To reiterate: the difference in the subroutinization result seems to be caused by the order in which glyphs are analyzed for subroutinization -- which is not necessarily the font's glyph order (I'm working up a test case to demonstrate/prove/describe this in detail).

josh-hadley · 2020-06-02T22:35:07Z

Some additional information and data for this:

tx appears to perform the analysis for subroutinization based on the Adobe Standard Encoding order (that is: it looks at those glyphs, if present, first), regardless of the font's glyph order.
makeotfexe probably performs the analysis based on glyph order
Source Sans Pro is not in Adobe Standard Encoding order
When the analysis order is the same, you get the same subroutinization results. Therefore, we can say conclusively that the core makeotfexe subroutinization is the same as tx. It's the analysis order that contributes to the differences you see.

I've attached some test files that support the above (I did not bother trying to prove makeotfexe's behavior exhaustively as I think it can be inferred from the other findings). The file SSP-limited.otf is SourceSansPro built with makeotfexe using a modified GOADB, containing mostly glyphs not in Adobe Standard Encoding. SSP-limited-tx.otf is the result of running tx -cff +S +b on SSP-limited.otf, then stuffing the resulting CFF table into the file. There are still some slight differences in the CFF table, but the subrs and glyph charstrings are identical.

To sum up: the difference in this particular case is because the font's glyph set is not in Adobe Standard Encoding order. makeotfexe (apparently) analyzes for subroutinization based on font glyph order, whereas tx analyzes based on Adobe Standard Encoding order. If you want the same subroutinization results between makeotfexe and tx, you need to have the font's glyphs in Adobe Standard Encoding order.

SSP-limited-test.zip

anthrotype · 2020-06-03T10:25:48Z

Thanks for the analysis.

tx appears to perform the analysis for subroutinization based on the Adobe Standard Encoding order...
makeotfexe probably performs the analysis based on glyph order

Any particular reasons why tx and makeotf should differ in this regard?

anthrotype · 2020-06-03T10:30:30Z

There are still some slight differences in the CFF table

I noticed those too. In paticular, tx seems to use "ExpertEncoding", whereas makeotf prefers "StandardEncoding". I don't know what that means, but I wonder if that has any relationship with that difference you noticed in subroutinization order.

Another difference is that tx is dropping FamilyBlues and FamilyOtherBlues, apparently in this font these have the same values as the BlueValues and OtherBlues respectively. Maybe tx thinks they are redundant and flushes them away? Is that good/safe?

miguelsousa · 2020-06-03T16:51:56Z

If FamilyBlues and FamilyOtherBlues have the same values as BlueValues and OtherBlues, it's correct to not include the Family set in the font.

As for the usage of ExpertEncoding, I find that strange. Not sure what consequences it may have.

josh-hadley · 2020-06-04T15:47:04Z

Any particular reasons why tx and makeotf should differ in this regard?

I don't know for sure; this all happened way before my time at Adobe and I think the people who made those decisions are not around anymore. My guess would be because Adobe Standard Encoding provided a consistent starting point for compressing the character/glyph sets that were popular at the time this scheme was developed (in tx, anyway).

As for the usage of ExpertEncoding, I find that strange. Not sure what consequences it may have.

I suspect that's an anomaly from the somewhat unusual set that I chose for this experiment. Probably tx uses some heuristic like % of charset present to set "StandardEncoding" and this font has a very small percentage of that. I would not expect to see this change in more normal cases. And I'm not sure it has any real consequences in an OpenType font anyway: the font's 'cmap' subtables will ultimately determine the character set(s) and encoding(s).

Bringing all of this back around to cffsubr: to really solve this well I think we need to build only the subroutinizer (from code that @cjchapman mentions here) into either a standalone executable or C Extension so we truly isolate subroutinization from other parts of the CFF/CFF2 tables.

A shorter-term workaround might be to see if we can extract only the relevant data from the tx-subroutinized CFF table (local & global subrs, charstrings, maybe some other bits) and stuff those back into the pre-subroutinized CFF, rather than take the entire converted table which might have other undesired diffs.

anthrotype assigned josh-hadley Jun 1, 2020

josh-hadley mentioned this issue Mar 1, 2021

Reusing components / subroutinezation is missing? adobe-type-tools/afdko#1319

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

re-subroutinizing SourceSansPro-Regular.otf yields slightly bigger file #9

re-subroutinizing SourceSansPro-Regular.otf yields slightly bigger file #9

anthrotype commented Jun 1, 2020

anthrotype commented Jun 1, 2020 •

edited

Loading

khaledhosny commented Jun 2, 2020

josh-hadley commented Jun 2, 2020

cjchapman commented Jun 2, 2020 •

edited

Loading

cjchapman commented Jun 2, 2020 •

edited

Loading

anthrotype commented Jun 2, 2020

cjchapman commented Jun 2, 2020

josh-hadley commented Jun 2, 2020

josh-hadley commented Jun 2, 2020

anthrotype commented Jun 3, 2020

anthrotype commented Jun 3, 2020

miguelsousa commented Jun 3, 2020

josh-hadley commented Jun 4, 2020

re-subroutinizing SourceSansPro-Regular.otf yields slightly bigger file #9

re-subroutinizing SourceSansPro-Regular.otf yields slightly bigger file #9

Comments

anthrotype commented Jun 1, 2020

anthrotype commented Jun 1, 2020 • edited Loading

khaledhosny commented Jun 2, 2020

josh-hadley commented Jun 2, 2020

cjchapman commented Jun 2, 2020 • edited Loading

cjchapman commented Jun 2, 2020 • edited Loading

anthrotype commented Jun 2, 2020

cjchapman commented Jun 2, 2020

josh-hadley commented Jun 2, 2020

josh-hadley commented Jun 2, 2020

anthrotype commented Jun 3, 2020

anthrotype commented Jun 3, 2020

miguelsousa commented Jun 3, 2020

josh-hadley commented Jun 4, 2020

anthrotype commented Jun 1, 2020 •

edited

Loading

cjchapman commented Jun 2, 2020 •

edited

Loading

cjchapman commented Jun 2, 2020 •

edited

Loading