Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Array.join to build up strings in readPostScriptTable(). #5074

Merged
merged 1 commit into from
Jul 25, 2014

Conversation

nnethercote
Copy link
Contributor

This avoids about 5 MiB of string allocations on one test case.

@CodingFabian
Copy link
Contributor

i wonder if you could just put the bytes into the array and then invoke bytesToString(buf) ?

@Snuffleupagus
Copy link
Collaborator

i wonder if you could just put the bytes into the array and then invoke bytesToString(buf) ?

I haven't looked at this particular case, but in general we may need to be careful when using that kind of pattern. (See the recent regression fixed by PR #5069.)

@Snuffleupagus
Copy link
Collaborator

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.22.172.223:8877/2849d2e44f5e4c5/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/0c287cf78c4bfa6/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/2849d2e44f5e4c5/output.txt

Total script time: 21.21 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/0c287cf78c4bfa6/output.txt

Total script time: 23.46 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://107.21.233.14:8877/0c287cf78c4bfa6/reftest-analyzer.html#web=eq.log

@nnethercote
Copy link
Contributor Author

The test failure appears to be due to a miniscule difference. Not sure why that happens; this should be functionally identical to the old code.

i wonder if you could just put the bytes into the array and then invoke bytesToString(buf) ?

You mean, do font.getBytes() and pass the result into bytesToString()? I think it would work, but getBytes() calls subarray() which creates a new object. So I prefer the code in the patch.

@Snuffleupagus
Copy link
Collaborator

The test failure appears to be due to a miniscule difference. Not sure why that happens; this should be functionally identical to the old code.

This was (likely) caused by the recent Firefox update, since the exact same test failure can be observed in #3891 (comment). I've generated new reference images for Linux, so just to make absolutely sure I'll re-trigger the test here once that completes.

@Snuffleupagus
Copy link
Collaborator

/botio-linux test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://107.21.233.14:8877/c55e45fa44d7622/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/c55e45fa44d7622/output.txt

Total script time: 23.19 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@fkaelberer
Copy link
Contributor

i wonder if you could just put the bytes into the array and then invoke bytesToString(buf) ?

You mean, do font.getBytes() and pass the result into bytesToString()?

@nnethercote I think Fabian might be thinking of collecting the byte codes of font.getByte() in an array and then applying bytesToString() to it. This potentially saves memory as numbers should have a smaller memory footprint than 1-char strings (can you confirm this?), and bytesToString() is much faster than calling String.fromCharCode() for each byte separately.

i wonder if you could just put the bytes into the array and then invoke bytesToString(buf) ?

I haven't looked at this particular case, but in general we may need to be careful when using that kind of pattern. (See the recent regression fixed by PR #5069.)

At least in the case of #5069, bytesToString did not cause the regression, see my comment there.

@nnethercote
Copy link
Contributor Author

This potentially saves memory as numbers should have a smaller memory footprint than 1-char strings (can you confirm this?),

In Firefox, single-byte ASCII strings are statically provided (and thus shareable) and so don't take up any additional memory.

and bytesToString() is much faster than calling String.fromCharCode() for each byte separately.

That's probably true. I'll update the patch accordingly.

@nnethercote
Copy link
Contributor Author

I just realized that using bytesToString() is wrong here -- it assumes the input is a typed array, not a vanilla array! The difference only manifests if the length is greater than 8192, whereupon it calls subarray(). I'll repost the original patch that uses fromCharCode+join.

Also, I did some microbenchmarking. On short arrays, bytesToString() is about 20% slower than fromCharCode+join. They are about the same for arrays of length 20, and once the length gets to 100s or 1000s bytesToString() is about 2x faster. So the potential win isn't clear-cut.

This avoids about 5 MiB of string allocations on one test case.
@timvandermeij
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/232a48198b72f57/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/1d07354d42a3364/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/1d07354d42a3364/output.txt

Total script time: 21.38 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/232a48198b72f57/output.txt

Total script time: 23.29 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@timvandermeij
Copy link
Contributor

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/f5bf540bd157a58/output.txt

timvandermeij added a commit that referenced this pull request Jul 25, 2014
Use Array.join to build up strings in readPostScriptTable().
@timvandermeij timvandermeij merged commit 62e6265 into mozilla:master Jul 25, 2014
@timvandermeij
Copy link
Contributor

Nice work, thanks!

@fkaelberer
Copy link
Contributor

I just realized that using bytesToString() is wrong here -- it assumes the input is a typed array, not a vanilla array! The difference only manifests if the length is greater than 8192, whereupon it calls subarray(). I'll repost the original patch that uses fromCharCode+join.

Also, I did some microbenchmarking. On short arrays, bytesToString() is about 20% slower than fromCharCode+join. They are about the same for arrays of length 20, and once the length gets to 100s or 1000s bytesToString() is about 2x faster. So the potential win isn't clear-cut.

@nnethercote I can confirm your test, it is slower indeed. Sorry for the extra work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants