Faster MD5 and Parallel reports generation #123

vuspenskiy · 2018-07-07T19:00:12Z

Accelerating the coveralls file generation by using DigestInputStream and parallelizing reports' generation for source files.

For the context, we're using a monorepo and the coveralls sbt task is taking so much time that CI times out (lemurheavy/coveralls-public#1154).

…m` and parallelizing reports' generation for source files

gslowikowski · 2018-07-14T15:17:55Z

Hi @vuspenskiy

How large is your repository and what are coveralls execution times?
I've checked your PR on Scala repo (large project) and coveralls execution times are between 4 and 6 seconds with and without your changes. I've measured only the time of coveralls.json file generation, without uploading to coveralls.io.

vuspenskiy · 2018-07-16T18:19:44Z

@gslowikowski about 200Kloc in 2500 files, max file is 1300 locs.

I used the forked version in our CircleCI. Before it was timing out after 10 minutes, now is done in 43 seconds (everything in sbt ';project outParentProject;coveralls').

gslowikowski · 2018-07-16T19:30:02Z

How large is coveralls.json file?

I will merge your PR, just want to understand what can be the root cause of so long execution times.

vuspenskiy · 2018-07-21T22:21:29Z

Hi @gslowikowski, thank you!

I think it was about 7Mb.

gslowikowski · 2018-11-01T21:33:59Z

Version 1.2.7 was released recently.

I'd like to add one information.

During the first run after the upgrade most users will see the information that all files have changed.

This is because the MD5 is calculated differently now. Previously every source file was loaded as a sequence of lines, then these lines were joined using new line character (\n) and MD5 was calculated. Now MD5 is calculated from the raw file content.

Where is the difference? Previously, if source file had Windows new lines they were changed to Unix ones. Additionally if there was new line at the end of the file it was lost. Now source file content is not transformed in any way before calculating MD5.

Accelerating the coveralls file generation by using `DigestInputStrea…

bc65f7f

…m` and parallelizing reports' generation for source files

vuspenskiy mentioned this pull request Jul 7, 2018

Uploading coverage data from CircleCI times out lemurheavy/coveralls-public#1154

Closed

gslowikowski merged commit c88bc5c into scoverage:master Oct 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster MD5 and Parallel reports generation #123

Faster MD5 and Parallel reports generation #123

vuspenskiy commented Jul 7, 2018 •

edited

Loading

gslowikowski commented Jul 14, 2018 •

edited

Loading

vuspenskiy commented Jul 16, 2018

gslowikowski commented Jul 16, 2018

vuspenskiy commented Jul 21, 2018

gslowikowski commented Nov 1, 2018

Faster MD5 and Parallel reports generation #123

Faster MD5 and Parallel reports generation #123

Conversation

vuspenskiy commented Jul 7, 2018 • edited Loading

gslowikowski commented Jul 14, 2018 • edited Loading

vuspenskiy commented Jul 16, 2018

gslowikowski commented Jul 16, 2018

vuspenskiy commented Jul 21, 2018

gslowikowski commented Nov 1, 2018

vuspenskiy commented Jul 7, 2018 •

edited

Loading

gslowikowski commented Jul 14, 2018 •

edited

Loading