Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I am experiencing race condition while using SimpleCov with Parallel_tests #559

Closed
aminariana opened this issue Feb 16, 2017 · 24 comments
Closed

Comments

@aminariana
Copy link

aminariana commented Feb 16, 2017

Repro:
I'm using simplecov (0.13.0), ruby "2.3.3", parallel_tests (2.13.0)

# In spec_helper.rb:
if ENV['COVERAGE']
  require 'simplecov'
  process_name = "process##{Process.pid.to_s}:#{ENV['TEST_ENV_NUMBER'] || '1'}"
  puts "Starting Code Coverage in #{process_name}"
  SimpleCov.command_name process_name
  SimpleCov.merge_timeout 120 # Expire parallel process coverage cache after x seconds
  SimpleCov.start 'rails'
  SimpleCov.refuse_coverage_drop
  SimpleCov.maximum_coverage_drop 1
  SimpleCov.minimum_coverage 100
end

Ran:

COVERAGE=true RAILS_ENV=test bundle exec rake parallel:spec

Expected:
The last of the 8 processes reports 100% coverage, which should make SimpleCov pass.

Actual:
The result of the 3rd-last process was picked as the merged result!

...
Coverage report generated for process#1306:3, process#1311:7, process#1313:8, process#1317:4, process#1318:5, process#1319:6 to /Users/amin/api/coverage. 1713 / 1801 LOC (95.11%) covered.
Coverage report generated for process#1306:3, process#1309:, process#1311:7, process#1313:8, process#1315:2, process#1317:4, process#1318:5, process#1319:6 to /Users/amin/api/coverage. 1754 / 1801 LOC (97.39%) covered.
Coverage report generated for process#1306:3, process#1309:, process#1311:7, process#1313:8, process#1315:2, process#1317:4, process#1318:5, process#1319:6 to /Users/amin/api/coverage. 1801 / 1801 LOC (100.0%) covered.
Coverage (95.11%) is below the expected minimum coverage (100.00%).

287 examples, 0 failures

Took 35 seconds
Tests Failed
@bf4
Copy link
Collaborator

bf4 commented Feb 16, 2017

Have you tried setting SimpleCov.pid = Process.pid before forking?

@aminariana
Copy link
Author

I did just now, and it didn't help. Also please notice that SimpleCov.start already does this for you:
https://github.com/colszowka/simplecov/blob/abdf6a294f030a9407606adcf541d52505622b1b/lib/simplecov.rb#L47

@PragTob
Copy link
Collaborator

PragTob commented Feb 16, 2017

Which version of simplecov are you using? Which Rails and Ruby version? Which testing framework?

Nothing obvious sticks out from over here :(


  SimpleCov.refuse_coverage_drop
  SimpleCov.maximum_coverage_drop 1

sort of override each other, refuse_coverage_drop sets maximum_coverage_drop to 0 but that's just an aside here :)

@aminariana
Copy link
Author

Coverage MUST NOT DROP! :-)
I had listed all the versions in the OP. Also Rspec v3.5.0

I just ran it again, and this time the results only up to the second-last process got considered even though the last process reached 100%. I'm sure there's a race condition somehow:

...
Coverage report generated for process#22110:3, process#22112:4, process#22114:6, process#22115:2, process#22116:7, process#22118:8, process#22120:5 to /Users/amin/api/coverage. 1754 / 1801 LOC (97.39%) covered.

Coverage report generated for process#22108:, process#22110:3, process#22112:4, process#22114:6, process#22115:2, process#22116:7, process#22118:8, process#22120:5 to /Users/amin/api/coverage. 1801 / 1801 LOC (100.0%) covered.
Coverage (97.39%) is below the expected minimum coverage (100.00%).

287 examples, 0 failures

Took 33 seconds
Tests Failed

@PragTob
Copy link
Collaborator

PragTob commented Feb 17, 2017

Sorry must have tiredly missed the versions :(

All we can do atm is look at the merging code and see if anything pops out as maybe being a race condition or something. If you could share a sample with us that reproduces this it'd be great but probably hard to do as it might also be related to something local (race conditions and execution speed).

If that's not possibly I'll likely post some puts debugging I'd like to see when I get to look into how the code works :)

@aminariana
Copy link
Author

I am so glad I documented the results from earlier! I started wiping out the resultset from previous run by adding the following to the spec_helper.rb:

  # Wipe out SimpleCov results
  config.before(:suite) do
    %x( rm -rf ./coverage )
  end

Result: The parallel_tests with SimpleCov still fails, BUT NO LONGER complains about:

Coverage (97.39%) is below the expected minimum coverage (100.00%).

I believe the cause of the self-contradicting summary below has changed by the action I took above:

...
Coverage report generated for process_#24691, process_2#24688, process_3#24693, process_4#24686, process_5#24694, process_6#24695, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1801 / 1801 LOC (100.0%) covered.

287 examples, 0 failures

Took 44 seconds
Tests Failed

New hypothesis:
Can deprecation warnings cause SimpleCov to fail? I'll paste the entire output below, which contains Rails 5.0 deprecation warnings from a certain gem (apipie) I use:

DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
8 processes for 28 specs, ~ 3 specs per process
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
DEPRECATION WARNING: alias_method_chain is deprecated. Please, use Module#prepend instead. From module, you can access the original method using super. (called from included at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor/recorder.rb:149)
Starting Code Coverage in process_#24691
Starting Code Coverage in process_2#24688
Starting Code Coverage in process_8#24698
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
Starting Code Coverage in process_3#24693
Starting Code Coverage in process_4#24686
Starting Code Coverage in process_5#24694
Starting Code Coverage in process_6#24695
Starting Code Coverage in process_7#24699
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from block (2 levels) in <class:Railtie> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/lib/apipie/extractor.rb:12)

Randomized with seed 54211

Randomized with seed 29581

Randomized with seed 55046
.
Randomized with seed 34341

Randomized with seed 26528

Randomized with seed 61688

Randomized with seed 28212

Randomized with seed 49854
..DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
.DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
..DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
..........DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
......DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
............................................DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
............................................

Finished in 7.79 seconds (files took 4.78 seconds to load)
35 examples, 0 failures

Randomized with seed 28212

Coverage report generated for process_4#24686 to /Users/amin/api/coverage. 632 / 2633 LOC (24.0%) covered.
..Coverage (24.00%) is below the expected minimum coverage (100.00%).
............................................................DEPRECATION WARNING: around_filter is deprecated and will be removed in Rails 5.1. Use around_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:8)
DEPRECATION WARNING: before_filter is deprecated and will be removed in Rails 5.1. Use before_action instead. (called from <class:ApipiesController> at /Users/amin/.rvm/gems/ruby-2.3.3/bundler/gems/apipie-rails-74541408375d/app/controllers/apipie/apipies_controller.rb:9)
..........

Finished in 13.42 seconds (files took 4.81 seconds to load)
37 examples, 0 failures

Randomized with seed 49854

......Coverage report generated for process_4#24686, process_7#24699 to /Users/amin/api/coverage. 1251 / 1934 LOC (64.68%) covered.
...

Finished in 14.6 seconds (files took 4.41 seconds to load)
23 examples, 0 failures

Randomized with seed 54211

..Coverage (64.68%) is below the expected minimum coverage (100.00%).
.Coverage report generated for process_4#24686, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1353 / 1918 LOC (70.54%) covered.
....Coverage (70.54%) is below the expected minimum coverage (100.00%).
........

Finished in 15.4 seconds (files took 4.7 seconds to load)
40 examples, 0 failures

Randomized with seed 34341

.Coverage report generated for process_4#24686, process_6#24695, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1475 / 1865 LOC (79.09%) covered.
.Coverage (79.09%) is below the expected minimum coverage (100.00%).
..............................................

Finished in 25.89 seconds (files took 4.53 seconds to load)
35 examples, 0 failures

Randomized with seed 29581

.Coverage report generated for process_#24691, process_4#24686, process_6#24695, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1577 / 1845 LOC (85.47%) covered.
....Coverage (85.47%) is below the expected minimum coverage (100.00%).
...............

Finished in 27.32 seconds (files took 4.77 seconds to load)
43 examples, 0 failures

Randomized with seed 26528

Coverage report generated for process_#24691, process_3#24693, process_4#24686, process_6#24695, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1650 / 1822 LOC (90.56%) covered.
.Coverage (90.56%) is below the expected minimum coverage (100.00%).
........

Finished in 31.94 seconds (files took 4.76 seconds to load)
43 examples, 0 failures

Randomized with seed 61688

Coverage report generated for process_#24691, process_3#24693, process_4#24686, process_5#24694, process_6#24695, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1710 / 1801 LOC (94.95%) covered.
.Coverage (94.95%) is below the expected minimum coverage (100.00%).
...

Finished in 37.28 seconds (files took 4.56 seconds to load)
31 examples, 0 failures

Randomized with seed 55046

Coverage report generated for process_#24691, process_2#24688, process_3#24693, process_4#24686, process_5#24694, process_6#24695, process_7#24699, process_8#24698 to /Users/amin/api/coverage. 1801 / 1801 LOC (100.0%) covered.

287 examples, 0 failures

Took 44 seconds
Tests Failed

@aminariana
Copy link
Author

For less ambiguity, I have conducted the following experiments:

  1. Commented out all thresholds:
# SimpleCov.minimum_coverage 100
# SimpleCov.refuse_coverage_drop
# SimpleCov.maximum_coverage_drop 1

Result:
Run passes (!)

Conclusion:
Failure definitely has to do with SimpleCov exit status.

  1. Uncomment only a single criterion:
SimpleCov.minimum_coverage 100

Result:
Run reports 100.0% covered, but also Tests Failed.

Conclusion:
Failure definitely has to do with threshold of coverage.

  1. Reduce threshold to 50%:
SimpleCov.minimum_coverage 50

Result:
Run passes

Conclusion:
It's not blind to the amount of coverage

  1. At 97% threshold, it fails
  2. At 90% threshold, it fails
  3. At 80% threshold, it fails
  4. At 65% threshold, it passes (!)
  5. At 72% threshold, it fails (!)
  6. At 72% threshold, it passes (!!!!!)

OK, I'm back to thinking this is certainly a race condition.

@aminariana
Copy link
Author

One more experiment:

Fix the minimum threshold at 100%

  1. Reduce the number of cores from Macbook Pro's default 8 to 2:
PARALLEL_TEST_PROCESSORS=2

Result is failure despite the second process hitting 100%:

Coverage report generated for process_2#26502 to /Users/amin/api/coverage. 1484 / 1831 LOC (81.05%) covered.
Coverage (81.05%) is below the expected minimum coverage (100.00%).
.....

Finished in 1 minute 17.19 seconds (files took 2.46 seconds to load)
151 examples, 0 failures

Randomized with seed 50459

Coverage report generated for process_#26503, process_2#26502 to /Users/amin/api/coverage. 1801 / 1801 LOC (100.0%) covered.

287 examples, 0 failures

Took 81 seconds (1:21)
Tests Failed
  1. Drop the number of cores to one:
PARALLEL_TEST_PROCESSORS=1

Result: pass!

Finished in 2 minutes 9.2 seconds (files took 2.39 seconds to load)
287 examples, 0 failures

Randomized with seed 17317

Coverage report generated for process_#26602 to /Users/amin/api/coverage. 1801 / 1801 LOC (100.0%) covered.

287 examples, 0 failures

Took 133 seconds (2:13)

Note: This inconsistency not only exhibits itself on my Macbook, but also on a remote (Travis CI) Unix system.

These experiments prove, I think beyond reasonable doubt, that SimpleCov has a race condition.

@aminariana
Copy link
Author

OMG

I see a bug #64 that has a comment from @jshraibman-mdsol on Jan 7, 2013 saying:

PR #185 should totally fix this (parallel_tests race condition) issue.

Seven months later, @colszowka wrote:

As @jshraibman-mdsol wrote an (embarassing) while ago, PR #185 should have fixed this problem. It will be released in a couple of minutes as part of version 0.8.0.pre2

I am reporting the same bug 4 years (!!!) later, and I'm looking at PR #185 where a lock pattern is implemented and supposedly merged with master, but I just checked master and it doesn't have it! See the commit:

4f3b4c7

This is a regression bug. The fix PR #185 somehow got lost on master.

@aminariana
Copy link
Author

aminariana commented Feb 17, 2017

The problems I'm seeing are somewhat consistent with @ronwsmith 's description at the bottom of #223 (except I do see LOC numbers)

Note to self:
The locking mechanism has moved to: https://github.com/colszowka/simplecov/blob/257e26394c464c4ab388631b4eff1aa98c37d3f1/lib/simplecov/result_merger.rb

@aminariana
Copy link
Author

aminariana commented Feb 18, 2017

SOLVED!

The SimpleCov race condition is definitely a bug. After 16 hours wasted, I came up with my own UNIX solution, which I will describe to anybody stuck with a parallel_test + SimpleCov mysterious failure problem:

  1. Your testing script should wipe previous coverage like:
rm -rf ./coverage
COVERAGE=true RAILS_ENV=test bundle exec rake parallel:spec
  1. Use the thread-safety of appending to Unix files to (a) count the number of processes starting, (b) count how many processes ended, and (c) whether your current ending process is the last process. Then if you're the last process, see if you have reached the threshold:
# In spec_helper.rb
if ENV['COVERAGE']
  require 'simplecov'
  process_name = "process_#{ENV['TEST_ENV_NUMBER']}##{Process.pid.to_s}"
  SimpleCov.command_name process_name
  threshold = 100
  SimpleCov.minimum_coverage 0
  SimpleCov.start 'rails' do
    %x( mkdir -p ./coverage/ )
    %x( echo "#{process_name}" >> ./coverage/.started )
  end
  SimpleCov.at_exit do
    SimpleCov.result.format!
    actual = SimpleCov.result.covered_percent.to_i
    success = (actual >= threshold)
    status = success ? "DONE" : "..."
    %x( echo "#{process_name}: #{actual} of #{threshold} #{status}" >> ./coverage/.done )
    processes_started = %x[ cat ./coverage/.started | wc -l ].to_i
    processes_done = %x[ cat ./coverage/.done | wc -l ].to_i
    if (processes_started == processes_done)
      print "TOTAL COVERAGE: #{actual} of #{threshold} -"
      if success
        puts "\e[0;32m SUCCESS! \e[0m\n"
        exit 0
      else
        puts "\e[0;31m FAIL! \e[0m\n"
        exit 1
      end
    end
  end
end

Note, the minimum_coverage of 0 is important to trick SimpleCov to calculate coverage. But the our constant threshold is getting compared to actual only when the last process completes, and doesn't rely in any way on SimpleCov buggy logic.

Result for 110% threshold, as expected:

TOTAL COVERAGE: 100 of 110 - FAIL! 

287 examples, 0 failures
Tests Failed

Result for 100% threshold, as expected:

TOTAL COVERAGE: 100 of 100 - SUCCESS! 

287 examples, 0 failures

You can take that to the bank! :)

@aminariana
Copy link
Author

@bf4 I just now after my above workaround started seeing the wisdom of your first reply -- but SimpleCov.pid is always being assigned after forking inside the process, no? I don't see at what level a non-forked SimpleCov class object can exist. With parallel_tests, we're always inside one of the processes, with no "controller" process except the shell script that calls the parallel_tests command.

@aminariana
Copy link
Author

aminariana commented Feb 18, 2017

I should leave one more comment for anybody looking at the history of PRs: using none of the past versions of SimpleCov solved this problem for me, whereas my own workaround did. So it's not a regression bug; it's a real bug that has never been fixed, at least for someone running a Macbook Pro, Rails 5, SimpleCov on Parallel_tests.

Double confirmation:
My Travis Continuous Integration (running Unix) also is passing now, using the workaround I provided above.

Build use to take 10 minutes, now it takes 2 minutes!

@josephks
Copy link

#185 got reverted because it caused problems for Windows users. See the changelog for v0.8.2.

@yosh1kura
Copy link

yosh1kura commented Dec 5, 2017

#559 (comment)

I set it up as per the above URL,

If you do not remove exit 0, it will not correctly judge when test becomes error.

       if success
         puts "\ e [0; 32 m SUCCESS! \ e [0 m \ n"
       else
         puts "\ e [0; 31 m FAIL! \ e [0 m \ n"
         exit 1
       end

@aminariana
Copy link
Author

In the last year since 2017, I've written many more tests, causing my processes to get loads more unevenly. My solution above does fail from time to time. I have found multiple underlying causes. For example, SimpleCov is like Schroeder's cat in that if you look at "SimpleCov.result.covered_percent" at the wrong time, you might alter the outcome.

Also it seems like ParallelTests puts the most load on the last process. I'm now using this heuristic to avoid the race condition by having the last starting process be the one waiting for others. The following solution seems to run into LESS RACE CONDITIONS than my previous (upvoted) solution above, and it's also less verbose and more elegant, though I won't guarantee that it completely avoids race conditions:

if ENV['COVERAGE']
  require 'simplecov'
  SimpleCov.command_name "process_#{ENV['TEST_ENV_NUMBER']}##{Process.pid.to_s}"
  SimpleCov.start 'rails'
  SimpleCov.at_exit do
    if ParallelTests.last_process?
      ParallelTests.wait_for_other_processes_to_finish
      SimpleCov.result.format!
      actual = SimpleCov.result.covered_percent.round(2)
      print "TOTAL COVERAGE: #{actual}% -"
      if (actual == 100)
        puts "\e[0;32m SUCCESS! \e[0m\n"
      else
        puts "\e[0;31m FAIL! \e[0m\n"
        exit 1
      end
    end
  end
end

@fabriciofreitag
Copy link

+1

@PragTob
Copy link
Collaborator

PragTob commented Jun 25, 2019

This should be fixed with the merge of #706 thanks to @f1sherman ! Release hopefully incoming within the next days (0.17.0).

@PragTob PragTob closed this as completed Jun 25, 2019
@PragTob
Copy link
Collaborator

PragTob commented Jun 25, 2019

Ah yes, please reopen/comment if it persits on master/after release.

@wkirby
Copy link

wkirby commented Oct 23, 2019

@PragTob I am, unfortunately, still seeing this issue even after the release of 0.17.1. I have been unable to track down the cause of the problem, but it appears that something about the order in which the processes from parallel_tests finish.

On our CI machine, we run with 8 cores, and thus, by default, 8 spec processes. Whenever a spec that is not spec8 finishes last, that coverage is reported out as the total coverage.

Here is as much relevant information as I can put together:

# spec_helper.rb

SimpleCov.minimum_coverage 80

SimpleCov.command_name 'specs' + (ENV['TEST_ENV_NUMBER'] || '')
SimpleCov.merge_timeout 3600
# .rspec_parallel

--require spec_helper
--format RSpec::Instafail
--format progress
--format RspecJunitFormatter --out coverage/test-results/rspec<%= ENV['TEST_ENV_NUMBER'] %>.xml

And here is some sample output by the CI:

...
Finished in 4 minutes 9.7 seconds (files took 36.17 seconds to load)
197 examples, 0 failures, 2 pending
Coverage (75.72%) is below the expected minimum coverage (80.00%).
Coverage report generated for specs, specs2, specs3, specs4, specs5, specs6, specs8 to /opt/atlassian/pipelines/agent/build/coverage. 7784 / 10280 LOC (75.72%) covered.
........
Finished in 4 minutes 13.6 seconds (files took 35.95 seconds to load)
264 examples, 0 failures, 0 pending
Coverage (54.03%) is below the expected minimum coverage (80.00%).
Coverage report generated for specs7 to /opt/atlassian/pipelines/agent/build/coverage. 5554 / 10280 LOC (54.03%) covered.
SimpleCov failed with exit 2
Tests Failed
1822 examples, 0 failures, 24 pendings
Took 293 seconds (4:53)

I am able to work around this problem by forcing parallel specs to use a different number of processes, but that will eventually fail, and I have to bump the number around again. It's sort of a game of musical chairs: identify a configuration where the final specs process finishes last, and use that until it stops working.

Let me know if I can provide any further information that might help diagnose.

@wkirby
Copy link

wkirby commented Nov 7, 2019

@PragTob I had time to come back to this and add some debugging statements through the SimpleCov.at_exit block, and I no longer think this is a race condition from ParallelSpec processes.

================================================================
SIMPLECOV DEBUGGING
Number of Processes running at Exit: 3
Value of ENV['TEST_ENV_NUMBER']: 5
Is this the Last Process started?: No
Is this SimpleCov's Final Result Process?: No
Covered Precent at Exit: 74.26
================================================================
..............................*......**.....*........................................*....................................
Finished in 3 minutes 53.4 seconds (files took 33.81 seconds to load)
255 examples, 0 failures, 5 pending
Coverage (78.16%) is below the expected minimum coverage (80.00%).
================================================================
SIMPLECOV DEBUGGING
Number of Processes running at Exit: 2
Value of ENV['TEST_ENV_NUMBER']: 4
Is this the Last Process started?: No
Is this SimpleCov's Final Result Process?: No
Covered Precent at Exit: 78.16
================================================================
....................................................
Finished in 4 minutes 26.7 seconds (files took 35.47 seconds to load)
306 examples, 0 failures, 5 pending
Coverage (50.62%) is below the expected minimum coverage (80.00%).
================================================================
SIMPLECOV DEBUGGING
Number of Processes running at Exit: 1
Value of ENV['TEST_ENV_NUMBER']: 7
Is this the Last Process started?: No
Is this SimpleCov's Final Result Process?: Yes
Covered Precent at Exit: 50.62
================================================================

We can clearly see that SimpleCov is correctly identifying the final process to exit, but the call to SimpleCov.result.covered_percent.round(2) is returning the result of just that process's covered percent.

This behavior is still intermittent, and the exact same codebase can pass or fail by adjusting the number of processes running the parallel specs. I'm turning my attention to the ResultMerger class to see if I can identify why that final process does not appear to get the merged result from SimpleCov.result.

@wkirby
Copy link

wkirby commented Nov 7, 2019

@PragTob closing this out as I've found the issue. The issue does not appear to be one with SimpleCov — at least, not directly. We finally tracked the issue down to the SimpleCov::ResultsMerger.results method — which iterates through the stored results and drops out those that don't meet the merge timeout threshold.

What we saw was that the final process was dropping all but the most recent results from the merged results, and tracked that down to Time.now - result.created_at returning a wildly different value in the final process. In the end, the culprit was Timecop, and a test where the time-travel function was never reset. When running in our CI, that test suite was isolated from others, which meant for the final process, Time.now was travelled 2 weeks into the future.

In order to see this behavior we had to have several minor coincidences — we had an error in Timecop usage, that test got chunked into a parallel process that had no other Timecop tests (avoiding something else resetting it), and that chunk was reliably the last one to finish.

I don't think there's any change needed to SimpleCov to address this, except maybe a note in the README that any code that affects the value of Time.now (i.e. Timecop) will potentially cause incorrect coverage reports.

tl;dr messing with time causes unforeseen issues

@PragTob
Copy link
Collaborator

PragTob commented Nov 13, 2019

@wkirby thank you so much for looking into it so much 💚

That... does sound quite gruel overall as in hard to track down. I'd wish there was something I know we could do, maybe I'll come up with something down the line but right now I can think of nothing off the top of my head.

Or well... maybe I do. We maybe use monotonic time. Hm.

@wkirby
Copy link

wkirby commented Nov 13, 2019

I appreciate that. The only thing I thought of was a check if Timecop is installed — the same way there's a check for ParallelTest and if so, make a call to Time.now_without_mock_time instead of Time.now.

That goes further down the line towards handling weird interactions with other gems, though, and I'm loathe to suggest that it's the responsibility of Simplecov to deal with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants