Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite Crash on Coverage 5.0a2 inside Hypothesis #702

Closed
TylerADavis opened this issue Sep 8, 2018 · 17 comments
Closed

SQLite Crash on Coverage 5.0a2 inside Hypothesis #702

TylerADavis opened this issue Sep 8, 2018 · 17 comments

Comments

@TylerADavis
Copy link

I was running some unit tests using hypothesis (link) which relies on coverage, and I ran into a crash. Specifically, I received the following error message:
coverage.misc.CoverageException: Couldn't use data file '/Users/me/my_project/.coverage': UNIQUE constraint failed: file.path

I have experienced this issue both on macOS and Linux while running python 3.6, and the issue does not persist with COVERAGE_STORAGE=json.

I have included a gist below which contains files to reproduce the error. Specifically, the error occurs when the hypothesis assume statement in test.py fails more than half the time and when wrap is called.

To reproduce the error, run the build.sh script, which will initialize a virtualenv, install required modules, and run a short example which should throw the error.

Additionally, I am including a vagrantfile which will run build.shin a VM, thus providing a clean slate for testing.

https://gist.github.com/TylerADavis/393ee6382b4f645774100ae0c490c5ba

Please let me know if there is any other information I could provide that would be useful.

@nedbat
Copy link
Owner

nedbat commented Sep 8, 2018

Thanks, this reproduces well, even without Vagrant in the mix.

@nedbat
Copy link
Owner

nedbat commented Sep 8, 2018

It looks like Hypothesis is creating a number of CoverageData objects, each of which is trying to write to the .coverage file. @DRMacIver thoughts?

@DRMacIver
Copy link
Contributor

DRMacIver commented Sep 8, 2018

It looks like Hypothesis is creating a number of CoverageData objects, each of which is trying to write to the .coverage file. @DRMacIver thoughts?

Hmm. I thought the way Hypothesis was using those CoverageData objects they weren't trying to write to the file at all. I don't know whether my belief there was wrong or whether this has changed as part of the new version of Coverage.

I'm currently arguing that we should pull the Coverage support in Hypothesis anyway (HypothesisWorks/hypothesis#1551). This is not due to problems with coverage-the-library, I just don't think coverage information in general has proven useful in the sort of testing we can feasibly do in normal runs.

Given that, my recommended workaround is to put use_coverage=False in your Hypothesis settings, as I don't think you're losing much by doing that. I'll try to make sure we fix this (probably by removing the feature altogether) before you ship the final version of 5.0.

@DRMacIver
Copy link
Contributor

HypothesisWorks/hypothesis#1564 will fix this.

@nedbat
Copy link
Owner

nedbat commented Sep 8, 2018

@DRMacIver Coverage.py 5.0 uses SQLite as the data storage, instead of a JSON-like file. So the file is created well before any save() call. This perfectly explains the problem. I'm wondering whether to retain a memory-only data scenario also?

@nedbat
Copy link
Owner

nedbat commented Sep 8, 2018

Or at least, avoid creating the file until save()....

@DRMacIver
Copy link
Contributor

In the immediate future, the Hypothesis side of this is going away, so do what you like. :-) In the medium to long-term, I would definitely need to be able to to have in memory only coverage data if I were to keep using coverage in the kind of ways that I want to for coverage-guided testing purposes.

I'm a little confused about how creating the file without saving it causes this error though. It seems like data must be be written to it in order to violate a unique constraint?

@nedbat
Copy link
Owner

nedbat commented Sep 8, 2018

The current SQLite support writes to the .coverage file before save() is called, as data is added to the CoverageData object. That shouldn't be hard to change though.

@nedbat nedbat changed the title SQLite Crash on Coverage 5.0a2 SQLite Crash on Coverage 5.0a2 inside Hypothesis Oct 13, 2018
@anarcat
Copy link

anarcat commented Oct 22, 2018

I am also seeing a reproducible crash when testing feed2exec in python 3.6 and 3.7 with pytest-cov 2.6, which i do not get in python 3.5:

INTERNALERROR> coverage.misc.CoverageException: Couldn't use data file '/home/anarcat/src/feed2exec/.coverage': UNIQUE constraint failed: file.path

i wonder if this could be something with changes in the sqlite3 standard library? or is this really just a bug in 5.0 alpha?

i first thought this was #716 which similarly stumbles upon the new sqlite storage...

thanks!

@nedbat
Copy link
Owner

nedbat commented Oct 22, 2018

@anarcat That is a separate problem: you need to be sure to use [run] parallel = True if your tests run simultaneously: #716 (comment)

@anarcat
Copy link

anarcat commented Oct 23, 2018

i confirm setting parallel = True fixes this magically. it seems weird i need to tweak this - i don't remember enabling parallel builds - but at least it works, thanks! :)

@nedbat
Copy link
Owner

nedbat commented Oct 23, 2018

@anarcat It seems like this is going to happen more and more. Can you help me out? How are you running your tests? What can I tell people to look out for?

@anarcat
Copy link

anarcat commented Oct 23, 2018

I have no idea. :) The source for my project is here. Tests are ran by gitlab CI or locally (it doesn't matter) with this tox.ini file, which in turn calls setuptools (or distutils or whatever it's called) with the pytest-runner plugin. The setup.py test alias is defined in setup.cfg and then pytest magically finds the tests in feed2exec/tests. There's a minor betamax config in conftest.py but it should matter here. I have more test docs if you want as well..

@blueyed
Copy link
Contributor

blueyed commented Oct 23, 2018

I've bisected this to d2f77ab.

The problem appears to be that the file_map is only read when opening the DB, but another subprocess might have added the file later when the ID is needed.

#723 fixes this for me.

My test case is https://github.com/Vimjas/covimerage/ (Vimjas/covimerage@bb84cc9), using tox -e py37-coveragepy5-coverage -- -x. (The initial failure is due to an autouse fixture, but also without this it fails, but later.)

To be clear: I am seeing the problem @anarcat mentioned, and which might be different from the original issue, sorry.

blueyed added a commit to blueyed/covimerage that referenced this issue Oct 23, 2018
blueyed added a commit to Vimjas/covimerage that referenced this issue Oct 23, 2018
@nedbat
Copy link
Owner

nedbat commented Nov 25, 2018

For anarcat's "UNIQUE constraint failed: file.path" problem, I would like people to use parallel=True. Or should I make that happen all the time anyway? I don't think I trust SQLite's concurrency enough to have multiple processes accessing the same database.

@strichter
Copy link
Contributor

@nedbat We did experiments yesterday and sqlite3 does properly lock files during writes.

A few things could be tried to solve the problem:

  1. Since write calls to the database are idempotent, I think it is safe to set the isolation_level to "IMMEDIATE" circumventing the transaction machinery completely. We have run it with this setting against our entire code base run (14k tests, 3 days total runtime) and it has not produced a single failure.

  2. @anarcat If you run your tests with threads, I would try applying this PR: Ensure sqldata thread safety. #760

blueyed added a commit to blueyed/coveragepy that referenced this issue Mar 31, 2019
@nedbat
Copy link
Owner

nedbat commented Apr 8, 2019

I would rather not use "isolation_level = immediate" because in the future, writes to the database may not be idempotent (for example, counting line executions, rather than just true/false per line).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants