Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid URL requirement name collisions #1149

Merged
merged 4 commits into from
May 27, 2020

Conversation

geokala
Copy link
Contributor

@geokala geokala commented May 19, 2020

Fixes #1145.

This avoids cache collisions when using URLs which both end with the same filename where a later dependency depends on an earlier one, e.g.
file:///tmp/mydep1/master.zip
file:///tmp/mydep2/master.zip
(where mydep2 depends on mydep1)

Changelog-friendly one-liner: Avoid name collisions with URL based requirements.

Contributor checklist
  • Provided the tests for the changes.
  • Gave a clear one-line description in the PR (that the maintainers can add to CHANGELOG.md on release).
  • Assign the PR to an existing or new milestone for the target version (following Semantic Versioning).

Copy link
Member

@atugushev atugushev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @geokala,

Thanks for the PR! I've skimmed the patch and noticed that it could break the cache when hashing files. Could you check this out?
https://github.com/jazzband/pip-tools/blob/65f2ebea007ade0dd723cec6f417df1131164eb1/piptools/repositories/pypi.py#L318

@atugushev atugushev changed the title Avoid URL requirement name collisions (#1145) Avoid URL requirement name collisions May 19, 2020
@codecov
Copy link

codecov bot commented May 19, 2020

Codecov Report

Merging #1149 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1149   +/-   ##
=======================================
  Coverage   99.49%   99.49%           
=======================================
  Files          36       36           
  Lines        2752     2773   +21     
  Branches      326      328    +2     
=======================================
+ Hits         2738     2759   +21     
  Misses          8        8           
  Partials        6        6           
Impacted Files Coverage Δ
piptools/repositories/pypi.py 95.52% <100.00%> (+0.11%) ⬆️
tests/conftest.py 100.00% <100.00%> (ø)
tests/test_repository_pypi.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ce999b...7bb1002. Read the comment docs.

@geokala geokala force-pushed the multiple-requirements-same-name branch from 65f2ebe to ecdf92e Compare May 20, 2020 16:43
@geokala
Copy link
Contributor Author

geokala commented May 20, 2020

Re: Cache directory, good point. I've factored out the logic to a _get_download_path function and used it in both places it appears to be needed.
I've not introduced a specific test for this as I believe it constitutes internal implementation detail rather than external behaviour that should be asserted upon, but let me know if a test is required.

collisions.
"""
if ireq.link:
salt = hashlib.sha256(ireq.link.url_without_fragment.encode()).hexdigest()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've researched pip codebase and have found this function. It probably makes sense to do the same for caching files. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that looks like it'd work nicely. Only issue would be that it's depending on another pip internal function, so increases the risk of breakage every time they update.
If you're comfortable with that then I can switch it to use that this evening.

Otherwise, I can grab the nesting part of it to give the same behaviour and avoid that dependency.

What's your preference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would grab the nesting part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that'll be in the next push. Just looking at the test now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update _get_download_path in accordance with this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do- but what's the objective?
I can do a straight copy of that function, which will not break on changes to that function's name, but will break if there are changes to the vendored packages that pip uses. Otherwise, what's currently in this PR is a partial copy which should address the concerns but may cause slight cache bloat.
Is the preference here to reduce cache bloat but at the possible expense of stability? If so, I could change it to call that function directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is to kill two birds with one stone. But, I don't want to block this improvement due to mysterious bugs and rather approve this as is. Thanks for your contribution!

@geokala geokala force-pushed the multiple-requirements-same-name branch 2 times, most recently from dfea595 to 07fdf9a Compare May 22, 2020 16:33
@AndydeCleyre
Copy link
Contributor

Can we get a nice explanation of the problem being solved, as the docstring for the test?

@geokala geokala force-pushed the multiple-requirements-same-name branch from 07fdf9a to 9790601 Compare May 23, 2020 09:34
@geokala
Copy link
Contributor Author

geokala commented May 23, 2020

@AndydeCleyre Sure, I've just added that- hopefully it's not too verbose?

@geokala
Copy link
Contributor Author

geokala commented May 23, 2020

Not sure why CI is failing at the moment, will take a look when it deigns to give me logs.

@atugushev
Copy link
Member

@geokala

Not sure why CI is failing at the moment, will take a look when it deigns to give me logs.

I've fixed it in geokala#1. You could merge this if it works.

@atugushev
Copy link
Member

@geokala

FYI, also fixed test test_name_collision due to changes in geokala@2c4c571.

@geokala
Copy link
Contributor Author

geokala commented May 24, 2020

@atugushev Awesome, thanks for that- makes sense, so I've merged it into my PR.

@geokala
Copy link
Contributor Author

geokala commented May 24, 2020

Looks like CI is failing on 2.7 pip latest (I think?):
File "/tmp/easy_install-P_Otzu/jaraco.windows-5.0.0/temp/easy_install-B_XSyo/path.py-12.4.0/temp/easy_install-kjgCqJ/more-itertools-8.3.0/more_itertools/init.py", line 1, in
File "/tmp/easy_install-P_Otzu/jaraco.windows-5.0.0/temp/easy_install-B_XSyo/path.py-12.4.0/temp/easy_install-kjgCqJ/more-itertools-8.3.0/more_itertools/more.py", line 482
yield from iterable
^
SyntaxError: invalid syntax

=================================== log end ====================================
ERROR: FAIL could not package project - v = InvocationError(u'/opt/hostedtoolcache/Python/2.7.18/x64/bin/python setup.py sdist --formats=zip --dist-dir /home/runner/work/pip-tools/pip-tools/.tox/dist', 1)

Has pip already stopped supporting 2.7 for make_sdist or something?

@geokala
Copy link
Contributor Author

geokala commented May 24, 2020

A local run of tox --notest -p auto --parallel-live doesn't give me that error, though of course my environment doesn't match travis.

@geokala
Copy link
Contributor Author

geokala commented May 24, 2020

So yeah, looks like more-itertools is being used by easy_install but doesn't support python2.7 any more- I'm guessing I already had an older version of easy_install installed so didn't run into this problem.
Presumably this means that something needs pinning- easy_install or more-itertools- and presumably it wants pinning in tox?

@atugushev
Copy link
Member

atugushev commented May 25, 2020

Unfortunately, recent setuptools-scm-4.0.0 broke our CI. Working on a fix in #1151.

@atugushev
Copy link
Member

@geokala

Please rebase onto master it should be fixed now.

Madda and others added 4 commits May 26, 2020 08:50
Without changing the dir, manual runs of pytest in the pip-tools
repo will result in sdists with unexpected content

Co-Authored-By: geokala <[email protected]>
@geokala geokala force-pushed the multiple-requirements-same-name branch from e546c9f to 7bb1002 Compare May 26, 2020 07:51
@geokala
Copy link
Contributor Author

geokala commented May 26, 2020

Rebased. Fingers crossed.

Copy link
Member

@atugushev atugushev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@atugushev atugushev added this to the 5.2.0 milestone May 27, 2020
@atugushev atugushev merged commit 008865a into jazzband:master May 27, 2020
@atugushev
Copy link
Member

@geokala

Great first-time contribution 🚀 Thanks!

@geokala
Copy link
Contributor Author

geokala commented May 27, 2020

No worries, and thanks for your help!

@geokala geokala deleted the multiple-requirements-same-name branch May 27, 2020 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple entries in requirements.in with the same file cause collision
3 participants