Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache git dependencies as wheels #7473

Merged
merged 2 commits into from
Apr 9, 2023

Conversation

maksbotan
Copy link
Contributor

Currently, poetry install will clone, build and install every git dependency when it's not present in the environment. This is OK for developer's machines, but not OK for CI - there environment is always fresh, and installing git dependencies takes significant time on each CI run, especially if the dependency has C extensions that need to be built.

This commit builds a wheel for every git dependency that has precise reference hash in lock file and is not required to be in editable mode, stores that wheel in a cache dir and will install from it instead of cloning the repository again.

Pull Request Check List

  • Added tests for changed code.
  • Updated documentation for changed code.

This is a copy of #6896, which got automatically closed when feature/wheel-installer-and-builder branch was merged and deleted.

@neersighted @radoering may I ping you about this feature? 🙏

If it's OK, I'll try to write a test.

@maksbotan maksbotan force-pushed the maksbotan/cache-git-deps branch from b1ec957 to ae240c0 Compare March 17, 2023 10:59
@maksbotan
Copy link
Contributor Author

@sdispater may I ask your attention about this? Since 1.4.0 was released, this change can be incorporated into master. IMO this is an important improvement.

Copy link
Member

@radoering radoering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note #7621 and see my review comments.

Further, I wonder if it might make sense to mark the cache directories of git dependencies somehow for debugging purposes, e.g. by creating some created_from_git_dependency file? This would also give you an easy way to get rid of all cached wheels for git dependencies without nuking your complete artifacts cache. 🤔

src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/chef.py Outdated Show resolved Hide resolved
@maksbotan
Copy link
Contributor Author

Thank you for the review, @radoering. I've addressed your comments in a separate commit, I'll squash everything before the merge.

Further, I wonder if it might make sense to mark the cache directories of git dependencies somehow for debugging purposes, e.g. by creating some created_from_git_dependency file? This would also give you an easy way to get rid of all cached wheels for git dependencies without nuking your complete artifacts cache. 🤔

I think this would make sense. What is a proper way to touch a file in poetry, or should I just grab shutil or something for that?

Copy link
Contributor Author

@maksbotan maksbotan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests fail after my refactoring, I'll look at that later, if the overall approach is ok.

src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/chef.py Outdated Show resolved Hide resolved
@maksbotan maksbotan force-pushed the maksbotan/cache-git-deps branch from ae240c0 to dc4d7e5 Compare March 20, 2023 21:14
@radoering
Copy link
Member

What is a proper way to touch a file in poetry, or should I just grab shutil or something for that?

path.touch()

@maksbotan maksbotan force-pushed the maksbotan/cache-git-deps branch from a1f1fce to 0047164 Compare April 4, 2023 13:06
@maksbotan
Copy link
Contributor Author

@radoering i've addressed your comments and fixed failing tests.

Copy link
Member

@radoering radoering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after some minor changes and with some tests.

  • in test_cache.py, we should add tests for get_cache_directory_for_git and get_cached_archive_for_git
  • in test_executor.py, we probably should extend test_executor_should_write_pep610_url_references_for_git similar to test_executor_should_write_pep610_url_references_for_wheel_urls or test_executor_should_write_pep610_url_references_for_non_wheel_urls

src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/utils/cache.py Outdated Show resolved Hide resolved
src/poetry/utils/cache.py Outdated Show resolved Hide resolved
@maksbotan maksbotan force-pushed the maksbotan/cache-git-deps branch from 0047164 to e7c71af Compare April 7, 2023 13:27
Copy link
Contributor Author

@maksbotan maksbotan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @radoering!

I've addressed your comments and added some tests. I did not add a test for get_cached_archive_for_git, as it is a wrapper around _get_cached_archive, which is tested elsewhere anyway. Moreover, there was no test for get_cached_archive_for_link anyway.

src/poetry/utils/cache.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/installation/executor.py Outdated Show resolved Hide resolved
src/poetry/utils/cache.py Outdated Show resolved Hide resolved
@radoering
Copy link
Member

Moreover, there was no test for get_cached_archive_for_link anyway.

Actually, there are two tests for get_cached_archive_for_link. 😉 Apart from that, you're right. I just added a small smoke test that checks that none of the internal assertions is raised.

Further, I fixed the lost cleanup for editable git installs. Thereby, I noticed that the content of PEP 610 direct_url.json had been wrong for editable git installs and fixed that, too. Please take a look at my commits and verify that they do not break anything for you.

@maksbotan
Copy link
Contributor Author

@radoering thank you!

I've run a few tests with my production use case and this branch seems to work correctly.

Should I rebase these commits now? Or perhaps squash into one?

maksbotan and others added 2 commits April 9, 2023 21:22
Currently, poetry install will clone, build and install every git
dependency when it's not present in the environment. This is OK for
developer's machines, but not OK for CI - there environment is always
fresh, and installing git dependencies takes significant time on each CI
run, especially if the dependency has C extensions that need to be
built.

This commit builds a wheel for every git dependency that has precise
reference hash in lock file and is not required to be in editable mode,
stores that wheel in a cache dir and will install from it instead of
cloning the repository again.
@radoering radoering force-pushed the maksbotan/cache-git-deps branch from 9116742 to d5cee69 Compare April 9, 2023 19:23
@radoering radoering enabled auto-merge (rebase) April 9, 2023 19:25
@radoering radoering merged commit d5f83ff into python-poetry:master Apr 9, 2023
radoering pushed a commit that referenced this pull request Apr 9, 2023
Currently, poetry install will clone, build and install every git
dependency when it's not present in the environment. This is OK for
developer's machines, but not OK for CI - there environment is always
fresh, and installing git dependencies takes significant time on each CI
run, especially if the dependency has C extensions that need to be
built.

This commit builds a wheel for every git dependency that has precise
reference hash in lock file and is not required to be in editable mode,
stores that wheel in a cache dir and will install from it instead of
cloning the repository again.
@maksbotan
Copy link
Contributor Author

Thank you! Could you give some estimate when this can hit a release? I understand that it would be in 1.5.x series?

@radoering
Copy link
Member

It will be in 1.5. I wouldn't expect it in April. 1.4.2 was released about a week ago and we didn't do too much for 1.5 until then.

@maksbotan
Copy link
Contributor Author

Thanks.

mwalbeck pushed a commit to mwalbeck/docker-python-poetry that referenced this pull request May 23, 2023
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [poetry](https://python-poetry.org/) ([source](https://github.com/python-poetry/poetry), [changelog](https://python-poetry.org/history/)) | minor | `1.4.2` -> `1.5.0` |

---

### Release Notes

<details>
<summary>python-poetry/poetry</summary>

### [`v1.5.0`](https://github.com/python-poetry/poetry/blob/HEAD/CHANGELOG.md#&#8203;150---2023-05-19)

[Compare Source](python-poetry/poetry@1.4.2...1.5.0)

##### Added

-   **Introduce the new source priorities `explicit` and `supplemental`** ([#&#8203;7658](python-poetry/poetry#7658),
    [#&#8203;6879](python-poetry/poetry#6879)).
-   **Introduce the option to configure the priority of the implicit PyPI source** ([#&#8203;7801](python-poetry/poetry#7801)).
-   Add handling for corrupt cache files ([#&#8203;7453](python-poetry/poetry#7453)).
-   Improve caching of URL and git dependencies ([#&#8203;7693](python-poetry/poetry#7693),
    [#&#8203;7473](python-poetry/poetry#7473)).
-   Add option to skip installing directory dependencies ([#&#8203;6845](python-poetry/poetry#6845),
    [#&#8203;7923](python-poetry/poetry#7923)).
-   Add `--executable` option to `poetry env info` ([#&#8203;7547](python-poetry/poetry#7547)).
-   Add `--top-level` option to `poetry show` ([#&#8203;7415](python-poetry/poetry#7415)).
-   Add `--lock` option to `poetry remove` ([#&#8203;7917](python-poetry/poetry#7917)).
-   Add experimental `POETRY_REQUESTS_TIMEOUT` option ([#&#8203;7081](python-poetry/poetry#7081)).
-   Improve performance of wheel inspection by avoiding unnecessary file copy operations ([#&#8203;7916](python-poetry/poetry#7916)).

##### Changed

-   **Remove the old deprecated installer and the corresponding setting `experimental.new-installer`** ([#&#8203;7356](python-poetry/poetry#7356)).
-   **Introduce `priority` key for sources and deprecate flags `default` and `secondary`** ([#&#8203;7658](python-poetry/poetry#7658)).
-   Deprecate `poetry run <entry point>` if the entry point was not previously installed via `poetry install` ([#&#8203;7606](python-poetry/poetry#7606)).
-   Only write the lock file if the installation succeeds ([#&#8203;7498](python-poetry/poetry#7498)).
-   Do not write the unused package category into the lock file ([#&#8203;7637](python-poetry/poetry#7637)).

##### Fixed

-   Fix an issue where Poetry's internal pyproject.toml continually grows larger with empty lines ([#&#8203;7705](python-poetry/poetry#7705)).
-   Fix an issue where Poetry crashes due to corrupt cache files ([#&#8203;7453](python-poetry/poetry#7453)).
-   Fix an issue where the `Retry-After` in HTTP responses was not respected and retries were handled inconsistently ([#&#8203;7072](python-poetry/poetry#7072)).
-   Fix an issue where Poetry silently ignored invalid groups ([#&#8203;7529](python-poetry/poetry#7529)).
-   Fix an issue where Poetry does not find a compatible Python version if not given explicitly ([#&#8203;7771](python-poetry/poetry#7771)).
-   Fix an issue where the `direct_url.json` of an editable install from a git dependency was invalid ([#&#8203;7473](python-poetry/poetry#7473)).
-   Fix an issue where error messages from build backends were not decoded correctly ([#&#8203;7781](python-poetry/poetry#7781)).
-   Fix an infinite loop when adding certain dependencies ([#&#8203;7405](python-poetry/poetry#7405)).
-   Fix an issue where pre-commit hooks skip pyproject.toml files in subdirectories ([#&#8203;7239](python-poetry/poetry#7239)).
-   Fix an issue where pre-commit hooks do not use the expected Python version ([#&#8203;6989](python-poetry/poetry#6989)).
-   Fix an issue where an unclear error message is printed if the project name is the same as one of its dependencies ([#&#8203;7757](python-poetry/poetry#7757)).
-   Fix an issue where `poetry install` returns a zero exit status even though the build script failed ([#&#8203;7812](python-poetry/poetry#7812)).
-   Fix an issue where an existing `.venv` was not used if `in-project` was not set ([#&#8203;7792](python-poetry/poetry#7792)).
-   Fix an issue where multiple extras passed to `poetry add` were not parsed correctly ([#&#8203;7836](python-poetry/poetry#7836)).
-   Fix an issue where `poetry shell` did not send a newline to `fish` ([#&#8203;7884](python-poetry/poetry#7884)).
-   Fix an issue where `poetry update --lock` printed operations that were not executed ([#&#8203;7915](python-poetry/poetry#7915)).
-   Fix an issue where `poetry add --lock` did perform a full update of all dependencies ([#&#8203;7920](python-poetry/poetry#7920)).
-   Fix an issue where `poetry shell` did not work with `nushell` ([#&#8203;7919](python-poetry/poetry#7919)).
-   Fix an issue where subprocess calls failed on Python 3.7 ([#&#8203;7932](python-poetry/poetry#7932)).
-   Fix an issue where keyring was called even though the password was stored in an environment variable ([#&#8203;7928](python-poetry/poetry#7928)).

##### Docs

-   Add information about what to use instead of `--dev` ([#&#8203;7647](python-poetry/poetry#7647)).
-   Promote semantic versioning less aggressively ([#&#8203;7517](python-poetry/poetry#7517)).
-   Explain Poetry's own versioning scheme in the FAQ ([#&#8203;7517](python-poetry/poetry#7517)).
-   Update documentation for configuration with environment variables ([#&#8203;6711](python-poetry/poetry#6711)).
-   Add details how to disable the virtualenv prompt ([#&#8203;7874](python-poetry/poetry#7874)).
-   Improve documentation on whether to commit `poetry.lock` ([#&#8203;7506](python-poetry/poetry#7506)).
-   Improve documentation of `virtualenv.create` ([#&#8203;7608](python-poetry/poetry#7608)).

##### poetry-core ([`1.6.0`](https://github.com/python-poetry/poetry-core/releases/tag/1.6.0))

-   Improve error message for invalid markers ([#&#8203;569](python-poetry/poetry-core#569)).
-   Increase robustness when deleting temporary directories on Windows ([#&#8203;460](python-poetry/poetry-core#460)).
-   Replace `tomlkit` with `tomli`, which changes the interface of some *internal* classes ([#&#8203;483](python-poetry/poetry-core#483)).
-   Deprecate `Package.category` ([#&#8203;561](python-poetry/poetry-core#561)).
-   Fix a performance regression in marker handling ([#&#8203;568](python-poetry/poetry-core#568)).
-   Fix an issue where wildcard version constraints were not handled correctly ([#&#8203;402](python-poetry/poetry-core#402)).
-   Fix an issue where `poetry build` created duplicate Python classifiers if they were specified manually ([#&#8203;578](python-poetry/poetry-core#578)).
-   Fix an issue where local versions where not handled correctly ([#&#8203;579](python-poetry/poetry-core#579)).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS44Mi4wIiwidXBkYXRlZEluVmVyIjoiMzUuODIuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Reviewed-on: https://git.walbeck.it/walbeck-it/docker-python-poetry/pulls/717
Co-authored-by: renovate-bot <[email protected]>
Co-committed-by: renovate-bot <[email protected]>
Copy link

github-actions bot commented Mar 3, 2024

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants