-
-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow cache location configuration #878
Conversation
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
This follows the logic proposed in aboutcode-org#357 and is intended to fix aboutcode-org#685. Signed-off-by: Haiko Schol <[email protected]>
@haikoschol Thank you! |
Codecov Report
@@ Coverage Diff @@
## develop #878 +/- ##
===========================================
+ Coverage 79.06% 79.16% +0.09%
===========================================
Files 93 93
Lines 11456 11454 -2
===========================================
+ Hits 9058 9067 +9
+ Misses 2398 2387 -11
Continue to review full report at Codecov.
|
Signed-off-by: Haiko Schol <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few comments for your review.
src/commoncode/fileutils.py
Outdated
from functools import partial | ||
from hashlib import md5 | ||
from os.path import abspath | ||
from os.path import dirname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind keeping the imports sorted in a given "block" (e.g. here the stdlib imports)
src/licensedcode/__init__.py
Outdated
from os.path import join | ||
from os.path import exists | ||
|
||
from commoncode import fileutils | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda like 2 empty lines after imports
src/commoncode/fileutils.py
Outdated
else: | ||
cache_root = os.path.expanduser('~') | ||
|
||
# tree_hash = tree_checksum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this
src/licensedcode/cache.py
Outdated
from os.path import join | ||
from functools import partial |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep imports sorted here and elsewhere
def get_or_build_index_through_cache( | ||
check_consistency=DEV_MODE, | ||
return_index=True, | ||
# used for testing only | ||
_tree_base_dir=src_dir, | ||
_tree_base_dir=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to minimize the need to import global variables. It tricky to avoid circular imports and with lower-case names, it is easy to get them mixed up with local variables.
src/scancode/cache.py
Outdated
|
||
from scancode import scans_cache_dir | ||
|
||
from licensedcode import DEV_MODE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be across all scrancode, not just licensedcode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I thought about what would be a better place for it, probably commoncode
. But it seemed like moving a few of those globals would drag a whole lot of stuff with it, so I opted for keeping changes to a minimum.
I can try to move all globals to commoncode/__init__.py
(and uppercase them while I'm at it). Should I?
src/commoncode/fileutils.py
Outdated
if dev_mode: | ||
cache_root = dirname(dirname(dirname(abspath(__file__)))) | ||
else: | ||
cache_root = os.path.expanduser('~') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need an abspath too
cache_root = os.environ.get(TMP_DIR_ENV) | ||
|
||
if cache_root and not os.path.exists(cache_root): | ||
raise OSError('{} is set to non-existent directory: {}'.format(TMP_DIR_ENV, cache_root)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why raise an error there? The directory should be created instead IMHO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the environment variable requires user intervention (unlike automatically picking cache dirs based on install method). Therefore I'd argue the user wants to be notified if there is a typo in the path, for example.
Creating a directory in the wrong place and proceeding with the scan might lead to filling up the wrong file system and could be time-consuming to diagnose.
|
||
if not cache_root: | ||
if dev_mode: | ||
cache_root = dirname(dirname(dirname(abspath(__file__)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the root of scancode's code, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I decided against using the global src_code
to avoid too many imports across modules. But see my comment below regarding DEV_MODE
.
Signed-off-by: Haiko Schol <[email protected]>
Signed-off-by: Haiko Schol <[email protected]>
@pombredanne Can we please get an updated review from you on this? This is blocking to resolve #685 for us. |
@sschuberth Looking at this now! Sorry for the delay |
@haikoschol I would like to rework this on top of the new code in #885 based on the detailed description in #685 (comment) |
As discussed offline, this has been superseded by #885. |
This is a first attempt to address #357 and #685.
I tried to move around as few things as possible, but I'm still not totally happy with the result. IMHO The most problematic part are import side effects, especially the creation of directories.
Instead of the environment variable
SCANCODE_TEMP_DIR
the code looks forSCANCODE_TMP
, because that was already in use incommoncode/fileutils.py
.