-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
License scanner #12
License scanner #12
Conversation
…inding. Wrapped the graph part in its own function which we may move to another file in the future?
Moving the comment from @martinfontanet here:
Question: The I can see two options:
What do you think would be the best solution for this ? (basically so that |
It's difficult to choose what would be the best option. IMO, we should redefine It would make sense to make Here is an example of what if path is a url:
self.path = git.clone(url) // path is now the folder where the repo was cloned
self.files = FilesMetadata(path)
self.lang = self.files.get_prog_lang()
license_path = self.files.get_license_path()
self.license = LicenseMetadata(license_path) What do you think? |
Makes sense to me. It means that we don't need to worry about finding the license file in |
Ok. I will rewrite the function a bit so that it takes a valid license path as a parameter, and move the rest of the function to the FilesMetaData/Repo class (?). Then the search-for-a-licensefile part of the function can be re-used. Also I will make a new .py file for "turning stuff into triples" so I can move the add_license_to_graph function there. |
Yes, I think this could go into
Great! Thanks, this could also be part of a separate PR |
…. This is now in the files.py class.
Ok, I think this should work. I did ask chatGPT to help me with the class part, so please pay extra attention there. It looks like it works and makes sense to me, but I'm still struggling with wrapping my head around classes: I think that will still take some time. I also have a suspicion that my auto-black reformat is slightly different than yours... I think in the future I don't want to commit the auto-reformat of the other files it also found in the repo (like the tests, git, and init .py's). Should I find a different way to reformat my code into Black so that it matches yours (do you have a link to whatever tool you use?) |
No it works properly :) I think the files in question had never been reformatted 😄 |
* recursive directory search with `os.walk` * more elaborate regex to avoid picking up source files * rename method to an action (locate_licenses) * add docstrings and doctest
* LicenseMetadata restored with docstring * single get_licenses method adapted from find_licenses for multiple paths
For some reason, when running from scancode.api import get_licenses
get_licenses('LICENSE') I keep running into this error: 349 def get_index(force=False, index_all_languages=False):
350 """
351 Return and eventually build and cache a LicenseIndex.
352 """
--> 353 return get_cache(force=force, index_all_languages=index_all_languages).index
AttributeError: 'LicenseCache' object has no attribute 'index' I suspect this is a version issue with scancode-toolkit, as they seem to have recently refactored the license part. |
OK this is actually a known bug in |
I introduced a few bugs when refactoring the code 👀 now the unit test from the docstring passes (i.e. license correctly parsed). If this is OK for @martinfontanet I think this could be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay!
It looks good to me :)
Code is supposed to: