Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine license matches in new LicenseDetection #2961

Merged
merged 86 commits into from
Nov 11, 2022

Conversation

AyanSinhaMahapatra
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra commented May 17, 2022

See #2878

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Remove primary license expression attributes, move matches to last.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Move the LicenseMatch serialization functions into
the LicenseDetection function.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Create LicenseDetection functions and enable them in the API.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Removes the unknown licenses CLI option, as this should be
default behaviour.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Update local unknown license reference code to use
LicenseDetection functions.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
This updates the unknown license dereferencing code to also
look for the referenced filename in root of the codebase.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Adds tests for:

- unknown license intros
- unknown local license references

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! that's quite a piece of work! 🙇 I provided some comments, but I did not review everything just yet. Of note you should use LicenseMatch and objects everywhere and not carry around dictionaries of scan data.
Instead convert these at the boundaries as early as possible deserialized to objects.

src/licensedcode/cache.py Outdated Show resolved Hide resolved
src/licensedcode/cache.py Outdated Show resolved Hide resolved
src/licensedcode/data/rules/lead-in_unknown_2.yml Outdated Show resolved Hide resolved
src/scancode/api.py Outdated Show resolved Hide resolved
src/licensedcode/detection.py Outdated Show resolved Hide resolved
src/licensedcode/detection.py Outdated Show resolved Hide resolved
src/licensedcode/detection.py Outdated Show resolved Hide resolved
src/licensedcode/detection.py Outdated Show resolved Hide resolved
src/licensedcode/detection.py Outdated Show resolved Hide resolved
src/licensedcode/detection.py Show resolved Hide resolved
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is some feedback on the format. We need to streamline this to the essence and avoid repeating details to remove noise when possible.

Updates implementation to address review comments.

- Remove code that wasn't being used
- Update docstrings
- Use constants and Enums
- Fix case where local reference can't be found
- Move back constants to API
- Modify test expectations
- Other misc changes

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Also adds a test without the --license-references option.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
LicenseMatch data now is based on a license-expression instead
of a license key.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Add new command line option to not inline license, licenseDB and license
detection level information.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Modify code to align to the LicenseDetection model everywhere.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@AyanSinhaMahapatra AyanSinhaMahapatra changed the title [WIP] Add LicenseDetection Add LicenseDetection Jun 1, 2022
Modify and update spdx and other output plugins to use LicenseDetection
output data format for licenses.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
* Rename `licenses` to `license_detections`
* Rename `license_expressions` to `detected_license_expression`
* Rename `spdx_license_expressions` to `detected_license_expression_spdx`

Also fix other uses of these attributes around scancode.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
* in case of unknown references being present without top-level
  detected, dereference using license detections in legalese/readme
  files at codebase root.
* add example cases from samba/samba, sugarlabs/physics, debian fusiondirectory,
  and paddlenlp as tests.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
* add a new attribute `rule_url` to LicenseMatch results.
* deleted License attributes 'scancode_data_url' and 'scancode_text_url' and
  replaced them with a single License attribute 'scancode_url'
* regen test expectations accordingly

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
replaces the `package` in referenced_filename with a temporary
`INHERIT_LICENSE_FROM_PACKAGE` which will be removed later in
favour of a flag.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@AyanSinhaMahapatra
Copy link
Member Author

@pombredanne this is all ready!

AyanSinhaMahapatra and others added 6 commits November 10, 2022 19:43
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
CHANGELOG.rst Outdated Show resolved Hide resolved
CHANGELOG.rst Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants