Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update scancode.io models to handle new scancode-toolkit scan fields #436

Closed
JonoYang opened this issue May 11, 2022 · 2 comments
Closed
Milestone

Comments

@JonoYang
Copy link
Member

JonoYang commented May 11, 2022

In a scancode JSON output, all Packages detected in a scan are present in a top-level attribute named packages. Likewise, all detected Dependencies are placed in the dependencies attribute. Multiple copies of the same package can be present in the packages field, if a particular package was detected multiple times in the same codebase. Each copy of this package will have a different package_uid. A package_uid is the purl of that package with a qualifier named uuid that is specific to the scancode run. e.g. pkg:pypi/[email protected]?uuid=9c19275c-c3fe-43dd-b6ec-a4f2bf65810f

For each Resource that is for a Package, the for_packages for those Resources will be populated with the package_uid of the Package they are for.

We will need to create a DiscoveredDependency model to handle the dependencies from the new top-level dependency attribute from a scan.

We also need to modify the DiscoveredPackage model to better store/query the new package_uid values. Currently, we put the package_uids for a package in the extra_data field.

The serializers of these models will have to be updated as well.

The value in a Resource's for_packages field is not a purl, but a package_uid. for a particular instance of a Package detected during a scan. In the new output, multiple copies of the same package can appear in the top-level packages field. Each copy has a different package_uid. We'll have to find a way to keep the package_uids around on Resources and to display the package_uids properly in the for_packages field in the scancode.io JSON output.

JonoYang added a commit that referenced this issue Jul 21, 2022
    * Update scan_for_application_packages to save detected Package data to the CodebaseResource it is from, then iterate through the CodebaseResources with Package data and use the proper Package handler to process the Package data
    * Create DiscoveredDependency model
    * Add package_data JSON field to CodebaseResource

Signed-off-by: Jono Yang <[email protected]>
JonoYang added a commit that referenced this issue Jul 21, 2022
    * Increase field sizes in DiscoveredDependency

Signed-off-by: Jono Yang <[email protected]>
@pombredanne pombredanne added this to the v32.0.0 milestone Jul 28, 2022
@tdruez
Copy link
Contributor

tdruez commented Aug 31, 2022

@JonoYang are we now ready to close this on?

@JonoYang
Copy link
Member Author

@tdruez This is finished now that #486 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants