Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[colic] Add copyright flag for extraction of copyright information #50

Merged
merged 2 commits into from
Aug 9, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ currently available backends are:
- **CoDep** extracts package and class dependencies of a Python module and serialized them as JSON structures, composed of edges and nodes, thus easing the bridging with front-end technologies for graph visualizations. It combines [PyReverse](https://pypi.org/project/pyreverse/) and [NetworkX](https://networkx.github.io/).
- **CoQua** retrieves code quality insights, such as checks about line-code’s length, well-formed variable names, unused imported modules and code clones. It uses [PyLint](https://www.pylint.org/) and [Flake8](http://flake8.pycqa.org/en/latest/index.html). The tools can be activated by passing the corresponding category: `code_quality_pylint` or `code_quality_flake8`.
- **CoVuln** scans the code to identify security vulnerabilities such as potential SQL and Shell injections, hard-coded passwords and weak cryptographic key size. It relies on [Bandit](https://github.com/PyCQA/bandit).
- **CoLic** scans the code to extract license information. It currently supports [Nomos](https://github.com/fossology/fossology/tree/master/src/nomos) and [ScanCode](https://github.com/nexB/scancode-toolkit). They can be activated by passing the corresponding category: `code_license_nomos`, `code_license_scancode`, or `code_license_scancode_cli`.
- **CoLic** scans the code to extract license & copyright information. It currently supports [Nomos](https://github.com/fossology/fossology/tree/master/src/nomos) and [ScanCode](https://github.com/nexB/scancode-toolkit). They can be activated by passing the corresponding category: `code_license_nomos`, `code_license_scancode`, or `code_license_scancode_cli`.
- **CoLang** gathers insights about code language distribution of a git repository. It relies on [Linguist](https://github.com/github/linguist) and [Cloc](http://cloc.sourceforge.net/) tools. They can be activated by passing the corresponding category: `code_language_linguist` or `code_language_cloc`.

### How to develop a backend
Expand Down
2 changes: 1 addition & 1 deletion bin/graal
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Repositories are reached using specific backends, which are:
cocom Fetch code complexity for many programming languages
codep Fetch package and class dependencies of Python modules
colang Fetch code language distribution
colic Fetch license information
colic Fetch license & copyright information
coqua Fetch code quality data of Python code
covuln Fetch security vulnerabilities in Python code

Expand Down
17 changes: 11 additions & 6 deletions graal/backends/core/analyzers/scancode.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,21 +53,26 @@ def __init__(self, exec_path, cli=False):
_ = subprocess.check_output([exec_path]).decode("utf-8")

def __analyze_scancode(self, file_path):
"""Add information about license using scancode
"""Add information about license and copyright using scancode
:param file_path: file path (in case of scancode)
"""
result = {'licenses': []}
result = {
'licenses': [],
'copyrights': [],
}
try:
msg = subprocess.check_output([self.exec_path, '--json-pp', '-', '--license', file_path]).decode("utf-8")
msg = subprocess.check_output(
[self.exec_path, '--json-pp', '-', '--license', '--copyright', file_path]).decode("utf-8")
except subprocess.CalledProcessError as e:
raise GraalError(cause="Scancode failed at %s, %s" % (file_path, e.output.decode("utf-8")))
finally:
subprocess._cleanup()

licenses_raw = json.loads(msg)
if 'files' in licenses_raw:
result['licenses'] = licenses_raw['files'][0]['licenses']
scancode_raw = json.loads(msg)
if 'files' in scancode_raw:
result['licenses'] = scancode_raw['files'][0]['licenses']
result['copyrights'] = scancode_raw['files'][0]['copyrights']

return result

Expand Down
5 changes: 3 additions & 2 deletions graal/backends/core/colic.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
class CoLic(Graal):
"""CoLic backend.

This class extends the Graal backend. It gathers license information
This class extends the Graal backend. It gathers license & copyright information
using Nomos, Scancode or Scancode-cli

:param uri: URI of the Git repository
Expand Down Expand Up @@ -209,7 +209,8 @@ def analyze(self, file_path):

:returns a dict containing the results of the analysis, like the one below
{
'licenses': [..]
'licenses': [..],
'copyrights': [..]
}
"""
if self.kind == SCANCODE_CLI:
Expand Down
2 changes: 2 additions & 0 deletions tests/test_colic.py
Original file line number Diff line number Diff line change
Expand Up @@ -282,12 +282,14 @@ def test_analyze(self):
analysis = license_analyzer.analyze(file_path)

self.assertIn('licenses', analysis)
self.assertIn('copyrights', analysis)

file_paths = [os.path.join(self.tmp_data_path, ANALYZER_TEST_FILE)]
license_analyzer = LicenseAnalyzer(SCANCODE_CLI_PATH, kind=SCANCODE_CLI)
analysis = license_analyzer.analyze(file_paths)

self.assertIn('licenses', analysis[0])
self.assertIn('copyrights', analysis[0])


class TestCoLicCommand(unittest.TestCase):
Expand Down
2 changes: 2 additions & 0 deletions tests/test_scancode.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ def test_analyze_scancode(self):
result = scancode.analyze(**kwargs)

self.assertIn('licenses', result)
self.assertIn('copyrights', result)

@unittest.mock.patch('subprocess.check_output')
def test_analyze_error(self, check_output_mock):
Expand Down Expand Up @@ -86,6 +87,7 @@ def test_analyze_scancode_cli(self):
result = scancode_cli.analyze(**kwargs)

self.assertIn('licenses', result[0])
self.assertIn('copyrights', result[0])

def test_analyze_error(self):
"""Test whether an exception is thrown in case of error"""
Expand Down