Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple input path in CLI #875 #1397 #1399

Merged
merged 9 commits into from
Mar 7, 2019

Conversation

pombredanne
Copy link
Member

@pombredanne pombredanne commented Feb 26, 2019

This adds support for #875 #1397 aka. multiple inputs and a new --include option.

  • you can now enter multiple input paths with the constraint that they all must be relative path and share a common root directory.
  • also add a new enabling option for --include that works in tandem with the --ignore and has the same semantics. Includes are processed first then ignores are applied on top.
  • it is now possible to run this:
    find samples/ | grep zlib |grep -v ada | grep -v iostrea | xargs scancode --include="*JGroup*" --ignore "*.S" --json-pp -
    or this:
    git diff --name-only master | xargs scancode -i --json-pp -

Reported-by: Nico Bucher @nicobucher
Signed-off-by: Philippe Ombredanne [email protected]

@pombredanne
Copy link
Member Author

@nicobucher ping... were you able to test this branch a bit? feedback is mucho welcomed!

@nicobucher
Copy link

Hi @pombredanne, I was able to quickly try this out.
It seems to do exactly what I was looking for!

I will play around with this branch in the next days. Perhaps I can give more detailed feedback then.

@pombredanne
Copy link
Member Author

@nicobucher thank you for the feedback. Note that the current implementation is subpar performance wise because the logic goes this way:

  1. find what is the root, shared ancestor directory of all the files arguments (and they must all be relative paths)
  2. walk the whole tree for this
  3. keep only the ones that match the requested paths.

If you have a large codebase (say a Linux kernel) and you modified only one file, this would walk 70K files first. So this can be optimized later if needed.

Signed-off-by: Philippe Ombredanne <[email protected]>
The handling of ignores and includes/excludes was not
fully correct and precedence was not given to includes


Signed-off-by: Philippe Ombredanne <[email protected]>
This was ScanCode can be aclled as a simple function without
the Click UI/UX and requirements

Signed-off-by: Philippe Ombredanne <[email protected]>
Signed-off-by: Philippe Ombredanne <[email protected]>
See ticket #1400 for more details

This is an example of how to call Scancode as a function from Python2
or Python3. The benefits are that when the server process has loaded the
license index, and imported its modules there is no per-call
import/loading penalty anymore.

This is using execnet which is the multiprocessing library used by
py.test and therefore a rather stable and high quality engine.

Signed-off-by: Philippe Ombredanne <[email protected]>
@pombredanne pombredanne force-pushed the 1397-multiple-inputs branch from 767ef38 to 8afa686 Compare March 5, 2019 14:59
@codecov
Copy link

codecov bot commented Mar 5, 2019

Codecov Report

Merging #1399 into develop will increase coverage by 0.17%.
The diff coverage is 70.55%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1399      +/-   ##
===========================================
+ Coverage    83.65%   83.83%   +0.17%     
===========================================
  Files          119      119              
  Lines        13949    14245     +296     
===========================================
+ Hits         11669    11942     +273     
- Misses        2280     2303      +23
Impacted Files Coverage Δ
src/formattedcode/output_jsonlines.py 100% <ø> (ø) ⬆️
src/plugincode/__init__.py 80% <0%> (-2.76%) ⬇️
src/scancode/__init__.py 75% <100%> (+0.3%) ⬆️
src/commoncode/fileutils.py 82.5% <100%> (+1.38%) ⬆️
src/commoncode/ignore.py 72.97% <100%> (ø) ⬆️
src/scancode/plugin_ignore.py 76.36% <68.42%> (-20.52%) ⬇️
src/scancode/cli.py 77.11% <69.69%> (+1.01%) ⬆️
src/commoncode/fileset.py 70.58% <75.86%> (-3.86%) ⬇️
src/formattedcode/output_json.py 77.58% <90%> (+0.22%) ⬆️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 887ae13...96069fd. Read the comment docs.

Signed-off-by: Philippe Ombredanne <[email protected]>
This can be a list or tuple or a string.

Signed-off-by: Philippe Ombredanne <[email protected]>
This is to support an possible namespace registration of all ScanCode
non-SPDX licenses as discussed in #536 and #532

Signed-off-by: Philippe Ombredanne <[email protected]>
@pombredanne pombredanne force-pushed the 1397-multiple-inputs branch from e0096b3 to 96069fd Compare March 6, 2019 18:37
@pombredanne
Copy link
Member Author

@johnmhoran I moved the script to create one SPDX doc per license to this branch... https://github.com/nexB/scancode-toolkit/pull/1399/files#diff-09a4d8eb1415f0acf8c0e24ec59fc3d3

@johnmhoran
Copy link
Member

Thanks for the heads-up @pombredanne .

@pombredanne
Copy link
Member Author

All green ... merging now.

@pombredanne pombredanne merged commit bd57613 into develop Mar 7, 2019
@pombredanne pombredanne deleted the 1397-multiple-inputs branch March 7, 2019 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants