Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurs when parsing a general text file. #3179

Closed
soimkim opened this issue Dec 23, 2022 · 9 comments
Closed

An error occurs when parsing a general text file. #3179

soimkim opened this issue Dec 23, 2022 · 9 comments

Comments

@soimkim
Copy link
Contributor

soimkim commented Dec 23, 2022

Description

An error occurs when parsing a general text file.
Error Message:

Traceback (most recent call last):
  File "/home/soim/git/venv/lib/python3.8/site-packages/scancode/interrupt.py", line 91, in interruptible
    return NO_ERROR, func(*(args or ()), **(kwargs or {}))
  File "/home/soim/git/venv/lib/python3.8/site-packages/scancode/api.py", line 171, in get_licenses
    idx = cache.get_index()
  File "/home/soim/git/venv/lib/python3.8/site-packages/licensedcode/cache.py", line 353, in get_index
    return get_cache(force=force, index_all_languages=index_all_languages).index
AttributeError: index

How To Reproduce

Analyze the path that contains the following file.
scancode --json-pp - --license .
File :

$ cat fosslight_log_221223_1022.txt
[   INFO] Checking copyright/license writing rules:
  Compliant: OK
  Summary:
    Open Source Package File: N/A
    Detected Licenses: N/A
    Files without license / total: 0 / 0
    Files without copyright / total: 0 / 0
  Files without license and copyright: N/A
  Files without license: N/A
  Files without copyright: N/A
  Tool Info:
    OS: Linux 5.15.0-56-generic
    Analyze path: test
    Python version: 3
    fosslight_prechecker version: fosslight_prechecker v3.0.11

[WARNING] Created file name: /home/soim/git/temp/fosslight_lint_221223_1022.yaml

System configuration

For bug reports, it really helps us to know:

  • What OS are you running on? Ubuntu 20.04
  • What version of scancode-toolkit was used to generate the scan file? scancode-toolkit==31.2.1
  • What installation method was used to install/run scancode? pip
@soimkim soimkim added the bug label Dec 23, 2022
@soimkim
Copy link
Contributor Author

soimkim commented Dec 23, 2022

https://github.com/nexB/scancode-toolkit/blob/e72110bb3b0cd41ac269eaf80aa131e06d412ec0/src/licensedcode/cache.py#L413
AttributeError occurs because the return object in the load_cache_file function is not LicenseCache.

@soimkim
Copy link
Contributor Author

soimkim commented Dec 23, 2022

After installing ScanCode, the following command must be executed once so that the error does not occur after that.
scancode --reindex-licenses

@pombredanne
Copy link
Member

@soimkim Thank you for the report. I am struggling with the same issue as part of making a new release in https://github.com/nexB/scancode-toolkit/tree/v31.2.2-branch-hotfix for the other report you made in #3171 and this fails the same way at https://github.com/nexB/scancode-toolkit/actions/runs/3761821678
This is a total head scratcher. That's been consuming the day yesterday and this is a top priority!

@AyanSinhaMahapatra
Copy link
Member

@soimkim this should be fixed in scancode-toolkit 31.2.3 which was released by @pombredanne, please give it a try!

@soimkim
Copy link
Contributor Author

soimkim commented Dec 23, 2022

@AyanSinhaMahapatra , The same error occurs in version 31.2.3.

temp$ ls
fosslight_log_221224_0814.txt
(venv) temp$ pip freeze | grep scancode
scancode-toolkit==31.2.3
(venv) temp$ scancode --json-pp - --license --package .
Setup plugins...
Collect file inventory...
Scan files for: licenses, packages with 1 process(es)...
[####################] 2
{
  "headers": [
    {
      "tool_name": "scancode-toolkit",
      "tool_version": "31.2.3",
      "options": {
        "input": [
          "."
        ],
        "--json-pp": "-",
        "--license": true,
        "--package": true
      },
      "notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
      "start_timestamp": "2022-12-23T231553.975677",
      "end_timestamp": "2022-12-23T231556.685130",
      "output_format_version": "2.0.0",
      "duration": 2.7094662189483643,
      "message": null,
      "errors": [
        "Path: temp/fosslight_log_221224_0814.txt"
      ],
      "warnings": [],
      "extra_data": {
        "system_environment": {
          "operating_system": "linux",
          "cpu_architecture": "64",
          "platform": "Linux-5.15.0-56-generic-x86_64-with-glibc2.29",
          "platform_version": "#62~20.04.1-Ubuntu SMP Tue Nov 22 21:24:20 UTC 2022",
          "python_version": "3.8.10 (default, Nov 14 2022, 12:59:47) \n[GCC 9.4.0]"
        },
        "spdx_license_list_version": "3.18",
        "files_count": 1
      }
    }
  ],
  "dependencies": [],
  "packages": [],
  "files": [
    {
      "path": "temp",
      "type": "directory",
      "licenses": [],
      "license_expressions": [],
      "percentage_of_license_text": 0,
      "package_data": [],
      "for_packages": [],
      "scan_errors": []
    },
    {
      "path": "temp/fosslight_log_221224_0814.txt",
      "type": "file",
      "licenses": [],
      "license_expressions": [],
      "percentage_of_license_text": 0,
      "package_data": [],
      "for_packages": [],
      "scan_errors": [
        "ERROR: for scanner: licenses:\nERROR: Unknown error:\nTraceback (most recent call last):\n  File \"/home/soim/git/scanner/fosslight_source_scanner/venv/lib/python3.8/site-packages/scancode/interrupt.py\", line 91, in interruptible\n    return NO_ERROR, func(*(args or ()), **(kwargs or {}))\n  File \"/home/soim/git/scanner/fosslight_source_scanner/venv/lib/python3.8/site-packages/scancode/api.py\", line 171, in get_licenses\n    idx = cache.get_index()\n  File \"/home/soim/git/scanner/fosslight_source_scanner/venv/lib/python3.8/site-packages/licensedcode/cache.py\", line 353, in get_index\n    return get_cache(force=force, index_all_languages=index_all_languages).index\nAttributeError: index\n"
      ]
    }
  ]
}Scanning done.
Some files failed to scan properly:
Path: temp/fosslight_log_221224_0814.txt
Summary:        licenses, packages with 1 process(es)
Errors count:   1
Scan Speed:     25.69 files/sec.
Initial counts: 2 resource(s): 1 file(s) and 1 directorie(s)
Final counts:   2 resource(s): 1 file(s) and 1 directorie(s)
Timings:
  scan_start: 2022-12-23T231553.975677
  scan_end:   2022-12-23T231556.685130
  setup_scan:licenses: 2.66s
  setup: 2.66s
  total: 2.71s
Removing temporary files...done.

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Jan 9, 2023

I could reproduce this btw,

  • What OS are you running on? Ubuntu 20.04
  • What version of scancode-toolkit was used to generate the scan file? scancode-toolkit==31.2.3
  • What installation method was used to install/run scancode? pip
  • Python version: 3.8.10
"ERROR: for scanner: licenses:\nERROR: Unknown error:\nTraceback (most recent call last):\n
File \"/home/soim/git/scanner/fosslight_source_scanner/venv/lib/python3.8/site-packages/scancode/interrupt.py\", line 91,
in interruptible\n    return NO_ERROR, func(*(args or ()), **(kwargs or {}))\n 
File \"/home/soim/git/scanner/fosslight_source_scanner/venv/lib/python3.8/site-packages/scancode/api.py\", line 171,
in get_licenses\n    idx = cache.get_index()\n 
File \"/home/soim/git/scanner/fosslight_source_scanner/venv/lib/python3.8/site-packages/licensedcode/cache.py\",line 353,
 in get_index\n    return get_cache(force=force, index_all_languages=index_all_languages).index\nAttributeError: index\n"

@pombredanne
Copy link
Member

@soimkim we found a workaround and the source of the bug:

$ pip install --upgrade attrs==22.1.0
Collecting attrs==22.1.0
  Using cached attrs-22.1.0-py2.py3-none-any.whl (58 kB)
Installing collected packages: attrs
  Attempting uninstall: attrs
    Found existing installation: attrs 22.2.0
    Uninstalling attrs-22.2.0:
      Successfully uninstalled attrs-22.2.0
Successfully installed attrs-22.1.0
$ python
Python 3.8.15 (default, Jan  9 2023, 12:00:24) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from licensedcode.cache import *;idx=get_index()
>>> type(idx)
<class 'licensedcode.index.LicenseIndex'>

So the culprit is that we have a new attrs release 22 on December 21st. Things pickled with the previous version of attrs (the pickled index) cannot unpickle with newer versions.

We will have a permanent fix in v32.0.
In the meantime, as a recap here are three workarounds:

  • force downgrade attrs with pip install --upgrade attrs==22.1.0 after a pip install scancode-toolkit
  • OR, use the published constraints to pin the versions to the ones used in the release:
curl -o requirements.txt https://raw.githubusercontent.com/nexB/scancode-toolkit/v31.2.3/requirements.txt
pip install scancode-toolkit --constraint requirements.txt 
  • OR, reindex the licenses after the scancode-toolkit install (only once) with scancode --reindex-licenses

pombredanne added a commit that referenced this issue Jan 9, 2023
We have vendored attrs only for its use in licensedcode.models.
With this, we avoid updates to the attrs library that would make
unpickling the license index fail.

Reported-by: Soim @soimkim
Reference: #3192
Reference: #3179
Signed-off-by: Philippe Ombredanne <[email protected]>
pombredanne added a commit that referenced this issue Jan 9, 2023
Signed-off-by: Philippe Ombredanne <[email protected]>
@AyanSinhaMahapatra
Copy link
Member

@soimkim we just pushed a bugfix release scancode-toolkit 31.2.4 (by @pombredanne) and I cannot reproduce the bug with v31.2.4 anymore so please give it a try and let us know!

@soimkim
Copy link
Contributor Author

soimkim commented Jan 11, 2023

I confirmed that it was fixed by installing scancode-toolkit 31.2.4.
Thank you for your quick response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants