Fix documentation errors

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
aboutcode-org · Feb 17, 2021 · 1b23422 · 1b23422
1 parent 74f3d45
commit 1b23422
Show file tree

Hide file tree

Showing 4 changed files with 43 additions and 29 deletions.
diff --git a/.github/workflows/ci-docs.yml b/.github/workflows/ci-docs.yml
@@ -28,7 +28,7 @@ jobs:
         run:  cd docs && pip install -r requirements.txt	
 
       - name: Check Sphinx Documentation build minimally
-        run: sphinx-build -E source build
+        run: sphinx-build -E ./source ./build
 
       - name: Check for documentation style errors
         run: ./scripts/doc8_style_check.sh

diff --git a/INSTALL.rst b/INSTALL.rst
@@ -1,12 +1,11 @@
 Quickstart - Scancode Plugin
 ----------------------------
 
-``scancode-results-analyzer`` can be installed as a scancode post-scan plugin,
-using ``pip``.
+``scancode-results-analyzer`` can be installed as a scancode post-scan plugin.
 
 1. Clone the Repository and navigate to the ``scancode-results-analyzer`` directory.
 
-2. Configure::
+2. Configure (Installs the requirements, and scancode-toolkit with the plugin)::
 
     ./configure
 

diff --git a/docs/source/how-analysis-is-performed/cases-incorrect-scans.rst b/docs/source/how-analysis-is-performed/cases-incorrect-scans.rst
@@ -70,53 +70,69 @@ All Issue Types
 ---------------
 
 .. list-table::
-    :widths: 15 15
+    :widths: 5 15 15
     :header-rows: 1
 
-    * - ``text/notice/tag/reference``
+    * - ``license``
       - ``issue_type::classification_id``
+      - ``Description``
 
     * - ``text``
       - ``text-legal-lic-files``
+      - The matched text is present in a file whose name is a known legal filename.
 
     * - ``text``
       - ``text-non-legal-lic-files``
+      - The matched license text is present in a file whose name is not a known legal filename.
 
     * - ``text``
-      - ``text-lic-text-fragments``
+      - ``lic-text-fragments``
+      - Only parts of a larger license text are detected.
 
     * - ``notice``
-      - ``notice-and-or-with-notice``
+      - ``and-or-with-notice``
+      - A notice with a complex license expression (i.e. exceptions, choices or combinations).
 
     * - ``notice``
-      - ``notice-single-key-notice``
+      - ``single-key-notice``
+      - A notice with a single license.
 
     * - ``notice``
       - ``notice-has-unknown-match``
+      - License notices with unknown licenses detected.
 
     * - ``notice``
       - ``notice-false-positive``
+      - A piece of code/text is incorrectly detected as a license.
 
     * - ``tag``
-      - ``tag-tag-coverage``
+      - ``tag-low-coverage``
+      - A part of a license tag is detected
 
     * - ``tag``
-      - ``tag-other-tag-structures``
+      - ``other-tag-structures``
+      - A new/common structure of tags are detected with scope for being handled differently.
 
     * - ``tag``
       - ``tag-false-positives``
+      - A piece of code/text is incorrectly detected as a license.
 
     * - ``reference``
-      - ``reference-lead-in-or-unknown-refs``
+      - ``lead-in-or-unknown-reference``
+      - Lead-ins to known license references are detected.
 
     * - ``reference``
-      - ``reference-low-coverage-refs``
+      - ``low-coverage-reference``
+      - License references with a incomplete match.
 
     * - ``reference``
       - ``reference-to-local-file``
+      - Matched to an unknown rule as the license information is present in another file,
+        which is referred to in this matched piece of text.
 
     * - ``reference``
       - ``reference-false-positive``
+      - A piece of code/text is incorrectly detected as a license.
 
 .. _case_lic_text:
 

diff --git a/docs/source/how-analysis-is-performed/selecting-incorrect-unique.rst b/docs/source/how-analysis-is-performed/selecting-incorrect-unique.rst
@@ -98,20 +98,21 @@ this is efficient enough, and passes through the list of matches once.
 File-regions with Incorrect Scans
 ---------------------------------
 
-The attribute ``license_scan_analysis_result`` in the analysis results has information on if the
+The attribute ``issue_id`` in the analysis results has information on if the
 file-region has any license detection issue in it, bases on coverage values, presence of extra words
 or false positive tags.
 
 .. note::
 
-    The 6 possible values of ``license_scan_analysis_result`` are:
+    The 5 possible values of ``issue_id`` are:
 
-    1. ``correct-license-detection``
-    2. ``imperfect-match-coverage``
-    3. ``near-perfect-match-coverage``
-    4. ``extra-words``
-    5. ``false-positive``
-    6. ``unknown-match``
+    1. ``imperfect-match-coverage``
+    2. ``near-perfect-match-coverage``
+    3. ``extra-words``
+    4. ``false-positive``
+    5. ``unknown-match``
+
+    If we do not have an issue, it is a correct license detection.
 
 Scancode detects most licenses accurately, so our focus is only on the parts where the detection has
 issues, and so primarily in the first step we separate this from the Correct Scans.
@@ -126,7 +127,7 @@ So in ``Step 1``::
     are wrong detections, and also detections where all the matches have a perfect
     ``match_coverage``, i.e. 100.
 
-These fall into the first category::
+These fall into the first category:
 
     1. ``correct-license-detection``
 
@@ -151,7 +152,7 @@ There is also another case where ``score != matched_coverage * rule_relevance``,
 some extra words, i.e. the entire rule was matched, but there were some extra words which caused the
 decrease in score.
 
-So the 3 category of issues as classified in this step are::
+So the 3 category of issues as classified in this step are:
 
     2. ``imperfect-match-coverage``
     3. ``near-perfect-match-coverage``
@@ -165,12 +166,12 @@ less than a threshold (i.e. say less than 4 words) and the start-line of the mat
 be more than a threshold (i.e. say more than 1000) for it to be considered a false positive.
 
 This is the ``Step 3`` and here a NLP sentence Classifier could be used to improve accuracy.
-The issue class is called::
+The issue class is called:
 
     5. ``false-positives``
 
 Even if all the matches has perfect `match_coverage`, if there are `unknown` license
-matches there, there's likely a license detection issue. This issue is a::
+matches there, there's likely a license detection issue. This issue is a:
 
     6. ``unknown-match``
 
@@ -212,8 +213,6 @@ I.e. the policy is::
     “matched_rule_identifier” and “match_coverage” across these multiple files, we keep only
     one file among them and discard the others.
 
-This is performed in the summary plugin, where all the unique license detection issues are
-reported in the summary together, each with a list of their occurrences.
 
 For example, in `scancode-toolkit#1920 <https://github.com/nexB/scancode-toolkit/issues/1920>`_, socat-2.0.0 has
 multiple (6) files with each file having the same 3 matched rules and match_coverage sets, i.e. -
@@ -226,5 +225,5 @@ So, we need to keep only one of these files, as the others have the same license
 
 .. note::
 
-    This isn't followed in the ``scancode`` ``post-scan plugin`` as the processing is per-file,
-    and this is a codebase-level operation.
+    This is performed in the summary plugin, where all the unique license detection issues are
+    reported in the summary together, each with a list of their occurrences.