Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug: match-results-forward command generates invalid Sarif (duplicated rules). #2505

Merged
merged 13 commits into from
Aug 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/ReleaseHistory.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Unreleased

* BUGFIX: Remove duplicated rule definitions when executing `match-results-forward` commands for results with sub-rule ids. [#2486](https://github.com/microsoft/sarif-sdk/pull/2486)
* BUGFIX: Update `merge` command to properly produce runs by tool and version when passed the `--merge-runs` argument. [#2488](https://github.com/microsoft/sarif-sdk/pull/2488)
* BUGFIX: Eliminate `IOException` and `DirectoryNotFoundException` exceptions thrown by `merge` command when splitting by rule (due to invalid file characters in rule ids).

Expand Down
8 changes: 5 additions & 3 deletions src/Sarif/Visitors/RunMergingVisitor.cs
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,10 @@ public override Result VisitResult(Result node)
}

// Cache RuleId and set Result.RuleIndex to the (new) index
string ruleId = node.ResolvedRuleId(CurrentRun);
string resultRuleId = node.ResolvedRuleId(CurrentRun);
ReportingDescriptor rule = node.GetRule(CurrentRun);
string ruleId = rule?.Id ?? resultRuleId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string ruleId = rule?.Id ?? resultRuleId;

Is there a case where rule.Id might not be null and resultRuleId may be another value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the ruleId from result (resultRuleId) is a sub ruleId. The rule's Id (rule.Id) is the parent ruleId.

e.g. a result has sub rule id ("SEC105/001/allocator"), and a rule index points to its parent rule definition in rules array.

      "results": [
        {
          "ruleId": "SEC105/001/allocator",
          "ruleIndex": 0,
          ...
        }
          "rules": [
            {
              "id": "SEC105/001", // index 0
              ...
            }

The fix here is to use the parent ruleId instead of sub ruleId as the RuleIdToIndex dictionary's key, to avoid the (parent) rule definition to be added multiple times.


if (!string.IsNullOrEmpty(ruleId))
{
if (RuleIdToIndex.TryGetValue(ruleId, out int ruleIndex))
Copy link
Collaborator Author

@yongyan-gh yongyan-gh Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuleIdToIndex

It uses this rule-id to index mapping to track all referenced rules.
If there are multiple unique sub ruleIds of a same parent rule in the results, it caches the same parent rule for each unique sub ruleId, which caused the invalid Sarif output (same rule duplicated multiple times) #ByDesign

Expand All @@ -87,15 +90,14 @@ public override Result VisitResult(Result node)
}
else
{
ReportingDescriptor rule = node.GetRule(CurrentRun);
int newIndex = Rules.Count;
Rules.Add(rule);

RuleIdToIndex[ruleId] = newIndex;
node.RuleIndex = newIndex;
}

node.RuleId = ruleId;
node.RuleId ??= resultRuleId;
}

Result result = base.VisitResult(node);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
// Copyright (c) Microsoft. All rights reserved.
// Licensed under the MIT license. See LICENSE file in the project root for full license information.

using System;
using System.Collections.Generic;
using System.Linq;

using FluentAssertions;

Expand Down Expand Up @@ -359,5 +361,135 @@ private void VerifyRuleMatches(Run mergedRun, Run sourceRun, int firstResultInde
firstResultIndex++;
}
}

[Fact]
public void RunMergingVisitor_MergeSubRulesProperly()
{
var previousRun = new Run
{
Tool = new Tool
{
Driver = new ToolComponent
{
Name = "Test Tool",
Rules = new List<ReportingDescriptor>()
{
new ReportingDescriptor() { Id = "Rule/001" },
new ReportingDescriptor() { Id = "Rule/002" },
new ReportingDescriptor() { Id = "Rule/003" }, // not referenced by this run's results
new ReportingDescriptor() { Id = "Rule004" },
new ReportingDescriptor() { Id = "Rule/006" }, // no reference
}
},
},
Results = new List<Result>
{
new Result { RuleId = "Rule/001/SubRule1", RuleIndex = 0 },
new Result { RuleId = "Rule/001/SubRule2", RuleIndex = 0 },
new Result { RuleId = "Rule/002/SubRule1", RuleIndex = 1 },
new Result { RuleId = "Rule/002/SubRule2", RuleIndex = 1 },
new Result { RuleId = "Rule004", RuleIndex = 3 }, // not a sub rule
}
};

var currentRun = new Run
{
Tool = new Tool
{
Driver = new ToolComponent
{
Name = "Test Tool",
Rules = new List<ReportingDescriptor>()
{
new ReportingDescriptor() { Id = "Rule/004" }, // no reference
new ReportingDescriptor() { Id = "Rule/002" },
new ReportingDescriptor() { Id = "Rule/003" },
new ReportingDescriptor() { Id = "Rule/005" },
}
},
},
Results = new List<Result>
{
new Result { RuleId = "Rule/002/SubRule3", RuleIndex = 1 },
new Result { RuleId = "Rule/002/SubRule4", RuleIndex = 1 },
new Result { RuleId = "Rule/003/SubRule1", RuleIndex = 2 },
new Result { RuleId = "Rule/003/SubRule2", RuleIndex = 2 },
new Result { RuleIndex = 3 }, // RuleId is not set ("Rule/005")
new Result { RuleId = "Rule/DoesNotExist" }, // rule definition does not exist, should create a empty rule for it.
}
};

Run mergedRun = currentRun.DeepClone();
Run baselineRunCopy = previousRun.DeepClone();

mergedRun.MergeResultsFrom(baselineRunCopy);

// We should get Rule/001, Rule/002, Rule/003, Rule004, Rule/005 and 1 extra rule "Rule/DoesNotExist".
// Rule/004 and Rule/006 weren't referenced.
var expectedRules = new Dictionary<string, int>();
mergedRun.Tool.Driver.Rules.Count.Should().Be(6);
foreach (ReportingDescriptor rule in mergedRun.Tool.Driver.Rules)
{
if (!expectedRules.ContainsKey(rule.Id))
{
expectedRules[rule.Id] = 0;
}
expectedRules[rule.Id]++;
}

expectedRules["Rule/001"].Should().Be(1);
expectedRules["Rule/002"].Should().Be(1);
expectedRules["Rule/003"].Should().Be(1);
expectedRules["Rule004"].Should().Be(1);
expectedRules["Rule/005"].Should().Be(1);
expectedRules["Rule/DoesNotExist"].Should().Be(1);
expectedRules.ContainsKey("Rule/004").Should().BeFalse();
expectedRules.ContainsKey("Rule/006").Should().BeFalse();

mergedRun.Results.Count.Should().Be(11);
foreach (Result result in mergedRun.Results)
{
string subRuleId = result.RuleId;
string parentRuleId = result.GetRule(mergedRun).Id;
if (subRuleId != parentRuleId)
{
subRuleId.StartsWith(parentRuleId).Should().BeTrue();
}
}
Copy link
Contributor

@marmegh marmegh Aug 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can consolidate all of this:

int rule_001Count = 0;
int rule_002Count = 0;
int rule_003Count = 0;
int rule004Count = 0;
int rule_005Count = 0;
int rule_006Count = 0;
int rule_DoesNotExistCount = 0;

mergedRun.Results.Count.Should().Be(11); //Even better if you have a way of calculating this expected value from the inputs above.

foreach (Result result in mergedRune.Results)
{
    ...// keep everything the same and add something like this:
   if(result.Id == "Rule/001")
   {
       rule_001Count+=1;
   }
}

rule_001Count.Should().Be(expectedRule001Count);
//etc

We're trying to avoid iterating over the same lists multiple times. Let me know if this makes sense.
#Resolved

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, updated & removed duplicated iterations

}

[Fact]
public void RunMergingVisitor_WhenCurrentRunIsNull_ShouldThrowException()
{
Run currentRun = null;
var baselineRun = new Run
{
Tool = new Tool
{
Driver = new ToolComponent
{
Name = "Test Tool",
Rules = new List<ReportingDescriptor>()
{
new ReportingDescriptor() { Id = "Rule001" },
new ReportingDescriptor() { Id = "Rule002" },
}
},
},
Results = new List<Result>
{
new Result { RuleIndex = 1 },
new Result { RuleIndex = 0 },
new Result { RuleIndex = 1 }
}
};

RunMergingVisitor currentVisitor = new RunMergingVisitor();
currentVisitor.Visit(baselineRun);

currentVisitor.CurrentRun = currentRun;
Action action = () => currentVisitor.VisitResult(baselineRun.Results[0]);
action.Should().Throw<InvalidOperationException>("Expect VisitResult method throw InvalidOperationException when RunMergingVisitor.CurrentRun is null");
}
}
}