Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: exclude known instrumentation jars from being erroneously identified #2796

Merged
merged 2 commits into from
Apr 22, 2024

Conversation

kzantow
Copy link
Contributor

@kzantow kzantow commented Apr 19, 2024

This PR adds a check for known manifest keys that are likely only to be present in instrumentation jars in order to omit these from being erroneously identified as the targeted counterparts. For example, a New Relic distribution contains a number of jars named such as spring- and tomcat- which contain only instrumentation code, not the actual products but were being falsely identified as the corresponding spring and tomcat. This PR adds an exclusion when Weave-Classes is found in the manifest to at least omit these from being incorrectly identified.

// their targeted counterparts, e.g. newrelic spring and tomcat instrumentation
if _, ok := manifest.Main.Get("Weave-Classes"); ok {
log.Debugf("excluding archive due to Weave-Classes manifest entry: %s", j.location)
return nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to skip matching on this jar entirely, or is there a similar heuristic where we could correctly find the NewRelic jar?

Copy link
Contributor Author

@kzantow kzantow Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are finding the newrelic jar, these are subcomponents of it. E.g. this PR reports:

$go run ./cmd/syft ~/Downloads/newrelic-java-8.9.1.zip 
NAME                       VERSION  TYPE                         
agent-bridge               8.9.1    java-archive                  
agent-bridge-datastore     8.9.1    java-archive                  
newrelic                   8.9.1    java-archive                  
newrelic-api               8.9.1    java-archive  (+1 duplicate)  
newrelic-api-javadoc                java-archive                  
newrelic-api-sources                java-archive                  
newrelic-security-agent    1.1.0    java-archive                  
newrelic-security-api      1.1.0    java-archive                  
newrelic-weaver-api        8.9.1    java-archive                  
newrelic-weaver-scala-api  8.9.1    java-archive  

I don't think it's especially valuable to try to identify these separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see.

I misunderstood the comment at https://github.com/anchore/syft/pull/2796/files#diff-0965089b69371402427667e60ad9d7ad5698ae0b330b5c7430e2b0c671655fe5R1356 // we expect no packages to be discovered from this - I thought that meant we wouldn't find the New Relic jar at all.

What is meant by that comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is for a manifest with the Weave-Classes attribute being excluded. I updated the comment to hopefully be more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note I can't seem to find any vulnerabilities reported against the Java agent or related components anywhere, though. But it seems that the vulnerabilities reported are not against individual components but rather the platform: https://nvd.nist.gov/products/cpe/search/results?namingFormat=2.3&keyword=newrelic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are currently no known public vulnerabilities filed against the Newrelic java agent components

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and there are currently no separate maven components published for the instrumentation jars, so when there are vulns in future they would likely be placed against one or multiple of the published maven jars which we are still capturing (though probably the groupie and artifact I'd may be incorrect, but that is a separate issue)

@westonsteimel
Copy link
Contributor

westonsteimel commented Apr 20, 2024

I haven't had a chance to look yet, but does this approach also work for the mentioned newrelic-security-agent.jar from their doc https://docs.newrelic.com/docs/apm/agents/java-agent/troubleshooting/java-agent-identified-with-security-vulnerabilities/?

https://repo1.maven.org/maven2/com/newrelic/agent/java/security/newrelic-security-agent/

I'm guessing it may since it appears that it is also bundled within the Java agent jar

@westonsteimel
Copy link
Contributor

Okay, I verified this also works for newrelic-security-agent.jar

@willmurphyscode
Copy link
Contributor

We had a discussion of this issue over slack; posting a summary here.

Before this change, the behavior is:

  1. Syft finds the newrelic jar itself
  2. Syft mis-identifies the components of the new relic jar as the thing they instrument (so it thinks the jar that instruments spring apps is spring itself, for example).

After this change, the behavior is:

  1. Syft finds the newrelic jar itself
  2. Syft omits the component jars of the new relic jar entirely
  3. Syft omits any jar that has Weave-Classes set in its manifest key entirely

Some research

  1. Weave-Classes is rarely used, only 74 hits on GitHub as of this writing: https://github.com/search?q=%22Weave-Classes%22&type=code
  2. The only places we've seen Weave-Classes in a manifest is as part of a component jar of New Relic; but these component jars don't seem to have CVEs reported directly against them, and we'll still find the parent jar.
  3. @kzantow and @spiffcs tried experiments where they inverted the order of heuristics here to see whether checking the manifest first before the filename would improve things. It instead makes some things, especially CPE generation for the component jars, much much worse. (This is what I was trying to suggest in an earlier comment, but I was misremembering some details of the Java cataloger and didn't ask very clearly.

I think this is good to approve, given the additional research, especially the lack of jars using Weave-Classes in their manifest.

@kzantow is my summary of the "after this change" behavior correct?

@kzantow
Copy link
Contributor Author

kzantow commented Apr 22, 2024

@willmurphyscode your summary is accurate -- to be especially clear: there is the distinct possibility Syft will start missing valid jars. I think the likelihood of that is extremely low; but this should benefit any New Relic Java agent users considerably.

@kzantow kzantow merged commit f7d3d55 into anchore:main Apr 22, 2024
11 checks passed
@kzantow kzantow deleted the fix/exclude-weave-jars branch April 22, 2024 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants