Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do report deduplication after an Analysis run like what is done during the Store command #4042

Open
vodorok opened this issue Oct 16, 2023 · 0 comments
Assignees
Labels
analyzer 📈 Related to the analyze commands (analysis driver) enhancement 🌟

Comments

@vodorok
Copy link
Contributor

vodorok commented Oct 16, 2023

Currently, when one tries to store a report folder, and there are multiple same reports from a header file, only one piece of report will be kept and stored of all the identical reports.

This can be extremely wasteful on large report folders when the raw report count is in the millions. It is also confusing when the store command uploads the report folder, it will print the raw report count which will be different to the report count that is actually stored.

See the following examples:
CodeChecker parse <results_folder>

----==== Checker Statistics ====----                                      
-------------------------------------------------------------------------                                                                           
Checker name                               | Severity | Number of reports                                                                           
------------------------------------------------------------------------- 
cppcheck-uninitMemberVar                   | MEDIUM   |                 4
cppcheck-invalidPrintfArgType_uint         | MEDIUM   |                 1
cppcheck-invalidPrintfArgType_sint         | MEDIUM   |                 2
cppcoreguidelines-virtual-class-destructor | MEDIUM   |                12
cppcoreguidelines-special-member-functions | LOW      |                17
bugprone-sizeof-expression                 | HIGH     |                 1 
-------------------------------------------------------------------------
----=================----            
                                     
----==== File Statistics ====----    
--------------------------------     
File name    | Number of reports                                          
--------------------------------                                          
tinyxml2.h   |                36                                          
tinyxml2.cpp |                 1                                          
--------------------------------                                          
----=================----      

Store result on GUI:
image

CodeChecker store <results_folder>

----==== Checker Statistics ====----                                                                                                                
-------------------------------------------------------------------------                                                                           
Checker name                               | Severity | Number of reports                                                                           
-------------------------------------------------------------------------                                                                           
cppcheck-uninitMemberVar                   | MEDIUM   |                 7                                                                           
cppcheck-invalidPrintfArgType_uint         | MEDIUM   |                 1                                                                           
cppcheck-invalidPrintfArgType_sint         | MEDIUM   |                 4
cppcoreguidelines-virtual-class-destructor | MEDIUM   |                24
cppcoreguidelines-special-member-functions | LOW      |                34                                                                           
bugprone-sizeof-expression                 | HIGH     |                 1
-------------------------------------------------------------------------
----=================----

----==== File Statistics ====----
--------------------------------
File name    | Number of reports
--------------------------------
tinyxml2.h   |                70
tinyxml2.cpp |                 1
--------------------------------
----=================----

It would be a beneficial feature to deduplicate identical reports during or after the analysis run. The same algorithm should be used that the store handler uses, to ensure the best compatibility.

@vodorok vodorok self-assigned this Oct 16, 2023
@vodorok vodorok added the analyzer 📈 Related to the analyze commands (analysis driver) label Oct 16, 2023
@vodorok vodorok added this to the release 6.23.0 milestone Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer 📈 Related to the analyze commands (analysis driver) enhancement 🌟
Projects
None yet
Development

No branches or pull requests

3 participants