Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding basic framework for the CLI, no changes to old cli. #261

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

shaynakapadia
Copy link
Collaborator

Summary

This MR adds the basic setup for the new non-interactive surfactant cli.

If merged this pull request will

  • add a surfactant cli load command
  • add a surfactant cli save command
  • add a new subfolder surfactant/cmd/cli_commands that contains the cli classes related to each subcommand (base, save, load)
  • Initial structure to provide serialization for SBOMs (right now it just saves it as json (pickling does not work right now))

Proposed changes

The changes here will migrate the existing cli interface to the new structure. Proposed workflow is below:

surfactant cli load sbom.json # Loads the sbom into surfactant, surfactantn saves it in ~/.surfactant in a serialized form
surfactant cli find --containerPath=^123* # Loads from serialized form and finds subset that matches args
surfactant cli add --installPath 123/ /bin/ # Adds new install path based on containerpath
surfactant cli merge # merges changes to the subset from find back into the main sbom
surfactant cli find --uuid 123 # Find one entry to edit based on uuid
surfactant cli edit --components="IsAGRAF" # Editing an array by picking the element, this one edits a specific component in this entry
Current Value: {"name": "IsAGRAF", "Vendor": "Rockwell Collins Automation"}
New Value: {"name": "IsAGRAF", "Vendor": ["Rockwell Collins Automation"], "version": "1.2.3"} 
surfactant cli edit --name # Edit a string value
Current Value: oldname.out
New Value: 1.2.3.CPO.out
surfactant cli merge # Merge changes back into the rest of the SBOM
surfactant cli save new_sbom.json # save edited sbom to a new file

@shaynakapadia shaynakapadia self-assigned this Sep 23, 2024
@shaynakapadia shaynakapadia marked this pull request as draft September 23, 2024 22:19
@shaynakapadia shaynakapadia marked this pull request as ready for review October 7, 2024 21:10
Copy link
Collaborator

@nightlark nightlark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be interesting to see how the performance for (de)serializing larger SBOMs is -- it looks like that will need to happen for every command that gets ran?

surfactant/cmd/cli.py Outdated Show resolved Hide resolved
surfactant/cmd/cli.py Outdated Show resolved Hide resolved
@nightlark nightlark added the enhancement New feature or request label Oct 18, 2024
@shaynakapadia
Copy link
Collaborator Author

shaynakapadia commented Oct 21, 2024

It will be interesting to see how the performance for (de)serializing larger SBOMs is -- it looks like that will need to happen for every command that gets ran?

Ran some timing on the surfactant cli load cmd, which both serializes and deserializes. Not sure why 72.7 MB and the 134 MB is going slower than the larger ones, but could be the nesting or something. Right now the serialization isn't really serialization, its just writing json to file. I was running into issues with python pickle, so am working on figuring out a workaround.

SBOM Size Avg Time
134 KB 0.720 sec
783 KB 0.708 sec
5.8 MB 1.356 sec
11.5 MB 3.641 sec
72.7 MB 2.806 sec

Update: Compared msgpack to json, and msgpack is a bit faster, mostly on the packing, but also on the other tasks.

size json_pack msgpack json_unpack msgunpack json_write msg_write json_read msg_read
725.7 MB 14.833 12.3212 10.531 10.486 0.1092 0.081 0.0894 0.0704
72.7 MB 1.4576 1.1906 0.9776 0.9380 0.0102 0.0080 0.0084 0.0060
11.5 MB 0.8108 0.6802 2.4992 2.4526 0.0022 0.0012 0.0010 0.0010
5.8 MB 0.3154 0.2556 0.5292 0.5348 0.0010 0.0010 0.0010 0.0010
783 KB 0.0560 0.0446 0.1458 0.1420 0.0000 0.0000 0.0008 0.0004
134 KB 0.0090 0.0072 0.0198 0.0198 0.0000 0.0000 0.0000 0.0000
237 B 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

@shaynakapadia
Copy link
Collaborator Author

So pickling is in fact significantly faster if I pickle the class directly. I can't do this unless I take care of the mappingproxy type first, but with some very minimal pre and post processing it works

def serialize(sbom):
    for k, v in sbom.__dataclass_fields__.items():
        v.metadata = {}
    return pickle.dumps(sbom)

def deserialize(data):
    sbom = pickle.loads(data)
    for k, v in sbom.__dataclass_fields__.items():
        v.metadata = MappingProxyType({})
    return sbom

Timing results here:

Size json_dumps msgpack pickle_dumps json_loads msgunpack pickle_loads
725.7 MB 15.6412 12.4518 1.3130 10.6634 10.5556 0.7808
72.7 MB 1.5228 1.2606 0.0946 1.0528 1.0048 0.0794
11.5 MB 0.8552 0.7242 0.0304 2.6202 2.6132 0.0502
5.8 MB 0.3380 0.2650 0.0092 0.5446 0.5404 0.0112
783 KB 0.0560 0.0480 0.0012 0.1524 0.1472 0.0020
134 KB 0.0090 0.0070 0.0000 0.0202 0.0200 0.0000
237 B 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

docs/cli_usage.md Outdated Show resolved Hide resolved
self.sbom_filename = "sbom_cli"
self.subset_filename = "subset_cli"
# Create data directory
self.data_dir = self._get_cli_sbom_dir()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the function in the ConfigManager (https://github.com/LLNL/Surfactant/blob/main/surfactant/configmanager.py#L145) get used instead? I think the one difference is it uses Local AppData on Windows -- it doesn't get synchronized between multiple systems, which is probably okay (esp. if the SBOM happens to be very big).

Copy link
Collaborator

@nightlark nightlark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cli_save.py and cli_load.py are using a different docstring style than the other files.

Comment on lines +18 to +23
sbom An internal record of sbom entries the class adds to as it finds more matches.
subset An internal record of the subset of sbom entries from the last cli find call.
sbom_filename: A string value of the filename where the loaded sbom is stored.
subset_filename: A string value of the filename where the current subset result from the "cli find" command is stored.
match_functions A dictionary of functions that provide matching functionality for given SBOM fields (i.e. uuid, sha256, installpath, etc)
camel_case_conversions A dictionary of string conversions from all lowercase to camelcase. Used to convert python click options to match the SBOM attribute's case
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sbom An internal record of sbom entries the class adds to as it finds more matches.
subset An internal record of the subset of sbom entries from the last cli find call.
sbom_filename: A string value of the filename where the loaded sbom is stored.
subset_filename: A string value of the filename where the current subset result from the "cli find" command is stored.
match_functions A dictionary of functions that provide matching functionality for given SBOM fields (i.e. uuid, sha256, installpath, etc)
camel_case_conversions A dictionary of string conversions from all lowercase to camelcase. Used to convert python click options to match the SBOM attribute's case
sbom: An internal record of sbom entries the class adds to as it finds more matches.
subset: An internal record of the subset of sbom entries from the last cli find call.
sbom_filename: A string value of the filename where the loaded sbom is stored.
subset_filename: A string value of the filename where the current subset result from the "cli find" command is stored.
match_functions: A dictionary of functions that provide matching functionality for given SBOM fields (i.e. uuid, sha256, installpath, etc)
camel_case_conversions: A dictionary of string conversions from all lowercase to camelcase. Used to convert python click options to match the SBOM attribute's case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants