This driver runs select NuGet Package Explorer analyzers on NuGet packages and saves the results to CSV. This driver was added to investigate the extent of package reproducibility on NuGet.org.
CatalogScanDriverType enum value |
NuGetPackageExplorerToCsv |
Driver implementation | NuGetPackageExplorerToCsvDriver |
Processing mode | process latest catalog leaf per package ID and version |
Cursor dependencies | V3 package content: this driver needs the .nupkg from the package content resource |
Components using driver output | Kusto ingestion via KustoIngestionMessageProcessor , since this driver produces CSV data |
Temporary storage config | Table Storage:CsvRecordTableName (name prefix): holds CSV records before they are added to a CSV blobTaskStateTableName (name prefix): tracks completion of CSV blob aggregation |
Persistent storage config | Blob Storage:NuGetPackageExplorerContainerName : contains CSVs for the NuGetPackageExplorers tableNuGetPackageExplorerFileContainerName : contains CSVs for the NuGetPackageExplorerFiles table |
Output CSV tables | NuGetPackageExplorerFiles NuGetPackageExplorers |
This driver downloads the whole package (.nupkg) to disk from the NuGet.org V3 package content resource. This is required for the NuGet Package Explorer APIs to work. It cannot operate on a generic, seekable package stream. After the download is complete, a NuGet Package Explorer ZipPackage
is instantiated. This instance is passed to the NuGet Package Explorer SymbolValidator
which performs much of the validation desired.
There are many failure modes for the NuGet Package Explorer validation, so the result CSV record can express that instead of just throwing exceptions and blocking the processing pipeline. For example, some package analysis fails with a timeout and for other packages it fails based on a poorly (or creatively) authored structure or metadata.
Package-level analysis is stored in the NuGetPackageExplorers
table. More granular file-level analysis is stored in the NuGetPackageExplorerFiles
table.
This workflow parses symbol data related to a package. This means that some Source Link information is available. This can be an additional source of package repository information in addition to the .nuspec <repository>
element read by the PackageManifestToCsv
driver.