-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: develop a crate to parse cwl documents #4
Conversation
WalkthroughThis pull request introduces enhancements to the Changes
Possibly related issues
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAttention: Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Nitpick comments (14)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (1)
1-2
: Fix import ordering to satisfy formatting constraints.The pipeline indicates a formatting issue with the import order. Ensure the imports are arranged according to your project's style guidelines, e.g., grouped by external vs. internal crates or alphabetically sorted.
You might apply a diff similar to the following (exact ordering may vary based on your style guide):
-use crate::schema::types::{Any, CwlSchemaType, Documentation}; -use crate::schema::requirements::CommandLineToolRequirement; +use crate::schema::requirements::CommandLineToolRequirement; +use crate::schema::types::{Any, CwlSchemaType, Documentation};🧰 Tools
🪛 GitHub Actions: Continuous Integration
[warning] 1-1: Formatting issue: Import order needs to be reorganized
zefiro-core/zefiro-cwl-parser/src/schema/document.rs (1)
102-103
: Add test coverage forto_yaml
method.The lines for writing YAML (
to_yaml
) are not tested according to coverage reports. Consider implementing a quick test that verifies correct serialization to an in-memory buffer.Would you like a quick example test snippet to ensure coverage?
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 102-103: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L102-L103
Added lines #L102 - L103 were not covered by testszefiro-core/zefiro-cwl-parser/src/lib.rs (1)
3-3
: Add an extra newline to address formatting requirements.According to the pipeline, an additional newline is required after the module declarations to maintain consistent styling. For example:
pub mod schema; pub mod values; -pub use crate::schema::document::CwlSchema; + +pub use crate::schema::document::CwlSchema; pub use crate::values::document::CwlValues;🧰 Tools
🪛 GitHub Actions: Continuous Integration
[warning] 3-3: Formatting issue: Extra newline needed after imports
zefiro-core/zefiro-cwl-parser/src/values/types.rs (2)
6-9
: Consider renaming the struct or importing with an alias to avoid confusion withstd::fs::File
.While this
File
struct works as intended, its name might be confused with Rust’s standard libraryFile
. You can either rename it or import it with an alias to prevent ambiguity.
17-23
: Same naming consideration forDirectory
.Similar to the
File
struct, consider naming thisCwlDirectory
or another descriptive name to avoid overshadowing possible Rust standard library usage.zefiro-core/zefiro-cwl-parser/src/schema/types.rs (1)
12-39
: Clarity check forCwlSchemaType
.Overall, the enum covers various CWL schema types well. However, the
Any(String)
variant name could conflict with theAny
enum. Consider renaming for improved clarity.zefiro-core/zefiro-cwl-parser/src/values/document.rs (1)
10-14
: Fix the missing trailing comma to comply with Rust style guidelines.The pipeline indicates a formatting issue. Adding a trailing comma after the last struct field helps maintain consistent formatting.
pub struct CwlValues { #[serde(flatten)] - values: HashMap<String, CwlValueType> + values: HashMap<String, CwlValueType>, }🧰 Tools
🪛 GitHub Actions: Continuous Integration
[warning] 10-10: Formatting issue: Missing comma in struct field
zefiro-core/zefiro-cwl-parser/examples/data/clt-step-schema.yml (3)
21-22
: Review the output location modification approachThe
outputEval
expression modifies thelocation
property in-place:outputEval: ${self[0].location += inputs.output_location_subdir; return self[0]}This mutation-based approach could lead to unexpected side effects. Consider creating a new location instead:
outputEval: ${return Object.assign({}, self[0], {location: self[0].location + inputs.output_location_subdir})}
26-29
: Document resource requirementsThe resource requirements would benefit from documentation explaining why these specific values were chosen:
- coresMin: 2
- outdirMin: 1000 (MB)
- ramMin: 1024 (MB)
Consider adding comments to explain the resource calculations and any empirical data supporting these values.
31-32
: Improve readability of time limit expressionThe time limit expression
$(60*60*1)
(3600 seconds) could be more readable.Consider using a more explicit expression or adding a comment:
timelimit: $(60*60*1) # 1 hour in seconds # or timelimit: 3600 # 1 hour in seconds.github/workflows/cd.yml (1)
46-46
: Add newline at end of fileAdd a newline character at the end of the file to comply with POSIX standards.
# cargo publish --manifest-path zefiro-core/Cargo.toml \ - # --locked --token ${{ secrets.CARGO_REGISTRY_TOKEN }} + # --locked --token ${{ secrets.CARGO_REGISTRY_TOKEN }} +🧰 Tools
🪛 yamllint (1.35.1)
[error] 46-46: no new line character at the end of file
(new-line-at-end-of-file)
zefiro-core/zefiro-cwl-parser/examples/data/wf-step-schema.yml (1)
5-10
: Consider using semantic names for workflow inputsThe input IDs are prefixed with
step__
, which couples them tightly to the step name. Consider using more semantic names that reflect their purpose rather than their usage location.- - id: step__in_file + - id: input_files type: type: array items: File - - id: step__out_file + - id: output_filename type: stringzefiro-core/zefiro-cwl-parser/README.md (2)
7-8
: Enhance specification support documentationThe overview is vague about which CWL v1.2 fields are supported. Consider:
- Adding a table of supported/unsupported fields
- Linking to specific CWL v1.2 specification sections
- Documenting any limitations or planned features
🧰 Tools
🪛 GitHub Check: Typos
[warning] 7-7:
"stuctures" should be "structures".🪛 GitHub Actions: Continuous Integration
[error] 7-7: Typo: 'stuctures' should be 'structures'
22-58
: Enhance code examples with error handlingThe example code uses the
?
operator but doesn't show:
- The error types that might be returned
- How to handle different error cases
- The complete function signature with Result type
Consider adding:
fn main() -> Result<(), Box<dyn std::error::Error>> { // Existing example code Ok(()) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (18)
.github/workflows/cd.yml
(1 hunks)Cargo.toml
(1 hunks)zefiro-core/Cargo.toml
(0 hunks)zefiro-core/zefiro-cwl-parser/Cargo.toml
(1 hunks)zefiro-core/zefiro-cwl-parser/README.md
(1 hunks)zefiro-core/zefiro-cwl-parser/examples/data/clt-step-schema.yml
(1 hunks)zefiro-core/zefiro-cwl-parser/examples/data/clt-step-values.yml
(1 hunks)zefiro-core/zefiro-cwl-parser/examples/data/wf-step-schema.yml
(1 hunks)zefiro-core/zefiro-cwl-parser/src/lib.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/mod.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/types.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/mod.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/types.rs
(1 hunks)
💤 Files with no reviewable changes (1)
- zefiro-core/Cargo.toml
✅ Files skipped from review due to trivial changes (3)
- zefiro-core/zefiro-cwl-parser/src/values/mod.rs
- zefiro-core/zefiro-cwl-parser/src/schema/mod.rs
- zefiro-core/zefiro-cwl-parser/Cargo.toml
🧰 Additional context used
🪛 GitHub Check: Typos
zefiro-core/zefiro-cwl-parser/README.md
[warning] 7-7:
"stuctures" should be "structures".
🪛 GitHub Actions: Continuous Integration
zefiro-core/zefiro-cwl-parser/README.md
[error] 7-7: Typo: 'stuctures' should be 'structures'
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
[warning] 1-1: Formatting issue: Import order needs to be reorganized
zefiro-core/zefiro-cwl-parser/src/lib.rs
[warning] 3-3: Formatting issue: Extra newline needed after imports
Cargo.toml
[error] Package believes it's in a workspace when it's not. Add 'zefiro-core' to the workspace.members
array in /home/runner/work/zefiro/zefiro/Cargo.toml or add an empty [workspace]
table to the package's manifest.
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 3-7: Formatting issue: Import statement needs to be split into multiple lines
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 10-10: Formatting issue: Missing comma in struct field
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs
[warning] 43-43: Formatting issue: Missing comma at the end of struct field
🪛 yamllint (1.35.1)
.github/workflows/cd.yml
[error] 46-46: no new line character at the end of file
(new-line-at-end-of-file)
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 102-103: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L102-L103
Added lines #L102 - L103 were not covered by tests
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 72-73: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L72-L73
Added lines #L72 - L73 were not covered by tests
🔇 Additional comments (19)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (5)
11-30
: All good: CommandLineTool struct.The schema definition looks correct and comprehensive, with optional fields appropriately wrapped in
Option
. Serialization attributes are used correctly. This ensures robust and flexible serialization/deserialization.
32-47
: CommandInputParameter struct looks good.Fields and Serde attributes appear to be correctly defined, and optional fields are handled properly.
48-61
: CommandOutputParameter struct looks good.No issues found. It closely parallels the input parameter approach and maintains consistency.
62-77
: CommandLineBinding struct is well-defined.Optional parameters are correctly handled. The positioning and prefix approach is consistent with CWL specs.
78-88
: OutputBinding struct is well-structured.Serialization attributes, including
skip_serializing_if
, help keep the output concise. The fields align well with CWL’s command line binding pattern.zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (1)
1-91
: Workflow schema logic is sound.All structs adhere to CWL v1.2 naming and layout, with Serde usage consistent across optional fields.
🧰 Tools
🪛 GitHub Actions: Continuous Integration
[warning] 43-43: Formatting issue: Missing comma at the end of struct field
zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (1)
1-94
: ResourceRequirement defaults and additional structures look good.The default values and usage of
skip_serializing_none
align well with CWL conventions, ensuring a robust specification of requirements.zefiro-core/zefiro-cwl-parser/src/schema/document.rs (1)
1-126
: Overall logic is well-structured.The
CwlSchema
enum and associated methods effectively handle reading and writing CWL objects from/to YAML. The version and class checks are consistent with the CWL v1.2 spec.🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 102-103: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L102-L103
Added lines #L102 - L103 were not covered by tests🪛 GitHub Actions: Continuous Integration
[warning] 3-7: Formatting issue: Import statement needs to be split into multiple lines
zefiro-core/zefiro-cwl-parser/src/lib.rs (1)
1-5
: Export statements look appropriate.Exposing
CwlSchema
andCwlValues
publicly is clear and aligns well with your crate’s structure.🧰 Tools
🪛 GitHub Actions: Continuous Integration
[warning] 3-3: Formatting issue: Extra newline needed after imports
zefiro-core/zefiro-cwl-parser/src/values/types.rs (1)
32-44
: EnumCwlValueType
is well-structured.The usage of
#[serde(untagged)]
for flexible deserialization is clear and consistent. Good job!zefiro-core/zefiro-cwl-parser/src/schema/types.rs (5)
6-10
: “Any” enum usage is acceptable, but be cautious with large YAML values.Relying on
Any(YValue)
is useful for flexible parsing, but ensure you handle memory usage if large YAML documents are parsed.
42-47
: Documentation enum approach is good.Structuring single vs. multi-line documentation with separate variants is a neat approach.
49-54
: Multiple format support is well-implemented.Splitting single vs. multiple formats is helpful for validation and clarity in serialization.
56-61
:Scatter
enum usage is clear.Good use of untagged variants for scattering single or multiple parameters.
63-68
:Source
enum is consistent with the existing pattern.Keeping single vs. multiple source variants consistent with the rest of the schema design is logical and straightforward.
zefiro-core/zefiro-cwl-parser/examples/data/clt-step-values.yml (1)
1-4
: File references look correct.This YAML snippet aligns with the introduced
File
struct usage. Ensure the placeholdersinputLocation
andreadgroup
are resolved at runtime or replaced with valid values..github/workflows/cd.yml (1)
43-46
: Review commented publish stepThe commented section for publishing zefiro-core needs attention:
- Is this intentionally commented out?
- Should we add a similar step for publishing zefiro-cwl-parser?
Please clarify if we should:
- Remove this commented section if it's no longer needed
- Add a step for publishing zefiro-cwl-parser
- Keep both steps if both crates need publishing
🧰 Tools
🪛 yamllint (1.35.1)
[error] 46-46: no new line character at the end of file
(new-line-at-end-of-file)
zefiro-core/zefiro-cwl-parser/examples/data/wf-step-schema.yml (2)
53-56
: Verify resource requirements alignment with execution environmentThe resource requirements specify:
- 2 cores minimum
- 1000 (units?) minimum outdir size
- 1024 (MB?) minimum RAM
Please ensure:
- The units for
outdirMin
andramMin
are documented- These requirements align with the execution environment capabilities
62-64
: Validate scatter configuration for parallel executionThe workflow uses
dotproduct
scatter method onin_file
. This means:
- Each file in the input array will be processed independently
- The execution order is not guaranteed
- All executions must succeed for the workflow to succeed
Please ensure error handling is in place for partial failures:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
zefiro-core/zefiro-cwl-parser/README.md (2)
22-58
: Consider adding error handling examplesWhile the example is comprehensive, it would be beneficial to demonstrate proper error handling. Consider:
- Showing how to handle specific error types
- Adding comments about what errors might occur
- Providing an example with explicit error handling instead of using the
?
operatorExample enhancement:
// Example with explicit error handling let schema = match CwlSchema::from_path("workflow.yml") { Ok(schema) => schema, Err(e) => { eprintln!("Failed to parse CWL schema: {}", e); // Handle specific error types return Err(e); } };
1-2
: Consider adding a table of contentsFor better navigation, consider adding a table of contents at the beginning of the README. This would be especially helpful as the documentation grows with more examples and features.
Example:
## Table of Contents - [Overview](#overview) - [Usage](#usage) - [Parsing CWL Schema Documents](#parsing-cwl-schema-documents) - [Parsing CWL Values Documents](#parsing-cwl-values-documents)zefiro-core/zefiro-cwl-parser/src/main.rs (2)
6-6
: Consider more user-friendly error handling.
Using.expect("Failed to deserialize CWL values document")
can abruptly terminate the program. Depending on use cases, a more graceful error handling approach (e.g., returningResult
or printing a user-facing message) may be preferable.
17-19
: Evaluate reactivating or removing the commented-out logic.
If you plan to iterate over schema values for debugging or validation, consider reintroducing the code in lines 17–19 or removing it if it's no longer needed. Dead code can become confusing over time.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (5)
.github/workflows/ci.yml
(1 hunks)zefiro-core/Cargo.toml
(1 hunks)zefiro-core/zefiro-cwl-parser/Cargo.toml
(1 hunks)zefiro-core/zefiro-cwl-parser/README.md
(1 hunks)zefiro-core/zefiro-cwl-parser/src/main.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- zefiro-core/zefiro-cwl-parser/Cargo.toml
🧰 Additional context used
🪛 GitHub Actions: Continuous Integration
zefiro-core/Cargo.toml
[error] Multiple workspace roots found in the same workspace: '/home/runner/work/zefiro/zefiro/zefiro-core' and '/home/runner/work/zefiro/zefiro'
🔇 Additional comments (5)
zefiro-core/zefiro-cwl-parser/README.md (3)
1-8
: LGTM! Clear and informative overview.The overview section effectively communicates the library's purpose and limitations, with helpful links to the CWL specification.
16-16
: Update dependency version placeholderThe dependency line still contains a placeholder that needs to be replaced with the actual version.
70-75
: Clarify supported storage backendsThe example shows an S3 URI (
s3://bucket/input.txt
), but there's no mention of supported storage backends or URI schemes. Please clarify:
- What storage backends are supported?
- What URI schemes are valid?
- Are there any prerequisites for using S3 or other remote storage?
Let's check if there's any documentation about storage backends:
zefiro-core/zefiro-cwl-parser/src/main.rs (1)
7-12
: Validate the content of iterated values.
Before matching on the types (e.g.,CwlValueType::File
), consider adding checks to ensure that the values are valid and consistent. If the file path or associated data is invalid, you might risk panics or silent errors down the line..github/workflows/ci.yml (1)
24-26
: Good practice for deterministic builds.
Switching from--offline
to--locked
ensures your build uses Cargo.lock dependencies exactly, improving reproducibility and consistency. Watch out for environment constraints (e.g., ephemeral container runs) so that locked builds can still find all needed crates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (1)
zefiro-core/zefiro-cwl-parser/src/values/types.rs (1)
12-15
: 🛠️ Refactor suggestionReturn references instead of cloning strings.
As noted in the previous review, to avoid unnecessary allocations, consider returning
&str
instead ofString
in bothFile
andDirectory
implementations.- pub fn get_location(&self) -> String { - self.location.clone() + pub fn get_location(&self) -> &str { + &self.location }Also applies to: 26-29
🧹 Nitpick comments (11)
zefiro-core/zefiro-cwl-parser/src/values/types.rs (2)
6-9
: Consider extracting common fields into a trait or base struct.The
File
andDirectory
structs share identical fields. Consider reducing duplication by extracting common fields into a trait or base struct.Example approach using a trait:
pub trait CwlLocation { fn get_location(&self) -> &str; } #[derive(Clone, Debug, Deserialize, Serialize)] pub struct File { pub class: String, pub location: String, } impl CwlLocation for File { fn get_location(&self) -> &str { &self.location } } // Similar implementation for DirectoryAlso applies to: 20-23
31-43
: Add documentation for CwlValueType enum.The enum lacks documentation explaining its purpose and the meaning of each variant. Consider adding documentation similar to the
File
andDirectory
structs.Example:
/// Represents the various value types supported in CWL /// See: https://www.commonwl.org/v1.2/CommandLineTool.html#CWLType #[derive(Clone, Debug, Deserialize, Serialize)] #[serde(untagged)] pub enum CwlValueType { /// Boolean values (true/false) Boolean(bool), // ... document other variants }zefiro-core/zefiro-cwl-parser/src/values/document.rs (1)
81-85
: Enhance test coverage with error cases.The current test only verifies successful parsing. Consider adding tests for:
- Invalid YAML syntax
- Missing required fields
- Invalid file paths
Example:
#[test] fn test_parse_invalid_file_path() { let result = CwlValues::from_path("nonexistent.yml"); assert!(result.is_err()); assert!(result.unwrap_err().to_string().contains("Failed to open file")); }zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (2)
37-46
: Add missing optional fields for CommandInputParameter.Consider adding these optional fields from the CWL spec:
label
: For human-readable displaysecondaryFiles
: For associated filesstreamable
: For streaming optimizationformat
: For data format identificationloadContents
: For small file contents
53-60
: Add missing optional fields for CommandOutputParameter.Consider adding these optional fields from the CWL spec:
label
: For human-readable displaysecondaryFiles
: For associated filesstreamable
: For streaming optimizationformat
: For data format identificationzefiro-core/zefiro-cwl-parser/src/schema/document.rs (6)
11-11
: Consider future CWL version compatibility.The
SUPPORTED_VERSIONS
constant currently only supports CWL v1.2. Consider:
- Adding support for older versions (v1.0, v1.1) if backward compatibility is desired
- Documenting the version support policy
- Making the version check configurable for future updates
13-19
: Enhance enum documentation with more details.While the documentation is good, consider adding:
- Examples of when to use each variant
- Links to CWL specification for CommandLineTool and Workflow
- Notes about the untagged serialization behavior
31-34
: Consider using AsRef for more flexible path handling.The
from_path
method could be more flexible by accepting various path types.- pub fn from_path(path: &str) -> Result<Self> { - let reader = BufReader::new(File::open(path)?); + pub fn from_path<P: AsRef<std::path::Path>>(path: P) -> Result<Self> { + let reader = BufReader::new(File::open(path.as_ref())?);
37-53
: Enhance version validation and error handling.The version validation could be improved:
- Consider using a proper version comparison instead of string matching
- Add more context to error messages
- Consider logging the actual version when an unsupported version is encountered
pub fn from_yaml(value: Value) -> Result<Self> { let version = value .get("cwlVersion") .and_then(Value::as_str) - .ok_or_else(|| anyhow::anyhow!("Failed to determine CWL specification version."))?; + .ok_or_else(|| anyhow::anyhow!("Missing required field 'cwlVersion' in CWL document"))?; ensure!( SUPPORTED_VERSIONS.contains(&version), - "Unsupported CWL version: {version}" + "Unsupported CWL version: {}. Supported versions are: {:?}", + version, SUPPORTED_VERSIONS );
94-97
: Improve error handling specificity in from_string.The error handling could be more specific about what went wrong during parsing.
pub fn from_string(yaml_input: &str) -> Result<Self, Error> { - serde_yaml::from_str(yaml_input) - .map_err(|e| Error::msg(format!("Failed to parse CWL schema from string: {}", e))) + let value: Value = serde_yaml::from_str(yaml_input) + .map_err(|e| Error::msg(format!("Invalid YAML syntax: {}", e)))?; + Self::from_yaml(value)
110-116
: Add documentation for FromStr implementation.Consider adding documentation explaining:
- The purpose of implementing FromStr
- Example usage
- Error conditions
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
Cargo.toml
(1 hunks)zefiro-core/Cargo.toml
(1 hunks)zefiro-core/zefiro-cwl-parser/src/lib.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/mod.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/types.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/mod.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/types.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- zefiro-core/zefiro-cwl-parser/src/values/mod.rs
- zefiro-core/zefiro-cwl-parser/src/lib.rs
- zefiro-core/zefiro-cwl-parser/src/schema/mod.rs
- zefiro-core/Cargo.toml
- zefiro-core/zefiro-cwl-parser/src/schema/types.rs
🔇 Additional comments (7)
Cargo.toml (1)
7-11
: LGTM! Workspace configuration looks correct.The workspace members are properly configured, including both the new
zefiro-cwl-parser
crate and its parentzefiro-core
.zefiro-core/zefiro-cwl-parser/src/values/document.rs (2)
14-17
: LGTM! Good use of serde flatten and HashMap.The structure is well-designed with appropriate use of serde attributes and HashMap for flexible key-value storage.
71-73
: Add test coverage for serialization methods.The
to_yaml
method lacks test coverage. Additionally, consider testing:
- Error cases for
from_path
andfrom_string
- The
to_string
methodWould you like help drafting comprehensive tests for these methods?
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (1)
11-30
:⚠️ Potential issueAdd missing required and optional fields per CWL spec.
The
CommandLineTool
struct is missing several fields defined in the CWL spec:
- Required:
baseCommand
(string or array of strings)- Optional but important:
arguments
,stdin
,stdout
,stderr
Additionally, consider adding validation for the
class
field to ensure it's "CommandLineTool".zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (1)
35-37
: Add missing optional Docker requirement fields.Add these optional fields from the CWL spec:
dockerLoad
: Load from a filedockerFile
: Supply inline DockerfiledockerImport
: Import from a URLdockerImageId
: Use specific image IDdockerOutputDirectory
: Set working directoryThe empty
docker_pull
validation issue was previously reported.zefiro-core/zefiro-cwl-parser/src/schema/document.rs (2)
1-9
: Split the import statements into multiple lines for better readability.The import statements should be split for better readability and to comply with formatting guidelines.
118-129
: Enhance test coverage.The current test suite only covers successful parsing. Consider adding tests for:
- Error cases (invalid version, invalid class, malformed YAML)
- Serialization methods (to_string, to_yaml)
- FromStr implementation
- Edge cases (empty document, minimal valid documents)
Would you like me to generate additional test cases to improve coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🧹 Nitpick comments (9)
zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (3)
4-7
: Add documentation for resource requirement constants.Consider adding documentation comments for these constants to explain their purpose and units (MB).
+/// Default number of CPU cores const CPU_NUM_DEFAULT: u32 = 1; +/// Default RAM size in megabytes const RAM_SIZE_IN_MB_DEFAULT: u32 = 1024; +/// Default minimum temporary directory size in megabytes const TMPDIR_MIN_IN_MB_DEFAULT: u32 = 1024; +/// Default minimum output directory size in megabytes const OUTDIR_MIN_IN_MB_DEFAULT: u32 = 1024;
18-27
: Add documentation link and skip_serializing_none attribute.Consider these enhancements:
- Add a link to the CWL spec for CommandLineToolRequirement
- Add the
skip_serializing_none
attribute which might be needed for optional fields in the variants+/// See: https://www.commonwl.org/v1.2/CommandLineTool.html#Requirements /// Describes requirements for `CommandLineTool`. +#[skip_serializing_none] #[derive(Clone, Debug, Deserialize, Serialize)] #[serde(tag = "class")] pub enum CommandLineToolRequirement {
99-106
: Consider adding a default value for enable_reuse.It might be useful to provide a default value for
enable_reuse
to make the struct easier to use.pub struct WorkReuse { + #[serde(default)] pub enable_reuse: bool, } +impl Default for WorkReuse { + fn default() -> Self { + Self { + enable_reuse: true, // or false, depending on your use case + } + } +}.github/workflows/cd.yml (2)
28-40
: Enhance CI job reliability and performanceWhile the basic setup is good, consider these improvements:
- Add a timeout to prevent hung jobs
- Specify explicit Rust version
- Add cargo dependency caching
publish-crates-io: name: Publish on crates.io runs-on: ubuntu-22.04 + timeout-minutes: 10 needs: generate-changelog steps: - name: Checkout uses: actions/checkout@v4 + - name: Cache dependencies + uses: actions/cache@v3 + with: + path: | + ~/.cargo/registry + ~/.cargo/git + target + key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }} - name: Verify package run: cargo verify-project --manifest-path zefiro-core/zefiro-cwl-parser/Cargo.toml - name: Set the release version run: echo "RELEASE_VERSION=${GITHUB_REF:11}" >> "$GITHUB_ENV" - name: Install Rust toolchain uses: dtolnay/rust-toolchain@stable + with: + toolchain: 1.75.0 # or your minimum supported version
44-44
: Add newline at end of fileThe file is missing a newline character at the end, which is a POSIX requirement.
# cargo publish --manifest-path zefiro-core/Cargo.toml \ - # --locked --token ${{ secrets.CARGO_REGISTRY_TOKEN }} + # --locked --token ${{ secrets.CARGO_REGISTRY_TOKEN }} +🧰 Tools
🪛 yamllint (1.35.1)
[error] 44-44: no new line character at the end of file
(new-line-at-end-of-file)
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (3)
21-21
: Add documentation for the requirements field.Consider adding a doc comment explaining:
- The purpose of workflow requirements
- Common requirement types
- Impact on workflow execution
29-34
: Consider making id field required for WorkflowInputParameter.The
id
field is typically required in CWL to uniquely identify input parameters. Consider:
- Making
id
non-optional- Adding validation for unique IDs within a workflow
69-69
: Enhance type safety for scatter_method.Consider using an enum for
scatter_method
instead ofString
:#[derive(Clone, Debug, Deserialize, Serialize)] #[serde(rename_all = "snake_case")] pub enum ScatterMethod { DotProduct, NestedCrossProduct, FlatCrossProduct, }zefiro-core/zefiro-cwl-parser/src/values/types.rs (1)
1-43
: Consider adding custom error types for better error handling.The current implementation could benefit from a dedicated error type to handle various validation errors that might occur during struct creation or value conversion.
Consider adding:
#[derive(Debug, thiserror::Error)] pub enum CwlValueError { #[error("Invalid class: expected {expected}, got {actual}")] InvalidClass { expected: &'static str, actual: String, }, #[error("Invalid location: {0}")] InvalidLocation(String), #[error("Type conversion error: {0}")] TypeConversion(String), } // Then update the constructors to use this error type impl File { pub fn new(location: String) -> Result<Self, CwlValueError> { if location.is_empty() { return Err(CwlValueError::InvalidLocation( "location cannot be empty".to_string() )); } // ... rest of the implementation } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/cd.yml
(1 hunks)zefiro-core/zefiro-cwl-parser/README.md
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/types.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- zefiro-core/zefiro-cwl-parser/README.md
🧰 Additional context used
📓 Learnings (2)
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs:61-70
Timestamp: 2025-01-04T16:22:22.394Z
Learning: In `zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs`, the user explicitly wants the `WorkflowStep::run` field to support only `CommandLineTool`.
zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs:83-85
Timestamp: 2025-01-04T16:02:13.319Z
Learning: For the `ToolTimeLimit.timelimit` field in `zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs`, the user indicated that the type can be either a string or a u32.
🪛 yamllint (1.35.1)
.github/workflows/cd.yml
[error] 44-44: no new line character at the end of file
(new-line-at-end-of-file)
🔇 Additional comments (14)
zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (6)
9-16
: LGTM! Well-structured workflow requirements enum.The enum is well-documented and correctly implements serialization with appropriate tagging.
34-36
: Empty docker_pull validation is neededBased on the code inspection, there's currently no validation for empty
docker_pull
strings in theDockerRequirement
struct.Consider implementing validation using a newtype pattern:
#[derive(Clone, Debug, Deserialize, Serialize)] pub struct DockerImage(String); impl DockerImage { pub fn new(image: String) -> Result<Self, &'static str> { if image.trim().is_empty() { Err("docker_pull cannot be empty") } else { Ok(Self(image)) } } } pub struct DockerRequirement { pub docker_pull: DockerImage, }
43-55
: Enhance ResourceRequirement with maximum limits.Consider adding maximum resource fields from the CWL spec:
coresMax
ramMax
tmpdirMax
outdirMax
72-76
: LGTM! Clear and concise JavaScript requirement implementation.The unit-like struct is well-documented and appropriately implements the required traits.
82-91
: LGTM! Correct implementation of timelimit as discussed.The implementation correctly handles both numeric seconds and string expressions using an untagged enum, as previously agreed upon.
93-97
: LGTM! Well-documented scatter feature requirement.The implementation is clear and follows the same pattern as other unit-like requirement structs.
.github/workflows/cd.yml (1)
26-26
: LGTM!The git-cliff command formatting change maintains the same functionality.
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (3)
1-6
: Well-organized imports with appropriate dependencies!The imports are properly organized, separating internal module imports from external crate dependencies. The use of
serde
andserde_with
is appropriate for CWL document parsing and serialization.
36-54
: Well-designed output parameter structure!Good implementation of:
- Flexible output source representation using untagged enum
- Clear documentation with spec reference
- Appropriate optional fields
84-90
: Clean and minimal implementation of workflow step output!The implementation is appropriately simple with just the required
id
field and clear documentation.zefiro-core/zefiro-cwl-parser/src/values/types.rs (2)
12-14
: Return a reference instead of cloning thelocation
.To avoid unnecessary allocations, consider returning
&str
instead ofString
, as the caller can clone if needed.- pub fn get_location(&self) -> String { - self.location.clone() + pub fn get_location(&self) -> &str { + &self.location }
31-43
: Enhance CwlValueType enum with documentation and helper methods.The enum could benefit from:
- Documentation for each variant explaining their CWL type mapping
- Helper methods for type checking and common conversions
- Consider potential serialization ambiguity with untagged enum (e.g., i32 vs i64)
Consider adding:
/// Represents all possible CWL value types #[derive(Clone, Debug, Deserialize, Serialize)] #[serde(untagged)] pub enum CwlValueType { /// Represents a boolean value Boolean(bool), /// Represents a 32-bit integer Int(i32), /// Represents a 64-bit integer Long(i64), // ... (document other variants) } impl CwlValueType { pub fn is_numeric(&self) -> bool { matches!(self, Self::Int(_) | Self::Long(_) | Self::Float(_) | Self::Double(_) ) } pub fn as_i64(&self) -> Option<i64> { match self { Self::Int(n) => Some(*n as i64), Self::Long(n) => Some(*n), _ => None, } } // Add more helper methods as needed }Let's verify the potential serialization ambiguity:
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (2)
1-5
: Well-structured imports and serialization setup!Good use of serde with appropriate attributes for JSON serialization/deserialization. The imports are clean and focused on the required functionality.
1-87
: Add unit tests for serialization/deserializationConsider adding unit tests to verify:
- Serialization/deserialization of all structs
- Default values and optional fields behavior
- Edge cases with different field combinations
Would you like me to generate example unit tests for these structs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (3)
zefiro-core/zefiro-cwl-parser/src/values/types.rs (3)
3-7
: 🛠️ Refactor suggestionAdd validation for File struct fields.
The struct should validate that the
location
is not empty and represents a valid path.impl File { pub fn new(location: String) -> Result<Self, &'static str> { if location.is_empty() { return Err("location cannot be empty"); } Ok(Self { location }) } }
9-13
: 🛠️ Refactor suggestionReturn a reference instead of cloning the
location
.To avoid unnecessary allocations, consider returning
&str
instead ofString
.- pub fn get_location(&self) -> String { - self.location.clone() + pub fn get_location(&self) -> &str { + &self.location }
15-19
: 🛠️ Refactor suggestionAdd validation for Directory struct fields.
The struct should validate that the
location
is not empty and represents a valid path.impl Directory { pub fn new(location: String) -> Result<Self, &'static str> { if location.is_empty() { return Err("location cannot be empty"); } Ok(Self { location }) } }
🧹 Nitpick comments (4)
zefiro-core/zefiro-cwl-parser/Cargo.toml (1)
19-23
: Consider using caret requirements for dependencies.For better dependency management, consider using caret requirements (
^
) instead of exact versions. This allows for compatible updates while maintaining SemVer guarantees.[dependencies] -anyhow = "1.0.95" -serde = { version = "1.0.216", features = ["derive"] } -serde_with = "3.12.0" -serde_yaml = "0.9.34" +anyhow = "^1.0.95" +serde = { version = "^1.0.216", features = ["derive"] } +serde_with = "^3.12.0" +serde_yaml = "^0.9.34"zefiro-core/zefiro-cwl-parser/src/values/document.rs (2)
72-87
: Improve doctest forto_yaml
.The current doctest doesn't verify the output. Consider adding assertions to validate the written content.
/// ``` /// use zefiro_cwl_parser::values::document::CwlValues; -/// use std::io::BufWriter; +/// use std::io::{BufWriter, Read}; /// /// let yaml_input = r#" /// in_file: /// class: File /// location: 's3://bucket/path/to/input.txt' /// out_file: 'output.txt' /// "#; /// /// let values = CwlValues::from_string(yaml_input).expect("Failed to deserialize CWL values document"); /// let mut tmpfile = tempfile::tempfile().unwrap(); -/// let mut writer = BufWriter::new(tmpfile); -/// values.to_yaml(writer); +/// let writer = BufWriter::new(&tmpfile); +/// values.to_yaml(writer).expect("Failed to write YAML"); +/// +/// // Verify written content +/// let mut content = String::new(); +/// tmpfile.seek(std::io::SeekFrom::Start(0)).unwrap(); +/// tmpfile.read_to_string(&mut content).unwrap(); +/// assert!(content.contains("s3://bucket/path/to/input.txt")); +/// assert!(content.contains("output.txt")); /// ```
35-43
: Improve error handling infrom_path
.Consider using
std::path::Path
for better path handling and validation.- pub fn from_path(path: &str) -> Result<Self, Error> { + pub fn from_path<P: AsRef<std::path::Path>>(path: P) -> Result<Self, Error> { + let path = path.as_ref(); let reader = BufReader::new( File::open(path) - .map_err(|e| Error::msg(format!("Failed to open file '{}': {}", path, e)))?, + .map_err(|e| Error::msg(format!("Failed to open file '{}': {}", path.display(), e)))?, ); serde_yaml::from_reader(reader) - .map_err(|e| Error::msg(format!("Failed to parse CWL values from '{}'; {}", path, e))) + .map_err(|e| Error::msg(format!("Failed to parse CWL values from '{}'; {}", path.display(), e)))zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (1)
11-22
: Consider makingid
field optional per CWL specThe
id
field is currently required, but according to the CWL spec, it should be optional.- pub id: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub id: Option<String>,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
zefiro-core/zefiro-cwl-parser/Cargo.toml
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/types.rs
(1 hunks)
🧰 Additional context used
📓 Learnings (1)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs:52-59
Timestamp: 2025-01-04T17:12:06.568Z
Learning: The user indicated that fields like `label`, `secondaryFiles`, `streamable`, `doc`, `format`, `loadContents`, and `loadListing` are not required for the MVP version of the CommandOutputParameter struct.
🪛 GitHub Actions: Continuous Integration
zefiro-core/zefiro-cwl-parser/src/values/types.rs
[warning] 44-44: Code formatting issue: Missing newline at the end of the file
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
[warning] 35-37: Code formatting issue: Error message string needs to be wrapped across multiple lines according to rustfmt rules
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
[warning] 37-38: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs#L37-L38
Added lines #L37 - L38 were not covered by tests
[warning] 42-42: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs#L42
Added line #L42 was not covered by tests
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 88-89: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L88-L89
Added lines #L88 - L89 were not covered by tests
🔇 Additional comments (3)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (1)
88-90
: Add test coverage for theto_yaml
method.Static analysis indicates these lines lack coverage. Consider adding a test to verify the serialization functionality.
Would you like me to help draft a comprehensive test that exercises the
to_yaml
method?🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 88-89: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L88-L89
Added lines #L88 - L89 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (2)
46-74
: Implementation looks good!The parameter structs are well-documented and correctly implement the essential fields needed for the MVP version.
81-90
: 🛠️ Refactor suggestionUpdate position type to match CWL spec
The
position
field should bei32
instead ofu32
to match the CWL specification, which allows for negative positions.- pub position: Option<u32>, + pub position: Option<i32>,Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (2)
16-16
: Consider makingid
field optionalAccording to the CWL specification, the
id
field should be optional. Consider updating the field definition:- pub id: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub id: Option<String>,
38-40
: Enhance error message with more contextThe error message could be more descriptive to help users understand why Docker requirement is mandatory.
- return Err(serde::ser::Error::custom( - "CommandLineTool must have DockerRequirement", - )); + return Err(serde::ser::Error::custom( + "CommandLineTool must have DockerRequirement in its requirements list. \ + This ensures proper container isolation and reproducibility.", + ));
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/types.rs
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- zefiro-core/zefiro-cwl-parser/src/values/types.rs
🧰 Additional context used
📓 Learnings (1)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs:52-59
Timestamp: 2025-01-04T17:12:06.568Z
Learning: The user indicated that fields like `label`, `secondaryFiles`, `streamable`, `doc`, `format`, `loadContents`, and `loadListing` are not required for the MVP version of the CommandOutputParameter struct.
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
[warning] 37-39: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs#L37-L39
Added lines #L37 - L39 were not covered by tests
[warning] 44-44: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs#L44
Added line #L44 was not covered by tests
🔇 Additional comments (5)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (5)
53-62
: LGTM! Well-structured MVP implementationThe
CommandInputParameter
struct includes essential fields while maintaining simplicity for the MVP version.
69-76
: LGTM! Focused MVP implementationThe
CommandOutputParameter
struct maintains a clean, minimal interface with essential fields, aligning with the MVP goals.
83-92
: LGTM! Clean binding implementationThe
CommandLineBinding
struct provides essential command-line binding functionality needed for the MVP.
99-105
: LGTM! Concise output binding implementationThe
OutputBinding
struct effectively captures the essential output binding rules while maintaining simplicity for the MVP.
32-46
: Add test coverage for CommandLineTool serializationThe serialization implementation lacks test coverage, particularly for:
- Successful serialization with valid Docker requirement
- Failed serialization when Docker requirement is missing
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 37-39: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs#L37-L39
Added lines #L37 - L39 were not covered by tests
[warning] 44-44: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs#L44
Added line #L44 was not covered by tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
.github/workflows/cd.yml (2)
28-44
: Secure configuration approved with suggestions for robustness.The publishing job is well-configured with secure practices:
- Uses explicit manifest path
- Employs --locked flag to prevent dependency tampering
- Properly handles secrets
Consider these improvements for better robustness:
- name: Publish the zefiro-cwl-parser library run: | + # Retry logic for network issues + for i in 1 2 3; do cargo publish --manifest-path zefiro-core/zefiro-cwl-parser/Cargo.toml \ - --locked --token ${{ secrets.CARGO_REGISTRY_TOKEN }} + --locked --token ${{ secrets.CARGO_REGISTRY_TOKEN }} && break || \ + if [ $i -lt 3 ]; then sleep 10; else exit 1; fi + done + + # Verify successful publish + cargo search zefiro-cwl-parser --limit 1 | grep -q "zefiro-cwl-parser = \"${RELEASE_VERSION#v}\""Also add a timeout to the job:
publish-crates-io: name: Publish on crates.io runs-on: ubuntu-22.04 + timeout-minutes: 10 needs: generate-changelog
🧰 Tools
🪛 yamllint (1.35.1)
[error] 44-44: no new line character at the end of file
(new-line-at-end-of-file)
44-44
: Add newline at end of file.The YAML file is missing a newline at the end, which is a POSIX requirement.
Add a newline at the end of the file.
🧰 Tools
🪛 yamllint (1.35.1)
[error] 44-44: no new line character at the end of file
(new-line-at-end-of-file)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/cd.yml
(1 hunks)zefiro-core/zefiro-cwl-parser/Cargo.toml
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- zefiro-core/zefiro-cwl-parser/Cargo.toml
🧰 Additional context used
🪛 yamllint (1.35.1)
.github/workflows/cd.yml
[error] 44-44: no new line character at the end of file
(new-line-at-end-of-file)
🔇 Additional comments (1)
.github/workflows/cd.yml (1)
26-26
: LGTM! Command formatting improvement.The git-cliff command formatting change improves readability while maintaining the same functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (1)
74-76
: Fix typo in error messageThere's a typo in the error message: "dserialize" should be "serialize".
- .map_err(|e| Error::msg(format!("Failed to dserialize CWL values to string: {}", e))) + .map_err(|e| Error::msg(format!("Failed to serialize CWL values to string: {}", e)))zefiro-core/zefiro-cwl-parser/src/schema/document.rs (2)
11-11
: Consider making supported versions configurableThe
SUPPORTED_VERSIONS
constant could be made configurable to support future CWL versions without code changes.Consider using a configuration file or environment variable to specify supported versions.
36-53
: Enhance error messages with more contextThe error messages could be more descriptive by including the actual values encountered.
- .ok_or_else(|| anyhow::anyhow!("Failed to determine CWL specification version."))?; + .ok_or_else(|| anyhow::anyhow!("Failed to determine CWL specification version. Document must include 'cwlVersion' field."))?; - Some(class) => bail!("Unsupported CWL document class: {class}"), - None => bail!("Failed to determine CWL document class."), + Some(class) => bail!("Unsupported CWL document class: '{}'. Supported classes are 'CommandLineTool' and 'Workflow'.", class), + None => bail!("Failed to determine CWL document class. Document must include 'class' field."),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)
🧰 Additional context used
📓 Learnings (1)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs:52-59
Timestamp: 2025-01-04T17:12:06.568Z
Learning: The user indicated that fields like `label`, `secondaryFiles`, `streamable`, `doc`, `format`, `loadContents`, and `loadListing` are not required for the MVP version of the CommandOutputParameter struct.
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by tests
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by tests
🔇 Additional comments (9)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (1)
96-98
: Add test coverage for to_yaml methodThe
to_yaml
method lacks test coverage. Consider adding tests to verify successful serialization and error handling.Would you like me to help draft test cases for the
to_yaml
method?🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/document.rs (1)
116-118
: Add test coverage for to_yaml methodThe
to_yaml
method lacks test coverage. Consider adding tests for both successful serialization and error cases.Would you like me to help draft test cases for the
to_yaml
method?🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (7)
1-4
: LGTM! Clean and well-organized imports.The imports are minimal and well-structured, with good separation between internal crate dependencies and external ones.
29-38
: LGTM! Well-structured input parameter implementation.The struct contains all necessary fields for the MVP version, with proper use of raw identifiers and optional fields.
45-52
: LGTM! Clean output parameter implementation.The struct is well-designed for the MVP version, containing all necessary fields while maintaining simplicity.
75-81
: LGTM! Clear and concise output binding implementation.The struct is well-documented and implements the necessary output binding functionality for the MVP version.
59-68
:⚠️ Potential issueFix position type to match CWL spec
The
position
field should bei32
instead ofu32
to match the CWL specification, which allows for negative positions.Apply this change:
pub struct CommandLineBinding { #[serde(skip_serializing_if = "Option::is_none")] - pub position: Option<u32>, + pub position: Option<i32>,Likely invalid or redundant comment.
11-22
:⚠️ Potential issueAdd validation for the
class
field and makeid
optionalThe struct has some deviations from the CWL spec:
- The
class
field should be validated to only accept "CommandLineTool"- The
id
field should be optionalHere's the suggested implementation:
#[serde(rename_all = "camelCase")] pub struct CommandLineTool { pub cwl_version: String, - pub class: String, + #[serde(default = "default_class")] + pub class: String, #[serde(skip_serializing_if = "Option::is_none")] pub doc: Option<Documentation>, - pub id: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub id: Option<String>, #[serde(skip_serializing_if = "Option::is_none")] pub label: Option<String>, pub inputs: Vec<CommandInputParameter>, pub outputs: Vec<CommandOutputParameter>, pub requirements: Vec<CommandLineToolRequirement>, } +fn default_class() -> String { + "CommandLineTool".to_string() +}Likely invalid or redundant comment.
21-21
:⚠️ Potential issueAdd Docker requirement validation
The struct is missing the Docker requirement validation that was present in previous versions. This is important for ensuring that all CommandLineTools specify a container image.
Add the following implementation:
+impl CommandLineTool { + fn has_docker_requirement(&self) -> bool { + self.requirements.iter().any(|req| matches!(req, CommandLineToolRequirement::DockerRequirement { .. })) + } +} + +impl Serialize for CommandLineTool { + fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> + where + S: serde::Serializer, + { + if !self.has_docker_requirement() { + return Err(serde::ser::Error::custom( + "CommandLineTool must have DockerRequirement in its requirements list. \ + Please add a DockerRequirement specifying the container image.", + )); + } + // Use the derived serialize implementation + #[derive(Serialize)] + struct Helper<'a>(#[serde(with = "self")] &'a CommandLineTool); + Helper(self).serialize(serializer) + } +}Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (4)
12-17
: Enhance struct documentation with more details.While the current documentation provides a basic description, consider adding:
- Example usage
- Description of the flattened serialization behavior
- Common use cases or scenarios
/// Represents a collection of CWL input and output values as key-value pairs +/// +/// # Examples +/// +/// ``` +/// use zefiro_cwl_parser::values::document::CwlValues; +/// +/// let yaml_input = r#" +/// message: "Hello" +/// number: 42 +/// "#; +/// +/// let values = CwlValues::from_string(yaml_input).unwrap(); +/// assert!(values.contains_key("message")); +/// ``` +/// +/// The internal HashMap is flattened during serialization, meaning nested structures +/// are serialized at the root level of the YAML document. #[derive(Clone, Debug, Deserialize, Serialize)] pub struct CwlValues {
75-75
: Fix typo in error message.There's a typo in "dserialize".
- .map_err(|e| Error::msg(format!("Failed to dserialize CWL values to string: {}", e))) + .map_err(|e| Error::msg(format!("Failed to serialize CWL values to string: {}", e)))
91-95
: Improve example by checking the result.The example should demonstrate proper error handling.
let values = CwlValues::from_string(yaml_input).expect("Failed to serialize CWL values document"); let mut tmpfile = tempfile::tempfile().unwrap(); let mut writer = BufWriter::new(tmpfile); -values.to_yaml(writer); +values.to_yaml(writer).expect("Failed to write CWL values to file");
101-121
: Enhance test coverage with additional test cases.Consider adding:
- Negative test cases for invalid YAML
- Edge cases (empty document, large files)
- Verification of error messages
Example test cases:
#[test] fn test_from_string_invalid_yaml() { let result = CwlValues::from_string("foo: [bar: baz}"); assert!(result.is_err()); assert!(result.unwrap_err().to_string().contains("Failed to deserialize")); } #[test] fn test_from_path_nonexistent_file() { let result = CwlValues::from_path("nonexistent.yml"); assert!(result.is_err()); assert!(result.unwrap_err().to_string().contains("Failed to open file")); }zefiro-core/zefiro-cwl-parser/src/schema/document.rs (1)
31-34
: Consider adding error context for file operations.While the implementation is correct, consider adding context to file operation errors for better debugging.
- let reader = BufReader::new(File::open(path)?); + let file = File::open(path) + .map_err(|e| anyhow::anyhow!("Failed to open file '{}': {}", path, e))?; + let reader = BufReader::new(file);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by tests
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by tests
🔇 Additional comments (7)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (2)
19-25
: LGTM! Clean and idiomatic Deref implementation.The Deref implementation provides ergonomic access to the underlying HashMap while maintaining encapsulation.
96-98
: Add test coverage for theto_yaml
method.The current test in
test_save_schema_to_yaml
doesn't verify the written content.Consider adding a test that verifies the serialized output:
#[test] fn test_to_yaml_content() { let input = r#" message: "Hello" number: 42 "#; let values = CwlValues::from_string(input).unwrap(); let mut output = Vec::new(); values.to_yaml(&mut output).unwrap(); let output_str = String::from_utf8(output).unwrap(); assert!(output_str.contains("message: Hello")); assert!(output_str.contains("number: 42")); }🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/document.rs (5)
1-9
: Split the import statements into multiple lines for better readability.The pipeline reports a formatting issue indicating that these chained imports need splitting.
13-19
: Well-structured enum definition with appropriate attributes!The
CwlSchema
enum is well-documented and correctly uses#[serde(untagged)]
for flexible deserialization of CWL documents.
37-53
: Robust implementation with good error handling!The
from_yaml
method includes thorough validation of CWL version and class, with clear error messages.
121-127
: Clean FromStr implementation!Good use of delegation to the existing
from_yaml
method.
11-11
: Consider supporting additional CWL versions.The
SUPPORTED_VERSIONS
constant only includes "v1.2". Consider whether support for other CWL versions (e.g., v1.0, v1.1) would be beneficial for broader compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (9)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (5)
19-25
: Add documentation for the Deref implementation.While the implementation is correct, adding documentation explaining why
Deref
is implemented would help future maintainers understand the design decision.+/// Implements Deref to allow direct access to the underlying HashMap operations, +/// making CwlValues behave like a HashMap while maintaining type safety. impl Deref for CwlValues {
72-76
: Fix typo in error message.There's a typo in the error message: "dserialize" should be "serialize".
- .map_err(|e| Error::msg(format!("Failed to dserialize CWL values to string: {}", e))) + .map_err(|e| Error::msg(format!("Failed to serialize CWL values to string: {}", e)))
78-95
: Improve the example forto_yaml
.The current example doesn't handle the Result returned by
to_yaml
.let values = CwlValues::from_string(yaml_input).expect("Failed to serialize CWL values document"); let mut tmpfile = tempfile::tempfile().unwrap(); let mut writer = BufWriter::new(tmpfile); -values.to_yaml(writer); +values.to_yaml(writer).expect("Failed to write YAML to file");
101-152
: Consider adding negative test cases.The current tests cover the happy path well, but adding tests for error conditions would improve robustness:
- Invalid YAML syntax
- Missing required fields
- Invalid file paths
- I/O errors
Would you like help drafting these additional test cases?
1-152
: Consider architectural improvements for better error handling and construction.Two suggestions to enhance the codebase:
- Implement a custom error type instead of using
anyhow::Error
directly. This would provide more specific error handling and better documentation of possible failure modes.- Consider adding a builder pattern for constructing complex
CwlValues
instances programmatically.Would you like help implementing either of these suggestions?
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/document.rs (4)
11-11
: Document version support strategy.Consider adding documentation about:
- Why only v1.2 is supported
- Plans for supporting other versions
- Migration path for users with different versions
31-34
: Enhance error handling in from_path.Consider wrapping the errors with more context:
pub fn from_path(path: &str) -> Result<Self> { - let reader = BufReader::new(File::open(path)?); - Self::from_yaml(serde_yaml::from_reader(reader)?) + let file = File::open(path) + .map_err(|e| anyhow::anyhow!("Failed to open file '{}': {}", path, e))?; + let reader = BufReader::new(file); + Self::from_yaml(serde_yaml::from_reader(reader) + .map_err(|e| anyhow::anyhow!("Failed to parse YAML from '{}': {}", path, e))?) }
121-127
: Enhance FromStr error handling.Consider adding more context to the error:
fn from_str(s: &str) -> Result<Self> { - Self::from_yaml(serde_yaml::from_str(s)?) + serde_yaml::from_str(s) + .map_err(|e| anyhow::anyhow!("Failed to parse string as YAML: {}", e)) + .and_then(Self::from_yaml) }
129-185
: Enhance test coverage with additional cases.Consider adding tests for:
- Invalid CWL versions
- Unsupported document classes
- Malformed YAML
- Empty files
- Files with missing required fields
Example test cases:
#[test] fn test_unsupported_version() { let yaml = r#" cwlVersion: v1.0 class: CommandLineTool id: step "#; assert!(matches!( CwlSchema::from_string(yaml), Err(e) if e.to_string().contains("Unsupported CWL version") )); } #[test] fn test_unsupported_class() { let yaml = r#" cwlVersion: v1.2 class: Expression id: step "#; assert!(matches!( CwlSchema::from_string(yaml), Err(e) if e.to_string().contains("Unsupported CWL document class") )); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by tests
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by tests
🔇 Additional comments (4)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (2)
1-17
: Well-structured imports and struct definition!The code demonstrates good practices with:
- Appropriate use of imports
- Clear documentation
- Correct use of derive macros
- Smart use of serde flattening for clean serialization
96-98
: Add test coverage for theto_yaml
method.While there is an integration test in
test_cwlvalues_to_yaml
, direct unit tests for edge cases would be valuable.Would you like help drafting unit tests that exercise error cases for
to_yaml
?🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/document.rs (2)
116-118
: Improve test coverage for to_yaml method.The
to_yaml
method's error handling isn't covered by tests. Consider adding test cases for:
- Writing to a failing writer (e.g., a full disk simulation)
- Invalid YAML serialization cases
The codecov report confirms this gap in coverage.
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by tests
13-19
: Verify untagged enum serialization behavior.The
#[serde(untagged)]
attribute means the enum variant will be determined by attempting to match the fields rather than using a "type" field. This could lead to ambiguous cases ifCommandLineTool
andWorkflow
have similar fields.Run this script to check for potential field overlaps:
✅ Verification successful
Let me try a different search pattern since the previous one didn't yield results.
Let me check one more thing to verify how these types are distinguished in CWL.
Based on the analysis of the structs and error handling code, I can now provide the final response:
#[serde(untagged)]
usage is appropriate and safeThe untagged enum serialization is safe in this case because:
- Both types have a distinct
class
field that serves as a discriminator- The structs have different field combinations (CommandLineTool has CommandInputParameter/CommandOutputParameter while Workflow has WorkflowStep)
- The error handling code in document.rs shows there's validation of the
class
field to ensure proper type discrimination- The field structures are sufficiently different to prevent ambiguous matching
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for struct definitions to analyze field overlap echo "CommandLineTool fields:" ast-grep --pattern 'struct CommandLineTool { $$$ }' echo -e "\nWorkflow fields:" ast-grep --pattern 'struct Workflow { $$$ }'Length of output: 210
Script:
#!/bin/bash # Search for struct definitions in the entire codebase echo "Looking for CommandLineTool struct:" rg "struct CommandLineTool" -A 10 echo -e "\nLooking for Workflow struct:" rg "struct Workflow" -A 10Length of output: 6169
Script:
#!/bin/bash # Look for class field values to confirm type discrimination echo "Searching for class field assignments or validations:" rg "class.*=.*\".*\"" -A 1 rg "class.*=.*CommandLineTool" -A 1 rg "class.*=.*Workflow" -A 1Length of output: 635
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (4)
12-17
: Enhance struct documentation with more details.While the current documentation is good, consider adding more details about:
- The expected format of the key-value pairs
- Common use cases
- Examples of valid CWL values
/// Represents a collection of CWL input and output values as key-value pairs +/// +/// # Examples +/// +/// ``` +/// # use zefiro_cwl_parser::values::document::CwlValues; +/// let yaml_input = r#" +/// message: "Hello World" +/// input_file: +/// class: File +/// location: "s3://bucket/input.txt" +/// "#; +/// let values = CwlValues::from_string(yaml_input).unwrap(); +/// ```
74-75
: Fix typo in error message.There's a typo in the error message: "dserialize" should be "deserialize".
- .map_err(|e| Error::msg(format!("Failed to dserialize CWL values to string: {}", e))) + .map_err(|e| Error::msg(format!("Failed to deserialize CWL values to string: {}", e)))
80-95
: Improve doc example for to_yaml method.The example code doesn't handle the Result returned by
to_yaml
, which could mask potential errors.let values = CwlValues::from_string(yaml_input).expect("Failed to serialize CWL values document"); let mut tmpfile = tempfile::tempfile().unwrap(); let mut writer = BufWriter::new(tmpfile); -values.to_yaml(writer); +values.to_yaml(writer).expect("Failed to write YAML");
101-131
: Enhance test coverage and maintainability.Consider the following improvements to the test suite:
- Add test cases for error scenarios (e.g., invalid YAML, file not found)
- Make test data paths configurable through environment variables or test constants
- Add tests for empty documents and edge cases
Here's an example of additional test cases:
#[test] fn test_cwlvalues_from_path_not_found() { let result = CwlValues::from_path("non_existent_file.yml"); assert!(result.is_err()); assert!(result.unwrap_err().to_string().contains("Failed to open file")); } #[test] fn test_cwlvalues_from_string_invalid_yaml() { let result = CwlValues::from_string("invalid: : yaml: :"); assert!(result.is_err()); } #[test] fn test_cwlvalues_empty_document() { let result = CwlValues::from_string("{}").unwrap(); assert_eq!(result.values.len(), 0); }zefiro-core/zefiro-cwl-parser/src/schema/document.rs (3)
11-11
: Consider using a more specific type for version constants.Instead of using a slice of string slices, consider using a const array for better type safety and zero-cost abstraction.
-const SUPPORTED_VERSIONS: &[&str] = &["v1.2"]; +const SUPPORTED_VERSIONS: [&str; 1] = ["v1.2"];
37-53
: Consider using const string comparison for version check.The version comparison could be more efficient using const string comparison.
-ensure!( - SUPPORTED_VERSIONS.contains(&version), - "Unsupported CWL version: {version}" -); +ensure!( + SUPPORTED_VERSIONS.iter().any(|&v| v.eq_ignore_ascii_case(version)), + "Unsupported CWL version: {version}" +);
129-161
: Add tests for error cases and edge cases.While the happy path is well tested, consider adding tests for:
- Invalid CWL versions
- Unsupported document classes
- Malformed YAML
- File not found errors
Example test cases:
#[test] fn test_invalid_cwl_version() { let yaml = r#" cwlVersion: v2.0 class: CommandLineTool id: step "#; assert!(matches!( CwlSchema::from_string(yaml).unwrap_err().to_string(), s if s.contains("Unsupported CWL version") )); } #[test] fn test_unsupported_class() { let yaml = r#" cwlVersion: v1.2 class: UnsupportedType id: step "#; assert!(matches!( CwlSchema::from_string(yaml).unwrap_err().to_string(), s if s.contains("Unsupported CWL document class") )); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/values/document.rs
(1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/values/document.rs
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by tests
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by tests
🔇 Additional comments (6)
zefiro-core/zefiro-cwl-parser/src/values/document.rs (2)
19-25
: Clean Deref implementation!The Deref implementation provides ergonomic access to the underlying HashMap while maintaining encapsulation.
96-98
: Add test coverage for error cases in to_yaml.The current test only covers the success path. Consider adding tests for error cases, such as writing to a failed writer.
Would you like help drafting additional test cases for error scenarios?
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 96-97: zefiro-core/zefiro-cwl-parser/src/values/document.rs#L96-L97
Added lines #L96 - L97 were not covered by testszefiro-core/zefiro-cwl-parser/src/schema/document.rs (4)
1-9
: LGTM! Imports are properly organized.The imports are now correctly formatted and grouped according to Rust conventions.
13-19
: LGTM! Well-designed enum with appropriate derives.The enum is well-documented and correctly uses untagged serde representation for flexible deserialization of CWL documents.
27-27
: Update example paths to be relative to the crate root.The doc examples use incorrect relative paths. They should be relative to the crate root.
-let yaml_file = "examples/data/clt-step-schema.yml"; +let yaml_file = "zefiro-core/zefiro-cwl-parser/examples/data/clt-step-schema.yml";Also applies to: 110-110
121-127
: LGTM! Clean FromStr implementation.The implementation correctly delegates to
from_yaml
and follows Rust error handling patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (1)
40-57
: 🛠️ Refactor suggestionAdd maximum resource fields from CWL spec
Consider adding maximum resource fields:
- coresMax
- ramMax
- tmpdirMax
- outdirMax
🧹 Nitpick comments (10)
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (3)
18-19
: Consider validation for theid
field.The
id
field uses#[serde(default)]
which defaults to an empty string. Consider making thisOption<String>
instead, as an empty string might not be a valid identifier in CWL.- #[serde(default)] - pub id: String, + pub id: Option<String>,
24-24
: Consider validating workflow requirements for duplicates.The
requirements
vector could potentially contain duplicate requirements. Consider adding validation during deserialization to ensure uniqueness.You could implement custom deserialization or add a validation method:
impl Workflow { pub fn validate(&self) -> Result<(), String> { let mut seen = std::collections::HashSet::new(); for req in &self.requirements { if !seen.insert(req.get_type()) { return Err(format!("Duplicate requirement type: {:?}", req.get_type())); } } Ok(()) } }
82-82
: Enhance type safety for scatter_method.Consider using an enum for
scatter_method
to restrict values to valid options: "dotproduct", "nested_crossproduct", "flat_crossproduct".#[derive(Clone, Debug, Deserialize, Serialize)] #[serde(rename_all = "snake_case")] pub enum ScatterMethod { DotProduct, NestedCrossProduct, FlatCrossProduct, }Then update the field:
- pub scatter_method: Option<String>, + pub scatter_method: Option<ScatterMethod>,zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (4)
4-4
: Consider supporting multiple CWL versionsThe constant
SUPPORTED_CWL_VERSIONS
only includes v1.2. Consider adding support for earlier versions (v1.0, v1.1) to ensure broader compatibility with existing CWL workflows.
11-18
: Add documentation link and consider additional workflow requirements
- Add the CWL specification link in the documentation comment.
- Consider implementing additional workflow requirements from the CWL spec:
- SubworkflowFeatureRequirement
- MultipleInputFeatureRequirement
- StepInputExpressionRequirement
20-29
: Add documentation link and consider additional command line tool requirements
- Add the CWL specification link in the documentation comment.
- Consider implementing additional requirements from the CWL spec:
- InitialWorkDirRequirement
- EnvVarRequirement
- ShellCommandRequirement
101-108
: Consider adding validation for WorkReuseConsider implementing validation to ensure that when WorkReuse is specified, enable_reuse is true. This would prevent the redundant specification of WorkReuse with enable_reuse set to false.
Example implementation:
impl WorkReuse { pub fn validate(&self) -> Result<(), &'static str> { if !self.enable_reuse { return Err("WorkReuse specified but enable_reuse is false"); } Ok(()) } }zefiro-core/zefiro-cwl-parser/src/schema/document.rs (3)
94-97
: Enhance error message infrom_string
.The current error message wraps the underlying error but doesn't guide users on potential fixes. Consider adding common failure reasons to help users debug issues.
- .map_err(|e| Error::msg(format!("Failed to parse CWL schema from string: {}", e))) + .map_err(|e| Error::msg(format!( + "Failed to parse CWL schema from string. Ensure the string contains valid YAML and includes required 'cwlVersion' and 'class' fields. Error: {}", + e + )))
104-115
: Improve doc example to show error handling.The example in the documentation doesn't demonstrate error handling and uses
unwrap()
. Consider showing proper error handling practices./// ``` /// use zefiro_cwl_parser::schema::document::CwlSchema; /// use std::fs::File; /// use std::io::BufWriter; /// /// let yaml_file = "examples/data/clt-step-schema.yml"; -/// let schema = CwlSchema::from_path(yaml_file).expect("Failed to serialize CWL schema document"); -/// let mut tmpfile = tempfile::tempfile().unwrap(); -/// let mut writer = BufWriter::new(tmpfile); -/// schema.to_yaml(writer); +/// # fn main() -> anyhow::Result<()> { +/// let schema = CwlSchema::from_path(yaml_file)?; +/// let tmpfile = tempfile::tempfile()?; +/// let writer = BufWriter::new(tmpfile); +/// schema.to_yaml(writer)?; +/// # Ok(()) +/// # } /// ```
129-197
: Add negative test cases for validation.The test suite covers the happy path and write errors well, but could benefit from negative test cases for:
- Unsupported CWL versions
- Invalid class values
- Missing required fields
Here's a suggested test implementation:
#[test] fn test_validation_errors() { // Test unsupported version let yaml_str = r#" cwlVersion: v0.1 class: CommandLineTool id: step "#; assert!(CwlSchema::from_string(yaml_str).is_err()); // Test invalid class let yaml_str = r#" cwlVersion: v1.2 class: InvalidClass id: step "#; assert!(CwlSchema::from_string(yaml_str).is_err()); // Test missing required fields let yaml_str = r#" id: step "#; assert!(CwlSchema::from_string(yaml_str).is_err()); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/document.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs
(1 hunks)zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs
(1 hunks)
🧰 Additional context used
📓 Learnings (3)
zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs:83-85
Timestamp: 2025-01-04T16:02:13.319Z
Learning: For the `ToolTimeLimit.timelimit` field in `zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs`, the user indicated that the type can be either a string or a u32.
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (2)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs:52-59
Timestamp: 2025-01-04T17:12:06.568Z
Learning: The user indicated that fields like `label`, `secondaryFiles`, `streamable`, `doc`, `format`, `loadContents`, and `loadListing` are not required for the MVP version of the CommandOutputParameter struct.
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs:1-81
Timestamp: 2025-01-04T19:10:32.963Z
Learning: Full validation of CWL files and related tests will be implemented in the `zefiro-cwl-validator` crate, rather than in the `zefiro-cwl-parser` crate.
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (1)
Learnt from: nsyzrantsev
PR: zefiroproj/zefiro#4
File: zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs:61-70
Timestamp: 2025-01-04T16:22:22.394Z
Learning: In `zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs`, the user explicitly wants the `WorkflowStep::run` field to support only `CommandLineTool`.
🪛 GitHub Check: codecov/patch
zefiro-core/zefiro-cwl-parser/src/schema/document.rs
[warning] 116-117: zefiro-core/zefiro-cwl-parser/src/schema/document.rs#L116-L117
Added lines #L116 - L117 were not covered by tests
🔇 Additional comments (16)
zefiro-core/zefiro-cwl-parser/src/schema/workflow.rs (3)
37-47
: LGTM! Well-structured input parameter implementation.The
WorkflowInputParameter
struct is well-documented and correctly implements the CWL specification.
49-67
: LGTM! Clean output parameter implementation with flexible output source handling.The
WorkflowOutputParameter
andWorkflowOutputParameterOutputSource
implementations are well-structured and provide good flexibility for handling both single and array output sources.
85-85
: Fix incorrect documentation comment.The comment incorrectly states it defines the input parameters of the workflow step (
out
section). It should reference thein
section instead.-/// Defines the input parameters of the workflow step (`out` section). +/// Defines the input parameters of the workflow step (`in` section).zefiro-core/zefiro-cwl-parser/src/schema/requirements.rs (4)
31-38
: LGTM!The implementation is clean and includes proper documentation with the spec link.
74-78
: LGTM!The implementation correctly follows the CWL spec and includes proper documentation.
80-93
: LGTM!The implementation correctly handles both numeric and expression-based time limits as per CWL spec, using an untagged enum for flexible parsing.
95-99
: LGTM!The implementation correctly follows the CWL spec and includes proper documentation.
zefiro-core/zefiro-cwl-parser/src/schema/command_line_tool.rs (7)
1-5
: LGTM!The imports are well-organized and correctly specify the required dependencies for CWL schema implementation.
6-28
: LGTM!The CommandLineTool struct is well-documented and correctly implements the CWL specification with appropriate derive macros and serde attributes.
40-54
: LGTM!The CommandInputParameter struct correctly implements the essential fields required for the MVP version.
56-68
: LGTM!The CommandOutputParameter struct correctly implements the essential fields required for the MVP version.
86-97
: LGTM!The OutputBinding struct correctly implements the essential fields required for the MVP version.
76-77
: 🛠️ Refactor suggestionChange position type to i32
According to the CWL specification, the position field should be an integer that can be negative. Currently, it's
u32
which only allows positive numbers.#[serde(skip_serializing_if = "Option::is_none")] - pub position: Option<u32>, + pub position: Option<i32>,Likely invalid or redundant comment.
18-19
: 🛠️ Refactor suggestionMake
id
field optionalAccording to the CWL specification, the
id
field is optional. Currently, it defaults to an empty string, which might not be the best representation of an optional field.- #[serde(default)] - pub id: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub id: Option<String>,Likely invalid or redundant comment.
zefiro-core/zefiro-cwl-parser/src/schema/document.rs (2)
121-127
: LGTM! Clean and efficient implementation.The
FromStr
implementation correctly reuses thefrom_yaml
method, maintaining consistency in parsing logic.
13-19
: Consider potential deserialization ambiguity with untagged enum.The
#[serde(untagged)]
attribute means serde will try to deserialize into each variant until one succeeds. While theclass
field check infrom_yaml
prevents ambiguity, direct deserialization through serde (e.g.,from_string
) might be ambiguous if a document could match both variants.Let's verify if there are any cases where a document could be ambiguously parsed:
✅ Verification successful
The untagged enum deserialization is safe due to class field validation
After analyzing the code, I can confirm that the deserialization is safe. The
CwlSchema
enum uses two approaches for parsing:
- The primary
from_yaml
method explicitly checks theclass
field before attempting deserialization, ensuring unambiguous parsing:match value.get("class").and_then(Value::as_str) {
- The
from_string
andFromStr
implementations both ultimately usefrom_yaml
, inheriting its safety guarantees.Both
CommandLineTool
andWorkflow
structs have distinct required fields and structures, and the class field validation ensures correct variant selection, making the#[serde(untagged)]
attribute safe to use in this context.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for CommandLineTool and Workflow struct definitions to analyze field overlap ast-grep --pattern 'struct CommandLineTool {$$$}' ast-grep --pattern 'struct Workflow {$$$}'Length of output: 96
Script:
#!/bin/bash # Let's try a different approach to find the struct definitions rg "struct (CommandLineTool|Workflow)" -A 20Length of output: 9041
Script:
#!/bin/bash # Let's check how the class field is used in deserialization rg "class.*=.*\"(Workflow|CommandLineTool)\"" -B 2 -A 2Length of output: 55
Script:
#!/bin/bash # Let's check the implementation of from_yaml or similar parsing methods rg "fn from_yaml|fn from_str" -A 10Length of output: 3836
Summary by CodeRabbit
Release Notes
New Features
zefiro-cwl-parser
with version0.1.0
.Documentation
Chores