-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set max_cores and max_workers to None #478
Conversation
WalkthroughThe changes in this pull request primarily focus on modifying the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (3)
executorlib/interactive/executor.py (1)
150-152
: Consider documenting the default behavior.The function documentation should be updated to explain what happens when both
max_workers
andmax_cores
are None. This is important for users to understand the resource allocation behavior in such cases.executorlib/__init__.py (2)
107-110
: Update method documentation to reflect optional parameters.The docstring for
__new__
still describesmax_workers
andmax_cores
as(int)
without indicating they're optional. Please update the parameter descriptions to:
- Indicate they're optional
- Document the behavior when None is provided
- max_workers (int): for backwards compatibility with the standard library... + max_workers (Optional[int]): for backwards compatibility with the standard library... - max_cores (int): defines the number cores which can be used in parallel + max_cores (Optional[int]): defines the number cores which can be used in parallel. If None...
Line range hint
177-183
: Verify interaction between resource defaults and optional parameters.The default
resource_dict
setscores=1
, butmax_cores
andmax_workers
can now be None. Please clarify:
- The precedence between these parameters
- How None values interact with the default resource configuration
Consider documenting these interactions in the class docstring to help users understand the behavior when different combinations of parameters are provided.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
executorlib/__init__.py
(2 hunks)executorlib/interactive/executor.py
(1 hunks)executorlib/standalone/inputcheck.py
(2 hunks)
🔇 Additional comments (2)
executorlib/interactive/executor.py (1)
150-152
: Review resource allocation logic.
Making both parameters optional could impact how resources are allocated, especially in block allocation mode where max_cores
is used to calculate max_workers
(e.g., int(max_cores / cores_per_worker)
). Ensure there's proper fallback behavior when these values are None.
#!/bin/bash
# Search for places where max_cores is used in calculations
rg "max_cores.*cores_per_worker" -A 2
executorlib/__init__.py (1)
86-89
: Verify backward compatibility with existing code.
The change from required integers to Optional[int] with None defaults for max_workers
and max_cores
could impact existing code that expects these parameters to be integers. Please ensure that:
- Downstream code handles None values appropriately
- Default behavior when None is provided is well-defined
✅ Verification successful
Making parameters optional is safe and properly handled
The validation logic in validate_number_of_cores
properly handles None values for both max_workers
and max_cores
:
- Returns 1 if both are None (default case)
- Uses
max_workers
ifmax_cores
is None - Uses
max_cores
if provided check_max_workers_and_cores
validates these parameters for specific backends
The change to make these parameters optional is safe and well-implemented with appropriate fallback behavior.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for direct usage of max_workers/max_cores to verify handling of None values
rg -A 3 "max_(workers|cores)\s*[=:]" --type python
Length of output: 83
Script:
#!/bin/bash
# Search for direct usage of max_workers/max_cores without file type flag
rg -A 3 "max_(workers|cores)\s*[=:]"
Length of output: 31989
Script:
#!/bin/bash
# Search for validation and handling of None values
rg "def validate_number_of_cores|def check_max_workers_and_cores" -A 10
Length of output: 1599
executorlib/standalone/inputcheck.py
Outdated
if max_workers is None and max_cores is None: | ||
return 1 | ||
elif max_workers is not None and max_cores is None: | ||
return max_workers | ||
return max_cores | ||
else: | ||
return max_cores |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Function needs type safety improvements and better documentation.
Several issues need to be addressed:
- The function signature doesn't match the implementation (should use Optional[int])
- Missing null checks before using the parameters
- Docstring doesn't describe the new behavior with None values
Here's the suggested fix:
- def validate_number_of_cores(max_cores: int, max_workers: int) -> int:
+ def validate_number_of_cores(max_cores: Optional[int], max_workers: Optional[int]) -> int:
"""
Validate the number of cores and return the appropriate value.
+
+ Args:
+ max_cores: Optional maximum number of cores
+ max_workers: Optional maximum number of workers
+
+ Returns:
+ int: 1 if both parameters are None,
+ max_workers if max_cores is None,
+ max_cores otherwise
"""
if max_workers is None and max_cores is None:
return 1
elif max_workers is not None and max_cores is None:
return max_workers
- else:
+ elif max_cores is not None:
return max_cores
+ else:
+ raise ValueError("Invalid combination of max_cores and max_workers")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if max_workers is None and max_cores is None: | |
return 1 | |
elif max_workers is not None and max_cores is None: | |
return max_workers | |
return max_cores | |
else: | |
return max_cores | |
def validate_number_of_cores(max_cores: Optional[int], max_workers: Optional[int]) -> int: | |
""" | |
Validate the number of cores and return the appropriate value. | |
Args: | |
max_cores: Optional maximum number of cores | |
max_workers: Optional maximum number of workers | |
Returns: | |
int: 1 if both parameters are None, | |
max_workers if max_cores is None, | |
max_cores otherwise | |
""" | |
if max_workers is None and max_cores is None: | |
return 1 | |
elif max_workers is not None and max_cores is None: | |
return max_workers | |
elif max_cores is not None: | |
return max_cores | |
else: | |
raise ValueError("Invalid combination of max_cores and max_workers") |
def check_max_workers_and_cores( | ||
max_workers: Optional[int], max_cores: Optional[int] | ||
) -> None: | ||
if max_workers is not None: | ||
raise ValueError( | ||
"The number of workers cannot be controlled with the pysqa based backend." | ||
) | ||
if max_cores != 1: | ||
if max_cores is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Review needs to be updated - test cases don't align with new optional parameters
The verification revealed an inconsistency between the changes and the test cases:
- The function
check_max_workers_and_cores
was modified to make parameters optional (Optional[int]) - However, the test cases in
test_check_max_workers_and_cores
still pass non-None values and expect ValueError - The actual usage in
create_file_executor
passes non-None values by default (max_workers=1, max_cores=1
)
This indicates that either:
- The function shouldn't have been made optional, or
- The test cases and usage need to be updated to reflect the new optional behavior
🔗 Analysis chain
LGTM! Type hints and validation logic updated correctly.
The changes properly implement optional parameters while maintaining the original validation purpose. The error messages remain clear and appropriate.
Let's verify the usage of this function across the codebase:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for all calls to check_max_workers_and_cores to ensure they handle None correctly
ast-grep --pattern 'check_max_workers_and_cores($$$)'
Length of output: 479
Script:
#!/bin/bash
# Let's check the test file to understand the test cases
cat tests/test_shared_input_check.py
# Also check the executor implementation to see how it handles these parameters
cat executorlib/cache/executor.py
Length of output: 8418
max_workers: Optional[int] = None, | ||
backend: str = "local", | ||
max_cores: int = 1, | ||
max_cores: Optional[int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Type mismatch in validate_number_of_cores function signature
The function signature in executorlib/standalone/inputcheck.py
doesn't match the optional types:
def validate_number_of_cores(max_cores: int, max_workers: int) -> int:
While the implementation correctly handles None values, the type hints need to be updated to Optional[int]
to match the caller's signature.
executorlib/standalone/inputcheck.py
: Update parameter types toOptional[int]
🔗 Analysis chain
Verify validation handling for None values.
While making max_workers
and max_cores
optional aligns with the PR objectives, we should ensure that validate_number_of_cores
properly handles None values for both parameters.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for validate_number_of_cores implementation to verify None handling
ast-grep --pattern 'def validate_number_of_cores($_: Optional[int], $_: Optional[int])'
Length of output: 1023
Script:
#!/bin/bash
# Let's search for validate_number_of_cores implementation with a broader pattern
rg "def validate_number_of_cores" -A 10
Length of output: 750
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
tests/test_local_executor_future.py (1)
71-73
: LGTM! Consider documenting the max_workers=1 choice.The addition of
max_workers=1
improves test determinism by ensuring single-threaded execution. This is particularly important for this test which verifies future independence from executor lifecycle.Consider adding a comment explaining why
max_workers=1
is specifically chosen here to help future maintainers understand the test's constraints.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
tests/test_integration_pyiron_workflow.py
(6 hunks)tests/test_local_executor_future.py
(2 hunks)tests/test_shell_interactive.py
(1 hunks)
🔇 Additional comments (4)
tests/test_shell_interactive.py (1)
108-108
: LGTM! The explicit max_workers setting maintains test consistency.
The addition of max_workers=1
aligns with the test's existing sequential execution assumptions and matches the behavior in test_execute_single_task
which uses cores=1
.
Let's verify the sequential execution assumption is maintained across the codebase:
✅ Verification successful
The sequential execution assumption is consistently maintained across test files
The search results confirm that all test files consistently use max_workers=1
or max_cores=1
when testing sequential execution behavior. This includes:
test_shell_executor.py
: Usesmax_workers=1
for shell command execution teststest_local_executor.py
: Usesmax_workers=1
for MPI executor teststest_flux_executor.py
: Usesmax_workers=1
for Flux executor teststest_executor_backend_*.py
: Usesmax_cores=1
for various backend tests
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that other test files maintain consistent worker/core settings
# for sequential execution tests.
# Test: Search for similar test patterns that rely on sequential execution
rg -A 5 "test_.*executor.*" tests/ | rg -A 5 "max_workers.*=.*1|cores.*=.*1"
Length of output: 6774
tests/test_local_executor_future.py (1)
112-114
: LGTM! Verify test coverage for multi-worker scenarios.
The addition of max_workers=1
ensures reliable test execution by controlling concurrency.
Let's verify if we have test coverage for multi-worker scenarios elsewhere:
✅ Verification successful
Multi-worker scenarios are well covered across test suite
The codebase has comprehensive test coverage for both single and multi-worker configurations:
test_executor_backend_mpi.py
tests withmax_workers=2
test_shell_executor.py
includes tests with bothmax_workers=1
andmax_workers=2
test_shared_input_check.py
verifies various combinations of workers (1,2) and corestest_local_executor.py
andtest_flux_executor.py
both test withmax_workers=2
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for tests using InteractiveExecutor with different worker configurations
# Expect: Mix of single and multi-worker test cases
rg -l "InteractiveExecutor\(" tests/
rg "max_workers\s*=\s*\d+" tests/
Length of output: 3377
tests/test_integration_pyiron_workflow.py (2)
Line range hint 1-238
: Test implementation looks robust and well-structured.
The test class provides comprehensive coverage of executor behavior with dynamically defined objects, including:
- Clear test scenarios and edge cases
- Descriptive assertions with helpful messages
- Proper handling of timeouts and exceptions
77-77
: Consider if explicit max_workers=1 is necessary for these tests.
While the PR's objective is to make max_workers
optional (defaulting to None), these tests explicitly set max_workers=1
. This might be intentional for controlling concurrency in tests, but we should verify if it's necessary.
Let's analyze the test requirements:
Consider the following approaches:
- If sequential execution is required for these tests, document why
max_workers=1
is necessary - If parallel execution is safe, remove the explicit
max_workers
to test the default behavior - Consider splitting timing-sensitive tests from those that can run in parallel
Also applies to: 107-107, 138-138, 162-162, 192-192, 221-221
✅ Verification successful
Setting max_workers=1
is necessary for these tests
The explicit max_workers=1
setting is required for these tests because:
- Tests contain timing-sensitive operations with
sleep(0.1)
calls and explicit timeout checks - Tests verify state transitions through the
running
flag and callbacks - Tests assert specific execution states that require deterministic behavior:
test_callable
: Asserts that future is not done after submissiontest_callback
: Verifies state transitions from running to not runningtest_timeout
: Validates timeout behavior with precise timing
Removing the max_workers=1
constraint could lead to non-deterministic behavior and flaky tests due to race conditions in these timing-sensitive scenarios.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if these tests specifically need sequential execution
# Look for timing dependencies, race conditions, or order-specific assertions
# Search for sleep calls and timing-related assertions
echo "Checking timing dependencies:"
rg "sleep\(" tests/test_integration_pyiron_workflow.py -B 2 -A 2
# Search for assertions that might depend on execution order
echo "Checking order-dependent assertions:"
rg "assert.*running" tests/test_integration_pyiron_workflow.py -B 2 -A 2
Length of output: 907
Script:
#!/bin/bash
# Let's check the test cases and assertions to understand the execution requirements
# Check the test class and methods
echo "Test methods and their assertions:"
rg "def test_" tests/test_integration_pyiron_workflow.py -A 10
# Check for any parallel execution related code
echo "Parallel execution patterns:"
rg -i "parallel|concurrent|async|worker" tests/test_integration_pyiron_workflow.py
# Check the actual test implementations
echo "Full test implementations:"
cat tests/test_integration_pyiron_workflow.py
Length of output: 10211
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
tests/test_shared_input_check.py (1)
84-86
: Consider adding more test cases for parameter combinationsWhile the current tests verify error cases, consider adding test cases for valid scenarios:
- Both parameters being None (expected success case)
- Both parameters being valid integers where max_workers * max_cores is valid
Example addition:
def test_check_max_workers_and_cores(self): # Existing error cases with self.assertRaises(ValueError): check_max_workers_and_cores(max_workers=2, max_cores=None) with self.assertRaises(ValueError): check_max_workers_and_cores(max_workers=None, max_cores=2) with self.assertRaises(ValueError): check_max_workers_and_cores(max_workers=2, max_cores=2) # Add success cases check_max_workers_and_cores(max_workers=None, max_cores=None) # Should not raise check_max_workers_and_cores(max_workers=2, max_cores=4) # Should not raise when product is valid
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
executorlib/standalone/inputcheck.py
(3 hunks)tests/test_shared_input_check.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- executorlib/standalone/inputcheck.py
🔇 Additional comments (2)
tests/test_shared_input_check.py (2)
20-20
: LGTM: Import follows existing conventions
The new import is properly grouped with related input check functions and follows the alphabetical ordering.
Line range hint 1-109
: Verify test coverage for all related changes
The AI summary mentions changes to the Executor
class and create_executor
function. Let's verify that all these changes are properly tested.
def test_validate_number_of_cores(self): | ||
self.assertIsInstance( | ||
validate_number_of_cores(max_cores=None, max_workers=None), int | ||
) | ||
self.assertIsInstance( | ||
validate_number_of_cores(max_cores=1, max_workers=None), int | ||
) | ||
self.assertIsInstance( | ||
validate_number_of_cores(max_cores=None, max_workers=1), int | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance test coverage and documentation for validate_number_of_cores
While the test verifies return types, consider these improvements:
- Add a docstring explaining the test's purpose
- Add assertions for expected values, not just types
- Include error cases
- Test edge cases (e.g., negative values)
Example enhancement:
def test_validate_number_of_cores(self):
"""
Test validate_number_of_cores function handles various input combinations
and returns expected core counts.
"""
# Test return types and values
self.assertEqual(
validate_number_of_cores(max_cores=None, max_workers=None),
os.cpu_count(), # Expected default
)
self.assertEqual(
validate_number_of_cores(max_cores=2, max_workers=None),
2,
)
self.assertEqual(
validate_number_of_cores(max_cores=None, max_workers=2),
os.cpu_count() // 2, # Expected cores divided by workers
)
# Test error cases
with self.assertRaises(ValueError):
validate_number_of_cores(max_cores=-1, max_workers=None)
with self.assertRaises(ValueError):
validate_number_of_cores(max_cores=None, max_workers=-1)
Summary by CodeRabbit
New Features
Executor
class, allowing optional parameters formax_workers
,max_cores
, and other configuration options.ExecutorWithDependencies
class, including visual representation of task dependencies upon exiting the context manager.max_workers
, enhancing clarity in executor behavior during tests.Bug Fixes
Documentation