-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: API Change: k_work #28794
Comments
This change will be extremally troublesome. I will not argue with the fact that documentation was clear about it but you plan on taking away a useful feature. If keeping and documenting existing behavior is not an option can you add an API call that would perform cancel&ubsubmit mentioning that it requires to be performed in a right context? |
Not sure if it belong to this PR, but the |
This doesn't make sense to me, but I think the misunderstanding is one I originally had too:
The whole point of the discussion in #22826 (review) is that |
Whether the work submit can fail or not was not the point I tried to make, it just would be practical for the caller to know if the work was submitted or not. The code path in the caller might be different according to this info. See example in |
Personally I like being able to use non-negative results to distinguish success states, so I'd be fine with having I don't think it'd help you though. A return value from net_tc_submit_to_tx_queue, is an example of the sort of potential race we have to deal with: Note: I'm not saying the net code is wrong. I'm saying it's not obvious that it's correct. In the general case you cannot determine that a work item has completed its task based on the return value of |
No longer relevant. The existing API will be retained with documentation explaining its limitations, and new API added that doesn't have those limitations. See #28579 (comment) |
On Hold
Pending maintainer response to #28579 (comment)
tl;dr This RFC addresses the requirement to notify the community in the case of changes to stable API that might require changes to existing code.
Problem Description
As identified in #27356 and the issues and comments it links to, Zephyr's
k_work
API has a variety of race conditions that prevent users from being able to use it to determine whether a work item is delayed, pending, in progress, or completed. This has resulted in a few bugs, and in a mistaken assumption that the results returned by API functions can be used to infer the work item state.They cannot: work items by their nature proceed asynchronously based on operations initiated from threads and interrupts of various priorities, and by the time a return value is made available to a caller the state may have changed.
The goal is to revise the API slightly to conform to its documented behavior and original architecture. As a consequence the several bugs have been fixed and some behavior and return values changed.
Proposed change
k_delayed_work_cancel()
will no longer remove an item submitted throughk_delayed_work_submit_to_queue()
if the item has progress to where it has been made pending (i.e. the delay has elapsed andk_work_submit_to_queue()
has been invoked).-EALREADY
is no longer an expected return value fromk_delayed_work_submit_to_queue()
.-EALREADY
is no longer an expected return value fromk_delayed_work_cancel()
.EINPROGRESS
has been added as a return value. The interpretation of-EINVAL
has changed.In this case return values from
k_delayed_work
functions have changed along with behavior related to canceling delayed work items.The goal is to make the implementation consistent with the API documentation, and to represent as accurately as possible the work item state, while reducing complexity and bugs.
Detailed RFC
The changes have been implemented in #28579.
The core behavior change is that
k_delayed_work_cancel()
will no longer remove an item submitted throughk_delayed_work_submit_to_queue()
if the item has progress to where it has been made pending (i.e. the delay has elapsed andk_work_submit_to_queue()
has been invoked).This path has been chosen because:
k_work_submit_to_queue()
, making this feature specific to delayed work items.Dependencies
Users of the non-delayed
k_work
API will see no change.Most users of the
k_delayed_work
API will see no change.Code that uses the return value of
k_delayed_work_cancel()
to infer whether a task had been scheduled but had not completed will need to rework the code to track task completion within the task state, including detecting that the task was completed externally prior to the work item being processed. #28579 demonstrates an alternative approach for the in-tree use of this idiom in the Bluetooth subsystem.Concerns and Unresolved Questions
The ability to remove an item from a work queue was motivated by cases where a task needed to be completed synchronously and immediately. However a task may be in a state where it can't be cancelled. The remediation necessary to handle that case also handles the case where it could have been cancelled, reducing the value of cancellation. That remediation requires that the code providing the work item handler track the state of the work.
This concern is being discussed at #28579 (review) and other comments.
Alternatives Considered
A more complex use of atomic flags could be used to trace the work item's progress through the states identified in #27356, but doing so would be very complex and would not change the fundamental issue that API return values are stale representations of work item state.
The text was updated successfully, but these errors were encountered: