Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(engine): support completion condition for multi-instance #8321

Merged
1 commit merged into from
Dec 20, 2021

Conversation

lzgabel
Copy link
Contributor

@lzgabel lzgabel commented Dec 7, 2021

Description

Support completion condition expression for multi-instance.

Multi-instance loop-characteristics with a <completionCondition> are now used by Zeebe.
Zeebe expects the completion condition to be a FEEL-expression that evaluates to a boolean.
Processes that contain an unparsable completion condition are rejected.

The completion condition expression is evaluated each time an instance of a multi-instance element completes.
The expression is expected to evaluate to a boolean, if it doesn't an incident is raised.
The expression is evaluated at the scope of the instance element, not at the scope of the multi-instance element.
If it evaluates to false, the normal completion behavior of the multi-instance body is used.
If it evaluates to true, any still active child instances of the multi-instance are terminated and the multi-instance element is completed.

Related issues

closes #5439

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/0.25) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 7, 2021

each parent execution of the created executions for each instance will have following variables:

  • nrOfElementInstances: the total number of instances
  • nrOfCompletedElementInstances: the number of already completed instances.

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 7, 2021

Hi @saig0 , I've implemented the first version. Can you review it and give me some suggestions? :)

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 7babee6 to 2e2745f Compare December 8, 2021 07:29
@saig0
Copy link
Member

saig0 commented Dec 8, 2021

@lzgabel awesome 🎉 Thank you for your contribution 🚀

I'm on vacation soon. @korthout do you have time to review the PR? Thank you 🍪

@korthout
Copy link
Member

korthout commented Dec 8, 2021

@saig0 Sure! I'll have a look. Do you have anything that you think I should be especially aware of regarding this topic?

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 2e2745f to 7853013 Compare December 9, 2021 09:04
Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lzgabel Thanks for this contribution. I really like it! 🚀

There is quite a lot to unwrap here because this is a big topic. So as a first step in this review, I'd like to ask you to break the changes down into separate pull requests. We can then discuss each of the topics individually, which will help make the reviews easier and allows you to get the important functionality merged faster.

Specifically, it seems you're adding 2 features at once:

  1. transformation and evaluation of the completionCondition expression to terminate other instances of the multi instance and complete the multi instance when the condition is met (AKA completion condition for multi-instance)
  2. introduction of nrOfCompletedElementInstances and nrOfElementInstances as new local variables of a multi instance body, so these can be used in expressions (i.e. mainly for the completion condition expression, but they are also available to others)

In my opinion, these features should be developed independently. Therefore, let's focus this review on the main feature. Please open a new issue for the introduction of nrOfCompletedElementInstances and nrOfElementInstances as new local variables of a multi instance body. We should discuss whether that is the best solution, since it might break existing processes of other Zeebe users.

First, please remove the changes related to the nrOfCompletedElementInstances and nrOfElementInstances variables from this pull request.

Second, please add some test cases to MultiInstanceActivityTest.java to show that the logic works. If you struggle with this, please reach out so I can help.

Third, please make sure that an incident is created when the completion condition fails to evaluate to a boolean.

If you have any questions, please reach out. I'm happy to help. There's great potential in the changes you've provided here. Thanks again for working on this 🙇

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 7853013 to 7430c0d Compare December 10, 2021 06:28
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 10, 2021

Hi @korthout, I've removed the nrOfCompletedElementInstances and nrOfElementInstances variables and added test cases of completionCondition , can you check this out, please? :)

@lzgabel lzgabel requested a review from korthout December 10, 2021 06:35
Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Thanks for the fast changes @lzgabel 🚀

When I run mvn test -pl :zeebe-workflow-engine, I see some failing tests. I think this is mainly because of the changes to the unrelated processors:

  • SubProcessProcessor
  • CallActivityProcessor
  • ProcessMessageSubscriptionCorrelateProcessor
  • JobCompleteProcessor

I think that these don't require any changes. Instead, I would move the logic to terminate the child instances entirely to the responsibility of the MultiInstanceBodyProcessor.

Currently, in the MultiInstanceBodyProcessor you check that the completion condition is met (in the afterExecutionPathCompleted method which is executed for each completed child instance). I agree with that, but when that happens, we should not immediately complete the multi-instance. Instead, it should start terminating all the active child instances (by writing the TERMINATE_ELEMENT commands for them), which will asynchronously start terminating those tasks (and as a consequence cancel their respective jobs). Note, this also means you don't have to introduce a new JobIntent.TERMINATED, we already use CANCELLED for this same concept.

For each child instance that has completed its termination, the onChildTerminated method of the MultiInstanceBodyProcessor will be called. There we should check, whether there are still any child instances active, or that this was the last one to terminate.

As soon as all have terminated we can either:

  • terminate the multi-instance, if the intent of the multi-instance body is TERMINATING (this is the case if we were currently trying to terminate the multi-instance, and so all of its children were actively trying to terminate, this is simply what the current implementation does)
  • complete the multi-instance, because the only other reason why the children could terminate is in response to the completion condition having met

Please give this strategy a try and let me know if you have any questions.

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 7430c0d to 4e77ea0 Compare December 11, 2021 11:29
@lzgabel lzgabel requested a review from korthout December 11, 2021 11:30
Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 I'm so impressed with how fast you managed to tackle my comments. This is a difficult topic to implement and you're doing amazing. Your code has also improved massively and it was a joy to read and review.

For this review round, please add new commits for the changes instead of rebasing/squashing/amending, so I can review the changes independently.

I have a couple remarks, but we're already nearing the end.

🔧 Please also add a validation to ZeebeRuntimeValidators for the completion condition expression. This also requires a test case in ZeebeRuntimeValidationTest

EDIT: Also, it's my weekend. I'll stop for now and have another look on Monday 👍

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 13, 2021

Hi @korthout , I've fixed all review comments, please check this out. :)

@lzgabel lzgabel requested a review from korthout December 13, 2021 12:43
Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes @lzgabel! I think you're almost there 🏁

I only have a few remarks and changes to request. Please have another look.

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 13, 2021

Hi @korthout , thank you very much for your patience in correcting my problems, i've fixed all comments, please check this out, thanks. :)

Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @lzgabel. It looks great! Just one last thing 🤞

🔧 In addition to my last comment, please squash your changes. Feel free to turn it into 1 commit or to split it up in to several if you feel that makes more sense. Goal should be that future readers of the code can find why specific changes were made, so please provide a clear commit message.

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 4146448 to c1bd0eb Compare December 13, 2021 17:54
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 13, 2021

Hi @korthout , I've squashed this changes, please check it. :)

@lzgabel lzgabel requested a review from korthout December 13, 2021 17:59
@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from c1bd0eb to 4465982 Compare December 14, 2021 06:33
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 14, 2021

Hi @korthout , I'm very sorry. I lost a commit during squash this changes. I'll deal with it later. 🙇

@korthout
Copy link
Member

korthout commented Dec 14, 2021

@lzgabel I think you can find your previous state here: c1bd0eb. Let me know when I can have another look. 🙂

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 4465982 to c1bd0eb Compare December 14, 2021 11:24
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 14, 2021

@lzgabel I think you can find your previous state here: c1bd0eb. Let me know when I can have another look. 🙂

Hi @korthout, I'm sorry for the delay. I've fixed the problem. please check it. 🙇

@korthout
Copy link
Member

Hi @lzgabel Sorry for the long wait. 🙇 I'm currently running some manual tests and checking whether all my assumptions were correct. For example, Operate should be able to show the terminated children and completed multi-instance element.

However, during testing I ran into the following issue:

I created a process with a multi-instance element:

<bpmn:multiInstanceLoopCharacteristics isSequential="true">
  <bpmn:extensionElements>
    <zeebe:loopCharacteristics inputCollection="= [1,2,3,4,5]" inputElement="index" />
  </bpmn:extensionElements>
  <bpmn:completionCondition>= index = value</bpmn:completionCondition>
</bpmn:multiInstanceLoopCharacteristics>

It expects that the value variable exists in the process instance and uses the index variable (which is set by the inputElement). However, it leads to an incident:

failed to evaluate expression ' index = value': no variable found for name 'index'

❓ The reason for this is that we evaluate the expression at the scope of the multi-instance body, while the inputElement index is a local variable only available to the child instance that just completed. The question is thus: should the completion condition expression be evaluated at the scope of the multi-instance body or at the scope of the child instance? Please share if you have thoughts about this. Personally, I still need to think about this a bit longer.

🔧 In the meantime, I have another question for you: would you be willing to write the documentation for this topic? This topic should be described in some detail in the reference documentation (probably on this page). I've already opened an issue for it here. If you're willing, please have a look.

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 14, 2021

The question is thus: should the completion condition expression be evaluated at the scope of the multi-instance body or at the scope of the child instance?

I will follow up and take a look at this problem.

would you be willing to write the documentation for this topic?

Yes, my pleasure!

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 15, 2021

The question is thus: should the completion condition expression be evaluated at the scope of the multi-instance body or at the scope of the child instance?

Hi @korthout, I'm currently running some manual tests through the camunda-bpm-platform.
The conclusion is that the completion condition expression should be evaluated in the scope of the child instance, except that the element type is callActivity.

for callActivity: the expression evaluation context has only access to the multi instance execution's variable and its parent, it can't access variables to in the callActivity execution scope.


As required by the specification:

The completionCondition Expression is a boolean predicate that is evaluated every time an instance completes. When evaluated to true, the remaining instances are cancelled.

This brings another question:

The variables were also removed when the child instance completed( in the DbElementInstanceState.removeInstance #L139 method ), therefore an incident will be raised:

failed to evaluate expression 'x': no variable found for name 'x'

@korthout
Copy link
Member

korthout commented Dec 15, 2021

Thanks @lzgabel for looking into this. 👍 And sorry for the late reply. I've also dug into it and for now the topic is still under discussion by the team. We don't yet know what is the best solution. However, at this time we think that evaluating at the child instance is likely better, as it provides access to many more variables.

If you're open to continue working on this, please have a go at it. If not, please let me know so I can have a look. It's fine by me either way, I'm already so thankful for your contribution.🎖️

If you do want to try this, please create new commits so we can always return to the current implementation. The way to approach this change is to only evaluate the completion condition expression in the beforeExecutionPathCompleted method, evaluate it at the child instance scope, and make this method return the result. Then in BpmnStateTransitionBehavior.transitionToCompleted you'll need to retrieve that result and pass it into the call to afterExecutionPathCompleted as a parameter. You can then replace the code in MultiInstanceBodyProcessor.afterExecutionPathCompleted that currently evaluates the expression a second time, with the passed in parameter. We then no longer need to create this unresolvable incident as a bonus 🎉.

If that works, then you'll also have to do this for BpmnStateTransitionBehavior.onCalledProcessCompleted.

As always, please let me know if there's anything I can help with. 🙇

@korthout
Copy link
Member

@oleschoenburg Pointed me to a paragraph in the spec that seems to indicate that the completion condition expression has access to the inputDataItem and thus evaluates at the child instance scope. 💯

The completionCondition, the condition in the ComplexBehaviorDefinition, and the
DataInputAssociation of the Event in the ComplexBehaviorDefinition can refer to the MI Activity
instance attributes and the loopDataInput, loopDataOutput, inputDataItem, and outputDataItem that
are referenced from the MultiInstanceLoopCharacteristics.

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 16, 2021

Hi @korthout , I've fixed the completion condition expression be evaluated at the scope of the child instance problem, please check this out. :)

Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! 👏

Please have a look at my comments

@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 16, 2021

Hi @korthout , I've fixed all review comments, please check this out. 🙇

Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @lzgabel.

There's just some final clean up left I think.

@lzgabel lzgabel requested a review from korthout December 16, 2021 14:38
@korthout
Copy link
Member

Thanks @lzgabel.

I'm going to run the CI on this code and manually test it out a bit again. This may again take some time. Sorry about that. I do think this is it though🤞

@korthout
Copy link
Member

@lzgabel Can you please rebase/squash the commits again?

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 7fee938 to 09e6ecd Compare December 16, 2021 14:49
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 16, 2021

@lzgabel Can you please rebase/squash the commits again?

Done.

Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lzgabel Seems there's a small problem leading to failing tests. Please have a look

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from 09e6ecd to a58d20c Compare December 16, 2021 16:38
@korthout
Copy link
Member

@lzgabel Please rebase your changes on top of develop and resolve the conflicts.

@lzgabel lzgabel force-pushed the 5439-lz-completion-condition branch from a58d20c to 00d9cb3 Compare December 16, 2021 17:25
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 16, 2021

Hi @korthout, Sorry for the long wait. I just resolved a code conflict.🙇

@korthout
Copy link
Member

korthout commented Dec 17, 2021

I've just finished manually testing the operability with Operate on Camunda Cloud SaaS and it looks okay with some minor mistakes:

  • evaluation failures correctly lead to incidents that can be resolved
  • Operate correctly shows completed instances
  • Operate correctly shows terminated instances
  • Operate correctly shows completed multi-instance (for sequential case)
  • Operate incorrectly shows multi-instance as terminated (for parallel case, if any instances were terminated)
  • Operate's instance history is not updated correctly (both for sequential and parallel cases)

Sequential multi-instance

Collection of 5 elements, completed after the second iteration

Note that the instance history view was not updated correctly.

Screen Shot 2021-12-17 at 10 26 43

Here's the same page after refresh:
Screen Shot 2021-12-17 at 10 15 23

Parallel multi-instance

Collection of 5 elements, completed after the first and second instances were completed.

Note that the instance history view was not updated correctly.

Screen Shot 2021-12-17 at 10 09 56

Here's the same page after refresh:

Note that the instance history view is now correct, but in the diagram the multi-instance element still is marked as terminated

Screen Shot 2021-12-17 at 10 10 16

@lzgabel There's not really much for you to about this. I'll discuss with the team to decide whether we can afford these minor bugs when this feature is used in combination with Operate.

Copy link
Member

@korthout korthout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a go from the team on this feature. So: LGTM 🎖️

Thanks again for this great contribution @lzgabel. This is a really cool feature and it was a pleasure to review this. 🙇

🏆 I recommend that you have a look at our Camunda Champions program and apply for it. I'd be happy to set up a personal call for you with our Community Manager to answer any questions you may have about it. Would you be interested in that?

Lastly, let's get this merged 🎉

bors merge

@ghost
Copy link

ghost commented Dec 20, 2021

Build succeeded:

@ghost ghost merged commit 9e70bb9 into camunda:develop Dec 20, 2021
@lzgabel
Copy link
Contributor Author

lzgabel commented Dec 20, 2021

Thanks again for this great contribution @lzgabel. This is a really cool feature and it was a pleasure to review this. 🙇

Hi @korthout , I'm honored to be able to contribute to the construction of the community.💪 Thank you for patiently reviewing my code and giving me many suggestions to deal with problems better.
Thanks again. 🙇

🏆 I recommend that you have a look at our Camunda Champions program and apply for it. I'd be happy to set up a personal call for you with our Community Manager to answer any questions you may have about it. Would you be interested in that?

Yes, I'm very interested. I'll apply to join later. ❤️

@korthout
Copy link
Member

I'll DM you on Slack to follow up on this.

@lzgabel lzgabel deleted the 5439-lz-completion-condition branch July 3, 2022 07:56
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support completion condition for multi-instance
3 participants