Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi tenancy in Flow Framework #980

Merged
merged 43 commits into from
Jan 25, 2025

Conversation

dbwiddis
Copy link
Member

@dbwiddis dbwiddis commented Dec 13, 2024

Description

Uses the multitenant remote metadata client developed in ML Commons feature/multi_tenancy branch to migrate Flow Framework system indices to use the same pattern.

Submitting in logical steps to ease maintainer review. Suggestions:

  1. To start, review each of the first several commits to understand how the tenant id is implemented and passed around
  2. Then visit the tenant aware integ test classes to understand what it is we're trying to allow/prohibit

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added the backport 2.x backport PRs to 2.x branch label Dec 13, 2024
@dbwiddis dbwiddis force-pushed the multi-tenancy branch 2 times, most recently from 60e881d to f611e41 Compare December 13, 2024 21:55
@dbwiddis dbwiddis force-pushed the multi-tenancy branch 10 times, most recently from 090727f to 8cd5b8f Compare December 27, 2024 21:39
Copy link

codecov bot commented Dec 27, 2024

Codecov Report

Attention: Patch coverage is 62.74256% with 288 lines in your changes missing coverage. Please review.

Project coverage is 75.51%. Comparing base (1c16b1b) to head (3550786).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...framework/indices/FlowFrameworkIndicesHandler.java 65.07% 82 Missing and 6 partials ⚠️
...ework/transport/CreateWorkflowTransportAction.java 66.05% 28 Missing and 9 partials ⚠️
.../opensearch/flowframework/util/EncryptorUtils.java 63.63% 26 Missing and 10 partials ⚠️
.../transport/ReprovisionWorkflowTransportAction.java 20.00% 14 Missing and 2 partials ⚠️
...ework/transport/DeleteWorkflowTransportAction.java 55.17% 10 Missing and 3 partials ⚠️
...flowframework/transport/handler/SearchHandler.java 45.83% 13 Missing ⚠️
.../org/opensearch/flowframework/util/ParseUtils.java 43.47% 9 Missing and 4 partials ⚠️
...h/flowframework/rest/RestCreateWorkflowAction.java 60.71% 6 Missing and 5 partials ⚠️
...rk/transport/ProvisionWorkflowTransportAction.java 47.05% 7 Missing and 2 partials ⚠️
.../opensearch/flowframework/FlowFrameworkPlugin.java 50.00% 7 Missing and 1 partial ⚠️
... and 13 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #980      +/-   ##
============================================
- Coverage     76.39%   75.51%   -0.89%     
- Complexity     1010     1050      +40     
============================================
  Files           100      101       +1     
  Lines          4881     5211     +330     
  Branches        455      498      +43     
============================================
+ Hits           3729     3935     +206     
- Misses          955     1038      +83     
- Partials        197      238      +41     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@owaiskazi19 owaiskazi19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are discrepancies at few places between INTERNAL_SERVER_ERROR and BAD_REQUEST rest status for the request calls

@dbwiddis dbwiddis force-pushed the multi-tenancy branch 2 times, most recently from b8a4ce8 to 5e63f56 Compare January 24, 2025 08:09
@dbwiddis
Copy link
Member Author

@owaiskazi19 @amitgalitz Can we try to get this merged today? I think I've addressed all your comments or at least answered them. All tests are passing.

  • Note multitenancy is not designed to be a dynamic setting or one that will be upgraded to for BWC, and even the storage option settings are set at startup. Adding a tenant_id field to every document means they are inaccessible to other tenants with the setting true, turning it off would expose all this data. And if a cluster had data on it before it was "toggled on" then none of that data would be visible to anyone.
    • When the setting is false then tenantId is set to null and the intent of most of this code is that there is no change to existing functionality in that case; while enabling per-tenant isolation in other environments; all while maintaining the same set of code.
  • SdkClient initialization in the plugin class is a one method call to another external method. It only looks complex because of a 6-element map declared inline. I can potentially move that map to the settings class but can do that in a later PR; there is nothing time critical about it.
  • The current thread pool default may be deleted in a future PR. Once I get this merged I will have time to work on doing that.
  • I've set our threadpools back to 4/8/4 which is double what we started with. The main impact here is all our uses of actionGet() which block threads and consume the pool; there is not a lot of extra CPU involved. These should probably be made settings and I may make that change shortly as in a multitenant environment they are way too small.
  • I've changed the "default tenant id" to an empty string.
  • The use of a latch for master key initialization is taken from ML Commons implementation (as are a lot of the multitenancy features which were developed on a feature branch there and thoroughly tested). If you have another way to do it, feel free to rewrite that code. I know what's there works.
  • @owaiskazi19 commented "There are discrepancies at few places between INTERNAL_SERVER_ERROR and BAD_REQUEST rest status for the request calls" but I'm not sure where these are. I've tried to change a couple, but many are pre-existing. Can we make a follow-up issue if you think there are any more?

Copy link
Member

@owaiskazi19 owaiskazi19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments. Not blocking the PR for the ones that can be addressed later

@dbwiddis dbwiddis merged commit d8a5a4b into opensearch-project:main Jan 25, 2025
22 of 23 checks passed
@dbwiddis dbwiddis deleted the multi-tenancy branch January 25, 2025 01:14
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/flow-framework/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/flow-framework/backport-2.x
# Create a new branch
git switch --create backport/backport-980-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d8a5a4bdbcb39bd45e7fb3b6aa9f800b59b1ddfb
# Push it to GitHub
git push --set-upstream origin backport/backport-980-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/flow-framework/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-980-to-2.x.

@opensearch-trigger-bot opensearch-trigger-bot bot added the backport-failed Applied to PRs when the automatic backport fails label Jan 25, 2025
dbwiddis added a commit to dbwiddis/flow-framework that referenced this pull request Jan 25, 2025
* Import SdkClient and inject it

Signed-off-by: Daniel Widdis <[email protected]>

* Pass sdkClient to IndicesHandler and EncryptorUtils classes

Signed-off-by: Daniel Widdis <[email protected]>

* Extract tenant id from REST header into RestAction

Signed-off-by: Daniel Widdis <[email protected]>

* Pass tenant id to transport actions in template

Signed-off-by: Daniel Widdis <[email protected]>

* Validate tenant id existence in workflow transport actions

Signed-off-by: Daniel Widdis <[email protected]>

* Pass SdkClient and tenant id to util used for access control checks

Signed-off-by: Daniel Widdis <[email protected]>

* Perform tenant id validation checks for workflow APIs

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate Update workflow get action to SdkCleint

Signed-off-by: Daniel Widdis <[email protected]>

* Pass tenantId to IndicesHandler and use in EncryptorUtils

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate EncryptorUtils getting master key from index

Signed-off-by: Daniel Widdis <[email protected]>

* Refactor fetching master key to permit reuse

Signed-off-by: Daniel Widdis <[email protected]>

* Refactor initializeMasterKey to use common code

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate indexing new key to config

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate template indexing to sdkClient

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate template deletion to sdkClient

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate get template to sdkClient

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate provision template to sdkClient

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate max workflow search to sdkClient

Signed-off-by: Daniel Widdis <[email protected]>

* Add tenantId to GetWorkflowStateRequest

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate GetWorkflowStateRequest to multitenant client

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate getProvisioningProgress to avoid repetition

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate canDeleteWorkflowStateDoc to avoid repetition

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate initial state document creation to metadata client

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate state document deletion to metadata client

Signed-off-by: Daniel Widdis <[email protected]>

* Add Tenant aware Rest Tests for Workflows

Signed-off-by: Daniel Widdis <[email protected]>

* Fix javadocs

Signed-off-by: Daniel Widdis <[email protected]>

* Add publishToMavenLocal for more CI

Signed-off-by: Daniel Widdis <[email protected]>

* Fix some CI

Signed-off-by: Daniel Widdis <[email protected]>

* Enable tenant aware search

Signed-off-by: Daniel Widdis <[email protected]>

* Refactor state index update method using multitenant client

Signed-off-by: Daniel Widdis <[email protected]>

* Get metadata client artifacts from Maven Snapshot

Signed-off-by: Daniel Widdis <[email protected]>

* Update tests for new update async code

Signed-off-by: Daniel Widdis <[email protected]>

* Switch SdkClient to use default generic thread executor

Signed-off-by: Daniel Widdis <[email protected]>

* Migrate last updates to sdkClient

Signed-off-by: Daniel Widdis <[email protected]>

* Revert (most) changes to unit tests based on async client changes

Signed-off-by: Daniel Widdis <[email protected]>

* Pass tenant id when updating state during provisioning

Signed-off-by: Daniel Widdis <[email protected]>

* Integrate tenantId with synchronous provisioning

Signed-off-by: Daniel Widdis <[email protected]>

* Fix failing integ tests after rebase, code review updates

Signed-off-by: Daniel Widdis <[email protected]>

* Replace fakeTenantId placeholders with actual tenant id

Signed-off-by: Daniel Widdis <[email protected]>

* Use version catalog for commons-lang3 and httpcore dependencies

Signed-off-by: Daniel Widdis <[email protected]>

* Exclude transitive httpclient dependency from metadata and rest client

Signed-off-by: Daniel Widdis <[email protected]>

* Fix more test errors and tweak dependencies

Signed-off-by: Daniel Widdis <[email protected]>

* More code review comments and refactoring

Signed-off-by: Daniel Widdis <[email protected]>

---------

Signed-off-by: Daniel Widdis <[email protected]>
@dbwiddis dbwiddis removed the backport-failed Applied to PRs when the automatic backport fails label Jan 25, 2025
dbwiddis added a commit that referenced this pull request Jan 25, 2025
Implement multi tenancy in Flow Framework (#980)

* Import SdkClient and inject it



* Pass sdkClient to IndicesHandler and EncryptorUtils classes



* Extract tenant id from REST header into RestAction



* Pass tenant id to transport actions in template



* Validate tenant id existence in workflow transport actions



* Pass SdkClient and tenant id to util used for access control checks



* Perform tenant id validation checks for workflow APIs



* Migrate Update workflow get action to SdkCleint



* Pass tenantId to IndicesHandler and use in EncryptorUtils



* Migrate EncryptorUtils getting master key from index



* Refactor fetching master key to permit reuse



* Refactor initializeMasterKey to use common code



* Migrate indexing new key to config



* Migrate template indexing to sdkClient



* Migrate template deletion to sdkClient



* Migrate get template to sdkClient



* Migrate provision template to sdkClient



* Migrate max workflow search to sdkClient



* Add tenantId to GetWorkflowStateRequest



* Migrate GetWorkflowStateRequest to multitenant client



* Migrate getProvisioningProgress to avoid repetition



* Migrate canDeleteWorkflowStateDoc to avoid repetition



* Migrate initial state document creation to metadata client



* Migrate state document deletion to metadata client



* Add Tenant aware Rest Tests for Workflows



* Fix javadocs



* Add publishToMavenLocal for more CI



* Fix some CI



* Enable tenant aware search



* Refactor state index update method using multitenant client



* Get metadata client artifacts from Maven Snapshot



* Update tests for new update async code



* Switch SdkClient to use default generic thread executor



* Migrate last updates to sdkClient



* Revert (most) changes to unit tests based on async client changes



* Pass tenant id when updating state during provisioning



* Integrate tenantId with synchronous provisioning



* Fix failing integ tests after rebase, code review updates



* Replace fakeTenantId placeholders with actual tenant id



* Use version catalog for commons-lang3 and httpcore dependencies



* Exclude transitive httpclient dependency from metadata and rest client



* Fix more test errors and tweak dependencies



* More code review comments and refactoring



---------

Signed-off-by: Daniel Widdis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport PRs to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] RestStatus.CONFLICT when updating the state index with two steps concurrently
4 participants