Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDAP-21096] Split Appfabric into stateless service and stateful processor #15773

Merged

Conversation

vsethi09
Copy link
Contributor

@vsethi09 vsethi09 commented Dec 16, 2024

Context

Make appfabric service stateless by splitting it into appfabric service to server HTTP requests and appfabric processor to run subscriber services and process message.

Change Description

  • Added new main class AppFabricProcessorServiceMain for appfabric processor.
  • Added AppFabricProcessorService which is used by AppFabricProcessorServiceMain.
  • Removed messaging subscriber services from AppFabricServer and moved them to AppFabricProcessorService.
  • HTTP handlers are hosted only in AppFabricServer and AppFabricProcessorService does not have any HTTP handlers.
  • Updated test classes to run Appfabric Processor along with Appfabric Server wherever needed.

Verification

  • Unit Tests:
    • Updated test classes to run Appfabric Processor along with Appfabric Server wherever needed.
    • Added UT for AppFabricProcessorServiceTest to test Start and Stop (similar to AppFabricServerTest).
    • Note: UT cannot be added for AppFabricProcessorServiceMain as it does not host and HTTP handlers and runs only subscriber services.
  • CDAP sandbox: Tested basic flows.
  • Deployed locally built image to a k8s cluster and verified by deployed CRD and CR. ([CDAP-21096] Split Appfabric into stateless service and stateful processor cdap-operator#123)

@vsethi09 vsethi09 added the build Triggers github actions build label Dec 16, 2024
@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch 7 times, most recently from 6874823 to 75ec19d Compare December 19, 2024 09:43
@vsethi09 vsethi09 marked this pull request as ready for review December 19, 2024 09:43
@vsethi09 vsethi09 changed the title [WIP][CDAP-21096] Split Appfabric into stateless service and stateful processor [CDAP-21096] Split Appfabric into stateless service and stateful processor Dec 19, 2024
@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch from 75ec19d to b3e3e17 Compare December 19, 2024 10:11
Comment on lines 102 to 103
CredentialProviderService credentialProviderService,
NamespaceCredentialProviderService namespaceCredentialProviderService,
Copy link
Member

@itsankit-google itsankit-google Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need CredentialProviderService in processor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will it work in this case if the processor service does not host any handlers but GcpWorkloadIdentityHttpHandler runs in AppfabricService?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed. Removed.

@@ -0,0 +1,216 @@
/*
* Copyright © 2014-2020 Cask Data, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2024

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch 2 times, most recently from f88bc89 to ce8bda2 Compare December 23, 2024 11:06
@@ -0,0 +1,41 @@
/*
* Copyright © 2014-2018 Cask Data, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2025

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -0,0 +1,195 @@
/*
* Copyright © 2014-2024 Cask Data, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2025

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch 6 times, most recently from 227670b to 510685e Compare January 6, 2025 17:28
@vsethi09 vsethi09 requested a review from tivv January 6, 2025 17:31
NamespaceCredentialProviderService namespaceCredentialProviderService,
ProvisioningService provisioningService,
BootstrapService bootstrapService,
SystemAppManagementService systemAppManagementService,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SystemAppManagementService#bootStrapSystemAppConfigDir is (a) slow (b) I am not sure what happens if we try to do it in parallel. Are there harm on keeping ApplicationLifecycleService/ProgramLifecycleService in the Processor as well?

private final ProgramStopSubscriberService programStopSubscriberService;
private final RunRecordCorrectorService runRecordCorrectorService;
private final RunDataTimeToLiveService runDataTimeToLiveService;
private final ProgramRunStatusMonitorService programRunStatusMonitorService;
private final RunRecordMonitorService runRecordCounterService;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guy is not designed to run in multiple instances. It has internal tracker for "number of concurrent starts". Our counters will be wrong if we start it everywhere, e.g. removeRequest is mostly called by ProgramNotificationSubscriberServicem while new requests are added by the handler. We need to make this mechanism distributed. @masoud-io FYI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, in this PR I have moved the HTTP handlers for program, app, workflow to processors.

They can be moved back to the Appfabric server after the issue due to in-memory cache is fixed. Created Jira for follow up PR: CDAP-21112 Move HTTP handler from Appfabric processor to server after fixing ProgramRuntimeService and RunRecordMonitorService

@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch 13 times, most recently from aed827e to e3a85e1 Compare January 9, 2025 18:56
@vsethi09 vsethi09 requested a review from tivv January 9, 2025 19:26
new AuditModule(),
new AuthorizationModule(),
new AuthorizationEnforcementModule().getMasterModule(),
Modules.override(new AppFabricServiceRuntimeModule(cConf).getDistributedModules())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using same runtime module for Service and Processor? How does it decide what to bring up where?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, same module is used.

AppFabricServer and AppFabricProcessorService decide what to bring up for their service.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's a bit too easy to mess up. We define both things and then if anyone refers to a thing with a different name, we get all the handlers up... It's probably not the biggest deal as Processor handlers won't be able to come up without necessary services, but it also introduces a hidden assumption.
May I ask you to add some explicit thing, e.g. add a parameter to AppFabricServiceRuntimeModule constructor (PROCESSOR_HANDLERS/SERVER_HANDLERS/BOTH_HANDLERS) and bind only the handlers this module is constructed for? This would make it explicit and ensures it's hard to mess it up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this will touch multiple files. I'm addressing this comment in a separate PR: #15799

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG

@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch 2 times, most recently from 6af62c9 to eca1abb Compare January 9, 2025 20:45
Copy link
Contributor

@tivv tivv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one comment to address.

new AuditModule(),
new AuthorizationModule(),
new AuthorizationEnforcementModule().getMasterModule(),
Modules.override(new AppFabricServiceRuntimeModule(cConf).getDistributedModules())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's a bit too easy to mess up. We define both things and then if anyone refers to a thing with a different name, we get all the handlers up... It's probably not the biggest deal as Processor handlers won't be able to come up without necessary services, but it also introduces a hidden assumption.
May I ask you to add some explicit thing, e.g. add a parameter to AppFabricServiceRuntimeModule constructor (PROCESSOR_HANDLERS/SERVER_HANDLERS/BOTH_HANDLERS) and bind only the handlers this module is constructed for? This would make it explicit and ensures it's hard to mess it up.

@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch 3 times, most recently from bae34bc to 0a2c5b8 Compare January 12, 2025 08:26
…essor

Fix RunRecordCorrectorService and AuditLogSubscriberService

Fix RunRecordCorrectorService and AuditLogSubscriberService

Add cdap system service for appfabric processor

Move AppLifecycle, ProgramLifecycle and Workflow HTTP handlers to appfabric server

Fix unit tests
@vsethi09 vsethi09 force-pushed the feature/CDAP-21096_split_appfabric_service_processor branch from 0a2c5b8 to 7a71a52 Compare January 12, 2025 13:24
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
16.0% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@vsethi09 vsethi09 merged commit 94d7671 into develop Jan 12, 2025
11 of 12 checks passed
@vsethi09 vsethi09 deleted the feature/CDAP-21096_split_appfabric_service_processor branch January 12, 2025 15:16
Copy link
Contributor

@tivv tivv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

new AuditModule(),
new AuthorizationModule(),
new AuthorizationEnforcementModule().getMasterModule(),
Modules.override(new AppFabricServiceRuntimeModule(cConf).getDistributedModules())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Triggers github actions build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants