Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureSets are delivered to Ingestion Job through Kafka #792

Merged
merged 2 commits into from
Jun 17, 2020

Conversation

pyalex
Copy link
Collaborator

@pyalex pyalex commented Jun 11, 2020

What this PR does / why we need it:

SpecService & IngestionJob are now communicate through kafka topics, which makes job restarts on FeatureSet change obsolete. Now job restarted only when subscription configuration of store was changed.

Communication Flow:

  1. SpecService.applyFeatureSet publish FeatureSetSpec to specs-topic and set FeatureSet status to Pending
  2. IngestionJob reads from this topic (all history of changes + recent updates) & build materialized view of Specs in memory
  3. IngestionJob sends acknowledgment back to SpecService via specs-ack-topic
  4. SpecService collects acknowledgments from all related jobs (see FeatureSetJobStatus) and when all running jobs acknowledged FeatureSet status is changed to Ready

Which issue(s) this PR fixes:

Fixes #761

Does this PR introduce a user-facing change?:
No


@pyalex pyalex changed the title [WIP] FeatureSets are delivered to Ingestion Job through Kafka FeatureSets are delivered to Ingestion Job through Kafka Jun 12, 2020
@pyalex pyalex added the kind/feature New feature or request label Jun 12, 2020
@pyalex
Copy link
Collaborator Author

pyalex commented Jun 12, 2020

/test test-end-to-end-batch-dataflow


@Autowired
public SpecService(
FeatureSetRepository featureSetRepository,
StoreRepository storeRepository,
ProjectRepository projectRepository,
Source defaultSource) {
Source defaultSource,
KafkaTemplate<String, FeatureSetProto.FeatureSetSpec> specPublisher) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to flag that even though I agree with the approach we are taking with Spring and Kafka, I do consider it technical debt that we are going into. Ideally the life cycle of jobs and the updates of feature sets to those jobs would fully encapsulated in the job management layer, especially if we ever want to separate job management from core.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be more than happy to move this communication responsibility to JobService, but right now JobService is dependant on SpecService and I need publishing to kafka be synchronous part of applyFeatureSet. So yeah, currently it's a tech debt.

@pyalex
Copy link
Collaborator Author

pyalex commented Jun 16, 2020

/test test-end-to-end-batch-dataflow

…eatureSet version

switch to spring-kafka (configs)

specService send message to kafka & expect ack & update status accordingly

jobs runner to send source & specs config (source + ack)

ingestion job to read specs from kafka and send ack

return featureSets in ingestionJob

generate uniq topic name for each test run

prevent listJobs from failing when job failed on start
@pyalex
Copy link
Collaborator Author

pyalex commented Jun 16, 2020

/test test-end-to-end-batch-dataflow

@woop
Copy link
Member

woop commented Jun 17, 2020

/lgtm

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pyalex, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job Coordination Improvement Proposal
3 participants