Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: remove Kafka Streams from documentation #6596

Merged
merged 1 commit into from
Dec 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/architecture/metadata-ingestion.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ As long as you can emit a [Metadata Change Proposal (MCP)] event to Kafka or mak

### Applying Metadata Change Proposals to DataHub Metadata Service (mce-consumer-job)

DataHub comes with a Kafka Streams based job, [mce-consumer-job], which consumes the Metadata Change Proposals and writes them into the DataHub Metadata Service (datahub-gms) using the `/ingest` endpoint.
DataHub comes with a Spring job, [mce-consumer-job], which consumes the Metadata Change Proposals and writes them into the DataHub Metadata Service (datahub-gms) using the `/ingest` endpoint.

[Metadata Change Proposal (MCP)]: ../what/mxe.md#metadata-change-proposal-mcp
[Metadata Change Log (MCL)]: ../what/mxe.md#metadata-change-log-mcl
Expand Down
6 changes: 3 additions & 3 deletions docs/architecture/metadata-serving.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ Note that not all MCP-s will result in an MCL, because the DataHub serving tier

### Metadata Index Applier (mae-consumer-job)

[Metadata Change Logs]s are consumed by another Kafka Streams job, [mae-consumer-job], which applies the changes to the [graph] and [search index] accordingly.
[Metadata Change Logs]s are consumed by another Spring job, [mae-consumer-job], which applies the changes to the [graph] and [search index] accordingly.
The job is entity-agnostic and will execute corresponding graph & search index builders, which will be invoked by the job when a specific metadata aspect is changed.
The builder should instruct the job how to update the graph and search index based on the metadata change.
The builder should instruct the job how to update the graph and search index based on the metadata change.

To ensure that metadata changes are processed in the correct chronological order, MCLs are keyed by the entity [URN] — meaning all MAEs for a particular entity will be processed sequentially by a single Kafka streams thread.
To ensure that metadata changes are processed in the correct chronological order, MCLs are keyed by the entity [URN] — meaning all MAEs for a particular entity will be processed sequentially by a single thread.

### Metadata Query Serving

Expand Down
4 changes: 2 additions & 2 deletions metadata-jobs/README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# MXE Processing Jobs
DataHub uses Kafka as the pub-sub message queue in the backend. There are 2 Kafka topics used by DataHub which are
DataHub uses Kafka as the pub-sub message queue in the backend. There are 2 Kafka topics used by DataHub which are
`MetadataChangeEvent` and `MetadataAuditEvent`.
* `MetadataChangeEvent:` This message is emitted by any data platform or crawler in which there is a change in the metadata.
* `MetadataAuditEvent:` This message is emitted by [DataHub GMS](../gms) to notify that metadata change is registered.

To be able to consume from these two topics, there are two [Kafka Streams](https://kafka.apache.org/documentation/streams/)
To be able to consume from these two topics, there are two Spring
jobs DataHub uses:
* [MCE Consumer Job](mce-consumer-job): Writes to [DataHub GMS](../gms)
* [MAE Consumer Job](mae-consumer-job): Writes to [Elasticsearch](../docker/elasticsearch) & [Neo4j](../docker/neo4j)
8 changes: 4 additions & 4 deletions metadata-jobs/mae-consumer-job/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: "metadata-jobs:mae-consumer-job"

# Metadata Audit Event Consumer Job

The Metadata Audit Event Consumer is a [Kafka Streams](https://kafka.apache.org/documentation/streams/) job which can be deployed by itself, or as part of the Metadata Service.
The Metadata Audit Event Consumer is a Spring job which can be deployed by itself, or as part of the Metadata Service.

Its main function is to listen to change log events emitted as a result of changes made to the Metadata Graph, converting changes in the metadata model into updates
against secondary search & graph indexes (among other things)
Expand All @@ -15,10 +15,10 @@ Today the job consumes from two important Kafka topics:
2. `MetadataChangeLog_Timeseries_v1`

> Where does the name **Metadata Audit Event** come from? Well, history. Previously, this job consumed
> a single `MetadataAuditEvent` topic which has been deprecated and removed from the critical path. Hence, the name!
> a single `MetadataAuditEvent` topic which has been deprecated and removed from the critical path. Hence, the name!

## Pre-requisites
* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html)
* You need to have [JDK8](https://www.oracle.com/java/technologies/jdk8-downloads.html)
installed on your machine to be able to build `DataHub Metadata Service`.

## Build
Expand Down Expand Up @@ -46,7 +46,7 @@ the application directly from command line after a successful [build](#build):
```

## Endpoints
Spring boot actuator has been enabled for MAE Application.
Spring boot actuator has been enabled for MAE Application.
`healthcheck`, `metrics` and `info` web endpoints are enabled by default.

`healthcheck` - http://localhost:9091/actuator/health
Expand Down
6 changes: 3 additions & 3 deletions metadata-jobs/mce-consumer-job/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ title: "metadata-jobs:mce-consumer-job"

# Metadata Change Event Consumer Job

The Metadata Change Event Consumer is a [Kafka Streams](https://kafka.apache.org/documentation/streams/) job which can be deployed by itself, or as part of the Metadata Service.
The Metadata Change Event Consumer is a Spring job which can be deployed by itself, or as part of the Metadata Service.

Its main function is to listen to change proposal events emitted by clients of DataHub which request changes to the Metadata Graph. It then applies
these requests against DataHub's storage layer: the Metadata Service.
these requests against DataHub's storage layer: the Metadata Service.

Today the job consumes from two topics:

Expand Down Expand Up @@ -62,7 +62,7 @@ listen on port 5005 for a remote debugger.
```

## Endpoints
Spring boot actuator has been enabled for MCE Application.
Spring boot actuator has been enabled for MCE Application.
`healthcheck`, `metrics` and `info` web endpoints are enabled by default.

`healthcheck` - http://localhost:9090/actuator/health
Expand Down