Add ingest/input/bytes metric and Kafka consumer metrics. #14582

gianm · 2023-07-13T19:38:30Z

New metrics:

ingest/input/bytes. Equivalent to processedBytes in the task reports.
kafka/consumer/bytesConsumed: Equivalent to the Kafka consumer
metric "bytes-consumed-total". Only emitted for Kafka tasks.
kafka/consumer/recordsConsumed: Equivalent to the Kafka consumer
metric "records-consumed-total". Only emitted for Kafka tasks.

New metrics: 1) ingest/input/bytes. Equivalent to processedBytes in the task reports. 2) kafka/consumer/bytesConsumed: Equivalent to the Kafka consumer metric "bytes-consumed-total". Only emitted for Kafka tasks. 3) kafka/consumer/recordsConsumed: Equivalent to the Kafka consumer metric "records-consumed-total". Only emitted for Kafka tasks.

clintropolis

🤘

clintropolis · 2023-07-13T20:11:22Z

...fka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaConsumerMonitor.java

+  private static boolean isTopicMetric(final MetricName metricName)
+  {
+    // Certain metrics are emitted both as grand totals and broken down by topic; we want to ignore the grand total and
+    // only look at the per-topic metrics. See https://kafka.apache.org/documentation/#new_consumer_fetch_monitoring.


nit: this doc link didn't take me to any particular section, is this supposed to point to https://kafka.apache.org/documentation/#consumer_fetch_monitoring

since we only talk to one topic i assume there shouldn't be any difference between the grand total and per topic metrics?

Ah, yes, I'm not sure where I got the new_ from. Fixed.

since we only talk to one topic i assume there shouldn't be any difference between the grand total and per topic metrics?

I assume so too, although I did it this way to future-proof against a possible future scenario where we support reading from multiple topics. It's also nice to have topic as an attribute for the metric.

imply-cheddar · 2023-07-14T00:04:42Z

...ce/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java

+      if (toolbox.getMonitorScheduler() != null) {
+        final Monitor monitor = recordSupplier.monitor();
+        if (monitor != null) {
+          toolbox.getMonitorScheduler().addMonitor(monitor);
+        }
+      }
+


Instead of registering it here, this should really be the job of the newTaskRecordSupplier() call. How about we change that method to take TaskToolbox and add a default implementation of that which delegates to the no-argument call.

Sounds fine. I'm assuming the goal of the default implementation would be to generally switch to newTaskRecordSupplier(TaskToolbox), but maintain compatibility with existing third-party extensions. So to do that, I updated the patch with these changes:

Deprecate newTaskRecordSupplier(), but keep it around to avoid breaking extenstions.

Add newTaskRecordSupplier(TaskToolbox) and switch callers and builtin extensions to use that.

LMK if this matches what you had in mind.

cheddar

requesting change with this account to see if it shows up?

abhishekagarwal87 · 2023-07-14T08:51:18Z

...fka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaConsumerMonitor.java

+    for (final Map.Entry<MetricName, ? extends Metric> entry : consumer.metrics().entrySet()) {
+      final MetricName metricName = entry.getKey();
+
+      if (METRICS.containsKey(metricName.name()) && isTopicMetric(metricName)) {


It will be useful to have task id as a dimension as well.

taskId should already be on the serviceEmitter

It is indeed on the emitter itself.

...-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java

+   */
+  protected RecordSupplier<PartitionIdType, SequenceOffsetType, RecordType> newTaskRecordSupplier(final TaskToolbox toolbox)
+  {
+    return newTaskRecordSupplier();


ektravel · 2023-07-18T15:49:23Z

docs/operations/metrics.md

 |`ingest/events/unparseable`|Number of events rejected because the events are unparseable.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|0|
+|`ingest/events/thrownAway`|Number of events rejected because they are either null, or filtered by the transform spec, or outside the windowPeriod.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|0|


Suggested change

|`ingest/events/thrownAway`|Number of events rejected because they are either null, or filtered by the transform spec, or outside the windowPeriod.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|0|

|`ingest/events/thrownAway`|Number of events rejected because they are either null, filtered by the transform spec, or outside the window period.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|0|

Replaced with the specific names of configuration parameters.

If you prefer to keep the original style, windowPeriod should be in code font. For example:

Number of events rejected because they are either null, filtered by the transform spec, or outside the `windowPeriod` parameter.

ektravel · 2023-07-18T15:50:28Z

docs/operations/metrics.md

 |`ingest/events/duplicate`|Number of events rejected because the events are duplicated.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|0|
-|`ingest/events/processed`|Number of events successfully processed per emission period.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|Equal to the number of events per emission period.|
+|`ingest/input/bytes`|Number of bytes read from input sources, after decompression but prior to parsing. This covers all data read, including data that does not end up being fully processed and ingested. For example, this includes data that ends up being rejected for being unparseable or filtered out.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|Depends on amount of data read.|


Suggested change

|`ingest/input/bytes`|Number of bytes read from input sources, after decompression but prior to parsing. This covers all data read, including data that does not end up being fully processed and ingested. For example, this includes data that ends up being rejected for being unparseable or filtered out.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|Depends on amount of data read.|

|`ingest/input/bytes`|Number of bytes read from input sources, after decompression but prior to parsing. This covers all data read, including data that does not end up being fully processed and ingested. For example, this includes data that ends up being rejected for being unparseable or filtered out.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|Depends on the amount of data read.|

We don't seem consistent about these: some others do both "depends on X" and "depends on the X". Anyway, I'll change it, since "depends on the X" seems slightly more common.

Once this PR is merged, I can go through metrics.md and update the rest of the document to make it consistent.

ektravel

Reviewed the documentation portion of the PR. A couple of minor nits. Otherwise, looks good.

asdf2014

Just one minor suggestion, overall LGTM 👍

...-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java

…blestream/SeekableStreamIndexTask.java Co-authored-by: Benedict Jin <[email protected]>

* Add ingest/input/bytes metric and Kafka consumer metrics. New metrics: 1) ingest/input/bytes. Equivalent to processedBytes in the task reports. 2) kafka/consumer/bytesConsumed: Equivalent to the Kafka consumer metric "bytes-consumed-total". Only emitted for Kafka tasks. 3) kafka/consumer/recordsConsumed: Equivalent to the Kafka consumer metric "records-consumed-total". Only emitted for Kafka tasks. * Fix anchor. * Fix KafkaConsumerMonitor. * Interface updates. * Doc changes. * Update indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java Co-authored-by: Benedict Jin <[email protected]> --------- Co-authored-by: Benedict Jin <[email protected]>

gianm added Area - Metrics/Event Emitting Area - Ingestion labels Jul 13, 2023

github-actions bot added the Area - Documentation label Jul 13, 2023

clintropolis approved these changes Jul 13, 2023

View reviewed changes

gianm added 2 commits July 13, 2023 14:19

Fix anchor.

71c07ea

Fix KafkaConsumerMonitor.

b577b06

imply-cheddar suggested changes Jul 14, 2023

View reviewed changes

cheddar requested changes Jul 14, 2023

View reviewed changes

abhishekagarwal87 reviewed Jul 14, 2023

View reviewed changes

Interface updates.

7fe16f1

github-advanced-security bot found potential problems Jul 14, 2023

View reviewed changes

ektravel reviewed Jul 18, 2023

View reviewed changes

ektravel approved these changes Jul 18, 2023

View reviewed changes

cheddar approved these changes Jul 18, 2023

View reviewed changes

Doc changes.

fd0d9ee

asdf2014 approved these changes Jul 19, 2023

View reviewed changes

...-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTask.java Outdated Show resolved Hide resolved

Update indexing-service/src/main/java/org/apache/druid/indexing/seeka…

50ec459

…blestream/SeekableStreamIndexTask.java Co-authored-by: Benedict Jin <[email protected]>

asdf2014 merged commit bac5ef3 into apache:master Jul 20, 2023

LakshSingla added this to the 28.0 milestone Oct 12, 2023

LakshSingla mentioned this pull request Nov 4, 2023

[DRAFT] 28.0.0 release notes #15326

Closed

kfaraz mentioned this pull request Jan 15, 2025

Add ingest/processed/bytes metric #17581

Merged

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ingest/input/bytes metric and Kafka consumer metrics. #14582

Add ingest/input/bytes metric and Kafka consumer metrics. #14582

gianm commented Jul 13, 2023

clintropolis left a comment

clintropolis Jul 13, 2023

gianm Jul 13, 2023

imply-cheddar Jul 14, 2023 •

edited

Loading

gianm Jul 14, 2023

cheddar left a comment

abhishekagarwal87 Jul 14, 2023

imply-cheddar Jul 14, 2023

gianm Jul 14, 2023

ektravel Jul 18, 2023 •

edited

Loading

gianm Jul 18, 2023

ektravel Jul 19, 2023 •

edited

Loading

ektravel Jul 18, 2023

gianm Jul 18, 2023

ektravel Jul 19, 2023

ektravel left a comment

asdf2014 left a comment

		\|`ingest/events/unparseable`\|Number of events rejected because the events are unparseable.\|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`\|0\|
		\|`ingest/events/thrownAway`\|Number of events rejected because they are either null, or filtered by the transform spec, or outside the windowPeriod.\|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`\|0\|

	\|`ingest/input/bytes`\|Number of bytes read from input sources, after decompression but prior to parsing. This covers all data read, including data that does not end up being fully processed and ingested. For example, this includes data that ends up being rejected for being unparseable or filtered out.\|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`\|Depends on amount of data read.\|
	\|`ingest/input/bytes`\|Number of bytes read from input sources, after decompression but prior to parsing. This covers all data read, including data that does not end up being fully processed and ingested. For example, this includes data that ends up being rejected for being unparseable or filtered out.\|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`\|Depends on the amount of data read.\|

Add ingest/input/bytes metric and Kafka consumer metrics. #14582

Add ingest/input/bytes metric and Kafka consumer metrics. #14582

Conversation

gianm commented Jul 13, 2023

clintropolis left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

imply-cheddar Jul 14, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cheddar left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ektravel Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ektravel Jul 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ektravel left a comment

Choose a reason for hiding this comment

asdf2014 left a comment

Choose a reason for hiding this comment

imply-cheddar Jul 14, 2023 •

edited

Loading

ektravel Jul 18, 2023 •

edited

Loading

ektravel Jul 19, 2023 •

edited

Loading