Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine: output: Add metrics for displaying the available capacity of chunks as percent #8063

Merged

Conversation

cosmo0920
Copy link
Contributor

Currently, the capacity of chunks are not explicitly collected in metrics of fluent bit. This should be collected and described for non fluent-bit expert. Note that the capacity of chunks as percent should be collected per plugins. This is because Fluent Bit can set up different capacity of total limit of chunks per plugin.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
[SERVICE]
    HTTP_Server On
    HTTP_Port 2021
    Hot_Reload On
    Log_Level debug

[INPUT]
    Name dummy
    rate 1000

[OUTPUT]
    Name http
    storage.total_limit_size 1MB
  • Debug log output from testing the change
Fluent Bit v2.2.0
* Copyright (C) 2015-2023 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/10/19 18:48:15] [ info] Configuration:
[2023/10/19 18:48:15] [ info]  flush time     | 1.000000 seconds
[2023/10/19 18:48:15] [ info]  grace          | 5 seconds
[2023/10/19 18:48:15] [ info]  daemon         | 0
[2023/10/19 18:48:15] [ info] ___________
[2023/10/19 18:48:15] [ info]  inputs:
[2023/10/19 18:48:15] [ info]      dummy
[2023/10/19 18:48:15] [ info] ___________
[2023/10/19 18:48:15] [ info]  filters:
[2023/10/19 18:48:15] [ info] ___________
[2023/10/19 18:48:15] [ info]  outputs:
[2023/10/19 18:48:15] [ info]      http.0
[2023/10/19 18:48:15] [ info] ___________
[2023/10/19 18:48:15] [ info]  collectors:
[2023/10/19 18:48:15] [ info] [fluent bit] version=2.2.0, commit=92b9053a54, pid=75696
[2023/10/19 18:48:15] [debug] [engine] coroutine stack size: 36864 bytes (36.0K)
[2023/10/19 18:48:15] [ info] [storage] ver=1.2.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/10/19 18:48:15] [ info] [cmetrics] version=0.6.3
[2023/10/19 18:48:15] [ info] [ctraces ] version=0.3.1
[2023/10/19 18:48:15] [ info] [input:dummy:dummy.0] initializing
[2023/10/19 18:48:15] [ info] [input:dummy:dummy.0] storage_strategy='memory' (memory only)
[2023/10/19 18:48:15] [debug] [dummy:dummy.0] created event channels: read=21 write=22
[2023/10/19 18:48:15] [debug] [http:http.0] created event channels: read=23 write=24
[2023/10/19 18:48:15] [ info] [output:http:http.0] worker #0 started
[2023/10/19 18:48:15] [ info] [output:http:http.0] worker #1 started
[2023/10/19 18:48:15] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2021
[2023/10/19 18:48:15] [ info] [sp] stream processor started
[2023/10/19 18:48:15] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
[2023/10/19 18:48:15] [debug] [input chunk] update output instances with new chunk size diff=36, records=1, input=dummy.0
<snip>

This newly added metrics should be collected as follows after some of the period of runing:

% curl localhost:2021/api/v2/metrics                                                                                                                                                   
2023-10-19T09:41:49.595623923Z fluentbit_uptime{hostname="Hiroshi-no-MacBook-Air-M2.local"} = 7
2023-10-19T09:41:49.595350585Z fluentbit_input_bytes_total{name="dummy.0"} = 248724
2023-10-19T09:41:49.595350585Z fluentbit_input_records_total{name="dummy.0"} = 6909
2023-10-19T09:41:42.592963316Z fluentbit_output_proc_records_total{name="http.0"} = 0
2023-10-19T09:41:42.592963316Z fluentbit_output_proc_bytes_total{name="http.0"} = 0
2023-10-19T09:41:42.592963316Z fluentbit_output_errors_total{name="http.0"} = 0
2023-10-19T09:41:48.605878156Z fluentbit_output_retries_total{name="http.0"} = 6
2023-10-19T09:41:42.592963316Z fluentbit_output_retries_failed_total{name="http.0"} = 0
2023-10-19T09:41:42.592963316Z fluentbit_output_dropped_records_total{name="http.0"} = 0
2023-10-19T09:41:48.605878156Z fluentbit_output_retried_records_total{name="http.0"} = 5914
2023-10-19T09:41:49.595623923Z fluentbit_process_start_time_seconds{hostname="Hiroshi-no-MacBook-Air-M2.local"} = 1697708502
2023-10-19T09:41:49.595623923Z fluentbit_build_info{hostname="Hiroshi-no-MacBook-Air-M2.local",version="2.2.0",os="macos"} = 1697708502
2023-10-19T09:41:49.595623923Z fluentbit_hot_reloaded_times{hostname="Hiroshi-no-MacBook-Air-M2.local"} = 0
2023-10-19T09:41:49.595680758Z fluentbit_storage_chunks = 7
2023-10-19T09:41:49.595680758Z fluentbit_storage_mem_chunks = 7
2023-10-19T09:41:49.595680758Z fluentbit_storage_fs_chunks = 0
2023-10-19T09:41:49.595680758Z fluentbit_storage_fs_chunks_up = 0
2023-10-19T09:41:49.595680758Z fluentbit_storage_fs_chunks_down = 0
2023-10-19T09:41:42.592427681Z fluentbit_input_ingestion_paused{name="dummy.0"} = 0
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_overlimit{name="dummy.0"} = 0
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_memory_bytes{name="dummy.0"} = 177156
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_chunks{name="dummy.0"} = 5
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_chunks_up{name="dummy.0"} = 5
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_chunks_down{name="dummy.0"} = 0
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_chunks_busy{name="dummy.0"} = 4
2023-10-19T09:41:47.596394122Z fluentbit_input_storage_chunks_busy_bytes{name="dummy.0"} = 141120
2023-10-19T09:41:49.595606881Z fluentbit_output_upstream_total_connections{name="http.0"} = 1
2023-10-19T09:41:42.592963316Z fluentbit_output_upstream_busy_connections{name="http.0"} = 0
2023-10-19T09:41:48.605878156Z fluentbit_output_chunk_available_capacity_percent{name="http.0"} = 78.673599999999993
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 09:51 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 09:51 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 09:51 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 10:21 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 10:23 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 10:23 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 10:23 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 temporarily deployed to pr October 19, 2023 10:53 — with GitHub Actions Inactive
@cosmo0920 cosmo0920 marked this pull request as ready for review October 19, 2023 11:39
src/flb_engine.c Outdated
@@ -207,6 +207,16 @@ static inline int handle_input_event(flb_pipefd_t fd, uint64_t ts,
return 0;
}

static inline double calculate_chunk_capacity_percent(struct flb_output_instance *ins)
{
if (ins->total_limit_size == -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a chance that total_limit_size == 0 ? (if so, it will breaks in the next part of the code)

Copy link
Contributor Author

@cosmo0920 cosmo0920 Dec 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no chance to become total_limit_size == 0 due to these lines fill out -1 value into limit in this case:
https://github.com/fluent/fluent-bit/blob/master/src/flb_output.c#L907-L915
But your suggestion is really reasonable. So, I changed the clause to check zero or lower.

@edsiper edsiper merged commit a1faa4f into master Dec 20, 2023
44 checks passed
@edsiper edsiper deleted the cosmo0920-add-chunk-usage-percent-metrics-on-output-plugins branch December 20, 2023 21:22
@edsiper
Copy link
Member

edsiper commented Dec 20, 2023

@cosmo0920 thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants