Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector panic on reload #13814

Closed
Sh4d1 opened this issue Aug 3, 2022 · 0 comments · Fixed by #14875
Closed

Vector panic on reload #13814

Sh4d1 opened this issue Aug 3, 2022 · 0 comments · Fixed by #14875
Assignees
Labels
domain: reload Anything related to reloading Vector (updating configuration) type: bug A code related bug.

Comments

@Sh4d1
Copy link
Contributor

Sh4d1 commented Aug 3, 2022

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

Hello, when I try to reload this conf below (see the vector.address that changed), I get the following logs:

Aug 03 11:44:29 host systemd[1]: Stopped Vector.
Aug 03 11:44:29 host systemd[1]: vector.service: Failed with result 'timeout'.
Aug 03 11:44:29 host systemd[1]: vector.service: Main process exited, code=killed, status=9/KILL
Aug 03 11:44:29 host systemd[1]: vector.service: Killing process 26826 (vector) with signal SIGKILL.
Aug 03 11:44:29 host systemd[1]: vector.service: State 'stop-sigterm' timed out. Killing.
Aug 03 11:43:53 host vector[26826]: 2022-08-03T11:43:53.521100Z  WARN sink{component_kind="sink" component_id=vector_out component_type=vector component_name=vector_out}:request{request_id=0}: vector::sinks::util::retries: Request timed out. If this happens often while the events are actually reaching their destination, try decreasing `batch.max_bytes` and/or using `compression` if applicable. Alternatively `request.timeout_secs` can be increased.
Aug 03 11:42:58 host systemd[1]: Stopping Vector...
Aug 03 11:42:41 host vector[26826]: 2022-08-03T11:42:41.021153Z ERROR source{component_kind="source" component_id=journald_in component_type=journald component_name=journald_in}: vector::sources::journald: Could not send journald log. error=Sender is closed.
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.871743Z ERROR source{component_kind="source" component_id=syslog_in component_type=syslog component_name=syslog_in}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.738450Z ERROR source{component_kind="source" component_id=syslog_in component_type=syslog component_name=syslog_in}: vector::sources::syslog: Error sending line. error=Sender is closed.
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.738342Z ERROR source{component_kind="source" component_id=journald_in component_type=journald component_name=journald_in}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.738285Z ERROR source{component_kind="source" component_id=syslog_in component_type=syslog component_name=syslog_in}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: thread 'vector-worker' panicked at 'join error: JoinError::Panic(Id(66), ...)', src/topology/builder.rs:212:30
Aug 03 11:42:40 host vector[26826]: thread 'vector-worker' panicked at 'join error: JoinError::Panic(Id(69), ...)', src/topology/builder.rs:212:30
Aug 03 11:42:40 host vector[26826]: thread 'vector-worker' panicked at 'unit error: ()', lib/vector-core/src/fanout.rs:369:42
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.737937Z ERROR transform{component_kind="transform" component_id=journald_filter component_type=filter component_name=journald_filter}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.737900Z ERROR transform{component_kind="transform" component_id=syslog_filter component_type=filter component_name=syslog_filter}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: :369:42
Aug 03 11:42:40 host vector[26826]: thread 'vector-worker' panicked at 'unit error: ()', thread 'lib/vector-core/src/fanout.rsvector-worker:' panicked at '369unit error: ():', 42lib/vector-core/src/fanout.rs
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.737464Z ERROR transform{component_kind="transform" component_id=journald_parser component_type=remap component_name=journald_parser}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.737412Z ERROR transform{component_kind="transform" component_id=syslog_parser component_type=remap component_name=syslog_parser}: vector::topology: An error occurred that Vector couldn't handle.
Aug 03 11:42:40 host vector[26826]: ' panicked at 'Pausing non-existent sink is not valid: vector_out', lib/vector-core/src/fanout.rs:315:17
Aug 03 11:42:40 host vector[26826]: vector-workernote: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Aug 03 11:42:40 host vector[26826]: thread 'vector-worker' panicked at 'Pausing non-existent sink is not valid: vector_out', lib/vector-core/src/fanout.rs:315:thread '17
Aug 03 11:42:40 host vector[26826]: 2022-08-03T11:42:40.736299Z  INFO vector::topology::running: Reloading running topology with new configuration.
Aug 03 11:42:40 host systemd[1]: Reloaded Vector.
Aug 03 11:42:40 host vector[28838]:                            Validated
Aug 03 11:42:40 host vector[28838]: ------------------------------------
Aug 03 11:42:40 host vector[28838]: √ Health check "metrics_out"
Aug 03 11:42:40 host vector[28838]: √ Health check "vector_out"
Aug 03 11:42:40 host vector[28838]: √ Component configuration
Aug 03 11:42:40 host vector[28838]: √ Loaded ["/etc/vector/vector.toml"]
Aug 03 11:42:40 host systemd[1]: Reloading Vector.
Aug 03 11:41:24 host vector[26826]: 2022-08-03T11:41:24.509124Z  WARN sink{component_kind="sink" component_id=vector_out component_type=vector component_name=vector_out}:request{request_id=0}: vector::sinks::util::retries: Request timed out. If this happens often while the events are actually reaching their destination, try decreasing `batch.max_bytes` and/or using `compression` if applicable. Alternatively `request.timeout_secs` can be increased.
Aug 03 11:39:29 host vector[26826]: 2022-08-03T11:39:29.503232Z  WARN sink{component_kind="sink" component_id=vector_out component_type=vector component_name=vector_out}:request{request_id=0}: vector::sinks::util::retries: Request timed out. If this happens often while the events are actually reaching their destination, try decreasing `batch.max_bytes` and/or using `compression` if applicable. Alternatively `request.timeout_secs` can be increased.
Aug 03 11:37:55 host vector[26826]: 2022-08-03T11:37:55.496935Z  WARN sink{component_kind="sink" component_id=vector_out component_type=vector component_name=vector_out}:request{request_id=0}: vector::sinks::util::retries: Request timed out. If this happens often while the events are actually reaching their destination, try decreasing `batch.max_bytes` and/or using `compression` if applicable. Alternatively `request.timeout_secs` can be increased.

I think it's linked to the fact that thevector_out sinks is having timeouts

cc @jszwedko I think it's the same I encountered in #13731

Configuration

[api]
enabled = true
address = "127.0.0.1:8686"

[sources.syslog_in]
type = "syslog"
address = "127.0.0.1:8699"
max_length = 102_400
mode = "udp"
# Keep only haproxy logs from syslog
[transforms.syslog_filter]
type = "filter"
inputs = [ "syslog_in" ]
condition = ".appname == \"haproxy\""
# Parse haproxy logs
[transforms.syslog_parser]
inputs = [ "syslog_filter" ]
type   = "remap"
source = '''
.log.ident = del(.appname)
.log.pid = to_string(del(.procid)) ?? ""
.log.pri = del(.severity)
.log.@timestamp = del(.timestamp)
.log.message = del(.message)
.log.host = del(.host)

. = .log
'''
# Read from journald
[sources.journald_in]
type = "journald"
since_now = true
[transforms.journald_filter]
type = "filter"
inputs = [ "journald_in" ]
condition = 'true && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && .SYSLOG_IDENTIFIER != "..." && (!contains(.SYSLOG_IDENTIFIER, "...") ?? true) && (!contains(.SYSLOG_IDENTIFIER, "...") ?? true) && (!contains(.SYSLOG_IDENTIFIER, "...") ?? true)'
# Parse journald logs
[transforms.journald_parser]
inputs = [ "journald_filter" ]
type   = "remap"
source = '''
.log.ident = del(.SYSLOG_IDENTIFIER)
.log.pid = to_string(del(._PID)) ?? ""
.log.pri = to_syslog_level(to_int(del(.PRIORITY)) ?? 0) ?? ""
.log.@timestamp = del(.timestamp)
.log.message = del(.message)
.log.host = del(.host)

. = .log
'''

# Send all logs to vector server
[sinks.vector_out]
type = "vector"
inputs = [ "journald_parser", "syslog_parser"]
address = "x.x.x.x:8700" # <- this is what I changed/reloaded
version = "2"
compression = true
# Scrap internal metrics
[sources.metrics_in]
type = "internal_metrics"
scrape_interval_secs = 20
# Expose prometheus metrics
[sinks.metrics_out]
type = "prometheus"
inputs = [ "metrics_in" ]
address = "x.x.x.x:8800"
default_namespace = "service"

Version

0.23.0

Debug Output

No response

Example Data

No response

Additional Context

No response

References

No response

@Sh4d1 Sh4d1 added the type: bug A code related bug. label Aug 3, 2022
@jszwedko jszwedko added the domain: reload Anything related to reloading Vector (updating configuration) label Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: reload Anything related to reloading Vector (updating configuration) type: bug A code related bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants