Filebeat stops processing log entries after "Read line error: file inactive" #1974

draenor · 2016-07-06T13:16:03Z

Filebeat stops processing logs after a while ~~in combination with log rolling, however I'm not sure if exclusively related to log rolling~~ It seemed to happen unrelated to log rolling today (the other harvesters were still fine).

Version 1.2.3.
OS: WIndows Server 2012 R2
Steps to reproduce
- filebeat.exe -c filebeat.yml
- Wait (until log rolling)

Issue description

The state file does not update and no logs are sent from filebeat (i.e., it's halted indefinately)
This occurred after log rolling (but I'm unsure if exclusively so)
Two instances of FileBeat is reading the same log file and both experience the exact same problem, i.e., both get stuck on the exact same offset.
When restarting one of both filebeat instances the restarted instance continues to process log lines without problems

Filebeat log:

2016-07-05T15:43:41+02:00 INFO Harvester started for file: ..\logs\foo.log
2016-07-05T15:45:11+02:00 INFO Events sent: 1
2016-07-05T15:45:11+02:00 INFO Registry file updated. 2 states written.
...
2016-07-06T00:09:29+02:00 INFO Not resuming rotated file: ..\logs\foo.log
2016-07-06T00:09:29+02:00 INFO Harvester started for file: ..\logs\foo.log
2016-07-06T00:09:35+02:00 INFO Events sent: 1
2016-07-06T00:09:35+02:00 INFO Registry file updated. 5 states written.
2016-07-06T01:07:08+02:00 INFO Read line error: file inactive
2016-07-06T01:08:51+02:00 INFO Read line error: file inactive

filebeat.yml

filebeat:
  registry_file: .filebeat_registry
  prospectors:
    - paths:
        - "${LOGDIR}/foo.log"
        - "${LOGDIR}/bar.log"
      input_type: log
      document_type: mytype

output:
  logstash:
    hosts: ["host.com:5044"]

shipper:
logging:
  to_syslog: false
  to_files: true

  files:
    path: "${LOGDIR}/filebeat/"
    name: filebeat.log
    rotateeverybytes: 10485760 # = 10MB
    keepfiles: 7
  level: info

If it matters, the crawled logs are generated and rolled by the java logback library.

The text was updated successfully, but these errors were encountered:

tsg · 2016-07-08T13:10:27Z

The file inactive error means that Filebeat thinks the file wasn't modified for a long time (1h by default) so it closes it. This discussion seems similar to what you are experiencing.

If it's indeed the same bug, there are two things that you could try:

setting a very large value for close_older
the 5.0.0-alpha3 release, which should fix this

@ruflin might have more insights on this.

ruflin · 2016-07-11T08:26:39Z

If I understand this right, filebeat gets stuck either at the offset when the file got rotated or an offset when the close_older is reached? It seems like the internal state is not properly updated, means the harvester does not properly communicate that the crawling was finished.

The state part was completely rewritten in Filebeat 5.0.0-*. It would be really helpful if you could try the 5.0.0 snapshot to see if this solves the problem: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

draenor · 2016-07-11T08:38:05Z

As we can see from the logs there were at least one event sent after log rolling, however I can try the latest 5 snapshot to see if the problem appear again within a couple of days.

ruflin · 2016-07-12T08:37:33Z

@draenor That would be great. Keep me posted.

draenor · 2016-07-12T11:32:51Z

I've been running version 1.2.3 in parallel to the version 5.0.0 snapshot. Looking at the registry file for the same prospector there's this

version 1.2.3: "offset":3141
version 5.0.0 snapshot: "offset":44309

The logs from 1.2.3

...
2016-07-12T05:25:10+02:00 INFO Events sent: 2
2016-07-12T05:25:10+02:00 INFO Registry file updated. 5 states written.
2016-07-12T06:24:49+02:00 INFO Read line error: file inactive
2016-07-12T06:25:12+02:00 INFO Read line error: file inactive
2016-07-12T11:27:28+02:00 INFO Not resuming rotated file: ..\logs\foo.log
2016-07-12T11:27:28+02:00 INFO Harvester started for file: ..\logs\foo.log
2016-07-12T11:27:28+02:00 INFO Registry file updated. 5 states written.
2016-07-12T11:27:28+02:00 INFO Events sent: 1
2016-07-12T11:27:28+02:00 INFO Registry file updated. 5 states written.
2016-07-12T12:27:33+02:00 INFO Read line error: file inactive

And the last log entry from version 5.0.0 (which judging by the timestamp continues where 1.2.3 doesn't)

2016-07-12T13:08:46+02:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.write_bytes=1665 registrar.state_updates=4 libbeat.logstash.published_and_acked_events=1 libbeat.publisher.published_events=1 publish.events=2 libbeat.logstash.publish.read_bytes=6

Thoughts

Version 5 so far doesn't seem to suffer from this issue
Next, I'll try to set a higher value for close_older in the 1.2.3 version (24h) to see if that helps. I'll keep you posted on the results.
With 5 I get this 2016-07-12T13:37:55+02:00 ERR Error: The service process could not connect to the service controller. but it doesn't seem to have any impact.

ruflin · 2016-07-12T12:20:02Z

It could be that you hit the following issue which was fixed in 5.0: #1954

About the error above: Not 100% sure where this error is coming from. Some Googleing revelaed it could be Windows related. @andrewkroh Do you know here more?

Thanks for doing all the testing.

draenor · 2016-07-15T14:36:17Z

With the setting close_older: 24h I still experience file inactive errors (even though logs are set to roll daily).

The 5.0.0 snapshot does not seem to suffer from the same issue.

ruflin · 2016-07-18T06:03:50Z

@draenor Thanks for testing. I have the suspicion that the crawling hangs because of the Return channel not receiving data: https://github.com/elastic/beats/blob/1.3/filebeat/crawler/prospector.go#L461 This code does not exist anymore in 5.0. I will try to create some test cases to reliably reproduce this locally to find a fix.

ruflin · 2016-09-02T14:05:07Z

@draenor I was playing around with this and tried to fix it in 1.3. Unfortunately the fix introduced other issues which is the reason we never merged #2274 To fix the issue completely the state rewrite which was done for 5.0 is needed.

I'm closing this issue as I think it is fixed in 5.0, but please reopen if you discover it also in 5.0 or need more details.

Thanks a lot for investigating this in detail and providing all the data.

draenor mentioned this issue Jul 6, 2016

error occured 'INFO Read line error: file inactive ' and the registry_file do not update #1645

Closed

tsg added bug Filebeat Filebeat labels Jul 8, 2016

ruflin mentioned this issue Jul 19, 2016

Filebeat issue with ready logs from NFS #2056

Closed

ruflin closed this as completed Sep 2, 2016

mmguero mentioned this issue Apr 25, 2023

kubernetes - check out filebeat on network volumes idaholab/Malcolm#188

Closed

mmguero mentioned this issue Nov 5, 2024

kubernetes - check out filebeat on network volumes cisagov/Malcolm#472

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filebeat stops processing log entries after "Read line error: file inactive" #1974

Filebeat stops processing log entries after "Read line error: file inactive" #1974

draenor commented Jul 6, 2016 •

edited

Loading

tsg commented Jul 8, 2016 •

edited

Loading

ruflin commented Jul 11, 2016

draenor commented Jul 11, 2016

ruflin commented Jul 12, 2016

draenor commented Jul 12, 2016 •

edited

Loading

ruflin commented Jul 12, 2016

draenor commented Jul 15, 2016 •

edited

Loading

ruflin commented Jul 18, 2016

ruflin commented Sep 2, 2016

Filebeat stops processing log entries after "Read line error: file inactive" #1974

Filebeat stops processing log entries after "Read line error: file inactive" #1974

Comments

draenor commented Jul 6, 2016 • edited Loading

Issue description

tsg commented Jul 8, 2016 • edited Loading

ruflin commented Jul 11, 2016

draenor commented Jul 11, 2016

ruflin commented Jul 12, 2016

draenor commented Jul 12, 2016 • edited Loading

ruflin commented Jul 12, 2016

draenor commented Jul 15, 2016 • edited Loading

ruflin commented Jul 18, 2016

ruflin commented Sep 2, 2016

draenor commented Jul 6, 2016 •

edited

Loading

tsg commented Jul 8, 2016 •

edited

Loading

draenor commented Jul 12, 2016 •

edited

Loading

draenor commented Jul 15, 2016 •

edited

Loading