Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[[inputs.tail]] Collect full data for each update #15718

Closed
ysj2018 opened this issue Aug 8, 2024 · 6 comments
Closed

[[inputs.tail]] Collect full data for each update #15718

ysj2018 opened this issue Aug 8, 2024 · 6 comments
Labels
bug unexpected problem or unintended behavior

Comments

@ysj2018
Copy link

ysj2018 commented Aug 8, 2024

Relevant telegraf.conf

I have a log file, and I want to only collect the updated content for each update,i choose inputs.tail to do this.  But I found that every time I update, I collect the full amount of information。Please tell me what might be the reason。

Logs from Telegraf

I didn't find any errors log

System info

telegraf 1.31 Linux TVVMDC0231 5.15.0-78-generic #85-Ubuntu SMP Fri Jul 7 15:25:09 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Docker

No response

Steps to reproduce

Here is my inputs configuration:
[agent]
interval = "3s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "3s"

[[inputs.tail]]
files = ["/telegraf/telegraf/conf/testtail.log"]
data_format = "grok"
grok_patterns = ["%{TIMESTAMP_ISO8601:default_time}#-#%{DATA:instemplateName}#-#{GREEDYDATA:insSenarioId}"]
[[outputs.mongodb]]
dsn = "mongodb://ip:port"
timeout = "30s"
authentication = "SCRAM"
username = "mongodb"
password = "mongodb"
database = "telegraftail"
granularity = "seconds"
ttl = "360h"
I execute commands: telegraf --config test.conf
Then update one line of my log file
As a result, all the content in the logs was recollected in MongoDB

Expected behavior

I want to only collect the updated content for each update

Actual behavior

Actually, it collected all the contents of the log file

Additional info

No response

@ysj2018 ysj2018 added the bug unexpected problem or unintended behavior label Aug 8, 2024
@ysj2018
Copy link
Author

ysj2018 commented Aug 8, 2024

2024-08-08T11:53:06Z I! Loading config: test.conf
2024-08-08T11:53:06Z I! Starting Telegraf 1.31.2 brought to you by InfluxData the makers of InfluxDB
2024-08-08T11:53:06Z I! Available plugins: 234 inputs, 9 aggregators, 32 processors, 26 parsers, 60 outputs, 6 secret-stores
2024-08-08T11:53:06Z I! Loaded inputs: tail
2024-08-08T11:53:06Z I! Loaded aggregators:
2024-08-08T11:53:06Z I! Loaded processors:
2024-08-08T11:53:06Z I! Loaded secretstores:
2024-08-08T11:53:06Z I! Loaded outputs: mongodb
2024-08-08T11:53:06Z I! Tags enabled: host=TVVMDC0231
2024-08-08T11:53:06Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"TVVMDC0231", Flush Interval:10s

this is log

@powersj
Copy link
Contributor

powersj commented Aug 8, 2024

Hi,

Actually, it collected all the contents of the log file

The tail plugin should collect all the contents of the log file. The plugin will start at the beginning and work its way through the file until it has read everything. Then it will read all new lines as they come in. From the plugin readme:

By default, the tail plugin acts like the following unix tail command:

tail -F --lines=0 myfile.log
  • -F means that it will follow the name of the given file, so
    that it will be compatible with log-rotated files, and that it will retry on
    inaccessible files.
  • --lines=0 means that it will start at the end of the file (unless
    the from_beginning option is set).

Now if the plugin was reading the entire file every time, that would be a different issue. That only should happen if from_beginning = true, which is set to false by default.

Not sure I see a bug here as this is the expected behavior.

@powersj powersj closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024
@ysj2018
Copy link
Author

ysj2018 commented Aug 9, 2024

I understand that your description aligns with the official documentation, but in actuality, what I'm experiencing is: without configuring the from-beginning option, every time I update a single log line, it seems to be collecting the entire contents of the log file again, instead of just the one line I updated. I believe this is not the expected behavior

@powersj
Copy link
Contributor

powersj commented Aug 9, 2024

Then it would be good to show that and how we could fully reproduce, please provide:

  1. some sample log messages that we can use
  2. add in outputs.file to print the metrics that are generated to your config
  3. run through an example where you add 1 metric, wait for it to be printed, add a new line to the file and see if 1 or 2 lines are printed

@ysj2018
Copy link
Author

ysj2018 commented Aug 13, 2024

Thank you for your response. I have found the reason for the issue. When I used vi to add logs, the inode number was changed, which might have caused the program to consider it as a new file. When I used another method to add logs without changing the inode number, the result was as expected.

@powersj
Copy link
Contributor

powersj commented Aug 13, 2024

Ah thanks for following up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants