-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Promtail] Support reading from compressed log files #5956
Comments
+1 looks like an important thing to support to make promtail more robust. |
We are also interested in this as enabling compression when using multi-AZ architectures in AWS has significant savings on Inter-AZ data transfer costs. Anecdotal, but enabling compression in Pulsar messaging save 40% on inter-AZ data transfer. Having the ability to compression Promtail would have a similar impact. |
Hey @frittentheke and @ecliptik, I was taking a look at this the last few days and I have a few questions for you:
I'm planning on working on it this month but I'm still designing how it is going to work so these questions would help me decide a few things. Thank you in advance. |
Hi! This issue has been automatically marked as stale because it has not had any We use a stalebot among other tools to help manage the state of issues in this project. Stalebots are also emotionless and cruel and can close issues which are still very relevant. If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry. We regularly sort for closed issues which have a We may also:
We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, |
Very sorry for missing to respond to your questions @DylanGuedes - thanks first of all for picking up on the issue!
I am unsure if I understand you correctly. First to avoid reading such files again, but also reading from compressed files could be interrupted just as likely as an uncompressed file. Expecting a potentially huge file to be read and shipped successfully in one go and being unable to resume from an interruption seems unnecessary.
Not really, but some thoughts on possible implementations of "auto-detection":
In any case I would expect a helpful error message in case a format is not supported. Even with Golang supporting almost any sensible format, I believe it's quite seldom to have
Thanks again and sorry again for the late reply. |
Here's the thing: I think for most scenarios, users aren't appending more compressed data to a compressed file; what they normally want instead is to ingest compressed data a single time and that's it (similar to a batch job). Do you think this claim makes sense? If so, maybe we should:
|
What would make this "batch" type of ingestion any different from slurping in "regular" log files? You can then also decompress on the fly before piping into Promtail - Promtail does it the UNIX way by being easy to combine with other tools via pipes and is more than versatile enough to be used in ad-hoc scripts doing any sort of batch imports. So in short I do not believe there is a case for any new "way" of reading stuff.
@DylanGuedes I honestly believe we lost track of what my actual intention was: "Giving Promtail the ability to read from files which are compressed". This in essence is only about recognizing that a file is compressed and then to process the incoming data stream through a suitable library before the rest of the log parsing and shipping happens. In the case of Golang is likely would be https://pkg.go.dev/compress which can read compressed files (like So to form a list of what I believe needs to be implemented / done:
|
Thanks for the clarification, it makes way more sense now. Also in my previous message I overlooked the possibility of specifying folders that receive new compressed files. |
@ecliptik can you explain a little bit more about your use case?
Does this mean you have Promtail in 1 availability zone reading a file in another AZ and are therefore having to pay data transfer costs for that transaction? Is that right? I guess I was just thinking that the Promtail is generally in the same AZ as the files it is reading, in which case the only data transfer cost is when promtail sends to Loki (at which point, we do already compress the data before sending). |
**What this PR does / why we need it**: Adds to Promtail the ability to read compressed files. It works by: 1. Infer which compression format to use based on the file extension 2. Uncompress the file with the native `golang/compress` packages 3. Iterate over uncompressed lines and send them to Loki Its usage is the same as our current file tailing. **Which issue(s) this PR fixes**: Fixes #5956 Co-authored-by: Danny Kopping <[email protected]>
**What this PR does / why we need it**: Adds to Promtail the ability to read compressed files. It works by: 1. Infer which compression format to use based on the file extension 2. Uncompress the file with the native `golang/compress` packages 3. Iterate over uncompressed lines and send them to Loki Its usage is the same as our current file tailing. **Which issue(s) this PR fixes**: Fixes grafana#5956 Co-authored-by: Danny Kopping <[email protected]>
**What this PR does / why we need it**: Adds to Promtail the ability to read compressed files. It works by: 1. Infer which compression format to use based on the file extension 2. Uncompress the file with the native `golang/compress` packages 3. Iterate over uncompressed lines and send them to Loki Its usage is the same as our current file tailing. **Which issue(s) this PR fixes**: Fixes grafana#5956 Co-authored-by: Danny Kopping <[email protected]>
Is your feature request related to a problem? Please describe.
Currenly Promtail can only read from files which are not compressed. At the same time applying compression is quite common for logging and holding a few days worth of logs on a machine.
Describe the solution you'd like
Either an option within the scrape config to allow compressed files to be considered or just the ability for e.g. a gzipped file to be read transparently. It's the same data as an uncompressed file, all the other checks and limits just apply.
Describe alternatives you've considered
There really is no alternative. It's either natively supported by Promtail to read a log stream from compressed files or some manual action is required by operations to either decompress the files again or pipe them to Promtail somehow.
Additional context
rsyslog writing compressed logs - https://www.rsyslog.com/doc/v8-stable/configuration/modules/omfile.html#ziplevel
syslog-ng piping logs to gzip -
destination d_gzip { program("gzip -c >> logfile.gz"); };
Elastic Filebeat feature request about reading compressed files - Add an new input type to backfill gzipped logs elastic/beats#637
Logrotate delaycompress option - https://github.com/logrotate/logrotate/blob/master/ChangeLog.md#25---1997-09-01
Java Log4j doing compression - https://logging.apache.org/log4j/2.x/manual/appenders.html
Golang logrus - https://github.com/sirupsen/logrus/blob/b53d94c8ada4ef48988cb955795bf5c22dcf4b25/README.md#rotation
Python logrotation and compression - https://python.readthedocs.io/en/latest/howto/logging-cookbook.html#using-a-rotator-and-namer-to-customize-log-rotation-processing
...
The text was updated successfully, but these errors were encountered: