Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: issue with timestamp comparison #248

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Mantas2
Copy link

@Mantas2 Mantas2 commented May 19, 2023

Problem:
The logstash-input-s3 plugin has a known issue regarding object timestamp logic, causing problems when using S3-compatible storage solutions other than AWS S3. This issue has been discussed in the following GitHub pull request and issue:
Pull Request: Fix object timestamp logic
Issue: sincedb file not created, files from bucket not deleted

Proposed Solution:
To address both problems, a suggested fix has been proposed in the pull request. This fix aims to make the logstash-input-s3 plugin compatible with more S3-compatible backends by improving the timestamp handling logic.

Context:
It’s important to note that the logstash-input-s3 plugin was originally designed to work only with AWS S3 and does not officially support other S3-compatible storage solutions. However, implementing the proposed fix would make the plugin suitable for a significant number of alternative S3-compatible solutions, eliminating the need for unsupported forks.

Microseconds Comparison:
The core issue lies in the comparison of timestamps with microseconds precision, which causes two main problems: the sincedb file not being created and duplicated reads of files from the S3 bucket. This issue is well-explained in the blog post titled “Time comparison in Ruby” by Railsware, which discusses the challenges and confusion associated with time comparison in Ruby. (Link)

Root Cause Uncertainty:
It’s worth noting that the root cause of the microseconds difference between file list timestamps in buckets and the last sincedb writes is still uncertain. This issue does not occur when using the logstash-input-s3 plugin with AWS S3, only on other S3 compatible backends.

Issue in question is present in Cloudfare R2 and DigitalOcean Spaces, and the fix has been tested with them as well:
Cloudflare R2: A S3-compatible storage solution provided by Cloudflare. (Link)
DigitalOcean Spaces: A S3-compatible object storage service offered by DigitalOcean. (Link)

@cla-checker-service
Copy link

❌ Author of the following commits did not sign a Contributor Agreement:
4ad48b6

Please, read and sign the above mentioned agreement if you want to contribute to this project

@Mantas2
Copy link
Author

Mantas2 commented May 25, 2023

I have an update from the R2 Engineering team - they have confirmed that the issue with the Logstash plugin in question is related to the difference in timestamp granularity between R2 and S3. While S3 uses timestamps with second granularity, R2 offers millisecond granularity. It also seems to be the case for other S3 compatible instances like Digitalocean Spaces.

@Romuss
Copy link

Romuss commented Jun 22, 2023

@Mantas2 please sign a contributor agreement. We have the same problem with plugin. Thanks

@Mantas2
Copy link
Author

Mantas2 commented Jun 22, 2023

@Mantas2 please sign a contributor agreement. We have the same problem with plugin. Thanks

I have been trying to, multiple times, still waiting for someone to verify it, I guess

@Derekt2
Copy link

Derekt2 commented Mar 21, 2024

same issue here with minio S3 buckets

@fabionitto
Copy link

Will this code be merged? Having the same problem here with Minio Buckets

fabionitto added a commit to fabionitto/logstash-input-s3_timefix that referenced this pull request Apr 11, 2024
@yongkyun
Copy link
Contributor

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants