-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete log streams via API #577
Comments
Lets try and match the Prometheus API here: https://prometheus.io/docs/prometheus/latest/querying/api/#delete-series |
We should start with GET, https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers /cc @davkal |
related to #113 |
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
I've hacked together a rough prototype of the delete API after seeing cortexproject/cortex#2103 and blowing up my index with an erroneously parsed timestamp. Is anyone working on the delete API on the Loki team? I could finish up my prototype over the next 2-3 weeks if needed. |
Not yet. I appreciate your efforts! We would like to get it working for Cortex and then reuse most of the code in Loki. |
My issue was unrelated. I parsed the "time since boot" counter from the linux kernel log as label by accident. The cortex PR allowed me to use the new |
Is there any way to manually go in and delete all instances of a label? We're running into the scenario where we started collecting the wrong logs under the wrong label name, promtail reconfig was nice and easy but now we have the old labels hanging out with invalid logs. Would like to clean up, and since the API doesn't support this like Prometheus yet, hoping there's a more manual way. We're using two different setups, one with filesystem and boltdb for chunks/index, the other with s3 and cassandra. Both have the same scenario happening |
We also need this also. It would be great to have something like in prometheus. link |
Do we have any update on this, Is delete api available on Loki ? For Example : |
It's actively being worked on by @sandeepsukhani Feel free to give him support, it's OpenSource ! |
@cyriltovena Hi! Thanks for the info! 💪 But while it is not done, do you know of a workaround that can be used? Since we hold our logs for many days, is there currently any way I can manually delete these label values? |
I have to delete all data to solve the "cardinality limit exceeded for logs ....more than limit of 100000" problem. and I have no idea which label is high cardinality. is there a tool to see this? I just found the index size grow fast these days, but can not find the cause |
We recently released a way to see that with logcli. (aka --analyze-labels) see:
It does exactly what you want ✌️ |
@cyriltovena Thanks for the info. Are you aware of a way to actually delete the labels? I couldn't find information on that in the PR or blog. Thanks! |
Same problem here, need to clear the logs due to GDPR request. Seems silly to wipe all logs, when there are only a few labels we need to clear. |
On the topic of log deletion, I think it would be even more useful if we could configure retention. E.g. keep everything for X days/months/years. Or even better, since logs are usually stored on S3 (or similar storage), and loki is append-only (mostly?), maybe we could simply configure our S3 bucket to delete all files older than X days. That would also work nicely with protecting logs, since we could use e.g. S3 object lock to prevent deletion of logs until X days have passed. If you do add support for deleting streams via the API, don't enable it by default. For security, it's great that log shippers (anyone sending data to loki) can't delete it). For security, I think most setups will want to handle retention in loki or on the underlying storage (e.g. S3 bucket deletion rules). |
I need this also, its crazy how easy it is to add a bad label and then have it totally break your entire log database to the point where you need to wipe out all of your log data to repair it. This should be a pretty high priority to fix imo. |
Version 2.3.0 has this feature now. https://grafana.com/docs/loki/latest/operations/storage/logs-deletion/ |
Closing this issue now that the log deletion api has been released. |
The documentation for Log deletion API has been moved here: |
Is your feature request related to a problem? Please describe.
When the users will want to tweak Promtail scrape configuration (relabeling, pipeline, discovery) to work with their own deployments, they'll probably generate some unused or unwanted Loki streams.
It would be nice if we could search/filter through streams and ordered them by last chunk received, throughput and total size. Then be able to deleted those we don't want anymore.
This could also help users to create some sort of cron job to apply retention. @sandlis
Describe the solution you'd like
1 - List all streams ordered by last data seen and throughout over the last 5minutes, we should be able provide label matcher
{app="mongod" cluster="us-east1"}
to narrow down the search, this should be a query limited by 100 results by default.2 - Delete a list of stream via their labelset and may be from a specific time.
I feel like this could be useful to have in Grafana some sort of stream management UI @davkal ?
The text was updated successfully, but these errors were encountered: