-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make metricbeat sql module more versatile #22779
Comments
Pinging @elastic/integrations-services (Team:Services) |
This usage (getting data from an SQL table in the form of logs) would be better in filebeat rather than metricsbeat, as the spirit is closer to how log collection operate (have a single instance of each record) rather than metrics (collection of a sensors at a certain point in time). |
Thanks for your feedback, happy to see you like this module 🙂
Yeah, I tend to agree with this, the features requested here can be problematic in Metricbeat (specially the one about setting the Regarding the support for a unique identifier: something like this may be already possible by using the So I don't think this should be supported in the |
@jsoriano I get your point wrt. the id not being unique beyond the index used, but that would be a concern for the specific implementation. For some use-cases there would be no roll-over, and in our case we would also push only X historical records (e.g. the past 6 hours) so that a restart of the agent would recover from this kind of downtime. But I agree with @halfa, that filebeat would be a better place to have such an sql module. BTW We use a different timestamp in the definition of the index pattern and that works fine as well. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hi! We're labeling this issue as |
The metricbeat sql module is a lifesaver, we are very happy with this inclusion. \o/
But we think there are a few features that could make this even better.
Support a unique identifier for
_id
We like to perform queries and updates docs in Elastic. For this we would need to provide the
_id
value, so it would be nice if we could configure which field is to be used as the_id
value. This would allow us to have a rolling window for updating documents (e.g. every hour process the aggregated data for the previous X hours) and have it update those entries in Elastic.I may be possible to related this to the primary key of a table. But a configurable column would be more convenient IMO
Support an identifier for the
@timestamp
We already use a timestamp provided in the query as the timeField in the Index pattern, and this works very well already. But I think it could be useful to have the default
@timestamp
selected from the query as well so this works out of the box.For metrics at the time of the query this offers no value, but for time-based aggregations in tables, having a way to influence
@timestamp
without the need to have ingest pipelines would be very useful.Storing and using the timestamp of the last successful ingest
We are processing aggregated data grouped by hour, the first time we would like to get to the historical information (all information), but subsequent queries should only cover the most recent windows. In fact, if there would be a variable that would hold the timestamp of the last queried/ingested time we could use this to calculate the timeframe we need to process. In this case if we miss to run the queries in a timely fashion (agent was down) it would pick up where it left off (just like filebeat does when processing logfiles).
cc @amandahla @jsoriano
Originally posted by @dagwieers in #13257 (comment)
The text was updated successfully, but these errors were encountered: