Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RollupV2] Rollups of Rollups over all ILM Tiers #82153

Closed
PascalSenn opened this issue Dec 30, 2021 · 3 comments
Closed

[RollupV2] Rollups of Rollups over all ILM Tiers #82153

PascalSenn opened this issue Dec 30, 2021 · 3 comments
Labels
>enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@PascalSenn
Copy link

I am following #42720, and really like the way the new API and integration into ILM is.

What i have not seen (maybe also just not found), is the ability to rollup based on rollups on different tiers.

Basically the following scenario:

A High frequencey API that ingests a large amount of time series data. The idea is to ingest the data and the pretty quickly roll it up into smaller Indices. <1h hot.

<index>
  |
  |
 hot (<1h)
  |\
  | \
  |  \
  |   \
  |    \
  |     \
  |      \
  |       \
  |   warm (< 1d) <rollup 5 minute> 
  |        |\
  |        | \
 delete    |  \
           |   \
           |    \
           |     \
           |      \
           |       \
           |   cold (< 6M) <rollup 12h> 
           |        |\
           |        | \
         delete     |  \
                    |   \
                    |    \
                    |     \
                    |      \
                    |       \
                    |      frozen (< 2y) <rollup 1d>
                    |        | 
                  delete     | 
                             | 
                           delete     

This would make it possible to store a lot of relevant data without the cost of excessive disk usage. Especially with tracing or network data, it's enough to have a rough idea how the service or the network performed a few months back.

To realize this each phase would need to have support for the rollup API. Also it would have to be possible to make rollups of rollups (i think #70534).

Is this something that you guys have on the radar?

@PascalSenn PascalSenn added >enhancement needs:triage Requires assignment of a team area label labels Dec 30, 2021
@not-napoleon not-napoleon added :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data and removed needs:triage Requires assignment of a team area label labels Jan 6, 2022
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 6, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@csoulios
Copy link
Contributor

@PascalSenn thank you for your feedback.

The use case you are describing is the main downsampling scenario we plan to support. In fact, we decided that we must optimize rollups for the downsampling use case.

As we felt that the downsampling operation would be a better fit with our time-series database effort, we decided to rethink rollups as a tsdb operation. We think integrating tsdb and rollups (downsampling) would yield far better user experience as well as more efficient code.

So we have made rollups (downsampling) an important milestone of TSDB (#74660)

@PascalSenn
Copy link
Author

@csoulios Awesome thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

4 participants