You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With the new field matcher, we have sped up matching the documents matched by a query such as __name__ matching .* because it is turned into a field search for index query search execution.
A problem remains however that with queries for aggregate values for very frequently appearing fields (i.e. __name__) where literally each metric (index document) has the field, instead of being able to return the values by walking the FST values, we actually return the postings lists and walk each document to see if tag value for that document is new/exists already in our aggregate result.
We can special case this in index/block.go at the top of the block.Query method, for queries that are single field matchers we can just walk the FST of each segment we have and add them to the aggregate results and return that.
It will significantly speed up queries for single field queries that match a very high amount of documents.
The text was updated successfully, but these errors were encountered:
I think we can do this in an easier/more generic way than special-casing the single matcher case; we could add a "MatchQuery" type to the AggregationOptions sent in to session.Aggregate(..); then can use a similar path, by editing execBlockAggregateQueryFn to take in the AggregationOption TermFilter, and then use that existing path with additional filtering?
I’d prefer the second path too (adding the casing in the aggregate path) but it’s a little more involved than the TermFilter. We can further avoid iterating all the fields in Field FST (because we have exactly one field). That will require an additional interface changes in m3ninx to restrict a range when getting an iterator from Fields() but should make it even faster.
With the new field matcher, we have sped up matching the documents matched by a query such as
__name__
matching.*
because it is turned into a field search for index query search execution.A problem remains however that with queries for aggregate values for very frequently appearing fields (i.e.
__name__
) where literally each metric (index document) has the field, instead of being able to return the values by walking the FST values, we actually return the postings lists and walk each document to see if tag value for that document is new/exists already in our aggregate result.We can special case this in
index/block.go
at the top of theblock.Query
method, for queries that are single field matchers we can just walk the FST of each segment we have and add them to the aggregate results and return that.It will significantly speed up queries for single field queries that match a very high amount of documents.
The text was updated successfully, but these errors were encountered: