You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we already have BitSet indexes it seems to be pretty straightforward to implement algorithms that check that certain value belongs to given result BitSet.
Unfortunately, it's performance would be dramatically lower on databases with a large number of documents, and even worse when the number of values in the index is equal or almost large as the number of documents, up to O(n * n / 64) operations to find all of the distinct values for this result set.
The simplest idea is to store something like a bloom filter consists of all longs in a bitset bitwise-or'ed to a single (or a few) long values. That way we may calculate the same hash on the result set and then use indexed hash to exclude values where is definitely no documents matching.
It would work smoothly on a large diverged indexes where it's needed and will cause almost no interfering on indexes with few values where in most cases we may have a document for each value and thus must check for each value by hand (which is a few)
The text was updated successfully, but these errors were encountered:
This issue is a part of #42
As we already have BitSet indexes it seems to be pretty straightforward to implement algorithms that check that certain value belongs to given result BitSet.
Unfortunately, it's performance would be dramatically lower on databases with a large number of documents, and even worse when the number of values in the index is equal or almost large as the number of documents, up to O(n * n / 64) operations to find all of the distinct values for this result set.
The simplest idea is to store something like a bloom filter consists of all longs in a bitset bitwise-or'ed to a single (or a few) long values. That way we may calculate the same hash on the result set and then use indexed hash to exclude values where is definitely no documents matching.
It would work smoothly on a large diverged indexes where it's needed and will cause almost no interfering on indexes with few values where in most cases we may have a document for each value and thus must check for each value by hand (which is a few)
The text was updated successfully, but these errors were encountered: