Bug: Endless recursion in getDistinctIndices with certain parameters. #34

kblauhut · 2021-06-21T13:13:11Z

Explanation
The getDistinctIndicesBis function within getDistinctIndices will fail with a Maximum call stack size exceeded error when there are too few distinct indizes being returned while n < size.

How to reproduce

const filter = new BloomFilter(39, 28);
filter.add("da5e21f8a67c4163f1a53ef43515bd027967da305ecfc741b2c3f40f832b7f82");

This code will trigger the issue.

Why does this happen?

// This returns the same hashes after n is larger than size
const hashes = hashTwice(elem, true, seed! + (size % n));

// This return from the doubleHashing function will also return the same values when called with the same hashes and different n values
return Math.abs((hashA + n * hashB) % size);

This combined with the filtering of double indizes in getDistinctIndicesBis will lead to the issue.
Its important to note that this will only happen in certain cases with high hash count and a low bit count.

The text was updated successfully, but these errors were encountered:

danielaugustin · 2021-06-22T06:53:25Z

For every n >= size the function hashTwice does produce the same hashes (hashA and hashB) for every n.

If n >= size and hashB is dividable by size the function doubleHashing returns the same index for every n. This will always produce the same parameters for the recursive calls (getDistinctIndicesBis), resulting in a stack overflow.

Example (hashA=23, hashB=21, size=7 => ind always will become 23 % 7 = 2):

(23 + 7 * 21) mod 7 = 2
(23 + 8 * 21) mod 7 = 2
(23 + 9 * 21) mod 7 = 2
(23 + 10 * 21) mod 7 = 2
...

Possible fixes:

(dirty) Within getDistinctIndicesBis put a bracket around seed! + size when calling hashTwice. This delays the point where every recursive call produces the same hashes. This fix does only delay
Within doubleHashing replace n * hashB with n + hashB. The term (hashA + n + hashB) % size will always produce different indizes.
Before calling doubleHashing check if hashB is dividable by size. In this case add a number (!= size) to hashB.

I checked fix (2) and it seems to work. Because the link in the comment (http://citeseer.ist.psu.edu/viewdoc/download;jsessionid=4060353E67A356EF9528D2C57C064F5A?doi=10.1.1.152.579&rep=rep1&type=pdf) does not work I was not able to check if doubleHashing was implemented correctly .

Hope this helps.

folkvir · 2021-06-23T09:30:53Z

Thank you for sharing this, I will investigate this and post news as soon as possible.

folkvir · 2021-06-23T12:54:19Z

Back, Fix pushed in branch fix-34 https://github.com/Callidon/bloom-filters/tree/fix-34
Is it possible to test it with a production test?

The difference is the hashTwice(elem, true, seed! + (size % n)) which is now hashTwice(elem, true, seed! + size + n) and is now a classic while loop instead of a recursive function.

You have to know that generating distinct indexes using this technique is very time consuming.
In fact if you want to generate N distinct indexes in a set of N elements, the probability to find the latest will be infinitely small in a very large set. But for a BloomFilter, we get a number of indexes equal to the number of hash functions used, so practically this is very fast.

folkvir self-assigned this Jun 23, 2021

folkvir added the bug label Jun 23, 2021

folkvir mentioned this issue Jul 7, 2021

Fix 34 #36

Merged

Callidon closed this as completed in 20ac20b Jul 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Endless recursion in getDistinctIndices with certain parameters. #34

Bug: Endless recursion in getDistinctIndices with certain parameters. #34

kblauhut commented Jun 21, 2021

danielaugustin commented Jun 22, 2021

folkvir commented Jun 23, 2021

folkvir commented Jun 23, 2021

Bug: Endless recursion in getDistinctIndices with certain parameters. #34

Bug: Endless recursion in getDistinctIndices with certain parameters. #34

Comments

kblauhut commented Jun 21, 2021

danielaugustin commented Jun 22, 2021

folkvir commented Jun 23, 2021

folkvir commented Jun 23, 2021