Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Bloom Filter: wrong number of hash functions used #9843

Closed
bennetyee opened this issue Oct 31, 2018 · 5 comments
Closed

Bloom Filter: wrong number of hash functions used #9843

bennetyee opened this issue Oct 31, 2018 · 5 comments
Labels
A3-stale 🍃 Pull request did not receive any updates in a long time. No review needed at this stage. Close it. F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue. F8-enhancement 🎊 An additional feature request. M4-core ⛓ Core client code / Rust.
Milestone

Comments

@bennetyee
Copy link

  • Parity Ethereum version: HEAD
  • Operating system: all
  • Installation: built from source
  • Fully synchronized: NA
  • Network: NA
  • Restarted: NA

The number of bloom filter hash functions used is actually one less than intended.

Both https://github.com/paritytech/parity-ethereum/blob/master/util/bloom/src/lib.rs#L133 and https://github.com/paritytech/parity-ethereum/blob/master/util/bloom/src/lib.rs#L145 iterate k_i starting at 0 up to (and including) self.k_num - 1, for a total of self.k_num iterations. The intent is to get self.k_num statistically independent hash values to set bits or check for bits in self.bitmap, as is normal for bloom filter implementations. However, the code for bloom_hash at https://github.com/paritytech/parity-ethereum/blob/master/util/bloom/src/lib.rs#L180 returns the same hash value for k_i equal to either 0 or 1. This means that, in effect, we are getting self.k_num - 1 independent hashes, instead of the expected self.k_num hashes. Since the value of self.k_num is computed in new at https://github.com/paritytech/parity-ethereum/blob/master/util/bloom/src/lib.rs#L88 with optimum_k_num https://github.com/paritytech/parity-ethereum/blob/master/util/bloom/src/lib.rs#L164 the value is correct, but the actual number of independent hash function used is incorrect; it is actually one fewer.

However, hash_backward_compatibility_for_new https://github.com/paritytech/parity-ethereum/blob/master/util/bloom/src/lib.rs#L246 seem to indicate that there is a strong desire to retain backward compatibility. Either a fix -- including changing the backwards compatibility tests -- or comments explaining why we want the first and second hash functions to be identical would be nice, because future readers of this code will want to know.

@jam10o-new jam10o-new added F8-enhancement 🎊 An additional feature request. F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue. M4-core ⛓ Core client code / Rust. labels Oct 31, 2018
@jam10o-new
Copy link
Contributor

I'm really unsure how to describe this, what (difference in) effect would this have in practice?

@jam10o-new jam10o-new added this to the 2.3 milestone Oct 31, 2018
@niklasad1
Copy link
Collaborator

/cc @debris

@tomusdrw
Copy link
Collaborator

@bennetyee some background: #6404 (comment)

I think we should definitely fix this properly, I'd be in favour of using new way of hashing for fresh syncs, but keeping the old behaviour for old DBs to avoid the need of migration.

@5chdn 5chdn modified the milestones: 2.3, 2.4 Jan 10, 2019
@5chdn 5chdn modified the milestones: 2.4, 2.5 Feb 21, 2019
@soc1c soc1c modified the milestones: 2.5, 2.6 Apr 2, 2019
@ordian ordian modified the milestones: 2.6, 2.7 Jul 12, 2019
@adria0
Copy link

adria0 commented Jul 27, 2020

Closing issue due to its stale state.

@adria0 adria0 closed this as completed Jul 27, 2020
@adria0
Copy link

adria0 commented Jul 27, 2020

Closing issue due to its stale state.

@adria0 adria0 reopened this Jul 27, 2020
@adria0 adria0 added the A3-stale 🍃 Pull request did not receive any updates in a long time. No review needed at this stage. Close it. label Jul 27, 2020
@adria0 adria0 closed this as completed Jul 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A3-stale 🍃 Pull request did not receive any updates in a long time. No review needed at this stage. Close it. F3-annoyance 💩 The client behaves within expectations, however this “expected behaviour” itself is at issue. F8-enhancement 🎊 An additional feature request. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

No branches or pull requests

8 participants