2. Dev Learning
Bloom Filter
From Big Data notes, Bloom Filter:
- Keep an array of bits
- With hash functions, do
- For the membership test, if
- probability that is erroneously claimed to be in is:
→ typically is a small constant which depends on the desired false error rate → is proportional to
Insertion:
- feed the element to the hash functions to get array positions
- set positions bits to 1 Testing:
- feed the element to the hash functions to get array positions
- two cases
- any of the position is at 0 = element def not in the set
- otherwise = either:
- the element is in the set
- bits have by chance been set to 1 during insertion of other elements (no way to tell the difference)
⇒ This is the simple Bloom Filter.