False-Positive Probability and Compression Optimization for Tree-Structured Bloom Filters
Bloom filters are frequently used to to check the membership of an item in a set. However, Bloom filters face a dilemma: the transmission bandwidth and the accuracy cannot be optimized simultaneously. This dilemma is particularly severe for transmitting Bloom filters to remote nodes when the network bandwidth is limited. We propose a novel Bloom filter called BloomTree that consists of a tree-structured organization of smaller Bloom filters, each using a set of independent hash functions. BloomTree spreads items across levels that are compressed to reduce the transmission bandwidth need. We show how to find optimal configurations for BloomTree and investigate in detail by how much BloomTree outperforms the standard Bloom filter or the compressed Bloom filter. Finally, we use the intersection of BloomTrees to predict the set intersection, decreasing the false-positive probabilities by several orders of magnitude compared to both the compressed Bloom filter and the standard Bloom filter.