ozgrakkurt/filterz
Probabilistic filter implementations. Ribbon, bloom, xor filters.
Implementations of some probabilistic filter structures. Implemented with build once, use many times
use case in mind.
All filters export a Filter
interface that can be used like this:
const Filter = filterz.ribbon.Filter(u10);
var my_filter = try Filter.init(alloc, hashes);
defer my_filter.deinit();
for (hashes) |h| {
try std.testing.expect(filter.check(h));
}
Each filter also exports a lower level API that can be used to implement more advanced use cases like:
Requires Zig 0.14.0-dev release.
Speed optimized version of a bloom filter.
As described in https://github.com/apache/parquet-format/blob/master/BloomFilter.md
As described in https://arxiv.org/abs/2201.01174 Construction is a bit janky but constructed filters reach slightly higher space efficiency. This is better in cases where construction is one time and the filter is used for a much longer time.
As described in https://arxiv.org/abs/2103.02515
The implementation corresponds to the standard ribbon filter with "smash" as described in the paper.
make benchmark
NOTE: Cost estimate stat in the benchmark output is calculated by assuming every hit generates a disk read, which is priced at 50 microseconds.
Ribbon filter seems to be the best option when on a memory budget. Bloom filter is the king when there is no memory budget.