ribbonGo

A high-performance Ribbon filter implementation in Go — the space-efficient probabilistic data structure that is practically smaller than Bloom and Xor filters.

Based on the paper:

"Ribbon filter: practically smaller than Bloom and Xor" Peter C. Dillinger & Stefan Walzer, 2021 (arXiv:2103.02515)

What is a Ribbon Filter?

A Ribbon filter is a static, space-efficient probabilistic data structure for approximate set membership queries — the same problem Bloom filters solve, but using significantly less space.

Given a set of keys, a Ribbon filter can answer "is this key in the set?" with:

No false negatives — if a key is in the set, the filter always says yes.
Configurable false positive rate — non-member keys have a tunable probability of being incorrectly reported as members (FPR ≈ 2^−r for r result bits).
Near-optimal space — approaches the information-theoretic lower bound, using only ~1–5% more space than the minimum possible.

Ribbon vs Bloom vs Xor

Property	Bloom	Xor	Ribbon
Space overhead	~44% over optimal	~23% over optimal	~1–5% over optimal
Construction	Fast, incremental	Fast, static	Moderate, static
Supports deletion	No (without counting)	No	No
FPR at 1% target (bits/key)	9.6	9.8	≈ 8.0

Ribbon filters are ideal for read-heavy, write-once workloads — LSM-tree engines (RocksDB, LevelDB), static lookup tables, networking data planes, and any scenario where a set is built once and queried millions of times.

Quick Start

Installation

go get github.com/RibbonGo/ribbonGo

Requires Go 1.25+.

Usage

package main

import (
    "fmt"
    "log"

    "github.com/RibbonGo/ribbonGo"
)

func main() {
    // Create a filter with default settings (w=128, r=7, FPR ≈ 0.78%)
    f := ribbon.New()

    // Build the filter from a set of keys
    keys := []string{"apple", "banana", "cherry", "date", "elderberry"}
    if err := f.Build(keys); err != nil {
        log.Fatal(err)
    }

    // Query membership
    fmt.Println(f.Contains("banana"))    // true  (always correct for members)
    fmt.Println(f.Contains("fig"))       // false (probably — FPR ≈ 0.78%)
}

Custom Configuration

// Trade space for faster construction with a narrower ribbon width
f := ribbon.NewWithConfig(ribbon.Config{
    CoeffBits:           64,    // w=64: balanced speed/space
    ResultBits:          8,     // r=8: FPR ≈ 0.39%
    FirstCoeffAlwaysOne: true,  // deterministic pivot (faster)
    MaxSeeds:            256,   // hash seed retries before failure
})

if err := f.Build(keys); err != nil {
    // err is ribbon.ErrConstructionFailed if banding fails
    log.Fatal(err)
}

API Reference

The public API is intentionally minimal — a single type with four functions.

Types

Type	Description
`Ribbon`	The main filter type. Create → Build → Contains.
`Config`	Construction parameters (ribbon width, result bits, etc.).
`ErrConstructionFailed`	Sentinel error returned when banding fails for all seed retries.

Functions

Function	Signature	Description
`New`	`func New() *Ribbon`	Create a filter with defaults: w=128, r=7, fcao=true, maxSeeds=256
`NewWithConfig`	`func NewWithConfig(cfg Config) *Ribbon`	Create a filter with custom parameters. Panics on invalid config.
`Build`	`func (r *Ribbon) Build(keys []string) error`	Construct the filter from a set of unique keys. May be called multiple times.
`Contains`	`func (r *Ribbon) Contains(key string) bool`	Test membership. Zero allocations, safe for concurrent use after Build.

Config Fields

Field	Type	Default	Description
`CoeffBits`	`uint32`	128	Ribbon width w ∈ {32, 64, 128}. Larger → more compact, slower to build.
`ResultBits`	`uint`	7	Fingerprint bits r ∈ [1, 8]. FPR ≈ 2^−r.
`FirstCoeffAlwaysOne`	`bool`	true	Force bit 0 of coefficient rows to 1 for deterministic pivoting.
`MaxSeeds`	`uint32`	256	Maximum hash seed retries before returning `ErrConstructionFailed`.

Features

Paper-faithful implementation — every design decision traces to a specific section of Dillinger & Walzer (2021), with §N citations throughout the code
Configurable ribbon width — w ∈ {32, 64, 128} to trade construction speed for space efficiency
Configurable result bits — r bits per key for FPR ≈ 2^−r
Dynamic slot computation — overhead ratio grows logarithmically with n, ported from RocksDB's empirical tables (ribbon_config.cc)
firstCoeffAlwaysOne optimisation — deterministic pivot from paper §4, skipping leading-zero scan
SoA memory layout — Struct-of-Arrays for maximum cache utilisation (8 coefficients per cache line at w=64)
Width-specialised inner loops — pure uint64 path for w≤64, separate lo/hi ops for w=128
Software-pipelined prefetching — AddRange hides L2/L3 latency by prefetching the next key's slot
Zero allocations on hot paths — no heap escapes during construction or query
Two-phase hash pipeline — hash keys once, remix per seed attempt; uses XXH3 (64-bit)
Research-friendly — all paper parameters exposed; reference (slow*) implementations included for cross-validation

Benchmarks

Apple M3 Pro · Go 1.25 · ARM64 · r = 7 result bits · firstCoeffAlwaysOne = true

All benchmarks follow the methodology of Dillinger & Walzer (2021), testing at both n = 10⁶ and n = 10⁸ keys.

Build Performance

Construction throughput and space efficiency at scale.

n	Width	ns/key	bits/key	Overhead
10⁶	w=32	56.89	10.54	31.81%
10⁶	w=64	64.56	8.959	11.99%
10⁶	w=128	106.3	8.380	4.749%
10⁸	w=32	355.3	11.62	45.30%
10⁸	w=64	266.2	9.406	17.58%
10⁸	w=128	384.7	8.585	7.314%

At n = 10⁶, w=128 achieves 8.38 bits/key — only 4.7% above the information-theoretic minimum (7 bits for r=7). At n = 10⁸, it remains under 8.6 bits/key with just 7.3% overhead.

Query Performance

Lookup latency per key (n = 10⁶).

Width	Positive (ns/op)	Negative (ns/op)
w=32	37.25	36.85
w=64	53.70	52.49
w=128	88.66	84.73

Query time scales linearly with ribbon width, as expected — the inner loop performs a dot product over w bits. Positive and negative queries have nearly identical cost.

Space Efficiency

Bits per key at both scales, with packed (information-theoretic) comparison.

n	Width	bits/key	packed bits/key	Overhead
10⁶	w=32	10.54	9.227	31.81%
10⁶	w=64	8.959	7.839	11.99%
10⁶	w=128	8.380	7.332	4.749%
10⁸	w=32	11.62	10.17	45.30%
10⁸	w=64	9.406	8.231	17.58%
10⁸	w=128	8.585	7.512	7.314%

w=128 at n = 10⁶ uses only 8.38 bits/key — compare with Bloom filters at 9.6 bits/key for the same FPR. That's a 12.7% space saving over Bloom.

Run Benchmarks Yourself

# Full benchmark suite
go test -bench=. -benchmem -count=3 ./...

# Paper-aligned benchmarks at n=10⁶ and n=10⁸
go test -run=^$ -bench='BenchmarkRibbon' -benchtime=3s -count=1

# Specific benchmark
go test -run=^$ -bench='BenchmarkRibbonBuild/w=128/n=1000000' -benchtime=3s

Architecture

The implementation follows the paper's full algorithmic pipeline:

Key → [Hash] → [Bander] → [Solver] → [Filter/Query]
       §2         §2,§4       §2           §2

Pipeline Layers

Layer	File	Description
uint128	`uint128.go`	128-bit integer type for coefficient rows when w=128.
Hash	`hash.go`	Two-phase pipeline: hash each key once with XXH3, then cheaply remix per seed to derive `(start, coeffRow, result)` triples.
Bander	`bander.go`	On-the-fly Gaussian elimination over GF(2). Converts hashed equations into an upper-triangular banded matrix. Hottest code path during construction.
Solver	`solver.go`	Back-substitution: solves the upper-triangular system to produce the compact solution vector encoding the filter.
Filter	`filter.go`	Query evaluation: one hash, one dot product over GF(2). Also provides false-positive rate estimation.
Builder	`builder.go`	Orchestrates the full pipeline: hashing → banding → solving → filter construction. Includes RocksDB-style dynamic slot computation.
Public API	`ribbon.go`	Sole public surface: `Ribbon`, `Config`, `New()`, `NewWithConfig()`, `Build()`, `Contains()`.

Key Optimisations

SoA layout — Coefficient data is stored in parallel arrays (coeffLo []uint64, coeffHi []uint64, result []uint8) instead of an array of structs. For w≤64, coeffHi is nil — zero memory, zero operations. This doubles the number of coefficients per cache line compared to AoS layout.

Width specialisation — Add() dispatches to addW64() (pure uint64, ~10 ARM64 instructions per elimination step) or addW128() (separate lo/hi uint64 ops). The generic uint128.rsh() branch dispatch is avoided entirely.

Software-pipelined prefetching — AddRange() touches coeffLo[nextKey.start] while processing the current key, pulling the next cache line into L1. This mirrors RocksDB's BandingAddRange approach and yields 20–36% throughput improvement at scale.

Dynamic overhead ratio — The number of slots m is computed using RocksDB-style empirical tables where overhead grows logarithmically with n. This ensures reliable banding at all scales, unlike a fixed overhead formula that breaks down at large n.

Two-phase hashing — Keys are hashed once with XXH3. On seed retry, only a cheap remix is applied to derive new (start, coeff, result) triples — no re-hashing of the original key data.

How It Works

At a high level, a Ribbon filter encodes set membership as a system of linear equations over GF(2):

Hash each key to a triple: starting position s, a w-bit coefficient row c, and an r-bit result r.
Band the equations into an upper-triangular matrix via Gaussian elimination (the "banding" step).
Solve the triangular system by back-substitution, producing a compact solution vector Z.
Query a candidate key by computing its triple, then checking if c · Z[s..s+w] == r.

The "ribbon" name refers to the banded structure of the coefficient matrix — each equation touches only w consecutive columns starting at position s, forming a ribbon-like diagonal band.

Why is it smaller than Bloom?

A Bloom filter wastes space because its bit-setting positions are independent — you can't pack information as tightly. A Ribbon filter encodes key membership as equations, allowing the solver to pack nearly r bits of information per key into the solution vector. The information-theoretic minimum is r bits/key, and Ribbon achieves within 1–5% of this.

Testing

23 test functions across the public API, plus extensive internal tests — all passing.

Tests cover:

Correctness — single insertion, no false negatives, false-positive rate validation, collision chains, 128-bit boundary crossing, redundant/contradictory equations
All configurations — w ∈ {32, 64, 128} × firstCoeffAlwaysOne ∈ {true, false} × r ∈ {1, 4, 7, 8}
Scale — builds at n = 100,000 verified against expected FPR
Edge cases — empty input, single key, nil/empty filter, rebuild semantics, invalid config panics
Cross-validation — optimised Add() vs reference slowAdd(), AddRange() vs Add()-loop — all verified slot-by-slot

go test -v -count=1 ./...

Project Structure

ribbon/
├── ribbon.go              # Public API (Ribbon, Config, New, Build, Contains)
├── builder.go             # Pipeline orchestration, slot computation
├── bander.go              # Gaussian elimination (banding)
├── hash.go                # XXH3-based two-phase hash pipeline
├── solver.go              # Back-substitution solver
├── filter.go              # Query evaluation and FPR estimation
├── uint128.go             # 128-bit integer type
├── ribbon_test.go         # Public API tests (23 functions)
├── ribbon_bench_test.go   # Paper-aligned benchmarks
├── filter_test.go         # Internal filter/pipeline tests
├── filter_bench_test.go   # Internal benchmarks
├── bander_test.go         # Bander unit tests
├── bander_bench_test.go   # Bander benchmarks
├── hash_test.go           # Hash layer tests
├── hash_bench_test.go     # Hash benchmarks
├── solver_test.go         # Solver tests
├── solver_bench_test.go   # Solver benchmarks
├── uint128_test.go        # uint128 tests
├── go.mod
├── go.sum
├── LICENSE
└── README.md

Contributing

Contributions are welcome! This project follows the paper's design closely — please read the relevant paper section before modifying any algorithm.

Getting Started

git clone https://github.com/RibbonGo/ribbonGo.git
cd ribbonGo
go test ./...

Guidelines

Paper-first — every design decision should cite a specific section (§N) of Dillinger & Walzer (2021)
RocksDB cross-references — use [RocksDB: FunctionName in file.h] format for implementation parallels
All 6 configs — tests must cover w ∈ {32, 64, 128} × firstCoeffAlwaysOne ∈ {true, false}
Reference implementations — include slow* variants for cross-validation of optimised code
Zero allocations — hot paths must not escape to heap; verify with go test -bench=X -benchmem
Naming — unexported everything (standardBander, not StandardBander); constants use kCamelCase

References

Dillinger, P. C. & Walzer, S. (2021). Ribbon filter: practically smaller than Bloom and Xor. arXiv:2103.02515
RocksDB Ribbon filter: ribbon_impl.h, ribbon_alg.h, ribbon_config.cc
XXH3 hash function (Go): github.com/zeebo/xxh3

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bander.go		bander.go
bander_bench_test.go		bander_bench_test.go
bander_test.go		bander_test.go
builder.go		builder.go
filter.go		filter.go
filter_bench_test.go		filter_bench_test.go
filter_test.go		filter_test.go
go.mod		go.mod
go.sum		go.sum
hash.go		hash.go
hash_bench_test.go		hash_bench_test.go
hash_test.go		hash_test.go
ribbon.go		ribbon.go
ribbon_bench_test.go		ribbon_bench_test.go
ribbon_test.go		ribbon_test.go
solver.go		solver.go
solver_bench_test.go		solver_bench_test.go
solver_test.go		solver_test.go
uint128.go		uint128.go
uint128_test.go		uint128_test.go

Folders and files

Latest commit

History

Repository files navigation

ribbonGo

What is a Ribbon Filter?

Ribbon vs Bloom vs Xor

Quick Start

Installation

Usage

Custom Configuration

API Reference

Types

Functions

Config Fields

Features

Benchmarks

Build Performance

Query Performance

Space Efficiency

Run Benchmarks Yourself

Architecture

Pipeline Layers

Key Optimisations

How It Works

Why is it smaller than Bloom?

Testing

Project Structure

Contributing

Getting Started

Guidelines

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages