Skip to content

Convert indexer tables to TimescaleDB hypertables#486

Merged
aditya1702 merged 672 commits intomainfrom
timescale
Mar 6, 2026
Merged

Convert indexer tables to TimescaleDB hypertables#486
aditya1702 merged 672 commits intomainfrom
timescale

Conversation

@aditya1702
Copy link
Contributor

@aditya1702 aditya1702 commented Feb 2, 2026

What

This PR converts the five indexer tables (transactions, transactions_accounts, operations, operations_accounts, state_changes) from regular PostgreSQL tables to TimescaleDB hypertables with columnstore compression. It also:

  • Adds runtime-configurable chunk interval, retention, and compression settings via CLI flags
  • Implements a progressive recompressor for historical backfill that compresses chunks incrementally as contiguous batches complete
  • Tunes the balance tables (trustline_balances, native_balances, sac_balances) with fillfactor=80 and aggressive autovacuum settings for HOT update optimization
  • Adds a reconcile_oldest_cursor scheduled job to keep the oldest_ingest_ledger cursor accurate after retention drops old chunks
  • Removes the unused TrackRPCServiceHealth method (was only called from integration tests, never from production entrypoints)
  • Consolidates insert paths by removing the BatchInsert methods, keeping only BatchCopy
  • Switches all environments (Docker Compose, CI, integration tests) from PostgreSQL to TimescaleDB 2.25.0

Why

The indexer tables are append-heavy, time-ordered, and grow continuously — a workload pattern that maps directly to TimescaleDB's hypertable model. Converting to hypertables with columnstore compression gives us:

  1. Significant storage reduction — columnstore compression on historical chunks dramatically reduces disk usage for immutable ledger data
  2. Automatic data lifecycle management — retention policies can drop old chunks without expensive DELETE operations, and the reconciliation job keeps cursors in sync
  3. Better query performance — chunk pruning on ledger_created_at avoids scanning irrelevant data, and sparse bloom indexes accelerate filtered lookups
  4. Incremental compression during backfill — the progressive recompressor avoids a "compress everything at the end" bottleneck by compressing chunks as safe contiguous windows advance

The balance tables remain regular PostgreSQL tables since they're UPSERT-heavy (not append-only), but get storage parameter tuning to maximize HOT (Heap-Only Tuple) updates where only non-indexed columns change.

Known limitations

Closes #519

Base automatically changed from tokenid-bytea to opxdr-bytea-2 February 26, 2026 20:24
Base automatically changed from opxdr-bytea-2 to hash-bytea February 26, 2026 22:31
Base automatically changed from hash-bytea to main February 26, 2026 22:54
@aditya1702 aditya1702 self-assigned this Feb 27, 2026
Comment on lines +91 to +98
// Schedule the reconciliation job to run every 1 hour.
//
// The job checks whether retention has dropped chunks and advances the
// oldest ledger cursor if so. It is idempotent — a no-op when the cursor
// is already correct — and the query is microsecond-cheap (reads oldest
// chunk metadata + 1 row from ingest_store). Running on a fixed 1-hour
// interval keeps the cursor at most 1 hour stale after retention fires,
// with no coordination required with the retention job schedule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the semantic shift from having the oldest ledger cursor updated atomically to having it updated out-of-band in a cron job. Specifically, how does this shift impact systems that rely on the oldest ledger cursor?

My understanding is that our backfilling job reads the oldest ledger value to determine which ranges to process. What happens when this value is stale? For example, if the job is to backfill ledgers 1000 - 5000, because the wallet backend operator sees the oldest ledger at 5000, but its actually stale and the oldest ledger is really 10000, we'll create a gap in history between 5000 - 10000.

Is there a way to ensure the cursor is updated atomically with a chunk being dropped from the retention window? If not, should we have the job run more frequently?

Copy link
Contributor Author

@aditya1702 aditya1702 Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think we can do the following together:

  1. Not use the oldest ledger cursor when calculating backfill gaps but instead always check the actual min ledger value stored in the table: basically the same query that we use to reconcile the oldest ledger above. This will be agnostic of when the retention policy runs since we directly get the value from DB.

  2. Reduce the frequency of reconciliation to update the oldest ledger: this value is currently shown in our grafana dashboard to help us keep track of current retention range. That would ensure it gets updated quickly

I did explore atomically doing the retention policy and updation together however Timescale doesnt allow us to chain jobs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for option #1, which table are we going to query? iiuc the retention jobs could happen at different times for each table. For this option to be safe, a job should use the oldest ledger across all of the tables it cares about.

One option that truly eliminates the potential gap would be to use a custom retention job instead of the built in one. This has the downside of losing all of the built in scheduling/lifecycle management/monitoring but would allow us to do it atomically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to use a table's actual oldest ledger during backfill jobs, what else is the oldest ledger value actually used for? Maybe we should just remove the oldest ledger cursor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I already updated the code to ensure backfill job doesnt use the cursor but actually gets the min ledger from the data: ab7985b

The only reason we would need the cursor is to know the current stored range and start another historical backfill. Other than that it is not used anywhere

@aristidesstaffieri
Copy link
Contributor

Code review

Found 1 issue:

  1. Tight busy-wait loop in WaitForRPCHealthAndRun — the default case in the select calls rpcService.GetHealth() with no sleep or backoff, creating an unbounded polling loop that hammers the RPC endpoint as fast as possible until it becomes healthy or the context times out. The old implementation used a channel-based heartbeat that naturally rate-limited polling. Adding a time.Sleep or time.Ticker in the loop would prevent thousands of unnecessary HTTP requests per second during integration test startup.

for {
select {
case <-ctx.Done():
return fmt.Errorf("context canceled while waiting for RPC service to become healthy: %w", ctx.Err())
case sig := <-signalChan:
return fmt.Errorf("received signal %s while waiting for RPC service to become healthy", sig)
default:
healthRes, err := rpcService.GetHealth()
if err == nil {
if healthRes.Status == "healthy" {
return nil
}
}
}
}

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@aditya1702 aditya1702 merged commit 081bcf7 into main Mar 6, 2026
9 checks passed
@aditya1702 aditya1702 deleted the timescale branch March 6, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Convert historical indexer tables to TimescaleDB hypertables and add Timescale scaffolding

4 participants