Convert indexer tables to TimescaleDB hypertables by aditya1702 · Pull Request #486 · stellar/wallet-backend

aditya1702 · 2026-02-02T21:55:57Z

What

This PR converts the five indexer tables (transactions, transactions_accounts, operations, operations_accounts, state_changes) from regular PostgreSQL tables to TimescaleDB hypertables with columnstore compression. It also:

Adds runtime-configurable chunk interval, retention, and compression settings via CLI flags
Implements a progressive recompressor for historical backfill that compresses chunks incrementally as contiguous batches complete
Tunes the balance tables (trustline_balances, native_balances, sac_balances) with fillfactor=80 and aggressive autovacuum settings for HOT update optimization
Adds a reconcile_oldest_cursor scheduled job to keep the oldest_ingest_ledger cursor accurate after retention drops old chunks
Removes the unused TrackRPCServiceHealth method (was only called from integration tests, never from production entrypoints)
Consolidates insert paths by removing the BatchInsert methods, keeping only BatchCopy
Switches all environments (Docker Compose, CI, integration tests) from PostgreSQL to TimescaleDB 2.25.0

Why

The indexer tables are append-heavy, time-ordered, and grow continuously — a workload pattern that maps directly to TimescaleDB's hypertable model. Converting to hypertables with columnstore compression gives us:

Significant storage reduction — columnstore compression on historical chunks dramatically reduces disk usage for immutable ledger data
Automatic data lifecycle management — retention policies can drop old chunks without expensive DELETE operations, and the reconciliation job keeps cursors in sync
Better query performance — chunk pruning on ledger_created_at avoids scanning irrelevant data, and sparse bloom indexes accelerate filtered lookups
Incremental compression during backfill — the progressive recompressor avoids a "compress everything at the end" bottleneck by compressing chunks as safe contiguous windows advance

The balance tables remain regular PostgreSQL tables since they're UPSERT-heavy (not append-only), but get storage parameter tuning to maximize HOT (Heap-Only Tuple) updates where only non-indexed columns change.

Known limitations

Closes #519

…ts FK

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…into timescale

JakeUrban · 2026-02-27T17:24:59Z

internal/ingest/timescaledb.go

+		// Schedule the reconciliation job to run every 1 hour.
+		//
+		// The job checks whether retention has dropped chunks and advances the
+		// oldest ledger cursor if so. It is idempotent — a no-op when the cursor
+		// is already correct — and the query is microsecond-cheap (reads oldest
+		// chunk metadata + 1 row from ingest_store). Running on a fixed 1-hour
+		// interval keeps the cursor at most 1 hour stale after retention fires,
+		// with no coordination required with the retention job schedule.


I'm concerned about the semantic shift from having the oldest ledger cursor updated atomically to having it updated out-of-band in a cron job. Specifically, how does this shift impact systems that rely on the oldest ledger cursor?

My understanding is that our backfilling job reads the oldest ledger value to determine which ranges to process. What happens when this value is stale? For example, if the job is to backfill ledgers 1000 - 5000, because the wallet backend operator sees the oldest ledger at 5000, but its actually stale and the oldest ledger is really 10000, we'll create a gap in history between 5000 - 10000.

Is there a way to ensure the cursor is updated atomically with a chunk being dropped from the retention window? If not, should we have the job run more frequently?

Hmm I think we can do the following together:

Not use the oldest ledger cursor when calculating backfill gaps but instead always check the actual min ledger value stored in the table: basically the same query that we use to reconcile the oldest ledger above. This will be agnostic of when the retention policy runs since we directly get the value from DB.

Reduce the frequency of reconciliation to update the oldest ledger: this value is currently shown in our grafana dashboard to help us keep track of current retention range. That would ensure it gets updated quickly

I did explore atomically doing the retention policy and updation together however Timescale doesnt allow us to chain jobs.

for option #1, which table are we going to query? iiuc the retention jobs could happen at different times for each table. For this option to be safe, a job should use the oldest ledger across all of the tables it cares about.

One option that truly eliminates the potential gap would be to use a custom retention job instead of the built in one. This has the downside of losing all of the built in scheduling/lifecycle management/monitoring but would allow us to do it atomically.

If we were to use a table's actual oldest ledger during backfill jobs, what else is the oldest ledger value actually used for? Maybe we should just remove the oldest ledger cursor?

So I already updated the code to ensure backfill job doesnt use the cursor but actually gets the min ledger from the data: ab7985b

The only reason we would need the cursor is to know the current stored range and start another historical backfill. Other than that it is not used anywhere

aristidesstaffieri · 2026-02-27T18:42:31Z

Code review

Found 1 issue:

Tight busy-wait loop in WaitForRPCHealthAndRun — the default case in the select calls rpcService.GetHealth() with no sleep or backoff, creating an unbounded polling loop that hammers the RPC endpoint as fast as possible until it becomes healthy or the context times out. The old implementation used a channel-based heartbeat that naturally rate-limited polling. Adding a time.Sleep or time.Ticker in the loop would prevent thousands of unnecessary HTTP requests per second during integration test startup.

wallet-backend/internal/integrationtests/infrastructure/helpers.go

Lines 50 to 67 in aa3801f

    
           for { 
        
           	select { 
        
           	case <-ctx.Done(): 
        
           		return fmt.Errorf("context canceled while waiting for RPC service to become healthy: %w", ctx.Err()) 
        
           	case sig := <-signalChan: 
        
           		return fmt.Errorf("received signal %s while waiting for RPC service to become healthy", sig) 
        
           	default: 
        
           		healthRes, err := rpcService.GetHealth() 
        
           		if err == nil { 
        
           			if healthRes.Status == "healthy" { 
        
           				return nil 
        
           			} 
        
           		} 
        
           	} 
        
           }

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

internal/services/ingest_backfill.go

internal/data/operations.go

This reverts commit c3f871f.

aditya1702 added 30 commits January 30, 2026 15:19

change typ0

8bfdebc

Add resolvers for cb id and lp id

b8cdb5d

Update queries.go

3027c5a

Update statechange.go

625b900

Update types.go

e48c293

remove offer ID

409835c

Change ID -> Id

13f2368

Update data_validation_test.go

f712167

Add sponsored data field for reserves change

1ab6a1e

Fix batch copy

5ba65c2

Add LP ID for balance auth change

83e05e5

remove keyValue from balance auth change

72ce607

Update statechange.graphqls

7e3cd12

Add liquidity pool ID field to trustline change

28e8ed1

Update effects_horizon.go

a5c7b65

fix failing test

b834e24

fix test

5a953de

remove keyvalue for reserves change

b07c821

fix failing test - 2

72ac8b0

Update effects.go

31d9c57

Add constraint checks

f39aafc

Change transactions PK from hash to to_id, update transactions_accoun…

1e4d338

…ts FK

Rename AccountWithTxHash to AccountWithToID with tx_to_id column

573cef0

Update IndexerBuffer to track participants by ToID instead of TxHash

7302397

Update transactions data layer to use to_id for transactions_accounts FK

116b18d

Rename BatchGetByTxHashes to BatchGetByToIDs in accounts data layer

e0a9aa9

Update ingest service to use ToID for transaction participants

bb23e58

Update GraphQL dataloaders to use ToID for transaction accounts

6311a01

Update transaction resolver to use ToID for accounts loader

10cc35b

Update integration test helpers to use tx_to_id for joins

e5fc412

aditya1702 added 2 commits February 21, 2026 23:17

add comma

e253998

Update 2025-06-10.4-statechanges.sql

131f3b1

Base automatically changed from tokenid-bytea to opxdr-bytea-2 February 26, 2026 20:24

Base automatically changed from opxdr-bytea-2 to hash-bytea February 26, 2026 22:31

Base automatically changed from hash-bytea to main February 26, 2026 22:54

aditya1702 and others added 7 commits February 26, 2026 22:52

Merge branch 'main' into timescale

ce16112

Update internal/integrationtests/infrastructure/helpers.go

8104698

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

add defer rows.Close()

b2d4f45

Merge branch 'timescale' of https://github.com/stellar/wallet-backend …

869d762

…into timescale

Fire oldest ledger reconciliation after retention policy job runs

eb0522b

remove onReady() func

2f1b518

run reconciliation every 1h

aa3801f

aditya1702 self-assigned this Feb 27, 2026

aditya1702 requested a review from aristidesstaffieri February 27, 2026 16:33

JakeUrban reviewed Feb 27, 2026

View reviewed changes

aditya1702 added 2 commits February 27, 2026 14:39

use actual stored data for calculating oldest ledger

ab7985b

Update ingest_test.go

80a5e88

aristidesstaffieri reviewed Feb 27, 2026

View reviewed changes

internal/services/ingest_backfill.go Outdated Show resolved Hide resolved

aristidesstaffieri reviewed Feb 27, 2026

View reviewed changes

internal/data/operations.go Show resolved Hide resolved

aditya1702 added 8 commits March 3, 2026 13:41

remove recompressor logic

c3f871f

Add max chunks to compress setting

b98209d

Revert "remove recompressor logic"

719cdee

This reverts commit c3f871f.

Update .gitignore

774d6e6

Update ingest_backfill.go

9e45a27

Move channel send outside lock

f556cb2

remove globalEnd

f7c4dbf

Add comment about non-null account address

144c0d0

aditya1702 merged commit 081bcf7 into main Mar 6, 2026
9 checks passed

aditya1702 deleted the timescale branch March 6, 2026 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert indexer tables to TimescaleDB hypertables#486

Convert indexer tables to TimescaleDB hypertables#486
aditya1702 merged 672 commits intomainfrom
timescale

aditya1702 commented Feb 2, 2026 •

edited

Loading

Uh oh!

JakeUrban Feb 27, 2026

Uh oh!

aditya1702 Feb 27, 2026 •

edited

Loading

Uh oh!

aristidesstaffieri Feb 27, 2026

Uh oh!

JakeUrban Feb 27, 2026

Uh oh!

aditya1702 Mar 2, 2026

Uh oh!

aristidesstaffieri commented Feb 27, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aditya1702 commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Known limitations

Uh oh!

JakeUrban Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

aditya1702 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aristidesstaffieri Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

JakeUrban Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

aditya1702 Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

aristidesstaffieri commented Feb 27, 2026

Code review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aditya1702 commented Feb 2, 2026 •

edited

Loading

aditya1702 Feb 27, 2026 •

edited

Loading