Fix supervisor: report vault execution so stuck-scan order isn't fixed#187
Fix supervisor: report vault execution so stuck-scan order isn't fixed#187
Conversation
a6cce3f to
b2af175
Compare
|
Failing tests are due to rounding issues and should be fixed with: Just for info: |
cadence/scripts/flow-yield-vaults/get_pending_yield_vaults_paginated.cdc
Show resolved
Hide resolved
|
failing tests need: #182 |
Kay-Zee
left a comment
There was a problem hiding this comment.
LGTM, but you mentioned it still requires FlowActions changes? or are those resolved.
| if !(self.yieldVaultRegistry[yieldVaultID] ?? false) { | ||
| return | ||
| } | ||
| if let index = self.stuckScanOrder.firstIndex(of: yieldVaultID) { |
There was a problem hiding this comment.
Not that there's a solution, but wonder what the limit of the size of this array will be before we see problems, since we're just doing array ops.
There was a problem hiding this comment.
There is a "solution".
In the most recent commit now I used a linked list.
Not really happy about implementing a linked list in cadence but it gets execution time down to O(1).
If you prefer the old version without that added complexity we can always revert that commit.
added autobalancer callback to find potentially stuck vaults
3008e0a to
ed5ebfd
Compare
293dd3a to
c7fa0e6
Compare
…iene Fix scheduler registry hygiene follow-ups
|
I do have this PR against the actual @holyfuchs 's branch: #207 |
|
|
||
| /// Node in the simulated doubly-linked list used for O(1) stuck-scan ordering. | ||
| /// `prev` points toward the head (most recently executed); `next` points toward the tail (oldest/least recently executed). | ||
| access(all) struct ListNode { |
There was a problem hiding this comment.
Nice work with the doubly linked list!
Minor comment, I think we can extract it from this contract into a utility contract for cleaner code. I asked Claude, decision up to you:
The core list logic is already self-contained. The only coupling is in getStuckScanCandidates, which touches scheduleCaps and calls dequeuePending. The node operations themselves touch nothing external.
Here's the clean extraction boundary:
┌──────────────────────────────────────────────────────────────┐
│ NEW: UInt64LinkedList contract (or resource) │
│ │
│ struct ListNode { prev: UInt64?, next: UInt64? } │
│ var nodes: {UInt64: ListNode} │
│ var head: UInt64? │
│ var tail: UInt64? │
│ │
│ access(account) fun insertAtHead(id: UInt64) │
│ access(account) fun remove(id: UInt64): Bool │
│ access(all) view fun contains(id: UInt64): Bool │
│ access(all) view fun tailWalk(limit: UInt): [UInt64] │
└──────────────────────────────────────────────────────────────┘
▲ called by
┌──────────────────────────────────────────────────────────────┐
│ FlowYieldVaultsSchedulerRegistry (unchanged business API) │
│ │
│ register() → list.insertAtHead(id) if participates │
│ unregister() → list.remove(id) │
│ reportExecution() → list.remove(id); list.insertAtHead(id) │
│ getStuckScanCandidates() → list.tailWalk() + prune logic │
│ (coupling to scheduleCaps + dequeuePending stays here) │
└──────────────────────────────────────────────────────────────┘
As a resource (cleanest Cadence idiom):
access(all) contract UInt64LinkedList {
access(all) struct ListNode {
access(all) var prev: UInt64?
access(all) var next: UInt64?
// ... setters ...
}
access(all) resource List {
access(all) var nodes: {UInt64: ListNode}
access(all) var head: UInt64?
access(all) var tail: UInt64?
access(all) fun insertAtHead(id: UInt64) { ... } // O(1)
access(all) fun remove(id: UInt64): Bool { ... } // O(1)
access(all) view fun contains(id: UInt64): Bool { ... }
}
access(all) fun create(): @List {
return <- create List(...)
}
}
SchedulerRegistry stores one @UInt64LinkedList.List resource and delegates node operations to it. Changes required in SchedulerRegistry: replace the three fields and two private functions with a single @UInt64LinkedList.List field — roughly a 40-line reduction. The getStuckScanCandidates signature and behavior are unchanged.
The only decision is whether to use a resource (idiomatic, but requires the registry to store it) or a separate contract with its own module-level state (simpler to call, but less composable). The resource approach is preferred since it makes the list lifetime explicit and lets you instantiate multiple lists per contract if needed later.
Closes: #177
Description
The supervisor “check the first N vaults” logic was fixed: vault executions are now reported to the registry, which keeps an ordered list of “least recently executed” vaults. The supervisor then scans only those first N (e.g. 5) and recovers the ones that are actually stuck, instead of always the same fixed set.
What was implemented
Execution callback
Each AutoBalancer now has an execution callback that runs after a scheduled rebalance. The callback calls the registry with that vault’s id so the registry can update its internal order (remove id from the list, append to the end).
Shared callback resource
In
FlowYieldVaultsAutoBalancers, a singleRegistryReportCallbackresource per account implementsDeFiActions.AutoBalancerExecutionCallback. ItsonExecuted(balancerUUID)calls the registry so the vault that just ran is reported by id. Every new AutoBalancer gets a capability to this shared callback and passes it tosetExecutionCallback(cap).Context (from discussion)
The supervisor was limited to processing a small batch (e.g. first 5 vaults) per run. The agreed short-term approach was to order the vault list by “last executed” so the supervisor always checks the oldest / least recently executed vaults first (most likely stuck).