Skip to content

[Util] Improve remote debugging tooling#8440

Open
turbolent wants to merge 14 commits intomasterfrom
bastian/debugging-improvements
Open

[Util] Improve remote debugging tooling#8440
turbolent wants to merge 14 commits intomasterfrom
bastian/debugging-improvements

Conversation

@turbolent
Copy link
Member

@turbolent turbolent commented Feb 21, 2026

  • Add a new command compare-debug-tx which runs debug-tx on two branches and compares the results
  • Port improvements from feature/cadence-vm:
    • Improve debug-tx and debug-script commands:
      • Enable the AN Execution Data API by default. The EN only keeps data for a short period of time, the AN has registers for much longer
      • Add support for executing system transactions
      • Add support for entropy provider. Provide a simple block hash based provider which at least allows transactions to execute
      • Add support for executing all transactions of a block, or multiple transactions
      • Add support for showing execution results (logs, events, read registers, updates registers)
      • Add support for capturing traces and filtering to only "interesting" Cadence traces
    • Improve tracing in FVM, add additional information

Summary by CodeRabbit

  • New Features

    • CLI command to compare transaction execution results and traces across branches
    • Block-oriented and remote debugging modes, and a Cadence-focused trace collector
  • Improvements

    • Unified remote data access with caching and capturing snapshots; safer tracing controls (sync vs batched export, guarded attributes)
    • Deterministic register sorting and clearer, human-readable execution result output
  • Tests

    • Persistence and cache tests for register storage

@turbolent turbolent requested review from a team February 21, 2026 00:30
@turbolent turbolent requested a review from a team as a code owner February 21, 2026 00:30
@github-actions
Copy link
Contributor

github-actions bot commented Feb 21, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new CLI command to compare debug-tx outputs across git branches and refactors debugging to use remote clients, block-aware execution flows, enriched tracing, snapshot wrappers, register cache utilities, Cadence span exporting, and related tests and helpers.

Changes

Cohort / File(s) Summary
Compare CLI
cmd/util/cmd/compare-debug-tx/cmd.go, cmd/util/cmd/root.go
Adds compare-debug-tx Cobra command, repo/branch validation, runs debug-tx on two branches, captures results/traces to temp files and diffs them; registers command in root.
Debug CLI & flow
cmd/util/cmd/debug-tx/cmd.go, cmd/util/cmd/debug-script/cmd.go
Reworks debug-tx/debug-script to support remote clients, block-oriented execution (RunBlock/RunSingleTransaction), per-transaction tracing/entropy, block subscriptions, and public helper APIs.
Remote client & snapshots
utils/debug/remoteClient.go, utils/debug/remoteView.go, utils/debug/remoteDebugger.go
Adds RemoteClient interface and ExecutionNode/ExecutionData implementations; replaces local persistent cache with on-demand remote fetches; introduces CachingStorageSnapshot, CapturingStorageSnapshot, UpdatableStorageSnapshot, and consolidated Result type.
Register cache & utilities
utils/debug/registerCache.go, utils/debug/registers.go, utils/debug/registerCache_test.go
Implements composite register keys, streaming JSON FileRegisterCache (constructor returns error), deterministic sorting helpers, and tests for persistence/format.
Result formatting & Cadence tracing
utils/debug/result.go, utils/debug/cadence.go
Adds WriteResult for serializing execution results and InterestingCadenceSpanExporter to filter/collect Cadence-related OTLP spans with optional live logging.
Access API & subscriptions
utils/debug/api.go
Replaces Execution API header calls with Access API equivalents, maps block status, and adds access-based subscription helpers.
FVM tracing and guards
fvm/environment/program_logger.go, fvm/environment/programs.go, fvm/environment/system_contracts.go, fvm/environment/value_store.go, fvm/evm/handler/handler.go, fvm/tracing/tracer_span.go, fvm/transactionVerifier.go
Adds or guards OpenTelemetry attributes (location, owner/key, tx_counts, tx ID), adds spans around value store ops, renames isTraceable→IsTraceable.
Tracing exporter behavior
module/trace/trace.go
Adds NewTracerWithExporter(sync) to choose synchronous vs batched exporter behavior.
Storage snapshot API
fvm/storage/snapshot/execution_snapshot.go
Exposes ExecutionSnapshot.ReadRegisterSet() to retrieve the internal read-register set.
Minor CLI wiring & tests
cmd/util/cmd/root.go, utils/debug/registerCache_test.go
Registers new command in root and adds tests validating register cache persistence and format.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI as compare-debug-tx CLI
    participant Git as Git Repo
    participant DebugTx as debug-tx
    participant Diff as diff tool

    User->>CLI: run compare-debug-tx --branch1 A --branch2 B
    CLI->>Git: find repo root & verify clean working tree
    CLI->>Git: checkout branch A
    CLI->>DebugTx: invoke debug-tx -> resultA, traceA
    CLI->>Git: checkout branch B
    CLI->>DebugTx: invoke debug-tx -> resultB, traceB
    CLI->>Diff: diff resultA vs resultB (labeled)
    CLI->>Diff: diff traceA vs traceB (labeled)
    Diff-->>User: unified diffs
Loading
sequenceDiagram
    participant CLI as debug-tx CLI
    participant Flow as Access/Flow Client
    participant Remote as RemoteClient
    participant Snapshot as StorageSnapshot (Caching/Capturing)
    participant FVM as FVM Engine
    participant Exporter as Trace Exporter

    CLI->>Flow: fetch block header (if --block-id)
    Flow-->>CLI: block header
    CLI->>Flow: fetch block transactions
    Flow-->>CLI: tx list
    CLI->>Remote: NewRemoteClient(address, mode)
    Remote-->>CLI: client
    CLI->>Remote: StorageSnapshot(blockHeight, blockID)
    Remote-->>CLI: Snapshot
    CLI->>FVM: RunBlock/RunSingleTransaction(snapshot, header, txs, options)
    loop per transaction
      FVM->>Snapshot: Get/Set registers
      Snapshot-->>FVM: register values
      FVM->>Exporter: emit spans (if enabled)
      FVM-->>CLI: result
    end
    CLI->>CLI: WriteResult (if --show-result) and/or write traces
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

Tech Debt

Suggested reviewers

  • fxamacker
  • janezpodhostnik
  • zhangchiqing

Poem

🐰 I hopped through branches, peered at every trace,
Cached a snapshot, tidy in its place,
Spans hummed Cadence, registers aligned,
Diffs showed secrets branches left behind,
A rabbit applauds this tidy chase!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[Util] Improve remote debugging tooling' accurately summarizes the main objective of adding and improving debugging commands and infrastructure in the util package, reflecting the core purpose of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bastian/debugging-improvements

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link

codecov-commenter commented Feb 21, 2026

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/util/cmd/debug-script/cmd.go (1)

1-1: ⚠️ Potential issue | 🟡 Minor

Package name debug_tx is incorrect for the debug-script command.

This file resides in cmd/util/cmd/debug-script/ but declares package debug_tx. This is a copy-paste error from the debug-tx command. The package should be package debug_script to match Go naming conventions where the package name aligns with the import path's final segment. While the alias in cmd/util/cmd/root.go (debug_script "github.com/onflow/flow-go/cmd/util/cmd/debug-script") currently masks this issue, it violates Go conventions and should be corrected to package debug_script.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-script/cmd.go` at line 1, The package declaration in
cmd/util/cmd/debug-script/cmd.go is incorrectly named `debug_tx`; change the
package line to `package debug_script` so the package name matches the import
path's final segment and Go conventions; update the package declaration in the
top of the file containing the `debug-script` command (the file with the current
`package debug_tx` line) — no other code changes should be necessary because
root.go imports it as `debug_script`, but this fix ensures the package name and
import path align.
🧹 Nitpick comments (13)
fvm/environment/programs.go (1)

95-101: Inconsistent nil-guard and attribute method compared to program_logger.go.

Two divergences from the companion changes in program_logger.go:

  1. span.Tracer != nil rather than span.IsTraceable() – prefer the exported helper for consistency.
  2. location.ID() vs location.String() used in program_logger.go – the two can return different strings for the same location; pick one form consistently across FVM tracing attributes.
  3. Missing location != nil guard: program_logger.go explicitly guards this; if a nil common.Location were ever passed here with a live tracer, location.ID() would panic.
♻️ Proposed refactor
 	span := programs.tracer.StartChildSpan(trace.FVMEnvGetOrLoadProgram)
-	if span.Tracer != nil {
+	if span.IsTraceable() && location != nil {
 		span.SetAttributes(
-			attribute.String("location", location.ID()),
+			attribute.String("location", location.String()),
 		)
 	}
 	defer span.End()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fvm/environment/programs.go` around lines 95 - 101, The span attribute
setting in programs.tracer.StartChildSpan should mirror program_logger.go:
replace the span.Tracer != nil check with span.IsTraceable(), add a nil guard
for the location (e.g., if location != nil && span.IsTraceable()), and use
location.String() instead of location.ID() so the attribute key/value format is
consistent across FVM tracing; update the attribute.String call accordingly
inside programs.tracer.StartChildSpan handling.
utils/debug/registers.go (1)

17-21: SortRegisterIDs can pass CompareRegisterIDs directly.

The anonymous wrapper is unnecessary; slices.SortFunc accepts a function with the same signature as CompareRegisterIDs.

♻️ Proposed simplification
 func SortRegisterIDs(registerIDs []flow.RegisterID) {
-	slices.SortFunc(registerIDs, func(a, b flow.RegisterID) int {
-		return CompareRegisterIDs(a, b)
-	})
+	slices.SortFunc(registerIDs, CompareRegisterIDs)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/debug/registers.go` around lines 17 - 21, The SortRegisterIDs function
uses an unnecessary anonymous wrapper when calling slices.SortFunc; replace the
inline func with the existing CompareRegisterIDs function by passing
CompareRegisterIDs directly to slices.SortFunc (i.e., call
slices.SortFunc(registerIDs, CompareRegisterIDs)) so SortRegisterIDs simply
delegates to CompareRegisterIDs without the extra wrapper.
fvm/evm/handler/handler.go (1)

187-189: Prefer span.IsTraceable() over span.Tracer != nil for consistency.

Same pattern as in system_contracts.goIsTraceable() is the exported helper meant for this guard and keeps the codebase consistent.

♻️ Proposed refactor
-	if span.Tracer != nil {
+	if span.IsTraceable() {
 		span.SetAttributes(attribute.Int("tx_counts", len(rlpEncodedTxs)))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fvm/evm/handler/handler.go` around lines 187 - 189, Replace the raw tracer
nil-check with the exported helper to keep consistency: instead of checking
span.Tracer != nil before calling span.SetAttributes(attribute.Int("tx_counts",
len(rlpEncodedTxs))), use span.IsTraceable() as the guard; locate the span usage
in the EVM handler (the block referencing rlpEncodedTxs and calling
span.SetAttributes) and change the condition to if span.IsTraceable() { ... } so
it matches the pattern used in system_contracts.go.
fvm/environment/system_contracts.go (1)

52-57: Prefer span.IsTraceable() over span.Tracer != nil for consistency.

program_logger.go (same package) uses tracer.IsTraceable() for the same guard, and IsTraceable() was just exported specifically for this purpose. Using the raw field check creates an unnecessary divergence.

♻️ Proposed refactor
-	if span.Tracer != nil {
+	if span.IsTraceable() {
 		span.SetAttributes(
 			attribute.String(
 				"transaction.ContractFunctionCall",
 				contractLocation.String()+"."+spec.FunctionName))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fvm/environment/system_contracts.go` around lines 52 - 57, Replace the raw
nil check on span.Tracer with the exported helper to match package convention:
use span.IsTraceable() as the guard before calling span.SetAttributes(...).
Update the condition around
span.SetAttributes(attribute.String("transaction.ContractFunctionCall",
contractLocation.String()+"."+spec.FunctionName)) so it executes only when
span.IsTraceable() returns true, keeping the SetAttributes call and attribute
construction unchanged.
fvm/environment/value_store.go (1)

165-173: Tracing the full value in SetValue — consider performance implications.

When tracing is enabled, hex-encoding the entire value (which can be large for storage operations) on every SetValue call adds overhead. This is likely acceptable for debugging scenarios, but worth noting if tracing is ever enabled in production paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fvm/environment/value_store.go` around lines 165 - 173, The tracing currently
hex-encodes the entire value inside the SetValue path
(span.StartChildSpan(trace.FVMEnvSetValue) and span.SetAttributes(... "value",
hex.EncodeToString(value) ...)), which can be expensive for large buffers;
change the span attribute to avoid encoding the full payload—e.g. log only the
value length, a short truncated hex prefix, or a hash (sha256) of value instead
of hex.EncodeToString(value). Update the SetValue span.SetAttributes call to use
attribute.String("value_summary", ...) or attribute.Int("value_len", len(value))
(or attribute.String("value_hash", computedHash)) and remove the full-encoding
to prevent performance and memory overhead in production tracing.
utils/debug/remoteDebugger.go (1)

60-77: Minor: WithAuthorizationChecksEnabled(false) is already set in the parent context.

Line 67 sets fvm.WithAuthorizationChecksEnabled(false) for system transactions, but NewRemoteDebugger (line 39) already sets this on the base context for all transactions. This is redundant but harmless. If the intention is that non-system transactions should eventually have auth checks enabled in the debugger, consider removing it from the base context instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/debug/remoteDebugger.go` around lines 60 - 77, The call to
fvm.WithAuthorizationChecksEnabled(false) is redundant in the
isSystemTransaction branch because NewRemoteDebugger already applies it to the
base context; decide intended behavior and remove the duplicate: if you want
non-system transactions to have auth checks enabled, remove
WithAuthorizationChecksEnabled(false) from the base context in NewRemoteDebugger
(so keep the call only in the isSystemTransaction block); otherwise remove the
entry from the isSystemTransaction append and leave NewRemoteDebugger unchanged;
update references in NewRemoteDebugger and this file (NewContextFromParent /
blockCtx / isSystemTransaction) accordingly to ensure only the intended path
disables authorization checks.
cmd/util/cmd/compare-debug-tx/cmd.go (1)

96-107: Repository is left on branch2 after execution — consider restoring the original branch.

After both debug-tx runs complete, the working tree remains checked out on flagBranch2. If the user was on a different branch before running the command, this is surprising. Consider saving and restoring the original branch (or HEAD) at the end (or via defer).

Proposed fix: save and restore original branch

Add a helper to capture the current ref before checkout, and defer restoration:

+func currentBranch(repoRoot string) string {
+	cmd := exec.Command("git", "rev-parse", "--abbrev-ref", "HEAD")
+	cmd.Dir = repoRoot
+	out, err := cmd.Output()
+	if err != nil {
+		log.Fatal().Err(err).Msg("failed to determine current branch")
+	}
+	return strings.TrimSpace(string(out))
+}

Then in run:

 func run(_ *cobra.Command, args []string) {
 	repoRoot := findRepoRoot()
 	checkCleanWorkingTree(repoRoot)
+	originalBranch := currentBranch(repoRoot)
+	defer checkoutBranch(repoRoot, originalBranch)
 
 	// ... rest of function
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/compare-debug-tx/cmd.go` around lines 96 - 107, Save the current
git ref/branch before the first checkout and ensure it is restored when the
command exits: call a helper (e.g., getCurrentRef or similar) to capture the
current HEAD, then immediately defer a restore using checkoutBranch(repoRoot,
originalRef) (or equivalent) so after calling checkoutBranch(repoRoot,
flagBranch1) and checkoutBranch(repoRoot, flagBranch2) the working tree is
returned to the original ref; place the capture and defer at the start of the
function that invokes checkoutBranch/runDebugTx/diffFiles to guarantee
restoration on both success and error.
utils/debug/cadence.go (1)

16-19: Exported mutable fields — consider whether Spans should be accessed directly.

Spans and Log are exported and mutable. This is fine for a debugging utility, but note that Spans is mutated by ExportSpans and read by WriteSpans. If there's any chance these are called concurrently (e.g., writing spans while still exporting), a mutex would be needed. For the current debugging use case (export → shutdown → write), this appears safe.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/debug/cadence.go` around lines 16 - 19, InterestingCadenceSpanExporter
exposes mutable fields Spans and Log which are mutated by ExportSpans and read
by WriteSpans; decide whether to prevent direct access by making Spans
unexported (rename to spans) and provide safe accessor methods (e.g., AddSpan,
GetSpans/WriteSpans) or, if concurrent ExportSpans/WriteSpans is possible, add a
sync.Mutex to InterestingCadenceSpanExporter and lock around mutations/reads of
Spans (and Log if concurrently accessed) to prevent data races; update
ExportSpans and WriteSpans to use the new methods/locks and keep Log access
consistent.
cmd/util/cmd/debug-tx/cmd.go (2)

522-531: BlockHashEntropyProvider provides deterministic entropy for reproducible debugging.

Using the block hash as the entropy source is a pragmatic choice — it enables transactions that require randomness to execute during debugging, while being deterministic. The value receiver is fine since Go's escape analysis will heap-allocate the receiver when the slice escapes.

Note that this doesn't match the actual entropy used in production (which uses a VRF-based beacon), so execution results may differ from mainnet for transactions using randomness. The comment could mention this caveat.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 522 - 531, The comment on
BlockHashEntropyProvider should explicitly note that using the block hash
provides deterministic entropy for debugging but differs from production entropy
(which uses a VRF-based beacon), so transaction execution using randomness may
differ from mainnet; update the doc comment on the BlockHashEntropyProvider type
(and mention RandomSource) to add this caveat and clarify the
deterministic/debug-only nature of this provider.

91-102: Same log.Fatal without return pattern as in debug-script.

If flagUseExecutionDataAPI is false and flagExecutionAddress is empty, log.Fatal is called on Line 97, but execution could theoretically continue to Line 99 where err is uninitialized. Adding a return after log.Fatal would be more defensive. Same reasoning as the debug-script comment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 91 - 102, The guard that fatals
when neither flagUseExecutionDataAPI nor flagExecutionAddress is set can allow
execution to continue and hit the subsequent err check; update the remote client
initialization in the block around remoteClient so execution cannot proceed
after the fatal: either add an explicit return immediately after the log.Fatal()
call in that else branch or restructure the if/else to set err (or a sentinel)
and handle it uniformly; ensure the code around NewExecutionDataRemoteClient and
NewExecutionNodeRemoteClient leaves err defined before the later if err != nil
check and that remoteClient.Close() is only deferred when remoteClient was
successfully created.
utils/debug/remoteView.go (2)

29-40: NewExecutionNodeStorageSnapshot returns error but never fails.

The constructor always returns nil error. Consider simplifying the signature to just return the struct, avoiding unnecessary error handling at every call site.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/debug/remoteView.go` around lines 29 - 40, The constructor
NewExecutionNodeStorageSnapshot currently returns
(*ExecutionNodeStorageSnapshot, error) but never returns a non-nil error; change
its signature to return only *ExecutionNodeStorageSnapshot (remove the error
return), update the function body to return just
&ExecutionNodeStorageSnapshot{Client: client, BlockID: blockID}, and update all
call sites that expect two return values (and any error checks) to handle the
single return value accordingly (search for NewExecutionNodeStorageSnapshot
usages to modify callers).

191-246: CapturingStorageSnapshot — consider using named struct types for Reads/Writes.

The anonymous struct types for Reads and Writes make it harder for callers to work with these slices (e.g., they can't declare variables of that type easily). A named type like RegisterEntry would improve ergonomics.

Example
type RegisterEntry struct {
	flow.RegisterID
	flow.RegisterValue
}

type CapturingStorageSnapshot struct {
	backing StorageSnapshot
	Reads   []RegisterEntry
	Writes  []RegisterEntry
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utils/debug/remoteView.go` around lines 191 - 246, Introduce a named struct
type (e.g., RegisterEntry) and replace the anonymous struct slices in
CapturingStorageSnapshot with []RegisterEntry for Reads and Writes; update the
NewCapturingStorageSnapshot, Get and Set methods to append RegisterEntry values
instead of anonymous structs and keep the existing logic that forwards Set to
the UpdatableStorageSnapshot backing when present (referencing
CapturingStorageSnapshot, Reads, Writes, RegisterEntry, Get, Set, and
UpdatableStorageSnapshot to locate and modify code).
cmd/util/cmd/debug-script/cmd.go (1)

99-110: Potential nil-dereference of err if log.Fatal behavior changes.

If flagUseExecutionDataAPI is false and flagExecutionAddress is empty, execution reaches log.Fatal() on Line 105, which exits the process. However, if it doesn't exit (e.g., in a test with a stubbed logger), the if err != nil check on Line 107 would evaluate an uninitialized err, and remoteClient would be nil — causing a nil dereference on Line 110 or 112.

This is a minor robustness concern for a CLI tool, but adding a return after the log.Fatal on Line 105 (or restructuring) would make the code more defensive.

Suggested fix
 	} else {
 		log.Fatal().Msg("Either --use-execution-data-api or --execution-address must be provided")
+		return
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-script/cmd.go` around lines 99 - 110, The current branch
where neither flagUseExecutionDataAPI nor flagExecutionAddress is set calls
log.Fatal(...) but then proceeds to use err and remoteClient; to make this
robust, ensure control flow cannot reach the subsequent err check and
remoteClient.Close() after the log.Fatal call—either add an explicit return
immediately after the log.Fatal(...) call or restructure the client-creation
logic so that err and remoteClient are always initialized before reaching the
later if err != nil and defer remoteClient.Close() lines (referencing
debug.NewExecutionDataRemoteClient, debug.NewExecutionNodeRemoteClient, the err
variable, remoteClient variable, and the log.Fatal call).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/util/cmd/debug-tx/cmd.go`:
- Line 100: The log message contains a typo ("Failed to remote client"); update
the log.Fatal().Err(err).Msg(...) call in cmd/util/cmd/debug-tx/cmd.go so the
message reads "Failed to create remote client" (i.e., locate the
log.Fatal().Err(err).Msg invocation and replace the string to include the
missing word "create").

In `@fvm/environment/value_store.go`:
- Around line 243-249: The span attribute "index" is being set using the named
return slabIndex before the allocation occurs, so it will always be zero; move
the span.SetAttributes call (or at least the attribute that uses
binary.BigEndian.Uint64(slabIndex[:])) to after the actual allocation call that
populates slabIndex (the allocation logic in the same function where slabIndex
is assigned), or alternatively add a span event after allocation that records
the allocated index; update the code around store.tracer.StartChildSpan /
span.SetAttributes (trace.FVMEnvAllocateSlabIndex and slabIndex usage) to record
the real allocated value instead of the zero-initialized one.

In `@utils/debug/registerCache.go`:
- Around line 109-119: The Set/Get owner encoding is inconsistent: Set stores
RegisterEntry.Key.Owner as a raw binary string (via AddressToRegisterOwner)
while Get expects a hex string; fix by making them use the same encoding —
update FileRegisterCache.Set to store the owner as hex by replacing
AddressToRegisterOwner(ownerAddr) with hex.EncodeToString(ownerAddr.Bytes())
(i.e., set Key.Owner to the hex-encoded address), so Get’s
hex.DecodeString(d.Key.Owner) will succeed; ensure the change is applied where
FileRegisterCache.Set constructs flow.RegisterEntry and no other code expects
the old binary-string format.
- Around line 54-98: NewFileRegisterCache currently returns an error when
os.Open(filePath) fails; change it to treat a missing file as an empty cache by
checking os.IsNotExist(err) and returning &FileRegisterCache{filePath: filePath,
data: make(map[string]flow.RegisterEntry)} nil in that case, while preserving
existing behavior for other errors; keep the rest of the parsing logic for when
the file opens successfully (ensure the defer f.Close() stays after successful
os.Open) and leave Persist() to create and write the file later.

In `@utils/debug/remoteClient.go`:
- Around line 75-81: The StorageSnapshot method can underflow when computing
blockHeight-1; add a guard at the start of
ExecutionDataRemoteClient.StorageSnapshot to handle blockHeight == 0 (e.g.,
return a clear error instead of calling NewExecutionDataStorageSnapshot with
math.MaxUint64). Locate the StorageSnapshot method and the call to
NewExecutionDataStorageSnapshot and ensure you validate blockHeight before
creating executiondata.NewExecutionDataAPIClient or invoking
NewExecutionDataStorageSnapshot so an invalid 0 height is handled explicitly.

In `@utils/debug/remoteView.go`:
- Around line 143-144: The code indexes resp.Values[0] without a bounds check
which can panic; update the handler that assigns value = resp.Values[0] to first
verify len(resp.Values) > 0 and handle the empty-case safely (e.g., return an
error, set value to a zero/nil sentinel, or log and continue) so accessing
resp.Values is guarded; look for the assignment to value and any surrounding
function (the response-processing block that reads resp.Values) and add the
defensive check and appropriate error/handling path.

In `@utils/debug/result.go`:
- Around line 27-43: The code assumes result.Snapshot is non-nil and will panic
if Snapshot is nil; add a nil check on result.Snapshot before calling
ReadRegisterIDs/UpdatedRegisters and writing register sections. Specifically, in
the functions that call result.Snapshot.ReadRegisterIDs() and
result.Snapshot.UpdatedRegisters(), guard the whole "Read registers" and
"Updated registers" blocks with if result.Snapshot != nil { ... } (retain the
SortRegisterIDs, SortRegisterEntries, and fmt.Fprintf calls inside the guarded
blocks) and otherwise skip or print a brief message indicating no snapshot
available to avoid dereferencing a nil *snapshot.ExecutionSnapshot.

---

Outside diff comments:
In `@cmd/util/cmd/debug-script/cmd.go`:
- Line 1: The package declaration in cmd/util/cmd/debug-script/cmd.go is
incorrectly named `debug_tx`; change the package line to `package debug_script`
so the package name matches the import path's final segment and Go conventions;
update the package declaration in the top of the file containing the
`debug-script` command (the file with the current `package debug_tx` line) — no
other code changes should be necessary because root.go imports it as
`debug_script`, but this fix ensures the package name and import path align.

---

Nitpick comments:
In `@cmd/util/cmd/compare-debug-tx/cmd.go`:
- Around line 96-107: Save the current git ref/branch before the first checkout
and ensure it is restored when the command exits: call a helper (e.g.,
getCurrentRef or similar) to capture the current HEAD, then immediately defer a
restore using checkoutBranch(repoRoot, originalRef) (or equivalent) so after
calling checkoutBranch(repoRoot, flagBranch1) and checkoutBranch(repoRoot,
flagBranch2) the working tree is returned to the original ref; place the capture
and defer at the start of the function that invokes
checkoutBranch/runDebugTx/diffFiles to guarantee restoration on both success and
error.

In `@cmd/util/cmd/debug-script/cmd.go`:
- Around line 99-110: The current branch where neither flagUseExecutionDataAPI
nor flagExecutionAddress is set calls log.Fatal(...) but then proceeds to use
err and remoteClient; to make this robust, ensure control flow cannot reach the
subsequent err check and remoteClient.Close() after the log.Fatal call—either
add an explicit return immediately after the log.Fatal(...) call or restructure
the client-creation logic so that err and remoteClient are always initialized
before reaching the later if err != nil and defer remoteClient.Close() lines
(referencing debug.NewExecutionDataRemoteClient,
debug.NewExecutionNodeRemoteClient, the err variable, remoteClient variable, and
the log.Fatal call).

In `@cmd/util/cmd/debug-tx/cmd.go`:
- Around line 522-531: The comment on BlockHashEntropyProvider should explicitly
note that using the block hash provides deterministic entropy for debugging but
differs from production entropy (which uses a VRF-based beacon), so transaction
execution using randomness may differ from mainnet; update the doc comment on
the BlockHashEntropyProvider type (and mention RandomSource) to add this caveat
and clarify the deterministic/debug-only nature of this provider.
- Around line 91-102: The guard that fatals when neither flagUseExecutionDataAPI
nor flagExecutionAddress is set can allow execution to continue and hit the
subsequent err check; update the remote client initialization in the block
around remoteClient so execution cannot proceed after the fatal: either add an
explicit return immediately after the log.Fatal() call in that else branch or
restructure the if/else to set err (or a sentinel) and handle it uniformly;
ensure the code around NewExecutionDataRemoteClient and
NewExecutionNodeRemoteClient leaves err defined before the later if err != nil
check and that remoteClient.Close() is only deferred when remoteClient was
successfully created.

In `@fvm/environment/programs.go`:
- Around line 95-101: The span attribute setting in
programs.tracer.StartChildSpan should mirror program_logger.go: replace the
span.Tracer != nil check with span.IsTraceable(), add a nil guard for the
location (e.g., if location != nil && span.IsTraceable()), and use
location.String() instead of location.ID() so the attribute key/value format is
consistent across FVM tracing; update the attribute.String call accordingly
inside programs.tracer.StartChildSpan handling.

In `@fvm/environment/system_contracts.go`:
- Around line 52-57: Replace the raw nil check on span.Tracer with the exported
helper to match package convention: use span.IsTraceable() as the guard before
calling span.SetAttributes(...). Update the condition around
span.SetAttributes(attribute.String("transaction.ContractFunctionCall",
contractLocation.String()+"."+spec.FunctionName)) so it executes only when
span.IsTraceable() returns true, keeping the SetAttributes call and attribute
construction unchanged.

In `@fvm/environment/value_store.go`:
- Around line 165-173: The tracing currently hex-encodes the entire value inside
the SetValue path (span.StartChildSpan(trace.FVMEnvSetValue) and
span.SetAttributes(... "value", hex.EncodeToString(value) ...)), which can be
expensive for large buffers; change the span attribute to avoid encoding the
full payload—e.g. log only the value length, a short truncated hex prefix, or a
hash (sha256) of value instead of hex.EncodeToString(value). Update the SetValue
span.SetAttributes call to use attribute.String("value_summary", ...) or
attribute.Int("value_len", len(value)) (or attribute.String("value_hash",
computedHash)) and remove the full-encoding to prevent performance and memory
overhead in production tracing.

In `@fvm/evm/handler/handler.go`:
- Around line 187-189: Replace the raw tracer nil-check with the exported helper
to keep consistency: instead of checking span.Tracer != nil before calling
span.SetAttributes(attribute.Int("tx_counts", len(rlpEncodedTxs))), use
span.IsTraceable() as the guard; locate the span usage in the EVM handler (the
block referencing rlpEncodedTxs and calling span.SetAttributes) and change the
condition to if span.IsTraceable() { ... } so it matches the pattern used in
system_contracts.go.

In `@utils/debug/cadence.go`:
- Around line 16-19: InterestingCadenceSpanExporter exposes mutable fields Spans
and Log which are mutated by ExportSpans and read by WriteSpans; decide whether
to prevent direct access by making Spans unexported (rename to spans) and
provide safe accessor methods (e.g., AddSpan, GetSpans/WriteSpans) or, if
concurrent ExportSpans/WriteSpans is possible, add a sync.Mutex to
InterestingCadenceSpanExporter and lock around mutations/reads of Spans (and Log
if concurrently accessed) to prevent data races; update ExportSpans and
WriteSpans to use the new methods/locks and keep Log access consistent.

In `@utils/debug/registers.go`:
- Around line 17-21: The SortRegisterIDs function uses an unnecessary anonymous
wrapper when calling slices.SortFunc; replace the inline func with the existing
CompareRegisterIDs function by passing CompareRegisterIDs directly to
slices.SortFunc (i.e., call slices.SortFunc(registerIDs, CompareRegisterIDs)) so
SortRegisterIDs simply delegates to CompareRegisterIDs without the extra
wrapper.

In `@utils/debug/remoteDebugger.go`:
- Around line 60-77: The call to fvm.WithAuthorizationChecksEnabled(false) is
redundant in the isSystemTransaction branch because NewRemoteDebugger already
applies it to the base context; decide intended behavior and remove the
duplicate: if you want non-system transactions to have auth checks enabled,
remove WithAuthorizationChecksEnabled(false) from the base context in
NewRemoteDebugger (so keep the call only in the isSystemTransaction block);
otherwise remove the entry from the isSystemTransaction append and leave
NewRemoteDebugger unchanged; update references in NewRemoteDebugger and this
file (NewContextFromParent / blockCtx / isSystemTransaction) accordingly to
ensure only the intended path disables authorization checks.

In `@utils/debug/remoteView.go`:
- Around line 29-40: The constructor NewExecutionNodeStorageSnapshot currently
returns (*ExecutionNodeStorageSnapshot, error) but never returns a non-nil
error; change its signature to return only *ExecutionNodeStorageSnapshot (remove
the error return), update the function body to return just
&ExecutionNodeStorageSnapshot{Client: client, BlockID: blockID}, and update all
call sites that expect two return values (and any error checks) to handle the
single return value accordingly (search for NewExecutionNodeStorageSnapshot
usages to modify callers).
- Around line 191-246: Introduce a named struct type (e.g., RegisterEntry) and
replace the anonymous struct slices in CapturingStorageSnapshot with
[]RegisterEntry for Reads and Writes; update the NewCapturingStorageSnapshot,
Get and Set methods to append RegisterEntry values instead of anonymous structs
and keep the existing logic that forwards Set to the UpdatableStorageSnapshot
backing when present (referencing CapturingStorageSnapshot, Reads, Writes,
RegisterEntry, Get, Set, and UpdatableStorageSnapshot to locate and modify
code).

turbolent and others added 3 commits February 26, 2026 12:52
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/util/cmd/debug-tx/cmd.go`:
- Line 73: Validate the entropy-provider flag value instead of silently falling
back to "none": after reading flagEntropyProvider (set via
Cmd.Flags().StringVar) check it against the allowed set ("none", "block-hash")
and return a clear error (or fail the command via the command's RunE) if it's
not one of those values; update any other parsing sites (the code referenced
around lines handling entropy provider, e.g., the logic at 219-227) to perform
the same validation so unsupported values are rejected with an explanatory
message rather than ignored.
- Around line 139-143: The code currently allows continuing when neither
flagBlockID nor transaction IDs (args) are provided; update the logic in the cmd
handling (the block that checks flagBlockID and the later branch handling args
around the code referencing flagBlockID and args) to validate inputs and fail
fast: if flagBlockID == "" and len(args) == 0 return/log a fatal error with a
clear message like "must provide either --block-id or one or more transaction
IDs"; apply the same validation to the later branch (the code region referenced
at 188-213) so both code paths enforce the same required input check.
- Around line 69-72: Add runtime validation for the flag dependency so the
documented requirement for --log-cadence-traces is enforced: in the command
setup/Run logic for Cmd (where flags are read) check if flagLogTraces is true
and if so ensure both the global trace flag (flagTrace / --trace) and
flagOnlyTraceCadence are true; if either is missing return a clear error or
user-facing message and abort execution. Update the validation in the same spot
to cover the other related flags block (the range around the second group
115-137) so any combinations that violate the declared dependencies are rejected
early.
- Around line 121-126: Replace all usages of log.Fatal() in this file (including
the deferred span writer and the other affected locations) with the project's
irrecoverable error helper so deferred cleanup runs; import the irrecoverable
package and, where you currently do log.Fatal().Err(err).Msg("..."), call the
irrecoverable helper (passing the error and the same contextual message)
instead, e.g., in the deferred cadenceSpanExporter.WriteSpans block replace the
log.Fatal call with an irrecoverable invocation so remoteClient.Close(),
traceFile.Close(), and other defers still execute; do the same for the other
listed occurrences (lines that currently call log.Fatal for errors).
- Around line 242-243: The RPC calls currently use context.Background() (e.g.,
the GetTransactionResult call around txResult) which prevents
cancellation/timeout; update the command to use the CLI command context by
calling cmd.Context() and thread that context through all RPCs: change the
signatures of RunSingleTransaction, FetchBlockHeader, FetchBlockTransactions,
SubscribeBlockHeadersFromStartBlockID, and SubscribeBlockHeadersFromLatest to
accept a context.Context parameter, update their callers to pass cmd.Context(),
and replace any context.Background() uses inside those functions with the passed
context so all gRPC/SDK calls use the command context for cancellation and
timeouts.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0db99d and 67b401a.

📒 Files selected for processing (1)
  • cmd/util/cmd/debug-tx/cmd.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
fvm/environment/value_store.go (1)

165-173: Consider logging value length instead of the full value.

Storage values can be arbitrarily large (contract code, large data structures), and hex-encoding doubles the size. Logging the full value as a span attribute may:

  • Exceed tracing backend attribute size limits
  • Impact tracing system performance and storage costs
  • Potentially expose sensitive data in traces

Consider logging the value length instead, or truncating to a reasonable prefix.

Suggested approach
 	span := store.tracer.StartChildSpan(trace.FVMEnvSetValue)
 	if span.Tracer != nil {
 		span.SetAttributes(
 			attribute.String("owner", hex.EncodeToString(owner)),
 			attribute.String("key", hex.EncodeToString(keyBytes)),
-			attribute.String("value", hex.EncodeToString(value)),
+			attribute.Int("value_len", len(value)),
 		)
 	}
 	defer span.End()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@fvm/environment/value_store.go` around lines 165 - 173, The span attribute
currently logs the full hex-encoded value in the StartChildSpan created by
store.tracer.StartChildSpan(trace.FVMEnvSetValue); change the attribute for
"value" to record its length (e.g., len(value) or len(keyBytes) as appropriate)
or a truncated prefix instead of the full hex string to avoid huge trace
attributes; update the attribute.String call for "value" to use a stringified
length or a safe truncated hex prefix and keep the other attributes (owner, key)
intact while preserving the span.End() defer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@fvm/environment/value_store.go`:
- Around line 165-173: The span attribute currently logs the full hex-encoded
value in the StartChildSpan created by
store.tracer.StartChildSpan(trace.FVMEnvSetValue); change the attribute for
"value" to record its length (e.g., len(value) or len(keyBytes) as appropriate)
or a truncated prefix instead of the full hex string to avoid huge trace
attributes; update the attribute.String call for "value" to use a stringified
length or a safe truncated hex prefix and keep the other attributes (owner, key)
intact while preserving the span.End() defer.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67b401a and c1ea914.

📒 Files selected for processing (3)
  • fvm/environment/value_store.go
  • utils/debug/registerCache.go
  • utils/debug/registerCache_test.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
cmd/util/cmd/debug-tx/cmd.go (4)

188-213: ⚠️ Potential issue | 🟡 Minor

Fail fast when neither --block-id nor transaction IDs are provided.

When flagBlockID is empty and args is empty, the function silently exits without doing any work. This can mask operator mistakes.

Suggested fix at the start of the else branch
 	} else {
 		// No block ID provided, proceed with transaction IDs from args
+
+		if len(args) == 0 {
+			log.Fatal().Msg("Provide --block-id or at least one transaction ID")
+		}
 
 		for _, rawTxID := range args {

As per coding guidelines: "Treat all inputs as potentially byzantine and classify errors in a context-dependent manner."

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 188 - 213, The else branch that
iterates transaction IDs must first validate inputs and fail fast if no
transaction IDs were provided: add a guard at the start of the else branch that
checks if len(args) == 0 (and flagBlockID is empty) and then log an error and
exit (use log.Fatal or equivalent) with a clear message like "no --block-id or
transaction IDs provided"; keep the rest of the loop (where
RunSingleTransaction, flow.HexStringToIdentifier, and debug.WriteResult are
used) unchanged so behavior stays identical when args are present.

69-71: ⚠️ Potential issue | 🟡 Minor

Enforce the declared flag dependencies at runtime.

The --log-cadence-traces flag is documented as requiring both --trace and --only-trace-cadence, but this constraint is not validated. Users providing invalid flag combinations will get unexpected behavior.

Suggested fix in the run function
 func run(_ *cobra.Command, args []string) {
+	if flagLogTraces && (flagTracePath == "" || !flagOnlyTraceCadence) {
+		log.Fatal().Msg("--log-cadence-traces requires both --trace and --only-trace-cadence")
+	}
+	if flagOnlyTraceCadence && flagTracePath == "" {
+		log.Fatal().Msg("--only-trace-cadence requires --trace")
+	}
+
 	chainID := flow.ChainID(flagChain)

As per coding guidelines: "Treat all inputs as potentially byzantine and classify errors in a context-dependent manner."

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 69 - 71, Add runtime validation in
the command's Run/Execute function to enforce the documented dependency: if
flagLogTraces is true then require both flagTrace and flagOnlyTraceCadence to be
true; if the check fails, return a user-facing error (or print usage) explaining
that --log-cadence-traces requires --trace and --only-trace-cadence. Locate the
check near the start of the existing Run function (where flags are read) and
reference the flag variables flagLogTraces, flagTrace and flagOnlyTraceCadence
to perform the boolean validation and produce a clear error message when the
combination is invalid.

238-250: ⚠️ Potential issue | 🟠 Major

Propagate context to RPC calls instead of context.Background().

Using context.Background() at line 248 prevents cancellation and timeout control. If the network stalls, the CLI will hang indefinitely. Consider accepting a context.Context parameter and passing it through from cmd.Context().

Suggested signature change
 func RunSingleTransaction(
+	ctx context.Context,
 	remoteClient debug.RemoteClient,
 	txID flow.Identifier,
 	flowClient *client.Client,
 	chain flow.Chain,
 	spanExporter otelTrace.SpanExporter,
 	computeLimit uint64,
 ) debug.Result {
 	log.Info().Msgf("Fetching transaction result for %s ...", txID)

-	txResult, err := flowClient.GetTransactionResult(context.Background(), sdk.Identifier(txID))
+	txResult, err := flowClient.GetTransactionResult(ctx, sdk.Identifier(txID))

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 238 - 250, The
RunSingleTransaction function currently uses context.Background() for RPCs (see
flowClient.GetTransactionResult), which prevents cancellation/timeouts; change
the RunSingleTransaction signature to accept a ctx context.Context parameter and
replace all uses of context.Background() inside (e.g.,
flowClient.GetTransactionResult, any other RPCs in RunSingleTransaction) with
that ctx; update callers to pass cmd.Context() (or the calling context) into
RunSingleTransaction so cancellations and deadlines propagate correctly.

121-126: ⚠️ Potential issue | 🟠 Major

log.Fatal() in defer prevents proper cleanup.

Using log.Fatal() inside a deferred function (line 124) causes immediate process exit, bypassing the LIFO execution of other defers like remoteClient.Close() (line 102) and traceFile.Close() (line 112). This can cause resource leaks.

Suggested fix
 		defer func() {
 			err = cadenceSpanExporter.WriteSpans(traceFile)
 			if err != nil {
-				log.Fatal().Err(err).Msg("Failed to write spans")
+				log.Error().Err(err).Msg("Failed to write spans")
+				// Note: Consider returning error from run() or using a package-level exit code
 			}
 		}()

As per coding guidelines: "Use the irrecoverable package for exception handling instead of fmt.Errorf; always explicitly handle errors."

,

🧹 Nitpick comments (2)
cmd/util/cmd/debug-tx/cmd.go (2)

528-537: Consider adding a security note about deterministic entropy.

The BlockHashEntropyProvider uses block hash as entropy, which is deterministic and predictable. This is appropriate for debugging/replay scenarios where reproducibility is desired, but the doc comment should clarify this is NOT suitable for production use requiring cryptographic randomness.

Suggested doc improvement
 // BlockHashEntropyProvider implements environment.EntropyProvider
 // which provides a source of entropy to fvm context (required for Cadence's randomness),
 // by using the given block hash.
+//
+// NOTE: This provider is intended for debugging/replay purposes only.
+// The entropy is deterministic and predictable, making it unsuitable for
+// production use cases requiring cryptographic randomness.
 type BlockHashEntropyProvider struct {
 	BlockHash flow.Identifier
 }

Based on learnings: "Ensure cryptographic operations are handled carefully per the crypto library documentation."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 528 - 537, Update the doc comment
for BlockHashEntropyProvider to explicitly warn that it uses the deterministic
block hash as its entropy source and is only intended for debugging/replay and
not for production or any cryptographic use; mention the struct name
BlockHashEntropyProvider, the BlockHash field, and the RandomSource() method so
reviewers can find and update the comment to state that this source is
predictable and unsuitable for security-sensitive randomness.

489-490: Replace context.TODO() with a proper context parameter.

context.TODO() explicitly marks incomplete context handling. This function should accept a context.Context parameter to enable proper cancellation and tracing parent-child relationships.

 func RunTransaction(
+	ctx context.Context,
 	tx *sdk.Transaction,
 	snapshot debug.StorageSnapshot,
 	...
 ) debug.Result {
 	...
-	span, _ := tracer.StartTransactionSpan(context.TODO(), flow.Identifier(tx.ID()), "")
+	span, _ := tracer.StartTransactionSpan(ctx, flow.Identifier(tx.ID()), "")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 489 - 490, The call to
tracer.StartTransactionSpan currently uses context.TODO(), which must be
replaced by a propagated context parameter; update the enclosing function (the
function containing the span creation) to accept a context.Context (e.g., ctx)
and use that ctx in tracer.StartTransactionSpan(ctx, flow.Identifier(tx.ID()),
""); propagate ctx through any callers (or add context creation at call sites)
so cancellation and trace parent-child relationships work correctly, and keep
the defer span.End() as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@cmd/util/cmd/debug-tx/cmd.go`:
- Around line 188-213: The else branch that iterates transaction IDs must first
validate inputs and fail fast if no transaction IDs were provided: add a guard
at the start of the else branch that checks if len(args) == 0 (and flagBlockID
is empty) and then log an error and exit (use log.Fatal or equivalent) with a
clear message like "no --block-id or transaction IDs provided"; keep the rest of
the loop (where RunSingleTransaction, flow.HexStringToIdentifier, and
debug.WriteResult are used) unchanged so behavior stays identical when args are
present.
- Around line 69-71: Add runtime validation in the command's Run/Execute
function to enforce the documented dependency: if flagLogTraces is true then
require both flagTrace and flagOnlyTraceCadence to be true; if the check fails,
return a user-facing error (or print usage) explaining that --log-cadence-traces
requires --trace and --only-trace-cadence. Locate the check near the start of
the existing Run function (where flags are read) and reference the flag
variables flagLogTraces, flagTrace and flagOnlyTraceCadence to perform the
boolean validation and produce a clear error message when the combination is
invalid.
- Around line 238-250: The RunSingleTransaction function currently uses
context.Background() for RPCs (see flowClient.GetTransactionResult), which
prevents cancellation/timeouts; change the RunSingleTransaction signature to
accept a ctx context.Context parameter and replace all uses of
context.Background() inside (e.g., flowClient.GetTransactionResult, any other
RPCs in RunSingleTransaction) with that ctx; update callers to pass
cmd.Context() (or the calling context) into RunSingleTransaction so
cancellations and deadlines propagate correctly.

---

Nitpick comments:
In `@cmd/util/cmd/debug-tx/cmd.go`:
- Around line 528-537: Update the doc comment for BlockHashEntropyProvider to
explicitly warn that it uses the deterministic block hash as its entropy source
and is only intended for debugging/replay and not for production or any
cryptographic use; mention the struct name BlockHashEntropyProvider, the
BlockHash field, and the RandomSource() method so reviewers can find and update
the comment to state that this source is predictable and unsuitable for
security-sensitive randomness.
- Around line 489-490: The call to tracer.StartTransactionSpan currently uses
context.TODO(), which must be replaced by a propagated context parameter; update
the enclosing function (the function containing the span creation) to accept a
context.Context (e.g., ctx) and use that ctx in tracer.StartTransactionSpan(ctx,
flow.Identifier(tx.ID()), ""); propagate ctx through any callers (or add context
creation at call sites) so cancellation and trace parent-child relationships
work correctly, and keep the defer span.End() as-is.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c1ea914 and 8186987.

📒 Files selected for processing (1)
  • cmd/util/cmd/debug-tx/cmd.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
cmd/util/cmd/debug-tx/cmd.go (2)

122-126: ⚠️ Potential issue | 🟠 Major

Avoid log.Fatal in deferred and helper paths; return/propagate errors instead.

At Line 125, log.Fatal inside a defer can abort before later defers (e.g., traceFile.Close, remoteClient.Close) run. The same pattern across exported helpers also hard-exits the process instead of letting callers manage failures and cleanup.

As per coding guidelines: "Use the irrecoverable package for exception handling instead of fmt.Errorf; always explicitly handle errors and never log and continue on a best-effort basis".

Also applies to: 236-239, 257-258, 304-306, 316-317, 332-333, 360-361, 382-383, 401-402, 497-498, 526-527

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 122 - 126, The deferred block
calling cadenceSpanExporter.WriteSpans(traceFile) must not call log.Fatal;
instead return or propagate the error so callers can clean up other defers
(e.g., traceFile.Close, remoteClient.Close) and handle failures. Change the
defer to capture the write error and assign/append it to the function's error
return (or send it on an error channel) so the outer function can return that
error; do the same for other helper/deferred writes mentioned (lines around the
cadenceSpanExporter.WriteSpans call and the similar occurrences at the other
listed locations). Locate uses of cadenceSpanExporter.WriteSpans, traceFile, and
any helper functions that currently call log.Fatal and replace those fatal logs
with error propagation to the caller.

69-72: ⚠️ Potential issue | 🟡 Minor

Enforce trace flag dependencies at runtime.

--log-cadence-traces / --only-trace-cadence can be set without --trace, and the command then silently skips trace setup. Please fail fast on invalid combinations.

Suggested fix
+	if flagLogTraces && (flagTracePath == "" || !flagOnlyTraceCadence) {
+		log.Fatal().Msg("--log-cadence-traces requires both --trace and --only-trace-cadence")
+	}
+	if flagOnlyTraceCadence && flagTracePath == "" {
+		log.Fatal().Msg("--only-trace-cadence requires --trace")
+	}

Based on learnings: "Applies to **/*.go : Treat all inputs as potentially byzantine and classify errors in a context-dependent manner; no code path is safe unless explicitly proven and documented".

Also applies to: 116-139

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/util/cmd/debug-tx/cmd.go` around lines 69 - 72, The two flags
flagLogTraces and flagOnlyTraceCadence can be set while tracing is disabled,
causing silent no-ops; add a runtime validation in the command startup (e.g., in
Cmd.PreRunE or the RunE handler for Cmd) that checks the boolean flagTrace (the
main --trace flag) and returns a user-facing error if either flagLogTraces or
flagOnlyTraceCadence is true while flagTrace is false; use a clear message like
"must pass --trace when using --log-cadence-traces or --only-trace-cadence" and
ensure the check runs before any trace setup logic (references: Cmd,
flagLogTraces, flagOnlyTraceCadence, flagTrace).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@cmd/util/cmd/debug-tx/cmd.go`:
- Around line 122-126: The deferred block calling
cadenceSpanExporter.WriteSpans(traceFile) must not call log.Fatal; instead
return or propagate the error so callers can clean up other defers (e.g.,
traceFile.Close, remoteClient.Close) and handle failures. Change the defer to
capture the write error and assign/append it to the function's error return (or
send it on an error channel) so the outer function can return that error; do the
same for other helper/deferred writes mentioned (lines around the
cadenceSpanExporter.WriteSpans call and the similar occurrences at the other
listed locations). Locate uses of cadenceSpanExporter.WriteSpans, traceFile, and
any helper functions that currently call log.Fatal and replace those fatal logs
with error propagation to the caller.
- Around line 69-72: The two flags flagLogTraces and flagOnlyTraceCadence can be
set while tracing is disabled, causing silent no-ops; add a runtime validation
in the command startup (e.g., in Cmd.PreRunE or the RunE handler for Cmd) that
checks the boolean flagTrace (the main --trace flag) and returns a user-facing
error if either flagLogTraces or flagOnlyTraceCadence is true while flagTrace is
false; use a clear message like "must pass --trace when using
--log-cadence-traces or --only-trace-cadence" and ensure the check runs before
any trace setup logic (references: Cmd, flagLogTraces, flagOnlyTraceCadence,
flagTrace).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8186987 and 82c5ccf.

📒 Files selected for processing (2)
  • cmd/util/cmd/debug-script/cmd.go
  • cmd/util/cmd/debug-tx/cmd.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants