feat(rfc): sharded dag with virtual blocks RFC by hannahhoward · Pull Request #66 · storacha/RFC

hannahhoward · 2025-09-08T03:45:16Z

alanshaw

Yes, this sounds great ❤️

Something I realised is that we could remove the caching of these blocks that we do in freeway and any cost associated with that!

alanshaw · 2025-09-08T10:13:50Z

rfc/sharded-virtual-dag.md

+type ShardedDagIndex_0_2 struct {
+  content Link
+  shards [Link]
+  blocks [Link]


Array of multihash digests rather than links?

I wonder if we should model/phrase this as a "merkle proof" to prevent arbitrary blocks from being attached?

i.e. blocks MUST include content as the first block. Specify that other blocks MUST be linked from the root or one of it's children.

We should specify that blocks MUST NOT include DAG leaf blocks. i.e. no IPLD raw blocks (a reason to leave it as CID not hash).

Consider renaming blocks to proofs?

alanshaw · 2025-09-08T10:14:24Z

rfc/sharded-virtual-dag.md

+
+The `blocks` attribute in a sharded DAG index v0.2 block is simply a collection of links to blocks that are included in this index file. Since a Sharded DAG Index is a CAR file, these blocks are simply inserted into the CAR file.
+
+**IMPORTANT**: While small, this represents a breaking change for retrieval clients. Because the blocks included in the index are no longer in the underlying shard file, a retrieval client MUST be able to read the additional blocks out of the Sharded DAG Index CAR file in order to perform the retrieval. This is why we change the version of the Sharded DAG Index and list the included blocks in the Sharded DAG Index v0.2 root block.


alanshaw · 2025-09-08T10:17:40Z

rfc/sharded-virtual-dag.md

+
+## Benefits
+
+Adding the ability to put blocks directly into the Sharded DAG Index would enable us to store files as blobs in their original format (up to the shard size), while maintaining full UnixFS compatibility.


You could also go the other way and store a small file in the index and only upload 1 blob... 🐢

alanshaw · 2025-09-08T10:18:19Z

rfc/sharded-virtual-dag.md

+
+This would provide much faster RTT when using w3s.link -- the block could be returned directly from the location claim, and with no range request. Because the blob hash is sha256, it could be verified directly by the browser's various data integrity tools.
+
+This approach could enable other optimizations as well. For many UnixFS directories, the entire directory structure could live in the sharded DAG Index. This would enable deep pathing in only two roundtrips -- one to fetch the sharded dag index, and one to fetch the underlying file.


👏 👏 👏 OMG fast directory listings for HAMT

alanshaw · 2025-09-08T10:37:30Z

rfc/sharded-virtual-dag.md

+type ShardedDagIndex_0_2 struct {
+  content Link
+  shards [Link]
+  blocks [Link]


Another idea: structure the block links:

e.g.

{ content: { "/": "bafyroot" }, proof: [ { "/": "bafyroot" } [ /** children of bafyroot */ [{ "/": "bafyblock0" }, [/** more intermediate children */]], [{ "/": "bafyblock1" }, [/** more intermediate children */]] ] ] }

i.e. proof does the same thing as blocks by specifying the included blocks, but also encodes the DAG structure? I'm not sure if useful though and seems a little complicated...

yea I originally imagined various more complicated listings, then decided just listing the blocks and including them is probably the right move.

alanshaw · 2025-09-08T11:42:36Z

rfc/sharded-virtual-dag.md

+
+```
+
+The `blocks` attribute in a sharded DAG index v0.2 block is simply a collection of links to blocks that are included in this index file. Since a Sharded DAG Index is a CAR file, these blocks are simply inserted into the CAR file.


Worth mentioning that these blocks MUST/SHOULD/MAY(?) NOT be included in any shard the index references?

feat(rfc): sharded dag with virtual blocks RFC

b1f01da

alanshaw approved these changes Sep 8, 2025

View reviewed changes

alanshaw reviewed Sep 8, 2025

View reviewed changes

hannahhoward mentioned this pull request Jan 5, 2026

rfc: Guppy Retrieval Strategy #77

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rfc): sharded dag with virtual blocks RFC#66

feat(rfc): sharded dag with virtual blocks RFC#66
hannahhoward wants to merge 1 commit intoash/rfc/filepackfrom
rfc/virtual-dag

hannahhoward commented Sep 8, 2025

Uh oh!

alanshaw left a comment

Uh oh!

alanshaw Sep 8, 2025

Uh oh!

alanshaw Sep 8, 2025

Uh oh!

alanshaw Sep 8, 2025

Uh oh!

alanshaw Sep 8, 2025

Uh oh!

alanshaw Sep 8, 2025 •

edited

Loading

Uh oh!

alanshaw Sep 8, 2025

Uh oh!

hannahhoward Sep 8, 2025

Uh oh!

alanshaw Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		The `blocks` attribute in a sharded DAG index v0.2 block is simply a collection of links to blocks that are included in this index file. Since a Sharded DAG Index is a CAR file, these blocks are simply inserted into the CAR file.

		IMPORTANT: While small, this represents a breaking change for retrieval clients. Because the blocks included in the index are no longer in the underlying shard file, a retrieval client MUST be able to read the additional blocks out of the Sharded DAG Index CAR file in order to perform the retrieval. This is why we change the version of the Sharded DAG Index and list the included blocks in the Sharded DAG Index v0.2 root block.


		## Benefits

		Adding the ability to put blocks directly into the Sharded DAG Index would enable us to store files as blobs in their original format (up to the shard size), while maintaining full UnixFS compatibility.


		This would provide much faster RTT when using w3s.link -- the block could be returned directly from the location claim, and with no range request. Because the blob hash is sha256, it could be verified directly by the browser's various data integrity tools.

		This approach could enable other optimizations as well. For many UnixFS directories, the entire directory structure could live in the sharded DAG Index. This would enable deep pathing in only two roundtrips -- one to fetch the sharded dag index, and one to fetch the underlying file.

Conversation

hannahhoward commented Sep 8, 2025

Uh oh!

alanshaw left a comment

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

hannahhoward Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

alanshaw Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alanshaw Sep 8, 2025 •

edited

Loading