Skip to content

rfc: on-demand service authorization#68

Open
alanshaw wants to merge 2 commits intomainfrom
ash/rfc/service-authz
Open

rfc: on-demand service authorization#68
alanshaw wants to merge 2 commits intomainfrom
ash/rfc/service-authz

Conversation

@alanshaw
Copy link
Member

@alanshaw alanshaw commented Oct 1, 2025

📖 Preview

Authorizing services to talk to each other is happening more and more often as we build out the network. Our current approach is to pre-authorize services, storing long lived delegations for use when needed.

This is becoming onerous, and this PR proposes one solution - on-demand authorization.

@alanshaw alanshaw requested a review from a team October 1, 2025 16:17
@Peeja
Copy link
Member

Peeja commented Oct 1, 2025

For example, space/index/add causes the Indexing Service to fetch an index and add the hashes to it's IPNI chain (for internal network resolution). It is not authorized to fetch the index, so it can not fetch it!

Can we just send an additional delegation along with this invocation? The issuer of the space/index/add is authorized to retrieve (right?), so they can delegate that to the Indexing Service.

Separately, maybe: is the retrieval is an effect? It appears to be part of the process of space/index/add, just as blob/allocate is part of space/blob/add. The difference is that blob/allocate happens to have a different resource (the storage provider) than its parent, and the issuer (Storacha) is definitely already authorized before the process even begins, whereas here the resource is the space. I have no idea whether this thought actually has any bearing on passing along the authority, but it seems like it might?

@alanshaw
Copy link
Member Author

alanshaw commented Oct 1, 2025

For example, space/index/add causes the Indexing Service to fetch an index and add the hashes to it's IPNI chain (for internal network resolution). It is not authorized to fetch the index, so it can not fetch it!

Can we just send an additional delegation along with this invocation? The issuer of the space/index/add is authorized to retrieve (right?), so they can delegate that to the Indexing Service.

Yes in this case we can do that assuming we're ok with charging the user for egress.

It's a bunch of work to add to two clients, the upload service and the indexing service though...

Separately, maybe: is the retrieval is an effect? It appears to be part of the process of space/index/add, just as blob/allocate is part of space/blob/add. The difference is that blob/allocate happens to have a different resource (the storage provider) than its parent, and the issuer (Storacha) is definitely already authorized before the process even begins, whereas here the resource is the space. I have no idea whether this thought actually has any bearing on passing along the authority, but it seems like it might?

Um yeah I guess it is an effect, but it's not async - it'll have been completed by the time the response is received. I don't think this has bearing on passing authority like you say.

@frrist
Copy link
Member

frrist commented Oct 1, 2025

Just want to check my understanding here.

The idea is that services can request capabilities on-demand from other service using access/authorize, and the recipient decides whether to issue a delegation based on their allow-list of trusted DIDs, and other data in the request?

Aim here is to replace pre-configuring long-lived delegations, like what we do in piri for the indexing, upload, and egress service?

@travis
Copy link

travis commented Oct 1, 2025

I'm not really following how this will work - could you add an example of the full flow somewhere to really spell it out? I think the client will need to send an access/authorize invocation to the service. Will they then receive the requested delegations and need to subsequently send them back, or is the idea that the service will hold onto the delegations and use them as needed?

@Peeja
Copy link
Member

Peeja commented Oct 1, 2025

Yes in this case we can do that assuming we're ok with charging the user for egress.

Ah, and there's the rub. Are we? Are we not? Is there something so special about this principal retrieving the index data that it shouldn't incur a charge? Or is there something special about index data such that it should be freely egress-able? Or is the amount of egress here so minimal that it's not worth making exceptions?

It's a bunch of work to add to two clients, the upload service and the indexing service though...

It certainly is, but this sounds like a lot of work too. I'm not sure there's a way around that.

@frrist
Copy link
Member

frrist commented Oct 1, 2025

Yes in this case we can do that assuming we're ok with charging the user for egress.

Or we can filter out receipts/invocation during egress consolidation that are related to these sorts of events.

@alanshaw
Copy link
Member Author

alanshaw commented Oct 2, 2025

Yes in this case we can do that assuming we're ok with charging the user for egress.

Or we can filter out receipts/invocation during egress consolidation that are related to these sorts of events.

I think it's possible, but not easy to distinguish.

Copy link
Member

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly I agree with all of this.

One thing we don't get into here are session lengths -- IOW, if I call access/authorize, how long lived is my delegation? I believe for the original protocol access/authorize on the upload service issues a delegation that is quite long lived. I would think there there's not a huge issue in continuing to re-authorized periodically.

Overall, very reasonable direction I concur.


### Add `blob/retrieve` "service" capability

I'm proposing `blob/retrieve` as a new capability that allows retrieving a blob in full. It will be used for internal operations like replication, repair, indexing and filecoin onboarding - egress MUST NOT be recorded for these invocations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current replication protocol contains a blob/replica/transfer invocation, which is issued as a fork in the receipt for blob/replica/allocate. was the intent there to have that be actually send from the node that will receive the replica to the source node in order to replicate? If so maybe for at least that case no new capability is needed. I never was quite sure what blob/replica/transfer was for (other than maybe it was to simply track the copying)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for tracking the copying.

The storage node has no authority to invoke blob/replica/transfer on the node hosting the data, so we still need to get a delegation for doing so.

@alanshaw
Copy link
Member Author

alanshaw commented Oct 2, 2025

Yes in this case we can do that assuming we're ok with charging the user for egress.

Ah, and there's the rub. Are we? Are we not?

In this case (from the review here) Hannah is ok with the customer paying egress.

The point is more that there are instances where it "is intended" to be free, or where the customer isn't around to provide a new delegation - like a new filecoin SP retrieving data because a deal has expired and the previous SP is not taking deals anymore (i.e. you can't "re-use" the data).

Is there something so special about this principal retrieving the index data that it shouldn't incur a charge? Or is there something special about index data such that it should be freely egress-able? Or is the amount of egress here so minimal that it's not worth making exceptions?

Not more so than, for example, it being free because you asked for it to be replicated. i.e. this would be - it was free because you asked for it to be indexed.

It seems a little arbitrary - personally I would always charge the customer for egress. However there does seem to be a need to authorize access to data that does not originate from the client. Or rather, a mechanism that isn't a long lived delegation from the customer made in advance. Determining whether we charge for the invocation is not the goal of this RFC.

Holding a long lived delegation ties us in knots - we have to keep it safe forever ($$$) and when we need to change that delegation (add a new capability for example) we can't bring everyone with us, because we can't simply "get a new one" from all the customers.

The idea here is that we're leveraging authority over data by virtue of holding (hosting) it. This is different from the customers' authority over the data. 🤷‍♂️ perhaps it's a bad idea, but the proposal to mitigate concerns is that delegations would be issued only when the context is known and verified i.e. the cause should be a valid invocation that provides context for the delegation.

It's a bunch of work to add to two clients, the upload service and the indexing service though...

It certainly is, but this sounds like a lot of work too. I'm not sure there's a way around that.

Yes I anticipate around the same amount of work either way but I'm betting on doing this saving us work, maintenance and problems in the future.

@alanshaw
Copy link
Member Author

alanshaw commented Oct 2, 2025

I'm not really following how this will work - could you add an example of the full flow somewhere to really spell it out? I think the client will need to send an access/authorize invocation to the service. Will they then receive the requested delegations and need to subsequently send them back, or is the idea that the service will hold onto the delegations and use them as needed?

Typically you'd use the delegation you receive as proof in an invocation. So yes, you'd be "sending them back", but you'd also hold on to them so you can use them in subsequent invocations.

Here's how it might work if the indexing service were to request an authorization from a storage node for accessing an index for caching and adding to IPNI:

sequenceDiagram
    participant customer
    participant uploadService
    participant indexingService
    participant storageNode
    customer->>uploadService: space/index/add
    uploadService->>indexingService: assert/index
    indexingService->>storageNode: access/authorize { can: "blob/retrieve", cause: cid(assert/index) }
    storageNode->>indexingService: { ok: Delegation(blob/retrieve) }
    indexingService->>storageNode: blob/retrieve { blob: zQm... }
    storageNode->>indexingService: bytes(sharded DAG index)
    indexingService->>uploadService: { ok: {} }
    uploadService->>customer: { ok: {} }
Loading

Subsequent invocations from the indexing service to the same storage node can use the same delegation.

@alanshaw
Copy link
Member Author

alanshaw commented Oct 2, 2025

IOW, if I call access/authorize, how long lived is my delegation? I believe for the original protocol access/authorize on the upload service issues a delegation that is quite long lived. I would think there there's not a huge issue in continuing to re-authorized periodically.

Yes, good question. IIRC the eventual attestation is valid for 1 year. For this I was thinking to delegate for a much shorter time - 1 hour seems reasonable.

@alanshaw
Copy link
Member Author

alanshaw commented Oct 2, 2025

Just want to check my understanding here.

The idea is that services can request capabilities on-demand from other service using access/authorize, and the recipient decides whether to issue a delegation based on their allow-list of trusted DIDs, and other data in the request?

Yes correct. I actually thought of this because of your comment here #67 (comment)

Aim here is to replace pre-configuring long-lived delegations, like what we do in piri for the indexing, upload, and egress service?

I would say it's a welcome side effect.

@alanshaw alanshaw changed the title feat: authorizing access to services feat: on-demand service authorization Oct 2, 2025
@volmedo
Copy link
Member

volmedo commented Oct 2, 2025

I like this proposal, thank you for putting it together.

A couple ideas that come up:

  • allow-listing principals means trust. I know this is how things work today anyway and that working around it is not trivial. It's still better than having trust and the configuration being such a messy process.
  • I like how it works when the thing invoking access/authorize is a known actor, e.g. the indexing-service. It might not be that easy the other way. For instance, nodes will need authorization to invoke space/egress/track on the billing server, e.g.
{
  iss: 'did:key:zStorageNode',
  aud: 'did:web:billing.storacha.network',
  att: [{
    can: 'access/authorize',
    with: 'did:key:zStorageNode',
    nb: {
      // requested capabilities
      // Note: `with` is implied (did:web:billing.storacha.network)
      att: [{
        can: 'space/egress/track',
        nb: {
          // can be specific or open ended
        }
      }],
  }]
}

I see the billing service querying the registrar service to know whether the node is properly registered in the network before authorizing egress tracking.

Copy link
Member

@frrist frrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this.

Maintaining long-lived delegations in Piri's config has been a pain, so I'm really happy to move away from this pattern. We might even be able to remove the delegator/registrar entirely, or at least strip out the piece where it generates and receives delegations from nodes.

On session length, this seems like our main lever for improving request latency. I'd rather not re-authorize on every retrieval request since that could add hundreds of milliseconds or even full seconds, which feels like a non-starter.

@Peeja
Copy link
Member

Peeja commented Oct 2, 2025

Ohhhh. I misread the RFC. I understand now: blob/retrieve is on the storage node, not on the space. We're not circumventing the semantics of UCAN here. Unlike access/authorize, this flow will not involve any attestations: the storage node will have the natural authority to blob/retrieve itself, and be able to delegate that to whomever it deems necessary, as normal in UCAN. I'm on board.

In fact, I'll note that in 1.0 this whole flow becomes unnecessary. In 1.0, the proof chain doesn't need to be (and isn't) included within the invocation itself. That means that rather than fetch a delegation and then send it back with the invocation, the invoking service can just send the invocation and assume that the storage node has a delegation to cover it (or is willing to sign one on the fly). The only reason we can't do that in 0.9 is that the invoking service needs the delegation as part of what it signs. (E: @alanshaw has pointed out that I forgot how this works. Invocations do need a (flat) list of the delegations that make up their proof chain; delegations don't need to include the delegations that they're re-delegating. That is, the proof chain is only validated at the end during the invocation, but you do need to have everything present and signed in that invocation, not just available to you somewhere.)

As for the expiration date, I think we could arguably make them non-expiring, or at least expiring far in the future. The only principal who cares is the storage node itself. If it decides to stop trusting the invoking service at some point and wants to revoke its delegation, it won't have any trouble telling itself about the revocation. That should make this extra loop negligible in performance cost.

@Peeja
Copy link
Member

Peeja commented Oct 2, 2025

Oh, also notable: you don't actually need to include the delegation, just the CID. That means the invoking service doesn't need to store or send the whole delegation, if we can expect the service nodes to retain them themselves. Technically, they don't even need to send the whole delegation over in the first place, not that it's many more bytes for something that happens very occasionally.

@alanshaw alanshaw mentioned this pull request Oct 2, 2025
@alanshaw alanshaw changed the title feat: on-demand service authorization rfc: on-demand service authorization Oct 2, 2025
alanshaw added a commit to storacha/go-libstoracha that referenced this pull request Oct 7, 2025
refs storacha/RFC#68

Since `access/authorize` is a capability that already exists and has
semantics that vary slightly from what is proposed in the RFC I have
named the capability `access/grant` which is roughly similar but
different enough that we won't confuse the two.

---------

Co-authored-by: Vicente Olmedo <vicente@storacha.network>
alanshaw added a commit to storacha/go-libstoracha that referenced this pull request Oct 7, 2025
Adds the `blob/retrieve` capability as proposed here
storacha/RFC#68
alanshaw added a commit to storacha/piri that referenced this pull request Oct 16, 2025
This PR add a `blob/retrieve` capability handler for service retrievals
(see storacha/RFC#68). It is very similar to the
`space/content/retrieve` handler except it does accept byte range
requests, only allows invocations where the resource is the service
itself and is not associated with any egress billing.
alanshaw added a commit to storacha/indexing-service that referenced this pull request Oct 21, 2025
This PR add server side support for authorized retrievals. This involves
2 different authorized retrieval types:

1. Users can send a delegation for retrieving data from a space (per
#249) and it will now
be used by the indexer to make a UCAN authorized retrieval
(`space/content/retrieve`) from the relevant node (if not cached).
2. When an upload service invokes `assert/index`, the indexer will now
obtain a service delegation from the node that is storing the index
(using `access/grant`) before making a UCAN authorized retrieval
(`blob/retrieve`) to obtain the index. See
storacha/RFC#68 for details of on-demand service
retrievals.

---------

Co-authored-by: Vicente Olmedo <vicente@storacha.network>
alanshaw added a commit to storacha/indexing-service that referenced this pull request Oct 21, 2025
This PR adds support to the client library for authorized retrievals.

Essentially it allows one or more delegations to be attached to the query, that are sent in a HTTP header `X-Agent-Message`.

This PR also adds server side support for authorized retrievals. This involves 2 different authorized retrieval types:

1. Users can send a delegation for retrieving data from a space (per #249) and it will now be used by the indexer to make a UCAN authorized retrieval (`space/content/retrieve`) from the relevant node (if not cached).
2. When an upload service invokes `assert/index`, the indexer will now obtain a service delegation from the node that is storing the index (using `access/grant`) before making a UCAN authorized retrieval (`blob/retrieve`) to obtain the index. See storacha/RFC#68 for details of on-demand service retrievals.

---------

Co-authored-by: Vicente Olmedo <vicente@storacha.network>
@frrist
Copy link
Member

frrist commented Oct 27, 2025

Anything blocking this from merging? I'd like to start referencing the RFC directly in convo's 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants