On CASE resumption, previous resumption state is not deleted #18433

tcarmelveilleux · 2022-05-13T16:40:50Z

Problem

Found during internal security review at Google.

In either of these cases, a resumption has successfully occurred:

When Initiator resumed and received a Sigma2 Resume that was successful
When Responder received a Sigma1 for valid resumption and had received a successful session establishment status from the resumption

The spec says:

=== Validate Sigma2_Resume

...

. The initiator SHALL set the `Resumption ID` in the <<ref_SecureSessionContext, Session Context>> to the value `Resume2Msg.resumptionID`.

It doesn't mention what to do with the previous resumption state, but a careful consideration of the side-effects implies that since the resumption ID set by the responder and processed by the initiator gets updated in "session context" and that a given "session" has to be resumed, the previous resumption state, if any, for the given peer, should be deleted, prior to saving the new resumption state. Otherwise, it is possible that a session be multiply resumed, or that resumption contexts that are now logically expired, be reused.

The current implementation of DefaultSessionResumptionStorage never removes a prior entry for a given peer, except when fabric removal is called.

Proposed Solution

Either:

Delete(ScopedNodeId) should always be called before Save(...)
Save(...) should automatically called Delete (and ignore "failure/missing") for the ScopedNodeId of the peer

Furthermore,

Delete(ScopedNodeId) must support trying to delete any possible duplicate within the index, if any exists (i.e. make sure there remains no reume contexts). This makes sense since there should only need to be one node resumption state between peers.

The text was updated successfully, but these errors were encountered:

balducci-apple · 2022-05-13T17:13:34Z

I think this is the right thing to do. I know the spec previously had prose describing that resumption information should be deleted if resumption fails (but it appears to not state this anymore). In any case, success or failure should lead to the deletion of the resumption state.

msandstedt · 2022-05-13T19:09:14Z

Found during internal security review at Google.

What security problem would we hope to fix by making this change? In all cases, after session establishment, we are trusting peers not to continue using a session if their credentials have changed. This is true during ongoing session communication and is equally true for session resumption. Disallowing multiple session resumption contexts for a given peer does not remove the need to trust the peer because this does not address the case where a peer holds a single resumption context and continues using this after its credentials have changed.

Edit: discussed offline. I misunderstood what was being reported here, which is not really a security problem. The sdk's resumption storage implementation just has a bug. This API implies and requires that the resumption storage delegate only store a single entry per peer, but the implementation stores more than one:

virtual CHIP_ERROR FindByScopedNodeId(const ScopedNodeId & node, ResumptionIdStorage & resumptionId,
                                          Crypto::P256ECDHDerivedSecret & sharedSecret, CATValues & peerCATs) = 0;

However, as an amendment to the proposed solution, should we just make this happen automatically in the save() method? Surely it should not be permissible for save() to break the underlying storage. So it seems that save() must either check first for existing entries and error out for this case, or accept a duplicate and silently discard existing entries.

kghost · 2022-05-16T17:35:22Z

I'm not quite understand the role of ResumptionId. If there can by only one resumption entry per node, why not use a stable tuple of <CompressedFabricId, NodeId> as the ResumptionId ?

In this way, we don't need to store the relationship between the ResumptionId and the ScopedNodeId. I don't think it will expose any security flaws, since both CompressedFabricId and NodeId are public, and the shared secret sill kept secret, and correct key is needed to generate correct initiatorResumeMIC, which can be verified by the peer.

@tcarmelveilleux @turon @bzbarsky-apple @msandstedt

bzbarsky-apple · 2022-05-17T04:05:47Z

That sounds like a question for @balducci-apple

balducci-apple · 2022-05-19T22:15:24Z

Without the ResumptionID we don't know the NodeId of the peer that is sending us a Sigma1 with resumption.

msandstedt · 2022-12-20T15:28:46Z

It doesn't mention what to do with the previous resumption state, but a careful consideration of the side-effects implies that since the resumption ID set by the responder and processed by the initiator gets updated in "session context" and that a given "session" has to be resumed, the previous resumption state, if any, for the given peer, should be deleted, prior to saving the new resumption state. Otherwise, it is possible that a session be multiply resumed, or that resumption contexts that are now logically expired, be reused.

The current implementation of DefaultSessionResumptionStorage never removes a prior entry for a given peer, except when fabric removal is called.

Proposed Solution
Either:

Delete(ScopedNodeId) should always be called before Save(...)
Save(...) should automatically called Delete (and ignore "failure/missing") for the ScopedNodeId of the peer

#23062 fixes this, albeit not quite with the solution proposed.

Furthermore,
Delete(ScopedNodeId) must support trying to delete any possible duplicate within the index, if any exists (i.e. make sure there remains no reume contexts). This makes sense since there should only need to be one node resumption state between peers.

#23062 does not do this: there is no proactive cleaning for orphaned or duplicate entries. And it is true that because the default implementation stores each record across three tables, it is impossible to save atomically even if the backing kvstore save is atomic for each key=value.

One solution to achieve the proactive cleaning for individual implementations is to override the default session resumption storage implementation. But if we would like this in the default implementation itself, code needs to be added or changed.

tcarmelveilleux added secure channel V1.0 security labels May 13, 2022

kghost self-assigned this May 16, 2022

franck-apple added the p1 priority 1 work label Oct 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On CASE resumption, previous resumption state is not deleted #18433

On CASE resumption, previous resumption state is not deleted #18433

tcarmelveilleux commented May 13, 2022 •

edited

Loading

balducci-apple commented May 13, 2022

msandstedt commented May 13, 2022 •

edited

Loading

kghost commented May 16, 2022 •

edited

Loading

bzbarsky-apple commented May 17, 2022

balducci-apple commented May 19, 2022

msandstedt commented Dec 20, 2022

On CASE resumption, previous resumption state is not deleted #18433

On CASE resumption, previous resumption state is not deleted #18433

Comments

tcarmelveilleux commented May 13, 2022 • edited Loading

Problem

Proposed Solution

balducci-apple commented May 13, 2022

msandstedt commented May 13, 2022 • edited Loading

kghost commented May 16, 2022 • edited Loading

bzbarsky-apple commented May 17, 2022

balducci-apple commented May 19, 2022

msandstedt commented Dec 20, 2022

tcarmelveilleux commented May 13, 2022 •

edited

Loading

msandstedt commented May 13, 2022 •

edited

Loading

kghost commented May 16, 2022 •

edited

Loading