-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Thread] Changing/reverting operational dataset leaves stale SRP services #21101
Comments
@bluebin14 @tcarmelveilleux @jwhui @abtink @LuDuda @jmartinez-silabs I'm not sure what's the best way forward, so your comments are welcome :) |
Once the Thread network is successfully joined, for consistency failsafe expiry should not cause deleting Thread network credentials, just like removing the last peer doesn't. Then you don't need an "expiry in progress state" or other workarounds that may introduce new bugs. There is a removeNetwork command, the controller can use it if needed. |
That would be against the spec which mandates that the failsafe expiry should have the following effect (among others):
|
Leaving a dangling SRP advertisement is much worse from usability perspective than adhering to a spec inconsistency as compared to removeFabric behavior for last peer. The spec requirement even implies resetting network configuration when there are other accessing peers, so it should be removed from the spec. |
Could elaborate more on how exactly it impacts the usability? Or what are your assumptions? Do you assume that the commissioner keeps using the same node ID on each commissioning attempt? |
It impacts SRP server operation when the situation repeats, eventually being refused (as seen in uart log). Physically resetting a border router such as HomePod to clear the entries is a major disruptive operation. Better to have a software solution that tears down things cleanly. |
For what it's worth, failsafe cleanup is in fact async from the failsafe's point of view. It has a "not armed, but not fully disarmed; disarming in progress" state. What's missing is a way to keep it in that state for a bit while async disarm work happens. We could add such a way.... |
In the case, there is a failsafe and the device thread attachment was complete, I feel like we should create a mechanism to unregister the services. Once the removeSrpService is sent then you can delete the thread dataset. How would the new commissioning attempts start before the thread dataset is cleared? I believe you wouldn't be advertising already. Unsure about the complexity it adds to the failsafe expiry. I am not familiar with that part. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Problem
Consider the following scenario:
It was raised by @bluebin14 in the following slack discussion: https://csamembers.slack.com/archives/G014G30SVV0/p1657552482903829?thread_ts=1657202586.638369&cid=G014G30SVV0
Proposed Solution
CM
key of the service is switched to 0, so the commissioner shouldn't use the stale service to initiate the commissioning. But we could also delay registering the commissionable node service if there's scenario in which it could improve the user experience.The text was updated successfully, but these errors were encountered: