Align Shelly sleeping devices timeout with non-sleeping #118969

thecode · 2024-06-06T11:42:00Z

Proposed change

Currently we allowed a Shelly sleeping device to miss a single update due to the long update period the device use.
For non sleeping devices we allowed 2 missing updates (and even after that we don't mark them offline unless we can't reach them).
This makes sleeping devices to be marked as offline very easy, Gen1 devices are UDP based and can easily miss some updates.
Even under near perfect conditions UDP has an expected loss factor, adding Wi-Fi means it will be sometime more frequent.

I did a test using a Shelly Motion Gen1:
I left the device in a closed box so it doesn't detect any events, I put it near an AP so it was around -50dBm the whole week and it was not marked as unavailable, however moving it to a far point which reduced the RSSI to -70dBm made it being marked unavailable 2 times that week. -70dBm is considered to be excellent to good but it still lost some messages, probably due to collisions.

This change align all devices to use the same timeout.

Type of change

Dependency upgrade
Bugfix (non-breaking change which fixes an issue)
New integration (thank you!)
New feature (which adds functionality to an existing integration)
Deprecation (breaking change to happen in the future)
Breaking change (fix/feature causing existing functionality to break)
Code quality improvements to existing code or addition of tests

Additional information

This PR fixes or closes issue: fixes Shelly Motion sensors unavailable while CoAP messages are coming in #117734 ,fixes Shelly Door/Window 2 (battery operated, gen1) goes unavailable and stops working until integration reload #116948, fixes All Shelly Motion 2's become unavailable on a regular basis until the service "Home Assistant Core: Reload Config Entry" is called #119002
This PR is related to issue:
Link to documentation pull request:

Checklist

The code change is tested and works locally.
Local tests pass. Your PR cannot be merged unless tests pass
There is no commented out code in this PR.
I have followed the development checklist
I have followed the perfect PR recommendations
The code has been formatted using Ruff (ruff format homeassistant tests)
Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

Documentation added/updated for www.home-assistant.io

If the code communicates with devices, web services, or third-party tools:

The manifest file has all fields filled out correctly.
Updated and included derived files by running: python3 -m script.hassfest.
New or updated dependencies have been added to requirements_all.txt.
Updated by running python3 -m script.gen_requirements_all.
For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

I have reviewed two other open pull requests in this repository.

home-assistant · 2024-06-06T11:42:10Z

Hey there @balloob, @bieniu, @chemelli74, @bdraco, mind taking a look at this pull request as it has been labeled with an integration (shelly) you are listed as a code owner for? Thanks!

Code owner commands

Code owners of shelly can trigger bot actions by commenting:

@home-assistant close Closes the pull request.
@home-assistant rename Awesome new title Renames the pull request.
@home-assistant reopen Reopen the pull request.
@home-assistant unassign shelly Removes the current integration label and assignees on the pull request, add the integration domain after the command.
@home-assistant add-label needs-more-information Add a label (needs-more-information, problem in dependency, problem in custom component) to the pull request.
@home-assistant remove-label needs-more-information Remove a label (needs-more-information, problem in dependency, problem in custom component) on the pull request.

chemelli74 · 2024-06-06T12:18:46Z

The only draw back with this change si that it may take way too long to mark a device as unavailable.
If it's something important, linked to security ( motion, door, flood ), this change may compromise the result.

bdraco · 2024-06-07T22:28:49Z

It's definitely expected that UDP-based protocol will drop packets over Wi-Fi. I expect that is one of the reasons Shelly moved away from COAP with Gen2/Gen3 devices.

This a difficult problem as its never going to be as stable as we want it to be as there are no guarantees with UDP. All the potential options here are compromises.

It's reasonable never to fix this if the consensus is that the device is more important to be shown as unavailable after a single packet drop instead of reflecting an old state. It's always hard to know the correct action here because there will be opinions on both sides of this.

As an alternative, we could clearly state in our documentation that it's normal for Gen1 sleepy devices to experience periodic unavailability under standard conditions (nobody will ever have a perfect network), particularly when updates are occasionally dropped. That would allow us to close the linked issues and point them to the docs instead. i.e. wontfix

KruseLuds · 2024-06-08T00:50:15Z

I have to reiterate this was listed above as a fix for

#119002

...and that 119002 is referring to a GEN 2 device, however the above statement seems to imply this (118969) may be closed without any change being made. Please note that the above link I have supplied is to an issue that is NEW which came up due to a change to the Shelly integration and needs to be fixed as these Shelly Motion 2 units are used for an alarm system as well as a smart home for disabled older folks that require them to work properly instead of just becoming "unavailable" on a constant basis (which never used to happen).

So if this (118969) is closed without making any changes, 119002 which is referenced above as being fixed by this - and was posted with logs - should stay open and 119002 still needs to be worked upon as soon as possible.

Thank you -

smarthomefamilyverrips · 2024-06-09T06:44:14Z

It's definitely expected that UDP-based protocol will drop packets over Wi-Fi. I expect that is one of the reasons Shelly moved away from COAP with Gen2/Gen3 devices.

This a difficult problem as its never going to be as stable as we want it to be as there are no guarantees with UDP. All the potential options here are compromises.

It's reasonable never to fix this if the consensus is that the device is more important to be shown as unavailable after a single packet drop instead of reflecting an old state. It's always hard to know the correct action here because there will be opinions on both sides of this.

As an alternative, we could clearly state in our documentation that it's normal for Gen1 sleepy devices to experience periodic unavailability under standard conditions (nobody will ever have a perfect network), particularly when updates are occasionally dropped. That would allow us to close the linked issues and point them to the docs instead. i.e. wontfix

I not understand this is already going on since 2024.5 #116948 and mentioned in few other issues as well with #119002 as the latest it already several times is implied that it are connecting problems, and even now is suggested to simple ignore this and write it of as won't fix. This issue NOT WAS THERE AND NEVER HAVE BEEN THERE BEFORE 2024.5 so something did change to make this happen, and the focus should be on finding out what this is to my opinion. And yes I understand most of you do this out of hobby and for free, what is highly appreciated so I apologize for my frustrated reaction, but I hope it also is understandable if all the time is assumed this issue is not caused by the integration (or other changes within 2024.5 causing the shelly intgration to have this behavior) while it not existed before update 2024.5

EDIT: also it not is logical that this "becoming unavailable" only happen after a restart of HA and that ONE time reloading the device within the Shelly integration solves this for as long there is no other HA restart. If it was network related it also would occur on a more regular basis and not only at a HA restart.

I am no code writer and my knowledge is far from expert level, but I would say, make the Shelly integration reload all it's devices by itself for example 1 minute after restart of HA is completed would solve this issue also. Just thinking out loud as this is what I am doing manually now each time after I did do a HA restart.

Align Shelly sleeping devices timeout with non-sleeping

3a3761f

thecode added this to the 2024.6.1 milestone Jun 6, 2024

thecode requested review from balloob, bieniu, chemelli74 and bdraco as code owners June 6, 2024 11:42

home-assistant bot added bugfix cla-signed has-tests integration: shelly small-pr PRs with less than 30 lines. labels Jun 6, 2024

home-assistant bot assigned balloob, bieniu, chemelli74 and bdraco Jun 6, 2024

home-assistant bot added by-code-owner Quality Scale: platinum labels Jun 6, 2024

thecode removed this from the 2024.6.1 milestone Jun 7, 2024

thecode marked this pull request as draft June 7, 2024 10:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align Shelly sleeping devices timeout with non-sleeping #118969

Align Shelly sleeping devices timeout with non-sleeping #118969

thecode commented Jun 6, 2024 •

edited

home-assistant bot commented Jun 6, 2024

chemelli74 commented Jun 6, 2024

bdraco commented Jun 7, 2024 •

edited

KruseLuds commented Jun 8, 2024

smarthomefamilyverrips commented Jun 9, 2024 •

edited

Align Shelly sleeping devices timeout with non-sleeping #118969

Are you sure you want to change the base?

Align Shelly sleeping devices timeout with non-sleeping #118969

Conversation

thecode commented Jun 6, 2024 • edited

Proposed change

Type of change

Additional information

Checklist

home-assistant bot commented Jun 6, 2024

chemelli74 commented Jun 6, 2024

bdraco commented Jun 7, 2024 • edited

KruseLuds commented Jun 8, 2024

smarthomefamilyverrips commented Jun 9, 2024 • edited

thecode commented Jun 6, 2024 •

edited

bdraco commented Jun 7, 2024 •

edited

smarthomefamilyverrips commented Jun 9, 2024 •

edited