Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deconz doesn't recognize anymore when a router is offline (ikea bulbs in this case) #6711

Closed
LeoeLeoeL opened this issue Jan 30, 2023 · 12 comments

Comments

@LeoeLeoeL
Copy link

Describe the bug

If a bulb is switched off by a wall switch, deconz continue to report it online.

Steps to reproduce the behavior

Switch off a bulb by a physical wall switch and wait...........

Expected behavior

I expect to see the device offline as before

Screenshots

image
Luce Bagno -1 was shut down 12 hours ago.
Luce Cucina D1 & D1 were shut down 4hours ago.

Environment

  • Host system: Raspberry Pi
  • Running method: Raspbian
  • Firmware version: 0x26780700
  • deCONZ version: 2.20.1
  • Device: ConBee II
  • Do you use an USB extension cable: yes
  • Is there any other USB or serial devices connected to the host system? If so: Which? APC Smart-UPS 1500

deCONZ Logs

Additional context

@Mimiix
Copy link
Collaborator

Mimiix commented Jan 30, 2023

What types of lights are they (brand)?

@LeoeLeoeL
Copy link
Author

Ikea, but same behaviour have Sonoff ZBMini when I use them for testing pupose and then put in the drawer..

@Mimiix
Copy link
Collaborator

Mimiix commented Jan 30, 2023

It's funny, I'm testing now. Seeing a blue dot when i'm doing any changes on it. After a w hile (now) it's going red but remains available.

  • Testing now the INNR SP120 plug. That one seems to go "offline" after 2 minutes.
  • Hue bulb took 2 minutes (ish)
  • Osram bulbs took 3 minutes (ish

Seems to be isolated at IKEA lights at this moment

@Mimiix
Copy link
Collaborator

Mimiix commented Jan 30, 2023

Can you check the DDF and see if it's gold? (for the bulbs?)

Mine was "bronze" changed it to gold, did a hot reload and now it seems instantanious

@LeoeLeoeL
Copy link
Author

Status is "draft" for both.

@Mimiix
Copy link
Collaborator

Mimiix commented Jan 30, 2023

That's probably the issue. I've asked the devs to check, but it seems to be isolated to bulbs not having a DDF present.

@Mimiix Mimiix changed the title Deconz doesn't recognize anymore when a device is offline Deconz doesn't recognize anymore when a router is offline (ikea bulbs in this case) Jan 30, 2023
@ebaauw
Copy link
Collaborator

ebaauw commented Jan 30, 2023

Brand wouldn’t be a determining factor here. Would need to know:

  • Whether light is exposed through DDF or legacy code;
  • Weather light supports attribute reporting.

Unless you try and control a light over the API, the plugin wouldn’t mark reachable false until it has failed to poll the light. For lights with (periodic) attribute reporting, polling would only occur after a periodic report has been missed. I’m not quite sure if the legacy code handles this correctly, but the code for DDF should handle this. With typical periodic reporting every 5 minutes, I would expect up to 6 minutes delay after cutting power to a light. For lights being polled, it would depend on the number of lights in your network.

If memory serves, IKEA lights would be setup using attribute reporting in legacy code. ZHA Hue lights don’t support attribute reporting, and deCONZ only configures attribute reporting (through DDF) for ZB3 Hue lights since v2.20. That would be consistent with the observations above. It would also mean, this will get solved as we move to DDFs, cleaning the legacy code.

Note: using 20th century wall switches for your Zigbee lights is a bad idea. Any logic depending on reachable is a bad idea. See also #2590.

I’m almost scared to suggest this, but if you (or rather your spouse) insists on using wall switches, you might try and increase the rate of periodic reporting, at least for state/on. I think we use a 5s refresh interval for state/on in most DDFs, but the code handling the DDFs will poll less frequently. Depending on the number of lights, setting periodic reporting every 5 seconds could be a bit much.

@Mimiix
Copy link
Collaborator

Mimiix commented Jan 30, 2023

  • Whether light is exposed through DDF or legacy code;

In the case of TS: Via legacy code as the DDF was on draft.

In my case: after changing Bronze to Gold (so it used DDF) it was "fixed". So looks like legacy only.

Note: using 20th century wall switches for your Zigbee lights is a bad idea. Any logic depending on reachable is a bad idea. See also #2590.

Also not a favor of "deprecating" as that breaks issues. In my experience so far, there is good reasoning to use it as it is. Mainly because i haven't seen proper and affordable replacements on wallswitches in the Netherlands and proper documentation for group usage. Nevertheless: I believe that discussion is not related to the issue at hand. Apparently we have 2 different "ways" on when a device is marked "unreachable".

@LeoeLeoeL
Copy link
Author

The situation now.
image
Many connections are gone.

@ebaauw
Copy link
Collaborator

ebaauw commented Jan 30, 2023

Many connections are gone.

Please remember that Zigbee doesn't do connections (if it did, it would be easy to clear reachable). The lines represent neightbour table entries of adjacent Zigbee routers. As the device is no longer powered, the routers will expire the entries, and the GUI removes the line when it next queries the neighbour table.

Also not a favor of "deprecating" as that breaks issues.

Deprecating doesn't break anything, as it's purely documentation. Removing reachable would indeed break stuff.

there is good reasoning to use it as it is.

I only linked the issue, because it explains how reachable doesn't reflect whether the device is actually reachable. "As it is" means you'll have to wait several minutes to hours for reachable to be cleared after powering down a device.

Apparently we have 2 different "ways" on when a device is marked "unreachable".

Afaik, there is only one way, see linked issue. The variations in DDF vs legacy and reporting vs polling merely change how quickly deCONZ sends a unicast message to the device and can notice the missing response.

@easybeat
Copy link

easybeat commented Nov 30, 2023

I only linked the issue, because it explains how reachable doesn't reflect whether the device is actually reachable. "As it is" means you'll have to wait several minutes to hours for reachable to be cleared after powering down a device.

Hi

I think this is a massive issue for "normal" users because I expect the Phoscon App also uses "reachable" to show a device not greyed out. But in fact it's completely not showing the truth. Because the device is still working for example if you use the API to turn it on.

I'm using deconz for years now and the most anoying thing is that there is no way a "normal" user can find out or see if a device is currently connected.

Maybe I'm completely wrong about this, but really it is so anoying to see devices greyed out in Phoscon App but they are working absolutely fine! See screenshot:

image

I hope this will be solved at some point and there is a explanation for the reason.

Kind regards
Beat

@manup
Copy link
Member

manup commented Jul 10, 2024

Going through some older Bug reports for cleanup..

I've just tested with the Ikea GU10 which uses a DDF how it behaves when powering the device physically off.

  • All routers are periodically queried for neighbor table entries (DDF and legacy)
  • The GU10 DDF polls attributes if no reports come in within 30 minutes as per reporting configuration

The reachable attribute was set to false after roughly 30 minutes, and the node in deCONZ as well light in Phoscon App is shown ass offline.


Another test with Philips LWB004 E27 light running on legacy code without a DDF behaves also like that with the difference that the attribute polling is controlled by legacy code which focuses on on/off attribute with some hard set interval. The detection of offline also took roughly 30 minutes.

Regardless of DDF or legacy code, there is a important "lazy" ramp-up to detect reachable devices. If the responses to neighbor tables requests aren't received other commands will be send with APS ACKs enabled automatically for better detection. The lazy here means that this takes time since the periodic neighbor table requests may take 10, 20, 30 minutes... This is mainly to not get tricked by temporary network hickups and work well in larger networks.

Long story short, the detection is in place but can take a while.


The DDFs provide the best option to control the intervals. While we can tweak the "consider offline after x failed responses" down for quicker detection this always has the danger of false positives especially in larger networks.

Closing the issue for now since I consider the lazy detection is the best trade-off for a broad range of networks.

@manup manup closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants