Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hacs shows disconnected after (up to) 3 hours, have to constantly "reconfigure" #187

Closed
MithUK1 opened this issue Dec 10, 2022 · 88 comments
Closed

Comments

@MithUK1
Copy link

MithUK1 commented Dec 10, 2022

Hi, this has been an issue fo rme since i styarted using this integration quite some time ago.
I leave this hoping the next version will fix my issue but it never does.

Eseentially at some random time interval after configuring the integration it will stop communicating with alarm.com and will show as needing reconfiguration:

image
Until that point, my motion sensors and alarm system are seen and function correctly

For note, I can still log in to alarm.com with the same username and password (and authenticator code) as normal.

However in the activity log you can see it does a web login over and again (to scrape the sensor activity is my guess) then at 11:01 it simply stops and the integration displays as recongure.

image

I'm unsure of what log or where to get it to help troubleshoot this issue further, but if you let me know i'll help from my end however i can.
Thanks, M

@MithUK1 MithUK1 changed the title Hacs shows disconnected after 3 hours, have to constantly "reconfigure" Hacs shows disconnected after (up to) 3 hours, have to constantly "reconfigure" Dec 10, 2022
@elahd
Copy link
Collaborator

elahd commented Dec 14, 2022

Can you enable debug logging for the integration and post what shows up in Home Assistant's logs?

Add the following to configuration.yaml, then reboot:

logger:
  default: warn
  logs:
    custom_components.alarmdotcom: debug
    pyalarmdotcomajax: debug

@elahd elahd closed this as not planned Won't fix, can't repro, duplicate, stale Jan 6, 2023
@catellie
Copy link

I just installed this today and experience the same problem. After turning on the above, restarting and then re-authenticting I just see this in my log file:

2023-02-12 17:17:10.828 ERROR (MainThread) [pyalarmdotcomajax] Trying to refresh auth tokens by logging in again.
2023-02-12 17:17:15.525 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com.
2023-02-12 17:17:15.526 ERROR (MainThread) [pyalarmdotcomajax] Trying to refresh auth tokens by logging in again.
2023-02-12 17:17:19.342 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com.
2023-02-12 17:17:19.754 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com.
2023-02-12 17:17:19.755 ERROR (MainThread) [pyalarmdotcomajax] Trying to refresh auth tokens by logging in again.
2023-02-12 17:17:25.691 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com.
2023-02-12 17:17:25.692 ERROR (MainThread) [pyalarmdotcomajax] Trying to refresh auth tokens by logging in again.

Home Assistant appears to be happy at this stage though. 🤷 In case it matters, I'm located in Sweden and got this though the Garda reseller - which shows up as expected in the UI. The imported list of devices seems fairly reasonable, except for one device that the was just "unknown".

@catellie
Copy link

As I dig around more I also found this in the log:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 414, in async_add_entities
    tasks = [
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 414, in <listcomp>
    tasks = [
  File "/config/custom_components/alarmdotcom/alarm_control_panel.py", line 83, in <genexpr>
    AlarmControlPanel(
  File "/config/custom_components/alarmdotcom/alarm_control_panel.py", line 116, in __init__
    self._attr_supported_features = AlarmControlPanelEntityFeature(
  File "/usr/local/lib/python3.10/enum.py", line 385, in __call__
    return cls.__new__(cls, value)
  File "/usr/local/lib/python3.10/enum.py", line 710, in __new__
    raise ve_exc
ValueError: 7 is not a valid AlarmControlPanelEntityFeature

and later:

2023-02-12 17:19:19.415 ERROR (MainThread) [custom_components.alarmdotcom.alarmhub] Authentication failed while fetching Garda Alarm:REDACTED data: Invalid account credentials.

@cmbuckley
Copy link

Just to add to this, I see the following in debug logs:

2023-02-09 14:20:25.241 DEBUG (MainThread) [pyalarmdotcomajax] Logged in to Alarm.com.
2023-02-09 14:20:25.618 DEBUG (MainThread) [pyalarmdotcomajax] {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-09 14:20:25.619 ERROR (MainThread) [custom_components.alarmdotcom.alarmhub] Authentication failed while fetching ADT Smart Services:<redacted> data: Invalid account credentials.

There's no changes to the credentials to cause this. I tried reducing the sync frequency to see if it was a weak attempt at rate limiting, but that made no difference.

Somewhat related, because the underlying library makes opportunistic attempts to re-establish login when unsupported device types are found, this doubles the number of calls to the login endpoint.

@elahd
Copy link
Collaborator

elahd commented Feb 13, 2023

Tagging @roooodcastro and @dkedinger from #207.

Can anyone running into this problem help inspect web browser cookies when logged into Alarm.com? To do this in Chrome:

  1. Log in to alarm.com.
  2. Right click anywhere on the page and select "Inspect".
  3. In the inspector window, click on "Application" on top, then "Cookies" ->"https://www.alarm.com" in the left nav.
  4. Send a screenshot of just the column circled in green.

image

@cmbuckley The add-in attempts to log in so frequently because, in some instances, Alarm.com returns the same response for expired credentials and lack of permissions. There's definitely a more intelligent way to go about this, just haven't gotten around to implementing it.

@cmbuckley
Copy link

cmbuckley commented Feb 13, 2023

Here's mine:

I appreciate why there's the repeated log-in, but as you say it'd be nice to have a better way to deal with this, such as being able to restrict the device types that are checked.

@roooodcastro
Copy link

Mine are similar:

Screenshot_20230213_185701

However, I must add that this is not just an annoyance. The integration functionality is affected too. I can arm/disarm the alarm just fine, sometimes it takes a couple of minutes, however all sensors, including the arm/disarm state sensor, have stale or simply wrong readings, due to them only updating sporadically.

There's definitely a more intelligent way to go about this, just haven't gotten around to implementing it.

Do you know what a solution for this looks like, or have you not looked into this at all yet? I could have a go at implementing it if you have a plan.

@elahd
Copy link
Collaborator

elahd commented Feb 15, 2023

Thanks. The "afg" cookie is the AJAX cookie that the integration is complaining about missing. I thought that some providers may just not use that cookie, but looks like thats not the case since you all have it. I need to do a deeper dive and, also, release code that provides more detailed errors when the cookie goes missing. It's possible that the missing cookie is just a side effect of an upstream error.

@catellie The "7 is not a valid AlarmControlPanelEntityFeature" error should be fixed in the latest release.

@roooodcastro My plan was to have the integration look at the response object from either the system or partition endpoints. These objects contain a list of IDs for all sensors/devices in the system. If any device category has no devices listed, the library would skip over the endpoints for those devices on every subsequent refresh. This check would be done, say, once every 5-10 minutes so that the library could pick up new devices as they're added. This would need to be implemented in pyalarmdotcomajax, not in this Home Assistant integration.

@cmbuckley
Copy link

cmbuckley commented Feb 15, 2023

@elahd just to be clear here - the integration configures and works absolutely fine for a few hours, or a couple of days even, and then sporadically gets a bad response from the login endpoint, which causes the reconfigure notification.

As @roooodcastro mentioned, once that happens the sensors no longer perform their regular sync, even though the integration itself continues to work from the perspective of being able to set/unset the alarm.

Your suggestion of remembering empty device responses and skipping them from subsequent requests in the underlying library sounds spot on - but it feels like there's a separate need for an improved retry mechanism around that failed log-in, which is unfortunately coupled to the device detection because of the 403 responses when there's none of that device type.

@catellie
Copy link

@elahd : I updated earlier today, and as I got the red warning again now (hours later, but not when I upgraded) I re-authenticated and it appears to not make much of a difference:

This error originated from a custom integration.

Logger: homeassistant
Source: custom_components/alarmdotcom/alarm_control_panel.py:116
Integration: Alarm.com (documentation, issues)
First occurred: 10:23:29 PM (1 occurrences)
Last logged: 10:23:29 PM

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 414, in async_add_entities
    tasks = [
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 414, in <listcomp>
    tasks = [
  File "/config/custom_components/alarmdotcom/alarm_control_panel.py", line 83, in <genexpr>
    AlarmControlPanel(
  File "/config/custom_components/alarmdotcom/alarm_control_panel.py", line 116, in __init__
    self._attr_supported_features = AlarmControlPanelEntityFeature(
  File "/usr/local/lib/python3.10/enum.py", line 385, in __call__
    return cls.__new__(cls, value)
  File "/usr/local/lib/python3.10/enum.py", line 710, in __new__
    raise ve_exc
ValueError: 7 is not a valid AlarmControlPanelEntityFeature

I also got some Unable to extract ajax key from Alarm.com. REDACTED_FOR_POSTING messages.
As well as custom_components.alarmdotcom.alarmhub: Error logging in:

Let me know if there is more info I can provide.

@elahd
Copy link
Collaborator

elahd commented Feb 15, 2023

Thanks, @catellie. I'll release something later tonight in line with the last part of this comment to help with the multiple login attempts.

@catellie
Copy link

catellie commented Feb 15, 2023

@elahd Either you were super fast or my previous update didn't fully "bite", as I found another which gives a different error:

Logger: homeassistant
Source: helpers/selector.py:72
First occurred: 10:45:37 PM (1 occurrences)
Last logged: 10:45:37 PM

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 225, in async_init
    flow, result = await task
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 252, in _async_init
    result = await self._async_handle_step(flow, flow.init_step, data, init_done)
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 367, in _async_handle_step
    result: FlowResult = await getattr(flow, method)(user_input)
  File "/config/custom_components/alarmdotcom/config_flow.py", line 291, in async_step_reauth
    return await self.async_step_reauth_confirm(user_input)
  File "/config/custom_components/alarmdotcom/config_flow.py", line 302, in async_step_reauth_confirm
    return await self.async_step_user()
  File "/config/custom_components/alarmdotcom/config_flow.py", line 127, in async_step_user
    vol.Required(CONF_USERNAME): TextSelector(
  File "/usr/src/homeassistant/homeassistant/helpers/selector.py", line 912, in __init__
    super().__init__(config)
  File "/usr/src/homeassistant/homeassistant/helpers/selector.py", line 72, in __init__
    self.config = self.CONFIG_SCHEMA(config)
  File "/usr/local/lib/python3.10/site-packages/voluptuous/schema_builder.py", line 272, in __call__
    return self._compiled([], data)
  File "/usr/local/lib/python3.10/site-packages/voluptuous/schema_builder.py", line 595, in validate_dict
    return base_validate(path, iteritems(data), out)
  File "/usr/local/lib/python3.10/site-packages/voluptuous/schema_builder.py", line 433, in validate_mapping
    raise er.MultipleInvalid(errors)
voluptuous.error.MultipleInvalid: extra keys not allowed @ data['autocomplete']

I hope that's more helpful? Using autocomplete for a login field seems like a rather bad idea... 🤦

@elahd
Copy link
Collaborator

elahd commented Feb 16, 2023

@catellie Which version of Home Assistant are you using? Also, autocomplete use is in line with Home Assistant standards: https://developers.home-assistant.io/docs/data_entry_flow_index#enabling-browser-autofill. These autocomplete hints help password managers locate username, password, and OTP fields.

@elahd
Copy link
Collaborator

elahd commented Feb 16, 2023

Also, I had a fix for the "reconfigure" issue ready to go but accidentally purged the Codespace before pushing the code 🤦🏻‍♂️. Currently waiting on a GitHub support ticket to recover the data.

@catellie
Copy link

@elahd I run Home Assistant 2022.12.0.dev20221114 Frontend 20221114.0.dev - latest in docker on Ubuntu. The actual docker has not been restarted for 5+ weeks so I guess "latest" might not be super accurate.

@catellie
Copy link

For the record: I had reason to restart my HA and noticed that I no longer get the complaining message and even though I have not re-authenticated since the above message (5 days ago), I could still toggle a light switch that came with my alarm. So I guess something has mended? How/why is unclear, but thanks anyway!

@elahd
Copy link
Collaborator

elahd commented Feb 22, 2023

@catellie Thanks for the update. I think that some of the "extra" errors you were getting were related to your version of Home Assistant, although I'm not able to replicate them on my end in 2022.12.0.

@cmbuckley I implemented a smarter device update scheme that only hits endpoints for device types that are present in a user's account. The integration is also smarter about attempting to log back in -- it'll only log back in after verifying that it has been logged out. Can you re-install from the master branch to verify that everything works? @roooodcastro? Thanks!

@elahd
Copy link
Collaborator

elahd commented Feb 24, 2023

Has anyone been able to test?

@roooodcastro
Copy link

Hi @elahd, sorry for the delay, I was away from home for a few days. I've re-downloaded it now from master a couple hours ago, and so far it's not prompting me to re-configure the integration anymore.

I do see however some errors that I don't remember seeing before (or I might just not have noticed either):

Error fetching data from Alarm.com. Got 403 status when fetching data for device type DeviceType.SYSTEM. User is already logged in. Giving up on device type.

I get these for a few device types, both from devices types I don't have, and these two SYSTEM, and SENSOR, which looks just generic. I'm getting a lot of these errors starting when I restarted HA after updating to master, and they keep showing up (last logged is 1 minute before I copied the text):

First occurred: 10:45:22 (722 occurrences)
Last logged: 12:02:47

The sensors also don't seem to be updating, I've triggered a motion sensor, some window/door sensors, and armed/disarmed the alarm, and the integration doesn't seem to pick up any of the events.

I'll update tomorrow how's it looking. Using the latest HA 2023.2.5 now too.

@cmbuckley
Copy link

I've updated today, but since the errors were sporadic I will give it a couple of days before reporting back with any logs.

@elahd
Copy link
Collaborator

elahd commented Feb 24, 2023

Thanks, guys.

@roooodcastro Can you enable debugging for the integration (instructions)? I'd like to see the actual server response for that system data fetch error.

Also, which provider are you using?

@catellie
Copy link

catellie commented Feb 25, 2023 via email

@elahd
Copy link
Collaborator

elahd commented Feb 25, 2023

Thanks, @catellie. With the issues you had before the update, were the sensors updating as you expected them to? Also, can you enable debug mode and share the errors?

It seems like this issue is caused by some sort of intermittent connection issue or provider-specific rate limit. The debug logs should show the details.

@catellie
Copy link

Hi again @elahd ,

I'm not 100% sure about the sensors stability before since it wasn't really fully working. I believe I've seen my camera sensors occasionally report motion and the mentioned kitchen lights have at least HAPPENED to show the right state.

Still, I've had the debug mode on since day one (due to this issue) and scanning them I can see a couple of things (some of the them somewhat odd:

  • The login seems correct (there is a large ajax dump) and my provider is correctly named - presumably my rather long userid is OK and it says 2FA is not required.
  • An update listener is registered and then an update is requested, from which there is a trouble condition where it mentions that it cant get data from GARAGE_DOOR (status 423) along with messages in my local language - this odd mostly because I don't have any garage door sensor - so perhaps a hardcode?
  • Then a long list of my devices/sensors seems correctly initialized. (5.5 seconds runtime)
  • There is an arm config entry with update interval of 30 (seconds?) along with a long list of my devices again
  • About 30 s later there is an update request which fails (403) with: ```
    [pyalarmdotcomajax] _async_get_items_and_subordinates(): Failed to get data for device type DeviceType.CAMERA. Response: [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]. Errors: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}.
  • That appears to trigger things to start over with login, GARAGE_DOOR complaint etc
  • After another dump of all my devices there is the pretty bad stacktrace with the extra "autocomplete" I mentioned earlier. (this is on the 22nd of Feb)

And then nothing more in the logs at all related to alarmdotcom (a fair amount about a broken ledstrip I'm aware of, though so logging certainly works in general).

Does this help? The garage door issue can presumably be removed although it appears harmless, but perhaps the autocomplete exception is what causes the updates to fail?

I can see the "garage_door" being imported in cover.py, but I've not decoded exactly where it fails to realize that it shouldn't bother with them...

As a side: the installer of my system named devices/sensors in my local language, which includes some non-ASCII characters, but at a glance that appears to correctly handled.

Best / Jonas

@roooodcastro
Copy link

Thanks, guys.

@roooodcastro Can you enable debugging for the integration (instructions)? I'd like to see the actual server response for that system data fetch error.

Also, which provider are you using?

Hi, I'm using Smartzone, which is an Irish provider for Alarm.com.

Since my last reply, nothing has really changed in the system. I'm not getting re-prompted over and over for login anymore, but sensors are not being updated at all.

Here's the full debug log of a single "update cycle":

2023-02-26 01:19:48.073 DEBUG (MainThread) [custom_components.alarmdotcom.alarmhub] custom_components.alarmdotcom.alarmhub: Requesting update from Alarm.com.
2023-02-26 01:19:48.073 DEBUG (MainThread) [pyalarmdotcomajax] Calling update on Alarm.com
2023-02-26 01:19:48.461 DEBUG (MainThread) [pyalarmdotcomajax] Trouble condition response: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-26 01:19:48.462 DEBUG (MainThread) [pyalarmdotcomajax] Refreshing data for device types: [<DeviceType.SYSTEM: 'systems'>, <DeviceType.CAMERA: 'cameras'>, <DeviceType.GARAGE_DOOR: 'garageDoors'>, <DeviceType.GATE: 'gates'>, <DeviceType.IMAGE_SENSOR: 'imageSensors'>, <DeviceType.LIGHT: 'lights'>, <DeviceType.LOCK: 'locks'>, <DeviceType.PARTITION: 'partitions'>, <DeviceType.SENSOR: 'sensors'>, <DeviceType.THERMOSTAT: 'thermostats'>, <DeviceType.WATER_SENSOR: 'waterSensors'>]
2023-02-26 01:19:48.881 DEBUG (MainThread) [pyalarmdotcomajax] _async_build_device_list(): Failed to get data for device type DeviceType.SYSTEM. Response: [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]. Errors: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}.
2023-02-26 01:19:48.881 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com. Got 403 status when fetching data for device type DeviceType.SYSTEM. User is already logged in. Giving up on device type.
2023-02-26 01:19:48.886 DEBUG (MainThread) [pyalarmdotcomajax] Server response: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-26 01:19:49.267 DEBUG (MainThread) [pyalarmdotcomajax] _async_build_device_list(): Failed to get data for device type DeviceType.CAMERA. Response: [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]. Errors: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}.
2023-02-26 01:19:49.267 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com. Got 403 status when fetching data for device type DeviceType.CAMERA. User is already logged in. Giving up on device type.
2023-02-26 01:19:49.272 DEBUG (MainThread) [pyalarmdotcomajax] Server response: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-26 01:19:49.273 DEBUG (MainThread) [pyalarmdotcomajax] Skipping GARAGE_DOOR. Not installed in user environment.
2023-02-26 01:19:49.274 DEBUG (MainThread) [pyalarmdotcomajax] Skipping GATE. Not installed in user environment.
2023-02-26 01:19:49.657 DEBUG (MainThread) [pyalarmdotcomajax] _async_build_device_list(): Failed to get data for device type DeviceType.IMAGE_SENSOR. Response: [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]. Errors: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}.
2023-02-26 01:19:49.657 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com. Got 403 status when fetching data for device type DeviceType.IMAGE_SENSOR. User is already logged in. Giving up on device type.
2023-02-26 01:19:49.662 DEBUG (MainThread) [pyalarmdotcomajax] Server response: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-26 01:19:49.662 DEBUG (MainThread) [pyalarmdotcomajax] Skipping LIGHT. Not installed in user environment.
2023-02-26 01:19:49.663 DEBUG (MainThread) [pyalarmdotcomajax] Skipping LOCK. Not installed in user environment.
2023-02-26 01:19:50.082 DEBUG (MainThread) [pyalarmdotcomajax] _async_build_device_list(): Failed to get data for device type DeviceType.PARTITION. Response: [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]. Errors: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}.
2023-02-26 01:19:50.083 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com. Got 403 status when fetching data for device type DeviceType.PARTITION. User is already logged in. Giving up on device type.
2023-02-26 01:19:50.086 DEBUG (MainThread) [pyalarmdotcomajax] Server response: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-26 01:19:50.492 DEBUG (MainThread) [pyalarmdotcomajax] _async_build_device_list(): Failed to get data for device type DeviceType.SENSOR. Response: [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]. Errors: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}.
2023-02-26 01:19:50.493 ERROR (MainThread) [pyalarmdotcomajax] Error fetching data from Alarm.com. Got 403 status when fetching data for device type DeviceType.SENSOR. User is already logged in. Giving up on device type.
2023-02-26 01:19:50.495 DEBUG (MainThread) [pyalarmdotcomajax] Server response: {'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}
2023-02-26 01:19:50.496 DEBUG (MainThread) [pyalarmdotcomajax] Skipping THERMOSTAT. Not installed in user environment.
2023-02-26 01:19:50.496 DEBUG (MainThread) [pyalarmdotcomajax] Skipping WATER_SENSOR. Not installed in user environment.
2023-02-26 01:19:50.497 DEBUG (MainThread) [custom_components.alarmdotcom.alarmhub] Finished fetching Smartzone:<MY USERNAME> data in 2.424 seconds (success: True)
2023-02-26 01:19:50.497 DEBUG (MainThread) [custom_components.alarmdotcom.alarm_control_panel] Processing state DeviceState.DISARMED for My Home
2023-02-26 01:19:50.498 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.CLOSED for Front Door
2023-02-26 01:19:50.499 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.ACTIVE for Hallway Motion Detector
2023-02-26 01:19:50.499 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.CLOSED for Kitchen Back Door
2023-02-26 01:19:50.499 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.CLOSED for Living Room Window Left
2023-02-26 01:19:50.500 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.CLOSED for Living Room Window Right
2023-02-26 01:19:50.501 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.CLOSED for Panel Glass Break
2023-02-26 01:19:50.502 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.CLOSED for Panel Motion
2023-02-26 01:19:50.502 DEBUG (MainThread) [custom_components.alarmdotcom.binary_sensor] Processing state DeviceState.ACTIVE for Upstairs Motion Detector

While all doors and windows are indeed closed, I know for a fact that at least one of those motion sensors are not detecting any motion, and that my alarm is armed, not disarmed, so I know that this data is not correct.

Thanks

@elahd
Copy link
Collaborator

elahd commented May 15, 2023

I've released an update to the underlying pyalarmdotcomajax library that powers this HA integration. This release fixes issues with 2FA, introduces a new method for refreshing device data, and supports real time notifications. My ability to test is limited -- it would be great if anyone here could help run some basic tests. See #265 for testing details. Thanks!

@elahd
Copy link
Collaborator

elahd commented May 20, 2023

I just released v3.0.0-beta of the integration with support for real time "push" updates and a fix for 2FA.

If you don't see this version, go to HACS, click on integrations, click on Alarm.com, click on the 3 dot menu at the top right of the screen, click on redownload, and turn "show beta versions". v3 will appear shortly after turning this setting on.

@catellie
Copy link

catellie commented May 21, 2023 via email

@catellie
Copy link

Hi again,

I've had it running since I mentioned it above, and status reporting is indeed very fast for a fair amount of time.
However, there still seems to be a long term issue: me Levine the house this morning was correctly registered, but I noticed that manual Disarm on the panel today around 13:15 did not show up, nor was I able to press Disarm in the HA app. (The Alarm.com worked fine.) So at a glance, for me it only listens/works correctly for about 24 hours. Speed-wise this is an improvement. Stability wise I guess it's about as before in my case.

Best / Jonas

@elahd
Copy link
Collaborator

elahd commented May 22, 2023

Thanks for testing. Can you try again with v3.0.0-beta.2?

@catellie
Copy link

catellie commented May 22, 2023 via email

@elahd
Copy link
Collaborator

elahd commented May 22, 2023

@catellie Is this with v3.0.0-beta or v3.0.0-beta.2?

@catellie
Copy link

catellie commented May 22, 2023

The above was with the first beta. I've just applied the beta.2. So far I can see a neat looking websocket message block and no tracebacks, but I see some errors like this one:

2023-05-22 23:21:51.212 ERROR (MainThread) [homeassistant.components.ipp.coordinator] Error fetching ipp data: Invalid response from API: ('HTTP 500', {'content-type': None, 'message': '', 'status-code': 500})

As far as I can tell this version has no serious problems that turns into tracebacks. I'll let it run and see what happens...

🤞 / Jonas

@catellie
Copy link

Have now studied a fair number of iterations and as far as I can tell there are just two messages that are on the odd side:

(MainThread) [pyalarmdotcomajax] Requires two-factor authentication. Enabled methods are []

I assume this indicates that I DON'T have 2FA?

(MainThread) [pyalarmdotcomajax.websockets.handler.partition] Support for event EventType.UserLoggedIn (55) not yet implemented by Partition.

Presumably this is not harmful?

Best / Jonas

@catellie
Copy link

Ooops, there was just a problem:

2023-05-22 23:35:15.056 ERROR (MainThread) [custom_components.alarmdotcom.controller] Unexpected error fetching Garda Alarm:REDACTED data: 
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 891, in _async_get_system
    return [json_rsp["data"]]
KeyError: 'data'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 258, in _async_refresh
    self.data = await self._async_update_data()
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 217, in _async_update_data
    return await self.update_method()
  File "/config/custom_components/alarmdotcom/controller.py", line 142, in async_update
    await self.api.async_update()
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 243, in async_update
    raw_devices: list[dict] = await self._async_get_system(self._active_system_id)
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 895, in _async_get_system
    raise DataFetchFailed from err
pyalarmdotcomajax.errors.DataFetchFailed

@elahd
Copy link
Collaborator

elahd commented May 22, 2023 via email

@catellie
Copy link

catellie commented May 22, 2023 via email

@catellie
Copy link

Well, it appears to still break after some time. Here are some log snippets:

2023-05-24 06:59:52.110 ERROR (MainThread) [custom_components.alarmdotcom.controller] Unexpected error fetching Garda Alarm:XXX data:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 891, in _async_get_system
    return [json_rsp["data"]]
KeyError: 'data'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 258, in _async_refresh
    self.data = await self._async_update_data()
  File "/usr/src/homeassistant/homeassistant/helpers/update_coordinator.py", line 217, in _async_update_data
    return await self.update_method()
  File "/config/custom_components/alarmdotcom/controller.py", line 142, in async_update
    await self.api.async_update()
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 243, in async_update
    raw_devices: list[dict] = await self._async_get_system(self._active_system_id)
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 895, in _async_get_system
    raise DataFetchFailed from err
pyalarmdotcomajax.errors.DataFetchFailed

and

Server Response:
{'errors': [{'status': '403', 'detail': 'NotAllowed', 'code': 403}]}

As far as I can tell, the websocket message blob looks fine (the device type is -1 though). And my energy meter still has occasional glitches.

@elahd
Copy link
Collaborator

elahd commented May 28, 2023

Can you try again with v3.0.1?

@catellie
Copy link

I installed it some 30 mins ago, but it still has not succeeded to fetch anything. From the log:

2023-05-28 20:59:59.229 ERROR (MainThread) [pyalarmdotcomajax] Failed to get AJAX key from Alarm.com.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 488, in async_login
    self._ajax_headers["ajaxrequestuniquekey"] = resp.cookies["afg"].value
KeyError: 'afg'
2023-05-28 20:59:59.230 ERROR (MainThread) [homeassistant.config_entries] Error setting up entry Garda Alarm:REDACTED for alarmdotcom
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 488, in async_login
    self._ajax_headers["ajaxrequestuniquekey"] = resp.cookies["afg"].value
KeyError: 'afg'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/config/custom_components/alarmdotcom/controller.py", line 114, in initialize_lite
    await self.api.async_login()
  File "/usr/local/lib/python3.10/site-packages/pyalarmdotcomajax/__init__.py", line 492, in async_login
    raise UnexpectedResponse from err
pyalarmdotcomajax.exceptions.UnexpectedResponse

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/config_entries.py", line 387, in async_setup
    result = await component.async_setup_entry(hass, self)
  File "/config/custom_components/alarmdotcom/__init__.py", line 76, in async_setup_entry
    await controller.initialize()
  File "/config/custom_components/alarmdotcom/controller.py", line 60, in initialize
    await self.initialize_lite(
  File "/config/custom_components/alarmdotcom/controller.py", line 122, in initialize_lite
    raise UpdateFailed from err
homeassistant.helpers.update_coordinator.UpdateFailed

@catellie
Copy link

catellie commented May 28, 2023

Hmm, downgrading appears not to help. Did you perhaps adjust something in the library, @elahd ?

For the record, I'm now on:
Home Assistant 2023.5.4
Frontend 20230503.3 - latest

@elahd
Copy link
Collaborator

elahd commented May 31, 2023

@catellie Alarm.com changed their API, so lower versions will still break for some accounts. Try downloading v3.0.2. That fixes the last error you posted, but I'm not sure of whether something new will pop up now that this is fixed.

@catellie
Copy link

Yo, @elahd - I installed 3.0.2 late last night and it appears to work as before - for better and for worse:
It logs in fine again and works for a while - it detected my manually arming when I left this morning (7:30-ish), but appears to have got off track again before I came back this evening (18:00-ish) as it still claims "Armed away" as status and the only available action is "DISARM".

My log file keeps repeating blocks like this:

2023-05-31 21:39:54.262 DEBUG (MainThread) [pyalarmdotcomajax] Sending keep alive signal. Time until session context refresh: ~ 12 minutes.
2023-05-31 21:39:54.692 DEBUG (MainThread) [pyalarmdotcomajax] Session expired.
2023-05-31 21:39:54.693 INFO (MainThread) [pyalarmdotcomajax] User session expired. Logging back in.
2023-05-31 21:39:54.693 DEBUG (MainThread) [pyalarmdotcomajax] Attempting to log in to Alarm.com
2023-05-31 21:39:55.502 DEBUG (MainThread) [pyalarmdotcomajax] Response status from Alarm.com: 200
2023-05-31 21:39:57.170 DEBUG (MainThread) [pyalarmdotcomajax] Logged in to Alarm.com.
2023-05-31 21:39:58.020 DEBUG (MainThread) [pyalarmdotcomajax] *** START IDENTITY INFO ***
2023-05-31 21:39:58.021 DEBUG (MainThread) [pyalarmdotcomajax] Provider: REDACTED
2023-05-31 21:39:58.021 DEBUG (MainThread) [pyalarmdotcomajax] User: REDACTED
2023-05-31 21:39:58.021 DEBUG (MainThread) [pyalarmdotcomajax] Keep Alive Interval: 780000
2023-05-31 21:39:58.021 DEBUG (MainThread) [pyalarmdotcomajax] Keep Alive URL: /web/KeepAlive.aspx
2023-05-31 21:39:58.021 DEBUG (MainThread) [pyalarmdotcomajax] *** END IDENTITY INFO ***
2023-05-31 21:39:58.508 DEBUG (MainThread) [pyalarmdotcomajax]
==============================
Server Response:
{"data": {"id": REDACTED, "type": "twoFactorAuthentication/twoFactorAuthentication", "attributes": {"smsMobileNumber": {"country": "44"}, "email": "REDACTED", "currentDeviceName": "Other", "isCurrentDeviceTrusted": false, "selectedTypeOf2FA": 0, "enabledTwoFactorTypes": 0, "valid2FAPermissions": [1, 4], "canReset2FA": false, "showSuggestedSetup": false, "canSkipSuggestedSetup": false}}, "meta": {"transformer_version": "1.1"}}
==============================

Looking for a pattern, I see that after an initial OK handshake I started to get this about a minute after installation (on every call):

Server Response:
{"errors": [{"status": "403", "detail": "NotAllowed", "code": 403}]}

Then about an hour after I left, it starts alternating between the two types of responses - not in any clear pattern, just a fair amount of one and then a fair amount of the other and so on.

I can see ONE fairly clear pattern, though: Now and then there seems to be a Getting all devices in
My log indicates these are fine at 03:32:51, 10:47:56, 13:32:57 and 13:47:58, but at 15:02:59 this ends up as a 403 and after that it has not recovered - as in no sensor states or similar have been updated from this plugin.

Does this help you see the culprit? / Jonas

@catellie
Copy link

In my mind: 403 should indicate that the request is reasonably authenticated (hence no 401) as such, BUT still tries to fetch info that is not allowed for some reason (as opposed to data that is missing -> 404). That is of course assuming the http codes are used according to my interpretation of the RFC... 🤷

@catellie
Copy link

Another thought about the numbers: is the 780000 related to the ~12 minutes - if so, it seems the be exactly 13 minutes.

@catellie
Copy link

catellie commented Jun 1, 2023

So, I just left it running and 24 hours later, there has still not even been a TRY to do the Getting all devices call, but plenty of the Getting system data calls resulting in 403 (seemingly roughly 2 tries every 15 minutes).

I can't say I fully understand the protocol, but just looking at the results, I'd say there are two issues, that are possibly related:

  • The fundamental issue is that 403 happens at all - presumably some argument gets broken over time? Is this the listener that is supposed to instantly see when things change?
  • The separate issue seems to be that Getting all devices needs some kind of retry mechanism. Presumably this is NOT the listener (I'd expect the listener to only get changes, not the full list)

Thoughts @elahd ? / Jonas

@elahd
Copy link
Collaborator

elahd commented Jun 1, 2023

My understanding of what an Alarm.com 403 response means comes from these 3 files:

https://www.alarm.com/web/system/assets/addon-tree-output/@ember-data/adapter/error.js

  /**
    A `ForbiddenError` equates to a HTTP `403 Forbidden` response status.
    It is used by an adapter to signal that a request to the external API was
    valid but the server is refusing to respond to it. If authorization was
    provided and is valid, then the authenticated user does not have the
    necessary permissions for the request.
  
    @class ForbiddenError
    @extends AdapterError
  */
  var ForbiddenError = extend(AdapterError, 'The adapter operation is forbidden');
  _exports.ForbiddenError = ForbiddenError;
  ForbiddenError.prototype.code = 'ForbiddenError';

https://www.alarm.com/web/system/assets/customer-site/services/user-activity.js

        // If its a 403, the server received our request but the session already timed out.
        if (error.code === 403) {
          return this.contextManager.handleSessionExpiration();
        }

https://www.alarm.com/web/system/assets/addon-tree-output/@adc/ajax/enums/AjaxResponseHttpCode.js

  const InvalidAntiForgeryToken = 403;
  _exports.InvalidAntiForgeryToken = InvalidAntiForgeryToken;

There is a retry mechanism—that's why you see two requests every 15 minutes and not one—but that doesn't seem to be working. Can you capture traffic via your browser's devtools while you log into Alarm.com? You should see a request to https://www.alarm.com/web/api/identities. Post the contents of the response, specifically data→0→attributes→applicationSessionProperties. You should see something like this:

{
    "shouldTimeout": true,
    "keepAliveUrl": "/web/KeepAlive.aspx",
    "enableKeepAlive": true,
    "logoutTimeoutMs": 900000,
    "inactivityWarningTimeoutMs": 780000
}

The integration uses the exact values pulled from this object to keep the session alive, so you probably have 780000, as well, if you're seeing the 13 minute keep alive interval.

@catellie
Copy link

catellie commented Jun 3, 2023

Hmm, the first one makes perfect sense and I guess the session expiration is fair - assuming this is indeed the correct time out, so it smells a lot like the AntiForgery is triggered (which I guess is what you have been saying really).

I've just traced the session properties and they are indeed identical to what you listed above.
Skimming the various other attributes, the seem to make reasonable sense (Madrid TZ is a bit odd, but technically no different). I seem to have "Smart arming/disarm" which sounds good... 10000+ lines of json 🤦

@elahd
Copy link
Collaborator

elahd commented Jun 8, 2023

@catellie Test out v3.0.4-beta.1. It uses a dedicated web session instead of sharing with other HA integrations.

@catellie
Copy link

catellie commented Jun 8, 2023

I installed it instantly and about 1.5 hours in it is looking 👍!

@catellie
Copy link

catellie commented Jun 8, 2023

@elahd it is still going strong, detecting my manual Disarm when I arrived back home today. I'd say this is already more stable than I've seen in recent months.
I've not yet examined the logs in detail (probably will not have time before mid weekend), but my gut feeling is so far very positive! 🤞

@catellie
Copy link

catellie commented Jun 9, 2023 via email

@elahd
Copy link
Collaborator

elahd commented Jun 9, 2023

Which version were you running when you posted "it is still going strong" earlier?

What do you see in the debug log after the system was armed?

Also -- what is your polling interval set to? Did the manual arming register eventually?

@elahd
Copy link
Collaborator

elahd commented Jun 9, 2023

@catellie It looks like the session / reconfigure issue is resolved, so I'm going to close this ticket. Can you follow up on my question above in #288?

@elahd elahd closed this as completed Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants