Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Type annotations in synapse.databases.main.devices #13025

Merged
merged 22 commits into from
Jun 15, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions synapse/storage/databases/main/devices.py
Original file line number Diff line number Diff line change
Expand Up @@ -1809,7 +1809,7 @@ def _add_device_outbound_room_poke_txn(

async def get_uncoverted_outbound_room_pokes(
self, limit: int = 10
) -> List[Tuple[str, str, str, int, Optional[Dict[str, str]]]]:
) -> List[Tuple[str, str, str, int, Dict[str, str]]]:
"""Get device list changes by room that have not yet been handled and
written to `device_lists_outbound_pokes`.

Expand All @@ -1827,7 +1827,7 @@ async def get_uncoverted_outbound_room_pokes(

def get_uncoverted_outbound_room_pokes_txn(
txn: LoggingTransaction,
) -> List[Tuple[str, str, str, int, Optional[Dict[str, str]]]]:
) -> List[Tuple[str, str, str, int, Dict[str, str]]]:
txn.execute(sql, (limit,))

return [
Expand All @@ -1836,7 +1836,7 @@ def get_uncoverted_outbound_room_pokes_txn(
device_id,
room_id,
stream_id,
db_to_json(opentracing_context),
Copy link
Contributor Author

@DMRobertson DMRobertson Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure I liked this one. The original complaint was

synapse/storage/databases/main/devices.py:1860: error: Argument 2 to "runInteraction" of "DatabasePool" has incompatible type "Callable[[LoggingTransaction], List[Tuple[str, str, str, int, Optional[Dict[str, str]]]]]"; expected "Callable[..., List[Tuple[str, str, str, int, Dict[str, str]]]]" [arg-type]

Which corresponds to the context argument of this function:

def _add_device_outbound_poke_to_stream_txn(
self,
txn: LoggingTransaction,
user_id: str,
device_ids: Iterable[str],
hosts: Collection[str],
stream_ids: List[int],
context: Dict[str, str],
) -> None:
for host in hosts:
txn.call_after(
self._device_list_federation_stream_cache.entity_has_changed,
host,
stream_ids[-1],
)
now = self._clock.time_msec()
stream_id_iterator = iter(stream_ids)
encoded_context = json_encoder.encode(context)
self.db_pool.simple_insert_many_txn(
txn,
table="device_lists_outbound_pokes",
keys=(
"destination",
"stream_id",
"user_id",
"device_id",
"sent",
"ts",
"opentracing_context",
),
values=[
(
destination,
next(stream_id_iterator),
user_id,
device_id,
not self.hs.is_mine_id(
user_id
), # We only need to send out update for *our* users
now,
encoded_context if whitelisted_homeserver(destination) else "{}",
)
for destination in hosts
for device_id in device_ids
],
)

If we pass context=None, we'll first transform it to a JSON null and store that blob of json as the four-codepoint string null. I don't think that's what we want!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that opentracing_context is a nullable text field in the DB:

matrix=> \d device_lists_changes_in_room
Did not find any relation named "device_lists_changes_in_room".
matrix=> \d matrix.device_lists_changes_in_room
             Table "matrix.device_lists_changes_in_room"
          Column           |  Type   | Collation | Nullable | Default 
---------------------------+---------+-----------+----------+---------
 user_id                   | text    |           | not null | 
 device_id                 | text    |           | not null | 
 room_id                   | text    |           | not null | 
 stream_id                 | bigint  |           | not null | 
 converted_to_destinations | boolean |           | not null | 
 opentracing_context       | text    |           |          | 
Indexes:
    "device_lists_changes_in_stream_id" UNIQUE, btree (stream_id, room_id)
    "device_lists_changes_in_stream_id_unconverted" btree (stream_id) WHERE NOT converted_to_destinations

Copy link
Member

@clokep clokep Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if we have any "null" data already stored though? It is very unclear to me if it is safe to change this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not on matrix.org:

matrix=> select * from matrix.device_lists_changes_in_room where opentracing_context='null' limit 1;
(0 rows)

Time: 11322.423 ms (00:11.322)
matrix=> select * from matrix.device_lists_changes_in_room where opentracing_context='"null"' limit 1;
(0 rows)

Time: 10651.809 ms (00:10.652)
matrix=> select * from matrix.device_lists_changes_in_room where opentracing_context is NULL limit 1;
(0 rows)

Time: 3886.793 ms (00:03.887)

I suppose a safer approach would be to write a migration which makes the column non-nullable.

Maybe this is better dropped though; ideally type hints wouldn't kick off invasive changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1569b14 reverts this and 134d3da presents an alternative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'd be curious what @erikjohnston thinks here since this was was just added in #12321 (and maybe #13045). Is there a long term plan here we're unsure about?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused as to why we can't let it be None if I'm honest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly just trying to keep the existing type hints happy. Some of them say Dict[] and some of them say Optional[Dict].

Maybe we just use Optional[Dict] everywhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd go with Optional[Dict] if the DB has a nullable column?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8395995 and 60e2ce0 do this.

db_to_json(opentracing_context) or {},
)
for user_id, device_id, room_id, stream_id, opentracing_context in txn
]
Expand All @@ -1852,7 +1852,7 @@ async def add_device_list_outbound_pokes(
room_id: str,
stream_id: int,
hosts: Collection[str],
context: Optional[Dict[str, str]],
context: Dict[str, str],
) -> None:
"""Queue the device update to be sent to the given set of hosts,
calculated from the room ID.
Expand Down