Session Token Management APIs #36971

tvaron3 · 2024-08-21T05:59:50Z

Problem Statement

Customers that maintain their own session tokens could need ways to get the most updated session token. For example, a customer using multiple clients and keeping their session token in a cache could face race conditions when updating the cache. If a customer has a high cardinality of logical partition keys, it will mean storing many session tokens. Addresses #36286.

Changes

The changes would be part of a preview package
Added a new method to get most updated session tokens for customers wanting to keep track of their own session tokens.
Added a new api for converting logical partition key to feed range
Added new api for checking if a feed range is a subset of another feed range
Fixed bug with overlapping ranges and added test coverage
In the future could add ability to get artificial feed ranges to do operations

APIs

Container.py
def get_updated_session_token(feed_ranges_to_session_tokens: List, target_feed_range: str): --> str - Requires no metadata calls
def feed_range_for_logical_partition(pk: PartitionKey): --> FeedRange - There could be metadata calls for the collection properties, but it is cached
def is_feed_range_subset(parent_feed_range: str, child_feed_range: str): --> bool- no metadata calls are necessary for this
def read_feed_ranges(num_of_ranges: int): --> List - This would be out of scope for this pr. This would require metadata calls to setup the pkrange cache

Samples

# This would be happening through different clients 
# Using physical partition model for read operations
cache = {}
session_token = ""
feed_range = container.feed_range_for_logical_partition(logical_pk)
for stored_feed_range, stored_session_token in cache:
    if container.is_feed_range_subset(stored_feed_range, feed_range):
        session_token = stored_session_token
read_item = container.read_item(doc_to_read, logical_pk, session_token)


logical_pk_feed_range = container.feed_range_for_logical_partition(logical_pk)
session_token = container.client_connection.last_response_headers["x-ms-session-token"]
feed_ranges_and_session_tokens = []

# Get feed ranges for physical partitions
container_feed_ranges = container.read_feed_ranges()
target_feed_range = ""

# which feed range maps to the logical pk from the operation
for feed_range in container_feed_ranges:
    if container.is_feed_range_subset(feed_range, logical_pk_feed_range):
        target_feed_range = feed_range
        break 
for cached_feed_range, cached_session_token in cache:
        feed_ranges_and_session_tokens.append((cached_feed_range, cached_session_token))
# Add the target feed range and session token from the operation
feed_ranges_and_session_tokens.append((target_feed_range, session_token))
cache[feed_range] = container.get_updated_session_token(feed_ranges_and_session_tokens, target_feed_range)



# Different ways of storing the session token and how to get most updated session token

# ---------------------1. using logical partition key ---------------------------------------------------
# could also use the one stored from the responses headers
target_feed_range = container.feed_range_for_logical_partition(logical_pk)
updated_session_token = container.get_updated_session_token(feed_ranges_and_session_tokens, target_feed_range)
# ---------------------2. using artificial feed range ----------------------------------------------------
# Get four artificial feed ranges
container_feed_ranges = container.read_feed_ranges(4)

pk_feed_range = container.feed_range_for_logical_partition(logical_pk)
target_feed_range = ""
# which feed range maps to the logical pk from the operation
for feed_range in container_feed_ranges:
    if container.is_feed_range_subset(feed_range, pk_feed_range):
        target_feed_range = feed_range
        break 

updated_session_token = container.get_updated_session_token(feed_ranges_and_session_tokens, target_feed_range)
# ---------------------3. using physical partitions -----------------------------------------------------
# Get feed ranges for physical partitions
container_feed_ranges = container.read_feed_ranges()

pk_feed_range = container.feed_range_for_logical_partition(logical_pk)
target_feed_range = ""
# which feed range maps to the logical pk from the operation
for feed_range in container_feed_ranges:
    if container.is_feed_range_subset(feed_range, pk_feed_range):
        target_feed_range = feed_range
        break 

updated_session_token = container.get_updated_session_token(feed_ranges_and_session_tokens, target_feed_range)
# ------------------------------------------------------------------------------------------------------

Tradeoffs to Storing Session Token by Logical Partition Key vs Physical Partition vs Artificial Feed Ranges

Storing session tokens by logical partition keys has the benefit of requiring fewer updates. This approach minimizes the number of concurrent updates, which can be advantageous in terms of performance. Additionally, during a failover, the availability impact is reduced because there are fewer updates to the session tokens. For example, if Region A fails over and the client has a session token with a global LSN of 42, the next request would go to Region B, where the LSN on the replicas might be 32 due to replication lag. This discrepancy would trigger the 404 / 1002 exception (Read Session Not Available) for any requests with this session token.

On the other hand, using physical partitions or artificial feed ranges involves an optimistic get from the cache, as the number of concurrent updates will increase significantly. However, the benefit of this approach is that the cardinality of the stored session tokens would be significantly less, which can simplify management and reduce overhead. It would also mean a bigger blast radius during a failover as the scenario shown above would be more common.

Implementation

Glossary

Session Token Format: PKRangeId:VersionNumber#GlobalLSN#RegionId1=LocalLSN1#RegionId2=LocalLSN2...
Compound session token: Comma separated session tokens

Some Scenarios

Scenario	Input	Output
Normal Case	[("AA-BB", "0:1#54#3=50"), ("AA-BB","0:1#51#3=52")], "AA-BB"	"0:1#54#3=52"
Physical Partition Split with Both Children	[("AA-DD", "0:1#51#3=52"), ("AA-BB","1:1#55#3=52"), ("BB-DD","2:1#54#3=52")], "AA-DD"	"1:1#55#3=52, 2:1#54#3=52"
Physical Partition Split with One Child	[("AA-DD", "0:1#51#3=52"), ("AA-BB","1:1#55#3=52")], "AA-DD"	"0:1#51#3=52, 1:1#55#3=52"
Physical Partition Merge	[("AA-DD", "0:1#55#3=52"), ("AA-BB","1:1#51#3=52")], "AA-DD"	"0:1#55#3=52"
Compound Session Token	[("AA-DD", "2:1#54#3=52, 1:1#55#3=52"), ("AA-BB","0:1#51#3=52")], "AA-BB"	"2:1#54#3=52, 1:1#55#3=52, 0:1#51#3=52"
Several Compound Session Token	[("AA-DD", "2:1#57#3=52, 1:1#57#3=52"), ("AA-DD","2:1#56#3=52, 1:1#58#3=52")], "AA-DD"	"2:1#57#3=52, 1:1#58#3=52"
Overlapping Ranges	[("AA-CC", "0:1#54#3=52"), ("BB-FF","2:1#51#3=52")], "AA-EE"	"0:1#54#3=52,2:1#51#3=52"
No Relevant Feed Ranges	[("CC-DD", "0:1#54#3=52"), ("EE-FF","0:1#51")], "AA-BB"	throw illegal argument exception

Flow

For the def is_feed_range_subset(parent_feed_range: str, child_feed_range: str): --> str api, the implementation will follow the .NET implementation in this pr https://github.com/Azure/azure-cosmos-dotnet-v3/pull/4566/files. Merging session tokens will be done in the same way as the session container. Merging will take higher version number, higher global lsn, and higher local lsns.

---
title: Merge Session Tokens
---
flowchart TB
A["[(feed_range, session_token), ...],  target_feed_range"] --> B[filter all tuples with feed_range overlapping with target_feed_range]
B --> C{'Is there a feed_range that is a superset of some of the other feed_ranges excluding tuples with compound session tokens?}
C -- Yes and Superset Feed Range has Higher LSN --> F["merge and take the pkrangeid(s) of the higher session token(s)"]
C -- Yes and Superset has Lower LSN --> I{Are there feed_ranges that can be combined to be equal or larger than the super set feed range?}
I -- yes --> F
I -- No --> Z[compound the session tokens]
Z --> C
F --> C
C-- no --> H[compound the session tokens]
H --> E[Merge any session tokens with same pkrangeids]

azure-sdk · 2024-08-21T06:31:20Z

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-cosmos

…into tvaron3/sessionTokenHelper

…xinlian12/azure-sdk-for-python into tvaron3/sessionTokenHelper

… request context

…into tvaron3/sessionTokenHelper

tvaron3 · 2024-10-08T01:44:39Z

/azp run python - cosmos - tests

azure-pipelines · 2024-10-08T01:44:58Z

Azure Pipelines successfully started running 1 pipeline(s).

…into tvaron3/sessionTokenHelper

sdk/cosmos/azure-cosmos/azure/cosmos/_routing/routing_range.py

sdk/cosmos/azure-cosmos/azure/cosmos/_vector_session_token.py

sdk/cosmos/azure-cosmos/azure/cosmos/_session_token_helpers.py

sdk/cosmos/azure-cosmos/azure/cosmos/_cosmos_client_connection.py

sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py

annatisch · 2024-10-09T18:21:35Z

sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py

+        """
+        return FeedRangeEpk(await self._get_epk_range_for_partition_key(partition_key))
+
+    async def is_feed_range_subset(self, parent_feed_range: FeedRange, child_feed_range: FeedRange) -> bool:


It looks like this doesn't need to be on the client at all? Can this not simply be a utility on the FeedRange object itself?
As a Python developer, it would be cool if I could just do something like:
if child_feed_range in parent_feed_range:
Or failing that, even something like:
if parent_feed_range.contains(child_feed_range):
if child_feed_range.is_subset(parent_feed_range):

Added container information to the feed ranges and a check to make sure the feed ranges inputted correspond to the correct container. Feed ranges are scoped to a container.

annatisch · 2024-10-09T18:25:05Z

sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py

@@ -1319,3 +1332,40 @@ async def read_feed_ranges(

        return [FeedRangeEpk(Range.PartitionKeyRangeToRange(partitionKeyRange))
                for partitionKeyRange in partition_key_ranges]
+
+    async def get_updated_session_token(self,


This doesn't look like it needs to be on the client at all, nor does it look like it should be async?
This could also probably be a utility on the FeedRange object:
target_feed_range.get_session_tokens(feed_ranges_to_session_tokens)

Added container information to the feed ranges and a check to make sure the feed ranges inputted correspond to the correct container. Feed ranges and session tokens are scoped to a container.

tvaron3 · 2024-10-11T00:03:27Z

/azp run python - cosmos - tests

azure-pipelines · 2024-10-11T00:03:46Z

Azure Pipelines successfully started running 1 pipeline(s).

tvaron3 · 2024-10-11T18:17:09Z

/azp run python - cosmos - tests

azure-pipelines · 2024-10-11T18:17:29Z

Azure Pipelines successfully started running 1 pipeline(s).

annie-mac and others added 9 commits August 16, 2024 11:13

merge from main and resolve conflicts

2950e20

remove async keyword from changeFeed query in aio package

7a1a1eb

refactor

b6c53fb

refactor

5f16b14

fix pylint

36990ef

added public surface methods

3c569e8

pylint fix

7479b0c

fix

2e76620

added functionality for merging session tokens from logical pk

56bbb9e

github-actions bot added the Cosmos label Aug 21, 2024

annie-mac and others added 19 commits August 21, 2024 09:41

fix mypy

8c0aa46

added tests for basic merge and split

28394b9

resolve comments

25c3363

resolve comments

cecdfa5

resolve comments

65ed132

resolve comments

4bb30d2

fix pylint

5addcdc

fix mypy

59814d7

merge feed range changes

ec79b94

fix tests

66c3f7b

merged with feed range branch

1e7a268

Merge branch 'main' of https://github.com/Azure/azure-sdk-for-python …

997b6b0

…into tvaron3/sessionTokenHelper

Merge branch 'main' into addFeedRangeSupportInChangeFeed

7eda72f

add tests

3a2e4e1

fix pylint

0883dac

Merge branch 'addFeedRangeSupportInChangeFeed' of https://github.com/…

b7d1210

…xinlian12/azure-sdk-for-python into tvaron3/sessionTokenHelper

fix and resolve comments

195c47c

fix and resolve comments

246b1be

Added isSubsetFeedRange logic

10fe387

tvaron3 added 7 commits October 4, 2024 18:01

Added more tests

ad3ae4f

merge with main

8f466a1

Changed tests to use new public feed range and more test coverage for…

5249d0a

… request context

Added more tests

40523f5

Fix tests and add changelog

9f88b4e

fix spell checks

7c23e87

Merge branch 'main' of https://github.com/Azure/azure-sdk-for-python …

4d0b058

…into tvaron3/sessionTokenHelper

tvaron3 added 2 commits October 7, 2024 23:33

Added tests and pushed request context to client level

d7c598e

Added async methods and removed feed range from request context

8698098

tvaron3 marked this pull request as ready for review October 8, 2024 18:24

tvaron3 requested review from annatisch and a team as code owners October 8, 2024 18:24

tvaron3 added 2 commits October 8, 2024 18:07

fix tests

c252d88

fix tests and pylint

51e721b

tvaron3 changed the title ~~Session Token Merge~~ Session Token Management APIs Oct 9, 2024

Merge branch 'main' of https://github.com/Azure/azure-sdk-for-python …

923055b

…into tvaron3/sessionTokenHelper

annatisch requested changes Oct 9, 2024

View reviewed changes

tvaron3 added 4 commits October 9, 2024 23:17

Reacting to comments

104e341

Reacting to comments

5552912

pylint and added hpk tests

1bbbd0f

reacting to comments

a9299ab

fix tests and mypy

2155016

tvaron3 added 2 commits October 11, 2024 12:54

fix mypy

0436355

fix mypy

103eb41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session Token Management APIs #36971

Session Token Management APIs #36971

tvaron3 commented Aug 21, 2024 •

edited

Loading

azure-sdk commented Aug 21, 2024

tvaron3 commented Oct 8, 2024

azure-pipelines bot commented Oct 8, 2024

annatisch Oct 9, 2024

tvaron3 Oct 11, 2024

annatisch Oct 9, 2024

tvaron3 Oct 11, 2024

tvaron3 commented Oct 11, 2024

azure-pipelines bot commented Oct 11, 2024

tvaron3 commented Oct 11, 2024

azure-pipelines bot commented Oct 11, 2024

Session Token Management APIs #36971

Are you sure you want to change the base?

Session Token Management APIs #36971

Conversation

tvaron3 commented Aug 21, 2024 • edited Loading

Problem Statement

Changes

APIs

Samples

Tradeoffs to Storing Session Token by Logical Partition Key vs Physical Partition vs Artificial Feed Ranges

Implementation

Glossary

Some Scenarios

Flow

azure-sdk commented Aug 21, 2024

tvaron3 commented Oct 8, 2024

azure-pipelines bot commented Oct 8, 2024

annatisch Oct 9, 2024

Choose a reason for hiding this comment

tvaron3 Oct 11, 2024

Choose a reason for hiding this comment

annatisch Oct 9, 2024

Choose a reason for hiding this comment

tvaron3 Oct 11, 2024

Choose a reason for hiding this comment

tvaron3 commented Oct 11, 2024

azure-pipelines bot commented Oct 11, 2024

tvaron3 commented Oct 11, 2024

azure-pipelines bot commented Oct 11, 2024

tvaron3 commented Aug 21, 2024 •

edited

Loading