Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Result level cache key collisions from utf8 encoding #16552

Closed
gianm opened this issue Jun 5, 2024 · 0 comments · Fixed by #16569
Closed

Result level cache key collisions from utf8 encoding #16552

gianm opened this issue Jun 5, 2024 · 0 comments · Fixed by #16569
Labels

Comments

@gianm
Copy link
Contributor

gianm commented Jun 5, 2024

The ResultLevelCachingQueryRunner does this when it computes the cache key for a query:

final String cacheKeyStr = StringUtils.fromUtf8(strategy.computeResultLevelCacheKey(query));

This is bad because not all byte arrays can be represented as Java Strings. For example, the byte arrays [63, -48, 0, 0, 0, 0, 0, 0] and [63, -24, 0, 0, 0, 0, 0, 0] have the same representation when converted to Strings. Both have an invalid second character, which both get mapped to the replacement character. These byte arrays are the representations of doubles 0.25 and 0.75 respectively, which is how this was originally noticed (two queries had a cache key collision when they differed only in that one used 0.25 and one used 0.75 as a quantile parameter).

To fix this we need to always use byte[], never String, when dealing with cache keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant