-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deep storage segment metric #16072
Conversation
After this patch, will such segments be reported as available or unavailable? |
@@ -252,6 +252,25 @@ public Object2IntMap<String> getDatasourceToUnavailableSegmentCount() | |||
return datasourceToUnavailableSegments; | |||
} | |||
|
|||
public Object2IntMap<String> getDatasourceToDeepStorageQueryOnlySegmentCount() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Some of the code in this method may be commoned out with getDatasourceToUnavailableSegmentCount
.
docs/operations/metrics.md
Outdated
@@ -332,6 +332,7 @@ These metrics are for the Druid Coordinator and are reset each time the Coordina | |||
|`segment/unneededEternityTombstone/count`|Number of non-overshadowed eternity tombstones marked as unused.| |Varies| | |||
|`segment/unavailable/count`|Number of unique segments left to load until all used segments are available for queries.|`dataSource`|0| | |||
|`segment/underReplicated/count`|Number of segments, including replicas, left to load until all used segments are available for queries.|`tier`, `dataSource`|0| | |||
|`segment/deepStorageOnly/count`|Number of unique segments that are only available for querying directly from deep storage.|`dataSource`|Varies| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the metric name deepStorageQueryOnly
or better yet availableDeepStorageOnly
as it aligns better with unavailable/count
?
that should say "not be reported as unavailable", my bad. i think they should not be reported as unavailable b/c that metric is supposed to be 0 during steady state and enabling query from deep storage shouldn't change this |
What is the purpose of this metric? The doc says it's typical value is I can imagine tracking the size of cold data as a useful metric, but I'm not sure about the use of seeing how the count of cold segments changes over time will be helpful. Maybe some doc updates about that would help clarify the use of the metric. |
Thanks, @georgew5656 , the steady state reasoning makes sense to me. @suneet-s , one use of this metric would be to find out if the number of segments that are not available for querying using the native engine. For example, if there are some segments that are available for querying from deep storage using MSQ, they would not be counted towards the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
As @suneet-s requested, you may add more info in the docs. But the current definition of the metric also seems sufficient to me.
Addressing some followup comments from the change in the metric made in #16020
Description
If a datasource has a replicaCount=0 rule that applies to some segments that are not covered by a replicaCount>0 rule, those segments will not be reported as unavailable in segment/unavailable/count. Since they are not queryable by native queries (only using the query from deep storage feature), we want a separate metric that lets us check how many segments fit these criteria.
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Add metric segment/deepStorage/count that counts the number of segments that are only queryable through the query from deep storage metric.
Release note
New metric to support query from deep storage feature.
Key changed/added classes in this PR
DruidCoordinator
Stats
This PR has: