Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/gate: Prefix gate metrics for selects #3154

Merged

Conversation

simonpasquier
Copy link
Contributor

@simonpasquier simonpasquier commented Sep 11, 2020

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

The metrics on the gates for concurrent selects were exposed without a prefix (e.g. gate_queries_in_flight and gate_duration_seconds) unlike the gate metrics for the query API. The first commit 60950ff adds the thanos_query_concurrent_selects_ prefix.
After more thinking, I felt that exposing the number of in-flight queries doesn't make much sense for concurrent selects because each query creates a new gate. Thus we can have (max number of concurrent selects per query x max number of concurrent queries) selects in flight and it doesn't tell whether a query is blocked on too many selects or not. I've added another commit
722181c that removes the thanos_query_concurrent_selects_gate_queries_in_flight metric.

The following metrics have also been added to record the maximum number of concurrent requests per gate:

  • thanos_query_gate_queries_max
  • thanos_bucket_store_series_gate_queries_max, previously known as thanos_bucket_store_queries_concurrent_max.
  • thanos_memcached_getmulti_gate_queries_max

Verification

Tested manually.

@kakkoyun kakkoyun changed the title Prefix gate metrics for selects pkg/gate: Prefix gate metrics for selects Sep 11, 2020
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contributions. Looking good.
I have a couple of comments.

engine = promql.NewEngine(
queryableCreator = query.NewQueryableCreator(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_concurrent_selects_", reg),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, this registry only used for concurrent selects. However, it could be used for others as well. I think we should wrap the registry this early.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll think about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pkg/gate/gate.go Show resolved Hide resolved
@simonpasquier simonpasquier force-pushed the prefix-gate-metrics-for-selects branch 2 times, most recently from fff2500 to 430b3ed Compare September 11, 2020 15:31
Thanos query exposed `gate_queries_in_flight` and
`gate_duration_seconds_bucket` metrics for concurrent selects. These
metrics are now prefixed by `thanos_query_concurrent_selects_`.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This change deprecates the gate.(*Keeper) struct. When Keeper is used to
create several gates, the metric tracking the number of in-flight metric
isn't meaningful because it is hard to say whether requests are being
blocked or not.

As such the `thanos_query_concurrent_selects_gate_queries_in_flight` is
removed.

The following metrics have been added to record the maximum number of
concurrent requests per gate:
* `thanos_query_gate_queries_max`
* `thanos_bucket_store_series_gate_queries_max`, previously known as
  `thanos_bucket_store_queries_concurrent_max.`
* `thanos_memcached_getmulti_gate_queries_max`

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
// gate.(*Keeper).NewGate only once, it is recommended to use gate.New()
// instead. Otherwise it is recommended to use the
// github.com/prometheus/prometheus/pkg/gate package directly and wrap the
// returned gate with gate.InstrumentGateDuration().
type Keeper struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still used anywhere in the codebase? If it's not the case, why not remove it directly? I don't think we have any downstream dependents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and given that Thanos imports Cortex too, it creates a circular dependency hence the need to keep Keeper :)

pkg/query/querier.go Outdated Show resolved Hide resolved
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 Thanks a lot for addressing everything 💜

@kakkoyun kakkoyun merged commit b1fdb63 into thanos-io:master Sep 14, 2020
@simonpasquier simonpasquier deleted the prefix-gate-metrics-for-selects branch September 14, 2020 13:34
simonpasquier added a commit to simonpasquier/thanos that referenced this pull request Sep 15, 2020
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
kakkoyun pushed a commit that referenced this pull request Sep 15, 2020
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants