Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some basic metric to track metric latency and count #37

Open
wants to merge 2 commits into
base: v0.39
Choose a base branch
from

Conversation

davidyuanfs
Copy link

@davidyuanfs davidyuanfs commented Sep 25, 2024

Improve visibility in vector metric path

  • Add counter metric to understand the metric type also when log warn if metric type is wrong.
  • add latency for metric write from obs-agent to understand the delay, this can be identified by shardName

count type:
Screenshot 2024-09-25 at 12 21 44 PM
latency:
Screenshot 2024-09-25 at 12 21 52 PM

@@ -19,3 +19,8 @@ vector-common = { path = "../vector-common", features = ["btreemap"] }

[build-dependencies]
prost-build = "0.12"

[dependencies.tracing]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also used in other modules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this tracing is used in multiple source, such as :

tracing = { version = "0.1", default-features = false }

@@ -78,6 +84,10 @@ fn reparse_groups(
for (key, metric) in metrics {
let tags = combine_tags(key.labels, tag_overrides.clone());

let shard_name = tags.get("shardName").unwrap_or("unknown").to_string();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to make this shardName configurable at the source level?

Copy link
Author

@davidyuanfs davidyuanfs Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering you mean configurable in source level is from this metric configuration? This metric should already have "xxx--general_1" shardName from scraping. This is a label for input source shardName we want to understand the load volume from each input shard. such nephos_1 is sending x series/seconds.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, having a new option in https://vector.dev/docs/reference/configuration/sources/prometheus_scrape/ since shardName is very databricks specific. It could be called latency_label and set to "shardName"

Copy link
Collaborator

@flaviofcruz flaviofcruz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to update https://github.com/databricks/vector/blob/v0.39/README.databricks.md

And note the code should be merged into v0.39-custom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants