Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AggregatorRegistry Metrics Error - Operation Timed out #501

Open
austonpramodh opened this issue Apr 22, 2022 · 5 comments
Open

AggregatorRegistry Metrics Error - Operation Timed out #501

austonpramodh opened this issue Apr 22, 2022 · 5 comments

Comments

@austonpramodh
Copy link

I am trying to integrate prom client into a cluster-enabled project. Unfortunately, I don't get any response from the Aggregator endpoint. I do get responses from the "/metrics" from the worker nodes, but when I hit the "/cluster_metrics" I get an error saying "Operation Timed out".

I would like to understand how the aggregator gets the metrics from the works, This might help me understand if I am doing and mistakes.

Master Server

image

Worker Server - Already runs an HTTP server, therefore created a handler.

image

Error

Screen Shot 2022-04-21 at 8 13 22 PM

@austonpramodh
Copy link
Author

I found the solution for this, So looking at the ClusterMtrics file, It uses even emitter to collect the metrics from the workers. Therefore the AggregatorRegistry class needs to be instantiated on workers as well which I wasn't doing, I was just instantiating it in the master process.

image

Thanks. Closing this issue.

@zbjornson
Copy link
Collaborator

This was actually a mistake in a recent release, see #464. I haven't had any time to work on prom-client to address it, but you found the workaround.

@SpComb
Copy link

SpComb commented Aug 23, 2022

We ran into the same issue as a regression, where cluster metrics stopped working after an upgrade from 13.1.0 -> 14.0.1.

Perhaps this issue should be re-opened, as #464 has not been merged, and there isn't yet any released version that fixes this issue?

@zbjornson zbjornson reopened this Aug 25, 2022
@zbjornson
Copy link
Collaborator

Since so much time has passed and since a few releases have been made since the regression, I'm going to try to find a non-breaking fix that allows either usage pattern.

@morgaan
Copy link

morgaan commented Feb 6, 2024

I hoped that the latest major version (15) update would address that. It's not. Feel like I'm stuck with 13.1.0 for another while.
I thought using cluster was kind of the norm. If this is, that means that the breaking change introduced in 13.2.0 affects a lot of people and may deserve a bit of attention 🤷. I believe the fix lives in this PR: #464

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants