Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remotecfg components don't work after pull request #1501 #1688

Open
thiennn-neji opened this issue Sep 16, 2024 · 3 comments
Open

remotecfg components don't work after pull request #1501 #1688

thiennn-neji opened this issue Sep 16, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@thiennn-neji
Copy link

thiennn-neji commented Sep 16, 2024

What's wrong?

The Grafana Alloy remotecfg feature was fixed in pull request #1372, and I tested it with Grafana Alloy Revision 800739f, where it worked smoothly. However, starting from pull request #1501, it seems to no longer work.

It appears that the componentID for remotecfg components is in the format remotecfg/example.component.label rather than just example.component.label. However, in line 183 of pull request #1501 (see here), it assumes that the componentID in format example.component.label

In my opinion, the s.componentHandler should reference s.componentHttpPathPrefixRemotecfg instead of s.componentHttpPathPrefix

Steps to reproduce

Step 0: An alloy-remote-config-server is serving gRPC at 127.0.0.1:8888 with the following template (also specfied in Configuration section):

prometheus.exporter.self "default" { }

Step 1: Use two Docker images for two Grafana Alloy revisions: (Revision 9e290c6, v1.4.0-rc.0) and (Revision 800739f). Run the same configuration:

// test.alloy
logging {
	level  = "debug"
	format = "logfmt"
}

remotecfg {
	url            = "http://127.0.0.1:8888"
	id             = constants.hostname
	poll_frequency = "10s"
	attributes     = {
		"template_name"  = "test-remotecfg",
	}
}

with the command line options:

alloy run --stability.level=experimental --storage.path=/tmp/alloy /test.alloy

Step 2: In each Grafana Alloy instance, use cURL to fetch metrics

curl http://localhost:12345/api/v0/component/remotecfg/prometheus.exporter.self.default/metrics

Step 3: Check the output of cURL command and Grafana Alloy logs

  • Revision 800739f: The remotecfg component works as expected.

Alloy log

ts=2024-09-16T04:35:43.717199572Z level=info "boringcrypto enabled"=false
ts=2024-09-16T04:35:43.716113131Z level=info source=/go/pkg/mod/github.com/!kim!machine!gun/automemlimit@v0.6.0/memlimit/memlimit.go:170 msg="memory is not limited, skipping: %v" package=github.com/KimMachineGun/automemlimit/memlimit !BADKEY="memory is not limited"
ts=2024-09-16T04:35:43.717236944Z level=info msg="no peer discovery configured: both join and discover peers are empty" service=cluster
ts=2024-09-16T04:35:43.717240511Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab
ts=2024-09-16T04:35:43.717245918Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=remotecfg duration=638.347µs
ts=2024-09-16T04:35:43.717250881Z level=info msg="applying non-TLS config to HTTP server" service=http
ts=2024-09-16T04:35:43.717254029Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=http duration=6.381µs
ts=2024-09-16T04:35:43.717258166Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=cluster duration=345ns
ts=2024-09-16T04:35:43.717262171Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=otel duration=288ns
ts=2024-09-16T04:35:43.717265709Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=labelstore duration=1.76µs
ts=2024-09-16T04:35:43.717270911Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=tracing duration=2.865µs
ts=2024-09-16T04:35:43.717275657Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=logging duration=89.255µs
ts=2024-09-16T04:35:43.717288179Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=livedebugging duration=3.757µs
ts=2024-09-16T04:35:43.717295397Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab node_id=ui duration=382ns
ts=2024-09-16T04:35:43.717300411Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id="" trace_id=314425bf3e5d0b1540153436fba928ab duration=821.508µs
ts=2024-09-16T04:35:43.717310333Z level=debug msg="changing node state" service=cluster from=viewer to=participant
ts=2024-09-16T04:35:43.717317467Z level=debug msg="REDACTED @1: participant" service=cluster
ts=2024-09-16T04:35:43.717382118Z level=info msg="scheduling loaded components and services"
ts=2024-09-16T04:35:43.71749769Z level=info msg="starting cluster node" service=cluster peers_count=0 peers="" advertise_addr=127.0.0.1:12345
ts=2024-09-16T04:35:43.717529278Z level=debug msg="REDACTED @3: participant" service=cluster
ts=2024-09-16T04:35:43.717764506Z level=info msg="peers changed" service=cluster peers_count=1 peers=REDACTED
ts=2024-09-16T04:35:43.717874201Z level=info msg="now listening for http traffic" service=http addr=127.0.0.1:12345
ts=2024-09-16T04:35:43.718252927Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id=remotecfg trace_id=23892e91445f51aa15ae2d765e367e18
ts=2024-09-16T04:35:43.718297571Z level=info msg="finished node evaluation" controller_path=/ controller_id=remotecfg trace_id=23892e91445f51aa15ae2d765e367e18 node_id=prometheus.exporter.self.default duration=15.482µs
ts=2024-09-16T04:35:43.718309658Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id=remotecfg trace_id=23892e91445f51aa15ae2d765e367e18 duration=80.709µs
ts=2024-09-16T04:35:43.718652439Z level=info msg="scheduling loaded components and services"
ts=2024-09-16T04:35:53.770017569Z level=debug msg="skipping over API response since it contained the same hash" service=remotecfg

cURL log

curl http://localhost:12345/api/v0/component/remotecfg/prometheus.exporter.self.default/metrics

# Output
# HELP alloy_build_info A metric with a constant '1' value labeled by version, revision, branch, goversion from which alloy was built, and the goos and goarch for the build.
# TYPE alloy_build_info gauge
alloy_build_info{branch="main",goarch="amd64",goos="linux",goversion="go1.22.5",revision="800739fab",tags="netgo,builtinassets,promtail_journal_enabled",version="v1.4.0-devel+800739fab"} 1
  • Revision 9e290c6 (v1.4.0-rc.0): The remotecfg component does not work.

Alloy log

ts=2024-09-16T04:39:52.849428699Z level=info "boringcrypto enabled"=false
ts=2024-09-16T04:39:52.83708336Z level=info source=/go/pkg/mod/github.com/!kim!machine!gun/automemlimit@v0.6.0/memlimit/memlimit.go:170 msg="memory is not limited, skipping: %v" package=github.com/KimMachineGun/automemlimit/memlimit !BADKEY="memory is not limited"
ts=2024-09-16T04:39:52.851069694Z level=info msg="no peer discovery configured: both join and discover peers are empty" service=cluster
ts=2024-09-16T04:39:52.851097278Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f
ts=2024-09-16T04:39:52.851130211Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=tracing duration=6.094µs
ts=2024-09-16T04:39:52.851156684Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=remotecfg duration=10.43222ms
ts=2024-09-16T04:39:52.851178036Z level=info msg="applying non-TLS config to HTTP server" service=http
ts=2024-09-16T04:39:52.851190826Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=http duration=38.89µs
ts=2024-09-16T04:39:52.851236824Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=cluster duration=1.883µs
ts=2024-09-16T04:39:52.851260117Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=otel duration=1.433µs
ts=2024-09-16T04:39:52.851282544Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=livedebugging duration=13.309µs
ts=2024-09-16T04:39:52.851302812Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=ui duration=2.026µs
ts=2024-09-16T04:39:52.851332402Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=logging duration=1.947391ms
ts=2024-09-16T04:39:52.851397706Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f node_id=labelstore duration=21.187µs
ts=2024-09-16T04:39:52.851431635Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id="" trace_id=63e74be47a1d3930e0fb462f78acb05f duration=12.710918ms
ts=2024-09-16T04:39:52.851487935Z level=debug msg="changing node state" service=cluster from=viewer to=participant
ts=2024-09-16T04:39:52.851526946Z level=debug msg="cpu12681 @1: participant" service=cluster
ts=2024-09-16T04:39:52.851812792Z level=info msg="scheduling loaded components and services"
ts=2024-09-16T04:39:52.853168428Z level=info msg="starting cluster node" service=cluster peers_count=0 peers="" advertise_addr=127.0.0.1:12345
ts=2024-09-16T04:39:52.853311651Z level=debug msg="cpu12681 @3: participant" service=cluster
ts=2024-09-16T04:39:52.853737901Z level=info msg="peers changed" service=cluster peers_count=1 peers=cpu12681
ts=2024-09-16T04:39:52.855116828Z level=info msg="now listening for http traffic" service=http addr=127.0.0.1:12345
ts=2024-09-16T04:39:52.858859086Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id=remotecfg trace_id=167b9941097215f5ddb8421535365edc
ts=2024-09-16T04:39:52.859086486Z level=info msg="finished node evaluation" controller_path=/ controller_id=remotecfg trace_id=167b9941097215f5ddb8421535365edc node_id=prometheus.exporter.self.default duration=82.917µs
ts=2024-09-16T04:39:52.859136085Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id=remotecfg trace_id=167b9941097215f5ddb8421535365edc duration=361.008µs
ts=2024-09-16T04:39:52.860527774Z level=info msg="scheduling loaded components and services"
2024/09/16 04:39:58 http: superfluous response.WriteHeader call from go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/otelmux.getRRW.func2.1 (mux.go:114)

cURL log

curl http://localhost:12345/api/v0/component/remotecfg/prometheus.exporter.self.default/metrics

# Output
failed to parse URL path "/api/v0/component/remotecfg/prometheus.exporter.self.default/metrics": invalid path

System information

Linux 6.10.6 x86_64 (Ubuntu 24.04.1 LTS)

Software version

Grafana Alloy (Revision 9e290c6, v1.4.0-rc.0) and Grafana Alloy (Revision 800739f)

Configuration

// Alloy remote config server template
prometheus.exporter.self "default" { }

// Grafana Alloy config (Both revision 9e290c693 and 800739fab)
remotecfg {
	url            = "http://127.0.0.1:8888"
	id             = constants.hostname
	poll_frequency = "10s"
	attributes     = {
		"template_name"  = "test-remotecfg",
	}
}

Logs

No response

@thiennn-neji thiennn-neji added the bug Something isn't working label Sep 16, 2024
@wildum
Copy link
Contributor

wildum commented Sep 17, 2024

@tpaschalis

@tpaschalis
Copy link
Member

Hey there @thiennn-neji 👋

Just for context:

I'd like to understand what you're trying to achieve and what was the expected result.

So if I understand correctly the following file is what is passed to alloy run as the /test.alloy

// Grafana Alloy config (Both revision 9e290c693 and 800739fab)
remotecfg {
	url            = "http://127.0.0.1:8888"
	id             = constants.hostname
	poll_frequency = "10s"
	attributes     = {
		"template_name"  = "test-remotecfg",
	}
}

And the following is what the remotecfg server returns

prometheus.exporter.self "default" { }

Is that correct? Are there any other pieces of configuration to either side (either the 'local' or the 'remote' configuration?).

Could you also please provide

  • The contents of the remotecfg directory inside of the storage path (in your example, /tmp/alloy/remotecfg)
  • What the Alloy UI shows on the /remotecfg page, likely http://localhost:12345/remotecfg/ and further sub-pages. If you're building from main you might have to make generate-ui and run commands from the repo root)

@tpaschalis
Copy link
Member

Ok, I was able to get a little closer to the root cause.

I have my test remotecfg server return the following configuration, with a root exporter and one wrapped inside of a module.

prometheus.exporter.self "default" { }

prometheus.scrape "default" {
        targets    = prometheus.exporter.self.default.targets
        forward_to = []
}

declare "mymodule" {
        prometheus.exporter.self "inner" { }

        prometheus.scrape "inner" {
                targets    = prometheus.exporter.self.inner.targets
                forward_to = []
        }
}

mymodule "default" { }

The curl command fails on the first one, but works on the second one

$ curl http://localhost:12345/api/v0/component/remotecfg/prometheus.exporter.self.default/metrics

failed to parse URL path "/api/v0/component/remotecfg/prometheus.exporter.self.default/metrics": invalid path

$ curl http://localhost:12345/api/v0/component/remotecfg/mymodule.default/prometheus.exporter.self.inner/metrics
# HELP alloy_build_info A metric with a constant '1' value labeled by version, revision, branch, goversion from which alloy was built, and the goos and goarch for the build.
# TYPE alloy_build_info gauge
alloy_build_info{branch="main",goarch="amd64",goos="linux",goversion="go1.22.7",revision="9e1b6e827",tags="unknown",version="v1.5.0-devel+wip"} 1

I hadn't come across this as I always wrap my configuration in a module.

Did you have any specific issues to point towards that components aren't working at all, or just that they're not reachable via api/v0 here?

As far as I can tell, the scrape components are scheduled and try to scrape, but the first one fails because of the error you pointed out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants