Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding an example dashboard #9

Merged
merged 2 commits into from
Mar 27, 2023

Conversation

ts-mini
Copy link
Contributor

@ts-mini ts-mini commented Jan 30, 2023

This PR adds an example dashboard for grafana

Its a slightly modified dashboard from kubecost/opencost to work with the service monitor changes (and has a variablized datasource for things like mimir/cortex)

@ts-mini ts-mini changed the title tsmini.example dashboard feat: adding an example dashboard Jan 30, 2023
@ts-mini ts-mini force-pushed the tsmini.example-dashboard branch 2 times, most recently from fc25360 to b44862d Compare February 1, 2023 17:46
@mattray
Copy link
Collaborator

mattray commented Mar 10, 2023

@Pokom @logyball @toscott @sntxrr if any of you have the cycles to vet this, I think it's worth putting an example Grafana dashboard in.

@Pokom Pokom self-requested a review March 10, 2023 02:53
@Pokom
Copy link
Collaborator

Pokom commented Mar 10, 2023

Hey @mattray, I added myself as a reviewer and will try to get to this in the next week or so. This request keeps coming up and I think having a foundation of dashboards would be excellent to have.

@ts-mini would you mind rebasing your changes with main?

@ts-mini
Copy link
Contributor Author

ts-mini commented Mar 10, 2023

Hey @mattray, I added myself as a reviewer and will try to get to this in the next week or so. This request keeps coming up and I think having a foundation of dashboards would be excellent to have.

@ts-mini would you mind rebasing your changes with main?

yup! i might have to get to this monday tho, video games are calling

@Pokom
Copy link
Collaborator

Pokom commented Mar 11, 2023

Just initial thoughts/feedback:

The dashboard looks lovely and was usable right out of the box! The breakdown of Cluster Wide vs Namespace vs App is great. The one that didn't quite fit in(IMO) was the breakdown for PVC's and for two reasons:

  1. PVC's should be associated with workloads(apps, namespaces) to give better context
  2. When you have 1000's of PVC's, it takes minutes for this to load

The one thing I couldn't quite get working is App Dropdown. For me it's just forever "none", but this may be more to do with my internal setup then the dashboard itself.

Something that we should consider: A lot of these queries are fairly expensive to run as the datasets get large(100's to 1000's of node clusters). It would be valuable to set some of the queries up as recording rules that we ship alongside the dashboards so the dashboards are more performant.

Great contribution @ts-mini! We can iterate on this a bit next week so we can both get some video game time in 😉

@sntxrr sntxrr added the enhancement New feature or request label Mar 13, 2023
@boniek83
Copy link
Contributor

boniek83 commented Mar 15, 2023

Don't just use kube_node_labels, use kube_node_labels{job="opencost"} instead. If you have prometheus-operator deployed in cluster already using naked kube_node_labels causes all sorts of issues (this is applicable to all metrics exported by kube-state-metrics not just kube_node_labels).
This is more for opencost maintainers: consider adding honorLabels: true in ServiceMonitor in helm chart so we won't have to use exported_* labels. That's just ugly ;)
I see that in some panels you aggregate just by container and namespace. This is a bad idea because container names are not unique! You can easily have the same container names in multiple namespaces. You should aggregate by namespace/pod/container at the same time.

@mattray
Copy link
Collaborator

mattray commented Mar 22, 2023

@Pokom is this good to merge? We can always continue to iterate over it.

Copy link
Collaborator

@toscott toscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this against a couple of my clusters. Ran into issues with duplicate metrics from kube-state-metrics and opencost as a result of the kube_*_labels queries not explicitly referencing job="opencost" as mentioned by boniek83.

Was able to disable kube-state-metrics emitter from opencost by adding environment variable: {"EMIT_KSM_V1_METRICS": "false"}. Once the opencost replicated metrics fell off, most graphs started working (There are a couple that look back 2 weeks that still failed). The original kube cost documentation recommended to not do this. I'm not sure if that carries over to opencost thought.

The App dropdown was working for me for a few public charts with the correct labels, though I had some issues finding them if I added the namespace filter first.

Since this is just an example dashboard and contains many useful queries; my vote would be we merge this as is, then address additional concerns in new issues/pull requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants