You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The monitoring stack (prometheus, grafana, loki etc) have enough resources to start, but often struggle when scaled beyond a single node or with higher volume workloads. We should consider updating the default values, and provide clear guidance on suggested overrides for various deployment scales/sizes.
Additional context
Prometheus in particular seems to struggle even with relatively small workloads.
The text was updated successfully, but these errors were encountered:
This may be a good reason to evaluate scalable/HA grafana + prometheus. For reference on DUBBD in the past we had tickets for HPAs on those two and noted necessary external dependencies:
…713)
## Description
Document added for resource/HA overrides across core packages.
Also ~doubles Prometheus' limits, but does not adjust the requests. This
should ensure that Prometheus still schedules without requiring
significant resources, but also allows it to consume more memory without
hitting OOM errors.
## Related Issue
Related to #551
## Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Other (security config, docs update, etc)
## Checklist before merging
- [x] Test, docs, adr added or updated as needed
- [x] [Contributor
Guide](https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md)
followed
Is your feature request related to a problem? Please describe.
The monitoring stack (prometheus, grafana, loki etc) have enough resources to start, but often struggle when scaled beyond a single node or with higher volume workloads. We should consider updating the default values, and provide clear guidance on suggested overrides for various deployment scales/sizes.
Additional context
Prometheus in particular seems to struggle even with relatively small workloads.
The text was updated successfully, but these errors were encountered: