Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migration logic to v2 upgrade script for prometheus remote write regexes #1256

Merged
merged 1 commit into from
Dec 21, 2020

Conversation

pmalek-sumo
Copy link
Contributor

@pmalek-sumo pmalek-sumo commented Dec 18, 2020

Description

This PR handles 2 changes of remote write relabel regexes between 1.3 and 2.0 (#1200 and #1030) (both rooted under kube-prometheus-stack.prometheus.prometheusSpec.remoteWrite array):

  • for http://$(CHART).$(NAMESPACE).svc.cluster.local:9888/prometheus.metrics.operator.rule

    from

            - url: http://$(CHART).$(NAMESPACE).svc.cluster.local:9888/prometheus.metrics.operator.rule
              writeRelabelConfigs:
                - action: keep
                  regex: 'cluster_quantile:apiserver_request_latencies:histogram_quantile|instance:node_filesystem_usage:sum|instance:node_network_receive_bytes:rate:sum|cluster_quantile:scheduler_e2e_scheduling_latency:histogram_quantile|cluster_quantile:scheduler_scheduling_algorithm_latency:histogram_quantile|cluster_quantile:scheduler_binding_latency:histogram_quantile|node_namespace_pod:kube_pod_info:|:kube_pod_info_node_count:|node:node_num_cpu:sum|:node_cpu_utilisation:avg1m|node:node_cpu_utilisation:avg1m|node:cluster_cpu_utilisation:ratio|:node_cpu_saturation_load1:|node:node_cpu_saturation_load1:|:node_memory_utilisation:|node:node_memory_bytes_total:sum|node:node_memory_utilisation:ratio|node:cluster_memory_utilisation:ratio|:node_memory_swap_io_bytes:sum_rate|node:node_memory_utilisation:|node:node_memory_utilisation_2:|node:node_memory_swap_io_bytes:sum_rate|:node_disk_utilisation:avg_irate|node:node_disk_utilisation:avg_irate|:node_disk_saturation:avg_irate|node:node_disk_saturation:avg_irate|node:node_filesystem_usage:|node:node_filesystem_avail:|:node_net_utilisation:sum_irate|node:node_net_utilisation:sum_irate|:node_net_saturation:sum_irate|node:node_net_saturation:sum_irate|node:node_inodes_total:|node:node_inodes_free:'
                  sourceLabels: [__name__]

    into

            - url: http://$(CHART).$(NAMESPACE).svc.cluster.local:9888/prometheus.metrics.operator.rule
              writeRelabelConfigs:
                - action: keep
                  regex: 'cluster_quantile:apiserver_request_duration_seconds:histogram_quantile|instance:node_filesystem_usage:sum|instance:node_network_receive_bytes:rate:sum|cluster_quantile:scheduler_e2e_scheduling_duration_seconds:histogram_quantile|cluster_quantile:scheduler_scheduling_algorithm_duration_seconds:histogram_quantile|cluster_quantile:scheduler_binding_duration_seconds:histogram_quantile|node_namespace_pod:kube_pod_info:|:kube_pod_info_node_count:|node:node_num_cpu:sum|:node_cpu_utilisation:avg1m|node:node_cpu_utilisation:avg1m|node:cluster_cpu_utilisation:ratio|:node_cpu_saturation_load1:|node:node_cpu_saturation_load1:|:node_memory_utilisation:|node:node_memory_bytes_total:sum|node:node_memory_utilisation:ratio|node:cluster_memory_utilisation:ratio|:node_memory_swap_io_bytes:sum_rate|node:node_memory_utilisation:|node:node_memory_utilisation_2:|node:node_memory_swap_io_bytes:sum_rate|:node_disk_utilisation:avg_irate|node:node_disk_utilisation:avg_irate|:node_disk_saturation:avg_irate|node:node_disk_saturation:avg_irate|node:node_filesystem_usage:|node:node_filesystem_avail:|:node_net_utilisation:sum_irate|node:node_net_utilisation:sum_irate|:node_net_saturation:sum_irate|node:node_net_saturation:sum_irate|node:node_inodes_total:|node:node_inodes_free:'
                  sourceLabels: [__name__]
  • for http://$(CHART).$(NAMESPACE).svc.cluster.local:9888/prometheus.metrics.control-plane.coredns

    from

            - url: http://$(CHART).$(NAMESPACE).svc.cluster.local:9888/prometheus.metrics.control-plane.coredns
              writeRelabelConfigs:
                - action: keep
                  regex: coredns;(?:coredns_cache_(size|(hits|misses)_total)|coredns_dns_request_duration_seconds_(count|sum)|coredns_(dns_request|dns_response_rcode|forward_request)_count_total|process_(cpu_seconds_total|open_fds|resident_memory_bytes))
                  sourceLabels: [job, __name__]

    into

            - url: http://$(CHART).$(NAMESPACE).svc.cluster.local:9888/prometheus.metrics.control-plane.coredns
              writeRelabelConfigs:
                - action: keep
                  regex: 'coredns;(?:coredns_cache_(size|entries|(hits|misses)_total)|coredns_dns_request_duration_seconds_(count|sum)|coredns_(dns_request|dns_response_rcode|forward_request)_count_total|coredns_(forward_requests|dns_requests|dns_responses)_total|process_(cpu_seconds_total|open_fds|resident_memory_bytes))'
                  sourceLabels: [job, __name__]

Attaching the optical diff for those that are brave enough:

image

Testing performed
  • ci/build.sh
  • Redeploy fluentd and fluentd-events pods
  • Confirm events, logs, and metrics are coming in

@pmalek-sumo pmalek-sumo requested a review from a team December 18, 2020 15:52
@pmalek-sumo pmalek-sumo self-assigned this Dec 18, 2020
@pmalek-sumo pmalek-sumo force-pushed the v2-migration-prometheus-remote-writes branch 2 times, most recently from 4c6456e to 94908b9 Compare December 18, 2020 15:58
@sumo-drosiek
Copy link
Contributor

As for upgrade-1.0.0, we should warn customer if we detect changes to regex which we want to update

@pmalek-sumo pmalek-sumo force-pushed the v2-migration-prometheus-remote-writes branch from 94908b9 to b35036d Compare December 21, 2020 11:41
@pmalek-sumo pmalek-sumo force-pushed the v2-migration-prometheus-remote-writes branch from b35036d to 4513607 Compare December 21, 2020 11:42
@pmalek-sumo
Copy link
Contributor Author

As for upgrade-1.0.0, we should warn customer if we detect changes to regex which we want to update

I think you meant 2.0.0, right? Added

@pmalek-sumo pmalek-sumo added this to the v2.0 milestone Dec 21, 2020
Copy link
Contributor

@sumo-drosiek sumo-drosiek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@pmalek-sumo pmalek-sumo merged commit 188f2c7 into main Dec 21, 2020
@pmalek-sumo pmalek-sumo deleted the v2-migration-prometheus-remote-writes branch December 21, 2020 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants