Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sync metrics #560

Merged
merged 28 commits into from
Apr 23, 2020
Merged

Add sync metrics #560

merged 28 commits into from
Apr 23, 2020

Conversation

sozercan
Copy link
Member

@sozercan sozercan commented Apr 16, 2020

Adds following metrics for sync:

  • sync
    • Number of resources of each kind being cached
    • Tags:
      • kind [pod, namespace, ...] (kinds specified in config object)
      • status [active, error]
    • Aggregation: LastValue
  • sync_duration_seconds
    • Latency of sync operation in seconds
    • Tags: none
    • Aggregation: Distribution
  • sync_last_run_time
    • Timestamp of last sync operation
    • Tags: none
    • Aggregation: LastValue

Fixes #487

@sozercan sozercan force-pushed the metrics-sync branch 2 times, most recently from cac49d0 to bfd4aff Compare April 16, 2020 16:57
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/config/config_controller.go Outdated Show resolved Hide resolved
pkg/controller/sync/sync_controller.go Show resolved Hide resolved
sozercan and others added 20 commits April 21, 2020 14:34
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
…#532)

* Add by-pod status design doc and docs from shomron

Signed-off-by: Max Smythe <smythe@google.com>

* Fix typo

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
…nt#547)

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
…licy-agent#544)

* Add example of using a deny-all template to view request obj

Signed-off-by: Max Smythe <smythe@google.com>

* Fix bugs in deny all example

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Rita Zhang <rita.z.zhang@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* Fix PSP sysctls rego

Run forbidden check only when sysctls list in Pod has items

Signed-off-by: Philip Laine <philip.laine@gmail.com>

* Add test case

Signed-off-by: Philip Laine <philip.laine@gmail.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Since the controller manager will no longer restart when certRotator
returns, the return after a cert rotation in certRotator.Start() makes
certRotator stop working after first cert rotation is done before the
stop channel closes.

This commit keeps the certRotator blocked and rotating certs
periodically until the stop channel is closed. There's no need to
trigger a restart for webhooks to pick up the new certs because
CertWatcher can renew the webhook certs on the fly.

Tested on a GKE cluster:
I set rotationCheckFrequency to 1 minute, certValidityDuration to 10
minutes and lookaheadTime() to 6 minutes to test the behaviour of a cert
rotation and how webhooks react.

There would always be 1 manager restart as expected since the webhook
couldn't find the cert files in the first start.

I also observed a lag for about 20s before CertWatcher updates the
webhook certs after each cert rotation. In this 20s, there will be bad
certificate TLS handshake errors. After CertWatcher updates the webhook
certs, the webhook works fine in the rest of the 4 minutes.

Considering the actual certValidityDuration is 10 years, it's probably
okay for a 20-second down time for webhooks only after a cert rotation.

Signed-off-by: Yiqi Gao <yiqigao@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
…icy-agent#548)

* update Dockerfile to include dynamic build args for platform and architecture

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* add new docker-buildx func for building cross platform multi architecture images

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* add EOL at EOF

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* Adding extra comment with docker buildx links

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* Removing clone depth and version build arg

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* update .dockerignore and COPY rather than clone repository for local development

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* Update .dockerignore

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* COPY explicit references

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* revert individual COPY cmd in Dockerfile

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* revert image to use golang 1.13

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* remove GO111MODULE

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>

* update Dockerfile

Signed-off-by: Michael Fornaro <20387402+xUnholy@users.noreply.github.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
maxsmythe and others added 5 commits April 21, 2020 14:34
* Run audit in a deployment

Audit should still be run as a deployment so that
we can control update behavior if necessary and
get whatever other app management features
come with deployments in the future.

Signed-off-by: Max Smythe <smythe@google.com>

* Update helm chart generation

Signed-off-by: Max Smythe <smythe@google.com>

* Fix indentation on audit manager pull policy

Signed-off-by: Max Smythe <smythe@google.com>

* Fix audit pod/deployment labels

Signed-off-by: Max Smythe <smythe@google.com>

* Manager image patch for audit should reference a deployment

Signed-off-by: Max Smythe <smythe@google.com>

* Forgot to escape a newline

Signed-off-by: Max Smythe <smythe@google.com>

* Fix wrong groupVersion for deployment patch

Signed-off-by: Max Smythe <smythe@google.com>

* Regenerate manifests

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Add new fields in CertRotator and ReconcileVWH structs to store previous
"Gatekeeper" constants and get rid of referencing global constants in the
code.

Refactor AddRotator() to take CertRotator and vwhName as parameters.
Pass all "Gatekeeper" constants from the caller instead.

Tested on a GKE cluster with "make test-e2e".

Signed-off-by: Yiqi Gao <yiqigao@google.com>

Co-authored-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
* Allow for 'm' memory unit in examples

Signed-off-by: Robert Sheehy <rob.sheehy@workiva.com>

* Use millibyte as base unit for mem parsing example

Signed-off-by: Robert Sheehy <rob.sheehy@workiva.com>

* Ensure numbers are parsed as millibytes

Signed-off-by: Robert Sheehy <rob.sheehy@workiva.com>

Co-authored-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
This avoids issues like infinitely appending fields
to manager_image_patch.yaml and people's build files
failing due to an outdated patch.

Signed-off-by: Max Smythe <smythe@google.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Signed-off-by: Robert Sheehy <rob.sheehy@workiva.com>
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
Copy link
Contributor

@maxsmythe maxsmythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sozercan sozercan merged commit 9c74d2e into open-policy-agent:master Apr 23, 2020
@sozercan sozercan deleted the metrics-sync branch April 23, 2020 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metrics for Syncing Resources
7 participants