Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add gatekeeper-webhook post install hook to Helm chart #2052

Merged

Conversation

joaoubaldo
Copy link
Contributor

Signed-off-by: Joao Ubaldo me@joaoubaldo.com

What this PR does / why we need it:
This PR adds a postInstall Job/hook to probe gatekeeper-webhook API availability

Which issue(s) this PR fixes:
Fixes #1156

Special notes for your reviewer:
curlimages repository was used but if this is not reasonable I can add a custom Dockerfile and corresponding CI steps to this PR as well.

@sozercan
Copy link
Member

@joaoubaldo thanks for the PR! can you run make manifests and commit the changes?

Signed-off-by: Joao Ubaldo <me@joaoubaldo.com>
@joaoubaldo joaoubaldo force-pushed the 1156-helm-webhook-post-install branch from baf1d35 to 918cc3c Compare May 18, 2022 08:47
@joaoubaldo
Copy link
Contributor Author

@sozercan done. I also renamed the hook and its values so that it is clearer what they are about.

@codecov-commenter
Copy link

codecov-commenter commented May 20, 2022

Codecov Report

Merging #2052 (c5a62d5) into master (206bbe9) will decrease coverage by 0.13%.
The diff coverage is 40.00%.

@@            Coverage Diff             @@
##           master    #2052      +/-   ##
==========================================
- Coverage   54.50%   54.37%   -0.14%     
==========================================
  Files         111      111              
  Lines        9470     9470              
==========================================
- Hits         5162     5149      -13     
- Misses       3921     3931      +10     
- Partials      387      390       +3     
Flag Coverage Δ
unittests 54.37% <40.00%> (-0.14%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...onstrainttemplate/constrainttemplate_controller.go 54.91% <33.33%> (-3.12%) ⬇️
pkg/webhook/policy.go 37.53% <50.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9056cbb...c5a62d5. Read the comment docs.

@ritazh
Copy link
Member

ritazh commented May 25, 2022

Thanks for the PR @joaoubaldo! This PR currently ensures the helm install is not marked complete until the webhook is available. It however, does not ensure the namespace label container only runs when the webhook is available. So we will continue to get those errors. What do you think about adding this curl check as an initContainer to the gatekeeper-update-namespace-label job such that the namespace label container only runs after we confirm the webhook is available?

@joaoubaldo
Copy link
Contributor Author

Thanks for the review @ritazh. That is a good point. I wasn't considering that the label added by the gatekeeper-update-namespace-label Job was causing the webhook to be called. I will test the initContainer approach and update the PR after.

@joaoubaldo
Copy link
Contributor Author

joaoubaldo commented May 26, 2022

@ritazh I have updated the PR to make the webhook probe process be part of the gatekeeper-update-namespace-label Job as an initContainer. Since .Values.postInstall.labelNamespace can be explicitly disabled, I think it still makes sense to run the gatekeeper-probe-webhook-post-install Job to ensure the deployment only succeeds when the webhook is ready - is this reasonable to you? Essentially gatekeeper-update-namespace-label and gatekeeper-probe-webhook-post-install Jobs are mutually exclusive.

Below are the manifests for both scenarios.

$ helm template opa-gatekeeper -n gatekeeper-system --set postInstall.probeWebhook.enabled=true --set postInstall.labelNamespace.enabled=true ...:

---
# Source: gatekeeper/templates/namespace-post-install.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: gatekeeper-update-namespace-label
  labels:
    release: opa-gatekeeper
    heritage: Helm
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
---
# Source: gatekeeper/templates/namespace-post-install.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: gatekeeper-update-namespace-label
  labels:
    release: opa-gatekeeper
    heritage: Helm
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
rules:
  - apiGroups:
      - ""
    resources:
      - namespaces
    verbs:
      - get
      - update
      - patch
    resourceNames:
      - gatekeeper-system
---
# Source: gatekeeper/templates/namespace-post-install.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: gatekeeper-update-namespace-label
  labels:
    release: opa-gatekeeper
    heritage: Helm
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: gatekeeper-update-namespace-label
subjects:
  - kind: ServiceAccount
    name: gatekeeper-update-namespace-label
    namespace: "gatekeeper-system"
---
# Source: gatekeeper/templates/namespace-post-install.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: gatekeeper-update-namespace-label
  labels:
    app: 'gatekeeper'
    chart: 'gatekeeper'
    gatekeeper.sh/system: "yes"
    heritage: 'Helm'
    release: 'opa-gatekeeper'
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
spec:
  template:
    metadata:
      labels:
        app: 'gatekeeper'
        release: 'opa-gatekeeper'
    spec:
      restartPolicy: OnFailure
      serviceAccount: gatekeeper-update-namespace-label
      nodeSelector:
        kubernetes.io/os: linux
      volumes:
        - name: cert
          secret:
            secretName: gatekeeper-webhook-server-cert
      initContainers:
        - name: webhook-probe-post
          image: "curlimages/curl:7.83.1"
          imagePullPolicy: IfNotPresent
          args:
            - "--retry"
            - "99999"
            - "--retry-max-time"
            - "60"
            - "--retry-delay"
            - "1"
            - "--max-time"
            - "2"
            - "--cacert"
            - /certs/ca.crt
            - "-v"
            - "https://gatekeeper-webhook-service.gatekeeper-system.svc/v1/admitlabel?timeout=2s"
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - all
            readOnlyRootFilesystem: true
            runAsGroup: 999
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
          - mountPath: /certs
            name: cert
            readOnly: true
      containers:
        - name: kubectl-label
          image: "openpolicyagent/gatekeeper-crds:v3.9.0-beta.0"
          imagePullPolicy: IfNotPresent
          args:
            - label
            - ns
            - gatekeeper-system
            - admission.gatekeeper.sh/ignore=no-self-managing
            - --overwrite
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - all
            readOnlyRootFilesystem: true
            runAsGroup: 999
            runAsNonRoot: true
            runAsUser: 1000

$ helm template opa-gatekeeper -n gatekeeper-system --set postInstall.probeWebhook.enabled=true --set postInstall.labelNamespace.enabled=false ...:

---
# Source: gatekeeper/templates/probe-webhook-post-install.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: gatekeeper-probe-webhook-post-install
  labels:
    app: 'gatekeeper'
    chart: 'gatekeeper'
    gatekeeper.sh/system: "yes"
    heritage: 'Helm'
    release: 'opa-gatekeeper'
  annotations:
    "helm.sh/hook": post-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": hook-succeeded,before-hook-creation
spec:
  template:
    metadata:
      labels:
        app: 'gatekeeper'
        release: 'opa-gatekeeper'
    spec:
      restartPolicy: Never
      nodeSelector:
        kubernetes.io/os: linux
      volumes:
        - name: cert
          secret:
            secretName: gatekeeper-webhook-server-cert
      containers:
        - name: webhook-probe-post
          image: "curlimages/curl:7.83.1"
          imagePullPolicy: IfNotPresent
          args:
            - "--retry"
            - "99999"
            - "--retry-max-time"
            - "60"
            - "--retry-delay"
            - "1"
            - "--max-time"
            - "2"
            - "--cacert"
            - /certs/ca.crt
            - "-v"
            - "https://gatekeeper-webhook-service.gatekeeper-system.svc/v1/admitlabel?timeout=2s"
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - all
            readOnlyRootFilesystem: true
            runAsGroup: 999
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
          - mountPath: /certs
            name: cert
            readOnly: true

Copy link
Member

@ritazh ritazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ritazh ritazh requested a review from a team May 31, 2022 15:22
Copy link
Contributor

@maxsmythe maxsmythe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ritazh ritazh merged commit c311365 into open-policy-agent:master Jun 1, 2022
davis-haba pushed a commit to davis-haba/gatekeeper that referenced this pull request Jul 19, 2022
…icy-agent#2052)

* (Fix 1156) Helm: Add gatekeeper-probe-webhook post install probe

Signed-off-by: Joao Ubaldo <me@joaoubaldo.com>

* (Fix 1156) Launch webhook probe as part of labelNamespace postInstall job

Signed-off-by: Joao Ubaldo <me@joaoubaldo.com>

Co-authored-by: Rita Zhang <rita.z.zhang@gmail.com>
Co-authored-by: Sertaç Özercan <852750+sozercan@users.noreply.github.com>
Signed-off-by: davis-haba <davishaba@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"check-ignore-label.gatekeeper.sh" webhook not ready to accept connection just after installation
5 participants