Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing kubernetes deployment/service cause gloo to silently fail - break routing #5022

Closed
kevin-shelaga opened this issue Jul 14, 2021 · 1 comment · Fixed by #5307
Closed
Assignees
Labels
Priority: High Required in next 3 months to make progress, bugs that affect multiple users, or very bad UX Size: M 3 - 5 days Type: Bug Something isn't working

Comments

@kevin-shelaga
Copy link
Contributor

kevin-shelaga commented Jul 14, 2021

Describe the bug
Missing kubernetes deployment/service causes gloo to silently fail - breaks routing. It seems like all routing for all VSs fails when this error occurs, even if they do not reference the RT or US.

To Reproduce
Steps to reproduce the behavior:

helm install gloo glooe/gloo-ee --namespace gloo-system --create-namespace --set-string license_key=$KEY --version v1.8.0
cat <<EOF | kubectl apply -f -
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: default-petstore-8080
  namespace: gloo-system
spec:
  discoveryMetadata:
    labels:
      service: petstore
  kube:
    selector:
      app: petstore
    serviceName: petstore
    serviceNamespace: default
    servicePort: 8080
---  
apiVersion: gateway.solo.io/v1
kind: RouteTable
metadata:
  name: "my-rt"
  namespace: "gloo-system"
spec:
  routes:
    - matchers:
        - prefix: /
      routeAction:
        single:
          upstream:
            name: default-petstore-8080
            namespace: gloo-system
---
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: my-vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
      - "*"  
    routes:
      - matchers:
        - prefix: /
        options:
          extauth:
            disable: true
        delegateAction:
          ref:
            name: "my-rt"
            namespace: "gloo-system"
EOF
 k logs deploy/gloo -n gloo-system

{"level":"warn","ts":1626295142.0915785,"logger":"gloo-ee.v1.event_loop.setup.v1.event_loop.envoyTranslatorSyncer","caller":"syncer/envoy_translator_syncer.go:140","msg":"proxy gloo-system.gateway-proxy was rejected due to invalid config: 2 errors occurred:\n\t* invalid resource gloo-system.default-petstore-8080\n\t* Upstream name:\"default-petstore-8080\"  namespace:\"gloo-system\" references the service \"petstore\" which does not exist in namespace \"default\"\n\n\nAttempting to update only EDS information","version":"1.8.0"}
glooctl check

Checking deployments... OK
Checking pods... OK
Checking upstreams... OK
Checking upstream groups... OK
Checking auth configs... OK
Checking rate limit configs... OK
Checking VirtualHostOptions... OK
Checking RouteOptions... OK
Checking secrets... OK
Checking virtual services... OK
Checking gateways... OK
Checking proxies... OK
No problems detected.
I07 60206 request.go:645] Throttling request took 1.019111389s, request: GET:https://1.1.1.1/apis/acme.cert-manager.io/v1alpha3?timeout=32s
Detected Gloo Federation!
glooctl get us

...
| default-petstore-8080                                | Kubernetes | Accepted | svc name:      petstore             |
|                                                      |            |          | svc namespace: default              |
|                                                      |            |          | port:          8080                 |
|                                                      |            |          |                                     |
...

Expected behavior

  1. Warn reported in the gloo deployment should be an error.
  2. Glooctl should report an error
  3. The upstream with the missing K8s deployment should NOT be accepted

Additional context
Add any other context about the problem here, e.g.

  • Gloo Edge version - v1.8.0
  • Kubernetes version - 1.20
@kevin-shelaga kevin-shelaga added the Type: Bug Something isn't working label Jul 14, 2021
@chrisgaun chrisgaun added the Priority: High Required in next 3 months to make progress, bugs that affect multiple users, or very bad UX label Jul 16, 2021
@chrisgaun chrisgaun added this to the OpenShift milestone Jul 16, 2021
@chrisgaun chrisgaun removed the oncall label Sep 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High Required in next 3 months to make progress, bugs that affect multiple users, or very bad UX Size: M 3 - 5 days Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants