Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

[kibana] initContainer: configure-kibana-token: Back-off restarting failed container: #1714

Closed
mdnfiras opened this issue Oct 27, 2022 · 9 comments · Fixed by #1720
Closed
Assignees
Labels
bug Something isn't working kibana

Comments

@mdnfiras
Copy link

mdnfiras commented Oct 27, 2022

Chart version:
8.4.1 (from the main branch in this point in history)

Kubernetes version:
1.21

Kubernetes provider:
GKE (Google Kubernetes Engine)

Describe the bug:
Kibana's initContainer configure-kibana-token keep crashing forever.

Steps to reproduce:

  1. Deploy elasticsearch 8 helm chart and enable security features (username/password + SSL)
  2. Deploy kibana 8 helm chart and reference the appropriate elasticsearch credentials and certificates secrets
  3. After kibana's pod successfully runs for the first time, delete it
  4. the new kibana pod will have its initContainer configure-kibana-token crashing forever.

Expected behavior:
the new kibana pod will have its initContainer configure-kibana-token completes successfully.

Provide logs and/or server output (if relevant):
configure-kibana-token initContainer logs before crashing:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 409 Conflict

this init container creates a token for kibana's service account and saves it for kibana's actual container.

if i run a similar command from within the elasticsearch pods: curl -k -u $ELASTIC_USERNAME:$ELASTIC_PASSWORD -XPOST https://localhost:9200/_security/service/elastic/kibana/credential/token/mykibana8-kibana?pretty I get the following response:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "version_conflict_engine_exception",
        "reason" : "[service_account_token-elastic/kibana/mykibana8-kibana]: version conflict, document already exists (current version [1])",
        "index_uuid" : "zwuGuJcSS2OUs1ClsFdThB",
        "shard" : "0",
        "index" : ".security-7"
      }
    ],
    "type" : "version_conflict_engine_exception",
    "reason" : "[service_account_token-elastic/kibana/mykibana8-kibana]: version conflict, document already exists (current version [1])",
    "index_uuid" : "zwuGuJcSS2OUs1ClsFdThB",
    "shard" : "0",
    "index" : ".security-7"
  },
  "status" : 409
}

if i manually delete that token: curl -k -u $ELASTIC_USERNAME:$ELASTIC_PASSWORD -XDELETE https://localhost:9200/_security/service/elastic/kibana/credential/token/mykibana8-kibana?pretty

{
  "found" : true
}

and then the pod can start. but again, if that pod dies, the next one will get stuck the same way.

@mdnfiras
Copy link
Author

fix proposal: #1715

@jmlrt jmlrt self-assigned this Nov 3, 2022
@jmlrt jmlrt added bug Something isn't working kibana labels Nov 3, 2022
@jmlrt
Copy link
Member

jmlrt commented Nov 3, 2022

Relates to #1679 (comment)

@jmlrt
Copy link
Member

jmlrt commented Nov 3, 2022

Thanks, @mdnfiras for submitting this issue and PR 👍🏻

Indeed, it seems I forgot to handle the case where a pod is destroyed in #1679.

@mdnfiras
Copy link
Author

mdnfiras commented Nov 3, 2022

Thanks, @mdnfiras for submitting this issue and PR 👍🏻

Indeed, it seems I forgot to handle the case where a pod is destroyed in #1679.

and my fix proposal doesn't handle the case of having multiple replicas.

@jmlrt
Copy link
Member

jmlrt commented Nov 3, 2022

and my fix proposal doesn't handle the case of having multiple replicas.

Yes, that's what I'm trying to balance. From my memory, we already had some issues with multiple replicas for Kibana and it wasn't really supported. However, I can't find back any reference to that in the old GitHub tickets, so I'm not sure if we can't stick to that and merge with your PR or need to find a way to address multiple replicas.

@jmlrt
Copy link
Member

jmlrt commented Nov 3, 2022

If we need to handle multiple replicas/pods, there are different ways to do it:

  • Creating a specific kibana token for each replicas/pods
    • Pros:
      • This can be done by reusing the current init-container
    • Cons:
      • Need to add a suffix to the token name that can relate it to the pod where it will be used
      • Need to find a way to clean up the orphans tokens regularly
  • Create a single kibana token that will be shared by all replicas/pods
    • Pros:
      • No need to manage orphans tokens cleanup as the single token is still cleaned during the post-delete hook
    • Cons:
      • Need to create the token in a pre-install hook and find a way to pass it to the different pods (using a volume created in pre-install hook also?)

@mdnfiras
Copy link
Author

mdnfiras commented Nov 3, 2022

or we can run a job on helm install to:

  1. try to delete any existing token
  2. create a new token and save it as a secret (will need to talk to kubernetes api)

then we can mount the secret to the kibana container directly

@jmlrt
Copy link
Member

jmlrt commented Nov 4, 2022

Indeed, I think the best solution is to use a pre-install job that:

  1. delete the existing token matching the Helm chart release name
  2. create a new token matching the Helm chart release name
  3. encode the token in base64
  4. call the K8S api to create a secret with the base64 encoded token

Then mount the secret into all kibana pods and finaly remove the token + secret in a post-delete job.

I was already able to test the secret creation from a pod using k8s api.

Now, I'm trying to write a node JS script to do all the pre-install steps.
I'm struggling a bit since that's my first time looking at node and javascript 😉.
This would have been a lot easier in Python for example, however, node is the only language interpreter installed into the Kibana Docker image and there is no jq to parse json so using bash is not an option :(

@jmlrt
Copy link
Member

jmlrt commented Nov 4, 2022

Indeed, I think the best solution is to use a pre-install job that:

  1. delete the existing token matching the Helm chart release name
  2. create a new token matching the Helm chart release name
  3. encode the token in base64
  4. call the K8S api to create a secret with the base64 encoded token

Then mount the secret into all kibana pods and finaly remove the token + secret in a post-delete job.

I was already able to test the secret creation from a pod using k8s api.

Now, I'm trying to write a node JS script to do all the pre-install steps. I'm struggling a bit since that's my first time looking at node and javascript 😉. This would have been a lot easier in Python for example, however, node is the only language interpreter installed into the Kibana Docker image and there is no jq to parse json so using bash is not an option :(

PR in progress => #1720 (still a few things to fix 🤞🏻)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working kibana
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants