Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

aws,packet,baremetal: Remove rkt #946

Merged
merged 6 commits into from
Oct 8, 2020
Merged

aws,packet,baremetal: Remove rkt #946

merged 6 commits into from
Oct 8, 2020

Conversation

surajssd
Copy link
Member

@surajssd surajssd commented Sep 11, 2020

This commit removes rkt and uses docker to start services.

etcd(for controllers only):

  • Move the env var to a separate env var file.
  • Add etcd service file.
  • Change the name of service from etcd-member to etcd.

Bootkube(for controllers only):

  • Use docker run instead of rkt.

Kubelet(for controllers and workers):

  • Remove some of the folder creation in ExecStartPre, these are
    automatically created by docker, when mounted using -v flag.

delete-node(for workers):

  • Use docker run instead of rkt.

Fixes #720
Fixes #917

Release Notes

  • Change from etcd-member.service to etcd.service. Old mechanism of running etcd(using etcd-wrapper) entailed we use the Flatcar shipped etcd-member.service.
  • New etcd env vars file: /etc/kubernetes/etcd.env on controller hosts.
  • Remove all dependency on rkt.

@surajssd surajssd marked this pull request as draft September 11, 2020 08:08
@surajssd surajssd force-pushed the surajssd/remove-rkt branch 2 times, most recently from 180f662 to 02acf75 Compare September 11, 2020 09:37
@surajssd surajssd marked this pull request as ready for review September 11, 2020 10:39
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#449 would be nice 😢

Overall LGTM, left some questions.

invidian
invidian previously approved these changes Sep 14, 2020
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Comment on lines 19 to 25
ExecStart=sh -c "docker run --network=host \
-u $(id -u \"$${ETCD_USER}\"):$(id -u \"$${ETCD_USER}\") \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we remove the sh -c?

I guess it is used for the id -u part. But docker takes a string for -u param. Do we really need to run that? Even if we do, can't we add a variable for that in the env file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it should be possible to remove it. Good point @rata

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr; string user names don't work on flatcar.


On Flatcar when I do this I get the user id:

# id -u etcd
232

But there is no entry for that user in /etc/passwd:

# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
core:x:500:500:Flatcar Admin:/home/core:/bin/bash
systemd-timesync:x:997:997:systemd Time Synchronization:/:/sbin/nologin
systemd-coredump:x:996:996:systemd Core Dumper:/:/sbin/nologin

So docker fails:

# docker run -u etcd fedora bash
/run/torcx/bin/docker: Error response from daemon: linux spec user: unable to find user etcd: no matching entries in passwd file.
ERRO[0000] error waiting for container: context canceled

There is a workaround you can use which is echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd but this does not work as well.

So there is another passwd file at play here: https://github.com/flatcar-linux/baselayout/blob/flatcar-master/baselayout/passwd, this is where we get the user id of etcd from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a workaround you can use which is echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd but this does not work as well.

Can't we create the user using ignition too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added following line, but there was no effect:

diff --git a/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl b/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controll
er.yaml.tmpl
index fd50be82a..e911708ee 100644
--- a/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
+++ b/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl
@@ -252,3 +252,4 @@ passwd:
   users:
     - name: core
       ssh_authorized_keys: ${ssh_keys}
+    - name: etcd

on the controller host:

# cat /etc/passwd
root:x:0:0:root:/root:/bin/bash
core:x:500:500:Flatcar Admin:/home/core:/bin/bash
systemd-timesync:x:997:997:systemd Time Synchronization:/:/sbin/nologin
systemd-coredump:x:996:996:systemd Core Dumper:/:/sbin/nologin

Also if I manually try to add the user it does not work, because the user already exists:

# useradd etcd
useradd: user 'etcd' already exists

The user is already there in:

# cat /usr/share/baselayout/passwd | grep etcd
etcd:x:232:232::/dev/null:/sbin/nologin

But docker does not identify this, hence the workaround.

I had a chat with Thilo and this is what he had to add:

Baselayout provides /etc/passwd for the initrd, and includes the etcd user. The /etc/passwd is the runtime configuration, which is generated from the systemd build recipe.

For a quick patch, please try and work around this by issuing

echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd

In the long term we should really only have one authoritative source for passwd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to investigate it later, I think it is cleaner to create a variable in the env-file with the UID and use that variable here, instead of running a shell to run id -u.

That's a lot of indirection then to have separate unit to write env file etc. IMO the current way is simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, if we need a new unit for this (I thought we had one to create those things already) then this way is not so bad :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so can we resolve this one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @rata suggested to either investigate deeper why we can't properly use etcd user on Flatcar OR create an issue for it to investigate later (though honestly, this seems like an issue we will never go back to, but I don't mind having it created).

And as creating new unit to create env file seems complicated, I have a feeling that currently approach with sh is okay'ish.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker takes the user ID from the image itself. If the image specifies an etcd user, this has probably not the same user ID as the one Flatcar uses, therefore, the current way makes sense as long as you want to depend on the basic Flatcar etcd setup.

-v /etc/kubernetes:/etc/kubernetes:ro \
-v /etc/machine-id:/etc/machine-id \
-v /lib/modules:/lib/modules \
-v /run:/run:rw \
Copy link
Member

@rata rata Sep 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not rw just to a specific path only for the kubelet files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses unspecified files from here and a lock file:

# lsof | grep '/run' | grep kubelet                                                                                                           
kubelet   10931                   root    5uW     REG               0,21         0      30352 /run/lock/kubelet.lock        
kubelet   10931                   root   10u     unix 0xffff9fd3b65b6400       0t0     311644 /var/run/661904757 type=STREAM
kubelet   10931                   root   14u     unix 0xffff9fd3b65b4c00       0t0     311194 /var/run/661904757 type=STREAM
kubelet   10931                   root   20u     unix 0xffff9fd3b65b5400       0t0     311195 /var/run/661904757 type=STREAM
kubelet   10931                   root   31u     unix 0xffff9fd2b4d93800       0t0     311748 /var/run/661904757 type=STREAM
kubelet   10931 10948             root    5uW     REG               0,21         0      30352 /run/lock/kubelet.lock        
kubelet   10931 10948             root   10u     unix 0xffff9fd3b65b6400       0t0     311644 /var/run/661904757 type=STREAM
kubelet   10931 10948             root   14u     unix 0xffff9fd3b65b4c00       0t0     311194 /var/run/661904757 type=STREAM
kubelet   10931 10948             root   20u     unix 0xffff9fd3b65b5400       0t0     311195 /var/run/661904757 type=STREAM
kubelet   10931 10948             root   31u     unix 0xffff9fd2b4d93800       0t0     311748 /var/run/661904757 type=STREAM
...

Copy link
Member

@rata rata Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and the path can't be configured to use another one? If we can lock it down more, IMHO seems better. But we were running like this before with rkt, so no biggie :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, /run/lock must be shared via host path between host kubelet and self-hosted one. But /var/run is separate and only in the container, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Between /var and /var/run, one is symlink of other

Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits, which might not be relevant for now. Overall the PR looks good to me 👍

I'd also like to have opened conversations resolved before we merge it.

@surajssd surajssd force-pushed the surajssd/remove-rkt branch 2 times, most recently from fc76aa7 to a02cf73 Compare September 23, 2020 09:10
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits regarding newly added comments.

Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a reference, that this PR also addresses #917.

invidian
invidian previously approved these changes Sep 24, 2020
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall LGTM (some small nit-picking comments here&there), but there is one unknown that worries me a little bit: https://github.com/kinvolk/lokomotive/pull/946/files#r494871543

TimeoutStartSec=0
LimitNOFILE=40000
EnvironmentFile=/etc/kubernetes/etcd.env
ExecStart=sh -c "docker run --network=host \
Copy link
Member

@rata rata Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this might refer to the same problem the link from systemd-docker alban posted mention, but don't we want to exec docker ... instead of just running it? I'm not sure if systemd will handle correctly a child for this process (without exec, the shell forks and executes).

I ignore about systemd to know if exec or not will be better. But seems something that can be relevant. Did you chose this for some reason?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In current implementation now, no matter who kills kubelet systemd will always restart the process again.

invidian
invidian previously approved these changes Oct 7, 2020
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one Q, otherwise LGTM.

invidian
invidian previously approved these changes Oct 8, 2020
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This commit removes rkt and uses docker to start services.

etcd(for controllers only):

  - Move the env var to a separate env var file.
  - Add etcd service file.
  - Change the name of service from etcd-member to etcd.

Bootkube(for controllers only):

  Use `docker run` instead of rkt.

Kubelet(for controllers and workers):

  Remove some of the folder creation in `ExecStartPre`, these are
  automatically created by docker, when mounted using `-v` flag.

delete-node(for workers):

  Use `docker run` instead of rkt.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
This commit removes rkt and uses docker to start services.

etcd(for controllers only):

  - Move the env var to a separate env var file.
  - Add etcd service file.
  - Change the name of service from etcd-member to etcd.

Bootkube(for controllers only):

  Use `docker run` instead of rkt.

Kubelet(for controllers and workers):

  Remove some of the folder creation in `ExecStartPre`, these are
  automatically created by docker, when mounted using `-v` flag.

delete-node(for workers):

  Use `docker run` instead of rkt.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Fix the env var file with change in removal of rkt.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
This commit removes rkt and uses docker to start services.

etcd(for controllers only):

  - Move the env var to a separate env var file.
  - Add etcd service file.
  - Change the name of service from etcd-member to etcd.

Bootkube(for controllers only):

  Use `docker run` instead of rkt.

Kubelet(for controllers and workers):

  Remove some of the folder creation in `ExecStartPre`, these are
  automatically created by docker, when mounted using `-v` flag.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
@surajssd
Copy link
Member Author

surajssd commented Oct 8, 2020

Thanks everyone for your reviews 🎉

@surajssd surajssd merged commit 0cb8e9b into master Oct 8, 2020
@surajssd surajssd deleted the surajssd/remove-rkt branch October 8, 2020 10:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

delete-node.service cannot access kubeconfig when using tls bootstrap Remove all usages of rkt
5 participants