aws,packet,baremetal: Remove rkt #946

surajssd · 2020-09-11T08:08:07Z

This commit removes rkt and uses docker to start services.

etcd(for controllers only):

Move the env var to a separate env var file.
Add etcd service file.
Change the name of service from etcd-member to etcd.

Bootkube(for controllers only):

Use docker run instead of rkt.

Kubelet(for controllers and workers):

Remove some of the folder creation in ExecStartPre, these are
automatically created by docker, when mounted using -v flag.

delete-node(for workers):

Use docker run instead of rkt.

Fixes #720
Fixes #917

Release Notes

Change from etcd-member.service to etcd.service. Old mechanism of running etcd(using etcd-wrapper) entailed we use the Flatcar shipped etcd-member.service.
New etcd env vars file: /etc/kubernetes/etcd.env on controller hosts.
Remove all dependency on rkt.

assets/terraform-modules/bare-metal/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

invidian

#449 would be nice 😢

Overall LGTM, left some questions.

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

assets/terraform-modules/packet/flatcar-linux/kubernetes/workers/cl/worker.yaml.tmpl

invidian

OK

rata · 2020-09-16T10:06:10Z

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

+        ExecStart=sh -c "docker run --network=host \
+          -u $(id -u \"$${ETCD_USER}\"):$(id -u \"$${ETCD_USER}\") \


Can't we remove the sh -c?

I guess it is used for the id -u part. But docker takes a string for -u param. Do we really need to run that? Even if we do, can't we add a variable for that in the env file?

Yeah, I think it should be possible to remove it. Good point @rata

tl;dr; string user names don't work on flatcar.

On Flatcar when I do this I get the user id:

# id -u etcd 232

But there is no entry for that user in /etc/passwd:

# cat /etc/passwd root:x:0:0:root:/root:/bin/bash core:x:500:500:Flatcar Admin:/home/core:/bin/bash systemd-timesync:x:997:997:systemd Time Synchronization:/:/sbin/nologin systemd-coredump:x:996:996:systemd Core Dumper:/:/sbin/nologin

So docker fails:

# docker run -u etcd fedora bash /run/torcx/bin/docker: Error response from daemon: linux spec user: unable to find user etcd: no matching entries in passwd file. ERRO[0000] error waiting for container: context canceled

There is a workaround you can use which is echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd but this does not work as well.

So there is another passwd file at play here: https://github.com/flatcar-linux/baselayout/blob/flatcar-master/baselayout/passwd, this is where we get the user id of etcd from.

There is a workaround you can use which is echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd but this does not work as well.

Can't we create the user using ignition too?

Added following line, but there was no effect:

diff --git a/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl b/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controll er.yaml.tmpl index fd50be82a..e911708ee 100644 --- a/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl +++ b/assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl @@ -252,3 +252,4 @@ passwd: users: - name: core ssh_authorized_keys: ${ssh_keys} + - name: etcd

on the controller host:

# cat /etc/passwd root:x:0:0:root:/root:/bin/bash core:x:500:500:Flatcar Admin:/home/core:/bin/bash systemd-timesync:x:997:997:systemd Time Synchronization:/:/sbin/nologin systemd-coredump:x:996:996:systemd Core Dumper:/:/sbin/nologin

Also if I manually try to add the user it does not work, because the user already exists:

# useradd etcd useradd: user 'etcd' already exists

The user is already there in:

# cat /usr/share/baselayout/passwd | grep etcd etcd:x:232:232::/dev/null:/sbin/nologin

But docker does not identify this, hence the workaround.

I had a chat with Thilo and this is what he had to add:

Baselayout provides /etc/passwd for the initrd, and includes the etcd user. The /etc/passwd is the runtime configuration, which is generated from the systemd build recipe.

baselayout (w/ etcd user: https://github.com/flatcar-linux/baselayout/blob/flatcar-master/baselayout/passwd).

systemd build recipe in coreos-overlay: https://github.com/flatcar-linux/coreos-overlay/blob/main/sys-apps/systemd/systemd-9999.ebuild#L534-L542

For a quick patch, please try and work around this by issuing

echo 'etcd:x:232:232::/dev/null:/sbin/nologin' | sudo tee -a /etc/passwd

In the long term we should really only have one authoritative source for passwd.

If we want to investigate it later, I think it is cleaner to create a variable in the env-file with the UID and use that variable here, instead of running a shell to run id -u.

That's a lot of indirection then to have separate unit to write env file etc. IMO the current way is simpler.

Oh, right, if we need a new unit for this (I thought we had one to create those things already) then this way is not so bad :)

so can we resolve this one?

I think @rata suggested to either investigate deeper why we can't properly use etcd user on Flatcar OR create an issue for it to investigate later (though honestly, this seems like an issue we will never go back to, but I don't mind having it created).

And as creating new unit to create env file seems complicated, I have a feeling that currently approach with sh is okay'ish.

Docker takes the user ID from the image itself. If the image specifies an etcd user, this has probably not the same user ID as the one Flatcar uses, therefore, the current way makes sense as long as you want to depend on the basic Flatcar etcd setup.

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

rata · 2020-09-16T10:09:21Z

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

+          -v /etc/kubernetes:/etc/kubernetes:ro \
+          -v /etc/machine-id:/etc/machine-id \
+          -v /lib/modules:/lib/modules \
+          -v /run:/run:rw \


Why not rw just to a specific path only for the kubelet files?

It uses unspecified files from here and a lock file:

# lsof | grep '/run' | grep kubelet kubelet 10931 root 5uW REG 0,21 0 30352 /run/lock/kubelet.lock kubelet 10931 root 10u unix 0xffff9fd3b65b6400 0t0 311644 /var/run/661904757 type=STREAM kubelet 10931 root 14u unix 0xffff9fd3b65b4c00 0t0 311194 /var/run/661904757 type=STREAM kubelet 10931 root 20u unix 0xffff9fd3b65b5400 0t0 311195 /var/run/661904757 type=STREAM kubelet 10931 root 31u unix 0xffff9fd2b4d93800 0t0 311748 /var/run/661904757 type=STREAM kubelet 10931 10948 root 5uW REG 0,21 0 30352 /run/lock/kubelet.lock kubelet 10931 10948 root 10u unix 0xffff9fd3b65b6400 0t0 311644 /var/run/661904757 type=STREAM kubelet 10931 10948 root 14u unix 0xffff9fd3b65b4c00 0t0 311194 /var/run/661904757 type=STREAM kubelet 10931 10948 root 20u unix 0xffff9fd3b65b5400 0t0 311195 /var/run/661904757 type=STREAM kubelet 10931 10948 root 31u unix 0xffff9fd2b4d93800 0t0 311748 /var/run/661904757 type=STREAM ...

Oh, and the path can't be configured to use another one? If we can lock it down more, IMHO seems better. But we were running like this before with rkt, so no biggie :)

Also, /run/lock must be shared via host path between host kubelet and self-hosted one. But /var/run is separate and only in the container, right?

Between /var and /var/run, one is symlink of other

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

invidian

Some nits, which might not be relevant for now. Overall the PR looks good to me 👍

I'd also like to have opened conversations resolved before we merge it.

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

invidian

Some nits regarding newly added comments.

assets/charts/control-plane/kubelet/templates/kubelet-ds.yaml

invidian

Please add a reference, that this PR also addresses #917.

invidian

LGTM

rata

This overall LGTM (some small nit-picking comments here&there), but there is one unknown that worries me a little bit: https://github.com/kinvolk/lokomotive/pull/946/files#r494871543

rata · 2020-09-25T15:24:43Z

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

+        TimeoutStartSec=0
+        LimitNOFILE=40000
+        EnvironmentFile=/etc/kubernetes/etcd.env
+        ExecStart=sh -c "docker run --network=host \


I guess this might refer to the same problem the link from systemd-docker alban posted mention, but don't we want to exec docker ... instead of just running it? I'm not sure if systemd will handle correctly a child for this process (without exec, the shell forks and executes).

I ignore about systemd to know if exec or not will be better. But seems something that can be relevant. Did you chose this for some reason?

In current implementation now, no matter who kills kubelet systemd will always restart the process again.

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

invidian

Just one Q, otherwise LGTM.

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl

invidian

LGTM

This commit removes rkt and uses docker to start services. etcd(for controllers only): - Move the env var to a separate env var file. - Add etcd service file. - Change the name of service from etcd-member to etcd. Bootkube(for controllers only): Use `docker run` instead of rkt. Kubelet(for controllers and workers): Remove some of the folder creation in `ExecStartPre`, these are automatically created by docker, when mounted using `-v` flag. delete-node(for workers): Use `docker run` instead of rkt. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>

Fix the env var file with change in removal of rkt. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>

This commit removes rkt and uses docker to start services. etcd(for controllers only): - Move the env var to a separate env var file. - Add etcd service file. - Change the name of service from etcd-member to etcd. Bootkube(for controllers only): Use `docker run` instead of rkt. Kubelet(for controllers and workers): Remove some of the folder creation in `ExecStartPre`, these are automatically created by docker, when mounted using `-v` flag. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>

surajssd · 2020-10-08T10:21:10Z

Thanks everyone for your reviews 🎉

surajssd marked this pull request as draft September 11, 2020 08:08

surajssd commented Sep 11, 2020

View reviewed changes

assets/terraform-modules/bare-metal/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Show resolved Hide resolved

surajssd force-pushed the surajssd/remove-rkt branch 2 times, most recently from 180f662 to 02acf75 Compare September 11, 2020 09:37

surajssd marked this pull request as ready for review September 11, 2020 10:39

surajssd requested review from invidian and rata September 11, 2020 10:39

invidian reviewed Sep 14, 2020

View reviewed changes

surajssd force-pushed the surajssd/remove-rkt branch from 02acf75 to a360007 Compare September 14, 2020 07:44

invidian previously approved these changes Sep 14, 2020

View reviewed changes

rata reviewed Sep 16, 2020

View reviewed changes

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Outdated Show resolved Hide resolved

rata reviewed Sep 16, 2020

View reviewed changes

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Show resolved Hide resolved

rata reviewed Sep 16, 2020

View reviewed changes

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Show resolved Hide resolved

rata reviewed Sep 16, 2020

View reviewed changes

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Show resolved Hide resolved

surajssd dismissed invidian’s stale review via a10f0d7 September 17, 2020 07:46

surajssd force-pushed the surajssd/remove-rkt branch 2 times, most recently from a10f0d7 to f760a3c Compare September 22, 2020 14:22

invidian reviewed Sep 22, 2020

View reviewed changes

surajssd force-pushed the surajssd/remove-rkt branch 2 times, most recently from fc76aa7 to a02cf73 Compare September 23, 2020 09:10

invidian suggested changes Sep 23, 2020

View reviewed changes

surajssd force-pushed the surajssd/remove-rkt branch from a02cf73 to 283e863 Compare September 23, 2020 10:34

invidian mentioned this pull request Sep 23, 2020

delete-node.service cannot access kubeconfig when using tls bootstrap #917

Closed

invidian suggested changes Sep 24, 2020

View reviewed changes

surajssd force-pushed the surajssd/remove-rkt branch from 283e863 to a9224ed Compare September 24, 2020 06:32

surajssd requested review from invidian and rata September 24, 2020 06:32

invidian previously approved these changes Sep 24, 2020

View reviewed changes

invidian mentioned this pull request Sep 24, 2020

Add encryption to in-cluster pod traffic #911

Merged

rata reviewed Sep 25, 2020

View reviewed changes

surajssd dismissed invidian’s stale review via 9427c7b October 1, 2020 14:20

surajssd force-pushed the surajssd/remove-rkt branch from a9224ed to 9427c7b Compare October 1, 2020 14:20

invidian reviewed Oct 5, 2020

View reviewed changes

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Show resolved Hide resolved

surajssd force-pushed the surajssd/remove-rkt branch from 9427c7b to ee89965 Compare October 7, 2020 07:37

surajssd requested review from invidian and iaguis October 7, 2020 07:39

invidian previously approved these changes Oct 7, 2020

View reviewed changes

assets/terraform-modules/aws/flatcar-linux/kubernetes/cl/controller.yaml.tmpl Outdated Show resolved Hide resolved

surajssd dismissed invidian’s stale review via a253eb3 October 7, 2020 10:15

surajssd force-pushed the surajssd/remove-rkt branch 3 times, most recently from 7f0f2ff to f6cf4ca Compare October 8, 2020 06:44

surajssd requested a review from invidian October 8, 2020 06:44

invidian previously approved these changes Oct 8, 2020

View reviewed changes

surajssd added 6 commits October 8, 2020 14:55

docs: Upgrade etcd

2b9e157

Fix the env var file with change in removal of rkt. Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>

kubelet chart: Sync with the volume mounts with bootstrap kubelet

ab6f136

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>

update render-manifest

b45468f

surajssd dismissed invidian’s stale review via b45468f October 8, 2020 09:25

surajssd force-pushed the surajssd/remove-rkt branch from f6cf4ca to b45468f Compare October 8, 2020 09:25

invidian approved these changes Oct 8, 2020

View reviewed changes

surajssd merged commit 0cb8e9b into master Oct 8, 2020

surajssd deleted the surajssd/remove-rkt branch October 8, 2020 10:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws,packet,baremetal: Remove rkt #946

aws,packet,baremetal: Remove rkt #946

surajssd commented Sep 11, 2020 •

edited by invidian

Loading

invidian left a comment

invidian left a comment

rata Sep 16, 2020

invidian Sep 16, 2020

surajssd Sep 16, 2020

invidian Sep 16, 2020

surajssd Sep 17, 2020

invidian Sep 25, 2020

rata Sep 25, 2020

surajssd Sep 28, 2020

invidian Sep 28, 2020

pothos Sep 28, 2020

rata Sep 16, 2020 •

edited

Loading

surajssd Sep 17, 2020

rata Sep 25, 2020 •

edited

Loading

invidian Sep 28, 2020

surajssd Oct 7, 2020

invidian left a comment

invidian left a comment

invidian left a comment

invidian left a comment

rata left a comment

rata Sep 25, 2020 •

edited

Loading

surajssd Oct 7, 2020

invidian left a comment

invidian left a comment

surajssd commented Oct 8, 2020

		ExecStart=sh -c "docker run --network=host \
		-u $(id -u \"$${ETCD_USER}\"):$(id -u \"$${ETCD_USER}\") \

aws,packet,baremetal: Remove rkt #946

aws,packet,baremetal: Remove rkt #946

Conversation

surajssd commented Sep 11, 2020 • edited by invidian Loading

Release Notes

invidian left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rata Sep 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rata Sep 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

rata left a comment

Choose a reason for hiding this comment

rata Sep 25, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

invidian left a comment

Choose a reason for hiding this comment

surajssd commented Oct 8, 2020

surajssd commented Sep 11, 2020 •

edited by invidian

Loading

rata Sep 16, 2020 •

edited

Loading

rata Sep 25, 2020 •

edited

Loading

rata Sep 25, 2020 •

edited

Loading