Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run tests and enable CI for qemu-runtime-rs implementation #9804

Open
4 of 5 tasks
wainersm opened this issue Jun 10, 2024 · 5 comments · Fixed by #9807 or #9833
Open
4 of 5 tasks

Run tests and enable CI for qemu-runtime-rs implementation #9804

wainersm opened this issue Jun 10, 2024 · 5 comments · Fixed by #9807 or #9833
Assignees
Labels
area/ci Issues affecting the continuous integration enhancement Improvement to an existing feature needs-review Needs to be assessed by the team.

Comments

@wainersm
Copy link
Contributor

wainersm commented Jun 10, 2024

Which feature do you think can be improved?

We should run tests on runtime-rs' qemu implementation (a.k.a qemu-runtime-rs) to ensure:

  • it's on par with dragonball and cloud hypervisor crates
  • it's on par with the golang qemu implementation for runtime (in golang)
  • when it cannot be on par, at least we know the limitations better

First we'd like to run one-off tests to assess the current status. Then we go through a fix-and-enable-on-CI cycle until the qemu-runtime-rs is on par with other implementations and we understood the current limitations (if any).

How can it be improved?

DRAFT!

@wainersm wainersm added enhancement Improvement to an existing feature needs-review Needs to be assessed by the team. area/ci Issues affecting the continuous integration labels Jun 10, 2024
@wainersm wainersm self-assigned this Jun 10, 2024
wainersm added a commit to wainersm/kata-containers that referenced this issue Jun 10, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
@katacontainersbot katacontainersbot moved this from To do to In progress in Issue backlog Jun 10, 2024
wainersm added a commit to wainersm/kata-containers that referenced this issue Jun 11, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
wainersm added a commit to wainersm/kata-containers that referenced this issue Jun 11, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
wainersm added a commit to wainersm/kata-containers that referenced this issue Jun 11, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
wainersm added a commit to wainersm/kata-containers that referenced this issue Jun 11, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
@wainersm wainersm reopened this Jun 12, 2024
Issue backlog automation moved this from In progress to To do Jun 12, 2024
datadog-compute-robot pushed a commit to DataDog/kata-containers that referenced this issue Jun 17, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
datadog-compute-robot pushed a commit to DataDog/kata-containers that referenced this issue Jun 17, 2024
Allow kata-deploy to install and configure the qemu-runtime-rs runtimeClass
which ties to qemu hypervisor implementation in rust for the runtime-rs.

Fixes: kata-containers#9804
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
@wainersm wainersm reopened this Jun 19, 2024
ldoktor added a commit to ldoktor/kata-containers that referenced this issue Jun 20, 2024
we do encourage people to set the KATA_RUNTIME, but it is only used by
the webhook. Let's define it in the main `test.sh` and use it in the
smoke test to ensure the user-defined runtime is smoke-tested rather
than hard-coded kata-qemu one.

Related to: kata-containers#9804

Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
@wainersm
Copy link
Contributor Author

Hi @pmores @BbolroC !

* [ ]  Run [k8s integration tests](https://github.com/kata-containers/kata-containers/tree/main/tests/integration/kubernetes) for  `qemu-runtime-rs` on [kubernetes tests, using CRI-O](https://github.com/kata-containers/kata-containers/blob/main/.github/workflows/run-k8s-tests-with-crio-on-garm.yaml) workflow

At begin I was thinking in run ^^^. However, this week I talked with @ldoktor about actually running the openshift-tests on openshift-ci just like we do for qemu (with go runtime) but rather with qemu-runtime-rs (and rust runtime). Because Openshift uses CRI-O we will then end up already exercising with CRI-O, so perhaps we can drop that test scenario from kata CI?

@wainersm
Copy link
Contributor Author

@ldoktor
Copy link
Contributor

ldoktor commented Jun 21, 2024

Hi @pmores @BbolroC !

* [ ]  Run [k8s integration tests](https://github.com/kata-containers/kata-containers/tree/main/tests/integration/kubernetes) for  `qemu-runtime-rs` on [kubernetes tests, using CRI-O](https://github.com/kata-containers/kata-containers/blob/main/.github/workflows/run-k8s-tests-with-crio-on-garm.yaml) workflow

At begin I was thinking in run ^^^. However, this week I talked with @ldoktor about actually running the openshift-tests on openshift-ci just like we do for qemu (with go runtime) but rather with qemu-runtime-rs (and rust runtime). Because Openshift uses CRI-O we will then end up already exercising with CRI-O, so perhaps we can drop that test scenario from kata CI?

Yeah, I preliminary tried that, deployment seems fine, but the smoke-test KATA_RUNTIME=kata-qemu-runtime-rs ./run_smoke_test.sh fails to create a simple pod using qemu-runtime-rs with:

Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               11m                 default-scheduler  Successfully assigned default/http-server to ci-ln-0dijlkt-1d09d-5xf62-worker-eastus2-t4zsf
  Normal   AddedInterface          58s (x37 over 11m)  multus             Add eth0 [10.129.2.16/23] from ovn-kubernetes
  Warning  FailedCreatePodSandBox  54s (x37 over 10m)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: Others("failed to handle message create container\n\nCaused by:\n    0: create\n    1: agent create container\n    2: rpc status: Status { code: INTERNAL, message: \"EINVAL: Invalid argument\", details: [], special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }"): unknown

when I tried running smoke test with kata-qemu on the same deployment via KATA_RUNTIME=kata-qemu ./run_smoke_test.sh it passed... I'll try with 4.15 or 4.14.

@ldoktor
Copy link
Contributor

ldoktor commented Jun 21, 2024

Yeah, I preliminary tried that, deployment seems fine, but the smoke-test KATA_RUNTIME=kata-qemu-runtime-rs ./run_smoke_test.sh fails to create a simple pod using qemu-runtime-rs with:

Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               11m                 default-scheduler  Successfully assigned default/http-server to ci-ln-0dijlkt-1d09d-5xf62-worker-eastus2-t4zsf
  Normal   AddedInterface          58s (x37 over 11m)  multus             Add eth0 [10.129.2.16/23] from ovn-kubernetes
  Warning  FailedCreatePodSandBox  54s (x37 over 10m)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: Others("failed to handle message create container\n\nCaused by:\n    0: create\n    1: agent create container\n    2: rpc status: Status { code: INTERNAL, message: \"EINVAL: Invalid argument\", details: [], special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }"): unknown

when I tried running smoke test with kata-qemu on the same deployment via KATA_RUNTIME=kata-qemu ./run_smoke_test.sh it passed... I'll try with 4.15 or 4.14.

The same failure with 4.14, unless someone know why from top of their head I'll try to talk to some devels next week about that...

@wainersm
Copy link
Contributor Author

Hi @ldoktor !

Yeah, I preliminary tried that, deployment seems fine, but the smoke-test KATA_RUNTIME=kata-qemu-runtime-rs ./run_smoke_test.sh fails to create a simple pod using qemu-runtime-rs with:

Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               11m                 default-scheduler  Successfully assigned default/http-server to ci-ln-0dijlkt-1d09d-5xf62-worker-eastus2-t4zsf
  Normal   AddedInterface          58s (x37 over 11m)  multus             Add eth0 [10.129.2.16/23] from ovn-kubernetes
  Warning  FailedCreatePodSandBox  54s (x37 over 10m)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = CreateContainer failed: Others("failed to handle message create container\n\nCaused by:\n    0: create\n    1: agent create container\n    2: rpc status: Status { code: INTERNAL, message: \"EINVAL: Invalid argument\", details: [], special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }"): unknown

when I tried running smoke test with kata-qemu on the same deployment via KATA_RUNTIME=kata-qemu ./run_smoke_test.sh it passed... I'll try with 4.15 or 4.14.

hmmm... it seems that kata-deploy has installed all the runtimeClasses and this won't work. On Kata CI, we are deploying one runtimeClass at each job, and kata-deploy has an special handler for when runtimeclass is either dragonball, cloud-hypervisor or qemu-runtime-rs.

That special handler basically symlink containerd-shim-kata-v2 to /opt/kata/runtime-rs/bin/containerd-shim-kata-v2 (the rust runtime).

See https://github.com/kata-containers/kata-containers/blob/main/tools/packaging/kata-deploy/scripts/kata-deploy.sh#L128 and https://github.com/kata-containers/kata-containers/blob/main/tools/packaging/kata-deploy/scripts/kata-deploy.sh#L345

You will need to change the https://github.com/kata-containers/kata-containers/blob/main/tools/packaging/kata-deploy/kata-deploy/base/kata-deploy.yaml to:

As an example on how Kata CI change the file: https://github.com/kata-containers/kata-containers/blob/main/tests/functional/kata-deploy/kata-deploy.bats#L31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci Issues affecting the continuous integration enhancement Improvement to an existing feature needs-review Needs to be assessed by the team.
Projects
Issue backlog
  
To do
2 participants