Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-rs: add base qmp framework #9772

Merged
merged 3 commits into from
Jun 18, 2024

Conversation

pmores
Copy link
Contributor

@pmores pmores commented Jun 4, 2024

This PR creates a base framework for QMP functionality. The idea of the new Qmp object is similar to the one of QemuCmdLine - encapsulate implementation details of QMP handling while offering QemuInner an interface that matches its needs.

The qapi-rs library is used to handle the QMP low-level details. Its documentation isn't great but the library itself is close enough to QMP to be actually quite easy to use (so far at least ;-)).

We also add support for hotplugging vCPUs, basically as a test case for the newly added infrastructure. :-)

@katacontainersbot katacontainersbot added the size/large Task of significant size label Jun 4, 2024
Copy link
Member

@lifupan lifupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmores , LGTM!

Copy link
Contributor

@Apokleos Apokleos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Pavel @pmores

src/runtime-rs/crates/hypervisor/Cargo.toml Outdated Show resolved Hide resolved
@Apokleos
Copy link
Contributor

Apokleos commented Jun 5, 2024

Discussion: A notable difference compared to the kata runtime with golang is that re-executing qmpSetup() before each qmp command, such looks like called in setupVirtioMem
Why kata-runtime do repeated qmpSetup, if we don't do such action in qemu-rs, is that OK ?
@gkurz @pmores FYI !

@pmores
Copy link
Contributor Author

pmores commented Jun 5, 2024

Discussion: A notable difference compared to the kata runtime with golang is that re-executing qmpSetup() before each qmp command, such looks like called in setupVirtioMem Why kata-runtime do repeated qmpSetup, if we don't do such action in qemu-rs, is that OK ? @gkurz @pmores FYI !

Yes, I noticed that as well. My explanation is that in govmm, functions implementing QMP functionality are members of the qemu struct so they can be called anytime once qemu is constructed, possibly even before qemu is even launched. Thus govmm lacks a single code path that would guarantee proper operation sequencing (i.e. first initialisation then usage) therefore each QMP function has to ensure individually that QMP is up and ready to go. This is also hinted at by the fact that qmpSetup() is idempotent - it only does anything on its first invocation, subsequent invocations do nothing.

The structure of qemu-rs is different. Qmp is only even instantiated after qemu has been launched, after your last PR ensured that a QMP socket is properly set up on qemu launch. Qmp's constructor ensures that QMP connection is set up properly, otherwise it fails, Qmp is never even constructed, an error is returned to runtime-rs proper which aborts the launch. So by the time a Qmp member function that uses the QMP connection can even be called, it is guaranteed that QMP is good to go, therefore Qmp members don't need to check individually.

@Apokleos
Copy link
Contributor

Apokleos commented Jun 5, 2024

Discussion: A notable difference compared to the kata runtime with golang is that re-executing qmpSetup() before each qmp command, such looks like called in setupVirtioMem Why kata-runtime do repeated qmpSetup, if we don't do such action in qemu-rs, is that OK ? @gkurz @pmores FYI !

Yes, I noticed that as well. My explanation is that in govmm, functions implementing QMP functionality are members of the qemu struct so they can be called anytime once qemu is constructed, possibly even before qemu is even launched. Thus govmm lacks a single code path that would guarantee proper operation sequencing (i.e. first initialisation then usage) therefore each QMP function has to ensure individually that QMP is up and ready to go. This is also hinted at by the fact that qmpSetup() is idempotent - it only does anything on its first invocation, subsequent invocations do nothing.

The structure of qemu-rs is different. Qmp is only even instantiated after qemu has been launched, after your last PR ensured that a QMP socket is properly set up on qemu launch. Qmp's constructor ensures that QMP connection is set up properly, otherwise it fails, Qmp is never even constructed, an error is returned to runtime-rs proper which aborts the launch. So by the time a Qmp member function that uses the QMP connection can even be called, it is guaranteed that QMP is good to go, therefore Qmp members don't need to check individually.

Yeah, Cool. Nice and clear explanation for me. I re-check the qmpSetup and yes it's as you said.

@Apokleos
Copy link
Contributor

Apokleos commented Jun 6, 2024

/test

@@ -149,6 +152,14 @@ impl QemuInner {
tokio::spawn(log_qemu_stderr(qemu_process.stderr.take().unwrap()));
}

match Qmp::new(QMP_SOCKET_FILE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can legitimately fail because you have no guarantee that QEMU has called listen() yet.

Two options to address that :

  1. pre-connect as the go runtime does (see https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/qemu.go#L964)
  2. loop for some time until connection succeeds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just 2 is an option then since we don't do the connecting ourselves (qapi-rs does).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT qapi-rs does the QMP handshake but the connection is done by qemu-rs at e26028c#diff-d564620b2876afa7b1db8ade06afda6d28c187cb887d27880af3fd46866d1362R16. If this gets moved out to the caller, it is possible to implement 1.

Copy link
Contributor Author

@pmores pmores Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I might be misunderstanding then what exactly it takes to implement the pre-connect option. Clearly that would be the preferable one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it turns out this PR already does the pre-connect like the go runtime linked above. The go code basically 1) Listen()s on the qmp socket path, 2) duplicates that socket, 3) Dial()s the qmp socket path, and 4) returns the connection from 3, stores the duplicate from 2 and lets the original from 1 die.

To see that we're doing all that already, please note that UnixListener::bind() (executed in QmpSocket::new()) does include a listen() call so we have an equivalent of 1. Then 2 is not essential for the desired functionality and is apparently just a side effect of golang's UnixListener::File() implementation, we do 3 in this PR in Qmp::new() and we do our equivalents of 4, too.

So while I'll readily admit that this just by pure luck, I submit that this PR complies already. :-)

Copy link
Member

@BbolroC BbolroC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than @gkurz's comment, I am fine with it. I will defer to him for approval. Thanks @pmores !

@pmores
Copy link
Contributor Author

pmores commented Jun 11, 2024

Pushed a new version that sets a read timeout on the QMP connection. This is to protect runtime-rs from blocking indefinitely on QMP in case qemu doesn't launch, and is done by runtime-go as well.

impl Qmp {
pub fn new(qmp_sock_path: &str) -> Result<Self> {
let stream = UnixStream::connect(qmp_sock_path)?;
stream.set_read_timeout(Some(Duration::from_millis(250)))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is timeout hardcode 250 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! The value is admittedly a matter of judgment but there's some thought behind it. Setting the timeout is to protect runtime-rs from blocking forever trying to set up QMP connection if qemu fails to launch.

The upper bound is the value of container runtime's task creation timeout (2 seconds for containerd by default) after which the container runtime aborts container launch and terminates the shim anyway.

The lower bound is the maximum delay that can reasonably be expected when things run fine. I don't know what that could be but since this is a local unix-domain connection I suspect it should be fairly low. However, it could be noticeable on a system under heavy load, and we don't want to interfere with a slightly delayed but otherwise successful hypervisor launch.

I thought a value between between 100 ms and 250 ms could be good. I briefly mentioned that to @gkurz and we agreed to put 250 ms.

I'm not insisting that it must be 250 ms, if you have a reason to think that another value would be better I'm open to changing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, It would be better add a brief comments about it to explain why setting 250 as the value of timeout for developers to check when they're confused.
Anyway, I think the PR reaches maturity. I will approve it.
Thx Pavel @pmores

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment with explanation as per your request - please take a look. Thanks @Apokleos !

The constructor handles QMP connection initialisation, too, so there can
be non-functional Qmp instance.

Signed-off-by: Pavel Mores <pmores@redhat.com>
The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make
it accessible from QemuInner.  This is fine for now however if the constant
needs to be accessed from additional places in the future we could consider
moving it to somewhere more visible.

The Debug impl for Qmp is empty since first, we don't actually want it,
it's only forced by Hypervisor trait bounds, and second, it doesn't have
anything to display anyway.  If Qmp gets any members in the future that
can be meaningfully displayed they should be handled by Qmp's Debug::fmt().

Signed-off-by: Pavel Mores <pmores@redhat.com>
We take advantage of the Inner pattern to enable QemuInner::resize_vcpu()
take `&mut self` which we need to call non-const functions on Qmp.

This runs on Intel architecture but will need to be verified and ported
(if necessary) to other architectures in the future.

Signed-off-by: Pavel Mores <pmores@redhat.com>
Copy link
Contributor

@Apokleos Apokleos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx Pavel @pmores LGTM

@Apokleos Apokleos merged commit 388cd7d into kata-containers:main Jun 18, 2024
261 of 300 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/large Task of significant size
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants