-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime-rs: add base qmp framework #9772
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pmores , LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Pavel @pmores
Discussion: A notable difference compared to the kata runtime with golang is that re-executing |
02e6d41
to
f3362ae
Compare
Yes, I noticed that as well. My explanation is that in govmm, functions implementing QMP functionality are members of the The structure of qemu-rs is different. |
Yeah, Cool. Nice and clear explanation for me. I re-check the |
/test |
@@ -149,6 +152,14 @@ impl QemuInner { | |||
tokio::spawn(log_qemu_stderr(qemu_process.stderr.take().unwrap())); | |||
} | |||
|
|||
match Qmp::new(QMP_SOCKET_FILE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can legitimately fail because you have no guarantee that QEMU has called listen()
yet.
Two options to address that :
- pre-connect as the go runtime does (see https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/qemu.go#L964)
- loop for some time until connection succeeds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just 2 is an option then since we don't do the connecting ourselves (qapi-rs does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT qapi-rs does the QMP handshake but the connection is done by qemu-rs at e26028c#diff-d564620b2876afa7b1db8ade06afda6d28c187cb887d27880af3fd46866d1362R16. If this gets moved out to the caller, it is possible to implement 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I might be misunderstanding then what exactly it takes to implement the pre-connect option. Clearly that would be the preferable one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it turns out this PR already does the pre-connect like the go runtime linked above. The go code basically 1) Listen()
s on the qmp socket path, 2) duplicates that socket, 3) Dial()
s the qmp socket path, and 4) returns the connection from 3, stores the duplicate from 2 and lets the original from 1 die.
To see that we're doing all that already, please note that UnixListener::bind()
(executed in QmpSocket::new()
) does include a listen()
call so we have an equivalent of 1. Then 2 is not essential for the desired functionality and is apparently just a side effect of golang's UnixListener::File()
implementation, we do 3 in this PR in Qmp::new()
and we do our equivalents of 4, too.
So while I'll readily admit that this just by pure luck, I submit that this PR complies already. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f3362ae
to
5fa7567
Compare
Pushed a new version that sets a read timeout on the QMP connection. This is to protect runtime-rs from blocking indefinitely on QMP in case qemu doesn't launch, and is done by runtime-go as well. |
impl Qmp { | ||
pub fn new(qmp_sock_path: &str) -> Result<Self> { | ||
let stream = UnixStream::connect(qmp_sock_path)?; | ||
stream.set_read_timeout(Some(Duration::from_millis(250)))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is timeout hardcode 250 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! The value is admittedly a matter of judgment but there's some thought behind it. Setting the timeout is to protect runtime-rs from blocking forever trying to set up QMP connection if qemu fails to launch.
The upper bound is the value of container runtime's task creation timeout (2 seconds for containerd by default) after which the container runtime aborts container launch and terminates the shim anyway.
The lower bound is the maximum delay that can reasonably be expected when things run fine. I don't know what that could be but since this is a local unix-domain connection I suspect it should be fairly low. However, it could be noticeable on a system under heavy load, and we don't want to interfere with a slightly delayed but otherwise successful hypervisor launch.
I thought a value between between 100 ms and 250 ms could be good. I briefly mentioned that to @gkurz and we agreed to put 250 ms.
I'm not insisting that it must be 250 ms, if you have a reason to think that another value would be better I'm open to changing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, It would be better add a brief comments about it to explain why setting 250 as the value of timeout for developers to check when they're confused.
Anyway, I think the PR reaches maturity. I will approve it.
Thx Pavel @pmores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment with explanation as per your request - please take a look. Thanks @Apokleos !
The constructor handles QMP connection initialisation, too, so there can be non-functional Qmp instance. Signed-off-by: Pavel Mores <pmores@redhat.com>
The QMP_SOCKET_FILE constant in cmdline_generator.rs is made public to make it accessible from QemuInner. This is fine for now however if the constant needs to be accessed from additional places in the future we could consider moving it to somewhere more visible. The Debug impl for Qmp is empty since first, we don't actually want it, it's only forced by Hypervisor trait bounds, and second, it doesn't have anything to display anyway. If Qmp gets any members in the future that can be meaningfully displayed they should be handled by Qmp's Debug::fmt(). Signed-off-by: Pavel Mores <pmores@redhat.com>
We take advantage of the Inner pattern to enable QemuInner::resize_vcpu() take `&mut self` which we need to call non-const functions on Qmp. This runs on Intel architecture but will need to be verified and ported (if necessary) to other architectures in the future. Signed-off-by: Pavel Mores <pmores@redhat.com>
5fa7567
to
380f8ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx Pavel @pmores LGTM
This PR creates a base framework for QMP functionality. The idea of the new
Qmp
object is similar to the one ofQemuCmdLine
- encapsulate implementation details of QMP handling while offeringQemuInner
an interface that matches its needs.The
qapi-rs
library is used to handle the QMP low-level details. Its documentation isn't great but the library itself is close enough to QMP to be actually quite easy to use (so far at least ;-)).We also add support for hotplugging vCPUs, basically as a test case for the newly added infrastructure. :-)