Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-rs: firecracker hypervisor backend #8070

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Pyrromanis
Copy link

Add a basic runtime-rs Hypervisor trait implementation for Firecracker (FC).

  • implement fc-api requests to setup and start an FC sandbox
  • account for jailer directory hacks
  • add separate config for runtime-rs (might need to change as there are only minor diffs with the go config)

Notes:

Fixes: #5268

@katacontainersbot katacontainersbot added the size/huge Largest and most complex task (probably needs breaking into small pieces) label Sep 26, 2023
@katacontainersbot
Copy link
Contributor

Can one of the admins verify this patch?

@ananos ananos added no-backport-needed area/firecracker Issues specific to the Firecracker hypervisor lang/rust Rust code runtime-rs size/large Task of significant size rust Pull requests that update Rust code labels Sep 26, 2023
@ananos
Copy link
Member

ananos commented Sep 26, 2023

/test

@ananos
Copy link
Member

ananos commented Sep 27, 2023

/test

@ananos
Copy link
Member

ananos commented Oct 4, 2023

/test

@ananos
Copy link
Member

ananos commented Oct 6, 2023

/test

@@ -570,9 +572,31 @@ impl Persist for VirtSandbox {
let sandbox_state = crate::sandbox_persist::SandboxState {
sandbox_type: VIRTCONTAINER.to_string(),
resource: Some(self.resource_manager.save().await?),
hypervisor: Some(self.hypervisor.save_state().await?),
//hypervisor: Some(self.hypervisor.save_state().await?),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line could probably be removed, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, will clean up once we get the correct implementation on this snippet.

hypervisor: match self.hypervisor.save_state().await?.hypervisor_type.as_str() {
// TODO support other hypervisors
HYPERVISOR_DRAGONBALL => Ok(Some(self.hypervisor.save_state().await?)),
HYPERVISOR_FIRECRACKER => Ok(Some(self.hypervisor.save_state().await?)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd to me, why is save_state() called again? If the purpose of the first call (in the match expression above) is only to get the hypervisor type, isn't there a more direct way to do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly! maybe we should call save state earlier and keep it as a variable to pass it on there. Not sure, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a closer look, the Hypervisor interface doesn't seem to make it possible to get hypervisor_type directly, interestingly enough. So short of adding a Hypervisor::hypervisor_type() function to the interface, I can't think of a better solution at the moment than what you suggest - call save_state() ahead of the match expression, then match on the state and return it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, thanks!

@@ -597,3 +601,18 @@ fn update_component_log_level(config: &TomlConfig) {
updated_inner
});
}

fn determine_jailed(config: &TomlConfig) -> Result<String> {
let Some(hypervisor_config) = config.hypervisor.get(&config.runtime.hypervisor_name) else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this compile? It kind of looks like if let was meant but the following code seems to contradict that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed that addition to the language, thanks! :-)


fn determine_jailed(config: &TomlConfig) -> Result<String> {
let Some(hypervisor_config) = config.hypervisor.get(&config.runtime.hypervisor_name) else {
todo!()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be uncertainty how to handle this, however (assuming I'm guessing the intent of the previous line correctly) this block will only execute if hypervisor_name points to an unconfigured hypervisor, right? If so then that would clearly be a major misconfiguration, warranting a return with an Err. Or am I misunderstanding completely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct -- AFAIU todo!() will raise an error. If not, we should definitely raise an error there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe returning an Err here (probably with an empty string but that doesn't seem to matter much) would be better here than todo!() since todo!() basically just panics whereas with an Err return we can hope for a nice error stack message.

If this block gets executed it means that user pointed runtime.hypervisor_name to a hypervisor they haven't configured at all. There's no possible recovery from this kind of a gross configuration error as far as I can see, and failing with a nice informative error message in the log is all we can hope for here IMHO.

// jailed_path = h.<>
// and somehow store the sandbox state into the jail:
// persist::to_disk(&sandbox_state, &self.sid, jailed_path)?;
// Issue is, how to handle restore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get the issue right that it's the chicken-egg problem where in order to restore a hypervisor you need to get the corresponding jailed path, for which you need the very same hypervisor you're trying to restore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, correct -- and we don't like the way we've implemented that (at all...) so we're open to suggestions about what to do here ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I don't know if there can even be any other way apart from either 1) not saving the state to the jailed path, or 2) store the jailed path also somewhere outside of the state (which would require building something of a registry of sandbox state paths...).

}
if firecracker.boot_info.image.is_empty() {
return Err(eother!(
"Both guest boot image and initrd for firecracker are empty"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed firecracker didn't support initrd, am I right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it does, although we haven't added that in kata just yet ;-)


if (firecracker.cpu_info.default_vcpus > 0
&& firecracker.cpu_info.default_vcpus as u32 > default::MAX_FIRECRACKER_VCPUS)
|| firecracker.cpu_info.default_maxvcpus > default::MAX_FIRECRACKER_VCPUS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firecracker didn't support cpu hotplug, thus there's no need to check the default_maxvcpus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rationale here was to make sure that no vCPU setting is above the maximum number of vCPUs supported by the hypervisor. But I agree, let's just ignore the default_maxvcpus setting, as it is not used.

Ok(format!(
"unix://{}",
kata_run_path
let p = match jailed_path {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the uds is used for managing the shim, instead of the hypervisor, so why should we put the shim's management uds in the jail path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's true -- we thought it would be better if all container-related files are inside the jail, but, indeed, we can keep the monitor shim at the generic (non-jailed) dir.

@@ -164,8 +170,11 @@ DEFMSIZE9P := 8192
DEFVFIOMODE := guest-kernel
##VAR DEFSANDBOXCGROUPONLY=<bool> Default cgroup model
DEFSANDBOXCGROUPONLY ?= false
DEFSANDBOXCGROUPONLY_DB ?= true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sandbox cgroup only isn't hypervisor specific value, why should we define dg/fc specific var here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue is that if we want a separate default option for hypervisors, we should have a per-hypervisor variable.

@@ -45,6 +45,9 @@ tempdir = "0.3.7"

[target.'cfg(not(target_arch = "s390x"))'.dependencies]
dragonball = { path = "../../../dragonball", features = ["atomic-guest-memory", "virtio-vsock", "hotplug", "virtio-blk", "virtio-net", "virtio-fs", "vhost-net", "dbs-upcall", "virtio-mem", "virtio-balloon", "vhost-user-net", "host-device"] }
dbs-utils = { path = "../../../dragonball/src/dbs_utils" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the dragonball utils are useful to other hypervisors, maybe we can split them out into the hypervisor utils?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only use the MacAddr from dbs_utils -- maybe we could use another crate for that.

impl FcInner {
pub fn new() -> FcInner {
let mut capabilities = Capabilities::new();
capabilities.set(CapabilityBits::BlockDeviceSupport);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the FC need to set HybridVsockSupport support specifically?

let exit_status = child.wait().await?;
error!(
sl!(),
"[Firecracker] Process exited, status: {:?}", exit_status
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can redefine a sl! macro including the Firecracker sub system instead of writing them in the log message every time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack!

@lifupan
Copy link
Member

lifupan commented Jun 19, 2024

Hi @Pyrromanis

Thanks for the FC supporting effort, most of the patch LGTM except some small comments.

@ananos
Copy link
Member

ananos commented Jun 19, 2024

Hi @Pyrromanis

Thanks for the FC supporting effort, most of the patch LGTM except some small comments.

thanks @lifupan for the comments -- all look reasonable and prob. easy to address. We'll report back once we get to it!

@ananos ananos force-pushed the feat_add-fc-runtime-rs branch 4 times, most recently from 74729cb to e8757d8 Compare June 20, 2024 14:52
@ananos ananos force-pushed the feat_add-fc-runtime-rs branch 3 times, most recently from f9ab685 to d2f75b2 Compare June 26, 2024 10:19
Add a basic runtime-rs `Hypervisor` trait implementation for
AWS Firecracker

- Add basic hypervisor operations (setup / start / stop / add_device)
- Implement AWS Firecracker API on a separate file `fc_api.rs`
- Add support for running jailed (include all sandbox-related content)
- Add initial device support (limited as hotplug is not supported)
- Add separate config for runtime-rs (FC)

Notes:
- devmapper is the only snapshotter supported
- to account for no sharefs support, we copy files in the sandbox (as
  in the GO runtime)
- nerdctl spawn is broken (TODO: kata-containers#7703)

Fixes: kata-containers#5268

Signed-off-by: George Pyrros <gpyrros@nubificus.co.uk>
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Signed-off-by: Charalampos Mainas <cmainas@nubificus.co.uk>
Signed-off-by: George Ntoutsos <gntouts@nubificus.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/firecracker Issues specific to the Firecracker hypervisor lang/rust Rust code ok-to-test runtime-rs rust Pull requests that update Rust code size/huge Largest and most complex task (probably needs breaking into small pieces)
Projects
Development

Successfully merging this pull request may close these issues.

Implement a runtime-rs hypervisor plugin for Firecracker
8 participants