Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos-modules: add closure size + startup time optimization #146

Merged
merged 1 commit into from
Oct 1, 2023

Conversation

lf-
Copy link
Contributor

@lf- lf- commented Oct 1, 2023

This is based on my work on MapleCTF to run microvm.nix inside a docker
container (incidentally, an awesome microvm.nix use case) and have the
size not cause substantial issues.

Overall this saves about 700MB of closure size of a naive no-op VM
configuration at practically the sole cost of eating a qemu compile.

co/microvm.nix » nix path-info -sSh ./result1
/nix/store/ligbkxkl1hnz2pvj8d9dfic991zfc0s0-microvm-qemu-nixos     1.7K  756.3M

co/microvm.nix » nix path-info -sSh ./result-without
/nix/store/h218db586pc627ai96j5cq2gssbvaxk8-microvm-qemu-nixos     1.7K    1.4G

Without:

[root@nixos:~]# systemd-analyze time
Startup finished in 1.402s (kernel) + 8.830s (userspace) = 10.233s
multi-user.target reached after 8.830s in userspace.

With:

[root@nixos:~]# systemd-analyze time
Startup finished in 295ms (kernel) + 2.066s (initrd) + 2.471s (userspace) = 4.834s
multi-user.target reached after 2.466s in userspace.

This looks impressive, and perhaps it is, but there is a lot of speed to
still get out. The dominant factors in each system startup are:

Old:

  • dhcpcd.service (7.706s)
  • firewall.service (917ms)
  • sshd.service (518ms)

New:

This group (in total, 3.1s ish in initrd; I have no idea why they are
this bad!):

  • sys-devices-platform-serial8250-tty-ttyS1.device (3.087s)
  • dev-ttyS1.device (3.087s)
  • sys-devices-platform-serial8250-tty-ttyS2.device (3.085s)
  • dev-ttyS2.device (3.085s) dev-ttyS0.device (3.084s)
  • sys-devices-pnp0-00:02-tty-ttyS0.device (3.084s)
  • sys-devices-platform-serial8250-tty-ttyS3.device (3.082s)
  • dev-ttyS3.device (3.082s)
  • dev-disk-by\x2dpath-pci\x2d0000:00:02.0.device (3.181s)
  • sys-devices-pci0000:00-0000:00:02.0-virtio1-block-vda.device (3.181s)
  • dev-disk-by\x2duuid-0d9f5dc5\x2d0d09\x2d4a74\x2d88a5\x2da0a5be825cc5.device (3.181s)
  • dev-vda.device (3.181s)
  • dev-disk-by\x2ddiskseq-1.device (3.180s)
  • dev-disk-by\x2dpath-virtio\x2dpci\x2d0000:00:02.0.device (3.180s)
  • systemd-fsck@dev-vda.service (12ms)
  • sys-module-fuse.device (2.968s)

Then:

  • systemd-udev-trigger.service (261ms)
  • firewall.service (735ms) [!!!!!]
  • systemd-networkd.service (273ms)
  • sshd.service (511ms)

TL;DR: overall a huge improvement but it could have another 50% shaved
off, and I really have no idea why all that hardware init takes 3
seconds!

@lf-
Copy link
Contributor Author

lf- commented Oct 1, 2023

You can get the systemd-analyze plot outputs here: https://gist.github.com/lf-/961f8f6be2de4f08f33a0cae505fd4c8

@lf-
Copy link
Contributor Author

lf- commented Oct 1, 2023

(systemd bug this works around: systemd/systemd#29388)

@lf-
Copy link
Contributor Author

lf- commented Oct 1, 2023

oh, the nonsense device times are actually just, the entire initrd basically. systemd/systemd#29010

Copy link
Owner

@astro astro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the microvm.optimize idea a lot.

Could you drop the passthru commit? Instead, just tell me what you need, and I'm going to add that together with docs within a few days...

lib/runners/cloud-hypervisor.nix Outdated Show resolved Hide resolved
lib/runners/cloud-hypervisor.nix Outdated Show resolved Hide resolved
This is based on my work on MapleCTF to run microvm.nix inside a docker
container (incidentally, an awesome microvm.nix use case) and have the
size not cause substantial issues.

Overall this saves about 700MB of closure size of a naive no-op VM
configuration at practically the sole cost of eating a qemu compile.

```
co/microvm.nix » nix path-info -sSh ./result1
/nix/store/ligbkxkl1hnz2pvj8d9dfic991zfc0s0-microvm-qemu-nixos     1.7K  756.3M

co/microvm.nix » nix path-info -sSh ./result-without
/nix/store/h218db586pc627ai96j5cq2gssbvaxk8-microvm-qemu-nixos     1.7K    1.4G
```

Without:
```
[root@nixos:~]# systemd-analyze time
Startup finished in 1.402s (kernel) + 8.830s (userspace) = 10.233s
multi-user.target reached after 8.830s in userspace.
```

With:
```
[root@nixos:~]# systemd-analyze time
Startup finished in 295ms (kernel) + 2.066s (initrd) + 2.471s (userspace) = 4.834s
multi-user.target reached after 2.466s in userspace.
```

This looks impressive, and perhaps it is, but there is a lot of speed to
still get out. The dominant factors in each system startup are:

Old:
- dhcpcd.service (7.706s)
- firewall.service (917ms)
- sshd.service (518ms)

New:

This group (in total, 3.1s ish in initrd; I have no idea why they are
this bad!):

- sys-devices-platform-serial8250-tty-ttyS1.device (3.087s)
- dev-ttyS1.device (3.087s)
- sys-devices-platform-serial8250-tty-ttyS2.device (3.085s)
- dev-ttyS2.device (3.085s) dev-ttyS0.device (3.084s)
- sys-devices-pnp0-00:02-tty-ttyS0.device (3.084s)
- sys-devices-platform-serial8250-tty-ttyS3.device (3.082s)
- dev-ttyS3.device (3.082s)
- dev-disk-by\x2dpath-pci\x2d0000:00:02.0.device (3.181s)
- sys-devices-pci0000:00-0000:00:02.0-virtio1-block-vda.device (3.181s)
- dev-disk-by\x2duuid-0d9f5dc5\x2d0d09\x2d4a74\x2d88a5\x2da0a5be825cc5.device (3.181s)
- dev-vda.device (3.181s)
- dev-disk-by\x2ddiskseq-1.device (3.180s)
- dev-disk-by\x2dpath-virtio\x2dpci\x2d0000:00:02.0.device (3.180s)
- systemd-fsck@dev-vda.service (12ms)
- sys-module-fuse.device (2.968s)

Then:
- systemd-udev-trigger.service (261ms)
- firewall.service (735ms) [!!!!!]
- systemd-networkd.service (273ms)
- sshd.service (511ms)

TL;DR: overall a huge improvement but it could have another 50% shaved
off, and I really have no idea why all that hardware init takes 3
seconds!
@lf-
Copy link
Contributor Author

lf- commented Oct 1, 2023

Surplus commit has been dropped.

@astro astro merged commit cfafd9b into astro:main Oct 1, 2023
1 of 111 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants