podman system reset speedup: Imply --time 0 or provide option for it #21874

martinpitt · 2024-02-29T10:43:20Z

Feature request description

I'm currently trying to clean up cockpit-podman's test setup/cleanup, after a quick inquiry. In some ideal world, calling podman system reset --force would clean up everything. It doesn't yet, but I'd like to gradually get rid of our extra hacks.

An easy one is to make it faster:

podman run -d docker.io/busybox sleep infinity
time podman system reset --force

shows:

WARN[0010] StopSignal SIGTERM failed to stop container lucid_jones in 10 seconds, resorting to SIGKILL 

real	0m10.324s
user	0m0.134s
sys	0m0.152s

This is annoying for many tests, as it quickly piles up, is often slower than the actual test, and completely unnecessary.

Suggest potential solution

Given the hammer size and intent of system reset, it'd would make sense to me if it didn't bother with the 10s timeout, and just immediately kill everything.

Alternatively, it could grow a --time 0 option like podman rm has.

Have you considered any alternatives?

We currently do an extra podman rm --force --time 0 --all; podman pod rm --force --time 0 --all (conditionalized, as e.g. Ubuntu 22.04's podman doesn't yet understand the --time option). This works, but it's one more step to keep track of.

Additional context

This is on current Fedora 40: podman-5.0.0~rc1-3.fc40.x86_64

The text was updated successfully, but these errors were encountered:

Luap99 · 2024-02-29T11:10:03Z

For system reset it should just default to 0 IMO, the data is gone anyway after that so no reason to wait for a graceful container shut-down.

Anyhow you can run podman stop --all -t0 even on older ubuntu versions to work around that for now.

`podman system reset` defaults to waiting for 10s for containers to shut down, which unnecessarily slows down tests (see containers/podman#21874). To work around this, force-stop all containers first with a zero timeout. This approach also works on Ubuntu 22.04, so remove the special case.

`podman system reset` defaults to waiting for 10s for containers to shut down, which unnecessarily slows down tests (see containers/podman#21874). To work around this, force-stop all containers first with a zero timeout.

The `restore_dir()` for podman's data directory is highly problematic: This interferes with btrfs subvolumes and overlayfs mounts, and often causes `cp` failures like ``` cp: cannot stat '/home/admin/.local/share/containers/storage/overlay/compat3876082856': No such file or directory ``` So move to `podman system reset`, and restore the test images with `podman load` for each test. Unfortunately `podman system reset` defaults to the 10 s wait timeout (containers/podman#21874), so we still need the separate `rm --time 0` hack. But conceptually that can go away once that bug is fixed. This approach would also be nice on the system podman side, but it is super hard to get right there especially on CoreOS: There we simultaneously want a thorough cleanup, but also rely on the running cockpit/ws container. It also collides with the "force unmount everything below /var/lib/containers" hack that we unfortunately still need for some OSes. But doing it for the user at least solves half of the problem. The observed failures in the field all occurred on the user directory, anyway. Fixes cockpit-project#1591

`podman system reset` defaults to waiting for 10s for containers to shut down, which unnecessarily slows down tests (see containers/podman#21874). To work around this, force-stop all containers first with a zero timeout.

The `restore_dir()` for podman's data directory is highly problematic: This interferes with btrfs subvolumes and overlayfs mounts, and often causes `cp` failures like ``` cp: cannot stat '/home/admin/.local/share/containers/storage/overlay/compat3876082856': No such file or directory ``` So move to `podman system reset`, and restore the test images with `podman load` for each test. Unfortunately `podman system reset` defaults to the 10 s wait timeout (containers/podman#21874), so we still need the separate `rm --time 0` hack. But conceptually that can go away once that bug is fixed. This approach would also be nice on the system podman side, but it is super hard to get right there especially on CoreOS: There we simultaneously want a thorough cleanup, but also rely on the running cockpit/ws container. It also collides with the "force unmount everything below /var/lib/containers" hack that we unfortunately still need for some OSes. But doing it for the user at least solves half of the problem. The observed failures in the field all occurred on the user directory, anyway. Fixes #1591

when performing a system reset with containers that run somewhere where a soft kill wont work (like sleep), containers will wait 10 seconds before terminating with a sigkill. But for a forceful action like system reset, we should outright set no timeout so containers stop quickly and are not waiting on a timeout Fixes containers#21874 Signed-off-by: Brent Baude <bbaude@redhat.com>

baude · 2024-03-01T18:28:33Z

@martinpitt thanks for the nicely dont report.

martinpitt · 2024-03-01T18:37:18Z

Wow, thanks @baude for the fast fix!

martinpitt added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 29, 2024

martinpitt mentioned this issue Mar 1, 2024

test: Make system container cleanup faster cockpit-project/cockpit-podman#1596

Merged

martinpitt mentioned this issue Mar 1, 2024

test: Move from restore_dir() to podman system reset for user cockpit-project/cockpit-podman#1598

Merged

baude self-assigned this Mar 1, 2024

baude mentioned this issue Mar 1, 2024

Use stop timeout of zero for system reset #21906

Merged

openshift-merge-bot bot closed this as completed in #21906 Mar 1, 2024

stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label May 31, 2024

stale-locking-app bot locked as resolved and limited conversation to collaborators May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

podman system reset speedup: Imply --time 0 or provide option for it #21874

podman system reset speedup: Imply --time 0 or provide option for it #21874

martinpitt commented Feb 29, 2024

Luap99 commented Feb 29, 2024

baude commented Mar 1, 2024

martinpitt commented Mar 1, 2024

podman system reset speedup: Imply --time 0 or provide option for it #21874

podman system reset speedup: Imply --time 0 or provide option for it #21874

Comments

martinpitt commented Feb 29, 2024

Feature request description

Suggest potential solution

Have you considered any alternatives?

Additional context

Luap99 commented Feb 29, 2024

baude commented Mar 1, 2024

martinpitt commented Mar 1, 2024