Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/doc: Add Developing the Test Driver #216660

Merged
merged 1 commit into from
Feb 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions nixos/doc/manual/development/developing-the-test-driver.chapter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@

# Developing the NixOS Test Driver {#chap-developing-the-test-driver}

The NixOS test framework is a project of its own.

It consists of roughly the following components:

- `nixos/lib/test-driver`: The Python framework that sets up the test and runs the [`testScript`](#test-opt-testScript)
- `nixos/lib/testing`: The Nix code responsible for the wiring, written using the (NixOS) Module System.

These components are exposed publicly through:

- `nixos/lib/default.nix`: The public interface that exposes the `nixos/lib/testing` entrypoint.
- `flake.nix`: Exposes the `lib.nixos`, including the public test interface.

Beyond the test driver itself, its integration into NixOS and Nixpkgs is important.

- `pkgs/top-level/all-packages.nix`: Defines the `nixosTests` attribute, used
by the package `tests` attributes and OfBorg.
- `nixos/release.nix`: Defines the `tests` attribute built by Hydra, independently, but analogous to `nixosTests`
- `nixos/release-combined.nix`: Defines which tests are channel blockers.

Finally, we have legacy entrypoints that users should move away from, but are cared for on a best effort basis.
These include `pkgs.nixosTest`, `testing-python.nix` and `make-test-python.nix`.

## Testing changes to the test framework {#sec-test-the-test-framework}

When making significant changes to the test framework, we run the tests on Hydra, to avoid disrupting the larger NixOS project.

For this, we use the `python-test-refactoring` branch in the `NixOS/nixpkgs` repository, and its [corresponding Hydra jobset](https://hydra.nixos.org/jobset/nixos/python-test-refactoring).
This branch is used as a pointer, and not as a feature branch.

1. Rebase the PR onto a recent, good evaluation of `nixos-unstable`
2. Create a baseline evaluation by force-pushing this revision of `nixos-unstable` to `python-test-refactoring`.
3. Note the evaluation number (we'll call it `<previous>`)
4. Push the PR to `python-test-refactoring` and evaluate the PR on Hydra
5. Create a comparison URL by navigating to the latest build of the PR and adding to the URL `?compare=<previous>`. This is not necessary for the evaluation that comes right after the baseline.

Review the removed tests and newly failed tests using the constructed URL; otherwise you will accidentally compare iterations of the PR instead of changes to the PR base.

As we currently have some flaky tests, newly failing tests are expected, but should be reviewed to make sure that
- The number of failures did not increase significantly.
- All failures that do occur can reasonably be assumed to fail for a different reason than the changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document itself is good.

At the same time, what do you think about rigorously dropping everything what's flaky? Flaky tests are a nuisance to everyone who invests good work to get changes through and have the side-effect of wearing down motivation to painstaking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we don't mistake flaky Hydra for flaky tests, this would absolutely be a win.

We could perhaps move them to a flaky attrset, so that they can be fixed and moved back when they have a record of being reliable again.

Reasons to prefer dropping over moving:

  • less is more
  • satisfying

Reasons to prefer moving:

  • the state of the test remains visible, so that someone can pick it up and debug it
  • some tests need to evolve to a non-flaky state, as race conditions are discovered
  • no loss of coverage
  • Hermit

Hermit emulates the scheduler deterministically. This should let us find debuggable counterexamples, I think. Might be too new/experimental, and might not support KVM. We'll want to run it only on the flaky tests, because it is slower by an order of magnitude, but NixOS isn't poor so that's not a reason to avoid it.

So yeah, I think moving those tests would be great!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having different sets of which only one is really mandatory for contributors to satisfy sounds like a great improvement, yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #216828 for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, @JulienMalka wanted to explore Hermit usage in NixOS tests, I don't know if he has something to share, but just mentioning him as it can be interesting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I think it'd be very interesting to deep a little further into Hermit, see what improvements in test reproducibility we can get and at what cost. I'll certainly work on that subject soon, available to chat more about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What parts of the nixos test driver stack would be touched by using hermit? My gut feel says that this would be "just" configuring the guest vm's nixos config with kernel patch + config properly, is that about right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally Hermit supports KVM, so that we only have to wrap the top level invocation in nixos/lib/testing/run.nix. Otherwise, we have a qemu invocation in nixos/modules/virtualisation/qemu-vm.nix, which is included (a bit indirectly) from nixos/lib/testing/nodes.nix. Configuring the kernel works the way it normally does in NixOS, I think.

1 change: 1 addition & 0 deletions nixos/doc/manual/development/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ bootspec.chapter.md
what-happens-during-a-system-switch.chapter.md
writing-documentation.chapter.md
nixos-tests.chapter.md
developing-the-test-driver.chapter.md
testing-installer.chapter.md
```