Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I propose adding Landlock support to Bubblewrap #519

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ChrysoliteAzalea
Copy link

Hello everyone!

I propose adding Landlock support to Bubblewrap. Landlock is a Linux security module officially introduced in Linux 5.13 kernel version that allows unprivileged processes to impose filesystem self-restrictions. It can be used in Bubblewrap in order to prevent sandbox escape.

I've created an "experimental" branch in a forked repository with experimental patches adding Landlock support. If Landlock self-restriction is enabled, experimental patched version of Bubblewrap will create and apply a ruleset which only allows access to mounted resources, preventing container escape.

This experimental patch hasn't been fully tested, and may cause bugs in some cases (needs further testing before merging), which is why the experimental branch allows edits by those who have write access to this repository. Moreover, this patch relies on using shared libraries that I use for experimenting with Landlock, which have to be compiled before compiling the experimental branch, which is why I had problems with Meson build system (had to use Autoconf instead).

@ChrysoliteAzalea ChrysoliteAzalea force-pushed the experimental branch 2 times, most recently from e3f9438 to 0aa4132 Compare July 8, 2022 12:32
@Maryse47
Copy link

Maryse47 commented Jul 8, 2022

If Landlock self-restriction is enabled, experimental patched version of Bubblewrap will create and apply a ruleset which only allows access to mounted resources, preventing container escape.

Could you describe how it compares to non-landlock Bubblewrap? Can you escape its container?

@ChrysoliteAzalea
Copy link
Author

If Landlock self-restriction is enabled, experimental patched version of Bubblewrap will create and apply a ruleset which only allows access to mounted resources, preventing container escape.

Could you describe how it compares to non-landlock Bubblewrap? Can you escape its container?

The difference is, that with my patch, if Landlock protection is enabled, the Bubblewrap process creates ruleset and adds all mounted directories (binds, tmpfs, proc, dev) to it, and all access to resources not included would be blocked by Landlock (during my tests, it blocked the ls / command).

Currently, I haven't found any way to escape the container. However, I think that additional hardening (such as seccomp, Landlock, SELinux, AppArmor) is beneficial for the project as it can help mitigate any possible vulnerabilities should they happen (for example, I use additional AppArmor restriction for Flatpak apps using profile stacking feature).

Signed-off-by: Азалия Смарагдова <charming.flurry@yandex.ru>
Signed-off-by: Азалия Смарагдова <charming.flurry@yandex.ru>
@ChrysoliteAzalea
Copy link
Author

Force-pushed the experimental branch in order to combine and sign commits. No changes to the code itself have been made.

@ChrysoliteAzalea ChrysoliteAzalea marked this pull request as ready for review July 14, 2022 08:14
@ChrysoliteAzalea
Copy link
Author

I've tested this a bit, and it seems to be working correctly for me. Seems to be ready for review. What has to be done:

  • Adding the Landlock syscall wrappers and ruleset helper function to the Bubblewrap code (currently, it's done through shared libraries that are compiled separately)
  • Adding support for Meson build system (this patch has not been tested with it)
  • Removing the experimental patch warning

Copy link
Collaborator

@smcv smcv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landlock is a Linux security module officially introduced in Linux 5.13 kernel version that allows unprivileged processes to impose filesystem self-restrictions. It can be used in Bubblewrap in order to prevent sandbox escape.

What benefit does this give us? Each bwrap option has a relatively well-defined meaning in terms of kernel-level namespace and filesystem operations (unshare namespaces, bind-mount, and so on), and the semantics are that the sandboxed process may access anything in the resulting filesystem.

If the Landlock ruleset doesn't agree with the filesystem that has been built by the namespace operations, then either the Landlock ruleset is too loose (harmless, but no benefit), or the Landlock ruleset is too strict (which would be an incompatible change, because it's disallowing things that would previously have been allowed).

If the Landlock ruleset does agree with the filesystem that has been built by the namespace operations, then I'm not sure that I see what the benefit is. If the sandboxed process can access a filesystem outside the ones we have given it access to, wouldn't that be a kernel bug? Conversely, if there's a kernel bug that lets it access filesystems beyond those, how can we say that there isn't also a kernel bug that bypasses Landlock for those filesystems?

I think to consider this, we really need a use-case that motivates it - we are not going to take responsibility for maintaining new code just out of a sense of completeness, we would need a sandboxing operation that would benefit from it.

The justification for Landlock in its own documentation seems to be that its restrictions are finer-grained than the restrictions on a mount point: mount points start from "allow everything" and can optionally be read-only (which we implement) or noexec (which we currently don't implement), but Landlock can have finer-grained restrictions like "can create regular files, but no fifos".

If you described a use-case for those finer-grained restrictions - preferably with a reasonable user-accessible configuration or CLI syntax in some higher-level project like Flatpak, calling into bwrap for its implementation - then that would seem like a more useful use of Landlock. However, at the moment that doesn't seem to be present here.

I wonder whether it would make more sense to treat Landlock as a layer that is orthogonal to the mount namespace, more like the way we handle seccomp, instead of interleaving it into how the mount namespace is set up? If the implementation opened the /newroot as an O_PATH fd and granted access to that, what would be the semantics of doing that? Would that grant access to the entire filesystem namespace that bwrap's child process is going to be trapped in (including any sub-mounts), or would that break the child process by granting access to only the root tmpfs but not any sub-mounts like /dev?

if (mount ("proc", arg1, "proc", MS_NOSUID | MS_NOEXEC | MS_NODEV, NULL) != 0)
die_with_error ("Can't mount proc on %s", arg1);
break;
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need re-indenting into a block, and it would be more obviously correct without the re-indent.

Comment on lines +1099 to +1103
int exposed_fd = open(arg1,O_PATH | O_CLOEXEC);
add_read_access_rule(opt_landlock_ruleset_fd,exposed_fd);
add_write_access_rule(opt_landlock_ruleset_fd,exposed_fd,1);
add_execute_rule(opt_landlock_ruleset_fd,exposed_fd);
close(exposed_fd);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something that would benefit from being factored out into a helper function.


if (opt_landlock_ruleset_fd>-1)
{
int exposed_fd = open(arg1,O_PATH | O_CLOEXEC);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bubblewrap is in a GNUish coding style, the same as GLib and Flatpak (I'm not going to repeat this everywhere, but the same principles apply everywhere):

Suggested change
int exposed_fd = open(arg1,O_PATH | O_CLOEXEC);
int exposed_fd = open (arg1, O_PATH | O_CLOEXEC);

Comment on lines 3280 to 3281
/* Should be the last thing before execve() so that filters don't
* need to handle anything above */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should stay attached to seccomp_programs_apply: one of the things this comment means is that we don't want to require users' seccomp programs to allow landlock_restrict_self().

Comment on lines +3284 to +3285
if (landlock_restrict_self(opt_landlock_ruleset_fd,0)!=0)
warn("An error has occured while enabling Landlock restrictions. The sandbox will start without Landlock protection.\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want Landlock to be a security restriction, then it should fail closed. We can have a separate --landlock-try option for "try to do Landlock, but fall back to not", if you want to also support it as an optional hardening thing - but the default should be fail-closed.

@@ -16,4 +16,4 @@ mkdir -p m4
autoreconf --force --install --verbose

cd "$olddir"
test -n "$NOCONFIGURE" || "$srcdir/configure" "$@"
test -n "$NOCONFIGURE" || LIBS="-L$1 -lllwrapper -laddrule" "$srcdir/configure" "$@"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because bubblewrap is sometimes setuid root, we have to be very careful about external dependencies: every external dependency has to be written in a way that makes it safe to call into with elevated privileges, and maintained in a way that treats misplaced trust in the execution environment as a security vulnerability. I'm reasonably confident that glibc, libcap and libselinux are like that, but you'll notice we don't depend on anything larger, like GLib.

Is your Landlock library intended to be setuid-safe?

Is this a personal project, or does it have a bus factor of more than 1?

bubblewrap is already short of maintainer bandwidth (as you can see from the time it took to get this reviewed), so it is not really viable for us to end up in a situation where the bubblewrap maintainers also become responsible for maintaining and securing a Landlock helper library.

If this is going to be an external dependency, then it would have to be integrated into the build system properly, and build-time optional, like the libselinux code already is. --landlock would fail with an error message if the external dependency was not enabled. If we had a new option like --landlock-try, it could carry on regardless.

If you can guarantee that a Landlock library will be safe for setuid processes to link against (no constructors/initializers that e.g. trust environment variables), then another possibility would be to link to the library, but refuse to do anything with it (make --landlock fail with an error) if we find that we're setuid, similar to what's done for --userns.

Alternatively, if what you're doing with Landlock is sufficiently simple, then bwrap could open-code the syscalls in its own code. We'd have to look at a proposed implementation to make a judgement on whether it's small enough that we can take responsibility for its security, and we'd also have to think about cost vs. benefit in terms of what new hardening or new features Landlock gives us.

@smcv
Copy link
Collaborator

smcv commented Oct 26, 2022

all access to resources not included would be blocked by Landlock (during my tests, it blocked the ls / command)

That's an incompatible change. The initial contents of bwrap's filesystem namespace (before you specify any --bind, --dev and so on) is a tmpfs, and the intended semantics are that if you have not done a --bind $some_dir / or similar, then the child process is allowed to access that tmpfs directory (and in particular, it can list the top-level directories, which will typically include /dev, /usr and so on, bind-mounted in from elsewhere). Why would that ever not be desirable?

What the sandboxed process is not allowed to access is the filesystem outside the directory that starts as /newroot - but we enforce that by using pivot_root() to turn /newroot into the root filesystem, and unmounting the old root. That means there is no path that the sandboxed process can use to refer to a file it is not allowed to access: they simply do not exist in its view of the filesystem namespace.

If there are any filesystem sandbox escapes remaining after doing that, then they would be around kernel subtleties like the interpretation of "magic symlinks" in /proc. My understanding is that those would generally be treated as kernel security vulnerabilities, and fixed like any other serious kernel bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants