Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specs-go/config: add Landlock LSM support #1111

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kailun-qin
Copy link
Contributor

Linux kernel 5.13 adds support for Landlock Linux Security Module (LSM).
This allows unprivileged processes to create safe security sandboxes
that can securely restrict the ambient rights (e.g. global filesystem
access) for themselves.

#1110

Signed-off-by: Kailun Qin kailun.qin@intel.com

config.md Outdated
Comment on lines 215 to 221
* **`landlock`** (object, OPTIONAL) specifies the Landlock unprivileged access control settings for the container process.
For more information about Landlock, see [Landlock documentation][landlock].
`landlock` contains the following properties:

* **`ruleset`** (object, OPTIONAL) the `ruleset` field identifies a set of rules (i.e., actions on objects) that need to be handled (i.e., restricted).
* **`rules`** (array of objects, OPTIONAL) the `rules` field specifies the security policies (i.e., actions allowed on objects) to be added to an existing ruleset
* **`abi`** (object, OPTIONAL) the `abi` field defines the specific Landlock ABI version.
Copy link
Member

@cyphar cyphar Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm of two minds about how much of the Landlock ruleset language we want to embed in the runtime-spec. I haven't looked too deeply into the version of Landlock that was merged -- is there any serialisation format they describe? Or is it like libseccomp where it's a bunch of function calls and we need to create our own way of describing it?

I don't really have a conclusion with my below points, I'm more wondering what other maintainers think about it.

Pros

Given that adding bits to runtime-spec configuration is a somewhat supported thing by some runtimes, making things configurable through the runtime-spec directly without needing any host setup (such as is the case with AppArmor and SELinux configuration) means that higher-level users can support this feature transparently by just passing whatever ruleset the user specified. This is one of the (on paper) benefits of our seccomp configuration, though in practice Docker has its own custom ruleset and crun is toying with an even more custom system. Though seccomp is a lot more granular than landlock so maybe this problem won't come up in practice.

Cons

  • We embed some fairly hairy bits of the kernel ABI into our format, meaning that if a new kernel feature is added, runtimes won't gain support for it until after the spec is updated and all of the other runtimes update. If there is a liblandlock with their own serialisation format, that would be preferable. This is especially true when it comes to the LANDLOCK_* constants being defined -- in seccomp these are a bit of a pain point because we need to manually translate them several times, meaning that we end up with multiple definitions of the constant mapping (none of which are in the spec).

  • The fact there's an ABI version field tells me that they might want to have new ABIs in the future. This is going to cause a problem with Go in particular, because a new ABI format that changes the structure of the landlock config object is going to make supporting multiple ABI versions quite difficult (in Rust you could do it with unions but we don't have that option in Go and we'd probably have to use some kind of awful reflection or creating a partial struct to just extract the version number).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any serialisation format they describe? Or is it like libseccomp where it's a bunch of function calls and we need to create our own way of describing it?

There is no "language" serialization done by the kernel. It is similar to openat* syscalls: a path rule is defined by a file descriptor to identify the path/hierarchy to be restricted with a set of rights as a bitmask.

Landlock is designed to handle layers of policies. This is similar to seccomp, but with less overhead, and focusing on semantic, which should avoid the seccomp issues. It then makes sense for runtime-spec to support a Landlock configuration. Other tools, either outside the container or inside, could still use Landlock the way it suits them.

We embed some fairly hairy bits of the kernel ABI into our format, meaning that if a new kernel feature is added, runtimes won't gain support for it until after the spec is updated and all of the other runtimes update.

This is correct but like I said above, nothing stops another component to use a newer or older set of Landlock features, independently of runtime-spec. I can help you keep an up-to-date spec as soon as a new features are in mainline.

If there is a liblandlock with their own serialisation format, that would be preferable. This is especially true when it comes to the LANDLOCK_* constants being defined -- in seccomp these are a bit of a pain point because we need to manually translate them several times, meaning that we end up with multiple definitions of the constant mapping (none of which are in the spec).

About the constants, you can use golandlock developped by @gnoack (package renaming is ongoing though).

The fact there's an ABI version field tells me that they might want to have new ABIs in the future. This is going to cause a problem with Go in particular, because a new ABI format that changes the structure of the landlock config object is going to make supporting multiple ABI versions quite difficult (in Rust you could do it with unions but we don't have that option in Go and we'd probably have to use some kind of awful reflection or creating a partial struct to just extract the version number).

Landlock will gain new features over time, but the old ones will (of course) still be supported (i.e. no compatibility breakage). The ABI version is there to probe the kernel and know if it supports a set of features. This version ABI should not be feared, nothing will break when a new one will be supported by the kernel. For more details, there is an issue for golandlock about that: landlock-lsm/go-landlock#4

// LandlockRule represents the security policies (i.e., actions allowed on objects) .
type LandlockRule struct {
// Type is the Landlock rule type pointing to the rules to be added to an existing ruleset.
Type LandlockRuleType `json:"type,omitempty" platform:"linux"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need multiple LandlockRules of the same type?

For example, we can do instead:

"rules": {
  "path_beneath": [
    {
      "allowedAccess": ["xxx"],
      "paths": ["xxx"]
    },
    {
      "allowedAccess": ["xxx"],
      "paths": ["xxx"]
    }
  ]
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would make the configuration lighter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedbacks! Use an object with rule-type keys instead to make it lighter.

@h-vetinari
Copy link
Contributor

So I'm of two minds about how much of the Landlock ruleset language we want to embed in the runtime-spec. I haven't looked too deeply into the version of Landlock that was merged -- is there any serialisation format they describe? Or is it like libseccomp where it's a bunch of function calls and we need to create our own way of describing it?

[...]

CC @l0kod

@BoardzMaster
Copy link

BoardzMaster commented Aug 4, 2021

Having this format of Landlock rules I think is enough (like @kailun-qin suggested)
image

  1. Go LibLandlock is already implemented by Günther Noack https://github.com/gnoack/golandlock. It can be used in runc.
  2. Another issue is the future support for containerd/k8s. Here there must be serialization format like @cyphar addressed in his comment ( grpc mechanisms in particular)

Mickaël Salaün, please give your opinion CC @l0kod

@AkihiroSuda
Copy link
Member

@kailun-qin
Do you have a POC implementation of runc/crun to verify this design?

@l0kod
Copy link

l0kod commented Aug 22, 2021

Sorry for the delay, I missed the GitHub notification. Thanks for working on this, I'm happy to help! However, I'm not familiar with runtime-spec, my suggestions may not be accurate.

To be sure everyone is on the same page, I want to say that Landlock may not currently be used in a generic container because of the reparenting limitation. I plan to address this in the following months.
However, I think it is a good idea to start the integration now and that the current status can still be useful to users.

The goal of Landlock is to give (for this use case) the ability to container managers to handle access rights independently to the parent system configuration (i.e. it doesn't require admin rights). It seems that options such as maskedPaths and readonlyPaths have a similar purpose (but they use the namespace features underneath). However, because runtime-spec is close to the underlying kernel/OS, it seems that it makes sense to have a specific "landlock" object for Linux.

FYI, to be able to use Landlock, the noNewPrivileges option must be enabled. It may be a good idea to hint the user and error out if Landlock is configured without this option to avoid false sense of security.

Because a kernel may not support Landlock (because the kernel is too old, or Landlock is not configured, or not enabled at boot time), I would advise to follow a best-effort security approach, which is to just drop a warning and continue interpreting the configuration by default. However, it may be wise to let users choose to enforce Landlock restrictions for specific use cases (e.g. when we create a configuration for our own systems that we know should support Landlock). I don't know how this is currently handled for seccomp though, but I guess the behavior should be the same.

How do you plan to use the "abi" field?

If you want to follow Landlock development, you can subscribe to the appropriate mailing list: https://subspace.kernel.org/lists.linux.dev.html. I'll send news soon.

config.md Outdated
@@ -253,6 +260,65 @@ _Note: symbolic name for uid and gid, such as uname and gname respectively, are
],
"apparmorProfile": "acme_secure_profile",
"selinuxLabel": "system_u:system_r:svirt_lxc_net_t:s0:c124,c675",
"landlock": {
"ruleset": {
"handledAcessFS": [
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handledAccessFS (a "c" is missing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. Fixed, thanks!

config.md Outdated

* **`ruleset`** (object, OPTIONAL) the `ruleset` field identifies a set of rules (i.e., actions on objects) that need to be handled (i.e., restricted).
* **`rules`** (array of objects, OPTIONAL) the `rules` field specifies the security policies (i.e., actions allowed on objects) to be added to an existing ruleset
* **`abi`** (object, OPTIONAL) the `abi` field defines the specific Landlock ABI version.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the "abi" field, it is not explained how this will be used. Is it a requirement to only enforce the Landlock rules with a kernel supporting this version? Is it a requirement for a minimal version?

The ABI version should be used by the container runtime to know if a set of rules is supported by the kernel, which should then help to only enforce those that are indeed supported and ignores unsupported features (i.e. best-effort security).

I'm not sure it makes sense to expose this in the configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally agree w/ the best-effort security approach and I'm currently implementing this way (i.e., if not supported, just drop a warning and continue interpreting the configuration).
Added with more elaborations on the usage of this abi field. IMO, it's fine to expose this in the config. Feedback is welcome.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the abi and handledAccessFS settings trying to solve the same problem here? In the golandlock module, picking an ABI version is just a shortcut for asking for the maximum set of handledAccessFS possible under that ABI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnoack Thanks! Great question.

Yes, from the current golandlock implementation, the abi maps to the max set of handledAccessFS possible under it. I.e., they may try to solve the same problem at this moment.

For the new abis in the future, should they normally point to sets of features to be supported by the kernel and those might go beyond handledAccessFS? I'm not sure. I'll leave the question here to @l0kod.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABI version is not only linked to handledAccessFS, the version may increase for it or for other features (independently). See my comment just bellow.

config.md Outdated
"landlock": {
"ruleset": {
"handledAcessFS": [
"LANDLOCK_ACCESS_FS_EXECUTE",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These access rights could be truncated and simplified because they are scoped and typed by the "handledAccessFS" (e.g. "execute" instead of "LANDLOCK_ACCESS_FS_EXECUTE").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense to me. Updated, thanks!

// LandlockRule represents the security policies (i.e., actions allowed on objects) .
type LandlockRule struct {
// Type is the Landlock rule type pointing to the rules to be added to an existing ruleset.
Type LandlockRuleType `json:"type,omitempty" platform:"linux"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would make the configuration lighter.

@BoardzMaster
Copy link

BoardzMaster commented Aug 23, 2021

I suggest using a simplified version of a file/dir access rules like RW, RO. In this case, we could use RODirs()...RwFiles() restrict path options from the golandlock library https://github.com/gnoack/golandlock and a spec config.json file would be like this:
image

It will be more convenient for a user.

Mickaël Salaün, please give your opinion CC @l0kod

@l0kod
Copy link

l0kod commented Aug 23, 2021

I suggest using a simplified version of a file/dir access rules like RW, RO. In this case, we could use RODirs()...RwFiles()

RW and RO are misleading because there is not yet full support for all read and write operations on the filesystem. Moreover, runtime-spec seems to deal with low-level system config (e.g. mount options). I think that maintaining a high-level and evolving configuration will not fit with the purpose of runtime-spec.

However, it is still possible to create variables containing a pre-defined set of access rights.

@gnoack
Copy link

gnoack commented Aug 23, 2021

I suggest using a simplified version of a file/dir access rules like RW, RO. In this case, we could use RODirs()...RwFiles()

RW and RO are misleading because there is not yet full support for all read and write operations on the filesystem. Moreover, runtime-spec seems to deal with low-level system config (e.g. mount options). I think that maintaining a high-level and evolving configuration will not fit with the purpose of runtime-spec.

However, it is still possible to create variables containing a pre-defined set of access rights.

+1

The intent with ROFiles() and friends is to provide a way to coarsely permit operations and have the permissions "grow" along when new Landlock ABI versions become available. I'm hoping that in most cases it will only be needed to increment the used Landlock ABI version from time to time, and leaving the ROFiles() and similar rules untouched. But my compatibility assumptions are difficult to test, and @l0kod has a better hunch what the future might bring.

Please open a feature request on golandlock if you want more direct access for specifying ruleset.handledAccessFS. (Right now, it's implicitly set to the "best possible" value in the golandlock.V1 constant. The only reason this is not exposed so far was because I could not imagine a use case for it...)

@kailun-qin
Copy link
Contributor Author

Thank you for all your inputs and comments! @l0kod @cyphar @BoardzMaster @h-vetinari @AkihiroSuda @gnoack

I've updated with a new version mainly w/ below changes based on the feedbacks:

  • Use an object with rule-type keys instead of an array of rules with specific types to make configuration lighter.
  • Simplify the access rights consts as they are scoped and typed by the LandlockFSAction.
  • Update some field comments and add elaborations on the usage of abi etc.
  • Misc minor fixes.

@AkihiroSuda Regarding POC implementation, we're working on it based on runc and golandlock. We'll submit PRs for review once it's ready.

Please kindly take another look. Let me know if any further suggestions. Thanks!

config.md Outdated

* **`ruleset`** (object, OPTIONAL) the `ruleset` field identifies a set of rules (i.e., actions on objects) that need to be handled (i.e., restricted).
* **`rules`** (array of objects, OPTIONAL) the `rules` field specifies the security policies (i.e., actions allowed on objects) to be added to an existing ruleset
* **`abi`** (object, OPTIONAL) the `abi` field defines the specific Landlock ABI version.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ABI version is not only linked to handledAccessFS, the version may increase for it or for other features (independently). See my comment just bellow.

config.md Outdated
* **`abi`** (object, OPTIONAL) the `abi` field defines the specific Landlock ABI version.
This should be used by the runtime to check if the kernel supports the specified sets of Landlock features and then enforce those following a best-effort security approach.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't designed the ABI version to be used this way. The idea is to know which features are supported before asking the kernel to build a ruleset with them (and not more), not the other way around. This helps make sure that no error should occur because of unknown features.

What you are doing here is to use this version to check that the container configuration was tested with a kernel supporting at least this ABI version, which are implicit default configurations (not defined in this spec anyway). I think it is safer to not set and use this version in a configuration but instead have an explicit configuration (i.e. explicit access rights). This would also avoid the case where different runtimes define their own default values according to a version which is not yet in the specification.

Later, we could add groups of access rights that could be used for handledAccessFS and allowedAccess, but the content of these groups should not change over time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@l0kod Thanks for the comments! Make sense to me.

Removing the abi field from the config and will leverage the explicit access rights specified.

In this case, I think we need direct access to ruleset.handledAccessFS in go-landlock? @gnoack

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll expose it. Tracking it in landlock-lsm/go-landlock#12

config.md Outdated
* **`abi`** (object, OPTIONAL) the `abi` field defines the specific Landlock ABI version.
This should be used by the runtime to check if the kernel supports the specified sets of Landlock features and then enforce those following a best-effort security approach.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@l0kod Thanks for the comments! Make sense to me.

Removing the abi field from the config and will leverage the explicit access rights specified.

In this case, I think we need direct access to ruleset.handledAccessFS in go-landlock? @gnoack

// This is for conditions when the Landlock access rights explicitly configured by the container are not
// supported or available in the running kernel.
// Default is false, i.e., following a best-effort security approach.
DisableBestEffort bool `json:"disableBestEffort,omitempty" platform:"linux"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also adding a best-effort control here, see if it makes sense to you @l0kod @gnoack. Thanks!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about that, I think it would be more flexible to replace this global field with two per feature block: to be able to have a soft requirement (by default) or a hard requirement. This would enable for instance to require that the kernel supports a specific feature that would otherwise break the container (for instance, supporting file reparenting, which would probably be a new kind of option for the ruleset).
This would be complementary to a soft error (by default, best-effort: only log a warning) or a hard error (don't launch the container) if the feature is not supported. Being able to have no warning at all (not by default) could be useful too.

To sum up, that may look like this:

"ruleset": {
      "handledAccessFS": [
          "execute",
          "write_file",
          "read_file",
          "read_dir",
          "remove_dir",
          "remove_file",
          "make_char",
          "make_dir",
          "make_reg",
          "make_sock",
          "make_fifo",
          "make_block",
          "make_sym"
      ],

    // optional field, default value (all handledAccessFS elements may not be known to the kernel; take into account as much as possible).
    // If "required" is true and the kernel doesn't support all the handledAccessFS rights, then ignores the whole "ruleset" block and the associated "rules".
    "required": false,

    // optional field, default value (if an handledAccessFS element is unknown to the kernel, then log a warning). Other values could be "silent" and "error".
    "unsupported": "warning"
},
"rules": {
    "pathBeneath": [
        {
            "allowedAccess": [
                "execute",
                "read_file",
                "read_dir"
            ],
            "paths": [
                "/usr",
                "/bin"
            ],

            // optional field, default value (all allowedAccess elements may not be known to the kernel; take into account as much as possible)
            // If "required" is true and the kernel doesn't support all the allowedAccess rights, then ignores this rule.
            "required": false,

            // optional field, default value (if an allowedAccess element is unknown to the kernel, then log a warning)
            "unsupported": "warning"
        },

Copy link
Contributor Author

@kailun-qin kailun-qin Sep 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @l0kod for the comments!

I do agree this adds more flexibility. Just wondering whether this adds more complexity as well? If supported, the flag check and intersect functions in go-landlock may better be exposed and the logic will be handled in runc. Beyond that, a RestrictPathsNoCheck API may need to be added as the checks in the current implementations are redundant in go-landlock.

I'm currently leveraging the best effort mode provided directly in go-landlock (which is a single control point). Any thoughts/comments? @gnoack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding soft/hard requirements for what the kernel supports:

I've been playing with the thought of adding that to go-landlock, but was not convinced whether it would be actually needed. The way I considered implementing it would have been to have one AccessFSSet that is ideal and one AccessFSSet that is the minimum. So instead of .BestEffort(), you'd then call something like .BestEffortButAtLeast(AccessFSSet(...)). (or maybe I can find a nicer name for that...)

It would be reasonably easy to add to go-landlock, required changes would be: (a) to add a minHandledAccessFS field to the Config struct, (b) expose a suitable method to set it, and then (c) in restrictPaths(), just before the RulesetAttr{} is created, return an error if the minimal requirement is not fulfilled. I was just not convinced whether there is actually a need for it and it felt like speculative generality... do you think this is a use case that you'd need?

Regarding file reparenting:

If I understand this correctly, file reparenting is currently forbidden in a Landlock-restricted process in kernel 5.13, and with kernel 5.X you want to permit this in Landlock-restricted processes if they ask for it at enforcement time. Do I understand that correctly?

If that is the case, then it sounds like it should maybe not be part of the Config (RulesetAttr), but might logically fit more as an argument to RestrictPaths() (invocation of landlock_add_rule), like this:

err := landlock.V99.BestEffort().RestrictPaths(
  variousPaths...,
  landlock.EnableFileReparenting(params...),
)

The thinking is:

  • the Config (landlock.V99, RulesetAttr) describes what should be forbidden (and with abi→∞, thats everything :))
  • the arguments to RestrictPaths() (landlock_add_rule) describe the exceptions to what is forbidden (the operations that the process intends to do)

So yes, the above invocation is not compatible with kernel 5.13. The go-landlock library's BestEffort() mode should restrict the strongest rules it can, but it might have to fall back to not enforcing anything at all, if file reparenting is something that the program needs, and if that is not compatible with the current kernel.

(I feel like I'm making a few assumptions here, there is a good chance that I'm misunderstanding this... :) let me know)

Regarding evolution flexibility of the config:

If there will be more fields added to RulesetAttr in the future, it might be easier to think of this as a "permits-more-or-the-same-as" relationship on RulesetAttr structs rather than as a "subset" relationship on AccessFSSets as it's implemented now in go-landlock. So maybe it would be better to have a "ruleset" and "min_ruleset" rather than having a "handled_access_fs" and "min_handled_access_fs" in the ruleset? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about that, I think it would be more flexible to replace this global field with two per feature block: to be able to have a soft requirement (by default) or a hard requirement. This would enable for instance to require that the kernel supports a specific feature that would otherwise break the container (for instance, supporting file reparenting, which would probably be a new kind of option for the ruleset).

If we really want to support a soft requirement (by default) or a hard requirement in ruleset to specify at least some features shall be supported, the format would be something similar to what @gnoack suggested - ruleset and min_ruleset (where one ruleset not be enough).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simplified version of the proposition would be to only add required and unsupported (names may not be clear enough) to the ruleset block. I don't know about the evolution flexibility of this specification though.

This also makes me wonder about rules not being nested into ruleset. That would make more sense. It would be lighter to merge landlock and ruleset into landlockRuleset (the same way there is apparmorProfile and selinuxLabel). What do you think?

For rules not being nested into ruleset, they are currenlty being errored out (and should be considered as misconfigured maybe?). In this case, I think it's fine to only control in the ruleset block (simplified version).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding soft/hard requirements for what the kernel supports:

I've been playing with the thought of adding that to go-landlock, but was not convinced whether it would be actually needed. The way I considered implementing it would have been to have one AccessFSSet that is ideal and one AccessFSSet that is the minimum. So instead of .BestEffort(), you'd then call something like .BestEffortButAtLeast(AccessFSSet(...)). (or maybe I can find a nicer name for that...)

It would be reasonably easy to add to go-landlock, required changes would be: (a) to add a minHandledAccessFS field to the Config struct, (b) expose a suitable method to set it, and then (c) in restrictPaths(), just before the RulesetAttr{} is created, return an error if the minimal requirement is not fulfilled. I was just not convinced whether there is actually a need for it and it felt like speculative generality... do you think this is a use case that you'd need?

If this needs to go to rules as well (as @l0kod suggested in the above with sample), then this would not be enough.

Regarding file reparenting:

If I understand this correctly, file reparenting is currently forbidden in a Landlock-restricted process in kernel 5.13, and with kernel 5.X you want to permit this in Landlock-restricted processes if they ask for it at enforcement time. Do I understand that correctly?

If that is the case, then it sounds like it should maybe not be part of the Config (RulesetAttr), but might logically fit more as an argument to RestrictPaths() (invocation of landlock_add_rule), like this:

err := landlock.V99.BestEffort().RestrictPaths(
  variousPaths...,
  landlock.EnableFileReparenting(params...),
)

The thinking is:

  • the Config (landlock.V99, RulesetAttr) describes what should be forbidden (and with abi→∞, thats everything :))
  • the arguments to RestrictPaths() (landlock_add_rule) describe the exceptions to what is forbidden (the operations that the process intends to do)

So yes, the above invocation is not compatible with kernel 5.13. The go-landlock library's BestEffort() mode should restrict the strongest rules it can, but it might have to fall back to not enforcing anything at all, if file reparenting is something that the program needs, and if that is not compatible with the current kernel.

(I feel like I'm making a few assumptions here, there is a good chance that I'm misunderstanding this... :) let me know)

I have the same understanding...

Regarding evolution flexibility of the config:

If there will be more fields added to RulesetAttr in the future, it might be easier to think of this as a "permits-more-or-the-same-as" relationship on RulesetAttr structs rather than as a "subset" relationship on AccessFSSets as it's implemented now in go-landlock. So maybe it would be better to have a "ruleset" and "min_ruleset" rather than having a "handled_access_fs" and "min_handled_access_fs" in the ruleset? WDYT?

Yes, it makes sense to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So instead of .BestEffort(), you'd then call something like .BestEffortButAtLeast(AccessFSSet(...)). (or maybe I can find a nicer name for that...)

In the Rust library, I use ErrorThreshold and set_error_threshold() to be able to require or ignore specific features. By default, it is a best-effort approach.

If I understand this correctly, file reparenting is currently forbidden in a Landlock-restricted process in kernel 5.13, and with kernel 5.X you want to permit this in Landlock-restricted processes if they ask for it at enforcement time. Do I understand that correctly?

Yes, I think it will be set in a new bitfield in the ruleset attribute struct. An additional access-right might come as well.

  • the Config (landlock.V99, RulesetAttr) describes what should be forbidden (and with abi→∞, thats everything :))

potentially everything, according to the ruleset.

So yes, the above invocation is not compatible with kernel 5.13. The go-landlock library's BestEffort() mode should restrict the strongest rules it can, but it might have to fall back to not enforcing anything at all, if file reparenting is something that the program needs, and if that is not compatible with the current kernel.

(I feel like I'm making a few assumptions here, there is a good chance that I'm misunderstanding this... :) let me know)

That is correct. :)

If there will be more fields added to RulesetAttr in the future, it might be easier to think of this as a "permits-more-or-the-same-as" relationship on RulesetAttr structs rather than as a "subset" relationship on AccessFSSets as it's implemented now in go-landlock. So maybe it would be better to have a "ruleset" and "min_ruleset" rather than having a "handled_access_fs" and "min_handled_access_fs" in the ruleset? WDYT?

Yes, good idea, it would make sense to have an optional "min_ruleset" or something similar (complementary to "ruleset").

kailun-qin added a commit to kailun-qin/runc that referenced this pull request Sep 1, 2021
This patch introduces Landlock Linux Security Module (LSM) support in
runc, which was landed in Linux kernel 5.13.

This allows unprivileged processes to create safe security sandboxes
that can securely restrict the ambient rights (e.g. global filesystem
access) for themselves.

runtime-spec: opencontainers/runtime-spec#1111

Fixes opencontainers#2859

Signed-off-by: Kailun Qin <kailun.qin@intel.com>
kailun-qin added a commit to kailun-qin/runc that referenced this pull request Sep 9, 2021
This patch introduces Landlock Linux Security Module (LSM) support in
runc, which was landed in Linux kernel 5.13.

This allows unprivileged processes to create safe security sandboxes
that can securely restrict the ambient rights (e.g. global filesystem
access) for themselves.

runtime-spec: opencontainers/runtime-spec#1111

Fixes opencontainers#2859

Signed-off-by: Kailun Qin <kailun.qin@intel.com>
@AkihiroSuda
Copy link
Member

What do we need to move this forward?

@vbatts
Copy link
Member

vbatts commented Apr 20, 2022

@AkihiroSuda the discussion looks good and healthy. And it looks like implementations are iterating to find consensus and polish out issues found. I haven't seen any feedback that should stop this issue from a final review.

Copy link
Member

@vbatts vbatts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it SGTM apart from minimizing top-level variable names.


// Define actions on files and directories that Landlock can restrict a sandboxed process to.
const (
FSActExecute LandlockFSAction = "execute"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all these FSAct* will be top-level variables, can we prefix with LL for LandLock? I'm just thinking that FSActions may overlap with another LSM or something like fanotify

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @kailun-qin 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the suggestions!

Copy link
Contributor Author

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landlock has recently introduced v2 ABI in Linux 5.19: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cb44e4f061e16be65b8a16505e121490c66d30d0.

Shall we support it also in this PR or thru a follow-up one?

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config schema must be updated as well

@kailun-qin
Copy link
Contributor Author

config schema must be updated as well

Updated, thanks!

@AkihiroSuda
Copy link
Member

Could you consider squashing commits?

@kailun-qin
Copy link
Contributor Author

Could you consider squashing commits?

Squashed and rebased, PTAL, thanks!

Linux kernel 5.13 adds support for Landlock Linux Security Module (LSM).
This allows unprivileged processes to create safe security sandboxes
that can securely restrict the ambient rights (e.g. global filesystem
access) for themselves.

opencontainers#1110

Signed-off-by: Kailun Qin <kailun.qin@intel.com>
Entries in the array contain the following properties:
* **`allowedAccess`** (array of strings, OPTIONAL) is an array of FS typed actions that are allowed by a rule.
* **`paths`** (array of strings, OPTIONAL) is an array of files or parent directories of the file hierarchies to restrict.
* **`disableBestEffort`** (bool, OPTIONAL) the `disableBestEffort` field disables the best-effort security approach for Landlock access rights.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this warning something that the kernel/landlock generates? or is the runtime expected to map this?
I'm reading through the kernel docs and not seeing this best-effort parameter.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "best effort" feature is implemented in user space based on the Landlock ABI level exposed by the kernel. The idea is to use Landlock if it's available and to the extent that it is available, but to fall back to a lower level of protection if the kernel is older.

At the moment, the support matrix for file system access rights is:

  • ABI v1 (kernel 5.13) supports all rights up to MAKE_SYM
  • ABI v2 (kernel 5.19) supports additionally the REFER right

And for yet-unreleased kernels:

  • ABI v3 (in upcoming kernel 6.2, if everything goes well) will support additionally the TRUNCATE right

So the kernel interface to this is:

  • use a special flag for landlock_create_ruleset(2) to figure out the ABI level
  • then use the Landlock API with the right access rights that are available at that level

Also see https://docs.google.com/document/d/1SkFpl_Xxyl4E6G2uYIlzL0gY2PFo-Nl8ikblLvnpvlU/edit for a bit more background and a visualization of how the API evolves

@AkihiroSuda AkihiroSuda added this to the v1.2.0 milestone Mar 28, 2023
@AkihiroSuda
Copy link
Member

@kailun-qin What's current status of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants