linux/Documentation/userspace-api/landlock.rst
<<
>>
Prefs
   1.. SPDX-License-Identifier: GPL-2.0
   2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
   3.. Copyright © 2019-2020 ANSSI
   4.. Copyright © 2021 Microsoft Corporation
   5
   6=====================================
   7Landlock: unprivileged access control
   8=====================================
   9
  10:Author: Mickaël Salaün
  11:Date: March 2021
  12
  13The goal of Landlock is to enable to restrict ambient rights (e.g. global
  14filesystem access) for a set of processes.  Because Landlock is a stackable
  15LSM, it makes possible to create safe security sandboxes as new security layers
  16in addition to the existing system-wide access-controls. This kind of sandbox
  17is expected to help mitigate the security impact of bugs or
  18unexpected/malicious behaviors in user space applications.  Landlock empowers
  19any process, including unprivileged ones, to securely restrict themselves.
  20
  21Landlock rules
  22==============
  23
  24A Landlock rule describes an action on an object.  An object is currently a
  25file hierarchy, and the related filesystem actions are defined with `access
  26rights`_.  A set of rules is aggregated in a ruleset, which can then restrict
  27the thread enforcing it, and its future children.
  28
  29Defining and enforcing a security policy
  30----------------------------------------
  31
  32We first need to create the ruleset that will contain our rules.  For this
  33example, the ruleset will contain rules that only allow read actions, but write
  34actions will be denied.  The ruleset then needs to handle both of these kind of
  35actions.
  36
  37.. code-block:: c
  38
  39    int ruleset_fd;
  40    struct landlock_ruleset_attr ruleset_attr = {
  41        .handled_access_fs =
  42            LANDLOCK_ACCESS_FS_EXECUTE |
  43            LANDLOCK_ACCESS_FS_WRITE_FILE |
  44            LANDLOCK_ACCESS_FS_READ_FILE |
  45            LANDLOCK_ACCESS_FS_READ_DIR |
  46            LANDLOCK_ACCESS_FS_REMOVE_DIR |
  47            LANDLOCK_ACCESS_FS_REMOVE_FILE |
  48            LANDLOCK_ACCESS_FS_MAKE_CHAR |
  49            LANDLOCK_ACCESS_FS_MAKE_DIR |
  50            LANDLOCK_ACCESS_FS_MAKE_REG |
  51            LANDLOCK_ACCESS_FS_MAKE_SOCK |
  52            LANDLOCK_ACCESS_FS_MAKE_FIFO |
  53            LANDLOCK_ACCESS_FS_MAKE_BLOCK |
  54            LANDLOCK_ACCESS_FS_MAKE_SYM,
  55    };
  56
  57    ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
  58    if (ruleset_fd < 0) {
  59        perror("Failed to create a ruleset");
  60        return 1;
  61    }
  62
  63We can now add a new rule to this ruleset thanks to the returned file
  64descriptor referring to this ruleset.  The rule will only allow reading the
  65file hierarchy ``/usr``.  Without another rule, write actions would then be
  66denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
  67``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
  68descriptor.
  69
  70.. code-block:: c
  71
  72    int err;
  73    struct landlock_path_beneath_attr path_beneath = {
  74        .allowed_access =
  75            LANDLOCK_ACCESS_FS_EXECUTE |
  76            LANDLOCK_ACCESS_FS_READ_FILE |
  77            LANDLOCK_ACCESS_FS_READ_DIR,
  78    };
  79
  80    path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
  81    if (path_beneath.parent_fd < 0) {
  82        perror("Failed to open file");
  83        close(ruleset_fd);
  84        return 1;
  85    }
  86    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
  87                            &path_beneath, 0);
  88    close(path_beneath.parent_fd);
  89    if (err) {
  90        perror("Failed to update ruleset");
  91        close(ruleset_fd);
  92        return 1;
  93    }
  94
  95We now have a ruleset with one rule allowing read access to ``/usr`` while
  96denying all other handled accesses for the filesystem.  The next step is to
  97restrict the current thread from gaining more privileges (e.g. thanks to a SUID
  98binary).
  99
 100.. code-block:: c
 101
 102    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
 103        perror("Failed to restrict privileges");
 104        close(ruleset_fd);
 105        return 1;
 106    }
 107
 108The current thread is now ready to sandbox itself with the ruleset.
 109
 110.. code-block:: c
 111
 112    if (landlock_restrict_self(ruleset_fd, 0)) {
 113        perror("Failed to enforce ruleset");
 114        close(ruleset_fd);
 115        return 1;
 116    }
 117    close(ruleset_fd);
 118
 119If the `landlock_restrict_self` system call succeeds, the current thread is now
 120restricted and this policy will be enforced on all its subsequently created
 121children as well.  Once a thread is landlocked, there is no way to remove its
 122security policy; only adding more restrictions is allowed.  These threads are
 123now in a new Landlock domain, merge of their parent one (if any) with the new
 124ruleset.
 125
 126Full working code can be found in `samples/landlock/sandboxer.c`_.
 127
 128Layers of file path access rights
 129---------------------------------
 130
 131Each time a thread enforces a ruleset on itself, it updates its Landlock domain
 132with a new layer of policy.  Indeed, this complementary policy is stacked with
 133the potentially other rulesets already restricting this thread.  A sandboxed
 134thread can then safely add more constraints to itself with a new enforced
 135ruleset.
 136
 137One policy layer grants access to a file path if at least one of its rules
 138encountered on the path grants the access.  A sandboxed thread can only access
 139a file path if all its enforced policy layers grant the access as well as all
 140the other system access controls (e.g. filesystem DAC, other LSM policies,
 141etc.).
 142
 143Bind mounts and OverlayFS
 144-------------------------
 145
 146Landlock enables to restrict access to file hierarchies, which means that these
 147access rights can be propagated with bind mounts (cf.
 148Documentation/filesystems/sharedsubtree.rst) but not with
 149Documentation/filesystems/overlayfs.rst.
 150
 151A bind mount mirrors a source file hierarchy to a destination.  The destination
 152hierarchy is then composed of the exact same files, on which Landlock rules can
 153be tied, either via the source or the destination path.  These rules restrict
 154access when they are encountered on a path, which means that they can restrict
 155access to multiple file hierarchies at the same time, whether these hierarchies
 156are the result of bind mounts or not.
 157
 158An OverlayFS mount point consists of upper and lower layers.  These layers are
 159combined in a merge directory, result of the mount point.  This merge hierarchy
 160may include files from the upper and lower layers, but modifications performed
 161on the merge hierarchy only reflects on the upper layer.  From a Landlock
 162policy point of view, each OverlayFS layers and merge hierarchies are
 163standalone and contains their own set of files and directories, which is
 164different from bind mounts.  A policy restricting an OverlayFS layer will not
 165restrict the resulted merged hierarchy, and vice versa.  Landlock users should
 166then only think about file hierarchies they want to allow access to, regardless
 167of the underlying filesystem.
 168
 169Inheritance
 170-----------
 171
 172Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
 173restrictions from its parent.  This is similar to the seccomp inheritance (cf.
 174Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
 175task's :manpage:`credentials(7)`.  For instance, one process's thread may apply
 176Landlock rules to itself, but they will not be automatically applied to other
 177sibling threads (unlike POSIX thread credential changes, cf.
 178:manpage:`nptl(7)`).
 179
 180When a thread sandboxes itself, we have the guarantee that the related security
 181policy will stay enforced on all this thread's descendants.  This allows
 182creating standalone and modular security policies per application, which will
 183automatically be composed between themselves according to their runtime parent
 184policies.
 185
 186Ptrace restrictions
 187-------------------
 188
 189A sandboxed process has less privileges than a non-sandboxed process and must
 190then be subject to additional restrictions when manipulating another process.
 191To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
 192process, a sandboxed process should have a subset of the target process rules,
 193which means the tracee must be in a sub-domain of the tracer.
 194
 195Kernel interface
 196================
 197
 198Access rights
 199-------------
 200
 201.. kernel-doc:: include/uapi/linux/landlock.h
 202    :identifiers: fs_access
 203
 204Creating a new ruleset
 205----------------------
 206
 207.. kernel-doc:: security/landlock/syscalls.c
 208    :identifiers: sys_landlock_create_ruleset
 209
 210.. kernel-doc:: include/uapi/linux/landlock.h
 211    :identifiers: landlock_ruleset_attr
 212
 213Extending a ruleset
 214-------------------
 215
 216.. kernel-doc:: security/landlock/syscalls.c
 217    :identifiers: sys_landlock_add_rule
 218
 219.. kernel-doc:: include/uapi/linux/landlock.h
 220    :identifiers: landlock_rule_type landlock_path_beneath_attr
 221
 222Enforcing a ruleset
 223-------------------
 224
 225.. kernel-doc:: security/landlock/syscalls.c
 226    :identifiers: sys_landlock_restrict_self
 227
 228Current limitations
 229===================
 230
 231File renaming and linking
 232-------------------------
 233
 234Because Landlock targets unprivileged access controls, it is needed to properly
 235handle composition of rules.  Such property also implies rules nesting.
 236Properly handling multiple layers of ruleset, each one of them able to restrict
 237access to files, also implies to inherit the ruleset restrictions from a parent
 238to its hierarchy.  Because files are identified and restricted by their
 239hierarchy, moving or linking a file from one directory to another implies to
 240propagate the hierarchy constraints.  To protect against privilege escalations
 241through renaming or linking, and for the sake of simplicity, Landlock currently
 242limits linking and renaming to the same directory.  Future Landlock evolutions
 243will enable more flexibility for renaming and linking, with dedicated ruleset
 244flags.
 245
 246Filesystem topology modification
 247--------------------------------
 248
 249As for file renaming and linking, a sandboxed thread cannot modify its
 250filesystem topology, whether via :manpage:`mount(2)` or
 251:manpage:`pivot_root(2)`.  However, :manpage:`chroot(2)` calls are not denied.
 252
 253Special filesystems
 254-------------------
 255
 256Access to regular files and directories can be restricted by Landlock,
 257according to the handled accesses of a ruleset.  However, files that do not
 258come from a user-visible filesystem (e.g. pipe, socket), but can still be
 259accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
 260restricted.  Likewise, some special kernel filesystems such as nsfs, which can
 261be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
 262restricted.  However, thanks to the `ptrace restrictions`_, access to such
 263sensitive ``/proc`` files are automatically restricted according to domain
 264hierarchies.  Future Landlock evolutions could still enable to explicitly
 265restrict such paths with dedicated ruleset flags.
 266
 267Ruleset layers
 268--------------
 269
 270There is a limit of 64 layers of stacked rulesets.  This can be an issue for a
 271task willing to enforce a new ruleset in complement to its 64 inherited
 272rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
 273E2BIG.  It is then strongly suggested to carefully build rulesets once in the
 274life of a thread, especially for applications able to launch other applications
 275that may also want to sandbox themselves (e.g. shells, container managers,
 276etc.).
 277
 278Memory usage
 279------------
 280
 281Kernel memory allocated to create rulesets is accounted and can be restricted
 282by the Documentation/admin-guide/cgroup-v1/memory.rst.
 283
 284Questions and answers
 285=====================
 286
 287What about user space sandbox managers?
 288---------------------------------------
 289
 290Using user space process to enforce restrictions on kernel resources can lead
 291to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
 292the OS code and state
 293<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
 294
 295What about namespaces and containers?
 296-------------------------------------
 297
 298Namespaces can help create sandboxes but they are not designed for
 299access-control and then miss useful features for such use case (e.g. no
 300fine-grained restrictions).  Moreover, their complexity can lead to security
 301issues, especially when untrusted processes can manipulate them (cf.
 302`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
 303
 304Additional documentation
 305========================
 306
 307* Documentation/security/landlock.rst
 308* https://landlock.io
 309
 310.. Links
 311.. _samples/landlock/sandboxer.c:
 312   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
 313