From patchwork Fri Feb 15 16:07:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042990 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J7Q0Dp3z9rxp for ; Sat, 16 Feb 2019 03:07:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388391AbfBOQH3 (ORCPT ); Fri, 15 Feb 2019 11:07:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60472 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391310AbfBOQH2 (ORCPT ); Fri, 15 Feb 2019 11:07:28 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 04EB3AB42C; Fri, 15 Feb 2019 16:07:28 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id F13D619C65; Fri, 15 Feb 2019 16:07:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 01/27] containers: Rename linux/container.h to linux/container_dev.h From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:07:23 +0000 Message-ID: <155024684311.21651.6261046862181321227.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 15 Feb 2019 16:07:28 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Rename linux/container.h to linux/container_dev.h so that linux/container.h can be used for containers. Signed-off-by: David Howells --- drivers/acpi/container.c | 2 +- drivers/base/container.c | 2 +- include/linux/container.h | 25 ------------------------- include/linux/container_dev.h | 25 +++++++++++++++++++++++++ 4 files changed, 27 insertions(+), 27 deletions(-) delete mode 100644 include/linux/container.h create mode 100644 include/linux/container_dev.h diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c index 12c240903c18..435db0694405 100644 --- a/drivers/acpi/container.c +++ b/drivers/acpi/container.c @@ -23,7 +23,7 @@ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ #include -#include +#include #include "internal.h" diff --git a/drivers/base/container.c b/drivers/base/container.c index 1ba42d2d3532..1ff01ead2b2a 100644 --- a/drivers/base/container.c +++ b/drivers/base/container.c @@ -6,7 +6,7 @@ * Author: Rafael J. Wysocki */ -#include +#include #include "base.h" diff --git a/include/linux/container.h b/include/linux/container.h deleted file mode 100644 index 3c03e6fd2035..000000000000 --- a/include/linux/container.h +++ /dev/null @@ -1,25 +0,0 @@ -/* - * Definitions for container bus type. - * - * Copyright (C) 2013, Intel Corporation - * Author: Rafael J. Wysocki - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include - -/* drivers/base/power/container.c */ -extern struct bus_type container_subsys; - -struct container_dev { - struct device dev; - int (*offline)(struct container_dev *cdev); -}; - -static inline struct container_dev *to_container_dev(struct device *dev) -{ - return container_of(dev, struct container_dev, dev); -} diff --git a/include/linux/container_dev.h b/include/linux/container_dev.h new file mode 100644 index 000000000000..3c03e6fd2035 --- /dev/null +++ b/include/linux/container_dev.h @@ -0,0 +1,25 @@ +/* + * Definitions for container bus type. + * + * Copyright (C) 2013, Intel Corporation + * Author: Rafael J. Wysocki + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include + +/* drivers/base/power/container.c */ +extern struct bus_type container_subsys; + +struct container_dev { + struct device dev; + int (*offline)(struct container_dev *cdev); +}; + +static inline struct container_dev *to_container_dev(struct device *dev) +{ + return container_of(dev, struct container_dev, dev); +} From patchwork Fri Feb 15 16:07:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042991 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J7k4STfz9s4Z for ; Sat, 16 Feb 2019 03:07:46 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729073AbfBOQHi (ORCPT ); Fri, 15 Feb 2019 11:07:38 -0500 Received: from mx1.redhat.com ([209.132.183.28]:65487 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388763AbfBOQHh (ORCPT ); Fri, 15 Feb 2019 11:07:37 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8595C1301AC; Fri, 15 Feb 2019 16:07:36 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0A1145D6A9; Fri, 15 Feb 2019 16:07:33 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 02/27] containers: Implement containers as kernel objects From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:07:33 +0000 Message-ID: <155024685321.21651.1504201877881622756.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 15 Feb 2019 16:07:36 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Implement a kernel container object such that it contains the following things: (1) Namespaces. (2) A root directory. (3) A set of processes, including one designated as the 'init' process. A container is created and attached to a file descriptor by: int cfd = container_create(const char *name, unsigned int flags); this inherits all the namespaces of the parent container unless otherwise the mask calls for new namespaces. CONTAINER_NEW_FS_NS CONTAINER_NEW_EMPTY_FS_NS CONTAINER_NEW_CGROUP_NS [root only] CONTAINER_NEW_UTS_NS CONTAINER_NEW_IPC_NS CONTAINER_NEW_USER_NS CONTAINER_NEW_PID_NS CONTAINER_NEW_NET_NS Other flags include: CONTAINER_KILL_ON_CLOSE CONTAINER_CLOSE_ON_EXEC Note that I've added a pointer to the current container to task_struct. This doesn't make the nsproxy pointer redundant as you can still make new namespaces with clone(). I've also added a list_head to task_struct to form a list in the container of its member processes. This is convenient, but redundant since the code could iterate over all the tasks looking for ones that have a matching task->container. It might make sense to use fsconfig() to configure the container: fsconfig(cfd, FSCONFIG_SET_NAMESPACE, "user", NULL, userns_fd); fsconfig(cfd, FSCONFIG_SET_NAMESPACE, "mnt", NULL, mntns_fd); fsconfig(cfd, FSCONFIG_SET_FD, "rootfs", NULL, root_fd); fsconfig(cfd, FSCONFIG_CMD_CREATE_CONTAINER, NULL, NULL, 0); Nacked-by: "Eric W. Biederman" ================== FUTURE DEVELOPMENT ================== (1) Setting up the container. A container would be created with, say: int cfd = container_create("fred", CONTAINER_NEW_EMPTY_FS_NS); Once created, it should then be possible for the supervising process to modify the new container. Mounts can be created inside of the container's namespaces: fsfd = fsopen("ext4", 0); fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd); fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/sda3", 0); fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); mfd = fsmount(fsfd, 0, 0); and then mounted into the namespace: move_mount(mfd, "", cfd, "/", MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_CONTAINER_ROOT); Further mounts can be added by: move_mount(mfd, "", cfd, "proc", MOVE_MOUNT_F_EMPTY_PATH); Files and devices can be created by supplying the container fd as the dirfd argument: mkdirat(int cfd, const char *path, mode_t mode); mknodat(int cfd, const char *path, mode_t mode, dev_t dev); int fd = openat(int cfd, const char *path, unsigned int flags, mode_t mode); [*] Note that when using cfd as dirfd, the path must not contain a '/' at the front. Sockets, such as netlink, can be opened inside of the container's namespaces: int fd = container_socket(int cfd, int domain, int type, int protocol); This should allow management of the container's network namespace from outside. (2) Starting the container. Once all modifications are complete, the container's 'init' process can be started by: fork_into_container(int cfd); This precludes further external modification of the mount tree within the container. Before this point, the container is simply destroyed if the container fd is closed. (3) Waiting for the container to complete. The container fd can then be polled to wait for init process therein to complete and the exit code collected by: container_wait(int container_fd, int *_wstatus, unsigned int wait, struct rusage *rusage); The container and everything in it can be terminated or killed off: container_kill(int container_fd, int initonly, int signal); If 'init' dies, all other processes in the container are preemptively SIGKILL'd by the kernel. By default, if the container is active and its fd is closed, the container is left running and wil be cleaned up when its 'init' exits. The default can be changed with the CONTAINER_KILL_ON_CLOSE flag. (4) Supervising the container. Given that we have an fd attached to the container, we could make it such that the supervising process could monitor and override EPERM returns for mount and other privileged operations within the container. (5) Per-container keyring. Each container can point to a per-container keyring for the holding of integrity keys and filesystem keys for use inside the container. This would be attached: keyctl(KEYCTL_SET_CONTAINER_KEYRING, cfd, keyring) This keyring would be searched by request_key() after it has searched the thread, process and session keyrings. (6) Running different LSM policies by container. This might particularly make sense with something like Apparmor where different path-based rules might be required inside a container to inside the parent. Signed-off-by: David Howells --- arch/x86/entry/syscalls/syscall_32.tbl | 1 arch/x86/entry/syscalls/syscall_64.tbl | 1 fs/namespace.c | 5 include/linux/container.h | 86 ++++++++ include/linux/init_task.h | 1 include/linux/lsm_hooks.h | 20 ++ include/linux/sched.h | 3 include/linux/security.h | 15 + include/linux/syscalls.h | 3 include/uapi/linux/container.h | 28 +++ init/Kconfig | 7 + init/init_task.c | 3 kernel/Makefile | 2 kernel/container.c | 348 ++++++++++++++++++++++++++++++++ kernel/exit.c | 1 kernel/fork.c | 7 + kernel/namespaces.h | 15 + kernel/nsproxy.c | 23 +- kernel/sys_ni.c | 3 security/security.c | 12 + 20 files changed, 571 insertions(+), 13 deletions(-) create mode 100644 include/linux/container.h create mode 100644 include/uapi/linux/container.h create mode 100644 kernel/container.c create mode 100644 kernel/namespaces.h diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index c9db9d51a7df..3564814a5d21 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -407,3 +407,4 @@ 393 i386 fsinfo sys_fsinfo __ia32_sys_fsinfo 394 i386 mount_notify sys_mount_notify __ia32_sys_mount_notify 395 i386 sb_notify sys_sb_notify __ia32_sys_sb_notify +396 i386 container_create sys_container_create __ia32_sys_container_create diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index 17869bf7788a..aa6cccbe5271 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -352,6 +352,7 @@ 341 common fsinfo __x64_sys_fsinfo 342 common mount_notify __x64_sys_mount_notify 343 common sb_notify __x64_sys_sb_notify +344 common container_create __x64_sys_container_create # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/namespace.c b/fs/namespace.c index f378cfc63043..ea005f55ec4c 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "pnode.h" #include "internal.h" @@ -3742,6 +3743,10 @@ static void __init init_mount_tree(void) set_fs_pwd(current->fs, &root); set_fs_root(current->fs, &root); +#ifdef CONFIG_CONTAINERS + path_get(&root); + init_container.root = root; +#endif } void __init mnt_init(void) diff --git a/include/linux/container.h b/include/linux/container.h new file mode 100644 index 000000000000..0a8918435097 --- /dev/null +++ b/include/linux/container.h @@ -0,0 +1,86 @@ +/* Container objects + * + * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#ifndef _LINUX_CONTAINER_H +#define _LINUX_CONTAINER_H + +#include +#include +#include +#include +#include +#include +#include + +struct fs_struct; +struct nsproxy; +struct task_struct; + +/* + * The container object. + */ +struct container { + char name[24]; + u64 id; /* Container ID */ + refcount_t usage; + int exit_code; /* The exit code of 'init' */ + const struct cred *cred; /* Creds for this container, including userns */ + struct nsproxy *ns; /* This container's namespaces */ + struct path root; /* The root of the container's fs namespace */ + struct task_struct *init; /* The 'init' task for this container */ + struct container *parent; /* Parent of this container. */ + void *security; /* LSM data */ + struct list_head members; /* Member processes, guarded with ->lock */ + struct list_head child_link; /* Link in parent->children */ + struct list_head children; /* Child containers */ + wait_queue_head_t waitq; /* Someone waiting for init to exit waits here */ + unsigned long flags; +#define CONTAINER_FLAG_INIT_STARTED 0 /* Init is started - certain ops now prohibited */ +#define CONTAINER_FLAG_DEAD 1 /* Init has died */ +#define CONTAINER_FLAG_KILL_ON_CLOSE 2 /* Kill init if container handle closed */ + spinlock_t lock; + seqcount_t seq; /* Track changes in ->root */ +}; + +extern struct container init_container; + +#ifdef CONFIG_CONTAINERS +extern const struct file_operations container_fops; + +extern int copy_container(unsigned long flags, struct task_struct *tsk, + struct container *container); +extern void exit_container(struct task_struct *tsk); +extern void put_container(struct container *c); + +static inline struct container *get_container(struct container *c) +{ + refcount_inc(&c->usage); + return c; +} + +static inline bool is_container_file(struct file *file) +{ + return file->f_op == &container_fops; +} + +#else + +static inline int copy_container(unsigned long flags, struct task_struct *tsk, + struct container *container) +{ return 0; } +static inline void exit_container(struct task_struct *tsk) { } +static inline void put_container(struct container *c) {} +static inline struct container *get_container(struct container *c) { return NULL; } +static inline bool is_container_file(struct file *file) { return false; } + +#endif /* CONFIG_CONTAINERS */ + +#endif /* _LINUX_CONTAINER_H */ diff --git a/include/linux/init_task.h b/include/linux/init_task.h index a7083a45a26c..f016cadece24 100644 --- a/include/linux/init_task.h +++ b/include/linux/init_task.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 52d0f3f4c786..0f310d911815 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -1460,6 +1460,16 @@ * @bpf_prog_free_security: * Clean up the security information stored inside bpf prog. * + * Security hooks for containers: + * + * @container_alloc: + * Permit creation of a new container and assign security data. + * @container: The new container. + * + * @container_free: + * Free security data attached to a container. + * @container: The container. + * */ union security_list_options { int (*binder_set_context_mgr)(struct task_struct *mgr); @@ -1825,6 +1835,12 @@ union security_list_options { int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux); void (*bpf_prog_free_security)(struct bpf_prog_aux *aux); #endif /* CONFIG_BPF_SYSCALL */ + + /* Container management security hooks */ +#ifdef CONFIG_CONTAINERS + int (*container_alloc)(struct container *container, unsigned int flags); + void (*container_free)(struct container *container); +#endif }; struct security_hook_heads { @@ -2069,6 +2085,10 @@ struct security_hook_heads { struct hlist_head bpf_prog_alloc_security; struct hlist_head bpf_prog_free_security; #endif /* CONFIG_BPF_SYSCALL */ +#ifdef CONFIG_CONTAINERS + struct hlist_head container_alloc; + struct hlist_head container_free; +#endif /* CONFIG_CONTAINERS */ } __randomize_layout; /* diff --git a/include/linux/sched.h b/include/linux/sched.h index d2f90fa92468..073a3a930514 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -36,6 +36,7 @@ struct backing_dev_info; struct bio_list; struct blk_plug; struct cfs_rq; +struct container; struct fs_struct; struct futex_pi_state; struct io_context; @@ -870,6 +871,8 @@ struct task_struct { /* Namespaces: */ struct nsproxy *nsproxy; + struct container *container; + struct list_head container_link; /* Signal handlers: */ struct signal_struct *signal; diff --git a/include/linux/security.h b/include/linux/security.h index da538c06766f..acd0c14c6e95 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -70,6 +70,7 @@ struct ctl_table; struct audit_krule; struct user_namespace; struct timezone; +struct container; enum lsm_event { LSM_POLICY_CHANGE, @@ -1751,6 +1752,20 @@ static inline void security_audit_rule_free(void *lsmrule) #endif /* CONFIG_SECURITY */ #endif /* CONFIG_AUDIT */ +#ifdef CONFIG_CONTAINERS +#ifdef CONFIG_SECURITY +int security_container_alloc(struct container *container, unsigned int flags); +void security_container_free(struct container *container); +#else +static inline int security_container_alloc(struct container *container, + unsigned int flags) +{ + return 0; +} +static inline void security_container_free(struct container *container) {} +#endif +#endif /* CONFIG_CONTAINERS */ + #ifdef CONFIG_SECURITYFS extern struct dentry *securityfs_create_file(const char *name, umode_t mode, diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 10127b1d923b..dac42098c2dd 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -943,6 +943,9 @@ asmlinkage long sys_mount_notify(int dfd, const char __user *path, unsigned int at_flags, int watch_fd, int watch_id); asmlinkage long sys_sb_notify(int dfd, const char __user *path, unsigned int at_flags, int watch_fd, int watch_id); +asmlinkage long sys_container_create(const char __user *name, unsigned int flags, + unsigned long spare3, unsigned long spare4, + unsigned long spare5); /* * Architecture-specific system calls diff --git a/include/uapi/linux/container.h b/include/uapi/linux/container.h new file mode 100644 index 000000000000..43748099b28d --- /dev/null +++ b/include/uapi/linux/container.h @@ -0,0 +1,28 @@ +/* Container UAPI + * + * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#ifndef _UAPI_LINUX_CONTAINER_H +#define _UAPI_LINUX_CONTAINER_H + + +#define CONTAINER_NEW_FS_NS 0x00000001 /* Dup current fs namespace */ +#define CONTAINER_NEW_EMPTY_FS_NS 0x00000002 /* Provide new empty fs namespace */ +#define CONTAINER_NEW_CGROUP_NS 0x00000004 /* Dup current cgroup namespace */ +#define CONTAINER_NEW_UTS_NS 0x00000008 /* Dup current uts namespace */ +#define CONTAINER_NEW_IPC_NS 0x00000010 /* Dup current ipc namespace */ +#define CONTAINER_NEW_USER_NS 0x00000020 /* Dup current user namespace */ +#define CONTAINER_NEW_PID_NS 0x00000040 /* Dup current pid namespace */ +#define CONTAINER_NEW_NET_NS 0x00000080 /* Dup current net namespace */ +#define CONTAINER_KILL_ON_CLOSE 0x00000100 /* Kill all member processes when fd closed */ +#define CONTAINER_FD_CLOEXEC 0x00000200 /* Close the fd on exec */ +#define CONTAINER__FLAG_MASK 0x000003ff + +#endif /* _UAPI_LINUX_CONTAINER_H */ diff --git a/init/Kconfig b/init/Kconfig index 5984dd7f2156..ab37c3a55aa1 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -992,6 +992,13 @@ config NET_NS Allow user space to create what appear to be multiple instances of the network stack. +config CONTAINERS + bool "Container support" + default y + help + Allow userspace to create and manipulate containers as objects that + have namespaces and hold a set of processes. + endif # NAMESPACES config CHECKPOINT_RESTORE diff --git a/init/init_task.c b/init/init_task.c index 5aebe3be4d7c..90c7439a195b 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -108,6 +108,9 @@ struct task_struct init_task .signal = &init_signals, .sighand = &init_sighand, .nsproxy = &init_nsproxy, + .container = &init_container, + .container_link.next = &init_container.members, + .container_link.prev = &init_container.members, .pending = { .list = LIST_HEAD_INIT(init_task.pending.list), .signal = {{0}} diff --git a/kernel/Makefile b/kernel/Makefile index 6aa7543bcdb2..98cdd18cecef 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -8,7 +8,7 @@ obj-y = fork.o exec_domain.o panic.o \ sysctl.o sysctl_binary.o capability.o ptrace.o user.o \ signal.o sys.o umh.o workqueue.o pid.o task_work.o \ extable.o params.o \ - kthread.o sys_ni.o nsproxy.o \ + kthread.o sys_ni.o nsproxy.o container.o \ notifier.o ksysfs.o cred.o reboot.o \ async.o range.o smpboot.o ucount.o diff --git a/kernel/container.c b/kernel/container.c new file mode 100644 index 000000000000..ca4012632cfa --- /dev/null +++ b/kernel/container.c @@ -0,0 +1,348 @@ +/* Implement container objects. + * + * Copyright (C) 2018 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "namespaces.h" + +struct container init_container = { + .name = ".init", + .id = 1, + .usage = REFCOUNT_INIT(2), + .cred = &init_cred, + .ns = &init_nsproxy, + .init = &init_task, + .members.next = &init_task.container_link, + .members.prev = &init_task.container_link, + .children = LIST_HEAD_INIT(init_container.children), + .flags = (1 << CONTAINER_FLAG_INIT_STARTED), + .lock = __SPIN_LOCK_UNLOCKED(init_container.lock), + .seq = SEQCNT_ZERO(init_fs.seq), +}; + +#ifdef CONFIG_CONTAINERS + +static atomic64_t container_id_counter = ATOMIC_INIT(1); + +/* + * Drop a ref on a container and clear it if no longer in use. + */ +void put_container(struct container *c) +{ + struct container *parent; + + while (c && refcount_dec_and_test(&c->usage)) { + BUG_ON(!list_empty(&c->members)); + if (c->ns) + put_nsproxy(c->ns); + path_put(&c->root); + + parent = c->parent; + if (parent) { + spin_lock(&parent->lock); + list_del(&c->child_link); + spin_unlock(&parent->lock); + } + + if (c->cred) + put_cred(c->cred); + security_container_free(c); + kfree(c); + c = parent; + } +} + +/* + * Allow the user to poll for the container dying. + */ +static unsigned int container_poll(struct file *file, poll_table *wait) +{ + struct container *container = file->private_data; + unsigned int mask = 0; + + poll_wait(file, &container->waitq, wait); + + if (test_bit(CONTAINER_FLAG_DEAD, &container->flags)) + mask |= POLLHUP; + + return mask; +} + +static int container_release(struct inode *inode, struct file *file) +{ + struct container *container = file->private_data; + + put_container(container); + return 0; +} + +const struct file_operations container_fops = { + .poll = container_poll, + .release = container_release, +}; + +/* + * Handle fork/clone. + * + * A process inherits its parent's container. The first process into the + * container is its 'init' process and the life of everything else in there is + * dependent upon that. + */ +int copy_container(unsigned long flags, struct task_struct *tsk, + struct container *container) +{ + struct container *c = container ?: tsk->container; + int ret = -ECANCELED; + + spin_lock(&c->lock); + + if (!test_bit(CONTAINER_FLAG_DEAD, &c->flags)) { + list_add_tail(&tsk->container_link, &c->members); + get_container(c); + tsk->container = c; + if (!c->init) { + set_bit(CONTAINER_FLAG_INIT_STARTED, &c->flags); + c->init = tsk; + } + ret = 0; + } + + spin_unlock(&c->lock); + return ret; +} + +/* + * Remove a dead process from a container. + * + * If the 'init' process in a container dies, we kill off all the other + * processes in the container. + */ +void exit_container(struct task_struct *tsk) +{ + struct task_struct *p; + struct container *c = tsk->container; + struct kernel_siginfo si = { + .si_signo = SIGKILL, + .si_code = SI_KERNEL, + }; + + spin_lock(&c->lock); + + list_del(&tsk->container_link); + + if (c->init == tsk) { + c->init = NULL; + c->exit_code = tsk->exit_code; + smp_wmb(); /* Order exit_code vs CONTAINER_DEAD. */ + set_bit(CONTAINER_FLAG_DEAD, &c->flags); + wake_up_bit(&c->flags, CONTAINER_FLAG_DEAD); + + list_for_each_entry(p, &c->members, container_link) { + si.si_pid = task_tgid_vnr(p); + send_sig_info(SIGKILL, &si, p); + } + } + + spin_unlock(&c->lock); + put_container(c); +} + +/* + * Allocate a container. + */ +static struct container *alloc_container(const char __user *name) +{ + struct container *c; + long len; + int ret; + + c = kzalloc(sizeof(struct container), GFP_KERNEL); + if (!c) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&c->members); + INIT_LIST_HEAD(&c->children); + init_waitqueue_head(&c->waitq); + spin_lock_init(&c->lock); + refcount_set(&c->usage, 1); + + ret = -EFAULT; + len = strncpy_from_user(c->name, name, sizeof(c->name)); + if (len < 0) + goto err; + ret = -ENAMETOOLONG; + if (len >= sizeof(c->name)) + goto err; + ret = -EINVAL; + if (strchr(c->name, '/')) + goto err; + + c->name[len] = 0; + return c; + +err: + kfree(c); + return ERR_PTR(ret); +} + +/* + * Create some creds for the container. We don't want to pin things we don't + * have to, so drop all keyrings from the new cred. The LSM gets to audit the + * cred struct when security_container_alloc() is invoked. + */ +static const struct cred *create_container_creds(unsigned int flags) +{ + struct cred *new; + int ret; + + new = prepare_creds(); + if (!new) + return ERR_PTR(-ENOMEM); + +#ifdef CONFIG_KEYS + key_put(new->thread_keyring); + new->thread_keyring = NULL; + key_put(new->process_keyring); + new->process_keyring = NULL; + key_put(new->session_keyring); + new->session_keyring = NULL; + key_put(new->request_key_auth); + new->request_key_auth = NULL; +#endif + + if (flags & CONTAINER_NEW_USER_NS) { + ret = create_user_ns(new); + if (ret < 0) + goto err; + new->euid = new->user_ns->owner; + new->egid = new->user_ns->group; + } + + new->fsuid = new->suid = new->uid = new->euid; + new->fsgid = new->sgid = new->gid = new->egid; + return new; + +err: + abort_creds(new); + return ERR_PTR(ret); +} + +/* + * Create a new container. + */ +static struct container *create_container(const char __user *name, unsigned int flags) +{ + struct container *parent, *c; + struct fs_struct *fs; + struct nsproxy *ns; + const struct cred *cred; + int ret; + + c = alloc_container(name); + if (IS_ERR(c)) + return c; + + if (flags & CONTAINER_KILL_ON_CLOSE) + __set_bit(CONTAINER_FLAG_KILL_ON_CLOSE, &c->flags); + + cred = create_container_creds(flags); + if (IS_ERR(cred)) { + ret = PTR_ERR(cred); + goto err_cont; + } + c->cred = cred; + + ret = -ENOMEM; + fs = copy_fs_struct(current->fs); + if (!fs) + goto err_cont; + + ns = create_new_namespaces( + (flags & CONTAINER_NEW_FS_NS ? CLONE_NEWNS : 0) | + (flags & CONTAINER_NEW_CGROUP_NS ? CLONE_NEWCGROUP : 0) | + (flags & CONTAINER_NEW_UTS_NS ? CLONE_NEWUTS : 0) | + (flags & CONTAINER_NEW_IPC_NS ? CLONE_NEWIPC : 0) | + (flags & CONTAINER_NEW_PID_NS ? CLONE_NEWPID : 0) | + (flags & CONTAINER_NEW_NET_NS ? CLONE_NEWNET : 0), + current->nsproxy, cred->user_ns, fs); + if (IS_ERR(ns)) { + ret = PTR_ERR(ns); + goto err_fs; + } + + c->ns = ns; + c->root = fs->root; + c->seq = fs->seq; + fs->root.mnt = NULL; + fs->root.dentry = NULL; + + ret = security_container_alloc(c, flags); + if (ret < 0) + goto err_fs; + + parent = current->container; + get_container(parent); + c->parent = parent; + c->id = atomic64_inc_return(&container_id_counter); + spin_lock(&parent->lock); + list_add_tail(&c->child_link, &parent->children); + spin_unlock(&parent->lock); + return c; + +err_fs: + free_fs_struct(fs); +err_cont: + put_container(c); + return ERR_PTR(ret); +} + +/* + * Create a new container object. + */ +SYSCALL_DEFINE5(container_create, + const char __user *, name, + unsigned int, flags, + unsigned long, spare3, + unsigned long, spare4, + unsigned long, spare5) +{ + struct container *c; + int fd; + + if (!name || + flags & ~CONTAINER__FLAG_MASK || + spare3 != 0 || spare4 != 0 || spare5 != 0) + return -EINVAL; + if ((flags & (CONTAINER_NEW_FS_NS | CONTAINER_NEW_EMPTY_FS_NS)) == + (CONTAINER_NEW_FS_NS | CONTAINER_NEW_EMPTY_FS_NS)) + return -EINVAL; + + c = create_container(name, flags); + if (IS_ERR(c)) + return PTR_ERR(c); + + fd = anon_inode_getfd("container", &container_fops, c, + O_RDWR | (flags & CONTAINER_FD_CLOEXEC ? O_CLOEXEC : 0)); + if (fd < 0) + put_container(c); + return fd; +} + +#endif /* CONFIG_CONTAINERS */ diff --git a/kernel/exit.c b/kernel/exit.c index 284f2fe9a293..78f6065ad799 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -864,6 +864,7 @@ void __noreturn do_exit(long code) if (group_dead) disassociate_ctty(1); exit_task_namespaces(tsk); + exit_container(tsk); exit_task_work(tsk); exit_thread(tsk); exit_umh(tsk); diff --git a/kernel/fork.c b/kernel/fork.c index b69248e6f0e0..009cf7e63894 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1920,9 +1920,12 @@ static __latent_entropy struct task_struct *copy_process( retval = copy_namespaces(clone_flags, p); if (retval) goto bad_fork_cleanup_mm; - retval = copy_io(clone_flags, p); + retval = copy_container(clone_flags, p, NULL); if (retval) goto bad_fork_cleanup_namespaces; + retval = copy_io(clone_flags, p); + if (retval) + goto bad_fork_cleanup_container; retval = copy_thread_tls(clone_flags, stack_start, stack_size, p, tls); if (retval) goto bad_fork_cleanup_io; @@ -2121,6 +2124,8 @@ static __latent_entropy struct task_struct *copy_process( bad_fork_cleanup_io: if (p->io_context) exit_io_context(p); +bad_fork_cleanup_container: + exit_container(p); bad_fork_cleanup_namespaces: exit_task_namespaces(p); bad_fork_cleanup_mm: diff --git a/kernel/namespaces.h b/kernel/namespaces.h new file mode 100644 index 000000000000..c44e3cf0e254 --- /dev/null +++ b/kernel/namespaces.h @@ -0,0 +1,15 @@ +/* Local namespaces defs + * + * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +extern struct nsproxy *create_new_namespaces(unsigned long flags, + struct nsproxy *nsproxy, + struct user_namespace *user_ns, + struct fs_struct *new_fs); diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index f6c5d330059a..4bb5184b3a80 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -27,6 +27,7 @@ #include #include #include +#include "namespaces.h" static struct kmem_cache *nsproxy_cachep; @@ -61,8 +62,8 @@ static inline struct nsproxy *create_nsproxy(void) * Return the newly created nsproxy. Do not attach this to the task, * leave it to the caller to do proper locking and attach it to task. */ -static struct nsproxy *create_new_namespaces(unsigned long flags, - struct task_struct *tsk, struct user_namespace *user_ns, +struct nsproxy *create_new_namespaces(unsigned long flags, + struct nsproxy *nsproxy, struct user_namespace *user_ns, struct fs_struct *new_fs) { struct nsproxy *new_nsp; @@ -72,39 +73,39 @@ static struct nsproxy *create_new_namespaces(unsigned long flags, if (!new_nsp) return ERR_PTR(-ENOMEM); - new_nsp->mnt_ns = copy_mnt_ns(flags, tsk->nsproxy->mnt_ns, user_ns, new_fs); + new_nsp->mnt_ns = copy_mnt_ns(flags, nsproxy->mnt_ns, user_ns, new_fs); if (IS_ERR(new_nsp->mnt_ns)) { err = PTR_ERR(new_nsp->mnt_ns); goto out_ns; } - new_nsp->uts_ns = copy_utsname(flags, user_ns, tsk->nsproxy->uts_ns); + new_nsp->uts_ns = copy_utsname(flags, user_ns, nsproxy->uts_ns); if (IS_ERR(new_nsp->uts_ns)) { err = PTR_ERR(new_nsp->uts_ns); goto out_uts; } - new_nsp->ipc_ns = copy_ipcs(flags, user_ns, tsk->nsproxy->ipc_ns); + new_nsp->ipc_ns = copy_ipcs(flags, user_ns, nsproxy->ipc_ns); if (IS_ERR(new_nsp->ipc_ns)) { err = PTR_ERR(new_nsp->ipc_ns); goto out_ipc; } new_nsp->pid_ns_for_children = - copy_pid_ns(flags, user_ns, tsk->nsproxy->pid_ns_for_children); + copy_pid_ns(flags, user_ns, nsproxy->pid_ns_for_children); if (IS_ERR(new_nsp->pid_ns_for_children)) { err = PTR_ERR(new_nsp->pid_ns_for_children); goto out_pid; } new_nsp->cgroup_ns = copy_cgroup_ns(flags, user_ns, - tsk->nsproxy->cgroup_ns); + nsproxy->cgroup_ns); if (IS_ERR(new_nsp->cgroup_ns)) { err = PTR_ERR(new_nsp->cgroup_ns); goto out_cgroup; } - new_nsp->net_ns = copy_net_ns(flags, user_ns, tsk->nsproxy->net_ns); + new_nsp->net_ns = copy_net_ns(flags, user_ns, nsproxy->net_ns); if (IS_ERR(new_nsp->net_ns)) { err = PTR_ERR(new_nsp->net_ns); goto out_net; @@ -162,7 +163,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) (CLONE_NEWIPC | CLONE_SYSVSEM)) return -EINVAL; - new_ns = create_new_namespaces(flags, tsk, user_ns, tsk->fs); + new_ns = create_new_namespaces(flags, tsk->nsproxy, user_ns, tsk->fs); if (IS_ERR(new_ns)) return PTR_ERR(new_ns); @@ -203,7 +204,7 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags, if (!ns_capable(user_ns, CAP_SYS_ADMIN)) return -EPERM; - *new_nsp = create_new_namespaces(unshare_flags, current, user_ns, + *new_nsp = create_new_namespaces(unshare_flags, current->nsproxy, user_ns, new_fs ? new_fs : current->fs); if (IS_ERR(*new_nsp)) { err = PTR_ERR(*new_nsp); @@ -251,7 +252,7 @@ SYSCALL_DEFINE2(setns, int, fd, int, nstype) if (nstype && (ns->ops->type != nstype)) goto out; - new_nsproxy = create_new_namespaces(0, tsk, current_user_ns(), tsk->fs); + new_nsproxy = create_new_namespaces(0, tsk->nsproxy, current_user_ns(), tsk->fs); if (IS_ERR(new_nsproxy)) { err = PTR_ERR(new_nsproxy); goto out; diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index a4e7131b2509..f0455cbb91cf 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -136,6 +136,9 @@ COND_SYSCALL(acct); COND_SYSCALL(capget); COND_SYSCALL(capset); +/* kernel/container.c */ +COND_SYSCALL(container_create); + /* kernel/exec_domain.c */ /* kernel/exit.c */ diff --git a/security/security.c b/security/security.c index b49732c02e21..259be9a1746c 100644 --- a/security/security.c +++ b/security/security.c @@ -1864,3 +1864,15 @@ void security_bpf_prog_free(struct bpf_prog_aux *aux) call_void_hook(bpf_prog_free_security, aux); } #endif /* CONFIG_BPF_SYSCALL */ + +#ifdef CONFIG_CONTAINERS +int security_container_alloc(struct container *container, unsigned int flags) +{ + return call_int_hook(container_alloc, 0, container, flags); +} + +void security_container_free(struct container *container) +{ + call_void_hook(container_free, container); +} +#endif /* CONFIG_CONTAINERS */ From patchwork Fri Feb 15 16:07:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042992 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J7r541Hz9s7T for ; Sat, 16 Feb 2019 03:07:52 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388366AbfBOQHp (ORCPT ); Fri, 15 Feb 2019 11:07:45 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39694 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391310AbfBOQHo (ORCPT ); Fri, 15 Feb 2019 11:07:44 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 76075C0ADB53; Fri, 15 Feb 2019 16:07:44 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8BC5E5D70D; Fri, 15 Feb 2019 16:07:42 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 03/27] containers: Provide /proc/containers From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:07:41 +0000 Message-ID: <155024686175.21651.6141317051029384847.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 15 Feb 2019 16:07:44 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide /proc/containers to view the current container and all the containers created within it: # ./foo-container NAME USE FL OWNER GROUP 141 01 0 0 foo-test 1 04 0 0 I'm not sure whether this is really desirable, though. Signed-off-by: David Howells --- kernel/container.c | 110 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 110 insertions(+) diff --git a/kernel/container.c b/kernel/container.c index ca4012632cfa..1d2cb1c1e9b1 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "namespaces.h" struct container init_container = { @@ -69,6 +70,108 @@ void put_container(struct container *c) } } +static void *container_proc_start(struct seq_file *m, loff_t *_pos) +{ + struct container *c = m->private; + struct list_head *p; + loff_t pos = *_pos; + + spin_lock(&c->lock); + + if (pos <= 1) { + *_pos = 1; + return (void *)1UL; /* Banner on first line */ + } + + if (pos == 2) + return m->private; /* Current container on second line */ + + /* Subordinate containers thereafter */ + p = c->children.next; + pos--; + for (pos--; pos > 0 && p != &c->children; pos--) { + p = p->next; + } + + if (p == &c->children) + return NULL; + return container_of(p, struct container, child_link); +} + +static void *container_proc_next(struct seq_file *m, void *v, loff_t *_pos) +{ + struct container *c = m->private, *vc = v; + struct list_head *p; + loff_t pos = *_pos; + + pos++; + *_pos = pos; + if (pos == 2) + return c; /* Current container on second line */ + + if (pos == 3) + p = &c->children; + else + p = &vc->child_link; + p = p->next; + if (p == &c->children) + return NULL; + return container_of(p, struct container, child_link); +} + +static void container_proc_stop(struct seq_file *m, void *v) +{ + struct container *c = m->private; + + spin_unlock(&c->lock); +} + +static int container_proc_show(struct seq_file *m, void *v) +{ + struct user_namespace *uns = current_user_ns(); + struct container *c = v; + const char *name; + + if (v == (void *)1UL) { + seq_puts(m, "NAME ID USE FL OWNER GROUP\n"); + return 0; + } + + name = (c == m->private) ? "" : c->name; + seq_printf(m, "%-24s %12llu %3u %02lx %5d %5d\n", + name, c->id, refcount_read(&c->usage), c->flags, + from_kuid_munged(uns, c->cred->uid), + from_kgid_munged(uns, c->cred->gid)); + + return 0; +} + +static const struct seq_operations container_proc_ops = { + .start = container_proc_start, + .next = container_proc_next, + .stop = container_proc_stop, + .show = container_proc_show, +}; + +static int container_proc_open(struct inode *inode, struct file *file) +{ + struct seq_file *m; + int ret = seq_open(file, &container_proc_ops); + + if (ret == 0) { + m = file->private_data; + m->private = current->container; + } + return ret; +} + +static const struct file_operations container_proc_fops = { + .open = container_proc_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + /* * Allow the user to poll for the container dying. */ @@ -345,4 +448,11 @@ SYSCALL_DEFINE5(container_create, return fd; } +static int __init init_container_fs(void) +{ + proc_create("containers", 0, NULL, &container_proc_fops); + return 0; +} +fs_initcall(init_container_fs); + #endif /* CONFIG_CONTAINERS */ From patchwork Fri Feb 15 16:07:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042993 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J826SBFz9s4Z for ; Sat, 16 Feb 2019 03:08:02 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726277AbfBOQHy (ORCPT ); Fri, 15 Feb 2019 11:07:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:43292 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388125AbfBOQHy (ORCPT ); Fri, 15 Feb 2019 11:07:54 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C9913C610C; Fri, 15 Feb 2019 16:07:52 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7605D5D70D; Fri, 15 Feb 2019 16:07:50 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 04/27] containers: Allow a process to be forked into a container From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:07:49 +0000 Message-ID: <155024686966.21651.5963892339360034863.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 15 Feb 2019 16:07:52 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Allow a single process to be forked directly into a container using a new syscall, thereby 'booting' the container: pid_t pid = fork_into_container(int container_fd); This process will be the 'init' process of the container. Further attempts to fork into the container will be rejected. Signed-off-by: David Howells Nacked-by: "Eric W. Biederman" --- arch/x86/entry/syscalls/syscall_32.tbl | 1 arch/x86/entry/syscalls/syscall_64.tbl | 1 arch/x86/ia32/sys_ia32.c | 2 - include/linux/cred.h | 3 + include/linux/nsproxy.h | 7 ++ include/linux/sched/task.h | 3 + include/linux/syscalls.h | 1 kernel/cred.c | 45 +++++++++++++ kernel/fork.c | 110 ++++++++++++++++++++++++++------ kernel/nsproxy.c | 11 +++ kernel/sys_ni.c | 1 11 files changed, 157 insertions(+), 28 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 3564814a5d21..8666693510f9 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -408,3 +408,4 @@ 394 i386 mount_notify sys_mount_notify __ia32_sys_mount_notify 395 i386 sb_notify sys_sb_notify __ia32_sys_sb_notify 396 i386 container_create sys_container_create __ia32_sys_container_create +397 i386 fork_into_container sys_fork_into_container __ia32_sys_fork_into_container diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index aa6cccbe5271..d40d4790fcb2 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -353,6 +353,7 @@ 342 common mount_notify __x64_sys_mount_notify 343 common sb_notify __x64_sys_sb_notify 344 common container_create __x64_sys_container_create +345 common fork_into_container __x64_sys_fork_into_container # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c index a43212036257..080d9e21b697 100644 --- a/arch/x86/ia32/sys_ia32.c +++ b/arch/x86/ia32/sys_ia32.c @@ -238,5 +238,5 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, clone_flags, unsigned long, tls_val, int __user *, child_tidptr) { return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr, - tls_val); + tls_val, NULL); } diff --git a/include/linux/cred.h b/include/linux/cred.h index 4907c9df86b3..357e743d5d4a 100644 --- a/include/linux/cred.h +++ b/include/linux/cred.h @@ -23,6 +23,7 @@ struct cred; struct inode; +struct container; /* * COW Supplementary groups list @@ -155,7 +156,7 @@ struct cred { extern void __put_cred(struct cred *); extern void exit_creds(struct task_struct *); -extern int copy_creds(struct task_struct *, unsigned long); +extern int copy_creds(struct task_struct *, unsigned long, struct container *); extern const struct cred *get_task_cred(struct task_struct *); extern struct cred *cred_alloc_blank(void); extern struct cred *prepare_creds(void); diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h index 2ae1b1a4d84d..81838ae24a92 100644 --- a/include/linux/nsproxy.h +++ b/include/linux/nsproxy.h @@ -11,6 +11,7 @@ struct ipc_namespace; struct pid_namespace; struct cgroup_namespace; struct fs_struct; +struct container; /* * A structure to contain pointers to all per-process @@ -63,9 +64,13 @@ extern struct nsproxy init_nsproxy; * * / * task_unlock(task); * + * 4. Container namespaces are set at container creation and cannot be + * changed. + * */ -int copy_namespaces(unsigned long flags, struct task_struct *tsk); +int copy_namespaces(unsigned long flags, struct task_struct *tsk, + struct container *dest_container); void exit_task_namespaces(struct task_struct *tsk); void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new); void free_nsproxy(struct nsproxy *ns); diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index 44c6f15800ff..bdff71b0fb66 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -73,7 +73,8 @@ extern void do_group_exit(int); extern void exit_files(struct task_struct *); extern void exit_itimers(struct signal_struct *); -extern long _do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *, unsigned long); +extern long _do_fork(unsigned long, unsigned long, unsigned long, int __user *, + int __user *, unsigned long, struct container *); extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, int __user *); struct task_struct *fork_idle(int); extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index dac42098c2dd..15e5cc704df3 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -946,6 +946,7 @@ asmlinkage long sys_sb_notify(int dfd, const char __user *path, asmlinkage long sys_container_create(const char __user *name, unsigned int flags, unsigned long spare3, unsigned long spare4, unsigned long spare5); +asmlinkage long sys_fork_into_container(int containerfd); /* * Architecture-specific system calls diff --git a/kernel/cred.c b/kernel/cred.c index 21f4a97085b4..f0ee5cec533d 100644 --- a/kernel/cred.c +++ b/kernel/cred.c @@ -313,6 +313,43 @@ struct cred *prepare_exec_creds(void) return new; } +/* + * Handle forking a process into a container. + */ +static struct cred *copy_container_creds(struct container *dest_container) +{ + struct cred *new; + + validate_process_creds(); + + new = kmem_cache_alloc(cred_jar, GFP_KERNEL); + if (!new) + return NULL; + + kdebug("prepare_creds() alloc %p", new); + + memcpy(new, dest_container->cred, sizeof(struct cred)); + + atomic_set(&new->usage, 1); + set_cred_subscribers(new, 0); + get_group_info(new->group_info); + get_uid(new->user); + get_user_ns(new->user_ns); + +#ifdef CONFIG_SECURITY + new->security = NULL; +#endif + + if (security_prepare_creds(new, dest_container->cred, GFP_KERNEL) < 0) + goto error; + validate_creds(new); + return new; + +error: + abort_creds(new); + return NULL; +} + /* * Copy credentials for the new process created by fork() * @@ -322,7 +359,8 @@ struct cred *prepare_exec_creds(void) * The new process gets the current process's subjective credentials as its * objective and subjective credentials */ -int copy_creds(struct task_struct *p, unsigned long clone_flags) +int copy_creds(struct task_struct *p, unsigned long clone_flags, + struct container *dest_container) { struct cred *new; int ret; @@ -343,7 +381,10 @@ int copy_creds(struct task_struct *p, unsigned long clone_flags) return 0; } - new = prepare_creds(); + if (dest_container) + new = copy_container_creds(dest_container); + else + new = prepare_creds(); if (!new) return -ENOMEM; diff --git a/kernel/fork.c b/kernel/fork.c index 009cf7e63894..71401deb4434 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1385,9 +1385,33 @@ static int copy_mm(unsigned long clone_flags, struct task_struct *tsk) return retval; } -static int copy_fs(unsigned long clone_flags, struct task_struct *tsk) +static int copy_fs(unsigned long clone_flags, struct task_struct *tsk, + struct container *dest_container) { struct fs_struct *fs = current->fs; + +#ifdef CONFIG_CONTAINERS + if (dest_container) { + fs = kmem_cache_alloc(fs_cachep, GFP_KERNEL); + if (!fs) + return -ENOMEM; + + fs->users = 1; + fs->in_exec = 0; + spin_lock_init(&fs->lock); + seqcount_init(&fs->seq); + fs->umask = 0022; + + spin_lock(&dest_container->lock); + fs->pwd = fs->root = dest_container->root; + path_get(&fs->root); + path_get(&fs->pwd); + spin_unlock(&dest_container->lock); + tsk->fs = fs; + return 0; + } +#endif + if (clone_flags & CLONE_FS) { /* tsk->fs is already what we want */ spin_lock(&fs->lock); @@ -1679,7 +1703,8 @@ static __latent_entropy struct task_struct *copy_process( struct pid *pid, int trace, unsigned long tls, - int node) + int node, + struct container *dest_container) { int retval; struct task_struct *p; @@ -1783,7 +1808,7 @@ static __latent_entropy struct task_struct *copy_process( } current->flags &= ~PF_NPROC_EXCEEDED; - retval = copy_creds(p, clone_flags); + retval = copy_creds(p, clone_flags, dest_container); if (retval < 0) goto bad_fork_free; @@ -1905,7 +1930,7 @@ static __latent_entropy struct task_struct *copy_process( retval = copy_files(clone_flags, p); if (retval) goto bad_fork_cleanup_semundo; - retval = copy_fs(clone_flags, p); + retval = copy_fs(clone_flags, p, dest_container); if (retval) goto bad_fork_cleanup_files; retval = copy_sighand(clone_flags, p); @@ -1917,15 +1942,15 @@ static __latent_entropy struct task_struct *copy_process( retval = copy_mm(clone_flags, p); if (retval) goto bad_fork_cleanup_signal; - retval = copy_namespaces(clone_flags, p); + retval = copy_container(clone_flags, p, dest_container); if (retval) goto bad_fork_cleanup_mm; - retval = copy_container(clone_flags, p, NULL); + retval = copy_namespaces(clone_flags, p, dest_container); if (retval) - goto bad_fork_cleanup_namespaces; + goto bad_fork_cleanup_container; retval = copy_io(clone_flags, p); if (retval) - goto bad_fork_cleanup_container; + goto bad_fork_cleanup_namespaces; retval = copy_thread_tls(clone_flags, stack_start, stack_size, p, tls); if (retval) goto bad_fork_cleanup_io; @@ -2124,10 +2149,10 @@ static __latent_entropy struct task_struct *copy_process( bad_fork_cleanup_io: if (p->io_context) exit_io_context(p); -bad_fork_cleanup_container: - exit_container(p); bad_fork_cleanup_namespaces: exit_task_namespaces(p); +bad_fork_cleanup_container: + exit_container(p); bad_fork_cleanup_mm: if (p->mm) mmput(p->mm); @@ -2183,7 +2208,7 @@ struct task_struct *fork_idle(int cpu) { struct task_struct *task; task = copy_process(CLONE_VM, 0, 0, NULL, &init_struct_pid, 0, 0, - cpu_to_node(cpu)); + cpu_to_node(cpu), NULL); if (!IS_ERR(task)) { init_idle_pids(task); init_idle(task, cpu); @@ -2195,15 +2220,16 @@ struct task_struct *fork_idle(int cpu) /* * Ok, this is the main fork-routine. * - * It copies the process, and if successful kick-starts - * it and waits for it to finish using the VM if required. + * It copies the process into the specified container, and if successful + * kick-starts it and waits for it to finish using the VM if required. */ long _do_fork(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *parent_tidptr, int __user *child_tidptr, - unsigned long tls) + unsigned long tls, + struct container *dest_container) { struct completion vfork; struct pid *pid; @@ -2229,8 +2255,32 @@ long _do_fork(unsigned long clone_flags, trace = 0; } + if (dest_container) { + /* A process spawned into a container doesn't share anything + * with the parent other than namespaces. + */ + if (clone_flags & (CLONE_CHILD_CLEARTID | + CLONE_CHILD_SETTID | + CLONE_FILES | + CLONE_FS | + CLONE_IO | + CLONE_PARENT | + CLONE_PARENT_SETTID | + CLONE_PTRACE | + CLONE_SETTLS | + CLONE_SIGHAND | + CLONE_SYSVSEM | + CLONE_THREAD)) + return -EINVAL; + + /* However, we do have to let kernel threads borrow a VM. */ + if ((clone_flags & CLONE_VM) && current->mm) + return -EINVAL; + } + p = copy_process(clone_flags, stack_start, stack_size, - child_tidptr, NULL, trace, tls, NUMA_NO_NODE); + child_tidptr, NULL, trace, tls, NUMA_NO_NODE, + dest_container); add_latent_entropy(); if (IS_ERR(p)) @@ -2279,7 +2329,7 @@ long do_fork(unsigned long clone_flags, int __user *child_tidptr) { return _do_fork(clone_flags, stack_start, stack_size, - parent_tidptr, child_tidptr, 0); + parent_tidptr, child_tidptr, 0, NULL); } #endif @@ -2289,14 +2339,14 @@ long do_fork(unsigned long clone_flags, pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags) { return _do_fork(flags|CLONE_VM|CLONE_UNTRACED, (unsigned long)fn, - (unsigned long)arg, NULL, NULL, 0); + (unsigned long)arg, NULL, NULL, 0, NULL); } #ifdef __ARCH_WANT_SYS_FORK SYSCALL_DEFINE0(fork) { #ifdef CONFIG_MMU - return _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0); + return _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0, NULL); #else /* can not support in nommu mode */ return -EINVAL; @@ -2308,7 +2358,26 @@ SYSCALL_DEFINE0(fork) SYSCALL_DEFINE0(vfork) { return _do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, 0, - 0, NULL, NULL, 0); + 0, NULL, NULL, 0, NULL); +} +#endif + +#ifdef CONFIG_CONTAINERS +SYSCALL_DEFINE1(fork_into_container, int, containerfd) +{ + struct fd f = fdget(containerfd); + int ret; + + if (!f.file) + return -EBADF; + ret = -EINVAL; + if (is_container_file(f.file)) { + struct container *dest_container = f.file->private_data; + + ret = _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0, dest_container); + } + fdput(f); + return ret; } #endif @@ -2336,7 +2405,8 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, unsigned long, tls) #endif { - return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr, tls); + return _do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr, tls, + NULL); } #endif diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c index 4bb5184b3a80..4031075300a4 100644 --- a/kernel/nsproxy.c +++ b/kernel/nsproxy.c @@ -136,12 +136,19 @@ struct nsproxy *create_new_namespaces(unsigned long flags, * called from clone. This now handles copy for nsproxy and all * namespaces therein. */ -int copy_namespaces(unsigned long flags, struct task_struct *tsk) +int copy_namespaces(unsigned long flags, struct task_struct *tsk, + struct container *dest_container) { struct nsproxy *old_ns = tsk->nsproxy; struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns); struct nsproxy *new_ns; + if (dest_container) { + get_nsproxy(dest_container->ns); + tsk->nsproxy = dest_container->ns; + return 0; + } + if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNET | CLONE_NEWCGROUP)))) { @@ -163,7 +170,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk) (CLONE_NEWIPC | CLONE_SYSVSEM)) return -EINVAL; - new_ns = create_new_namespaces(flags, tsk->nsproxy, user_ns, tsk->fs); + new_ns = create_new_namespaces(flags, old_ns, user_ns, tsk->fs); if (IS_ERR(new_ns)) return PTR_ERR(new_ns); diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index f0455cbb91cf..a23ad529d548 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -144,6 +144,7 @@ COND_SYSCALL(container_create); /* kernel/exit.c */ /* kernel/fork.c */ +COND_SYSCALL(fork_into_container); /* kernel/futex.c */ COND_SYSCALL(futex); From patchwork Fri Feb 15 16:07:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042994 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J8C2p7Mz9s4Z for ; Sat, 16 Feb 2019 03:08:11 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729055AbfBOQID (ORCPT ); Fri, 15 Feb 2019 11:08:03 -0500 Received: from mx1.redhat.com ([209.132.183.28]:58296 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388125AbfBOQIB (ORCPT ); Fri, 15 Feb 2019 11:08:01 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id F1D80BF9DD; Fri, 15 Feb 2019 16:08:00 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF843101E841; Fri, 15 Feb 2019 16:07:58 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 05/27] containers: Open a socket inside a container From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:07:58 +0000 Message-ID: <155024687804.21651.13220990774688382294.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 15 Feb 2019 16:08:01 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a system call to open a socket inside of a container, using that container's network namespace. This allows netlink to be used to manage the container. fd = container_socket(int container_fd, int domain, int type, int protocol); Signed-off-by: David Howells Nacked-by: "Eric W. Biederman" --- arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + include/linux/socket.h | 3 ++- include/linux/syscalls.h | 2 ++ kernel/sys_ni.c | 1 + net/compat.c | 2 +- net/socket.c | 34 +++++++++++++++++++++++++++----- 7 files changed, 37 insertions(+), 7 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 8666693510f9..f4c9beff77a6 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -409,3 +409,4 @@ 395 i386 sb_notify sys_sb_notify __ia32_sys_sb_notify 396 i386 container_create sys_container_create __ia32_sys_container_create 397 i386 fork_into_container sys_fork_into_container __ia32_sys_fork_into_container +398 i386 container_socket sys_container_socket __ia32_sys_container_socket diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl index d40d4790fcb2..e20cdf7b5527 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -354,6 +354,7 @@ 343 common sb_notify __x64_sys_sb_notify 344 common container_create __x64_sys_container_create 345 common fork_into_container __x64_sys_fork_into_container +346 common container_socket __x64_sys_container_socket # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/include/linux/socket.h b/include/linux/socket.h index ab2041a00e01..154ac900a8a5 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -10,6 +10,7 @@ #include /* __user */ #include +struct net; struct pid; struct cred; @@ -376,7 +377,7 @@ extern int __sys_sendto(int fd, void __user *buff, size_t len, int addr_len); extern int __sys_accept4(int fd, struct sockaddr __user *upeer_sockaddr, int __user *upeer_addrlen, int flags); -extern int __sys_socket(int family, int type, int protocol); +extern int __sys_socket(struct net *net, int family, int type, int protocol); extern int __sys_bind(int fd, struct sockaddr __user *umyaddr, int addrlen); extern int __sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen); diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 15e5cc704df3..547334c6ffc2 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -947,6 +947,8 @@ asmlinkage long sys_container_create(const char __user *name, unsigned int flags unsigned long spare3, unsigned long spare4, unsigned long spare5); asmlinkage long sys_fork_into_container(int containerfd); +asmlinkage long sys_container_socket(int containerfd, + int domain, int type, int protocol); /* * Architecture-specific system calls diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index a23ad529d548..ce9c5bb30e7f 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -236,6 +236,7 @@ COND_SYSCALL(shmdt); /* net/socket.c */ COND_SYSCALL(socket); COND_SYSCALL(socketpair); +COND_SYSCALL(container_socket); COND_SYSCALL(bind); COND_SYSCALL(listen); COND_SYSCALL(accept); diff --git a/net/compat.c b/net/compat.c index 959d1c51826d..1b2db740fd33 100644 --- a/net/compat.c +++ b/net/compat.c @@ -856,7 +856,7 @@ COMPAT_SYSCALL_DEFINE2(socketcall, int, call, u32 __user *, args) switch (call) { case SYS_SOCKET: - ret = __sys_socket(a0, a1, a[2]); + ret = __sys_socket(current->nsproxy->net_ns, a0, a1, a[2]); break; case SYS_BIND: ret = __sys_bind(a0, compat_ptr(a1), a[2]); diff --git a/net/socket.c b/net/socket.c index 7d271a1d0c7e..7406580598b9 100644 --- a/net/socket.c +++ b/net/socket.c @@ -80,6 +80,7 @@ #include #include #include +#include #include #include #include @@ -1326,9 +1327,9 @@ int sock_create_kern(struct net *net, int family, int type, int protocol, struct } EXPORT_SYMBOL(sock_create_kern); -int __sys_socket(int family, int type, int protocol) +int __sys_socket(struct net *net, int family, int type, int protocol) { - int retval; + long retval; struct socket *sock; int flags; @@ -1346,7 +1347,7 @@ int __sys_socket(int family, int type, int protocol) if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK)) flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK; - retval = sock_create(family, type, protocol, &sock); + retval = __sock_create(net, family, type, protocol, &sock, 0); if (retval < 0) return retval; @@ -1355,9 +1356,32 @@ int __sys_socket(int family, int type, int protocol) SYSCALL_DEFINE3(socket, int, family, int, type, int, protocol) { - return __sys_socket(family, type, protocol); + return __sys_socket(current->nsproxy->net_ns, family, type, protocol); } +/* + * Create a socket inside a container. + */ +#ifdef CONFIG_CONTAINERS +SYSCALL_DEFINE4(container_socket, + int, containerfd, int, family, int, type, int, protocol) +{ + struct fd f = fdget(containerfd); + long ret; + + if (!f.file) + return -EBADF; + ret = -EINVAL; + if (is_container_file(f.file)) { + struct container *c = f.file->private_data; + + ret = __sys_socket(c->ns->net_ns, family, type, protocol); + } + fdput(f); + return ret; +} +#endif + /* * Create a pair of connected sockets. */ @@ -2555,7 +2579,7 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args) switch (call) { case SYS_SOCKET: - err = __sys_socket(a0, a1, a[2]); + err = __sys_socket(current->nsproxy->net_ns, a0, a1, a[2]); break; case SYS_BIND: err = __sys_bind(a0, (struct sockaddr __user *)a1, a[2]); From patchwork Fri Feb 15 16:08:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042995 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J8R6sB0z9rxp for ; Sat, 16 Feb 2019 03:08:23 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391561AbfBOQIL (ORCPT ); Fri, 15 Feb 2019 11:08:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:32880 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728911AbfBOQIL (ORCPT ); Fri, 15 Feb 2019 11:08:11 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2807C7AEBE; Fri, 15 Feb 2019 16:08:11 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id E19F7600C5; Fri, 15 Feb 2019 16:08:06 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 06/27] containers, vfs: Allow syscall dirfd arguments to take a container fd From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:08:06 +0000 Message-ID: <155024688620.21651.16013251077091180213.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 15 Feb 2019 16:08:11 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Some filesystem system calls, such as mkdirat(), take a 'directory fd' to specify the pathwalk origin. This takes either AT_FDCWD or a file descriptor that refers to an open directory. Make it possible to supply a container fd, as obtained from container_create(), instead thereby specifying the container's root as the origin. This performs the filesystem operation into the container's mount namespace. For example: int cfd = container_create("fred", CONTAINER_NEW_MNT_NS, 0); mkdirat(cfd, "/fred", 0755); A better way to do this might be to temporarily override current->fs and current->nsproxy, but this requires splitting those fields so that procfs doesn't see the override. A sequence number and lock are available to protect the root pointer in case container_chroot() and/or container_pivot_root() are implemented. Signed-off-by: David Howells Nacked-by: "Eric W. Biederman" --- fs/namei.c | 45 ++++++++++++++++++++++++++++++++++----------- 1 file changed, 34 insertions(+), 11 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index a85deb55d0c9..4932b5467285 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2232,20 +2232,43 @@ static const char *path_init(struct nameidata *nd, unsigned flags) if (!f.file) return ERR_PTR(-EBADF); - dentry = f.file->f_path.dentry; + if (is_container_file(f.file)) { + struct container *c = f.file->private_data; + unsigned seq; - if (*s && unlikely(!d_can_lookup(dentry))) { - fdput(f); - return ERR_PTR(-ENOTDIR); - } + if (!*s) + return ERR_PTR(-EINVAL); - nd->path = f.file->f_path; - if (flags & LOOKUP_RCU) { - nd->inode = nd->path.dentry->d_inode; - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); + if (flags & LOOKUP_RCU) { + do { + seq = read_seqcount_begin(&c->seq); + nd->path = c->root; + nd->inode = nd->path.dentry->d_inode; + nd->seq = __read_seqcount_begin(&nd->path.dentry->d_seq); + } while (read_seqcount_retry(&c->seq, seq)); + } else { + spin_lock(&c->lock); + nd->path = c->root; + path_get(&nd->path); + spin_unlock(&c->lock); + nd->inode = nd->path.dentry->d_inode; + } } else { - path_get(&nd->path); - nd->inode = nd->path.dentry->d_inode; + dentry = f.file->f_path.dentry; + + if (*s && unlikely(!d_can_lookup(dentry))) { + fdput(f); + return ERR_PTR(-ENOTDIR); + } + + nd->path = f.file->f_path; + if (flags & LOOKUP_RCU) { + nd->inode = nd->path.dentry->d_inode; + nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); + } else { + path_get(&nd->path); + nd->inode = nd->path.dentry->d_inode; + } } fdput(f); return s; From patchwork Fri Feb 15 16:08:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042996 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J8W2D6Nz9s9G for ; Sat, 16 Feb 2019 03:08:27 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729064AbfBOQI0 (ORCPT ); Fri, 15 Feb 2019 11:08:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56792 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728911AbfBOQIZ (ORCPT ); Fri, 15 Feb 2019 11:08:25 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6DA80561CC; Fri, 15 Feb 2019 16:08:24 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 021C95D717; Fri, 15 Feb 2019 16:08:17 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 07/27] containers: Make fsopen() able to create a superblock in a container From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:08:16 +0000 Message-ID: <155024689635.21651.15943029551519736259.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 15 Feb 2019 16:08:24 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Make it possible for fsopen() to create a superblock in a specified container, using the namespaces associated with that container to cover UID translation, networking and filesystem content. This involves adding a new fsconfig command to specify the container. For example: cfd = container_create("fred", CONTAINER_NEW_FS_NS); fsfd = fsopen("ext4", 0); fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd); fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/sda3", 0); fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); mfd = fsmount(fsfd, 0, MOUNT_ATTR_RDONLY); move_mount(mfd, "", cfd, "/", MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_CONTAINER_ROOT); Signed-off-by: David Howells --- fs/fs_context.c | 19 +++++++++++++++ fs/fsopen.c | 54 +++++++++++++++++++++++++++++++++++++------- fs/namespace.c | 19 +++++++++++---- fs/proc/root.c | 11 +++++++-- include/linux/container.h | 1 + include/linux/fs_context.h | 3 ++ include/linux/pid.h | 5 +++- include/linux/proc_ns.h | 6 +++-- include/uapi/linux/mount.h | 1 + kernel/container.c | 4 +++ kernel/fork.c | 2 +- kernel/pid.c | 4 ++- 12 files changed, 108 insertions(+), 21 deletions(-) diff --git a/fs/fs_context.c b/fs/fs_context.c index a47ccd5a4a78..fc76ac02d618 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -169,6 +170,21 @@ int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param) } EXPORT_SYMBOL(vfs_parse_fs_param); +/* + * Specify a container in which a superblock will exist. + */ +void vfs_set_container(struct fs_context *fc, struct container *container) +{ + if (container) { + put_user_ns(fc->user_ns); + put_net(fc->net_ns); + + fc->container = get_container(container); + fc->user_ns = get_user_ns(container->cred->user_ns); + fc->net_ns = get_net(container->ns->net_ns); + } +} + /** * vfs_parse_fs_string - Convenience function to just parse a string. */ @@ -364,6 +380,8 @@ struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc) fc->source = NULL; fc->security = NULL; get_filesystem(fc->fs_type); + if (fc->container) + get_container(fc->container); get_net(fc->net_ns); get_user_ns(fc->user_ns); get_cred(fc->cred); @@ -510,6 +528,7 @@ void put_fs_context(struct fs_context *fc) put_net(fc->net_ns); put_user_ns(fc->user_ns); put_cred(fc->cred); + put_container(fc->container); kfree(fc->subtype); put_fc_log(fc); put_filesystem(fc->fs_type); diff --git a/fs/fsopen.c b/fs/fsopen.c index 3bb9c0c8cbcc..d0fe9e563ebb 100644 --- a/fs/fsopen.c +++ b/fs/fsopen.c @@ -17,11 +17,33 @@ #include #include #include +#include #include #include #include "internal.h" #include "mount.h" +/* + * Configure the destination container on a filesystem context. This must be + * done before any other parameters are offered. Containers are presented as + * fds attached to such objects given by the auxiliary parameter. + * + * For example: + * + * fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, container_fd); + */ +static int fsconfig_set_container(struct fs_context *fc, struct fs_parameter *param) +{ + struct container *c; + + if (!is_container_file(param->file)) + return -EINVAL; + + c = param->file->private_data; + vfs_set_container(fc, c); + return 0; +} + /* * Allow the user to read back any error, warning or informational messages. */ @@ -111,10 +133,6 @@ static int fscontext_alloc_log(struct fs_context *fc) /* * Open a filesystem by name so that it can be configured for mounting. - * - * We are allowed to specify a container in which the filesystem will be - * opened, thereby indicating which namespaces will be used (notably, which - * network namespace will be used for network filesystems). */ SYSCALL_DEFINE2(fsopen, const char __user *, _fs_name, unsigned int, flags) { @@ -143,7 +161,7 @@ SYSCALL_DEFINE2(fsopen, const char __user *, _fs_name, unsigned int, flags) if (IS_ERR(fc)) return PTR_ERR(fc); - fc->phase = FS_CONTEXT_CREATE_PARAMS; + fc->phase = FS_CONTEXT_CREATE_NS; ret = fscontext_alloc_log(fc); if (ret < 0) @@ -228,7 +246,8 @@ static int vfs_fsconfig_locked(struct fs_context *fc, int cmd, return ret; switch (cmd) { case FSCONFIG_CMD_CREATE: - if (fc->phase != FS_CONTEXT_CREATE_PARAMS) + if (fc->phase != FS_CONTEXT_CREATE_NS && + fc->phase != FS_CONTEXT_CREATE_PARAMS) return -EBUSY; fc->phase = FS_CONTEXT_CREATING; ret = vfs_get_tree(fc); @@ -259,9 +278,17 @@ static int vfs_fsconfig_locked(struct fs_context *fc, int cmd, break; vfs_clean_context(fc); return 0; + + case FSCONFIG_SET_CONTAINER: + if (fc->phase != FS_CONTEXT_CREATE_NS) + return -EBUSY; + return fsconfig_set_container(fc, param); + default: - if (fc->phase != FS_CONTEXT_CREATE_PARAMS && - fc->phase != FS_CONTEXT_RECONF_PARAMS) + if (fc->phase == FS_CONTEXT_CREATE_NS) + fc->phase = FS_CONTEXT_CREATE_PARAMS; + else if (fc->phase != FS_CONTEXT_CREATE_PARAMS && + fc->phase != FS_CONTEXT_RECONF_PARAMS) return -EBUSY; return vfs_parse_fs_param(fc, param); @@ -353,6 +380,10 @@ SYSCALL_DEFINE5(fsconfig, if (!_key || _value || aux < 0) return -EINVAL; break; + case FSCONFIG_SET_CONTAINER: + if (_key || _value || aux < 0) + return -EINVAL; + break; case FSCONFIG_CMD_CREATE: case FSCONFIG_CMD_RECONFIGURE: if (_key || _value || aux) @@ -438,6 +469,12 @@ SYSCALL_DEFINE5(fsconfig, if (!param.file) goto out_key; break; + case FSCONFIG_SET_CONTAINER: + ret = -EBADF; + param.file = fget(aux); + if (!param.file) + goto out_key; + break; default: break; } @@ -463,6 +500,7 @@ SYSCALL_DEFINE5(fsconfig, putname(param.name); break; case FSCONFIG_SET_FD: + case FSCONFIG_SET_CONTAINER: if (param.file) fput(param.file); break; diff --git a/fs/namespace.c b/fs/namespace.c index ea005f55ec4c..cc5d56f7ae29 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -781,9 +781,16 @@ static void put_mountpoint(struct mountpoint *mp) } } +static inline int __check_mnt(struct mount *mnt, struct mnt_namespace *mnt_ns) +{ + if (!mnt_ns) + mnt_ns = current->nsproxy->mnt_ns; + return mnt->mnt_ns == mnt_ns; +} + static inline int check_mnt(struct mount *mnt) { - return mnt->mnt_ns == current->nsproxy->mnt_ns; + return __check_mnt(mnt, NULL); } /* @@ -2696,7 +2703,8 @@ static int do_move_mount_old(struct path *path, const char *old_name) /* * add a mount into a namespace's mount tree */ -static int do_add_mount(struct mount *newmnt, struct path *path, int mnt_flags) +static int do_add_mount(struct mount *newmnt, struct path *path, int mnt_flags, + struct mnt_namespace *mnt_ns) { struct mountpoint *mp; struct mount *parent; @@ -2710,7 +2718,7 @@ static int do_add_mount(struct mount *newmnt, struct path *path, int mnt_flags) parent = real_mount(path->mnt); err = -EINVAL; - if (unlikely(!check_mnt(parent))) { + if (unlikely(!__check_mnt(parent, mnt_ns))) { /* that's acceptable only for automounts done in private ns */ if (!(mnt_flags & MNT_SHRINKABLE)) goto unlock; @@ -2765,7 +2773,8 @@ static int do_new_mount_fc(struct fs_context *fc, struct path *mountpoint, if (IS_ERR(mnt)) return PTR_ERR(mnt); - error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags); + error = do_add_mount(real_mount(mnt), mountpoint, mnt_flags, + fc->container ? fc->container->ns->mnt_ns : NULL); if (error < 0) mntput(mnt); return error; @@ -2839,7 +2848,7 @@ int finish_automount(struct vfsmount *m, struct path *path) goto fail; } - err = do_add_mount(mnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE); + err = do_add_mount(mnt, path, path->mnt->mnt_flags | MNT_SHRINKABLE, NULL); if (!err) return 0; fail: diff --git a/fs/proc/root.c b/fs/proc/root.c index 6927b29ece76..aa802006d855 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -186,8 +187,12 @@ static int proc_init_fs_context(struct fs_context *fc) ctx = kzalloc(sizeof(struct proc_fs_context), GFP_KERNEL); if (!ctx) return -ENOMEM; + + if (fc->container) + ctx->pid_ns = get_pid_ns(fc->container->pid_ns); + else + ctx->pid_ns = get_pid_ns(task_active_pid_ns(current)); - ctx->pid_ns = get_pid_ns(task_active_pid_ns(current)); fc->fs_private = ctx; fc->ops = &proc_fs_context_ops; return 0; @@ -300,7 +305,7 @@ struct proc_dir_entry proc_root = { .name = "/proc", }; -int pid_ns_prepare_proc(struct pid_namespace *ns) +int pid_ns_prepare_proc(struct pid_namespace *ns, struct container *container) { struct proc_fs_context *ctx; struct fs_context *fc; @@ -315,6 +320,8 @@ int pid_ns_prepare_proc(struct pid_namespace *ns) fc->user_ns = get_user_ns(ns->user_ns); } + vfs_set_container(fc, container); + ctx = fc->fs_private; if (ctx->pid_ns != ns) { put_pid_ns(ctx->pid_ns); diff --git a/include/linux/container.h b/include/linux/container.h index 0a8918435097..087aa1885ef7 100644 --- a/include/linux/container.h +++ b/include/linux/container.h @@ -37,6 +37,7 @@ struct container { struct path root; /* The root of the container's fs namespace */ struct task_struct *init; /* The 'init' task for this container */ struct container *parent; /* Parent of this container. */ + struct pid_namespace *pid_ns; /* The process ID namespace for this container */ void *security; /* LSM data */ struct list_head members; /* Member processes, guarded with ->lock */ struct list_head child_link; /* Link in parent->children */ diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index dc8c9fcba341..45486080eb84 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -40,6 +40,7 @@ enum fs_context_purpose { * Userspace usage phase for fsopen/fspick. */ enum fs_context_phase { + FS_CONTEXT_CREATE_NS, /* Set namespaces for sb creation */ FS_CONTEXT_CREATE_PARAMS, /* Loading params for sb creation */ FS_CONTEXT_CREATING, /* A superblock is being created */ FS_CONTEXT_AWAITING_MOUNT, /* Superblock created, awaiting fsmount() */ @@ -93,6 +94,7 @@ struct fs_context { struct file_system_type *fs_type; void *fs_private; /* The filesystem's context */ struct dentry *root; /* The root and superblock */ + struct container *container; /* The container in which the mount will exist */ struct user_namespace *user_ns; /* The user namespace for this mount */ struct net *net_ns; /* The network namespace for this mount */ const struct cred *cred; /* The mounter's credentials */ @@ -136,6 +138,7 @@ extern int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param) extern int vfs_parse_fs_string(struct fs_context *fc, const char *key, const char *value, size_t v_size); extern int generic_parse_monolithic(struct fs_context *fc, void *data); +extern void vfs_set_container(struct fs_context *fc, struct container *container); extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); diff --git a/include/linux/pid.h b/include/linux/pid.h index 14a9a39da9c7..16dc152ceef1 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -73,6 +73,8 @@ static inline struct pid *get_pid(struct pid *pid) return pid; } +struct container; + extern void put_pid(struct pid *pid); extern struct task_struct *pid_task(struct pid *pid, enum pid_type); extern struct task_struct *get_pid_task(struct pid *pid, enum pid_type); @@ -111,7 +113,8 @@ extern struct pid *find_get_pid(int nr); extern struct pid *find_ge_pid(int nr, struct pid_namespace *); int next_pidmap(struct pid_namespace *pid_ns, unsigned int last); -extern struct pid *alloc_pid(struct pid_namespace *ns); +extern struct pid *alloc_pid(struct pid_namespace *ns, + struct container *container); extern void free_pid(struct pid *pid); extern void disable_pid_allocation(struct pid_namespace *ns); diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h index d31cb6215905..dee0881eca5c 100644 --- a/include/linux/proc_ns.h +++ b/include/linux/proc_ns.h @@ -47,14 +47,16 @@ enum { #ifdef CONFIG_PROC_FS -extern int pid_ns_prepare_proc(struct pid_namespace *ns); +extern int pid_ns_prepare_proc(struct pid_namespace *ns, + struct container *container); extern void pid_ns_release_proc(struct pid_namespace *ns); extern int proc_alloc_inum(unsigned int *pino); extern void proc_free_inum(unsigned int inum); #else /* CONFIG_PROC_FS */ -static inline int pid_ns_prepare_proc(struct pid_namespace *ns) { return 0; } +static inline int pid_ns_prepare_proc(struct pid_namespace *ns, struct container *container) +{ return 0; } static inline void pid_ns_release_proc(struct pid_namespace *ns) {} static inline int proc_alloc_inum(unsigned int *inum) diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h index 96a0240f23fe..f60bbe6f4099 100644 --- a/include/uapi/linux/mount.h +++ b/include/uapi/linux/mount.h @@ -97,6 +97,7 @@ enum fsconfig_command { FSCONFIG_SET_FD = 5, /* Set parameter, supplying an object by fd */ FSCONFIG_CMD_CREATE = 6, /* Invoke superblock creation */ FSCONFIG_CMD_RECONFIGURE = 7, /* Invoke superblock reconfiguration */ + FSCONFIG_SET_CONTAINER = 8, /* Set a container, supplied by fd */ }; /* diff --git a/kernel/container.c b/kernel/container.c index 1d2cb1c1e9b1..fd3b2a6849a1 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -30,6 +30,7 @@ struct container init_container = { .cred = &init_cred, .ns = &init_nsproxy, .init = &init_task, + .pid_ns = &init_pid_ns, .members.next = &init_task.container_link, .members.prev = &init_task.container_link, .children = LIST_HEAD_INIT(init_container.children), @@ -51,6 +52,8 @@ void put_container(struct container *c) while (c && refcount_dec_and_test(&c->usage)) { BUG_ON(!list_empty(&c->members)); + if (c->pid_ns) + put_pid_ns(c->pid_ns); if (c->ns) put_nsproxy(c->ns); path_put(&c->root); @@ -391,6 +394,7 @@ static struct container *create_container(const char __user *name, unsigned int } c->ns = ns; + c->pid_ns = get_pid_ns(c->ns->pid_ns_for_children); c->root = fs->root; c->seq = fs->seq; fs->root.mnt = NULL; diff --git a/kernel/fork.c b/kernel/fork.c index 71401deb4434..09de5f35d312 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1958,7 +1958,7 @@ static __latent_entropy struct task_struct *copy_process( stackleak_task_init(p); if (pid != &init_struct_pid) { - pid = alloc_pid(p->nsproxy->pid_ns_for_children); + pid = alloc_pid(p->nsproxy->pid_ns_for_children, dest_container); if (IS_ERR(pid)) { retval = PTR_ERR(pid); goto bad_fork_cleanup_thread; diff --git a/kernel/pid.c b/kernel/pid.c index 20881598bdfa..6528a75e6c0d 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -156,7 +156,7 @@ void free_pid(struct pid *pid) call_rcu(&pid->rcu, delayed_put_pid); } -struct pid *alloc_pid(struct pid_namespace *ns) +struct pid *alloc_pid(struct pid_namespace *ns, struct container *container) { struct pid *pid; enum pid_type type; @@ -205,7 +205,7 @@ struct pid *alloc_pid(struct pid_namespace *ns) } if (unlikely(is_child_reaper(pid))) { - if (pid_ns_prepare_proc(ns)) + if (pid_ns_prepare_proc(ns, container)) goto out_free; } From patchwork Fri Feb 15 16:08:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042997 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J8y0ZDnz9rxp for ; Sat, 16 Feb 2019 03:08:50 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389887AbfBOQIm (ORCPT ); Fri, 15 Feb 2019 11:08:42 -0500 Received: from mx1.redhat.com ([209.132.183.28]:32834 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727084AbfBOQIm (ORCPT ); Fri, 15 Feb 2019 11:08:42 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 607DFC0C2349; Fri, 15 Feb 2019 16:08:41 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5B966608C4; Fri, 15 Feb 2019 16:08:30 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 08/27] containers, vfs: Honour CONTAINER_NEW_EMPTY_FS_NS From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:08:29 +0000 Message-ID: <155024690964.21651.13823458384398366556.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 15 Feb 2019 16:08:41 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Allow a container to be created with an empty mount namespace, as specified by passing CONTAINER_NEW_EMPTY_FS_NS to container_create(), and allow a root filesystem to be mounted into the container: cfd = container_create("foo", CONTAINER_NEW_EMPTY_FS_NS); fsfd = fsopen("ext3", 0); fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd); fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/sda3", 0); fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); ... rfd = fsmount(fsfd, 0, 0); move_mount(rfd, "", cfd, "/", MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_CONTAINER_ROOT); pfd = fsopen("proc", 0); write(pfd, "n c="); ... procfd = fsmount(pfd, 0, 0); move_mount(procfd, "", cfd, "proc", MOVE_MOUNT_F_EMPTY_PATH); Signed-off-by: David Howells --- fs/namespace.c | 95 +++++++++++++++++++++++++++++++++++++++----- include/uapi/linux/mount.h | 3 + kernel/container.c | 6 +++ kernel/fork.c | 6 ++- 4 files changed, 97 insertions(+), 13 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index cc5d56f7ae29..22cf4a8f8065 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3513,6 +3513,63 @@ SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags, return ret; } +/* + * Create a mount namespace for a container and set the root mount in it. + */ +static int set_container_root(struct path *path, int fd) +{ + struct mnt_namespace *mnt_ns; + struct container *container; + struct mount *mnt; + struct fd f; + int ret; + + f = fdget(fd); + if (!f.file) + return -EBADF; + ret = -EINVAL; + if (!is_container_file(f.file)) + goto out_fd; + + ret = -EBUSY; + container = f.file->private_data; + if (container->ns->mnt_ns) + goto out_fd; + + mnt_ns = alloc_mnt_ns(container->cred->user_ns, false); + if (IS_ERR(mnt_ns)) { + ret = PTR_ERR(mnt_ns); + goto out_fd; + } + + mnt = real_mount(path->mnt); + mnt_add_count(mnt, 1); + mnt->mnt_ns = mnt_ns; + mnt_ns->root = mnt; + mnt_ns->mounts++; + list_add(&mnt->mnt_list, &mnt_ns->list); + + ret = -EBUSY; + spin_lock(&container->lock); + if (!container->ns->mnt_ns) { + container->ns->mnt_ns = mnt_ns; + write_seqcount_begin(&container->seq); + container->root.mnt = path->mnt; + container->root.dentry = path->dentry; + write_seqcount_end(&container->seq); + path_get(&container->root); + mnt_ns = NULL; + ret = 0; + } + spin_unlock(&container->lock); + + if (ret < 0) + put_mnt_ns(mnt_ns); +out_fd: + fdput(f); + return ret; +} + /* * Move a mount from one place to another. In combination with * fsopen()/fsmount() this is used to install a new mount and in combination @@ -3528,6 +3585,7 @@ SYSCALL_DEFINE5(move_mount, { struct path from_path, to_path; unsigned int lflags; + char buf[2]; int ret = 0; if (!may_mount()) @@ -3536,6 +3594,17 @@ SYSCALL_DEFINE5(move_mount, if (flags & ~MOVE_MOUNT__MASK) return -EINVAL; + if (flags & MOVE_MOUNT_T_CONTAINER_ROOT) { + if (flags & (MOVE_MOUNT_T_SYMLINKS | + MOVE_MOUNT_T_AUTOMOUNTS | + MOVE_MOUNT_T_EMPTY_PATH)) + return -EINVAL; + if (strncpy_from_user(buf, to_pathname, 2) < 0) + return -EFAULT; + if (buf[0] != '/' || buf[1] != '\0') + return -EINVAL; + } + /* If someone gives a pathname, they aren't permitted to move * from an fd that requires unmount as we can't get at the flag * to clear it afterwards. @@ -3549,20 +3618,24 @@ SYSCALL_DEFINE5(move_mount, if (ret < 0) return ret; - lflags = 0; - if (flags & MOVE_MOUNT_T_SYMLINKS) lflags |= LOOKUP_FOLLOW; - if (flags & MOVE_MOUNT_T_AUTOMOUNTS) lflags |= LOOKUP_AUTOMOUNT; - if (flags & MOVE_MOUNT_T_EMPTY_PATH) lflags |= LOOKUP_EMPTY; + if (flags & MOVE_MOUNT_T_CONTAINER_ROOT) { + ret = set_container_root(&from_path, to_dfd); + } else { + lflags = 0; + if (flags & MOVE_MOUNT_T_SYMLINKS) lflags |= LOOKUP_FOLLOW; + if (flags & MOVE_MOUNT_T_AUTOMOUNTS) lflags |= LOOKUP_AUTOMOUNT; + if (flags & MOVE_MOUNT_T_EMPTY_PATH) lflags |= LOOKUP_EMPTY; - ret = user_path_at(to_dfd, to_pathname, lflags, &to_path); - if (ret < 0) - goto out_from; + ret = user_path_at(to_dfd, to_pathname, lflags, &to_path); + if (ret < 0) + goto out_from; - ret = security_move_mount(&from_path, &to_path); - if (ret < 0) - goto out_to; + ret = security_move_mount(&from_path, &to_path); + if (ret < 0) + goto out_to; - ret = do_move_mount(&from_path, &to_path); + ret = do_move_mount(&from_path, &to_path); + } out_to: path_put(&to_path); diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h index f60bbe6f4099..cfaa75fa0594 100644 --- a/include/uapi/linux/mount.h +++ b/include/uapi/linux/mount.h @@ -70,7 +70,8 @@ #define MOVE_MOUNT_T_SYMLINKS 0x00000010 /* Follow symlinks on to path */ #define MOVE_MOUNT_T_AUTOMOUNTS 0x00000020 /* Follow automounts on to path */ #define MOVE_MOUNT_T_EMPTY_PATH 0x00000040 /* Empty to path permitted */ -#define MOVE_MOUNT__MASK 0x00000077 +#define MOVE_MOUNT_T_CONTAINER_ROOT 0x00000080 /* Set as container root */ +#define MOVE_MOUNT__MASK 0x000000f7 /* * fsopen() flags. diff --git a/kernel/container.c b/kernel/container.c index fd3b2a6849a1..360284db959b 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -21,6 +21,7 @@ #include #include #include +#include #include "namespaces.h" struct container init_container = { @@ -400,6 +401,11 @@ static struct container *create_container(const char __user *name, unsigned int fs->root.mnt = NULL; fs->root.dentry = NULL; + if (flags & CONTAINER_NEW_EMPTY_FS_NS) { + put_mnt_ns(ns->mnt_ns); + ns->mnt_ns = NULL; + } + ret = security_container_alloc(c, flags); if (ret < 0) goto err_fs; diff --git a/kernel/fork.c b/kernel/fork.c index 09de5f35d312..6ec507a5f739 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2374,7 +2374,11 @@ SYSCALL_DEFINE1(fork_into_container, int, containerfd) if (is_container_file(f.file)) { struct container *dest_container = f.file->private_data; - ret = _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0, dest_container); + if (!dest_container->ns->mnt_ns) + ret = -ENOENT; + else + ret = _do_fork(SIGCHLD, 0, 0, NULL, NULL, 0, + dest_container); } fdput(f); return ret; From patchwork Fri Feb 15 16:08:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042998 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J916G7xz9rxp for ; Sat, 16 Feb 2019 03:08:53 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391799AbfBOQIw (ORCPT ); Fri, 15 Feb 2019 11:08:52 -0500 Received: from mx1.redhat.com ([209.132.183.28]:59314 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388365AbfBOQIw (ORCPT ); Fri, 15 Feb 2019 11:08:52 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 43832C7966; Fri, 15 Feb 2019 16:08:52 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 603C660C80; Fri, 15 Feb 2019 16:08:47 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 09/27] vfs: Allow mounting to other namespaces From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:08:46 +0000 Message-ID: <155024692658.21651.7276705643207668882.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 15 Feb 2019 16:08:52 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Currently sys_move_mount() and sys_mount(MS_MOVE) prevent the caller from moving a mount into a namespace not their own. Relax this such that any mount can be mounted onto any given mountpoint provided that the source mount is either detached or the same namespace as the destination. This permits container namespaces to be built from the outside rather than from the inside. Signed-off-by: David Howells --- fs/namespace.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 22cf4a8f8065..804601b6297c 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2627,12 +2627,10 @@ static int do_move_mount(struct path *old_path, struct path *new_path) ns = old->mnt_ns; err = -EINVAL; - /* The mountpoint must be in our namespace. */ - if (!check_mnt(p)) - goto out; - - /* The thing moved should be either ours or completely unattached. */ - if (attached && !check_mnt(old)) + /* The new mount must be either unattached or in the same namespace as + * the mountpoint. + */ + if (attached && old->mnt_ns != p->mnt_ns) goto out; if (!attached && !is_anon_ns(ns)) From patchwork Fri Feb 15 16:08:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1042999 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J9K1mklz9rxp for ; Sat, 16 Feb 2019 03:09:09 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391892AbfBOQJB (ORCPT ); Fri, 15 Feb 2019 11:09:01 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44798 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729453AbfBOQJB (ORCPT ); Fri, 15 Feb 2019 11:09:01 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3ACCC4FCDB; Fri, 15 Feb 2019 16:09:00 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3E992600C7; Fri, 15 Feb 2019 16:08:58 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 10/27] containers: Provide fs_context op for container setting From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:08:57 +0000 Message-ID: <155024693750.21651.5133054585005541648.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 15 Feb 2019 16:09:00 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide an fs_context op to notify a filesystem that a container has been set. The filesystem should do whatever cleanup it needs, then call do_set_container() and then re-set its container/namespace dependent stuff. This allows the following: (1) proc and mqueue mounts to set the correct pid and ipc namespaces respectively. (2) afs to discard the old default cell before the net namespace is changed (ie. while it is still pinned), after which it can get the new default cell. Signed-off-by: David Howells --- fs/afs/super.c | 18 ++++++++++++++++++ fs/fs_context.c | 32 ++++++++++++++++++++++++++------ fs/proc/root.c | 9 +++++++++ include/linux/fs_context.h | 2 ++ ipc/mqueue.c | 10 ++++++++++ 5 files changed, 65 insertions(+), 6 deletions(-) diff --git a/fs/afs/super.c b/fs/afs/super.c index 4e33a7038bc5..a349e213bdc8 100644 --- a/fs/afs/super.c +++ b/fs/afs/super.c @@ -569,6 +569,23 @@ static int afs_get_tree(struct fs_context *fc) return ret; } +static void afs_set_container(struct fs_context *fc) +{ + struct afs_fs_context *ctx = fc->fs_private; + struct afs_cell *cell; + + afs_put_cell(ctx->net, ctx->cell); + do_set_container(fc); + + /* Default to the workstation cell. */ + rcu_read_lock(); + cell = afs_lookup_cell_rcu(ctx->net, NULL, 0); + rcu_read_unlock(); + if (IS_ERR(cell)) + cell = NULL; + ctx->cell = cell; +} + static void afs_free_fc(struct fs_context *fc) { struct afs_fs_context *ctx = fc->fs_private; @@ -583,6 +600,7 @@ static void afs_free_fc(struct fs_context *fc) static const struct fs_context_operations afs_context_ops = { .free = afs_free_fc, .parse_param = afs_parse_param, + .set_container = afs_set_container, .get_tree = afs_get_tree, }; diff --git a/fs/fs_context.c b/fs/fs_context.c index fc76ac02d618..c0f333cc0e16 100644 --- a/fs/fs_context.c +++ b/fs/fs_context.c @@ -170,18 +170,38 @@ int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param) } EXPORT_SYMBOL(vfs_parse_fs_param); +/** + * do_set_container - Helper to set container + * @fc: The fs_context to adjust + * + * This is called to effect the change of namespaces associated with the + * container. The reason that this isn't rolled into vfs_set_container() is + * that the filesystem may need to do some cleanup on the old namespaces (which + * are currently pinned by the container) before calling this. + * + * The user namespace is not changed as that is used for security checks. + */ +void do_set_container(struct fs_context *fc) +{ + put_net(fc->net_ns); + fc->net_ns = get_net(fc->container->ns->net_ns); +} +EXPORT_SYMBOL(do_set_container); + /* - * Specify a container in which a superblock will exist. + * Specify a container in which a superblock will exist. This should be called + * before calling vfs_parse_fs_param. If ->set_container() is supplied by the + * filesystem, it should call do_set_container(). */ void vfs_set_container(struct fs_context *fc, struct container *container) { if (container) { - put_user_ns(fc->user_ns); - put_net(fc->net_ns); - + put_container(fc->container); fc->container = get_container(container); - fc->user_ns = get_user_ns(container->cred->user_ns); - fc->net_ns = get_net(container->ns->net_ns); + if (fc->ops->set_container) + fc->ops->set_container(fc); + else + do_set_container(fc); } } diff --git a/fs/proc/root.c b/fs/proc/root.c index aa802006d855..f8e124ce0888 100644 --- a/fs/proc/root.c +++ b/fs/proc/root.c @@ -164,6 +164,14 @@ static int proc_get_tree(struct fs_context *fc) return vfs_get_super(fc, vfs_get_keyed_super, proc_fill_super); } +static void proc_set_container(struct fs_context *fc) +{ + struct proc_fs_context *ctx = fc->fs_private; + + put_pid_ns(ctx->pid_ns); + ctx->pid_ns = get_pid_ns(fc->container->pid_ns); +} + static void proc_fs_context_free(struct fs_context *fc) { struct proc_fs_context *ctx = fc->fs_private; @@ -176,6 +184,7 @@ static void proc_fs_context_free(struct fs_context *fc) static const struct fs_context_operations proc_fs_context_ops = { .free = proc_fs_context_free, .parse_param = proc_parse_param, + .set_container = proc_set_container, .get_tree = proc_get_tree, .reconfigure = proc_reconfigure, }; diff --git a/include/linux/fs_context.h b/include/linux/fs_context.h index 45486080eb84..086e4f24705a 100644 --- a/include/linux/fs_context.h +++ b/include/linux/fs_context.h @@ -118,6 +118,7 @@ struct fs_context_operations { int (*dup)(struct fs_context *fc, struct fs_context *src_fc); int (*parse_param)(struct fs_context *fc, struct fs_parameter *param); int (*parse_monolithic)(struct fs_context *fc, void *data); + void (*set_container)(struct fs_context *fc); int (*get_tree)(struct fs_context *fc); int (*reconfigure)(struct fs_context *fc); }; @@ -138,6 +139,7 @@ extern int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param) extern int vfs_parse_fs_string(struct fs_context *fc, const char *key, const char *value, size_t v_size); extern int generic_parse_monolithic(struct fs_context *fc, void *data); +extern void do_set_container(struct fs_context *fc); extern void vfs_set_container(struct fs_context *fc, struct container *container); extern int vfs_get_tree(struct fs_context *fc); extern void put_fs_context(struct fs_context *fc); diff --git a/ipc/mqueue.c b/ipc/mqueue.c index 2a9a8be49f5b..821fb227800f 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -329,6 +330,14 @@ static struct inode *mqueue_get_inode(struct super_block *sb, return ERR_PTR(ret); } +static void mqueue_set_container(struct fs_context *fc) +{ + struct mqueue_fs_context *ctx = fc->fs_private; + + put_ipc_ns(ctx->ipc_ns); + ctx->ipc_ns = get_ipc_ns(fc->container->ns->ipc_ns); +} + static int mqueue_fill_super(struct super_block *sb, struct fs_context *fc) { struct inode *inode; @@ -1569,6 +1578,7 @@ static const struct super_operations mqueue_super_ops = { static const struct fs_context_operations mqueue_fs_context_ops = { .free = mqueue_fs_context_free, + .set_container = mqueue_set_container, .get_tree = mqueue_get_tree, }; From patchwork Fri Feb 15 16:09:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043001 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J9k1ZKyz9s4Z for ; Sat, 16 Feb 2019 03:09:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726480AbfBOQJM (ORCPT ); Fri, 15 Feb 2019 11:09:12 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34254 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729453AbfBOQJL (ORCPT ); Fri, 15 Feb 2019 11:09:11 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D19E98E71E; Fri, 15 Feb 2019 16:09:10 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 35F77608C4; Fri, 15 Feb 2019 16:09:06 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 11/27] containers: Sample program for driving container objects From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:09:05 +0000 Message-ID: <155024694546.21651.828651822893643197.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 15 Feb 2019 16:09:11 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Add a sample program to demonstrate driving a container object. It is called something like: ./samples/vfs/test-container /dev/sda3 where /dev/sda3 holds an ext4 filesystem that has appropriate /etc, /bin, /usr, /lib, /proc directories emplaced such that procfs can be mounted and then /bin/bash can be executed within the container. Signed-off-by: David Howells --- samples/vfs/Makefile | 5 + samples/vfs/test-container.c | 279 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 283 insertions(+), 1 deletion(-) create mode 100644 samples/vfs/test-container.c diff --git a/samples/vfs/Makefile b/samples/vfs/Makefile index b88655cb2f1d..25420919ee40 100644 --- a/samples/vfs/Makefile +++ b/samples/vfs/Makefile @@ -4,7 +4,8 @@ hostprogs-$(CONFIG_SAMPLE_VFS) := \ test-fs-query \ test-fsmount \ test-mntinfo \ - test-statx + test-statx \ + test-container # Tell kbuild to always build the programs always := $(hostprogs-y) @@ -17,3 +18,5 @@ HOSTLDLIBS_test-mntinfo += -lm HOSTCFLAGS_test-fs-query.o += -I$(objtree)/usr/include HOSTCFLAGS_test-fsmount.o += -I$(objtree)/usr/include HOSTCFLAGS_test-statx.o += -I$(objtree)/usr/include +HOSTCFLAGS_test-container.o += -I$(objtree)/usr/include +HOSTLDLIBS_test-container += -lkeyutils diff --git a/samples/vfs/test-container.c b/samples/vfs/test-container.c new file mode 100644 index 000000000000..44ff57afb5a4 --- /dev/null +++ b/samples/vfs/test-container.c @@ -0,0 +1,279 @@ +/* Container test. + * + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Hope -1 isn't a syscall */ +#ifndef __NR_fsopen +#define __NR_fsopen -1 +#endif +#ifndef __NR_fsmount +#define __NR_fsmount -1 +#endif +#ifndef __NR_fsconfig +#define __NR_fsconfig -1 +#endif +#ifndef __NR_move_mount +#define __NR_move_mount -1 +#endif + + +#define E(x) do { if ((x) == -1) { perror(#x); exit(1); } } while(0) + +static void check_messages(int fd) +{ + char buf[4096]; + int err, n; + + err = errno; + + for (;;) { + n = read(fd, buf, sizeof(buf)); + if (n < 0) + break; + n -= 2; + + switch (buf[0]) { + case 'e': + fprintf(stderr, "Error: %*.*s\n", n, n, buf + 2); + break; + case 'w': + fprintf(stderr, "Warning: %*.*s\n", n, n, buf + 2); + break; + case 'i': + fprintf(stderr, "Info: %*.*s\n", n, n, buf + 2); + break; + } + } + + errno = err; +} + +static __attribute__((noreturn)) +void mount_error(int fd, const char *s) +{ + check_messages(fd); + fprintf(stderr, "%s: %m\n", s); + exit(1); +} + +#define CONTAINER_NEW_FS_NS 0x00000001 /* Dup current fs namespace */ +#define CONTAINER_NEW_EMPTY_FS_NS 0x00000002 /* Provide new empty fs namespace */ +#define CONTAINER_NEW_CGROUP_NS 0x00000004 /* Dup current cgroup namespace [priv] */ +#define CONTAINER_NEW_UTS_NS 0x00000008 /* Dup current uts namespace */ +#define CONTAINER_NEW_IPC_NS 0x00000010 /* Dup current ipc namespace */ +#define CONTAINER_NEW_USER_NS 0x00000020 /* Dup current user namespace */ +#define CONTAINER_NEW_PID_NS 0x00000040 /* Dup current pid namespace */ +#define CONTAINER_NEW_NET_NS 0x00000080 /* Dup current net namespace */ +#define CONTAINER_KILL_ON_CLOSE 0x00000100 /* Kill all member processes when fd closed */ +#define CONTAINER_FD_CLOEXEC 0x00000200 /* Close the fd on exec */ +#define CONTAINER__FLAG_MASK 0x000003ff + +static inline int fsopen(const char *fs_name, unsigned int flags) +{ + return syscall(__NR_fsopen, fs_name, flags); +} + +static inline int fsconfig(int fsfd, unsigned int cmd, + const char *key, const void *val, int aux) +{ + return syscall(__NR_fsconfig, fsfd, cmd, key, val, aux); +} + +static inline int fsmount(int fsfd, unsigned int flags, unsigned int attr_flags) +{ + return syscall(__NR_fsmount, fsfd, flags, attr_flags); +} + +static inline int move_mount(int from_dfd, const char *from_pathname, + int to_dfd, const char *to_pathname, + unsigned int flags) +{ + return syscall(__NR_move_mount, + from_dfd, from_pathname, + to_dfd, to_pathname, flags); +} + +static inline int container_create(const char *name, unsigned int mask) +{ + return syscall(__NR_container_create, name, mask, 0, 0, 0); +} + +static inline int fork_into_container(int containerfd) +{ + return syscall(__NR_fork_into_container, containerfd); +} + +#define E_fsconfig(fd, cmd, key, val, aux) \ + do { \ + if (fsconfig(fd, cmd, key, val, aux) == -1) \ + mount_error(fd, key ?: "create"); \ + } while (0) + +/* + * The container init process. + */ +static __attribute__((noreturn)) +void container_init(void) +{ + if (0) { + /* Do a bit of debugging on the container. */ + struct dirent **dlist; + struct stat st; + char buf[4096]; + int n, i; + + printf("hello!\n"); + n = scandir("/", &dlist, NULL, alphasort); + if (n == -1) { + perror("scandir"); + exit(1); + } + + for (i = 0; i < n; i++) { + struct dirent *p = dlist[i]; + + if (p) + printf("- %u %s\n", p->d_type, p->d_name); + } + + n = readlink("/bin", buf, sizeof(buf) - 1); + if (n == -1) { + perror("readlink"); + exit(1); + } + + buf[n] = 0; + printf("/bin -> %s\n", buf); + + if (stat("/lib64/ld-linux-x86-64.so.2", &st) == -1) { + perror("stat"); + exit(1); + } + + printf("mode %o\n", st.st_mode); + } + + if (keyctl_join_session_keyring(NULL) == -1) { + perror("keyctl/join"); + exit(1); + } + + setenv("PS1", "container>", 1); + execl("/bin/bash", "bash", NULL); + perror("execl"); + exit(1); +} + +/* + * The container manager process. + */ +int main(int argc, char *argv[]) +{ + pid_t pid; + int fsfd, mfd, cfd, ws; + + if (argc != 2) { + fprintf(stderr, "Format: test-container \n"); + exit(2); + } + + cfd = container_create("foo-test", + CONTAINER_NEW_EMPTY_FS_NS | + //CONTAINER_NEW_UTS_NS | + //CONTAINER_NEW_IPC_NS | + //CONTAINER_NEW_USER_NS | + CONTAINER_NEW_PID_NS | + CONTAINER_KILL_ON_CLOSE | + CONTAINER_FD_CLOEXEC); + if (cfd == -1) { + perror("container_create"); + exit(1); + } + + system("cat /proc/containers"); + + /* Open the filesystem that's going to form the container root. */ + printf("Creating root...\n"); + fsfd = fsopen("ext4", 0); + if (fsfd == -1) { + perror("fsopen/root"); + exit(1); + } + + E_fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd); + E_fsconfig(fsfd, FSCONFIG_SET_STRING, "source", argv[1], 0); + E_fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); + E_fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); + + /* Mount the container root */ + printf("Mounting root...\n"); + mfd = fsmount(fsfd, 0, 0); + if (mfd < 0) + mount_error(fsfd, "fsmount/root"); + + if (move_mount(mfd, "", cfd, "/", + MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_CONTAINER_ROOT) < 0) { + perror("move_mount/root"); + exit(1); + } + E(close(fsfd)); + E(close(mfd)); + + /* Mount procfs within the container */ + printf("Creating procfs...\n"); + fsfd = fsopen("proc", 0); + if (fsfd == -1) { + perror("fsopen/proc"); + exit(1); + } + + E_fsconfig(fsfd, FSCONFIG_SET_CONTAINER, NULL, NULL, cfd); + E_fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); + + printf("Mounting procfs...\n"); + mfd = fsmount(fsfd, 0, 0); + if (mfd < 0) + mount_error(fsfd, "fsmount/proc"); + if (move_mount(mfd, "", cfd, "proc", MOVE_MOUNT_F_EMPTY_PATH) < 0) { + perror("move_mount/proc"); + exit(1); + } + E(close(fsfd)); + E(close(mfd)); + + /* Start the 'init' process. */ + printf("Forking...\n"); + switch ((pid = fork_into_container(cfd))) { + case -1: + perror("fork_into_container"); + exit(1); + case 0: + close(cfd); + container_init(); + default: + if (waitpid(pid, &ws, 0) < 0) { + perror("waitpid"); + exit(1); + } + } + E(close(cfd)); + exit(0); +} From patchwork Fri Feb 15 16:09:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043000 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J9h2WqHz9rxp for ; Sat, 16 Feb 2019 03:09:28 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391998AbfBOQJU (ORCPT ); Fri, 15 Feb 2019 11:09:20 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60136 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729649AbfBOQJT (ORCPT ); Fri, 15 Feb 2019 11:09:19 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DD88CBF9FC; Fri, 15 Feb 2019 16:09:18 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id B6D721024951; Fri, 15 Feb 2019 16:09:16 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 12/27] containers: Allow a daemon to intercept request_key upcalls in a container From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:09:16 +0000 Message-ID: <155024695603.21651.8408384800355761335.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 15 Feb 2019 16:09:19 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a mechanism by which a running daemon can intercept request_key upcalls, filtered by namespace and key type, and service them. The list of active services is per-container. Intercepts for a specific {key_type, namespace} can be installed on a container with: keyctl(KEYCTL_ADD_UPCALL_INTERCEPT, int containerfd, const char *type_name, unsigned int ns_id, key_serial_t dest_keyring); The authentication token keys for intercepted keys are linked into the destination keyring. Signed-off-by: David Howells --- include/linux/container.h | 2 include/linux/key-type.h | 2 include/uapi/linux/keyctl.h | 1 kernel/container.c | 4 + security/keys/Makefile | 2 security/keys/compat.c | 5 + security/keys/container.c | 227 ++++++++++++++++++++++++++++++++++++++ security/keys/internal.h | 10 ++ security/keys/keyctl.c | 14 ++ security/keys/request_key.c | 18 ++- security/keys/request_key_auth.c | 6 + 11 files changed, 278 insertions(+), 13 deletions(-) create mode 100644 security/keys/container.c diff --git a/include/linux/container.h b/include/linux/container.h index 087aa1885ef7..a8cac800ce75 100644 --- a/include/linux/container.h +++ b/include/linux/container.h @@ -42,6 +42,7 @@ struct container { struct list_head members; /* Member processes, guarded with ->lock */ struct list_head child_link; /* Link in parent->children */ struct list_head children; /* Child containers */ + struct list_head req_key_traps; /* Traps for request-key upcalls */ wait_queue_head_t waitq; /* Someone waiting for init to exit waits here */ unsigned long flags; #define CONTAINER_FLAG_INIT_STARTED 0 /* Init is started - certain ops now prohibited */ @@ -60,6 +61,7 @@ extern int copy_container(unsigned long flags, struct task_struct *tsk, struct container *container); extern void exit_container(struct task_struct *tsk); extern void put_container(struct container *c); +extern long key_del_intercept(struct container *c, const char *type); static inline struct container *get_container(struct container *c) { diff --git a/include/linux/key-type.h b/include/linux/key-type.h index 2148a6bf58f1..0e09dac53245 100644 --- a/include/linux/key-type.h +++ b/include/linux/key-type.h @@ -66,7 +66,7 @@ struct key_match_data { */ struct key_type { /* name of the type */ - const char *name; + const char name[24]; /* default payload length for quota precalculation (optional) * - this can be used instead of calling key_payload_reserve(), that diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index e9e7da849619..85e8fef89bba 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -68,6 +68,7 @@ #define KEYCTL_PKEY_VERIFY 28 /* Verify a public key signature */ #define KEYCTL_RESTRICT_KEYRING 29 /* Restrict keys allowed to link to a keyring */ #define KEYCTL_WATCH_KEY 30 /* Watch a key or ring of keys for changes */ +#define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ /* keyctl structures */ struct keyctl_dh_params { diff --git a/kernel/container.c b/kernel/container.c index 360284db959b..33e41fe5050b 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -35,6 +35,7 @@ struct container init_container = { .members.next = &init_task.container_link, .members.prev = &init_task.container_link, .children = LIST_HEAD_INIT(init_container.children), + .req_key_traps = LIST_HEAD_INIT(init_container.req_key_traps), .flags = (1 << CONTAINER_FLAG_INIT_STARTED), .lock = __SPIN_LOCK_UNLOCKED(init_container.lock), .seq = SEQCNT_ZERO(init_fs.seq), @@ -53,6 +54,8 @@ void put_container(struct container *c) while (c && refcount_dec_and_test(&c->usage)) { BUG_ON(!list_empty(&c->members)); + if (!list_empty(&c->req_key_traps)) + key_del_intercept(c, NULL); if (c->pid_ns) put_pid_ns(c->pid_ns); if (c->ns) @@ -286,6 +289,7 @@ static struct container *alloc_container(const char __user *name) INIT_LIST_HEAD(&c->members); INIT_LIST_HEAD(&c->children); + INIT_LIST_HEAD(&c->req_key_traps); init_waitqueue_head(&c->waitq); spin_lock_init(&c->lock); refcount_set(&c->usage, 1); diff --git a/security/keys/Makefile b/security/keys/Makefile index 9cef54064f60..24f5df27b1c2 100644 --- a/security/keys/Makefile +++ b/security/keys/Makefile @@ -16,6 +16,7 @@ obj-y := \ request_key.o \ request_key_auth.o \ user_defined.o + compat-obj-$(CONFIG_KEY_DH_OPERATIONS) += compat_dh.o obj-$(CONFIG_KEYS_COMPAT) += compat.o $(compat-obj-y) obj-$(CONFIG_PROC_FS) += proc.o @@ -23,6 +24,7 @@ obj-$(CONFIG_SYSCTL) += sysctl.o obj-$(CONFIG_PERSISTENT_KEYRINGS) += persistent.o obj-$(CONFIG_KEY_DH_OPERATIONS) += dh.o obj-$(CONFIG_ASYMMETRIC_KEY_TYPE) += keyctl_pkey.o +obj-$(CONFIG_CONTAINERS) += container.o # # Key types diff --git a/security/keys/compat.c b/security/keys/compat.c index 021d8e1c9233..6420881e5ce7 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -161,6 +161,11 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, case KEYCTL_WATCH_KEY: return keyctl_watch_key(arg2, arg3, arg4); +#ifdef CONFIG_CONTAINERS + case KEYCTL_CONTAINER_INTERCEPT: + return keyctl_container_intercept(arg2, compat_ptr(arg3), arg4, arg5); +#endif + default: return -EOPNOTSUPP; } diff --git a/security/keys/container.c b/security/keys/container.c new file mode 100644 index 000000000000..c61c43658f3b --- /dev/null +++ b/security/keys/container.c @@ -0,0 +1,227 @@ +/* Container intercept interface + * + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include "internal.h" + +struct request_key_intercept { + char type[32]; /* The type of key to be trapped */ + struct list_head link; /* Link in containers->req_key_traps */ + struct key *dest_keyring; /* Where to place the trapped auth keys */ + struct ns_common *ns; /* Namespace the key must match */ +}; + +/* + * Add an intercept filter to a container. + */ +static long key_add_intercept(struct container *c, struct request_key_intercept *rki) +{ + struct request_key_intercept *p; + + kenter("%p,{%s,%d}", c, rki->type, key_serial(rki->dest_keyring)); + + spin_lock(&c->lock); + list_for_each_entry(p, &c->req_key_traps, link) { + if (strcmp(rki->type, p->type) == 0) { + spin_unlock(&c->lock); + return -EEXIST; + } + } + + /* We put all-matching rules at the back so they're checked after the + * more specific rules. + */ + if (rki->type[0] == '*' && !rki->type[1]) + list_add_tail(&rki->link, &c->req_key_traps); + else + list_add(&rki->link, &c->req_key_traps); + + spin_unlock(&c->lock); + kleave(" = 0"); + return 0; +} + +/* + * Remove one or more intercept filters from a container. Returns the number + * of entries removed. + */ +long key_del_intercept(struct container *c, const char *type) +{ + struct request_key_intercept *p, *q; + long count; + LIST_HEAD(graveyard); + + kenter("%p,%s", c, type); + + spin_lock(&c->lock); + list_for_each_entry_safe(p, q, &c->req_key_traps, link) { + if (!type || strcmp(p->type, type) == 0) { + kdebug("- match %d", key_serial(p->dest_keyring)); + list_move(&p->link, &graveyard); + } + } + spin_unlock(&c->lock); + + count = 0; + while (!list_empty(&graveyard)) { + p = list_entry(graveyard.next, struct request_key_intercept, link); + list_del(&p->link); + count++; + + key_put(p->dest_keyring); + kfree(p); + } + + kleave(" = %ld", count); + return count; +} + +/* + * Create an intercept filter and add it to a container. + */ +static long key_create_intercept(struct container *c, const char *type, + key_serial_t dest_ring_id) +{ + struct request_key_intercept *rki; + key_ref_t dest_ref; + long ret = -ENOMEM; + + dest_ref = lookup_user_key(dest_ring_id, KEY_LOOKUP_CREATE, + KEY_NEED_WRITE); + if (IS_ERR(dest_ref)) + return PTR_ERR(dest_ref); + + rki = kzalloc(sizeof(*rki), GFP_KERNEL); + if (!rki) + goto out_dest; + + memcpy(rki->type, type, sizeof(rki->type)); + rki->dest_keyring = key_ref_to_ptr(dest_ref); + /* TODO: set rki->ns */ + + ret = key_add_intercept(c, rki); + if (ret < 0) + goto out_rki; + return ret; + +out_rki: + kfree(rki); +out_dest: + key_ref_put(dest_ref); + return ret; +} + +/* + * Add or remove (if dest_keyring==0) a request_key upcall intercept trap upon + * a container. If _type points to a string of "*" that matches all types. + */ +long keyctl_container_intercept(int containerfd, + const char *_type, + unsigned int ns_id, + key_serial_t dest_ring_id) +{ + struct container *c; + struct fd f; + char type[32] = ""; + long ret; + + if (containerfd < 0 || ns_id < 0) + return -EINVAL; + if (dest_ring_id && !_type) + return -EINVAL; + + f = fdget(containerfd); + if (!f.file) + return -EBADF; + ret = -EINVAL; + if (!is_container_file(f.file)) + goto out_fd; + + c = f.file->private_data; + + /* Find out what type we're dealing with (can be NULL to make removal + * remove everything). + */ + if (_type) { + ret = key_get_type_from_user(type, _type, sizeof(type)); + if (ret < 0) + goto out_fd; + } + + /* TODO: Get the namespace to filter on */ + + /* We add a filter if a destination keyring has been specified. */ + if (dest_ring_id) { + ret = key_create_intercept(c, type, dest_ring_id); + } else { + ret = key_del_intercept(c, _type ? type : NULL); + } + +out_fd: + fdput(f); + return ret; +} + +/* + * Queue a construction record if we can find a handler. + * + * Returns true if we found a handler - in which case ownership of the + * construction record has been passed on to the service queue and the caller + * can no longer touch it. + */ +int queue_request_key(struct key *authkey) +{ + struct container *c = current->container; + struct request_key_intercept *rki; + struct request_key_auth *rka = get_request_key_auth(authkey); + struct key *service_keyring; + struct key *key = rka->target_key; + int ret; + + kenter("%p,%d,%d", c, key_serial(authkey), key_serial(key)); + + if (list_empty(&c->req_key_traps)) { + kleave(" = -EAGAIN [e]"); + return -EAGAIN; + } + + spin_lock(&c->lock); + + list_for_each_entry(rki, &c->req_key_traps, link) { + if (strcmp(rki->type, "*") == 0 || + strcmp(rki->type, key->type->name) == 0) + goto found_match; + } + + spin_unlock(&c->lock); + kleave(" = -EAGAIN [n]"); + return -EAGAIN; + +found_match: + service_keyring = key_get(rki->dest_keyring); + kdebug("- match %d", key_serial(service_keyring)); + spin_unlock(&c->lock); + + /* We add the authentication key to the keyring for the service daemon + * to collect. This can be detected by means of a watch on the service + * keyring. + */ + ret = key_link(service_keyring, authkey); + key_put(service_keyring); + kleave(" = %d", ret); + return ret; +} diff --git a/security/keys/internal.h b/security/keys/internal.h index 14c5b8ad5bd6..e98fca465146 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -93,6 +93,7 @@ extern wait_queue_head_t request_key_conswq; extern void key_set_index_key(struct keyring_index_key *index_key); extern struct key_type *key_type_lookup(const char *type); extern void key_type_put(struct key_type *ktype); +extern int key_get_type_from_user(char *, const char __user *, unsigned); extern int __key_link_begin(struct key *keyring, const struct keyring_index_key *index_key, @@ -180,6 +181,11 @@ extern void key_gc_keytype(struct key_type *ktype); extern int key_task_permission(const key_ref_t key_ref, const struct cred *cred, key_perm_t perm); +#ifdef CONFIG_CONTAINERS +extern int queue_request_key(struct key *); +#else +static inline int queue_request_key(struct key *authkey) { return -EAGAIN; } +#endif static inline void notify_key(struct key *key, enum key_notification_subtype subtype, u32 aux) @@ -354,6 +360,10 @@ static inline long keyctl_watch_key(key_serial_t key_id, int watch_fd, int watch } #endif +#ifdef CONFIG_CONTAINERS +extern long keyctl_container_intercept(int, const char __user *, unsigned int, key_serial_t); +#endif + /* * Debugging key validation */ diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index 94b99a52b4e5..38ff33431f33 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -30,9 +30,9 @@ #define KEY_MAX_DESC_SIZE 4096 -static int key_get_type_from_user(char *type, - const char __user *_type, - unsigned len) +int key_get_type_from_user(char *type, + const char __user *_type, + unsigned len) { int ret; @@ -1857,6 +1857,14 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, case KEYCTL_WATCH_KEY: return keyctl_watch_key((key_serial_t)arg2, (int)arg3, (int)arg4); +#ifdef CONFIG_CONTAINERS + case KEYCTL_CONTAINER_INTERCEPT: + return keyctl_container_intercept((int)arg2, + (const char __user *)arg3, + (unsigned int)arg4, + (key_serial_t)arg5); +#endif + default: return -EOPNOTSUPP; } diff --git a/security/keys/request_key.c b/security/keys/request_key.c index edfabf20bdbb..078767564283 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include "internal.h" #include @@ -91,11 +92,11 @@ static int call_usermodehelper_keys(const char *path, char **argv, char **envp, * Request userspace finish the construction of a key * - execute "/sbin/request-key " */ -static int call_sbin_request_key(struct key *authkey, void *aux) +static int call_sbin_request_key(struct key *authkey) { static char const request_key[] = "/sbin/request-key"; struct request_key_auth *rka = get_request_key_auth(authkey); - const struct cred *cred = current_cred(); + const struct cred *cred = rka->cred; key_serial_t prkey, sskey; struct key *key = rka->target_key, *keyring, *session; char *argv[9], *envp[3], uid_str[12], gid_str[12]; @@ -203,7 +204,6 @@ static int construct_key(struct key *key, const void *callout_info, size_t callout_len, void *aux, struct key *dest_keyring) { - request_key_actor_t actor; struct key *authkey; int ret; @@ -216,11 +216,13 @@ static int construct_key(struct key *key, const void *callout_info, return PTR_ERR(authkey); /* Make the call */ - actor = call_sbin_request_key; - if (key->type->request_key) - actor = key->type->request_key; - - ret = actor(authkey, aux); + if (key->type->request_key) { + ret = key->type->request_key(authkey, aux); + } else { + ret = queue_request_key(authkey); + if (ret == -EAGAIN) + ret = call_sbin_request_key(authkey); + } /* check that the actor called complete_request_key() prior to * returning an error */ diff --git a/security/keys/request_key_auth.c b/security/keys/request_key_auth.c index afc304e8b61e..cd75173cadad 100644 --- a/security/keys/request_key_auth.c +++ b/security/keys/request_key_auth.c @@ -123,6 +123,10 @@ static void free_request_key_auth(struct request_key_auth *rka) { if (!rka) return; + + if (rka->target_key->state == KEY_IS_UNINSTANTIATED) + key_reject_and_link(rka->target_key, 0, -ENOKEY, NULL, NULL); + key_put(rka->target_key); key_put(rka->dest_keyring); if (rka->cred) @@ -184,7 +188,7 @@ struct key *request_key_auth_new(struct key *target, const char *op, goto error_free_rka; } - irka = cred->request_key_auth->payload.data[0]; + irka = get_request_key_auth(cred->request_key_auth); rka->cred = get_cred(irka->cred); rka->pid = irka->pid; From patchwork Fri Feb 15 16:09:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043004 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JBF60mnz9s7h for ; Sat, 16 Feb 2019 03:09:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388138AbfBOQJ3 (ORCPT ); Fri, 15 Feb 2019 11:09:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:60490 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388008AbfBOQJ2 (ORCPT ); Fri, 15 Feb 2019 11:09:28 -0500 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AA08AC7870; Fri, 15 Feb 2019 16:09:27 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id A0B60600D7; Fri, 15 Feb 2019 16:09:24 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 13/27] keys: Provide a keyctl to query a request_key authentication key From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:09:24 +0000 Message-ID: <155024696409.21651.3488621563034826227.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 15 Feb 2019 16:09:28 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a keyctl to query a request_key authentication key for situations where this information isn't passed on the command line (such as where the authentication key is placed in a queue instead of /sbin/request-key being invoked): struct keyctl_query_request_key_auth { char operation[32]; uid_t fsuid; gid_t fsgid; key_serial_t target_key; key_serial_t thread_keyring; key_serial_t process_keyring; key_serial_t session_keyring; __u64 spare[1]; }; keyctl(KEYCTL_QUERY_REQUEST_KEY_AUTH, key_serial_t key, struct keyctl_query_request_key_auth *data); Signed-off-by: David Howells --- include/uapi/linux/keyctl.h | 12 ++++++++++++ security/keys/compat.c | 2 ++ security/keys/container.c | 42 ++++++++++++++++++++++++++++++++++++++++++ security/keys/internal.h | 2 ++ security/keys/keyctl.c | 4 ++++ 5 files changed, 62 insertions(+) diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 85e8fef89bba..bb075ad1827d 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -69,6 +69,7 @@ #define KEYCTL_RESTRICT_KEYRING 29 /* Restrict keys allowed to link to a keyring */ #define KEYCTL_WATCH_KEY 30 /* Watch a key or ring of keys for changes */ #define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ +#define KEYCTL_QUERY_REQUEST_KEY_AUTH 32 /* Query a request_key_auth key */ /* keyctl structures */ struct keyctl_dh_params { @@ -114,4 +115,15 @@ struct keyctl_pkey_params { __u32 __spare[7]; }; +struct keyctl_query_request_key_auth { + char operation[32]; /* Operation name, typically "create" */ + uid_t fsuid; /* UID of requester */ + gid_t fsgid; /* GID of requester */ + __u32 target_key; /* The key being instantiated */ + __u32 thread_keyring; /* The requester's thread keyring */ + __u32 process_keyring; /* The requester's process keyring */ + __u32 session_keyring; /* The requester's session keyring */ + __u64 spare[1]; +}; + #endif /* _LINUX_KEYCTL_H */ diff --git a/security/keys/compat.c b/security/keys/compat.c index 6420881e5ce7..30055fc2b629 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -164,6 +164,8 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, #ifdef CONFIG_CONTAINERS case KEYCTL_CONTAINER_INTERCEPT: return keyctl_container_intercept(arg2, compat_ptr(arg3), arg4, arg5); + case KEYCTL_QUERY_REQUEST_KEY_AUTH: + return keyctl_query_request_key_auth(arg2, compat_ptr(arg3)); #endif default: diff --git a/security/keys/container.c b/security/keys/container.c index c61c43658f3b..115998e867cd 100644 --- a/security/keys/container.c +++ b/security/keys/container.c @@ -225,3 +225,45 @@ int queue_request_key(struct key *authkey) kleave(" = %d", ret); return ret; } + +/* + * Query information about a request_key_auth key. + */ +long keyctl_query_request_key_auth(key_serial_t auth_id, + struct keyctl_query_request_key_auth __user *_data) +{ + struct keyctl_query_request_key_auth data; + struct request_key_auth *rka; + struct key *session; + key_ref_t authkey_ref; + + if (auth_id <= 0 || !_data) + return -EINVAL; + + authkey_ref = lookup_user_key(auth_id, 0, KEY_NEED_SEARCH); + if (IS_ERR(authkey_ref)) + return PTR_ERR(authkey_ref); + rka = get_request_key_auth(key_ref_to_ptr(authkey_ref)); + + memset(&data, 0, sizeof(data)); + strlcpy(data.operation, rka->op, sizeof(data.operation)); + data.fsuid = from_kuid(current_user_ns(), rka->cred->fsuid); + data.fsgid = from_kgid(current_user_ns(), rka->cred->fsgid); + data.target_key = rka->target_key->serial; + data.thread_keyring = key_serial(rka->cred->thread_keyring); + data.process_keyring = key_serial(rka->cred->thread_keyring); + + rcu_read_lock(); + session = rcu_dereference(rka->cred->session_keyring); + if (!session) + session = rka->cred->user->session_keyring; + data.session_keyring = key_serial(session); + rcu_read_unlock(); + + key_ref_put(authkey_ref); + + if (copy_to_user(_data, &data, sizeof(data))) + return -EFAULT; + + return 0; +} diff --git a/security/keys/internal.h b/security/keys/internal.h index e98fca465146..9f2a6ce67d15 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -362,6 +362,8 @@ static inline long keyctl_watch_key(key_serial_t key_id, int watch_fd, int watch #ifdef CONFIG_CONTAINERS extern long keyctl_container_intercept(int, const char __user *, unsigned int, key_serial_t); +extern long keyctl_query_request_key_auth(key_serial_t, + struct keyctl_query_request_key_auth __user *); #endif /* diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index 38ff33431f33..a19efc60944d 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -1863,6 +1863,10 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, (const char __user *)arg3, (unsigned int)arg4, (key_serial_t)arg5); + case KEYCTL_QUERY_REQUEST_KEY_AUTH: + return keyctl_query_request_key_auth( + (key_serial_t)arg2, + (struct keyctl_query_request_key_auth __user *)arg3); #endif default: From patchwork Fri Feb 15 16:09:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043002 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441J9z41nxz9rxp for ; Sat, 16 Feb 2019 03:09:43 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404115AbfBOQJg (ORCPT ); Fri, 15 Feb 2019 11:09:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:9999 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403968AbfBOQJf (ORCPT ); Fri, 15 Feb 2019 11:09:35 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8C2D2A12AA; Fri, 15 Feb 2019 16:09:35 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8FD201024932; Fri, 15 Feb 2019 16:09:33 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 14/27] keys: Break bits out of key_unlink() From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:09:32 +0000 Message-ID: <155024697274.21651.2339284853609462143.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 15 Feb 2019 16:09:35 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Break bits out of key_unlink() into helper functions so that they can be used in implementing key_move(). Signed-off-by: David Howells --- security/keys/keyring.c | 89 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 66 insertions(+), 23 deletions(-) diff --git a/security/keys/keyring.c b/security/keys/keyring.c index 062cad635edf..431094c6cd74 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -1409,6 +1409,66 @@ int key_link(struct key *keyring, struct key *key) } EXPORT_SYMBOL(key_link); +/* + * Begin the process of unlinking a key from a keyring. + */ +static int __key_unlink_begin(struct key *keyring, unsigned int lock_nesting, + struct key *key, struct assoc_array_edit **_edit) + __acquires(&keyring->sem) +{ + struct assoc_array_edit *edit; + int ret; + + if (keyring->type != &key_type_keyring) + return -ENOTDIR; + + down_write_nested(&keyring->sem, lock_nesting); + + edit = assoc_array_delete(&keyring->keys, &keyring_assoc_array_ops, + &key->index_key); + if (IS_ERR(edit)) { + ret = PTR_ERR(edit); + goto error; + } + + if (!edit) { + ret = -ENOENT; + goto error; + } + + *_edit = edit; + return 0; + +error: + up_write(&keyring->sem); + return ret; +} + +/* + * Apply an unlink change. + */ +static void __key_unlink(struct key *keyring, struct key *key, + struct assoc_array_edit **_edit) +{ + assoc_array_apply_edit(*_edit); + *_edit = NULL; + notify_key(keyring, NOTIFY_KEY_UNLINKED, key_serial(key)); + key_payload_reserve(keyring, keyring->datalen - KEYQUOTA_LINK_BYTES); +} + +/* + * Finish unlinking a key from to a keyring. + */ +static void __key_unlink_end(struct key *keyring, + struct key *key, + struct assoc_array_edit *edit) + __releases(&keyring->sem) +{ + if (edit) + assoc_array_cancel_edit(edit); + up_write(&keyring->sem); +} + /** * key_unlink - Unlink the first link to a key from a keyring. * @keyring: The keyring to remove the link from. @@ -1429,35 +1489,18 @@ EXPORT_SYMBOL(key_link); int key_unlink(struct key *keyring, struct key *key) { struct assoc_array_edit *edit; - key_serial_t target = key_serial(key); int ret; key_check(keyring); key_check(key); - if (keyring->type != &key_type_keyring) - return -ENOTDIR; - - down_write(&keyring->sem); - - edit = assoc_array_delete(&keyring->keys, &keyring_assoc_array_ops, - &key->index_key); - if (IS_ERR(edit)) { - ret = PTR_ERR(edit); - goto error; - } - ret = -ENOENT; - if (edit == NULL) - goto error; - - assoc_array_apply_edit(edit); - notify_key(keyring, NOTIFY_KEY_UNLINKED, target); - key_payload_reserve(keyring, keyring->datalen - KEYQUOTA_LINK_BYTES); - ret = 0; + ret = __key_unlink_begin(keyring, 0, key, &edit); + if (ret < 0) + return ret; -error: - up_write(&keyring->sem); - return ret; + __key_unlink(keyring, key, &edit); + __key_unlink_end(keyring, key, edit); + return 0; } EXPORT_SYMBOL(key_unlink); From patchwork Fri Feb 15 16:09:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043003 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JB32cN3z9s7h for ; Sat, 16 Feb 2019 03:09:47 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404154AbfBOQJq (ORCPT ); Fri, 15 Feb 2019 11:09:46 -0500 Received: from mx1.redhat.com ([209.132.183.28]:35260 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2403968AbfBOQJp (ORCPT ); Fri, 15 Feb 2019 11:09:45 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 79338C0C5A4A; Fri, 15 Feb 2019 16:09:44 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7D641101E841; Fri, 15 Feb 2019 16:09:41 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 15/27] keys: Make __key_link_begin() handle lockdep nesting From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:09:40 +0000 Message-ID: <155024698079.21651.18304462489603588280.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 15 Feb 2019 16:09:44 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Make __key_link_begin() handle lockdep nesting for the implementation of key_move() where we have to lock two keyrings. Signed-off-by: David Howells --- security/keys/internal.h | 2 +- security/keys/key.c | 6 +++--- security/keys/keyring.c | 6 +++--- security/keys/request_key.c | 2 +- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/security/keys/internal.h b/security/keys/internal.h index 9f2a6ce67d15..40846657aebd 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -95,7 +95,7 @@ extern struct key_type *key_type_lookup(const char *type); extern void key_type_put(struct key_type *ktype); extern int key_get_type_from_user(char *, const char __user *, unsigned); -extern int __key_link_begin(struct key *keyring, +extern int __key_link_begin(struct key *keyring, unsigned int lock_nesting, const struct keyring_index_key *index_key, struct assoc_array_edit **_edit); extern int __key_link_check_live_key(struct key *keyring, struct key *key); diff --git a/security/keys/key.c b/security/keys/key.c index 2c60d6bcf8a3..63513ffcf2e8 100644 --- a/security/keys/key.c +++ b/security/keys/key.c @@ -518,7 +518,7 @@ int key_instantiate_and_link(struct key *key, } if (keyring) { - ret = __key_link_begin(keyring, &key->index_key, &edit); + ret = __key_link_begin(keyring, 0, &key->index_key, &edit); if (ret < 0) goto error; @@ -586,7 +586,7 @@ int key_reject_and_link(struct key *key, if (keyring->restrict_link) return -EPERM; - link_ret = __key_link_begin(keyring, &key->index_key, &edit); + link_ret = __key_link_begin(keyring, 0, &key->index_key, &edit); } mutex_lock(&key_construction_mutex); @@ -866,7 +866,7 @@ key_ref_t key_create_or_update(key_ref_t keyring_ref, index_key.desc_len = strlen(index_key.description); key_set_index_key(&index_key); - ret = __key_link_begin(keyring, &index_key, &edit); + ret = __key_link_begin(keyring, 0, &index_key, &edit); if (ret < 0) { key_ref = ERR_PTR(ret); goto error_free_prep; diff --git a/security/keys/keyring.c b/security/keys/keyring.c index 431094c6cd74..1334ed97e530 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -1227,7 +1227,7 @@ static int keyring_detect_cycle(struct key *A, struct key *B) /* * Preallocate memory so that a key can be linked into to a keyring. */ -int __key_link_begin(struct key *keyring, +int __key_link_begin(struct key *keyring, unsigned int lock_nesting, const struct keyring_index_key *index_key, struct assoc_array_edit **_edit) __acquires(&keyring->sem) @@ -1244,7 +1244,7 @@ int __key_link_begin(struct key *keyring, if (keyring->type != &key_type_keyring) return -ENOTDIR; - down_write(&keyring->sem); + down_write_nested(&keyring->sem, lock_nesting); ret = -EKEYREVOKED; if (test_bit(KEY_FLAG_REVOKED, &keyring->flags)) @@ -1393,7 +1393,7 @@ int key_link(struct key *keyring, struct key *key) key_check(keyring); key_check(key); - ret = __key_link_begin(keyring, &key->index_key, &edit); + ret = __key_link_begin(keyring, 0, &key->index_key, &edit); if (ret == 0) { kdebug("begun {%d,%d}", keyring->serial, refcount_read(&keyring->usage)); ret = __key_link_check_restriction(keyring, key); diff --git a/security/keys/request_key.c b/security/keys/request_key.c index 078767564283..ab1f6de9e623 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -375,7 +375,7 @@ static int construct_alloc_key(struct keyring_search_context *ctx, set_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags); if (dest_keyring) { - ret = __key_link_begin(dest_keyring, &ctx->index_key, &edit); + ret = __key_link_begin(dest_keyring, 0, &ctx->index_key, &edit); if (ret < 0) goto link_prealloc_failed; } From patchwork Fri Feb 15 16:09:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043005 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JBK0xj5z9rxp for ; Sat, 16 Feb 2019 03:10:01 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729256AbfBOQKA (ORCPT ); Fri, 15 Feb 2019 11:10:00 -0500 Received: from mx1.redhat.com ([209.132.183.28]:15175 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726335AbfBOQKA (ORCPT ); Fri, 15 Feb 2019 11:10:00 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C5CD531A10D; Fri, 15 Feb 2019 16:09:59 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 33A415DD6B; Fri, 15 Feb 2019 16:09:51 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 16/27] keys: Grant Link permission to possessers of request_key auth keys From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:09:50 +0000 Message-ID: <155024699041.21651.17284583580026798362.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 15 Feb 2019 16:09:59 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Grant Link permission to the possessers of request_key authentication keys, thereby allowing a daemon that is servicing upcalls to arrange things such that only the necessary auth key is passed to the actual service program and not all the daemon's pending auth keys. Signed-off-by: David Howells --- security/keys/request_key_auth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/security/keys/request_key_auth.c b/security/keys/request_key_auth.c index cd75173cadad..726555a0639c 100644 --- a/security/keys/request_key_auth.c +++ b/security/keys/request_key_auth.c @@ -208,7 +208,7 @@ struct key *request_key_auth_new(struct key *target, const char *op, authkey = key_alloc(&key_type_request_key_auth, desc, cred->fsuid, cred->fsgid, cred, - KEY_POS_VIEW | KEY_POS_READ | KEY_POS_SEARCH | + KEY_POS_VIEW | KEY_POS_READ | KEY_POS_SEARCH | KEY_POS_LINK | KEY_USR_VIEW, KEY_ALLOC_NOT_IN_QUOTA, NULL); if (IS_ERR(authkey)) { ret = PTR_ERR(authkey); From patchwork Fri Feb 15 16:10:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043006 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JBc0RbGz9rxp for ; Sat, 16 Feb 2019 03:10:16 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729662AbfBOQKJ (ORCPT ); Fri, 15 Feb 2019 11:10:09 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36662 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389248AbfBOQKI (ORCPT ); Fri, 15 Feb 2019 11:10:08 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E77EEAD886; Fri, 15 Feb 2019 16:10:07 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id C60E81024955; Fri, 15 Feb 2019 16:10:05 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 17/27] keys: Add a keyctl to move a key between keyrings From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:10:05 +0000 Message-ID: <155024700503.21651.11352044662949476132.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 15 Feb 2019 16:10:08 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Add a keyctl to atomically move a link to a key from one keyring to another. The key must exist in "from" keyring and a flag can be given to cause the operation to fail if there's a matching key already in the "to" keyring. This can be done with: keyctl(KEYCTL_MOVE, key_serial_t key, key_serial_t from_keyring, key_serial_t to_keyring, unsigned int flags); The key being moved must grant Link permission and both keyrings must grant Write permission. flags should be 0 or KEYCTL_MOVE_EXCL, with the latter preventing displacement of a matching key from the "to" keyring. Signed-off-by: David Howells --- include/linux/key.h | 5 ++ include/uapi/linux/keyctl.h | 3 + security/keys/compat.c | 3 + security/keys/internal.h | 1 security/keys/keyctl.c | 55 +++++++++++++++++++++++++++ security/keys/keyring.c | 88 +++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 155 insertions(+) diff --git a/include/linux/key.h b/include/linux/key.h index 82eb1b8d6336..165f842ec042 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -335,6 +335,11 @@ extern int key_update(key_ref_t key, extern int key_link(struct key *keyring, struct key *key); +extern int key_move(struct key *key, + struct key *from_keyring, + struct key *to_keyring, + unsigned int flags); + extern int key_unlink(struct key *keyring, struct key *key); diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index bb075ad1827d..425bbd9612c4 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -70,6 +70,7 @@ #define KEYCTL_WATCH_KEY 30 /* Watch a key or ring of keys for changes */ #define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ #define KEYCTL_QUERY_REQUEST_KEY_AUTH 32 /* Query a request_key_auth key */ +#define KEYCTL_MOVE 33 /* Move keys between keyrings */ /* keyctl structures */ struct keyctl_dh_params { @@ -126,4 +127,6 @@ struct keyctl_query_request_key_auth { __u64 spare[1]; }; +#define KEYCTL_MOVE_EXCL 0x00000001 /* Do not displace from the to-keyring */ + #endif /* _LINUX_KEYCTL_H */ diff --git a/security/keys/compat.c b/security/keys/compat.c index 30055fc2b629..ed36efa13c48 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -168,6 +168,9 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, return keyctl_query_request_key_auth(arg2, compat_ptr(arg3)); #endif + case KEYCTL_MOVE: + return keyctl_keyring_move(arg2, arg3, arg4, arg5); + default: return -EOPNOTSUPP; } diff --git a/security/keys/internal.h b/security/keys/internal.h index 40846657aebd..bad4a8038a99 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -242,6 +242,7 @@ extern long keyctl_update_key(key_serial_t, const void __user *, size_t); extern long keyctl_revoke_key(key_serial_t); extern long keyctl_keyring_clear(key_serial_t); extern long keyctl_keyring_link(key_serial_t, key_serial_t); +extern long keyctl_keyring_move(key_serial_t, key_serial_t, key_serial_t, unsigned int); extern long keyctl_keyring_unlink(key_serial_t, key_serial_t); extern long keyctl_describe_key(key_serial_t, char __user *, size_t); extern long keyctl_keyring_search(key_serial_t, const char __user *, diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index a19efc60944d..6057b810c611 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -572,6 +572,55 @@ long keyctl_keyring_unlink(key_serial_t id, key_serial_t ringid) return ret; } +/* + * Move a link to a key from one keyring to another, displacing any matching + * key from the destination keyring. + * + * The key must grant the caller Link permission and both keyrings must grant + * the caller Write permission. There must also be a link in the from keyring + * to the key. If both keyrings are the same, nothing is done. + * + * If successful, 0 will be returned. + */ +long keyctl_keyring_move(key_serial_t id, key_serial_t from_ringid, + key_serial_t to_ringid, unsigned int flags) +{ + key_ref_t key_ref, from_ref, to_ref; + long ret; + + if (flags & ~KEYCTL_MOVE_EXCL) + return -EINVAL; + + key_ref = lookup_user_key(id, KEY_LOOKUP_CREATE, KEY_NEED_LINK); + if (IS_ERR(key_ref)) { + ret = PTR_ERR(key_ref); + goto error; + } + + from_ref = lookup_user_key(from_ringid, 0, KEY_NEED_WRITE); + if (IS_ERR(from_ref)) { + ret = PTR_ERR(from_ref); + goto error2; + } + + to_ref = lookup_user_key(to_ringid, KEY_LOOKUP_CREATE, KEY_NEED_WRITE); + if (IS_ERR(to_ref)) { + ret = PTR_ERR(to_ref); + goto error3; + } + + ret = key_move(key_ref_to_ptr(key_ref), key_ref_to_ptr(from_ref), + key_ref_to_ptr(to_ref), flags); + + key_ref_put(to_ref); +error3: + key_ref_put(from_ref); +error2: + key_ref_put(key_ref); +error: + return ret; +} + /* * Return a description of a key to userspace. * @@ -1869,6 +1918,12 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, (struct keyctl_query_request_key_auth __user *)arg3); #endif + case KEYCTL_MOVE: + return keyctl_keyring_move((key_serial_t)arg2, + (key_serial_t)arg3, + (key_serial_t)arg4, + (unsigned int)arg5); + default: return -EOPNOTSUPP; } diff --git a/security/keys/keyring.c b/security/keys/keyring.c index 1334ed97e530..14df79814ea0 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -1504,6 +1504,94 @@ int key_unlink(struct key *keyring, struct key *key) } EXPORT_SYMBOL(key_unlink); +/** + * key_move - Move a key from one keyring to another + * @key: The key to move + * @from_keyring: The keyring to remove the link from. + * @to_keyring: The keyring to make the link in. + * @flags: Qualifying flags, such as KEYCTL_MOVE_EXCL. + * + * Make a link in @to_keyring to a key, such that the keyring holds a reference + * on that key and the key can potentially be found by searching that keyring + * whilst simultaneously removing a link to the key from @from_keyring. + * + * This function will write-lock both keyring's semaphores and will consume + * some of the user's key data quota to hold the link on @to_keyring. + * + * Returns 0 if successful, -ENOTDIR if either keyring isn't a keyring, + * -EKEYREVOKED if either keyring has been revoked, -ENFILE if the second + * keyring is full, -EDQUOT if there is insufficient key data quota remaining + * to add another link or -ENOMEM if there's insufficient memory. If + * KEYCTL_MOVE_EXCL is set, then -EEXIST will be returned if there's already a + * matching key in @to_keyring. + * + * It is assumed that the caller has checked that it is permitted for a link to + * be made (the keyring should have Write permission and the key Link + * permission). + */ +int key_move(struct key *key, + struct key *from_keyring, + struct key *to_keyring, + unsigned int flags) +{ + struct assoc_array_edit *from_edit, *to_edit; + int ret; + + kenter("%d,%d,%d", key->serial, from_keyring->serial, to_keyring->serial); + + if (from_keyring == to_keyring) + return 0; + + key_check(key); + key_check(from_keyring); + key_check(to_keyring); + + /* We have to be very careful here to take the keyring locks in the + * right order, lest we open ourselves to deadlocking against another + * move operation. + */ + if (from_keyring < to_keyring) { + ret = __key_unlink_begin(from_keyring, 0, key, &from_edit); + if (ret < 0) + goto out; + ret = __key_link_begin(to_keyring, 1, &key->index_key, &to_edit); + if (ret < 0) { + assoc_array_cancel_edit(from_edit); + goto out; + } + } else { + ret = __key_link_begin(to_keyring, 0, &key->index_key, &to_edit); + if (ret < 0) + goto out; + ret = __key_unlink_begin(from_keyring, 1, key, &from_edit); + if (ret < 0) { + __key_link_end(to_keyring, &key->index_key, to_edit); + goto out; + } + } + + ret = -EEXIST; + if (to_edit->dead_leaf && (flags & KEYCTL_MOVE_EXCL)) + goto error; + + ret = __key_link_check_restriction(to_keyring, key); + if (ret < 0) + goto error; + ret = __key_link_check_live_key(to_keyring, key); + if (ret < 0) + goto error; + + __key_unlink(from_keyring, key, &from_edit); + __key_link(to_keyring, key, &to_edit); +error: + __key_unlink_end(from_keyring, key, from_edit); + __key_link_end(to_keyring, &key->index_key, to_edit); +out: + kleave(" = %d", ret); + return ret; +} +EXPORT_SYMBOL(key_move); + /** * keyring_clear - Clear a keyring * @keyring: The keyring to clear. From patchwork Fri Feb 15 16:10:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043007 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JBy1GcJz9s7T for ; Sat, 16 Feb 2019 03:10:34 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730105AbfBOQK0 (ORCPT ); Fri, 15 Feb 2019 11:10:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:3145 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728146AbfBOQKZ (ORCPT ); Fri, 15 Feb 2019 11:10:25 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D44FC315FCA; Fri, 15 Feb 2019 16:10:24 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id EE3BA5D9CD; Fri, 15 Feb 2019 16:10:18 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 18/27] keys: Find the least-recently used unseen key in a keyring. From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:10:13 +0000 Message-ID: <155024701313.21651.15123621736164077230.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 15 Feb 2019 16:10:25 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a keyctl by which the oldest "unseen" key in a keyring can be found. The "unseenness" is controlled by a flag on the key, so is shared across all keyrings that might link to a key. The flag is only set by this keyctl. The keyctl looks like: key = keyctl_find_lru(key_serial_t keyring, const char *type_name) It searches the nominated keyring subtree for a valid key of the specified type and returns its serial number or -ENOKEY if no valid, unseen keys are found. This is primarily intended for use with ".request_key_auth"-type keys in container upcall management. Ordinarily, it should be possible to just pick the serial numbers out of the notification records from when an auth key gets added to the upcall keyring, but if the buffer gets overrun, then some other means must be employed. [!] I'm not sure I need to do the "unseen" check at all. This call is only really needed if there's a notification buffer overrun. Signed-off-by: David Howells --- include/linux/key.h | 1 include/uapi/linux/keyctl.h | 1 security/keys/compat.c | 2 + security/keys/container.c | 106 +++++++++++++++++++++++++++++++++++++++++++ security/keys/internal.h | 1 security/keys/keyctl.c | 3 + 6 files changed, 114 insertions(+) diff --git a/include/linux/key.h b/include/linux/key.h index 165f842ec042..de190036512b 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -219,6 +219,7 @@ struct key { #define KEY_FLAG_KEEP 8 /* set if key should not be removed */ #define KEY_FLAG_UID_KEYRING 9 /* set if key is a user or user session keyring */ #define KEY_FLAG_SET_WATCH_PROXY 10 /* Set if watch_proxy should be set on added keys */ +#define KEY_FLAG_SEEN 11 /* Set if returned by keyctl_find_oldest_key() */ /* the key type and key description string * - the desc is used to match a key against search criteria diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 425bbd9612c4..5b792303a05b 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -71,6 +71,7 @@ #define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ #define KEYCTL_QUERY_REQUEST_KEY_AUTH 32 /* Query a request_key_auth key */ #define KEYCTL_MOVE 33 /* Move keys between keyrings */ +#define KEYCTL_FIND_LRU 34 /* Find the least-recently used key in a keyring */ /* keyctl structures */ struct keyctl_dh_params { diff --git a/security/keys/compat.c b/security/keys/compat.c index ed36efa13c48..160fb7b37352 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -166,6 +166,8 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, return keyctl_container_intercept(arg2, compat_ptr(arg3), arg4, arg5); case KEYCTL_QUERY_REQUEST_KEY_AUTH: return keyctl_query_request_key_auth(arg2, compat_ptr(arg3)); + case KEYCTL_FIND_LRU: + return keyctl_find_lru(arg2, compat_ptr(arg3)); #endif case KEYCTL_MOVE: diff --git a/security/keys/container.c b/security/keys/container.c index 115998e867cd..8e6b3c8710e2 100644 --- a/security/keys/container.c +++ b/security/keys/container.c @@ -267,3 +267,109 @@ long keyctl_query_request_key_auth(key_serial_t auth_id, return 0; } + +struct key_lru_search_state { + struct key *candidate; + time64_t oldest; +}; + +/* + * Iterate over all the keys in the keyring looking for the one with the oldest + * timestamp. + */ +static bool cmp_lru(const struct key *key, + const struct key_match_data *match_data) +{ + struct key_lru_search_state *state = (void *)match_data->raw_data; + time64_t t; + + t = READ_ONCE(key->last_used_at); + if (state->oldest > t && !test_bit(KEY_FLAG_SEEN, &key->flags)) { + state->oldest = t; + state->candidate = (struct key *)key; + } + + return false; +} + +/* + * Find the oldest key in a keyring of a particular type. + */ +long keyctl_find_lru(key_serial_t _keyring, const char __user *type_name) +{ + struct key_lru_search_state state; + struct keyring_search_context ctx = { + .index_key.description = NULL, + .cred = current_cred(), + .match_data.cmp = cmp_lru, + .match_data.raw_data = &state, + .match_data.lookup_type = KEYRING_SEARCH_LOOKUP_ITERATE, + .flags = KEYRING_SEARCH_DO_STATE_CHECK, + }; + struct key_type *ktype; + struct key *key; + key_ref_t keyring_ref, ref; + char type[32]; + int ret, max_iter = 10; + + if (!_keyring || !type_name) + return -EINVAL; + + /* We want to allow special types, such as ".request_key_auth" */ + ret = strncpy_from_user(type, type_name, sizeof(type)); + if (ret < 0) + return ret; + if (ret == 0 || ret >= sizeof(type)) + return -EINVAL; + type[ret] = '\0'; + + keyring_ref = lookup_user_key(_keyring, 0, KEY_NEED_SEARCH); + if (IS_ERR(keyring_ref)) + return PTR_ERR(keyring_ref); + + if (strcmp(type, key_type_request_key_auth.name) == 0) { + ktype = &key_type_request_key_auth; + } else { + ktype = key_type_lookup(type); + if (IS_ERR(ktype)) { + ret = PTR_ERR(ktype); + goto error_ring; + } + } + + ctx.index_key.type = ktype; + + do { + state.oldest = TIME64_MAX; + state.candidate = NULL; + + rcu_read_lock(); + + /* Scan the keyring. We expect this to end in -EAGAIN as we + * can't generate a result until the entire scan is completed. + */ + ret = -EAGAIN; + ref = keyring_search_aux(keyring_ref, &ctx); + + key = state.candidate; + if (key && + !test_and_set_bit(KEY_FLAG_SEEN, &key->flags) && + key_validate(key) == 0) { + ret = key->serial; + goto error_unlock; + } + + + rcu_read_unlock(); + } while (--max_iter > 0); + goto error_type; + +error_unlock: + rcu_read_unlock(); +error_type: + if (ktype != &key_type_request_key_auth) + key_type_put(ktype); +error_ring: + key_ref_put(keyring_ref); + return ret; +} diff --git a/security/keys/internal.h b/security/keys/internal.h index bad4a8038a99..fe4a4da1ff17 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -365,6 +365,7 @@ static inline long keyctl_watch_key(key_serial_t key_id, int watch_fd, int watch extern long keyctl_container_intercept(int, const char __user *, unsigned int, key_serial_t); extern long keyctl_query_request_key_auth(key_serial_t, struct keyctl_query_request_key_auth __user *); +extern long keyctl_find_lru(key_serial_t, const char __user *); #endif /* diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index 6057b810c611..1446bc52e369 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -1916,6 +1916,9 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, return keyctl_query_request_key_auth( (key_serial_t)arg2, (struct keyctl_query_request_key_auth __user *)arg3); + case KEYCTL_FIND_LRU: + return keyctl_find_lru((key_serial_t)arg2, + (const char __user *)arg3); #endif case KEYCTL_MOVE: From patchwork Fri Feb 15 16:10:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043008 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JC62Fqyz9rxp for ; Sat, 16 Feb 2019 03:10:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389957AbfBOQKl (ORCPT ); Fri, 15 Feb 2019 11:10:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53120 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729225AbfBOQKl (ORCPT ); Fri, 15 Feb 2019 11:10:41 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7456F1244CF; Fri, 15 Feb 2019 16:10:40 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id C36E95D70D; Fri, 15 Feb 2019 16:10:30 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 19/27] containers: Sample: request_key upcall handling From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:10:30 +0000 Message-ID: <155024703003.21651.3499235528404179500.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 15 Feb 2019 16:10:40 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Implement a sample upcall handling. Firstly, the test-container sample is modified to (a) create a staging keyring and to (b) intercept request_key calls for user-type keys inside the container and place the authentication keys into that rather than invoking /sbin/request-key. Secondly, a test-upcall sample is added that will monitor the keyring for notifications and spawn /sbin/request-key instances for each of key added. This is run as: ./test-upcall to find a keyring called "upcall" in the session keyring (as created by the ./test-container program) and listen for additions to that, or it can be run as: ./test-upcall to listen on a specific keyring. Note that the test-upcall sample is designed to be run separately from test-container so that its stdout can be observed. Signed-off-by: David Howells --- samples/vfs/Makefile | 6 + samples/vfs/test-container.c | 16 +++ samples/vfs/test-upcall.c | 243 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 264 insertions(+), 1 deletion(-) create mode 100644 samples/vfs/test-upcall.c diff --git a/samples/vfs/Makefile b/samples/vfs/Makefile index 25420919ee40..a8e9e1142ae3 100644 --- a/samples/vfs/Makefile +++ b/samples/vfs/Makefile @@ -5,7 +5,8 @@ hostprogs-$(CONFIG_SAMPLE_VFS) := \ test-fsmount \ test-mntinfo \ test-statx \ - test-container + test-container \ + test-upcall # Tell kbuild to always build the programs always := $(hostprogs-y) @@ -18,5 +19,8 @@ HOSTLDLIBS_test-mntinfo += -lm HOSTCFLAGS_test-fs-query.o += -I$(objtree)/usr/include HOSTCFLAGS_test-fsmount.o += -I$(objtree)/usr/include HOSTCFLAGS_test-statx.o += -I$(objtree)/usr/include + HOSTCFLAGS_test-container.o += -I$(objtree)/usr/include HOSTLDLIBS_test-container += -lkeyutils +HOSTCFLAGS_test-upcall.o += -I$(objtree)/usr/include +HOSTLDLIBS_test-upcall += -lkeyutils diff --git a/samples/vfs/test-container.c b/samples/vfs/test-container.c index 44ff57afb5a4..7dc9071399b2 100644 --- a/samples/vfs/test-container.c +++ b/samples/vfs/test-container.c @@ -20,6 +20,8 @@ #include #include +#define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ + /* Hope -1 isn't a syscall */ #ifndef __NR_fsopen #define __NR_fsopen -1 @@ -187,6 +189,7 @@ void container_init(void) */ int main(int argc, char *argv[]) { + key_serial_t keyring; pid_t pid; int fsfd, mfd, cfd, ws; @@ -259,6 +262,19 @@ int main(int argc, char *argv[]) E(close(fsfd)); E(close(mfd)); + /* Create a keyring to catch upcalls. */ + printf("Intercepting...\n"); + keyring = add_key("keyring", "upcall", NULL, 0, KEY_SPEC_SESSION_KEYRING); + if (keyring == -1) { + perror("add_key/u"); + exit(1); + } + + if (keyctl(KEYCTL_CONTAINER_INTERCEPT, cfd, "user", 0, keyring) < 0) { + perror("keyctl_container_intercept"); + exit(1); + } + /* Start the 'init' process. */ printf("Forking...\n"); switch ((pid = fork_into_container(cfd))) { diff --git a/samples/vfs/test-upcall.c b/samples/vfs/test-upcall.c new file mode 100644 index 000000000000..225fa0325d1b --- /dev/null +++ b/samples/vfs/test-upcall.c @@ -0,0 +1,243 @@ +/* Container keyring upcall management test. + * + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define KEYCTL_WATCH_KEY 30 /* Watch a key or ring of keys for changes */ +#define KEYCTL_QUERY_REQUEST_KEY_AUTH 32 /* Query a request_key_auth key */ +#define KEYCTL_MOVE 33 /* Move keys between keyrings */ +#define KEYCTL_FIND_LRU 34 /* Find the least-recently used key in a keyring */ + +struct keyctl_query_request_key_auth { + char operation[32]; /* Operation name, typically "create" */ + uid_t fsuid; /* UID of requester */ + gid_t fsgid; /* GID of requester */ + key_serial_t target_key; /* The key being instantiated */ + key_serial_t thread_keyring; /* The requester's thread keyring */ + key_serial_t process_keyring; /* The requester's process keyring */ + key_serial_t session_keyring; /* The requester's session keyring */ + long long spare[1]; +}; + +static void process_request(key_serial_t keyring, key_serial_t key) +{ + struct keyctl_query_request_key_auth info; + char target[32], uid[32], gid[32], thread[32], process[32], session[32]; + void *callout; + long len; + +#if 0 + key = keyctl(KEYCTL_FIND_LRU, keyring, ".request_key_auth"); + if (key == -1) { + perror("keyctl/find"); + exit(1); + } +#endif + + if (keyctl(KEYCTL_QUERY_REQUEST_KEY_AUTH, key, &info) == -1) { + perror("keyctl/query"); + exit(1); + } + + len = keyctl_read_alloc(key, &callout); + if (len == -1) { + perror("keyctl/read"); + exit(1); + } + + sprintf(target, "%d", info.target_key); + sprintf(uid, "%d", info.fsuid); + sprintf(gid, "%d", info.fsgid); + sprintf(thread, "%d", info.thread_keyring); + sprintf(process, "%d", info.process_keyring); + sprintf(session, "%d", info.session_keyring); + + printf("Authentication key %d\n", key); + printf("- %s %s\n", info.operation, target); + printf("- uid=%s gid=%s\n", uid, gid); + printf("- rings=%s,%s,%s\n", thread, process, session); + printf("- callout='%s'\n", (char *)callout); + + switch (fork()) { + case 0: + /* Only pass the auth token of interest onto /sbin/request-key */ + if (keyctl(KEYCTL_MOVE, key, keyring, KEY_SPEC_THREAD_KEYRING) < 0) { + perror("keyctl_move/1"); + exit(1); + } + + if (keyctl_join_session_keyring(NULL) < 0) { + perror("keyctl_join"); + exit(1); + } + + if (keyctl(KEYCTL_MOVE, key, + KEY_SPEC_THREAD_KEYRING, KEY_SPEC_SESSION_KEYRING) < 0) { + perror("keyctl_move/2"); + exit(1); + } + + execl("/sbin/request-key", + "request-key", info.operation, target, uid, gid, thread, process, session, + NULL); + perror("execve"); + exit(1); + + case -1: + perror("fork"); + exit(1); + + default: + return; + } +} + +/* + * We saw a change on the keyring. + */ +static void saw_key_change(struct watch_notification *n) +{ + struct key_notification *k = (struct key_notification *)n; + unsigned int len = n->info & WATCH_INFO_LENGTH; + + if (len != sizeof(struct key_notification)) + return; + + printf("KEY %d change=%u aux=%d\n", k->key_id, n->subtype, k->aux); + + process_request(k->key_id, k->aux); +} + +/* + * Consume and display events. + */ +static int consumer(int fd, struct watch_queue_buffer *buf) +{ + struct watch_notification *n; + struct pollfd p[1]; + unsigned int head, tail, mask = buf->meta.mask; + + for (;;) { + p[0].fd = fd; + p[0].events = POLLIN | POLLERR; + p[0].revents = 0; + + if (poll(p, 1, -1) == -1) { + perror("poll"); + break; + } + + printf("ptrs h=%x t=%x m=%x\n", + buf->meta.head, buf->meta.tail, buf->meta.mask); + + while (head = __atomic_load_n(&buf->meta.head, __ATOMIC_ACQUIRE), + tail = buf->meta.tail, + tail != head + ) { + n = &buf->slots[tail & mask]; + printf("NOTIFY[%08x-%08x] ty=%04x sy=%04x i=%08x\n", + head, tail, n->type, n->subtype, n->info); + if ((n->info & WATCH_INFO_LENGTH) == 0) + goto out; + + switch (n->type) { + case WATCH_TYPE_META: + if (n->subtype == WATCH_META_REMOVAL_NOTIFICATION) + printf("REMOVAL of watchpoint %08x\n", + n->info & WATCH_INFO_ID); + break; + case WATCH_TYPE_KEY_NOTIFY: + saw_key_change(n); + break; + } + + tail += (n->info & WATCH_INFO_LENGTH) >> WATCH_LENGTH_SHIFT; + __atomic_store_n(&buf->meta.tail, tail, __ATOMIC_RELEASE); + } + } + +out: + return 0; +} + +/* + * We're only interested in key insertion events. + */ +static struct watch_notification_filter filter = { + .nr_filters = 1, + .filters = { + [0] = { + .type = WATCH_TYPE_KEY_NOTIFY, + .subtype_filter[0] = (1 << NOTIFY_KEY_LINKED), + }, + } +}; + +int main(int argc, char *argv[]) +{ + struct watch_queue_buffer *buf; + key_serial_t keyring; + size_t page_size = sysconf(_SC_PAGESIZE); + int fd; + + if (argc == 1) { + keyring = keyctl_search(KEY_SPEC_SESSION_KEYRING, "keyring", + "upcall", 0); + if (keyring == -1) { + perror("keyctl_search"); + exit(1); + } + } else if (argc == 2) { + keyring = strtoul(argv[1], NULL, 0); + } else { + fprintf(stderr, "Format: test-upcall []\n"); + exit(2); + } + + /* Create a watch on the keyring to detect the addition of keys. */ + fd = open("/dev/watch_queue", O_RDWR | O_CLOEXEC); + if (fd == -1) { + perror("/dev/watch_queue"); + exit(1); + } + + if (ioctl(fd, IOC_WATCH_QUEUE_SET_SIZE, 1) == -1) { + perror("/dev/watch_queue(size)"); + exit(1); + } + + if (ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter) == -1) { + perror("/dev/watch_queue(filter)"); + exit(1); + } + + buf = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (buf == MAP_FAILED) { + perror("mmap"); + exit(1); + } + + if (keyctl(KEYCTL_WATCH_KEY, keyring, fd, 0x01) == -1) { + perror("keyctl"); + exit(1); + } + + return consumer(fd, buf); +} From patchwork Fri Feb 15 16:10:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043009 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JCy1zYpz9rxp for ; Sat, 16 Feb 2019 03:11:26 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388465AbfBOQLS (ORCPT ); Fri, 15 Feb 2019 11:11:18 -0500 Received: from mx1.redhat.com ([209.132.183.28]:48710 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727273AbfBOQLS (ORCPT ); Fri, 15 Feb 2019 11:11:18 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 63187C610D; Fri, 15 Feb 2019 16:11:17 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 92C014115; Fri, 15 Feb 2019 16:10:51 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 20/27] container, keys: Add a container keyring From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:10:45 +0000 Message-ID: <155024704568.21651.12664692449080180818.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 15 Feb 2019 16:11:17 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Allow a container manager to attach keyrings to a container such that the keys contained therein are searched by request_key() in addition to a process's normal keyrings. This allows the manager to install keys to support filesystem decryption and authentication for superblocks inside the container without requiring any active role being played by processes inside of the container. So, for example, a container could be created, a keyring added and then an rxrpc-type key added to the keyring such that a container's root filesystem and data filesystems can be brought in from secure AFS volumes. It would also be possible to put filesystem crypto keys in there such that Ext4 encrypted files could be decrypted - without the need to share the key between other containers or let the key leak into the container. Because the container manager retains control of the keyring, it can update the contained keys as necessary to prevent expiration. Note that the keyring and keys in the keyring must grant Search permission directly to the container object. [!] Note that NFS, CIFS and other filesystems wishing to make use of this would have to get the token to use by calling request_key() on entry to its VFS methods and retain it in its file struct. [!] Note that request_key() called from userspace does not look in the container keyring. [!] Note that keys are now tagged with a tag that identifies the network namespace (or other domain of operation). This allows keys to be provided in one keyring that allow the same thing but in different network namespaces. The keyring should be created by the container manager and then set using: keyctl(KEYCTL_SET_CONTAINER_KEYRING, int containerfd, key_serial_t keyring); With this, request_key() inside the kernel searches: thread-keyring, process-keyring, session-keyring, container-keyring [!] It may be worth setting a flag on a mountpoint to indicate whether to search the container keyring first or last. Signed-off-by: David Howells --- include/linux/container.h | 1 + include/uapi/linux/keyctl.h | 1 + kernel/container.c | 1 + samples/vfs/test-container.c | 14 +++++++++++++ security/keys/compat.c | 2 ++ security/keys/container.c | 44 ++++++++++++++++++++++++++++++++++++++++++ security/keys/internal.h | 1 + security/keys/keyctl.c | 2 ++ security/keys/process_keys.c | 23 ++++++++++++++++++++++ 9 files changed, 89 insertions(+) diff --git a/include/linux/container.h b/include/linux/container.h index a8cac800ce75..7424f7fb5560 100644 --- a/include/linux/container.h +++ b/include/linux/container.h @@ -33,6 +33,7 @@ struct container { refcount_t usage; int exit_code; /* The exit code of 'init' */ const struct cred *cred; /* Creds for this container, including userns */ + struct key *keyring; /* Externally managed container keyring */ struct nsproxy *ns; /* This container's namespaces */ struct path root; /* The root of the container's fs namespace */ struct task_struct *init; /* The 'init' task for this container */ diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 5b792303a05b..a2afb4512f34 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -72,6 +72,7 @@ #define KEYCTL_QUERY_REQUEST_KEY_AUTH 32 /* Query a request_key_auth key */ #define KEYCTL_MOVE 33 /* Move keys between keyrings */ #define KEYCTL_FIND_LRU 34 /* Find the least-recently used key in a keyring */ +#define KEYCTL_SET_CONTAINER_KEYRING 35 /* Attach a keyring to a container */ /* keyctl structures */ struct keyctl_dh_params { diff --git a/kernel/container.c b/kernel/container.c index 33e41fe5050b..f2706a45f364 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -71,6 +71,7 @@ void put_container(struct container *c) if (c->cred) put_cred(c->cred); + key_put(c->keyring); security_container_free(c); kfree(c); c = parent; diff --git a/samples/vfs/test-container.c b/samples/vfs/test-container.c index 7dc9071399b2..e24048fdbe33 100644 --- a/samples/vfs/test-container.c +++ b/samples/vfs/test-container.c @@ -21,6 +21,7 @@ #include #define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ +#define KEYCTL_SET_CONTAINER_KEYRING 35 /* Attach a keyring to a container */ /* Hope -1 isn't a syscall */ #ifndef __NR_fsopen @@ -262,6 +263,19 @@ int main(int argc, char *argv[]) E(close(fsfd)); E(close(mfd)); + /* Create a container keyring. */ + printf("Container keyring...\n"); + keyring = add_key("keyring", "_container", NULL, 0, KEY_SPEC_SESSION_KEYRING); + if (keyring == -1) { + perror("add_key/c"); + exit(1); + } + + if (keyctl(KEYCTL_SET_CONTAINER_KEYRING, cfd, keyring) < 0) { + perror("keyctl_set_container_keyring"); + exit(1); + } + /* Create a keyring to catch upcalls. */ printf("Intercepting...\n"); keyring = add_key("keyring", "upcall", NULL, 0, KEY_SPEC_SESSION_KEYRING); diff --git a/security/keys/compat.c b/security/keys/compat.c index 160fb7b37352..7990ec026237 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -168,6 +168,8 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, return keyctl_query_request_key_auth(arg2, compat_ptr(arg3)); case KEYCTL_FIND_LRU: return keyctl_find_lru(arg2, compat_ptr(arg3)); + case KEYCTL_SET_CONTAINER_KEYRING: + return keyctl_set_container_keyring(arg2, arg3); #endif case KEYCTL_MOVE: diff --git a/security/keys/container.c b/security/keys/container.c index 8e6b3c8710e2..720600f6a318 100644 --- a/security/keys/container.c +++ b/security/keys/container.c @@ -373,3 +373,47 @@ long keyctl_find_lru(key_serial_t _keyring, const char __user *type_name) key_ref_put(keyring_ref); return ret; } + +/* + * Attach a keyring to a container as the container key, to be searched by + * request_key() after thread, process and session keyrings. This is only + * permitted once per container. + */ +long keyctl_set_container_keyring(int containerfd, key_serial_t _keyring) +{ + struct container *c; + struct fd f; + key_ref_t keyring_ref = NULL; + long ret; + + if (containerfd < 0 || _keyring <= 0) + return -EINVAL; + + f = fdget(containerfd); + if (!f.file) + return -EBADF; + ret = -EINVAL; + if (!is_container_file(f.file)) + goto out_fd; + + c = f.file->private_data; + + keyring_ref = lookup_user_key(_keyring, 0, KEY_NEED_SEARCH); + if (IS_ERR(keyring_ref)) { + ret = PTR_ERR(keyring_ref); + goto out_fd; + } + + ret = -EBUSY; + spin_lock(&c->lock); + if (!c->keyring) { + c->keyring = key_get(key_ref_to_ptr(keyring_ref)); + ret = 0; + } + spin_unlock(&c->lock); + + key_ref_put(keyring_ref); +out_fd: + fdput(f); + return ret; +} diff --git a/security/keys/internal.h b/security/keys/internal.h index fe4a4da1ff17..6be76caee874 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -366,6 +366,7 @@ extern long keyctl_container_intercept(int, const char __user *, unsigned int, k extern long keyctl_query_request_key_auth(key_serial_t, struct keyctl_query_request_key_auth __user *); extern long keyctl_find_lru(key_serial_t, const char __user *); +extern long keyctl_set_container_keyring(int, key_serial_t); #endif /* diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index 1446bc52e369..a25799249b8a 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -1919,6 +1919,8 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, case KEYCTL_FIND_LRU: return keyctl_find_lru((key_serial_t)arg2, (const char __user *)arg3); + case KEYCTL_SET_CONTAINER_KEYRING: + return keyctl_set_container_keyring((int)arg2, (key_serial_t)arg3); #endif case KEYCTL_MOVE: diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index 0e0b9ccad2f8..39d3cbac920c 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -433,6 +434,28 @@ key_ref_t search_my_process_keyrings(struct keyring_search_context *ctx) } } + /* Search any container keyring on the end. */ +#ifdef CONFIG_CONTAINERS + if (current->container->keyring) { + key_ref = keyring_search_aux( + make_key_ref(current->container->keyring, 1), ctx); + if (!IS_ERR(key_ref)) + goto found; + + switch (PTR_ERR(key_ref)) { + case -EAGAIN: /* no key */ + if (ret) + break; + case -ENOKEY: /* negative key */ + ret = key_ref; + break; + default: + err = key_ref; + break; + } + } +#endif + /* no key - decide on the error we're going to go for */ key_ref = ret ? ret : err; From patchwork Fri Feb 15 16:11:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043011 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JD71Zt6z9rxp for ; Sat, 16 Feb 2019 03:11:35 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729225AbfBOQL0 (ORCPT ); Fri, 15 Feb 2019 11:11:26 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45160 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392282AbfBOQLZ (ORCPT ); Fri, 15 Feb 2019 11:11:25 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 23CFBC0B2A2F; Fri, 15 Feb 2019 16:11:25 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 509715BBA4; Fri, 15 Feb 2019 16:11:23 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 21/27] keys: Fix request_key() lack of Link perm check on found key From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:11:22 +0000 Message-ID: <155024708261.21651.15380024848711404052.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 15 Feb 2019 16:11:25 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org The request_key() syscall allows a process to gain access to the 'possessor' permits of any key that grants it Search permission by virtue of request_key() not checking whether a key it finds grants Link permission to the caller. Signed-off-by: David Howells --- security/keys/request_key.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/security/keys/request_key.c b/security/keys/request_key.c index ab1f6de9e623..10244b6fbf5d 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -564,6 +564,16 @@ struct key *request_key_and_link(struct key_type *type, key_ref = search_process_keyrings(&ctx); if (!IS_ERR(key_ref)) { + if (dest_keyring) { + ret = key_task_permission(key_ref, current_cred(), + KEY_NEED_LINK); + if (ret < 0) { + key_ref_put(key_ref); + key = ERR_PTR(ret); + goto error_free; + } + } + key = key_ref_to_ptr(key_ref); if (dest_keyring) { ret = key_link(dest_keyring, key); From patchwork Fri Feb 15 16:11:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043017 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JFq0zlJz9rxp for ; Sat, 16 Feb 2019 03:13:02 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730142AbfBOQLf (ORCPT ); Fri, 15 Feb 2019 11:11:35 -0500 Received: from mx1.redhat.com ([209.132.183.28]:1413 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726148AbfBOQLf (ORCPT ); Fri, 15 Feb 2019 11:11:35 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C777D2203B; Fri, 15 Feb 2019 16:11:33 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 00EFB60C62; Fri, 15 Feb 2019 16:11:30 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 22/27] KEYS: Replace uid/gid/perm permissions checking with an ACL From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:11:30 +0000 Message-ID: <155024709026.21651.7275876165845045967.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 15 Feb 2019 16:11:34 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Replace the uid/gid/perm permissions checking on a key with an ACL to allow the SETATTR and SEARCH permissions to be split. This will also allow a greater range of subjects to represented. ============ WHY DO THIS? ============ The problem is that SETATTR and SEARCH cover a slew of actions, not all of which should be grouped together. For SETATTR, this includes actions that are about controlling access to a key: (1) Changing a key's ownership. (2) Changing a key's security information. (3) Setting a keyring's restriction. And actions that are about managing a key's lifetime: (4) Setting an expiry time. (5) Revoking a key. and (proposed) managing a key as part of a cache: (6) Invalidating a key. Managing a key's lifetime doesn't really have anything to do with controlling access to that key. Expiry time is awkward since it's more about the lifetime of the content and so, in some ways goes better with WRITE permission. It can, however, be set unconditionally by a process with an appropriate authorisation token for instantiating a key, and can also be set by the key type driver when a key is instantiated, so lumping it with the access-controlling actions is probably okay. As for SEARCH permission, that currently covers: (1) Finding keys in a keyring tree during a search. (2) Permitting keyrings to be joined. (3) Invalidation. But these don't really belong together either, since these actions really need to be controlled separately. Finally, there are number of special cases to do with granting the administrator special rights to invalidate or clear keys that I would like to handle with the ACL rather than key flags and special checks. =============== WHAT IS CHANGED =============== The SETATTR permission is split to create two new permissions: (1) SET_SECURITY - which allows the key's owner, group and ACL to be changed and a restriction to be placed on a keyring. (2) REVOKE - which allows a key to be revoked. The SEARCH permission is split to create: (1) SEARCH - which allows a keyring to be search and a key to be found. (2) JOIN - which allows a keyring to be joined as a session keyring. (3) INVAL - which allows a key to be invalidated. The WRITE permission is also split to create: (1) WRITE - which allows a key's content to be altered and links to be added, removed and replaced in a keyring. (2) CLEAR - which allows a keyring to be cleared completely. This is split out to make it possible to give just this to an administrator. (3) REVOKE - see above. Keys acquire ACLs which consist of a series of ACEs, and all that apply are unioned together. An ACE specifies a subject, such as: (*) Possessor - permitted to anyone who 'possesses' a key (*) Owner - permitted to the key owner (*) Group - permitted to the key group (*) Everyone - permitted to everyone Note that 'Other' has been replaced with 'Everyone' on the assumption that you wouldn't grant a permit to 'Other' that you wouldn't also grant to everyone else. Further subjects may be made available by later patches. The ACE also specifies a permissions mask. The set of permissions is now: VIEW Can view the key metadata READ Can read the key content WRITE Can update/modify the key content SEARCH Can find the key by searching/requesting LINK Can make a link to the key SET_SECURITY Can change owner, ACL, expiry INVAL Can invalidate REVOKE Can revoke JOIN Can join this keyring CLEAR Can clear this keyring The KEYCTL_SETPERM function is then deprecated. The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set, or if the caller has a valid instantiation auth token. The KEYCTL_INVALIDATE function then requires INVAL. The KEYCTL_REVOKE function then requires REVOKE. The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an existing keyring. The JOIN permission is enabled by default for session keyrings and manually created keyrings only. ====================== BACKWARD COMPATIBILITY ====================== To maintain backward compatibility, KEYCTL_SETPERM will translate the permissions mask it is given into a new ACL for a key - unless KEYCTL_SET_ACL has been called on that key, in which case an error will be returned. It will convert possessor, owner, group and other permissions into separate ACEs, if each portion of the mask is non-zero. SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY. WRITE permission turns on WRITE, REVOKE and, if a keyring, CLEAR. JOIN is turned on if a keyring is being altered. The KEYCTL_DESCRIBE function translates the ACL back into a permissions mask to return depending on possessor, owner, group and everyone ACEs. It will make the following mappings: (1) INVAL, JOIN -> SEARCH (2) SET_SECURITY -> SETATTR (3) REVOKE -> WRITE if SETATTR isn't already set (4) CLEAR -> WRITE Note that the value subsequently returned by KEYCTL_DESCRIBE may not match the value set with KEYCTL_SETATTR. ======= TESTING ======= This passes the keyutils testsuite for all but a couple of tests: (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed if the type doesn't have ->read(). You still can't actually read the key. (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't work as Other has been replaced with Everyone in the ACL. Signed-off-by: David Howells --- certs/blacklist.c | 7 - certs/system_keyring.c | 12 - drivers/md/dm-crypt.c | 2 drivers/nvdimm/security.c | 2 fs/afs/security.c | 2 fs/cifs/cifs_spnego.c | 25 ++ fs/cifs/cifsacl.c | 28 ++ fs/cifs/connect.c | 4 fs/crypto/keyinfo.c | 2 fs/ecryptfs/ecryptfs_kernel.h | 2 fs/ecryptfs/keystore.c | 2 fs/fscache/object-list.c | 2 fs/nfs/nfs4idmap.c | 29 ++ fs/ubifs/auth.c | 2 include/linux/key.h | 113 +++++---- include/uapi/linux/keyctl.h | 63 +++++ lib/digsig.c | 2 net/ceph/ceph_common.c | 2 net/dns_resolver/dns_key.c | 12 + net/dns_resolver/dns_query.c | 15 + net/rxrpc/key.c | 16 + security/integrity/digsig.c | 31 +-- security/integrity/digsig_asymmetric.c | 2 security/integrity/evm/evm_crypto.c | 2 security/integrity/ima/ima_mok.c | 13 + security/integrity/integrity.h | 4 .../integrity/platform_certs/platform_keyring.c | 13 + security/keys/encrypted-keys/encrypted.c | 2 security/keys/encrypted-keys/masterkey_trusted.c | 2 security/keys/gc.c | 2 security/keys/internal.h | 12 + security/keys/key.c | 29 +- security/keys/keyctl.c | 93 +++++--- security/keys/keyring.c | 27 ++ security/keys/permission.c | 238 +++++++++++++++++--- security/keys/persistent.c | 27 ++ security/keys/proc.c | 17 + security/keys/process_keys.c | 72 +++++- security/keys/request_key.c | 40 ++- security/keys/request_key_auth.c | 15 + security/selinux/hooks.c | 16 + security/smack/smack_lsm.c | 3 42 files changed, 726 insertions(+), 278 deletions(-) diff --git a/certs/blacklist.c b/certs/blacklist.c index 3a507b9e2568..7677c3b0a147 100644 --- a/certs/blacklist.c +++ b/certs/blacklist.c @@ -93,8 +93,7 @@ int mark_hash_blacklisted(const char *hash) hash, NULL, 0, - ((KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW), + &internal_key_acl, KEY_ALLOC_NOT_IN_QUOTA | KEY_ALLOC_BUILT_IN); if (IS_ERR(key)) { @@ -153,9 +152,7 @@ static int __init blacklist_init(void) keyring_alloc(".blacklist", KUIDT_INIT(0), KGIDT_INIT(0), current_cred(), - (KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ | - KEY_USR_SEARCH, + &internal_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA | KEY_FLAG_KEEP, NULL, NULL); diff --git a/certs/system_keyring.c b/certs/system_keyring.c index 81728717523d..7b775d6028e1 100644 --- a/certs/system_keyring.c +++ b/certs/system_keyring.c @@ -100,9 +100,7 @@ static __init int system_trusted_keyring_init(void) builtin_trusted_keys = keyring_alloc(".builtin_trusted_keys", KUIDT_INIT(0), KGIDT_INIT(0), current_cred(), - ((KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ | KEY_USR_SEARCH), - KEY_ALLOC_NOT_IN_QUOTA, + &internal_key_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(builtin_trusted_keys)) panic("Can't allocate builtin trusted keyring\n"); @@ -111,10 +109,7 @@ static __init int system_trusted_keyring_init(void) secondary_trusted_keys = keyring_alloc(".secondary_trusted_keys", KUIDT_INIT(0), KGIDT_INIT(0), current_cred(), - ((KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ | KEY_USR_SEARCH | - KEY_USR_WRITE), - KEY_ALLOC_NOT_IN_QUOTA, + &internal_writable_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, get_builtin_and_secondary_restriction(), NULL); if (IS_ERR(secondary_trusted_keys)) @@ -164,8 +159,7 @@ static __init int load_system_certificate_list(void) NULL, p, plen, - ((KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ), + &internal_key_acl, KEY_ALLOC_NOT_IN_QUOTA | KEY_ALLOC_BUILT_IN | KEY_ALLOC_BYPASS_RESTRICTION); diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 0ff22159a0ca..7f37616cd21a 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -2034,7 +2034,7 @@ static int crypt_set_keyring_key(struct crypt_config *cc, const char *key_string return -ENOMEM; key = request_key(key_string[0] == 'l' ? &key_type_logon : &key_type_user, - key_desc + 1, NULL); + key_desc + 1, NULL, NULL); if (IS_ERR(key)) { kzfree(new_key_string); return PTR_ERR(key); diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c index f8bb746a549f..db5cfd934ec8 100644 --- a/drivers/nvdimm/security.c +++ b/drivers/nvdimm/security.c @@ -53,7 +53,7 @@ static struct key *nvdimm_request_key(struct nvdimm *nvdimm) struct device *dev = &nvdimm->dev; sprintf(desc, "%s%s", NVDIMM_PREFIX, nvdimm->dimm_id); - key = request_key(&key_type_encrypted, desc, ""); + key = request_key(&key_type_encrypted, desc, "", NULL); if (IS_ERR(key)) { if (PTR_ERR(key) == -ENOKEY) dev_dbg(dev, "request_key() found no key\n"); diff --git a/fs/afs/security.c b/fs/afs/security.c index 5f58a9a17e69..184274ce41e1 100644 --- a/fs/afs/security.c +++ b/fs/afs/security.c @@ -32,7 +32,7 @@ struct key *afs_request_key(struct afs_cell *cell) _debug("key %s", cell->anonymous_key->description); key = request_key(&key_type_rxrpc, cell->anonymous_key->description, - NULL); + NULL, NULL); if (IS_ERR(key)) { if (PTR_ERR(key) != -ENOKEY) { _leave(" = %ld", PTR_ERR(key)); diff --git a/fs/cifs/cifs_spnego.c b/fs/cifs/cifs_spnego.c index 7f01c6e60791..d1b439ad0f1a 100644 --- a/fs/cifs/cifs_spnego.c +++ b/fs/cifs/cifs_spnego.c @@ -32,6 +32,25 @@ #include "cifsproto.h" static const struct cred *spnego_cred; +static struct key_acl cifs_spnego_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_SEARCH | KEY_ACE_READ), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + +static struct key_acl cifs_spnego_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_CLEAR), + } +}; + /* create a new cifs key */ static int cifs_spnego_key_instantiate(struct key *key, struct key_preparsed_payload *prep) @@ -170,7 +189,8 @@ cifs_get_spnego_key(struct cifs_ses *sesInfo) cifs_dbg(FYI, "key description = %s\n", description); saved_cred = override_creds(spnego_cred); - spnego_key = request_key(&cifs_spnego_key_type, description, ""); + spnego_key = request_key(&cifs_spnego_key_type, description, "", + &cifs_spnego_key_acl); revert_creds(saved_cred); #ifdef CONFIG_CIFS_DEBUG2 @@ -207,8 +227,7 @@ init_cifs_spnego(void) keyring = keyring_alloc(".cifs_spnego", GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, - (KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ, + &cifs_spnego_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c index 1d377b7f2860..78eed72f3af0 100644 --- a/fs/cifs/cifsacl.c +++ b/fs/cifs/cifsacl.c @@ -33,6 +33,25 @@ #include "cifsproto.h" #include "cifs_debug.h" +static struct key_acl cifs_idmap_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_SEARCH | KEY_ACE_READ), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + +static struct key_acl cifs_idmap_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ), + } +}; + /* security id for everyone/world system group */ static const struct cifs_sid sid_everyone = { 1, 1, {0, 0, 0, 0, 0, 1}, {0} }; @@ -298,7 +317,8 @@ id_to_sid(unsigned int cid, uint sidtype, struct cifs_sid *ssid) rc = 0; saved_cred = override_creds(root_cred); - sidkey = request_key(&cifs_idmap_key_type, desc, ""); + sidkey = request_key(&cifs_idmap_key_type, desc, "", + &cifs_idmap_key_acl); if (IS_ERR(sidkey)) { rc = -EINVAL; cifs_dbg(FYI, "%s: Can't map %cid %u to a SID\n", @@ -403,7 +423,8 @@ sid_to_id(struct cifs_sb_info *cifs_sb, struct cifs_sid *psid, return -ENOMEM; saved_cred = override_creds(root_cred); - sidkey = request_key(&cifs_idmap_key_type, sidstr, ""); + sidkey = request_key(&cifs_idmap_key_type, sidstr, "", + &cifs_idmap_key_acl); if (IS_ERR(sidkey)) { rc = -EINVAL; cifs_dbg(FYI, "%s: Can't map SID %s to a %cid\n", @@ -481,8 +502,7 @@ init_cifs_idmap(void) keyring = keyring_alloc(".cifs_idmap", GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, - (KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ, + &cifs_idmap_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 683310f26171..3b946fcf025c 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2903,7 +2903,7 @@ cifs_set_cifscreds(struct smb_vol *vol, struct cifs_ses *ses) } cifs_dbg(FYI, "%s: desc=%s\n", __func__, desc); - key = request_key(&key_type_logon, desc, ""); + key = request_key(&key_type_logon, desc, "", NULL); if (IS_ERR(key)) { if (!ses->domainName) { cifs_dbg(FYI, "domainName is NULL\n"); @@ -2914,7 +2914,7 @@ cifs_set_cifscreds(struct smb_vol *vol, struct cifs_ses *ses) /* didn't work, try to find a domain key */ sprintf(desc, "cifs:d:%s", ses->domainName); cifs_dbg(FYI, "%s: desc=%s\n", __func__, desc); - key = request_key(&key_type_logon, desc, ""); + key = request_key(&key_type_logon, desc, "", NULL); if (IS_ERR(key)) { rc = PTR_ERR(key); goto out_err; diff --git a/fs/crypto/keyinfo.c b/fs/crypto/keyinfo.c index 1e11a683f63d..201e8715302b 100644 --- a/fs/crypto/keyinfo.c +++ b/fs/crypto/keyinfo.c @@ -92,7 +92,7 @@ find_and_lock_process_key(const char *prefix, if (!description) return ERR_PTR(-ENOMEM); - key = request_key(&key_type_logon, description, NULL); + key = request_key(&key_type_logon, description, NULL, NULL); kfree(description); if (IS_ERR(key)) return key; diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h index e74cb2a0b299..6460bd2a4e9d 100644 --- a/fs/ecryptfs/ecryptfs_kernel.h +++ b/fs/ecryptfs/ecryptfs_kernel.h @@ -105,7 +105,7 @@ ecryptfs_get_encrypted_key_payload_data(struct key *key) static inline struct key *ecryptfs_get_encrypted_key(char *sig) { - return request_key(&key_type_encrypted, sig, NULL); + return request_key(&key_type_encrypted, sig, NULL, NULL); } #else diff --git a/fs/ecryptfs/keystore.c b/fs/ecryptfs/keystore.c index e74fe84d0886..38f4e30ed730 100644 --- a/fs/ecryptfs/keystore.c +++ b/fs/ecryptfs/keystore.c @@ -1625,7 +1625,7 @@ int ecryptfs_keyring_auth_tok_for_sig(struct key **auth_tok_key, { int rc = 0; - (*auth_tok_key) = request_key(&key_type_user, sig, NULL); + (*auth_tok_key) = request_key(&key_type_user, sig, NULL, NULL); if (!(*auth_tok_key) || IS_ERR(*auth_tok_key)) { (*auth_tok_key) = ecryptfs_get_encrypted_key(sig); if (!(*auth_tok_key) || IS_ERR(*auth_tok_key)) { diff --git a/fs/fscache/object-list.c b/fs/fscache/object-list.c index 43e6e28c164f..6a672289e5ec 100644 --- a/fs/fscache/object-list.c +++ b/fs/fscache/object-list.c @@ -321,7 +321,7 @@ static void fscache_objlist_config(struct fscache_objlist_data *data) const char *buf; int len; - key = request_key(&key_type_user, "fscache:objlist", NULL); + key = request_key(&key_type_user, "fscache:objlist", NULL, NULL); if (IS_ERR(key)) goto no_config; diff --git a/fs/nfs/nfs4idmap.c b/fs/nfs/nfs4idmap.c index bf34ddaa2ad7..25f3f2f97ce9 100644 --- a/fs/nfs/nfs4idmap.c +++ b/fs/nfs/nfs4idmap.c @@ -71,6 +71,25 @@ struct idmap { struct mutex idmap_mutex; }; +static struct key_acl nfs_idmap_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_SEARCH | KEY_ACE_READ), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + +static struct key_acl nfs_idmap_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ), + } +}; + /** * nfs_fattr_init_names - initialise the nfs_fattr owner_name/group_name fields * @fattr: fully initialised struct nfs_fattr @@ -200,8 +219,7 @@ int nfs_idmap_init(void) keyring = keyring_alloc(".id_resolver", GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, - (KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ, + &nfs_idmap_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); @@ -278,11 +296,12 @@ static struct key *nfs_idmap_request_key(const char *name, size_t namelen, if (ret < 0) return ERR_PTR(ret); - rkey = request_key(&key_type_id_resolver, desc, ""); + rkey = request_key(&key_type_id_resolver, desc, "", &nfs_idmap_key_acl); if (IS_ERR(rkey)) { mutex_lock(&idmap->idmap_mutex); rkey = request_key_with_auxdata(&key_type_id_resolver_legacy, - desc, "", 0, idmap); + desc, "", 0, idmap, + &nfs_idmap_key_acl); mutex_unlock(&idmap->idmap_mutex); } if (!IS_ERR(rkey)) @@ -311,8 +330,6 @@ static ssize_t nfs_idmap_get_key(const char *name, size_t namelen, } rcu_read_lock(); - rkey->perm |= KEY_USR_VIEW; - ret = key_validate(rkey); if (ret < 0) goto out_up; diff --git a/fs/ubifs/auth.c b/fs/ubifs/auth.c index 5bf5fd08879e..38bae9737166 100644 --- a/fs/ubifs/auth.c +++ b/fs/ubifs/auth.c @@ -246,7 +246,7 @@ int ubifs_init_authentication(struct ubifs_info *c) snprintf(hmac_name, CRYPTO_MAX_ALG_NAME, "hmac(%s)", c->auth_hash_name); - keyring_key = request_key(&key_type_logon, c->auth_key_name, NULL); + keyring_key = request_key(&key_type_logon, c->auth_key_name, NULL, NULL); if (IS_ERR(keyring_key)) { ubifs_err(c, "Failed to request key: %ld", diff --git a/include/linux/key.h b/include/linux/key.h index de190036512b..a38b89bd414c 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -32,49 +32,14 @@ /* key handle serial number */ typedef int32_t key_serial_t; -/* key handle permissions mask */ -typedef uint32_t key_perm_t; - struct key; struct net; #ifdef CONFIG_KEYS -#undef KEY_DEBUGGING +#include -#define KEY_POS_VIEW 0x01000000 /* possessor can view a key's attributes */ -#define KEY_POS_READ 0x02000000 /* possessor can read key payload / view keyring */ -#define KEY_POS_WRITE 0x04000000 /* possessor can update key payload / add link to keyring */ -#define KEY_POS_SEARCH 0x08000000 /* possessor can find a key in search / search a keyring */ -#define KEY_POS_LINK 0x10000000 /* possessor can create a link to a key/keyring */ -#define KEY_POS_SETATTR 0x20000000 /* possessor can set key attributes */ -#define KEY_POS_ALL 0x3f000000 - -#define KEY_USR_VIEW 0x00010000 /* user permissions... */ -#define KEY_USR_READ 0x00020000 -#define KEY_USR_WRITE 0x00040000 -#define KEY_USR_SEARCH 0x00080000 -#define KEY_USR_LINK 0x00100000 -#define KEY_USR_SETATTR 0x00200000 -#define KEY_USR_ALL 0x003f0000 - -#define KEY_GRP_VIEW 0x00000100 /* group permissions... */ -#define KEY_GRP_READ 0x00000200 -#define KEY_GRP_WRITE 0x00000400 -#define KEY_GRP_SEARCH 0x00000800 -#define KEY_GRP_LINK 0x00001000 -#define KEY_GRP_SETATTR 0x00002000 -#define KEY_GRP_ALL 0x00003f00 - -#define KEY_OTH_VIEW 0x00000001 /* third party permissions... */ -#define KEY_OTH_READ 0x00000002 -#define KEY_OTH_WRITE 0x00000004 -#define KEY_OTH_SEARCH 0x00000008 -#define KEY_OTH_LINK 0x00000010 -#define KEY_OTH_SETATTR 0x00000020 -#define KEY_OTH_ALL 0x0000003f - -#define KEY_PERM_UNDEF 0xffffffff +#undef KEY_DEBUGGING struct seq_file; struct user_struct; @@ -118,6 +83,36 @@ union key_payload { void *data[4]; }; +struct key_ace { + unsigned int type; + unsigned int perm; + union { + kuid_t uid; + kgid_t gid; + unsigned int subject_id; + }; +}; + +struct key_acl { + refcount_t usage; + unsigned short nr_ace; + bool possessor_viewable; + struct rcu_head rcu; + struct key_ace aces[]; +}; + +#define KEY_POSSESSOR_ACE(perms) { \ + .type = KEY_ACE_SUBJ_STANDARD, \ + .perm = perms, \ + .subject_id = KEY_ACE_POSSESSOR \ + } + +#define KEY_OWNER_ACE(perms) { \ + .type = KEY_ACE_SUBJ_STANDARD, \ + .perm = perms, \ + .subject_id = KEY_ACE_OWNER \ + } + /*****************************************************************************/ /* * key reference with possession attribute handling @@ -187,6 +182,7 @@ struct key { struct rw_semaphore sem; /* change vs change sem */ struct key_user *user; /* owner of this key */ void *security; /* security data for this key */ + struct key_acl __rcu *acl; union { time64_t expiry; /* time at which key expires (or 0) */ time64_t revoked_at; /* time at which key was revoked */ @@ -194,7 +190,6 @@ struct key { time64_t last_used_at; /* last time used for LRU keyring discard */ kuid_t uid; kgid_t gid; - key_perm_t perm; /* access permissions */ unsigned short quotalen; /* length added to quota */ unsigned short datalen; /* payload data length * - may not match RCU dereferenced payload @@ -220,6 +215,7 @@ struct key { #define KEY_FLAG_UID_KEYRING 9 /* set if key is a user or user session keyring */ #define KEY_FLAG_SET_WATCH_PROXY 10 /* Set if watch_proxy should be set on added keys */ #define KEY_FLAG_SEEN 11 /* Set if returned by keyctl_find_oldest_key() */ +#define KEY_FLAG_HAS_ACL 12 /* Set if KEYCTL_SETACL called on key */ /* the key type and key description string * - the desc is used to match a key against search criteria @@ -268,7 +264,7 @@ extern struct key *key_alloc(struct key_type *type, const char *desc, kuid_t uid, kgid_t gid, const struct cred *cred, - key_perm_t perm, + struct key_acl *acl, unsigned long flags, struct key_restriction *restrict_link); @@ -304,18 +300,21 @@ static inline void key_ref_put(key_ref_t key_ref) extern struct key *request_key(struct key_type *type, const char *description, - const char *callout_info); + const char *callout_info, + struct key_acl *acl); extern struct key *request_key_with_auxdata(struct key_type *type, const char *description, const void *callout_info, size_t callout_len, - void *aux); + void *aux, + struct key_acl *acl); extern struct key *request_key_net(struct key_type *type, const char *description, struct net *net, - const char *callout_info); + const char *callout_info, + struct key_acl *acl); extern int wait_for_key_construction(struct key *key, bool intr); @@ -326,7 +325,7 @@ extern key_ref_t key_create_or_update(key_ref_t keyring, const char *description, const void *payload, size_t plen, - key_perm_t perm, + struct key_acl *acl, unsigned long flags); extern int key_update(key_ref_t key, @@ -346,7 +345,7 @@ extern int key_unlink(struct key *keyring, extern struct key *keyring_alloc(const char *description, kuid_t uid, kgid_t gid, const struct cred *cred, - key_perm_t perm, + struct key_acl *acl, unsigned long flags, struct key_restriction *restrict_link, struct key *dest); @@ -378,19 +377,29 @@ static inline key_serial_t key_serial(const struct key *key) extern void key_set_timeout(struct key *, unsigned); extern key_ref_t lookup_user_key(key_serial_t id, unsigned long flags, - key_perm_t perm); + u32 desired_perm); extern void key_free_user_ns(struct user_namespace *); /* * The permissions required on a key that we're looking up. */ -#define KEY_NEED_VIEW 0x01 /* Require permission to view attributes */ -#define KEY_NEED_READ 0x02 /* Require permission to read content */ -#define KEY_NEED_WRITE 0x04 /* Require permission to update / modify */ -#define KEY_NEED_SEARCH 0x08 /* Require permission to search (keyring) or find (key) */ -#define KEY_NEED_LINK 0x10 /* Require permission to link */ -#define KEY_NEED_SETATTR 0x20 /* Require permission to change attributes */ -#define KEY_NEED_ALL 0x3f /* All the above permissions */ +#define KEY_NEED_VIEW 0x001 /* Require permission to view attributes */ +#define KEY_NEED_READ 0x002 /* Require permission to read content */ +#define KEY_NEED_WRITE 0x004 /* Require permission to update / modify */ +#define KEY_NEED_SEARCH 0x008 /* Require permission to search (keyring) or find (key) */ +#define KEY_NEED_LINK 0x010 /* Require permission to link */ +#define KEY_NEED_SETSEC 0x020 /* Require permission to set owner, group, ACL */ +#define KEY_NEED_INVAL 0x040 /* Require permission to invalidate key */ +#define KEY_NEED_REVOKE 0x080 /* Require permission to revoke key */ +#define KEY_NEED_JOIN 0x100 /* Require permission to join keyring as session */ +#define KEY_NEED_CLEAR 0x200 /* Require permission to clear a keyring */ +#define KEY_NEED_ALL 0x3ff + +#define OLD_KEY_NEED_SETATTR 0x20 /* Used to be Require permission to change attributes */ + +extern struct key_acl internal_key_acl; +extern struct key_acl internal_keyring_acl; +extern struct key_acl internal_writable_keyring_acl; static inline short key_read_state(const struct key *key) { diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index a2afb4512f34..50d7b6ca82ab 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -15,6 +15,69 @@ #include +/* + * Keyring permission grant definitions + */ +enum key_ace_subject_type { + KEY_ACE_SUBJ_STANDARD = 0, /* subject is one of key_ace_standard_subject */ + nr__key_ace_subject_type +}; + +enum key_ace_standard_subject { + KEY_ACE_EVERYONE = 0, /* Everyone, including owner and group */ + KEY_ACE_GROUP = 1, /* The key's group */ + KEY_ACE_OWNER = 2, /* The owner of the key */ + KEY_ACE_POSSESSOR = 3, /* Any process that possesses of the key */ + nr__key_ace_standard_subject +}; + +#define KEY_ACE_VIEW 0x00000001 /* Can describe the key */ +#define KEY_ACE_READ 0x00000002 /* Can read the key content */ +#define KEY_ACE_WRITE 0x00000004 /* Can update/modify the key content */ +#define KEY_ACE_SEARCH 0x00000008 /* Can find the key by search */ +#define KEY_ACE_LINK 0x00000010 /* Can make a link to the key */ +#define KEY_ACE_SET_SECURITY 0x00000020 /* Can set owner, group, ACL */ +#define KEY_ACE_INVAL 0x00000040 /* Can invalidate the key */ +#define KEY_ACE_REVOKE 0x00000080 /* Can revoke the key */ +#define KEY_ACE_JOIN 0x00000100 /* Can join keyring */ +#define KEY_ACE_CLEAR 0x00000200 /* Can clear keyring */ +#define KEY_ACE__PERMS 0xffffffff + +/* + * Old-style permissions mask, deprecated in favour of ACL. + */ +#define KEY_POS_VIEW 0x01000000 /* possessor can view a key's attributes */ +#define KEY_POS_READ 0x02000000 /* possessor can read key payload / view keyring */ +#define KEY_POS_WRITE 0x04000000 /* possessor can update key payload / add link to keyring */ +#define KEY_POS_SEARCH 0x08000000 /* possessor can find a key in search / search a keyring */ +#define KEY_POS_LINK 0x10000000 /* possessor can create a link to a key/keyring */ +#define KEY_POS_SETATTR 0x20000000 /* possessor can set key attributes */ +#define KEY_POS_ALL 0x3f000000 + +#define KEY_USR_VIEW 0x00010000 /* user permissions... */ +#define KEY_USR_READ 0x00020000 +#define KEY_USR_WRITE 0x00040000 +#define KEY_USR_SEARCH 0x00080000 +#define KEY_USR_LINK 0x00100000 +#define KEY_USR_SETATTR 0x00200000 +#define KEY_USR_ALL 0x003f0000 + +#define KEY_GRP_VIEW 0x00000100 /* group permissions... */ +#define KEY_GRP_READ 0x00000200 +#define KEY_GRP_WRITE 0x00000400 +#define KEY_GRP_SEARCH 0x00000800 +#define KEY_GRP_LINK 0x00001000 +#define KEY_GRP_SETATTR 0x00002000 +#define KEY_GRP_ALL 0x00003f00 + +#define KEY_OTH_VIEW 0x00000001 /* third party permissions... */ +#define KEY_OTH_READ 0x00000002 +#define KEY_OTH_WRITE 0x00000004 +#define KEY_OTH_SEARCH 0x00000008 +#define KEY_OTH_LINK 0x00000010 +#define KEY_OTH_SETATTR 0x00000020 +#define KEY_OTH_ALL 0x0000003f + /* special process keyring shortcut IDs */ #define KEY_SPEC_THREAD_KEYRING -1 /* - key ID for thread-specific keyring */ #define KEY_SPEC_PROCESS_KEYRING -2 /* - key ID for process-specific keyring */ diff --git a/lib/digsig.c b/lib/digsig.c index 6ba6fcd92dd1..8cfa53585267 100644 --- a/lib/digsig.c +++ b/lib/digsig.c @@ -227,7 +227,7 @@ int digsig_verify(struct key *keyring, const char *sig, int siglen, else key = key_ref_to_ptr(kref); } else { - key = request_key(&key_type_user, name, NULL); + key = request_key(&key_type_user, name, NULL, NULL); } if (IS_ERR(key)) { pr_err("key not found, id: %s\n", name); diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c index 9cab80207ced..c6efe800392e 100644 --- a/net/ceph/ceph_common.c +++ b/net/ceph/ceph_common.c @@ -305,7 +305,7 @@ static int get_secret(struct ceph_crypto_key *dst, const char *name) { int err = 0; struct ceph_crypto_key *ckey; - ukey = request_key(&key_type_ceph, name, NULL); + ukey = request_key(&key_type_ceph, name, NULL, NULL); if (IS_ERR(ukey)) { /* request_key errors don't map nicely to mount(2) errors; don't even try, but still printk */ diff --git a/net/dns_resolver/dns_key.c b/net/dns_resolver/dns_key.c index 3e1a90669006..6b201531b165 100644 --- a/net/dns_resolver/dns_key.c +++ b/net/dns_resolver/dns_key.c @@ -46,6 +46,15 @@ const struct cred *dns_resolver_cache; #define DNS_ERRORNO_OPTION "dnserror" +static struct key_acl dns_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_CLEAR), + } +}; + /* * Preparse instantiation data for a dns_resolver key. * @@ -343,8 +352,7 @@ static int __init init_dns_resolver(void) keyring = keyring_alloc(".dns_resolver", GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, - (KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ, + &dns_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); diff --git a/net/dns_resolver/dns_query.c b/net/dns_resolver/dns_query.c index d88ea98da63e..3a6436a7931a 100644 --- a/net/dns_resolver/dns_query.c +++ b/net/dns_resolver/dns_query.c @@ -46,6 +46,16 @@ #include "internal.h" +static struct key_acl dns_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_SEARCH | KEY_ACE_READ), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_INVAL), + } +}; + /** * dns_query - Query the DNS * @net: The network namespace to operate in. @@ -124,7 +134,8 @@ int dns_query(struct net *net, * add_key() to preinstall malicious redirections */ saved_cred = override_creds(dns_resolver_cache); - rkey = request_key_net(&key_type_dns_resolver, desc, net, options); + rkey = request_key_net(&key_type_dns_resolver, desc, net, options, + &dns_key_acl); revert_creds(saved_cred); kfree(desc); if (IS_ERR(rkey)) { @@ -134,8 +145,6 @@ int dns_query(struct net *net, down_read(&rkey->sem); set_bit(KEY_FLAG_ROOT_CAN_INVAL, &rkey->flags); - rkey->perm |= KEY_USR_VIEW; - ret = key_validate(rkey); if (ret < 0) goto put; diff --git a/net/rxrpc/key.c b/net/rxrpc/key.c index 1cc6b0c6cc42..284d7a025fbc 100644 --- a/net/rxrpc/key.c +++ b/net/rxrpc/key.c @@ -27,6 +27,14 @@ #include #include "ar-internal.h" +static struct key_acl rxrpc_null_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 1, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_READ), + } +}; + static int rxrpc_vet_description_s(const char *); static int rxrpc_preparse(struct key_preparsed_payload *); static int rxrpc_preparse_s(struct key_preparsed_payload *); @@ -914,7 +922,8 @@ int rxrpc_request_key(struct rxrpc_sock *rx, char __user *optval, int optlen) if (IS_ERR(description)) return PTR_ERR(description); - key = request_key_net(&key_type_rxrpc, description, sock_net(&rx->sk), NULL); + key = request_key_net(&key_type_rxrpc, description, sock_net(&rx->sk), + NULL, NULL); if (IS_ERR(key)) { kfree(description); _leave(" = %ld", PTR_ERR(key)); @@ -945,7 +954,8 @@ int rxrpc_server_keyring(struct rxrpc_sock *rx, char __user *optval, if (IS_ERR(description)) return PTR_ERR(description); - key = request_key_net(&key_type_keyring, description, sock_net(&rx->sk), NULL); + key = request_key_net(&key_type_keyring, description, sock_net(&rx->sk), + NULL, NULL); if (IS_ERR(key)) { kfree(description); _leave(" = %ld", PTR_ERR(key)); @@ -1026,7 +1036,7 @@ struct key *rxrpc_get_null_key(const char *keyname) key = key_alloc(&key_type_rxrpc, keyname, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, cred, - KEY_POS_SEARCH, KEY_ALLOC_NOT_IN_QUOTA, NULL); + &rxrpc_null_key_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL); if (IS_ERR(key)) return key; diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c index f45d6edecf99..c666dc72006a 100644 --- a/security/integrity/digsig.c +++ b/security/integrity/digsig.c @@ -51,7 +51,8 @@ int integrity_digsig_verify(const unsigned int id, const char *sig, int siglen, if (!keyring[id]) { keyring[id] = - request_key(&key_type_keyring, keyring_name[id], NULL); + request_key(&key_type_keyring, keyring_name[id], + NULL, NULL); if (IS_ERR(keyring[id])) { int err = PTR_ERR(keyring[id]); pr_err("no %s keyring: %d\n", keyring_name[id], err); @@ -73,14 +74,14 @@ int integrity_digsig_verify(const unsigned int id, const char *sig, int siglen, return -EOPNOTSUPP; } -static int __integrity_init_keyring(const unsigned int id, key_perm_t perm, +static int __integrity_init_keyring(const unsigned int id, struct key_acl *acl, struct key_restriction *restriction) { const struct cred *cred = current_cred(); int err = 0; keyring[id] = keyring_alloc(keyring_name[id], KUIDT_INIT(0), - KGIDT_INIT(0), cred, perm, + KGIDT_INIT(0), cred, acl, KEY_ALLOC_NOT_IN_QUOTA, restriction, NULL); if (IS_ERR(keyring[id])) { err = PTR_ERR(keyring[id]); @@ -95,10 +96,7 @@ static int __integrity_init_keyring(const unsigned int id, key_perm_t perm, int __init integrity_init_keyring(const unsigned int id) { struct key_restriction *restriction; - key_perm_t perm; - - perm = (KEY_POS_ALL & ~KEY_POS_SETATTR) | KEY_USR_VIEW - | KEY_USR_READ | KEY_USR_SEARCH; + struct key_acl *acl = &internal_keyring_acl; if (id == INTEGRITY_KEYRING_PLATFORM) { restriction = NULL; @@ -113,14 +111,14 @@ int __init integrity_init_keyring(const unsigned int id) return -ENOMEM; restriction->check = restrict_link_to_ima; - perm |= KEY_USR_WRITE; + acl = &internal_writable_keyring_acl; out: - return __integrity_init_keyring(id, perm, restriction); + return __integrity_init_keyring(id, &internal_keyring_acl, restriction); } -int __init integrity_add_key(const unsigned int id, const void *data, - off_t size, key_perm_t perm) +static int __init integrity_add_key(const unsigned int id, const void *data, + off_t size, struct key_acl *acl) { key_ref_t key; int rc = 0; @@ -129,7 +127,7 @@ int __init integrity_add_key(const unsigned int id, const void *data, return -EINVAL; key = key_create_or_update(make_key_ref(keyring[id], 1), "asymmetric", - NULL, data, size, perm, + NULL, data, size, acl ?: &internal_key_acl, KEY_ALLOC_NOT_IN_QUOTA); if (IS_ERR(key)) { rc = PTR_ERR(key); @@ -149,7 +147,6 @@ int __init integrity_load_x509(const unsigned int id, const char *path) void *data; loff_t size; int rc; - key_perm_t perm; rc = kernel_read_file_from_path(path, &data, &size, 0, READING_X509_CERTIFICATE); @@ -158,21 +155,19 @@ int __init integrity_load_x509(const unsigned int id, const char *path) return rc; } - perm = (KEY_POS_ALL & ~KEY_POS_SETATTR) | KEY_USR_VIEW | KEY_USR_READ; - pr_info("Loading X.509 certificate: %s\n", path); - rc = integrity_add_key(id, (const void *)data, size, perm); + rc = integrity_add_key(id, data, size, NULL); vfree(data); return rc; } int __init integrity_load_cert(const unsigned int id, const char *source, - const void *data, size_t len, key_perm_t perm) + const void *data, size_t len, struct key_acl *acl) { if (!data) return -EINVAL; pr_info("Loading X.509 certificate: %s\n", source); - return integrity_add_key(id, data, len, perm); + return integrity_add_key(id, data, len, acl); } diff --git a/security/integrity/digsig_asymmetric.c b/security/integrity/digsig_asymmetric.c index d775e03fbbcc..017cb6db521d 100644 --- a/security/integrity/digsig_asymmetric.c +++ b/security/integrity/digsig_asymmetric.c @@ -57,7 +57,7 @@ static struct key *request_asymmetric_key(struct key *keyring, uint32_t keyid) else key = key_ref_to_ptr(kref); } else { - key = request_key(&key_type_asymmetric, name, NULL); + key = request_key(&key_type_asymmetric, name, NULL, NULL); } if (IS_ERR(key)) { diff --git a/security/integrity/evm/evm_crypto.c b/security/integrity/evm/evm_crypto.c index 43e2dc3a60d0..945f42b762e4 100644 --- a/security/integrity/evm/evm_crypto.c +++ b/security/integrity/evm/evm_crypto.c @@ -358,7 +358,7 @@ int evm_init_key(void) struct encrypted_key_payload *ekp; int rc; - evm_key = request_key(&key_type_encrypted, EVMKEY, NULL); + evm_key = request_key(&key_type_encrypted, EVMKEY, NULL, NULL); if (IS_ERR(evm_key)) return -ENOENT; diff --git a/security/integrity/ima/ima_mok.c b/security/integrity/ima/ima_mok.c index 073ddc9bce5b..ce48303cfacc 100644 --- a/security/integrity/ima/ima_mok.c +++ b/security/integrity/ima/ima_mok.c @@ -21,6 +21,15 @@ #include +static struct key_acl integrity_blacklist_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_WRITE | KEY_ACE_SEARCH), + } +}; + struct key *ima_blacklist_keyring; /* @@ -40,9 +49,7 @@ __init int ima_mok_init(void) ima_blacklist_keyring = keyring_alloc(".ima_blacklist", KUIDT_INIT(0), KGIDT_INIT(0), current_cred(), - (KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ | - KEY_USR_WRITE | KEY_USR_SEARCH, + &integrity_blacklist_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, restriction, NULL); diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h index 7de59f44cba3..fbc1264af55f 100644 --- a/security/integrity/integrity.h +++ b/security/integrity/integrity.h @@ -154,7 +154,7 @@ int integrity_digsig_verify(const unsigned int id, const char *sig, int siglen, int __init integrity_init_keyring(const unsigned int id); int __init integrity_load_x509(const unsigned int id, const char *path); int __init integrity_load_cert(const unsigned int id, const char *source, - const void *data, size_t len, key_perm_t perm); + const void *data, size_t len, struct key_acl *acl); #else static inline int integrity_digsig_verify(const unsigned int id, @@ -172,7 +172,7 @@ static inline int integrity_init_keyring(const unsigned int id) static inline int __init integrity_load_cert(const unsigned int id, const char *source, const void *data, size_t len, - key_perm_t perm) + struct key_acl *acl) { return 0; } diff --git a/security/integrity/platform_certs/platform_keyring.c b/security/integrity/platform_certs/platform_keyring.c index bcafd7387729..80bb6f750045 100644 --- a/security/integrity/platform_certs/platform_keyring.c +++ b/security/integrity/platform_certs/platform_keyring.c @@ -14,6 +14,15 @@ #include #include "../integrity.h" +static struct key_acl platform_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_READ), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + /** * add_to_platform_keyring - Add to platform keyring without validation. * @source: Source of key @@ -29,10 +38,8 @@ void __init add_to_platform_keyring(const char *source, const void *data, key_perm_t perm; int rc; - perm = (KEY_POS_ALL & ~KEY_POS_SETATTR) | KEY_USR_VIEW; - rc = integrity_load_cert(INTEGRITY_KEYRING_PLATFORM, source, data, len, - perm); + &platform_key_acl); if (rc) pr_info("Error adding keys to platform keyring %s\n", source); } diff --git a/security/keys/encrypted-keys/encrypted.c b/security/keys/encrypted-keys/encrypted.c index 389a298274d3..376068ec5a4e 100644 --- a/security/keys/encrypted-keys/encrypted.c +++ b/security/keys/encrypted-keys/encrypted.c @@ -307,7 +307,7 @@ static struct key *request_user_key(const char *master_desc, const u8 **master_k const struct user_key_payload *upayload; struct key *ukey; - ukey = request_key(&key_type_user, master_desc, NULL); + ukey = request_key(&key_type_user, master_desc, NULL, NULL); if (IS_ERR(ukey)) goto error; diff --git a/security/keys/encrypted-keys/masterkey_trusted.c b/security/keys/encrypted-keys/masterkey_trusted.c index dc3d18cae642..3322e7eeafce 100644 --- a/security/keys/encrypted-keys/masterkey_trusted.c +++ b/security/keys/encrypted-keys/masterkey_trusted.c @@ -33,7 +33,7 @@ struct key *request_trusted_key(const char *trusted_desc, struct trusted_key_payload *tpayload; struct key *tkey; - tkey = request_key(&key_type_trusted, trusted_desc, NULL); + tkey = request_key(&key_type_trusted, trusted_desc, NULL, NULL); if (IS_ERR(tkey)) goto error; diff --git a/security/keys/gc.c b/security/keys/gc.c index c39721163d43..cb667becf224 100644 --- a/security/keys/gc.c +++ b/security/keys/gc.c @@ -160,6 +160,7 @@ static noinline void key_gc_unused_keys(struct list_head *keys) key_user_put(key->user); key_put_tag(key->domain_tag); + key_put_acl(key->acl); kfree(key->description); memzero_explicit(key, sizeof(*key)); @@ -229,7 +230,6 @@ static void key_garbage_collector(struct work_struct *work) if (key->type == key_gc_dead_keytype) { gc_state |= KEY_GC_FOUND_DEAD_KEY; set_bit(KEY_FLAG_DEAD, &key->flags); - key->perm = 0; goto skip_dead_key; } else if (key->type == &key_type_keyring && key->restrict_link) { diff --git a/security/keys/internal.h b/security/keys/internal.h index 6be76caee874..9f9ecc1810c9 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -89,8 +89,11 @@ extern struct rb_root key_serial_tree; extern spinlock_t key_serial_lock; extern struct mutex key_construction_mutex; extern wait_queue_head_t request_key_conswq; +extern struct key_acl default_key_acl; +extern struct key_acl joinable_keyring_acl; extern void key_set_index_key(struct keyring_index_key *index_key); + extern struct key_type *key_type_lookup(const char *type); extern void key_type_put(struct key_type *ktype); extern int key_get_type_from_user(char *, const char __user *, unsigned); @@ -157,6 +160,7 @@ extern struct key *request_key_and_link(struct key_type *type, const void *callout_info, size_t callout_len, void *aux, + struct key_acl *acl, struct key *dest_keyring, unsigned long flags); @@ -180,7 +184,11 @@ extern void key_gc_keytype(struct key_type *ktype); extern int key_task_permission(const key_ref_t key_ref, const struct cred *cred, - key_perm_t perm); + u32 desired_perm); +extern unsigned int key_acl_to_perm(const struct key_acl *acl); +extern long key_set_acl(struct key *key, struct key_acl *acl); +extern void key_put_acl(struct key_acl *acl); + #ifdef CONFIG_CONTAINERS extern int queue_request_key(struct key *); #else @@ -249,7 +257,7 @@ extern long keyctl_keyring_search(key_serial_t, const char __user *, const char __user *, key_serial_t); extern long keyctl_read_key(key_serial_t, char __user *, size_t); extern long keyctl_chown_key(key_serial_t, uid_t, gid_t); -extern long keyctl_setperm_key(key_serial_t, key_perm_t); +extern long keyctl_setperm_key(key_serial_t, unsigned int); extern long keyctl_instantiate_key(key_serial_t, const void __user *, size_t, key_serial_t); extern long keyctl_negate_key(key_serial_t, unsigned, key_serial_t); diff --git a/security/keys/key.c b/security/keys/key.c index 63513ffcf2e8..bca9d01c05fa 100644 --- a/security/keys/key.c +++ b/security/keys/key.c @@ -199,7 +199,7 @@ static inline void key_alloc_serial(struct key *key) * @uid: The owner of the new key. * @gid: The group ID for the new key's group permissions. * @cred: The credentials specifying UID namespace. - * @perm: The permissions mask of the new key. + * @acl: The ACL to attach to the new key. * @flags: Flags specifying quota properties. * @restrict_link: Optional link restriction for new keyrings. * @@ -227,7 +227,7 @@ static inline void key_alloc_serial(struct key *key) */ struct key *key_alloc(struct key_type *type, const char *desc, kuid_t uid, kgid_t gid, const struct cred *cred, - key_perm_t perm, unsigned long flags, + struct key_acl *acl, unsigned long flags, struct key_restriction *restrict_link) { struct key_user *user = NULL; @@ -250,6 +250,9 @@ struct key *key_alloc(struct key_type *type, const char *desc, desclen = strlen(desc); quotalen = desclen + 1 + type->def_datalen; + if (!acl) + acl = &default_key_acl; + /* get hold of the key tracking for this user */ user = key_user_lookup(uid); if (!user) @@ -296,7 +299,8 @@ struct key *key_alloc(struct key_type *type, const char *desc, key->datalen = type->def_datalen; key->uid = uid; key->gid = gid; - key->perm = perm; + refcount_inc(&acl->usage); + rcu_assign_pointer(key->acl, acl); key->restrict_link = restrict_link; key->last_used_at = ktime_get_real_seconds(); @@ -785,7 +789,7 @@ static inline key_ref_t __key_update(key_ref_t key_ref, * @description: The searchable description for the key. * @payload: The data to use to instantiate or update the key. * @plen: The length of @payload. - * @perm: The permissions mask for a new key. + * @acl: The ACL to attach if a key is created. * @flags: The quota flags for a new key. * * Search the destination keyring for a key of the same description and if one @@ -808,7 +812,7 @@ key_ref_t key_create_or_update(key_ref_t keyring_ref, const char *description, const void *payload, size_t plen, - key_perm_t perm, + struct key_acl *acl, unsigned long flags) { struct keyring_index_key index_key = { @@ -899,22 +903,9 @@ key_ref_t key_create_or_update(key_ref_t keyring_ref, goto found_matching_key; } - /* if the client doesn't provide, decide on the permissions we want */ - if (perm == KEY_PERM_UNDEF) { - perm = KEY_POS_VIEW | KEY_POS_SEARCH | KEY_POS_LINK | KEY_POS_SETATTR; - perm |= KEY_USR_VIEW; - - if (index_key.type->read) - perm |= KEY_POS_READ; - - if (index_key.type == &key_type_keyring || - index_key.type->update) - perm |= KEY_POS_WRITE; - } - /* allocate a new key */ key = key_alloc(index_key.type, index_key.description, - cred->fsuid, cred->fsgid, cred, perm, flags, NULL); + cred->fsuid, cred->fsgid, cred, acl, flags, NULL); if (IS_ERR(key)) { key_ref = ERR_CAST(key); goto error_link_end; diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index a25799249b8a..2df896bfb8e4 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -120,8 +120,7 @@ SYSCALL_DEFINE5(add_key, const char __user *, _type, /* create or update the requested key and add it to the target * keyring */ key_ref = key_create_or_update(keyring_ref, type, description, - payload, plen, KEY_PERM_UNDEF, - KEY_ALLOC_IN_QUOTA); + payload, plen, NULL, KEY_ALLOC_IN_QUOTA); if (!IS_ERR(key_ref)) { ret = key_ref_to_ptr(key_ref)->serial; key_ref_put(key_ref); @@ -211,7 +210,8 @@ SYSCALL_DEFINE4(request_key, const char __user *, _type, /* do the search */ key = request_key_and_link(ktype, description, NULL, callout_info, - callout_len, NULL, key_ref_to_ptr(dest_ref), + callout_len, NULL, NULL, + key_ref_to_ptr(dest_ref), KEY_ALLOC_IN_QUOTA); if (IS_ERR(key)) { ret = PTR_ERR(key); @@ -373,16 +373,10 @@ long keyctl_revoke_key(key_serial_t id) struct key *key; long ret; - key_ref = lookup_user_key(id, 0, KEY_NEED_WRITE); + key_ref = lookup_user_key(id, 0, KEY_NEED_REVOKE); if (IS_ERR(key_ref)) { ret = PTR_ERR(key_ref); - if (ret != -EACCES) - goto error; - key_ref = lookup_user_key(id, 0, KEY_NEED_SETATTR); - if (IS_ERR(key_ref)) { - ret = PTR_ERR(key_ref); - goto error; - } + goto error; } key = key_ref_to_ptr(key_ref); @@ -416,7 +410,7 @@ long keyctl_invalidate_key(key_serial_t id) kenter("%d", id); - key_ref = lookup_user_key(id, 0, KEY_NEED_SEARCH); + key_ref = lookup_user_key(id, 0, KEY_NEED_INVAL); if (IS_ERR(key_ref)) { ret = PTR_ERR(key_ref); @@ -461,7 +455,7 @@ long keyctl_keyring_clear(key_serial_t ringid) struct key *keyring; long ret; - keyring_ref = lookup_user_key(ringid, KEY_LOOKUP_CREATE, KEY_NEED_WRITE); + keyring_ref = lookup_user_key(ringid, KEY_LOOKUP_CREATE, KEY_NEED_CLEAR); if (IS_ERR(keyring_ref)) { ret = PTR_ERR(keyring_ref); @@ -639,6 +633,7 @@ long keyctl_describe_key(key_serial_t keyid, size_t buflen) { struct key *key, *instkey; + unsigned int perm; key_ref_t key_ref; char *infobuf; long ret; @@ -668,6 +663,10 @@ long keyctl_describe_key(key_serial_t keyid, key = key_ref_to_ptr(key_ref); desclen = strlen(key->description); + rcu_read_lock(); + perm = key_acl_to_perm(rcu_dereference(key->acl)); + rcu_read_unlock(); + /* calculate how much information we're going to return */ ret = -ENOMEM; infobuf = kasprintf(GFP_KERNEL, @@ -675,7 +674,7 @@ long keyctl_describe_key(key_serial_t keyid, key->type->name, from_kuid_munged(current_user_ns(), key->uid), from_kgid_munged(current_user_ns(), key->gid), - key->perm); + perm); if (!infobuf) goto error2; infolen = strlen(infobuf); @@ -892,7 +891,7 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group) goto error; key_ref = lookup_user_key(id, KEY_LOOKUP_CREATE | KEY_LOOKUP_PARTIAL, - KEY_NEED_SETATTR); + KEY_NEED_SETSEC); if (IS_ERR(key_ref)) { ret = PTR_ERR(key_ref); goto error; @@ -988,18 +987,25 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group) * the key need not be fully instantiated yet. If the caller does not have * sysadmin capability, it may only change the permission on keys that it owns. */ -long keyctl_setperm_key(key_serial_t id, key_perm_t perm) +long keyctl_setperm_key(key_serial_t id, unsigned int perm) { + struct key_acl *acl; struct key *key; key_ref_t key_ref; long ret; + int nr, i, j; - ret = -EINVAL; if (perm & ~(KEY_POS_ALL | KEY_USR_ALL | KEY_GRP_ALL | KEY_OTH_ALL)) - goto error; + return -EINVAL; + + nr = 0; + if (perm & KEY_POS_ALL) nr++; + if (perm & KEY_USR_ALL) nr++; + if (perm & KEY_GRP_ALL) nr++; + if (perm & KEY_OTH_ALL) nr++; key_ref = lookup_user_key(id, KEY_LOOKUP_CREATE | KEY_LOOKUP_PARTIAL, - KEY_NEED_SETATTR); + KEY_NEED_SETSEC); if (IS_ERR(key_ref)) { ret = PTR_ERR(key_ref); goto error; @@ -1007,18 +1013,45 @@ long keyctl_setperm_key(key_serial_t id, key_perm_t perm) key = key_ref_to_ptr(key_ref); - /* make the changes with the locks held to prevent chown/chmod races */ - ret = -EACCES; - down_write(&key->sem); + ret = -EOPNOTSUPP; + if (test_bit(KEY_FLAG_HAS_ACL, &key->flags)) + goto error_key; - /* if we're not the sysadmin, we can only change a key that we own */ - if (capable(CAP_SYS_ADMIN) || uid_eq(key->uid, current_fsuid())) { - key->perm = perm; - notify_key(key, NOTIFY_KEY_SETATTR, 0); - ret = 0; + ret = -ENOMEM; + acl = kzalloc(struct_size(acl, aces, nr), GFP_KERNEL); + if (!acl) + goto error_key; + + refcount_set(&acl->usage, 1); + acl->nr_ace = nr; + j = 0; + for (i = 0; i < 4; i++) { + struct key_ace *ace = &acl->aces[j]; + unsigned int subset = (perm >> (i * 8)) & KEY_OTH_ALL; + + if (!subset) + continue; + ace->type = KEY_ACE_SUBJ_STANDARD; + ace->subject_id = KEY_ACE_EVERYONE + i; + ace->perm = subset; + if (subset & (KEY_OTH_WRITE | KEY_OTH_SETATTR)) + ace->perm |= KEY_ACE_REVOKE; + if (subset & KEY_OTH_SEARCH) + ace->perm |= KEY_ACE_INVAL; + if (key->type == &key_type_keyring) { + if (subset & KEY_OTH_SEARCH) + ace->perm |= KEY_ACE_JOIN; + if (subset & KEY_OTH_WRITE) + ace->perm |= KEY_ACE_CLEAR; + } + j++; } + /* make the changes with the locks held to prevent chown/chmod races */ + down_write(&key->sem); + ret = key_set_acl(key, acl); up_write(&key->sem); +error_key: key_put(key); error: return ret; @@ -1383,7 +1416,7 @@ long keyctl_set_timeout(key_serial_t id, unsigned timeout) long ret; key_ref = lookup_user_key(id, KEY_LOOKUP_CREATE | KEY_LOOKUP_PARTIAL, - KEY_NEED_SETATTR); + KEY_NEED_SETSEC); if (IS_ERR(key_ref)) { /* setting the timeout on a key under construction is permitted * if we have the authorisation token handy */ @@ -1654,7 +1687,7 @@ long keyctl_restrict_keyring(key_serial_t id, const char __user *_type, char *restriction = NULL; long ret; - key_ref = lookup_user_key(id, 0, KEY_NEED_SETATTR); + key_ref = lookup_user_key(id, 0, KEY_NEED_SETSEC); if (IS_ERR(key_ref)) return PTR_ERR(key_ref); @@ -1819,7 +1852,7 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, case KEYCTL_SETPERM: return keyctl_setperm_key((key_serial_t) arg2, - (key_perm_t) arg3); + (unsigned int)arg3); case KEYCTL_INSTANTIATE: return keyctl_instantiate_key((key_serial_t) arg2, diff --git a/security/keys/keyring.c b/security/keys/keyring.c index 14df79814ea0..64f590632891 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -518,11 +518,19 @@ static long keyring_read(const struct key *keyring, return ret; } -/* - * Allocate a keyring and link into the destination keyring. +/** + * keyring_alloc - Allocate a keyring and link into the destination + * @description: The key description to allow the key to be searched out. + * @uid: The owner of the new key. + * @gid: The group ID for the new key's group permissions. + * @cred: The credentials specifying UID namespace. + * @acl: The ACL to attach to the new key. + * @flags: Flags specifying quota properties. + * @restrict_link: Optional link restriction for new keyrings. + * @dest: Destination keyring. */ struct key *keyring_alloc(const char *description, kuid_t uid, kgid_t gid, - const struct cred *cred, key_perm_t perm, + const struct cred *cred, struct key_acl *acl, unsigned long flags, struct key_restriction *restrict_link, struct key *dest) @@ -531,7 +539,7 @@ struct key *keyring_alloc(const char *description, kuid_t uid, kgid_t gid, int ret; keyring = key_alloc(&key_type_keyring, description, - uid, gid, cred, perm, flags, restrict_link); + uid, gid, cred, acl, flags, restrict_link); if (!IS_ERR(keyring)) { ret = key_instantiate_and_link(keyring, NULL, 0, dest, NULL); if (ret < 0) { @@ -1125,10 +1133,11 @@ key_ref_t find_key_to_update(key_ref_t keyring_ref, /* * Find a keyring with the specified name. * - * Only keyrings that have nonzero refcount, are not revoked, and are owned by a - * user in the current user namespace are considered. If @uid_keyring is %true, - * the keyring additionally must have been allocated as a user or user session - * keyring; otherwise, it must grant Search permission directly to the caller. + * Only keyrings that have nonzero refcount, are not revoked, and are owned by + * a user in the current user namespace are considered. If @uid_keyring is + * %true, the keyring additionally must have been allocated as a user or user + * session keyring; otherwise, it must grant JOIN permission directly to the + * caller (ie. not through possession). * * Returns a pointer to the keyring with the keyring's refcount having being * incremented on success. -ENOKEY is returned if a key could not be found. @@ -1162,7 +1171,7 @@ struct key *find_keyring_by_name(const char *name, bool uid_keyring) continue; } else { if (key_permission(make_key_ref(keyring, 0), - KEY_NEED_SEARCH) < 0) + KEY_NEED_JOIN) < 0) continue; } diff --git a/security/keys/permission.c b/security/keys/permission.c index 06df9d5e7572..8dc6e80f6fd0 100644 --- a/security/keys/permission.c +++ b/security/keys/permission.c @@ -11,13 +11,62 @@ #include #include +#include +#include #include "internal.h" +struct key_acl default_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE__PERMS & ~KEY_ACE_JOIN), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + +struct key_acl joinable_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE__PERMS & ~KEY_ACE_JOIN), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_LINK | KEY_ACE_JOIN), + } +}; + +struct key_acl internal_key_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_SEARCH), + } +}; + +struct key_acl internal_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_SEARCH), + } +}; + +struct key_acl internal_writable_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_WRITE | KEY_ACE_SEARCH), + } +}; + /** * key_task_permission - Check a key can be used * @key_ref: The key to check. * @cred: The credentials to use. - * @perm: The permissions to check for. + * @desired_perm: The permission to check for. * * Check to see whether permission is granted to use a key in the desired way, * but permit the security modules to override. @@ -28,53 +77,73 @@ * permissions bits or the LSM check. */ int key_task_permission(const key_ref_t key_ref, const struct cred *cred, - unsigned perm) + unsigned int desired_perm) { - struct key *key; - key_perm_t kperm; - int ret; + const struct key_acl *acl; + const struct key *key; + unsigned int allow = 0; + int i; + + BUILD_BUG_ON(KEY_NEED_VIEW != KEY_ACE_VIEW || + KEY_NEED_READ != KEY_ACE_READ || + KEY_NEED_WRITE != KEY_ACE_WRITE || + KEY_NEED_SEARCH != KEY_ACE_SEARCH || + KEY_NEED_LINK != KEY_ACE_LINK || + KEY_NEED_SETSEC != KEY_ACE_SET_SECURITY || + KEY_NEED_INVAL != KEY_ACE_INVAL || + KEY_NEED_REVOKE != KEY_ACE_REVOKE || + KEY_NEED_JOIN != KEY_ACE_JOIN || + KEY_NEED_CLEAR != KEY_ACE_CLEAR); key = key_ref_to_ptr(key_ref); - /* use the second 8-bits of permissions for keys the caller owns */ - if (uid_eq(key->uid, cred->fsuid)) { - kperm = key->perm >> 16; - goto use_these_perms; - } + rcu_read_lock(); - /* use the third 8-bits of permissions for keys the caller has a group - * membership in common with */ - if (gid_valid(key->gid) && key->perm & KEY_GRP_ALL) { - if (gid_eq(key->gid, cred->fsgid)) { - kperm = key->perm >> 8; - goto use_these_perms; - } + acl = rcu_dereference(key->acl); + if (!acl || acl->nr_ace == 0) + goto no_access_rcu; + + for (i = 0; i < acl->nr_ace; i++) { + const struct key_ace *ace = &acl->aces[i]; - ret = groups_search(cred->group_info, key->gid); - if (ret) { - kperm = key->perm >> 8; - goto use_these_perms; + switch (ace->type) { + case KEY_ACE_SUBJ_STANDARD: + switch (ace->subject_id) { + case KEY_ACE_POSSESSOR: + if (is_key_possessed(key_ref)) + allow |= ace->perm; + break; + case KEY_ACE_OWNER: + if (uid_eq(key->uid, cred->fsuid)) + allow |= ace->perm; + break; + case KEY_ACE_GROUP: + if (gid_valid(key->gid)) { + if (gid_eq(key->gid, cred->fsgid)) + allow |= ace->perm; + else if (groups_search(cred->group_info, key->gid)) + allow |= ace->perm; + } + break; + case KEY_ACE_EVERYONE: + allow |= ace->perm; + break; + } + break; } } - /* otherwise use the least-significant 8-bits */ - kperm = key->perm; - -use_these_perms: + rcu_read_unlock(); - /* use the top 8-bits of permissions for keys the caller possesses - * - possessor permissions are additive with other permissions - */ - if (is_key_possessed(key_ref)) - kperm |= key->perm >> 24; + if (!(allow & desired_perm)) + goto no_access; - kperm = kperm & perm & KEY_NEED_ALL; + return security_key_permission(key_ref, cred, desired_perm); - if (kperm != perm) - return -EACCES; - - /* let LSM be the final arbiter */ - return security_key_permission(key_ref, cred, perm); +no_access_rcu: + rcu_read_unlock(); +no_access: + return -EACCES; } EXPORT_SYMBOL(key_task_permission); @@ -108,3 +177,100 @@ int key_validate(const struct key *key) return 0; } EXPORT_SYMBOL(key_validate); + +/* + * Roughly render an ACL to an old-style permissions mask. We cannot + * accurately render what the ACL, particularly if it has ACEs that represent + * subjects outside of { poss, user, group, other }. + */ +unsigned int key_acl_to_perm(const struct key_acl *acl) +{ + unsigned int perm = 0, tperm; + int i; + + BUILD_BUG_ON(KEY_OTH_VIEW != KEY_ACE_VIEW || + KEY_OTH_READ != KEY_ACE_READ || + KEY_OTH_WRITE != KEY_ACE_WRITE || + KEY_OTH_SEARCH != KEY_ACE_SEARCH || + KEY_OTH_LINK != KEY_ACE_LINK || + KEY_OTH_SETATTR != KEY_ACE_SET_SECURITY); + + if (!acl || acl->nr_ace == 0) + return 0; + + for (i = 0; i < acl->nr_ace; i++) { + const struct key_ace *ace = &acl->aces[i]; + + switch (ace->type) { + case KEY_ACE_SUBJ_STANDARD: + tperm = ace->perm & KEY_OTH_ALL; + + /* Invalidation and joining were allowed by SEARCH */ + if (ace->perm & (KEY_ACE_INVAL | KEY_ACE_JOIN)) + tperm |= KEY_OTH_SEARCH; + + /* Revocation was allowed by either SETATTR or WRITE */ + if ((ace->perm & KEY_ACE_REVOKE) && !(tperm & KEY_OTH_SETATTR)) + tperm |= KEY_OTH_WRITE; + + /* Clearing was allowed by WRITE */ + if (ace->perm & KEY_ACE_CLEAR) + tperm |= KEY_OTH_WRITE; + + switch (ace->subject_id) { + case KEY_ACE_POSSESSOR: + perm |= tperm << 24; + break; + case KEY_ACE_OWNER: + perm |= tperm << 16; + break; + case KEY_ACE_GROUP: + perm |= tperm << 8; + break; + case KEY_ACE_EVERYONE: + perm |= tperm << 0; + break; + } + } + } + + return perm; +} + +/* + * Destroy a key's ACL. + */ +void key_put_acl(struct key_acl *acl) +{ + if (acl && refcount_dec_and_test(&acl->usage)) + kfree_rcu(acl, rcu); +} + +/* + * Try to set the ACL. This either attaches or discards the proposed ACL. + */ +long key_set_acl(struct key *key, struct key_acl *acl) +{ + int i; + + /* If we're not the sysadmin, we can only change a key that we own. */ + if (!capable(CAP_SYS_ADMIN) && !uid_eq(key->uid, current_fsuid())) { + key_put_acl(acl); + return -EACCES; + } + + for (i = 0; i < acl->nr_ace; i++) { + const struct key_ace *ace = &acl->aces[i]; + if (ace->type == KEY_ACE_SUBJ_STANDARD && + ace->subject_id == KEY_ACE_POSSESSOR) { + if (ace->perm & KEY_ACE_VIEW) + acl->possessor_viewable = true; + break; + } + } + + rcu_swap_protected(key->acl, acl, lockdep_is_held(&key->sem)); + notify_key(key, NOTIFY_KEY_SETATTR, 0); + key_put_acl(acl); + return 0; +} diff --git a/security/keys/persistent.c b/security/keys/persistent.c index c9fbe63adc58..0a115cc543df 100644 --- a/security/keys/persistent.c +++ b/security/keys/persistent.c @@ -16,6 +16,27 @@ unsigned persistent_keyring_expiry = 3 * 24 * 3600; /* Expire after 3 days of non-use */ +static struct key_acl persistent_register_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_SEARCH | KEY_ACE_WRITE), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ), + } +}; + +static struct key_acl persistent_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_WRITE | + KEY_ACE_SEARCH | KEY_ACE_LINK | + KEY_ACE_CLEAR | KEY_ACE_INVAL), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ), + } +}; + /* * Create the persistent keyring register for the current user namespace. * @@ -26,8 +47,7 @@ static int key_create_persistent_register(struct user_namespace *ns) struct key *reg = keyring_alloc(".persistent_register", KUIDT_INIT(0), KGIDT_INIT(0), current_cred(), - ((KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ), + &persistent_register_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, NULL); if (IS_ERR(reg)) return PTR_ERR(reg); @@ -60,8 +80,7 @@ static key_ref_t key_create_persistent(struct user_namespace *ns, kuid_t uid, persistent = keyring_alloc(index_key->description, uid, INVALID_GID, current_cred(), - ((KEY_POS_ALL & ~KEY_POS_SETATTR) | - KEY_USR_VIEW | KEY_USR_READ), + &persistent_keyring_acl, KEY_ALLOC_NOT_IN_QUOTA, NULL, ns->persistent_keyring_register); if (IS_ERR(persistent)) diff --git a/security/keys/proc.c b/security/keys/proc.c index d2b802072693..d697a2e95217 100644 --- a/security/keys/proc.c +++ b/security/keys/proc.c @@ -154,6 +154,7 @@ static void proc_keys_stop(struct seq_file *p, void *v) static int proc_keys_show(struct seq_file *m, void *v) { + const struct key_acl *acl; struct rb_node *_p = v; struct key *key = rb_entry(_p, struct key, serial_node); unsigned long flags; @@ -161,6 +162,7 @@ static int proc_keys_show(struct seq_file *m, void *v) time64_t now, expiry; char xbuf[16]; short state; + bool check_pos; u64 timo; int rc; @@ -174,12 +176,16 @@ static int proc_keys_show(struct seq_file *m, void *v) .flags = KEYRING_SEARCH_NO_STATE_CHECK, }; - key_ref = make_key_ref(key, 0); + rcu_read_lock(); + + acl = rcu_dereference(key->acl); + check_pos = acl->possessor_viewable; /* determine if the key is possessed by this process (a test we can * skip if the key does not indicate the possessor can view it */ - if (key->perm & KEY_POS_VIEW) { + key_ref = make_key_ref(key, 0); + if (check_pos) { skey_ref = search_my_process_keyrings(&ctx); if (!IS_ERR(skey_ref)) { key_ref_put(skey_ref); @@ -190,12 +196,10 @@ static int proc_keys_show(struct seq_file *m, void *v) /* check whether the current task is allowed to view the key */ rc = key_task_permission(key_ref, ctx.cred, KEY_NEED_VIEW); if (rc < 0) - return 0; + goto out; now = ktime_get_real_seconds(); - rcu_read_lock(); - /* come up with a suitable timeout value */ expiry = READ_ONCE(key->expiry); if (expiry == 0) { @@ -234,7 +238,7 @@ static int proc_keys_show(struct seq_file *m, void *v) showflag(flags, 'i', KEY_FLAG_INVALIDATED), refcount_read(&key->usage), xbuf, - key->perm, + key_acl_to_perm(acl), from_kuid_munged(seq_user_ns(m), key->uid), from_kgid_munged(seq_user_ns(m), key->gid), key->type->name); @@ -245,6 +249,7 @@ static int proc_keys_show(struct seq_file *m, void *v) key->type->describe(key, m); seq_putc(m, '\n'); +out: rcu_read_unlock(); return 0; } diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index 39d3cbac920c..0a231ede4d2b 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -39,6 +39,37 @@ struct key_user root_key_user = { .uid = GLOBAL_ROOT_UID, }; +static struct key_acl user_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_WRITE | + KEY_ACE_SEARCH | KEY_ACE_LINK), + KEY_OWNER_ACE(KEY_ACE__PERMS & ~(KEY_ACE_JOIN | KEY_ACE_SET_SECURITY)), + } +}; + +static struct key_acl session_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE__PERMS & ~KEY_ACE_JOIN), + KEY_OWNER_ACE(KEY_ACE_VIEW | KEY_ACE_READ), + } +}; + +static struct key_acl thread_and_process_keyring_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE__PERMS & ~(KEY_ACE_JOIN | KEY_ACE_SET_SECURITY)), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + /* * Install the user and user session keyrings for the current process's UID. */ @@ -47,12 +78,10 @@ int install_user_keyrings(void) struct user_struct *user; const struct cred *cred; struct key *uid_keyring, *session_keyring; - key_perm_t user_keyring_perm; char buf[20]; int ret; uid_t uid; - user_keyring_perm = (KEY_POS_ALL & ~KEY_POS_SETATTR) | KEY_USR_ALL; cred = current_cred(); user = cred->user; uid = from_kuid(cred->user_ns, user->uid); @@ -77,9 +106,9 @@ int install_user_keyrings(void) uid_keyring = find_keyring_by_name(buf, true); if (IS_ERR(uid_keyring)) { uid_keyring = keyring_alloc(buf, user->uid, INVALID_GID, - cred, user_keyring_perm, + cred, &user_keyring_acl, KEY_ALLOC_UID_KEYRING | - KEY_ALLOC_IN_QUOTA, + KEY_ALLOC_IN_QUOTA, NULL, NULL); if (IS_ERR(uid_keyring)) { ret = PTR_ERR(uid_keyring); @@ -95,9 +124,9 @@ int install_user_keyrings(void) if (IS_ERR(session_keyring)) { session_keyring = keyring_alloc(buf, user->uid, INVALID_GID, - cred, user_keyring_perm, + cred, &user_keyring_acl, KEY_ALLOC_UID_KEYRING | - KEY_ALLOC_IN_QUOTA, + KEY_ALLOC_IN_QUOTA, NULL, NULL); if (IS_ERR(session_keyring)) { ret = PTR_ERR(session_keyring); @@ -144,7 +173,7 @@ int install_thread_keyring_to_cred(struct cred *new) return 0; keyring = keyring_alloc("_tid", new->uid, new->gid, new, - KEY_POS_ALL | KEY_USR_VIEW, + &thread_and_process_keyring_acl, KEY_ALLOC_QUOTA_OVERRUN, NULL, NULL); if (IS_ERR(keyring)) @@ -191,7 +220,7 @@ int install_process_keyring_to_cred(struct cred *new) return 0; keyring = keyring_alloc("_pid", new->uid, new->gid, new, - KEY_POS_ALL | KEY_USR_VIEW, + &thread_and_process_keyring_acl, KEY_ALLOC_QUOTA_OVERRUN, NULL, NULL); if (IS_ERR(keyring)) @@ -245,8 +274,7 @@ int install_session_keyring_to_cred(struct cred *cred, struct key *keyring) flags = KEY_ALLOC_IN_QUOTA; keyring = keyring_alloc("_ses", cred->uid, cred->gid, cred, - KEY_POS_ALL | KEY_USR_VIEW | KEY_USR_READ, - flags, NULL, NULL); + &session_keyring_acl, flags, NULL, NULL); if (IS_ERR(keyring)) return PTR_ERR(keyring); } else { @@ -554,7 +582,7 @@ bool lookup_user_key_possessed(const struct key *key, * returned key reference. */ key_ref_t lookup_user_key(key_serial_t id, unsigned long lflags, - key_perm_t perm) + unsigned int desired_perm) { struct keyring_search_context ctx = { .match_data.cmp = lookup_user_key_possessed, @@ -740,12 +768,12 @@ key_ref_t lookup_user_key(key_serial_t id, unsigned long lflags, case -ERESTARTSYS: goto invalid_key; default: - if (perm) + if (desired_perm) goto invalid_key; case 0: break; } - } else if (perm) { + } else if (desired_perm) { ret = key_validate(key); if (ret < 0) goto invalid_key; @@ -757,9 +785,11 @@ key_ref_t lookup_user_key(key_serial_t id, unsigned long lflags, goto invalid_key; /* check the permissions */ - ret = key_task_permission(key_ref, ctx.cred, perm); - if (ret < 0) - goto invalid_key; + if (desired_perm) { + ret = key_task_permission(key_ref, ctx.cred, desired_perm); + if (ret < 0) + goto invalid_key; + } key->last_used_at = ktime_get_real_seconds(); @@ -824,13 +854,13 @@ long join_session_keyring(const char *name) if (PTR_ERR(keyring) == -ENOKEY) { /* not found - try and create a new one */ keyring = keyring_alloc( - name, old->uid, old->gid, old, - KEY_POS_ALL | KEY_USR_VIEW | KEY_USR_READ | KEY_USR_LINK, + name, old->uid, old->gid, old, &joinable_keyring_acl, KEY_ALLOC_IN_QUOTA, NULL, NULL); if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); goto error2; } + goto no_perm_test; } else if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); goto error2; @@ -839,6 +869,12 @@ long join_session_keyring(const char *name) goto error3; } + ret = key_task_permission(make_key_ref(keyring, false), old, + KEY_NEED_JOIN); + if (ret < 0) + goto error3; + +no_perm_test: /* we've got a keyring - now to install it */ ret = install_session_keyring_to_cred(new, keyring); if (ret < 0) diff --git a/security/keys/request_key.c b/security/keys/request_key.c index 10244b6fbf5d..0d609c1efece 100644 --- a/security/keys/request_key.c +++ b/security/keys/request_key.c @@ -115,8 +115,7 @@ static int call_sbin_request_key(struct key *authkey) cred = get_current_cred(); keyring = keyring_alloc(desc, cred->fsuid, cred->fsgid, cred, - KEY_POS_ALL | KEY_USR_VIEW | KEY_USR_READ, - KEY_ALLOC_QUOTA_OVERRUN, NULL, NULL); + NULL, KEY_ALLOC_QUOTA_OVERRUN, NULL, NULL); put_cred(cred); if (IS_ERR(keyring)) { ret = PTR_ERR(keyring); @@ -344,11 +343,11 @@ static int construct_alloc_key(struct keyring_search_context *ctx, struct key *dest_keyring, unsigned long flags, struct key_user *user, + struct key_acl *acl, struct key **_key) { struct assoc_array_edit *edit; struct key *key; - key_perm_t perm; key_ref_t key_ref; int ret; @@ -358,17 +357,9 @@ static int construct_alloc_key(struct keyring_search_context *ctx, *_key = NULL; mutex_lock(&user->cons_lock); - perm = KEY_POS_VIEW | KEY_POS_SEARCH | KEY_POS_LINK | KEY_POS_SETATTR; - perm |= KEY_USR_VIEW; - if (ctx->index_key.type->read) - perm |= KEY_POS_READ; - if (ctx->index_key.type == &key_type_keyring || - ctx->index_key.type->update) - perm |= KEY_POS_WRITE; - key = key_alloc(ctx->index_key.type, ctx->index_key.description, ctx->cred->fsuid, ctx->cred->fsgid, ctx->cred, - perm, flags, NULL); + acl, flags, NULL); if (IS_ERR(key)) goto alloc_failed; @@ -444,6 +435,7 @@ static struct key *construct_key_and_link(struct keyring_search_context *ctx, const char *callout_info, size_t callout_len, void *aux, + struct key_acl *acl, struct key *dest_keyring, unsigned long flags) { @@ -466,7 +458,7 @@ static struct key *construct_key_and_link(struct keyring_search_context *ctx, goto error_put_dest_keyring; } - ret = construct_alloc_key(ctx, dest_keyring, flags, user, &key); + ret = construct_alloc_key(ctx, dest_keyring, flags, user, acl, &key); key_user_put(user); if (ret == 0) { @@ -504,6 +496,7 @@ static struct key *construct_key_and_link(struct keyring_search_context *ctx, * @callout_info: The data to pass to the instantiation upcall (or NULL). * @callout_len: The length of callout_info. * @aux: Auxiliary data for the upcall. + * @acl: The ACL to attach if a new key is created. * @dest_keyring: Where to cache the key. * @flags: Flags to key_alloc(). * @@ -531,6 +524,7 @@ struct key *request_key_and_link(struct key_type *type, const void *callout_info, size_t callout_len, void *aux, + struct key_acl *acl, struct key *dest_keyring, unsigned long flags) { @@ -593,7 +587,7 @@ struct key *request_key_and_link(struct key_type *type, goto error_free; key = construct_key_and_link(&ctx, callout_info, callout_len, - aux, dest_keyring, flags); + aux, acl, dest_keyring, flags); } error_free: @@ -635,6 +629,7 @@ EXPORT_SYMBOL(wait_for_key_construction); * @type: Type of key. * @description: The searchable description of the key. * @callout_info: The data to pass to the instantiation upcall (or NULL). + * @acl: The ACL to attach if a new key is created. * * As for request_key_and_link() except that it does not add the returned key * to a keyring if found, new keys are always allocated in the user's quota, @@ -646,7 +641,8 @@ EXPORT_SYMBOL(wait_for_key_construction); */ struct key *request_key(struct key_type *type, const char *description, - const char *callout_info) + const char *callout_info, + struct key_acl *acl) { struct key *key; size_t callout_len = 0; @@ -656,7 +652,7 @@ struct key *request_key(struct key_type *type, callout_len = strlen(callout_info); key = request_key_and_link(type, description, NULL, callout_info, callout_len, - NULL, NULL, KEY_ALLOC_IN_QUOTA); + NULL, acl, NULL, KEY_ALLOC_IN_QUOTA); if (!IS_ERR(key)) { ret = wait_for_key_construction(key, false); if (ret < 0) { @@ -675,6 +671,7 @@ EXPORT_SYMBOL(request_key); * @callout_info: The data to pass to the instantiation upcall (or NULL). * @callout_len: The length of callout_info. * @aux: Auxiliary data for the upcall. + * @acl: The ACL to attach if a new key is created. * * As for request_key_and_link() except that it does not add the returned key * to a keyring if found and new keys are always allocated in the user's quota. @@ -686,14 +683,15 @@ struct key *request_key_with_auxdata(struct key_type *type, const char *description, const void *callout_info, size_t callout_len, - void *aux) + void *aux, + struct key_acl *acl) { struct key *key; int ret; key = request_key_and_link(type, description, NULL, callout_info, callout_len, - aux, NULL, KEY_ALLOC_IN_QUOTA); + aux, acl, NULL, KEY_ALLOC_IN_QUOTA); if (!IS_ERR(key)) { ret = wait_for_key_construction(key, false); if (ret < 0) { @@ -711,6 +709,7 @@ EXPORT_SYMBOL(request_key_with_auxdata); * @description: The searchable description of the key. * @net: The network namespace that is the key's domain of operation. * @callout_info: The data to pass to the instantiation upcall (or NULL). + * @acl: The ACL to attach if a new key is created. * * As for request_key() except that it does not add the returned key to a * keyring if found, new keys are always allocated in the user's quota, the @@ -723,7 +722,8 @@ EXPORT_SYMBOL(request_key_with_auxdata); struct key *request_key_net(struct key_type *type, const char *description, struct net *net, - const char *callout_info) + const char *callout_info, + struct key_acl *acl) { struct key *key; size_t callout_len = 0; @@ -733,7 +733,7 @@ struct key *request_key_net(struct key_type *type, callout_len = strlen(callout_info); key = request_key_and_link(type, description, net->key_domain, callout_info, callout_len, - NULL, NULL, KEY_ALLOC_IN_QUOTA); + NULL, acl, NULL, KEY_ALLOC_IN_QUOTA); if (!IS_ERR(key)) { ret = wait_for_key_construction(key, false); if (ret < 0) { diff --git a/security/keys/request_key_auth.c b/security/keys/request_key_auth.c index 726555a0639c..790c809844ac 100644 --- a/security/keys/request_key_auth.c +++ b/security/keys/request_key_auth.c @@ -28,6 +28,17 @@ static void request_key_auth_revoke(struct key *); static void request_key_auth_destroy(struct key *); static long request_key_auth_read(const struct key *, char __user *, size_t); +static struct key_acl request_key_auth_acl = { + .usage = REFCOUNT_INIT(1), + .nr_ace = 2, + .possessor_viewable = true, + .aces = { + KEY_POSSESSOR_ACE(KEY_ACE_VIEW | KEY_ACE_READ | KEY_ACE_SEARCH | + KEY_ACE_LINK), + KEY_OWNER_ACE(KEY_ACE_VIEW), + } +}; + /* * The request-key authorisation key type definition. */ @@ -208,8 +219,8 @@ struct key *request_key_auth_new(struct key *target, const char *op, authkey = key_alloc(&key_type_request_key_auth, desc, cred->fsuid, cred->fsgid, cred, - KEY_POS_VIEW | KEY_POS_READ | KEY_POS_SEARCH | KEY_POS_LINK | - KEY_USR_VIEW, KEY_ALLOC_NOT_IN_QUOTA, NULL); + &request_key_auth_acl, + KEY_ALLOC_NOT_IN_QUOTA, NULL); if (IS_ERR(authkey)) { ret = PTR_ERR(authkey); goto error_free_rka; diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index fd845063b692..616b7c292eb6 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -6560,6 +6560,7 @@ static int selinux_key_permission(key_ref_t key_ref, { struct key *key; struct key_security_struct *ksec; + unsigned oldstyle_perm; u32 sid; /* if no specific permissions are requested, we skip the @@ -6568,13 +6569,26 @@ static int selinux_key_permission(key_ref_t key_ref, if (perm == 0) return 0; + oldstyle_perm = perm & (KEY_NEED_VIEW | KEY_NEED_READ | KEY_NEED_WRITE | + KEY_NEED_SEARCH | KEY_NEED_LINK); + if (perm & KEY_NEED_SETSEC) + oldstyle_perm |= OLD_KEY_NEED_SETATTR; + if (perm & KEY_NEED_INVAL) + oldstyle_perm |= KEY_NEED_SEARCH; + if (perm & KEY_NEED_REVOKE && !(perm & OLD_KEY_NEED_SETATTR)) + oldstyle_perm |= KEY_NEED_WRITE; + if (perm & KEY_NEED_JOIN) + oldstyle_perm |= KEY_NEED_SEARCH; + if (perm & KEY_NEED_CLEAR) + oldstyle_perm |= KEY_NEED_WRITE; + sid = cred_sid(cred); key = key_ref_to_ptr(key_ref); ksec = key->security; return avc_has_perm(&selinux_state, - sid, ksec->sid, SECCLASS_KEY, perm, NULL); + sid, ksec->sid, SECCLASS_KEY, oldstyle_perm, NULL); } static int selinux_key_getsecurity(struct key *key, char **_buffer) diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index feaace1c24a2..c09133115769 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -4407,7 +4407,8 @@ static int smack_key_permission(key_ref_t key_ref, #endif if (perm & (KEY_NEED_READ | KEY_NEED_SEARCH | KEY_NEED_VIEW)) request |= MAY_READ; - if (perm & (KEY_NEED_WRITE | KEY_NEED_LINK | KEY_NEED_SETATTR)) + if (perm & (KEY_NEED_WRITE | KEY_NEED_LINK | KEY_NEED_SETSEC | + KEY_NEED_INVAL | KEY_NEED_REVOKE | KEY_NEED_CLEAR)) request |= MAY_WRITE; rc = smk_access(tkp, keyp->security, request, &ad); rc = smk_bu_note("key access", tkp, keyp->security, request, rc); From patchwork Fri Feb 15 16:11:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043016 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JFl4HkRz9s7T for ; Sat, 16 Feb 2019 03:12:59 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387453AbfBOQLs (ORCPT ); Fri, 15 Feb 2019 11:11:48 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55000 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730404AbfBOQLr (ORCPT ); Fri, 15 Feb 2019 11:11:47 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A1C711244DB; Fri, 15 Feb 2019 16:11:45 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 627CC5D70E; Fri, 15 Feb 2019 16:11:39 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 23/27] KEYS: Provide KEYCTL_GRANT_PERMISSION From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:11:39 +0000 Message-ID: <155024709900.21651.6425389370223480666.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 15 Feb 2019 16:11:45 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a keyctl() operation to grant/remove permissions. The grant operation, wrapped by libkeyutils, looks like: int ret = keyctl_grant_permission(key_serial_t key, enum key_ace_subject_type type, unsigned int subject, unsigned int perm); Where key is the key to be modified, type and subject represent the subject to which permission is to be granted (or removed) and perm is the set of permissions to be granted. 0 is returned on success. SET_SECURITY permission is required for this. The subject type currently must be KEY_ACE_SUBJ_STANDARD for the moment (other subject types will come along later). For subject type KEY_ACE_SUBJ_STANDARD, the following subject values are available: KEY_ACE_POSSESSOR The possessor of the key KEY_ACE_OWNER The owner of the key KEY_ACE_GROUP The key's group KEY_ACE_EVERYONE Everyone perm lists the permissions to be granted: KEY_ACE_VIEW Can view the key metadata KEY_ACE_READ Can read the key content KEY_ACE_WRITE Can update/modify the key content KEY_ACE_SEARCH Can find the key by searching/requesting KEY_ACE_LINK Can make a link to the key KEY_ACE_SET_SECURITY Can set security KEY_ACE_INVAL Can invalidate KEY_ACE_REVOKE Can revoke KEY_ACE_JOIN Can join this keyring KEY_ACE_CLEAR Can clear this keyring If an ACE already exists for the subject, then the permissions mask will be overwritten; if perm is 0, it will be deleted. Currently, the internal ACL is limited to a maximum of 16 entries. For example: int ret = keyctl_grant_permission(key, KEY_ACE_SUBJ_STANDARD, KEY_ACE_OWNER, KEY_ACE_VIEW | KEY_ACE_READ); Signed-off-by: David Howells --- include/uapi/linux/keyctl.h | 1 security/keys/compat.c | 2 + security/keys/internal.h | 5 ++ security/keys/keyctl.c | 5 ++ security/keys/permission.c | 119 +++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 132 insertions(+) diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 50d7b6ca82ab..045dcbb6bb8d 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -136,6 +136,7 @@ enum key_ace_standard_subject { #define KEYCTL_MOVE 33 /* Move keys between keyrings */ #define KEYCTL_FIND_LRU 34 /* Find the least-recently used key in a keyring */ #define KEYCTL_SET_CONTAINER_KEYRING 35 /* Attach a keyring to a container */ +#define KEYCTL_GRANT_PERMISSION 36 /* Grant a permit to a key */ /* keyctl structures */ struct keyctl_dh_params { diff --git a/security/keys/compat.c b/security/keys/compat.c index 7990ec026237..953156f94320 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -174,6 +174,8 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, case KEYCTL_MOVE: return keyctl_keyring_move(arg2, arg3, arg4, arg5); + case KEYCTL_GRANT_PERMISSION: + return keyctl_grant_permission(arg2, arg3, arg4, arg5); default: return -EOPNOTSUPP; diff --git a/security/keys/internal.h b/security/keys/internal.h index 9f9ecc1810c9..6cd7b5c17298 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -377,6 +377,11 @@ extern long keyctl_find_lru(key_serial_t, const char __user *); extern long keyctl_set_container_keyring(int, key_serial_t); #endif +extern long keyctl_grant_permission(key_serial_t keyid, + enum key_ace_subject_type type, + unsigned int subject, + unsigned int perm); + /* * Debugging key validation */ diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index 2df896bfb8e4..02bd73d5a05a 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -1961,6 +1961,11 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, (key_serial_t)arg3, (key_serial_t)arg4, (unsigned int)arg5); + case KEYCTL_GRANT_PERMISSION: + return keyctl_grant_permission((key_serial_t)arg2, + (enum key_ace_subject_type)arg3, + (unsigned int)arg4, + (unsigned int)arg5); default: return -EOPNOTSUPP; diff --git a/security/keys/permission.c b/security/keys/permission.c index 8dc6e80f6fd0..cb1359f6c668 100644 --- a/security/keys/permission.c +++ b/security/keys/permission.c @@ -274,3 +274,122 @@ long key_set_acl(struct key *key, struct key_acl *acl) key_put_acl(acl); return 0; } + +/* + * Allocate a new ACL with an extra ACE slot. + */ +static struct key_acl *key_alloc_acl(const struct key_acl *old_acl, int nr, int skip) +{ + struct key_acl *acl; + int nr_ace, i, j = 0; + + nr_ace = old_acl->nr_ace + nr; + if (nr_ace > 16) + return ERR_PTR(-EINVAL); + + acl = kzalloc(struct_size(acl, aces, nr_ace), GFP_KERNEL); + if (!acl) + return ERR_PTR(-ENOMEM); + + refcount_set(&acl->usage, 1); + acl->nr_ace = nr_ace; + for (i = 0; i < old_acl->nr_ace; i++) { + if (i == skip) + continue; + acl->aces[j] = old_acl->aces[i]; + j++; + } + return acl; +} + +/* + * Generate the revised ACL. + */ +static long key_change_acl(struct key *key, struct key_ace *new_ace) +{ + struct key_acl *acl, *old; + int i; + + old = rcu_dereference_protected(key->acl, lockdep_is_held(&key->sem)); + + for (i = 0; i < old->nr_ace; i++) + if (old->aces[i].type == new_ace->type && + old->aces[i].subject_id == new_ace->subject_id) + goto found_match; + + if (new_ace->perm == 0) + return 0; /* No permissions to remove. Add deny record? */ + + acl = key_alloc_acl(old, 1, -1); + if (IS_ERR(acl)) + return PTR_ERR(acl); + acl->aces[i] = *new_ace; + goto change; + +found_match: + if (new_ace->perm == 0) + goto delete_ace; + if (new_ace->perm == old->aces[i].perm) + return 0; + acl = key_alloc_acl(old, 0, -1); + if (IS_ERR(acl)) + return PTR_ERR(acl); + acl->aces[i].perm = new_ace->perm; + goto change; + +delete_ace: + acl = key_alloc_acl(old, -1, i); + if (IS_ERR(acl)) + return PTR_ERR(acl); + goto change; + +change: + return key_set_acl(key, acl); +} + +/* + * Add, alter or remove (if perm == 0) an ACE in a key's ACL. + */ +long keyctl_grant_permission(key_serial_t keyid, + enum key_ace_subject_type type, + unsigned int subject, + unsigned int perm) +{ + struct key_ace new_ace; + struct key *key; + key_ref_t key_ref; + long ret; + + new_ace.type = type; + new_ace.perm = perm; + + switch (type) { + case KEY_ACE_SUBJ_STANDARD: + if (subject >= nr__key_ace_standard_subject) + return -ENOENT; + new_ace.subject_id = subject; + break; + + default: + return -ENOENT; + } + + key_ref = lookup_user_key(keyid, KEY_LOOKUP_PARTIAL, KEY_NEED_SETSEC); + if (IS_ERR(key_ref)) { + ret = PTR_ERR(key_ref); + goto error; + } + + key = key_ref_to_ptr(key_ref); + + down_write(&key->sem); + + /* If we're not the sysadmin, we can only change a key that we own */ + ret = -EACCES; + if (capable(CAP_SYS_ADMIN) || uid_eq(key->uid, current_fsuid())) + ret = key_change_acl(key, &new_ace); + up_write(&key->sem); + key_put(key); +error: + return ret; +} From patchwork Fri Feb 15 16:11:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043012 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JDj2k6rz9s7h for ; Sat, 16 Feb 2019 03:12:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404327AbfBOQLz (ORCPT ); Fri, 15 Feb 2019 11:11:55 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47363 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2404036AbfBOQLy (ORCPT ); Fri, 15 Feb 2019 11:11:54 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B7707C0AD406; Fri, 15 Feb 2019 16:11:53 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 899475D6A9; Fri, 15 Feb 2019 16:11:51 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 24/27] keys: Allow a container to be specified as a subject in a key's ACL From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:11:50 +0000 Message-ID: <155024711079.21651.7490587069155188367.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 15 Feb 2019 16:11:54 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Allow the ACL attached to a key to grant permissions to the denizens of a container object when request_key() is called. This allows separate permissions to those granted in the possessor set. int cfd = container_create("foo", 0); int ret = keyctl_grant_permission(key, KEY_ACE_SUBJ_CONTAINER, cfd, KEY_ACE_SEARCH); To allow request_key() to find a key, KEY_ACE_SEARCH must be included in the ACE. This will allow filesystems and network protocols (eg. AFS and AF_RXRPC) to use the key. For the request_key() system call to be able to find a key for a process inside the container, KEY_ACE_LINK must be granted also. Keys on the container keyring (and the container keyring itself) can be accessed directly by ID from inside the container if other KEY_ACE_* permits are granted. Signed-off-by: David Howells --- include/linux/container.h | 6 ++- include/linux/key.h | 3 + include/uapi/linux/keyctl.h | 1 kernel/container.c | 41 ++++++++++++++++++- samples/vfs/test-container.c | 60 ++++++++++++++++++++++++++++ security/keys/permission.c | 90 ++++++++++++++++++++++++++++++++++++++---- security/keys/process_keys.c | 2 - 7 files changed, 188 insertions(+), 15 deletions(-) diff --git a/include/linux/container.h b/include/linux/container.h index 7424f7fb5560..cd82074c26a3 100644 --- a/include/linux/container.h +++ b/include/linux/container.h @@ -33,7 +33,11 @@ struct container { refcount_t usage; int exit_code; /* The exit code of 'init' */ const struct cred *cred; /* Creds for this container, including userns */ +#ifdef CONFIG_KEYS struct key *keyring; /* Externally managed container keyring */ + struct key_tag *tag; /* Container ID for key ACL */ + struct list_head req_key_traps; /* Traps for request-key upcalls */ +#endif struct nsproxy *ns; /* This container's namespaces */ struct path root; /* The root of the container's fs namespace */ struct task_struct *init; /* The 'init' task for this container */ @@ -43,7 +47,6 @@ struct container { struct list_head members; /* Member processes, guarded with ->lock */ struct list_head child_link; /* Link in parent->children */ struct list_head children; /* Child containers */ - struct list_head req_key_traps; /* Traps for request-key upcalls */ wait_queue_head_t waitq; /* Someone waiting for init to exit waits here */ unsigned long flags; #define CONTAINER_FLAG_INIT_STARTED 0 /* Init is started - certain ops now prohibited */ @@ -63,6 +66,7 @@ extern int copy_container(unsigned long flags, struct task_struct *tsk, extern void exit_container(struct task_struct *tsk); extern void put_container(struct container *c); extern long key_del_intercept(struct container *c, const char *type); +extern struct container *fd_to_container(int fd); static inline struct container *get_container(struct container *c) { diff --git a/include/linux/key.h b/include/linux/key.h index a38b89bd414c..01bccaa40047 100644 --- a/include/linux/key.h +++ b/include/linux/key.h @@ -90,6 +90,9 @@ struct key_ace { kuid_t uid; kgid_t gid; unsigned int subject_id; +#ifdef CONFIG_CONTAINERS + struct key_tag __rcu *container_tag; +#endif }; }; diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 045dcbb6bb8d..7136d14dd4d7 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -20,6 +20,7 @@ */ enum key_ace_subject_type { KEY_ACE_SUBJ_STANDARD = 0, /* subject is one of key_ace_standard_subject */ + KEY_ACE_SUBJ_CONTAINER = 1, /* subject is a container fd */ nr__key_ace_subject_type }; diff --git a/kernel/container.c b/kernel/container.c index f2706a45f364..81be4ed915c2 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -35,7 +35,9 @@ struct container init_container = { .members.next = &init_task.container_link, .members.prev = &init_task.container_link, .children = LIST_HEAD_INIT(init_container.children), +#ifdef CONFIG_KEYS .req_key_traps = LIST_HEAD_INIT(init_container.req_key_traps), +#endif .flags = (1 << CONTAINER_FLAG_INIT_STARTED), .lock = __SPIN_LOCK_UNLOCKED(init_container.lock), .seq = SEQCNT_ZERO(init_fs.seq), @@ -54,8 +56,6 @@ void put_container(struct container *c) while (c && refcount_dec_and_test(&c->usage)) { BUG_ON(!list_empty(&c->members)); - if (!list_empty(&c->req_key_traps)) - key_del_intercept(c, NULL); if (c->pid_ns) put_pid_ns(c->pid_ns); if (c->ns) @@ -71,7 +71,15 @@ void put_container(struct container *c) if (c->cred) put_cred(c->cred); +#ifdef CONFIG_KEYS + if (!list_empty(&c->req_key_traps)) + key_del_intercept(c, NULL); + if (c->tag) { + c->tag->removed = true; + key_put_tag(c->tag); + } key_put(c->keyring); +#endif security_container_free(c); kfree(c); c = parent; @@ -209,6 +217,24 @@ const struct file_operations container_fops = { .release = container_release, }; +/** + * fd_to_container - Get the container attached to an fd. + */ +struct container *fd_to_container(int fd) +{ + struct container *c = ERR_PTR(-EINVAL); + struct fd f = fdget(fd); + + if (!f.file) + return ERR_PTR(-EBADF); + + if (is_container_file(f.file)) + c = get_container(f.file->private_data); + + fdput(f); + return c; +} + /* * Handle fork/clone. * @@ -290,7 +316,9 @@ static struct container *alloc_container(const char __user *name) INIT_LIST_HEAD(&c->members); INIT_LIST_HEAD(&c->children); +#ifdef CONFIG_KEYS INIT_LIST_HEAD(&c->req_key_traps); +#endif init_waitqueue_head(&c->waitq); spin_lock_init(&c->lock); refcount_set(&c->usage, 1); @@ -305,8 +333,15 @@ static struct container *alloc_container(const char __user *name) ret = -EINVAL; if (strchr(c->name, '/')) goto err; - c->name[len] = 0; + +#ifdef CONFIG_KEYS + ret = -ENOMEM; + c->tag = kzalloc(sizeof(*c->tag), GFP_KERNEL); + if (!c->tag) + goto err; + refcount_set(&c->tag->usage, 1); +#endif return c; err: diff --git a/samples/vfs/test-container.c b/samples/vfs/test-container.c index e24048fdbe33..7b2081693fce 100644 --- a/samples/vfs/test-container.c +++ b/samples/vfs/test-container.c @@ -22,6 +22,30 @@ #define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ #define KEYCTL_SET_CONTAINER_KEYRING 35 /* Attach a keyring to a container */ +#define KEYCTL_GRANT_PERMISSION 36 /* Grant a permit to a key */ + +enum key_ace_subject_type { + KEY_ACE_SUBJ_STANDARD = 0, /* subject is one of key_ace_standard_subject */ + KEY_ACE_SUBJ_CONTAINER = 1, /* subject is a container fd */ +}; + +enum key_ace_standard_subject { + KEY_ACE_EVERYONE = 0, /* Everyone, including owner and group */ + KEY_ACE_GROUP = 1, /* The key's group */ + KEY_ACE_OWNER = 2, /* The owner of the key */ + KEY_ACE_POSSESSOR = 3, /* Any process that possesses of the key */ +}; + +#define KEY_ACE_VIEW 0x00000001 /* Can describe the key */ +#define KEY_ACE_READ 0x00000002 /* Can read the key content */ +#define KEY_ACE_WRITE 0x00000004 /* Can update/modify the key content */ +#define KEY_ACE_SEARCH 0x00000008 /* Can find the key by search */ +#define KEY_ACE_LINK 0x00000010 /* Can make a link to the key */ +#define KEY_ACE_SET_SECURITY 0x00000020 /* Can set owner, group, ACL */ +#define KEY_ACE_INVAL 0x00000040 /* Can invalidate the key */ +#define KEY_ACE_REVOKE 0x00000080 /* Can revoke the key */ +#define KEY_ACE_JOIN 0x00000100 /* Can join keyring */ +#define KEY_ACE_CLEAR 0x00000200 /* Can clear keyring */ /* Hope -1 isn't a syscall */ #ifndef __NR_fsopen @@ -190,7 +214,7 @@ void container_init(void) */ int main(int argc, char *argv[]) { - key_serial_t keyring; + key_serial_t keyring, key; pid_t pid; int fsfd, mfd, cfd, ws; @@ -271,11 +295,45 @@ int main(int argc, char *argv[]) exit(1); } + /* We need to grant the container permission to search for keys in the + * container keyring. + */ + if (keyctl(KEYCTL_GRANT_PERMISSION, keyring, KEY_ACE_SUBJ_CONTAINER, cfd, + KEY_ACE_SEARCH) < 0) { + perror("keyctl_grant/s"); + exit(1); + } + + if (keyctl(KEYCTL_GRANT_PERMISSION, keyring, + KEY_ACE_SUBJ_STANDARD, KEY_ACE_OWNER, 0) < 0) { + perror("keyctl_grant/s"); + exit(1); + } + if (keyctl(KEYCTL_SET_CONTAINER_KEYRING, cfd, keyring) < 0) { perror("keyctl_set_container_keyring"); exit(1); } + /* Create a key that can be accessed from within the container */ + printf("Sample key...\n"); + key = add_key("user", "foobar", "wibble", 6, keyring); + if (key == -1) { + perror("add_key/s"); + exit(1); + } + + if (keyctl(KEYCTL_GRANT_PERMISSION, key, KEY_ACE_SUBJ_CONTAINER, cfd, + KEY_ACE_VIEW | KEY_ACE_SEARCH | KEY_ACE_READ | KEY_ACE_LINK) < 0) { + perror("keyctl_grant/s"); + exit(1); + } + + if (keyctl_link(key, keyring) < 0) { + perror("keyctl_link"); + exit(1); + } + /* Create a keyring to catch upcalls. */ printf("Intercepting...\n"); keyring = add_key("keyring", "upcall", NULL, 0, KEY_SPEC_SESSION_KEYRING); diff --git a/security/keys/permission.c b/security/keys/permission.c index cb1359f6c668..f16d1665885f 100644 --- a/security/keys/permission.c +++ b/security/keys/permission.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "internal.h" struct key_acl default_key_acl = { @@ -130,6 +131,15 @@ int key_task_permission(const key_ref_t key_ref, const struct cred *cred, break; } break; +#ifdef CONFIG_CONTAINERS + case KEY_ACE_SUBJ_CONTAINER: { + const struct key_tag *tag = rcu_dereference(ace->container_tag); + + if (!tag->removed && current->container->tag == tag) + allow |= ace->perm; + break; + } +#endif } } @@ -185,8 +195,7 @@ EXPORT_SYMBOL(key_validate); */ unsigned int key_acl_to_perm(const struct key_acl *acl) { - unsigned int perm = 0, tperm; - int i; + unsigned int perm = 0, tperm, i; BUILD_BUG_ON(KEY_OTH_VIEW != KEY_ACE_VIEW || KEY_OTH_READ != KEY_ACE_READ || @@ -237,13 +246,37 @@ unsigned int key_acl_to_perm(const struct key_acl *acl) return perm; } +/* + * Clean up an ACL. + */ +static void key_free_acl(struct rcu_head *rcu) +{ + struct key_acl *acl = container_of(rcu, struct key_acl, rcu); +#ifdef CONFIG_CONTAINERS + struct key_tag *tag; + unsigned int i; + + for (i = 0; i < acl->nr_ace; i++) { + const struct key_ace *ace = &acl->aces[i]; + switch (ace->type) { + case KEY_ACE_SUBJ_CONTAINER: + tag = rcu_access_pointer(ace->container_tag); + key_put_tag(ace->container_tag); + break; + } + } +#endif + + kfree(acl); +} + /* * Destroy a key's ACL. */ void key_put_acl(struct key_acl *acl) { if (acl && refcount_dec_and_test(&acl->usage)) - kfree_rcu(acl, rcu); + call_rcu(&acl->rcu, key_free_acl); } /* @@ -297,6 +330,10 @@ static struct key_acl *key_alloc_acl(const struct key_acl *old_acl, int nr, int if (i == skip) continue; acl->aces[j] = old_acl->aces[i]; +#ifdef CONFIG_CONTAINERS + if (acl->aces[j].type == KEY_ACE_SUBJ_CONTAINER) + refcount_inc(&acl->aces[j].container_tag->usage); +#endif j++; } return acl; @@ -312,21 +349,39 @@ static long key_change_acl(struct key *key, struct key_ace *new_ace) old = rcu_dereference_protected(key->acl, lockdep_is_held(&key->sem)); - for (i = 0; i < old->nr_ace; i++) - if (old->aces[i].type == new_ace->type && - old->aces[i].subject_id == new_ace->subject_id) - goto found_match; + for (i = 0; i < old->nr_ace; i++) { + if (old->aces[i].type != new_ace->type) + continue; + switch (old->aces[i].type) { + case KEY_ACE_SUBJ_STANDARD: + if (old->aces[i].subject_id == new_ace->subject_id) + goto replace_ace; + break; +#ifdef CONFIG_CONTAINERS + case KEY_ACE_SUBJ_CONTAINER: + if (old->aces[i].container_tag == new_ace->container_tag) + goto replace_ace; + break; +#endif + default: + break; + } + } if (new_ace->perm == 0) - return 0; /* No permissions to remove. Add deny record? */ + return 0; /* No permissions to cancel. Add deny record? */ acl = key_alloc_acl(old, 1, -1); if (IS_ERR(acl)) return PTR_ERR(acl); acl->aces[i] = *new_ace; +#ifdef CONFIG_CONTAINERS + if (acl->aces[i].type == KEY_ACE_SUBJ_CONTAINER) + refcount_inc(&acl->aces[i].container_tag->usage); +#endif goto change; -found_match: +replace_ace: if (new_ace->perm == 0) goto delete_ace; if (new_ace->perm == old->aces[i].perm) @@ -360,6 +415,7 @@ long keyctl_grant_permission(key_serial_t keyid, key_ref_t key_ref; long ret; + memset(&new_ace, 0, sizeof(new_ace)); new_ace.type = type; new_ace.perm = perm; @@ -370,6 +426,18 @@ long keyctl_grant_permission(key_serial_t keyid, new_ace.subject_id = subject; break; +#ifdef CONFIG_CONTAINERS + case KEY_ACE_SUBJ_CONTAINER: { + struct container *c = fd_to_container(subject); + if (IS_ERR(c)) + return -EINVAL; + refcount_inc(&c->tag->usage); + new_ace.container_tag = c->tag; + put_container(c); + break; + } +#endif + default: return -ENOENT; } @@ -391,5 +459,9 @@ long keyctl_grant_permission(key_serial_t keyid, up_write(&key->sem); key_put(key); error: +#ifdef CONFIG_CONTAINERS + if (new_ace.type == KEY_ACE_SUBJ_CONTAINER && new_ace.container_tag) + key_put_tag(new_ace.container_tag); +#endif return ret; } diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index 0a231ede4d2b..f296a1cc979a 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -466,7 +466,7 @@ key_ref_t search_my_process_keyrings(struct keyring_search_context *ctx) #ifdef CONFIG_CONTAINERS if (current->container->keyring) { key_ref = keyring_search_aux( - make_key_ref(current->container->keyring, 1), ctx); + make_key_ref(current->container->keyring, false), ctx); if (!IS_ERR(key_ref)) goto found; From patchwork Fri Feb 15 16:11:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043015 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JFV64qxz9rxp for ; Sat, 16 Feb 2019 03:12:46 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392427AbfBOQMF (ORCPT ); Fri, 15 Feb 2019 11:12:05 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46436 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388701AbfBOQME (ORCPT ); Fri, 15 Feb 2019 11:12:04 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2E58EC0AE662; Fri, 15 Feb 2019 16:12:04 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0617D5D6A9; Fri, 15 Feb 2019 16:12:01 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 25/27] keys: Provide a way to ask for the container keyring From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:11:58 +0000 Message-ID: <155024711895.21651.13875852239032339316.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 15 Feb 2019 16:12:04 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a constant that can be used in place of a key ID to indicate the keyring belonging to the current process's container. Used as: key_serial_t container_keyring = keyctl_get_key_ID(KEY_SPEC_CONTAINER_KEYRING, 0); Note that this is merely a 'macro' for the ID of the keyring. To be able to actually do anything with it requires the keyring to grant appropriate permissions to the denizens of the container. Signed-off-by: David Howells --- include/uapi/linux/keyctl.h | 1 + samples/vfs/test-container.c | 15 +++++++++++++++ security/keys/process_keys.c | 7 +++++++ 3 files changed, 23 insertions(+) diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 7136d14dd4d7..89ab609f774c 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -88,6 +88,7 @@ enum key_ace_standard_subject { #define KEY_SPEC_GROUP_KEYRING -6 /* - key ID for GID-specific keyring */ #define KEY_SPEC_REQKEY_AUTH_KEY -7 /* - key ID for assumed request_key auth key */ #define KEY_SPEC_REQUESTOR_KEYRING -8 /* - key ID for request_key() dest keyring */ +#define KEY_SPEC_CONTAINER_KEYRING -9 /* - key ID for current->container's keyring */ /* request-key default keyrings */ #define KEY_REQKEY_DEFL_NO_CHANGE -1 diff --git a/samples/vfs/test-container.c b/samples/vfs/test-container.c index 7b2081693fce..4716dd50b696 100644 --- a/samples/vfs/test-container.c +++ b/samples/vfs/test-container.c @@ -20,6 +20,7 @@ #include #include +#define KEY_SPEC_CONTAINER_KEYRING -9 /* - key ID for current->container's keyring */ #define KEYCTL_CONTAINER_INTERCEPT 31 /* Intercept upcalls inside a container */ #define KEYCTL_SET_CONTAINER_KEYRING 35 /* Attach a keyring to a container */ #define KEYCTL_GRANT_PERMISSION 36 /* Grant a permit to a key */ @@ -160,6 +161,8 @@ static inline int fork_into_container(int containerfd) static __attribute__((noreturn)) void container_init(void) { + key_serial_t ckey; + if (0) { /* Do a bit of debugging on the container. */ struct dirent **dlist; @@ -203,6 +206,12 @@ void container_init(void) exit(1); } + ckey = keyctl_get_keyring_ID(KEY_SPEC_CONTAINER_KEYRING, 0); + if (ckey == -1) + perror("keyctl_get_keyring_ID"); + else + printf("Container keyring %d\n", ckey); + setenv("PS1", "container>", 1); execl("/bin/bash", "bash", NULL); perror("execl"); @@ -310,6 +319,12 @@ int main(int argc, char *argv[]) exit(1); } + if (keyctl(KEYCTL_GRANT_PERMISSION, keyring, + KEY_ACE_SUBJ_STANDARD, KEY_ACE_OWNER, 0) < 0) { + perror("keyctl_grant/s"); + exit(1); + } + if (keyctl(KEYCTL_SET_CONTAINER_KEYRING, cfd, keyring) < 0) { perror("keyctl_set_container_keyring"); exit(1); diff --git a/security/keys/process_keys.c b/security/keys/process_keys.c index f296a1cc979a..f8f580a760c9 100644 --- a/security/keys/process_keys.c +++ b/security/keys/process_keys.c @@ -725,6 +725,13 @@ key_ref_t lookup_user_key(key_serial_t id, unsigned long lflags, key_ref = make_key_ref(key, 1); break; + case KEY_SPEC_CONTAINER_KEYRING: + key = current->container->keyring; + if (!key) + goto error; + key_ref = make_key_ref(key, 0); + goto error; + default: key_ref = ERR_PTR(-EINVAL); if (id < 1) From patchwork Fri Feb 15 16:12:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043013 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JF73FN8z9s4Z for ; Sat, 16 Feb 2019 03:12:27 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392542AbfBOQMT (ORCPT ); Fri, 15 Feb 2019 11:12:19 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36438 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727559AbfBOQMN (ORCPT ); Fri, 15 Feb 2019 11:12:13 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5B056C2E26; Fri, 15 Feb 2019 16:12:12 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 32CEA60933; Fri, 15 Feb 2019 16:12:10 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 26/27] keys: Allow containers to be included in key ACLs by name From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:12:09 +0000 Message-ID: <155024712939.21651.14094051420890992278.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 15 Feb 2019 16:12:12 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Allow a container to be specified to KEYCTL_GRANT_PERMISSION by name. This allows processes that don't have access to the container fd to grant permission on a key to a container. This is restricted to the containers that are children of the current container. This can be effected with something like: keyctl(KEYCTL_GRANT_PERMISSION, key, KEY_ACE_SUBJ_CONTAINER_NAME, "foo-test", KEY_ACE_SEARCH); Signed-off-by: David Howells --- include/linux/container.h | 1 + include/uapi/linux/keyctl.h | 1 + kernel/container.c | 24 ++++++++++++++++++++++++ security/keys/compat.c | 4 ++++ security/keys/internal.h | 2 +- security/keys/keyctl.c | 2 +- security/keys/permission.c | 19 ++++++++++++++++++- 7 files changed, 50 insertions(+), 3 deletions(-) diff --git a/include/linux/container.h b/include/linux/container.h index cd82074c26a3..fd49ce23467d 100644 --- a/include/linux/container.h +++ b/include/linux/container.h @@ -61,6 +61,7 @@ extern struct container init_container; #ifdef CONFIG_CONTAINERS extern const struct file_operations container_fops; +extern struct container *find_container(const char *name); extern int copy_container(unsigned long flags, struct task_struct *tsk, struct container *container); extern void exit_container(struct task_struct *tsk); diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h index 89ab609f774c..31520da17f37 100644 --- a/include/uapi/linux/keyctl.h +++ b/include/uapi/linux/keyctl.h @@ -21,6 +21,7 @@ enum key_ace_subject_type { KEY_ACE_SUBJ_STANDARD = 0, /* subject is one of key_ace_standard_subject */ KEY_ACE_SUBJ_CONTAINER = 1, /* subject is a container fd */ + KEY_ACE_SUBJ_CONTAINER_NAME = 2, /* subject is a container name pointer */ nr__key_ace_subject_type }; diff --git a/kernel/container.c b/kernel/container.c index 81be4ed915c2..c164c16328d6 100644 --- a/kernel/container.c +++ b/kernel/container.c @@ -235,6 +235,30 @@ struct container *fd_to_container(int fd) return c; } +/** + * find_container - Find a child container by name. + * @name: The name of the container to find. + * + * Find a child of the current container by name. + */ +struct container *find_container(const char *name) +{ + struct container *c = current->container, *p; + + spin_lock(&c->lock); + list_for_each_entry(p, &c->children, child_link) { + if (strcmp(p->name, name) == 0) { + get_container(p); + goto found; + } + } + + p = NULL; +found: + spin_unlock(&c->lock); + return p; +} + /* * Handle fork/clone. * diff --git a/security/keys/compat.c b/security/keys/compat.c index 953156f94320..78c6c0e0eb59 100644 --- a/security/keys/compat.c +++ b/security/keys/compat.c @@ -175,6 +175,10 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option, case KEYCTL_MOVE: return keyctl_keyring_move(arg2, arg3, arg4, arg5); case KEYCTL_GRANT_PERMISSION: + if (arg3 == KEY_ACE_SUBJ_CONTAINER_NAME) + return keyctl_grant_permission(arg2, arg3, + (unsigned long)compat_ptr(arg4), + arg5); return keyctl_grant_permission(arg2, arg3, arg4, arg5); default: diff --git a/security/keys/internal.h b/security/keys/internal.h index 6cd7b5c17298..aa4ad9c8002e 100644 --- a/security/keys/internal.h +++ b/security/keys/internal.h @@ -379,7 +379,7 @@ extern long keyctl_set_container_keyring(int, key_serial_t); extern long keyctl_grant_permission(key_serial_t keyid, enum key_ace_subject_type type, - unsigned int subject, + unsigned long subject, unsigned int perm); /* diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c index 02bd73d5a05a..978c9008c3b2 100644 --- a/security/keys/keyctl.c +++ b/security/keys/keyctl.c @@ -1964,7 +1964,7 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, unsigned long, arg3, case KEYCTL_GRANT_PERMISSION: return keyctl_grant_permission((key_serial_t)arg2, (enum key_ace_subject_type)arg3, - (unsigned int)arg4, + (unsigned long)arg4, (unsigned int)arg5); default: diff --git a/security/keys/permission.c b/security/keys/permission.c index f16d1665885f..b0e94ccc4635 100644 --- a/security/keys/permission.c +++ b/security/keys/permission.c @@ -407,7 +407,7 @@ static long key_change_acl(struct key *key, struct key_ace *new_ace) */ long keyctl_grant_permission(key_serial_t keyid, enum key_ace_subject_type type, - unsigned int subject, + unsigned long subject, unsigned int perm) { struct key_ace new_ace; @@ -436,6 +436,23 @@ long keyctl_grant_permission(key_serial_t keyid, put_container(c); break; } + case KEY_ACE_SUBJ_CONTAINER_NAME: { + struct container *c; + char *name; + + name = strndup_user((const char __user *)subject, 23); + if (IS_ERR(name)) + return PTR_ERR(name); + c = find_container(name); + kfree(name); + if (!c) + return -EINVAL; + new_ace.type = KEY_ACE_SUBJ_CONTAINER; + refcount_inc(&c->tag->usage); + new_ace.container_tag = c->tag; + put_container(c); + break; + } #endif default: From patchwork Fri Feb 15 16:12:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 1043014 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-cifs-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 441JFN5Sddz9rxp for ; Sat, 16 Feb 2019 03:12:40 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732023AbfBOQM2 (ORCPT ); Fri, 15 Feb 2019 11:12:28 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36804 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727183AbfBOQM1 (ORCPT ); Fri, 15 Feb 2019 11:12:27 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 823E8C7943; Fri, 15 Feb 2019 16:12:25 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-129.rdu2.redhat.com [10.10.121.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4BEFE60C62; Fri, 15 Feb 2019 16:12:19 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 27/27] containers: Sample to grant access to a key in a container From: David Howells To: keyrings@vger.kernel.org, trond.myklebust@hammerspace.com, sfrench@samba.org Cc: linux-security-module@vger.kernel.org, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, rgb@redhat.com, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Fri, 15 Feb 2019 16:12:17 +0000 Message-ID: <155024713756.21651.13272811997083735868.stgit@warthog.procyon.org.uk> In-Reply-To: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> References: <155024683432.21651.14153938339749694146.stgit@warthog.procyon.org.uk> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 15 Feb 2019 16:12:26 +0000 (UTC) Sender: linux-cifs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-cifs@vger.kernel.org Provide a sample program that will grant access to the specified key for a container named "foo-test" (as created by the test-container sample) and then link the key into the container keyring (either given on the command line or searches for a keyring called "_container" in the session keyring as placed there by the test-container sample). So, for example, this could be used to place an rxrpc key in the container keyring for kAFS inside the container to use: (1) Poke kerberos to get a ticket for accessing AFS. # kinit # aklog-kafs redhat.com (2) Find the rxrpc key ID: # keyctl show Session Keyring 1071328996 --alswrv 0 0 keyring: _ses 574060623 ---lswrv 0 65534 \_ keyring: _uid.0 1004048468 --alswrv 0 0 \_ rxrpc: afs@redhat.com 918328787 --alswrv 0 0 \_ keyring: upcall 996275498 --alswrv 0 0 \_ keyring: _container 785497401 --alswrv 0 0 \_ user: foobar which would be 1004048468 in this example. (3) Invoke the sample: # test-cont-grant 1004048468 The rxrpc key can now be seen in the container keyring: # keyctl show Session Keyring 1071328996 --alswrv 0 0 keyring: _ses 574060623 ---lswrv 0 65534 \_ keyring: _uid.0 1004048468 --alswrv 0 0 \_ rxrpc: afs@redhat.com 918328787 --alswrv 0 0 \_ keyring: upcall 996275498 --alswrv 0 0 \_ keyring: _container 785497401 --alswrv 0 0 \_ user: foobar 1004048468 --alswrv 0 0 \_ rxrpc: afs@redhat.com (4) Mount the kAFS filesystem inside the container: > mount -t afs "%redhat.com:root.cell" /mnt The contents of /mnt can then be used from inside the container using the key placed into the container keyring. Signed-off-by: David Howells --- samples/vfs/Makefile | 3 + samples/vfs/test-cont-grant.c | 84 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) create mode 100644 samples/vfs/test-cont-grant.c diff --git a/samples/vfs/Makefile b/samples/vfs/Makefile index a8e9e1142ae3..c8eea193a856 100644 --- a/samples/vfs/Makefile +++ b/samples/vfs/Makefile @@ -6,6 +6,7 @@ hostprogs-$(CONFIG_SAMPLE_VFS) := \ test-mntinfo \ test-statx \ test-container \ + test-cont-grant \ test-upcall # Tell kbuild to always build the programs @@ -22,5 +23,7 @@ HOSTCFLAGS_test-statx.o += -I$(objtree)/usr/include HOSTCFLAGS_test-container.o += -I$(objtree)/usr/include HOSTLDLIBS_test-container += -lkeyutils +HOSTCFLAGS_test-cont-grant.o += -I$(objtree)/usr/include +HOSTLDLIBS_test-cont-grant += -lkeyutils HOSTCFLAGS_test-upcall.o += -I$(objtree)/usr/include HOSTLDLIBS_test-upcall += -lkeyutils diff --git a/samples/vfs/test-cont-grant.c b/samples/vfs/test-cont-grant.c new file mode 100644 index 000000000000..da4a60bc71fa --- /dev/null +++ b/samples/vfs/test-cont-grant.c @@ -0,0 +1,84 @@ +/* Link a key into a container keyring and grant perms to the container. + * + * Copyright (C) 2019 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public Licence + * as published by the Free Software Foundation; either version + * 2 of the Licence, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define KEYCTL_GRANT_PERMISSION 36 /* Grant a permit to a key */ + +enum key_ace_subject_type { + KEY_ACE_SUBJ_STANDARD = 0, /* subject is one of key_ace_standard_subject */ + KEY_ACE_SUBJ_CONTAINER = 1, /* subject is a container fd */ + KEY_ACE_SUBJ_CONTAINER_NAME = 2, /* subject is a container name pointer */ +}; + +enum key_ace_standard_subject { + KEY_ACE_EVERYONE = 0, /* Everyone, including owner and group */ + KEY_ACE_GROUP = 1, /* The key's group */ + KEY_ACE_OWNER = 2, /* The owner of the key */ + KEY_ACE_POSSESSOR = 3, /* Any process that possesses of the key */ +}; + +#define KEY_ACE_VIEW 0x00000001 /* Can describe the key */ +#define KEY_ACE_READ 0x00000002 /* Can read the key content */ +#define KEY_ACE_WRITE 0x00000004 /* Can update/modify the key content */ +#define KEY_ACE_SEARCH 0x00000008 /* Can find the key by search */ +#define KEY_ACE_LINK 0x00000010 /* Can make a link to the key */ +#define KEY_ACE_SET_SECURITY 0x00000020 /* Can set owner, group, ACL */ +#define KEY_ACE_INVAL 0x00000040 /* Can invalidate the key */ +#define KEY_ACE_REVOKE 0x00000080 /* Can revoke the key */ +#define KEY_ACE_JOIN 0x00000100 /* Can join keyring */ +#define KEY_ACE_CLEAR 0x00000200 /* Can clear keyring */ + +int main(int argc, char *argv[]) +{ + key_serial_t key, keyring; + + if (argc == 2) { + printf("Find keyring '_container'...\n"); + keyring = keyctl_search(KEY_SPEC_SESSION_KEYRING, "keyring", "_container", 0); + if (keyring == -1) { + perror("keyctl_search"); + exit(1); + } + + key = atoi(argv[1]); + } else if (argc == 3) { + printf("Use specified keyring...\n"); + keyring = atoi(argv[2]); + key = atoi(argv[1]); + } else { + fprintf(stderr, "Format: test-cont-grant []\n"); + exit(2); + } + + if (keyctl(KEYCTL_GRANT_PERMISSION, key, + KEY_ACE_SUBJ_CONTAINER_NAME, "foo-test", + KEY_ACE_SEARCH) < 0) { + perror("keyctl_grant/s"); + exit(1); + } + + if (keyctl_link(key, keyring) < 0) { + perror("keyctl_link"); + exit(1); + } + + exit(0); +}