[PATCH OpenHarmony-5.10 00/18] sync 5.10 cve bugfix 01-24

From: Yu Changchun <yuchangchun1@huawei.com> These patches are related with the following CVEs: CVE-2021-39685 CVE-2021-4083 CVE-2021-4155 CVE-2021-4197 CVE-2021-4202 CVE-2021-44733 CVE-2021-45095 CVE-2021-45469 CVE-2021-45480 --------------------------------------------------- Bongsu Jeon (1): net: nfc: nci: Change the NCI close sequence Chao Yu (1): f2fs: fix to do sanity check on last xattr entry in __f2fs_setxattr() Darrick J. Wong (1): xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate Greg Kroah-Hartman (2): USB: gadget: detect too-big endpoint 0 requests USB: gadget: bRequestType is a bitfield, not a enum Hangyu Hua (2): phonet: refcount leak in pep_sock_accep rds: memory leak in __rds_conn_create() Hui Su (1): cgroup/cgroup.c: replace 'of->kn->priv' with of_cft() Jens Wiklander (1): tee: handle lookup of shm with reference count 0 Lin Ma (4): NFC: reorganize the functions in nci_request NFC: reorder the logic in nfc_{un,}register_device NFC: add NCI_UNREG flag to eliminate the race NFC: add necessary privilege flags in netlink layer Linus Torvalds (1): fget: check that the fd still exists after getting a ref to it Michal Koutný (1): cgroup: cgroup.{procs,threads} factor out common parts Tejun Heo (3): cgroup: Use open-time credentials for process migraton perm checks cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv cgroup: Use open-time cgroup namespace for process migration perm checks drivers/tee/tee_shm.c | 171 ++++++++++++------------------ drivers/usb/gadget/composite.c | 12 +++ drivers/usb/gadget/legacy/dbgp.c | 13 +++ drivers/usb/gadget/legacy/inode.c | 16 ++- fs/f2fs/xattr.c | 11 +- fs/file.c | 4 + fs/xfs/xfs_ioctl.c | 3 +- include/linux/tee_drv.h | 4 +- include/net/nfc/nci_core.h | 1 + kernel/cgroup/cgroup-internal.h | 19 ++++ kernel/cgroup/cgroup-v1.c | 33 +++--- kernel/cgroup/cgroup.c | 147 ++++++++++++------------- net/nfc/core.c | 32 +++--- net/nfc/nci/core.c | 34 ++++-- net/nfc/netlink.c | 15 +++ net/phonet/pep.c | 1 + net/rds/connection.c | 1 + 17 files changed, 299 insertions(+), 218 deletions(-) -- 2.25.1

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> stable inclusion form stable-v5.10.85 commit 7193ad3e50e596ac2192531c58ba83b9e6d2444b issue: #I4RVJ4 CVE: CVE-2021-39685 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- Sometimes USB hosts can ask for buffers that are too large from endpoint 0, which should not be allowed. If this happens for OUT requests, stall the endpoint, but for IN requests, trim the request size to the endpoint buffer size. Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- drivers/usb/gadget/composite.c | 12 ++++++++++++ drivers/usb/gadget/legacy/dbgp.c | 13 +++++++++++++ drivers/usb/gadget/legacy/inode.c | 16 +++++++++++++++- 3 files changed, 40 insertions(+), 1 deletion(-) diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c index a8704e6498ab..426132988512 100644 --- a/drivers/usb/gadget/composite.c +++ b/drivers/usb/gadget/composite.c @@ -1648,6 +1648,18 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) struct usb_function *f = NULL; u8 endp; + if (w_length > USB_COMP_EP0_BUFSIZ) { + if (ctrl->bRequestType == USB_DIR_OUT) { + goto done; + } else { + /* Cast away the const, we are going to overwrite on purpose. */ + __le16 *temp = (__le16 *)&ctrl->wLength; + + *temp = cpu_to_le16(USB_COMP_EP0_BUFSIZ); + w_length = USB_COMP_EP0_BUFSIZ; + } + } + /* partial re-init of the response message; the function or the * gadget might need to intercept e.g. a control-OUT completion * when we delegate to it. diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c index e1d566c9918a..e567afcb2794 100644 --- a/drivers/usb/gadget/legacy/dbgp.c +++ b/drivers/usb/gadget/legacy/dbgp.c @@ -345,6 +345,19 @@ static int dbgp_setup(struct usb_gadget *gadget, void *data = NULL; u16 len = 0; + if (length > DBGP_REQ_LEN) { + if (ctrl->bRequestType == USB_DIR_OUT) { + return err; + } else { + /* Cast away the const, we are going to overwrite on purpose. */ + __le16 *temp = (__le16 *)&ctrl->wLength; + + *temp = cpu_to_le16(DBGP_REQ_LEN); + length = DBGP_REQ_LEN; + } + } + + if (request == USB_REQ_GET_DESCRIPTOR) { switch (value>>8) { case USB_DT_DEVICE: diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c index 71e7d10dd76b..04b9c4f5f129 100644 --- a/drivers/usb/gadget/legacy/inode.c +++ b/drivers/usb/gadget/legacy/inode.c @@ -110,6 +110,8 @@ enum ep0_state { /* enough for the whole queue: most events invalidate others */ #define N_EVENT 5 +#define RBUF_SIZE 256 + struct dev_data { spinlock_t lock; refcount_t count; @@ -144,7 +146,7 @@ struct dev_data { struct dentry *dentry; /* except this scratch i/o buffer for ep0 */ - u8 rbuf [256]; + u8 rbuf[RBUF_SIZE]; }; static inline void get_dev (struct dev_data *data) @@ -1333,6 +1335,18 @@ gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) u16 w_value = le16_to_cpu(ctrl->wValue); u16 w_length = le16_to_cpu(ctrl->wLength); + if (w_length > RBUF_SIZE) { + if (ctrl->bRequestType == USB_DIR_OUT) { + return value; + } else { + /* Cast away the const, we are going to overwrite on purpose. */ + __le16 *temp = (__le16 *)&ctrl->wLength; + + *temp = cpu_to_le16(RBUF_SIZE); + w_length = RBUF_SIZE; + } + } + spin_lock (&dev->lock); dev->setup_abort = 0; if (dev->state == STATE_DEV_UNCONNECTED) { -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stable inclusion form stable-v5.10.85 commit 7193ad3e50e596ac2192531c58ba83b9e6d2444b issue: #I4RVJ4 CVE: CVE-2021-39685
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
Sometimes USB hosts can ask for buffers that are too large from endpoint 0, which should not be allowed. If this happens for OUT requests, stall the endpoint, but for IN requests, trim the request size to the endpoint buffer size.
Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> mainline inclusion form mainline-v5.16-rc6 commit f08adf5add9a071160c68bb2a61d697f39ab0758 issue: #I4RVJ4 CVE: CVE-2021-39685 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- Szymon rightly pointed out that the previous check for the endpoint direction in bRequestType was not looking at only the bit involved, but rather the whole value. Normally this is ok, but for some request types, bits other than bit 8 could be set and the check for the endpoint length could not stall correctly. Fix that up by only checking the single bit. Fixes: 153a2d7e3350 ("USB: gadget: detect too-big endpoint 0 requests") Cc: Felipe Balbi <balbi@kernel.org> Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com> Link: https://lore.kernel.org/r/20211214184621.385828-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- drivers/usb/gadget/composite.c | 6 +++--- drivers/usb/gadget/legacy/dbgp.c | 6 +++--- drivers/usb/gadget/legacy/inode.c | 6 +++--- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c index 426132988512..8bec0cbf844e 100644 --- a/drivers/usb/gadget/composite.c +++ b/drivers/usb/gadget/composite.c @@ -1649,14 +1649,14 @@ composite_setup(struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) u8 endp; if (w_length > USB_COMP_EP0_BUFSIZ) { - if (ctrl->bRequestType == USB_DIR_OUT) { - goto done; - } else { + if (ctrl->bRequestType & USB_DIR_IN) { /* Cast away the const, we are going to overwrite on purpose. */ __le16 *temp = (__le16 *)&ctrl->wLength; *temp = cpu_to_le16(USB_COMP_EP0_BUFSIZ); w_length = USB_COMP_EP0_BUFSIZ; + } else { + goto done; } } diff --git a/drivers/usb/gadget/legacy/dbgp.c b/drivers/usb/gadget/legacy/dbgp.c index e567afcb2794..ffe58d44b2cb 100644 --- a/drivers/usb/gadget/legacy/dbgp.c +++ b/drivers/usb/gadget/legacy/dbgp.c @@ -346,14 +346,14 @@ static int dbgp_setup(struct usb_gadget *gadget, u16 len = 0; if (length > DBGP_REQ_LEN) { - if (ctrl->bRequestType == USB_DIR_OUT) { - return err; - } else { + if (ctrl->bRequestType & USB_DIR_IN) { /* Cast away the const, we are going to overwrite on purpose. */ __le16 *temp = (__le16 *)&ctrl->wLength; *temp = cpu_to_le16(DBGP_REQ_LEN); length = DBGP_REQ_LEN; + } else { + return err; } } diff --git a/drivers/usb/gadget/legacy/inode.c b/drivers/usb/gadget/legacy/inode.c index 04b9c4f5f129..217d2b66fa51 100644 --- a/drivers/usb/gadget/legacy/inode.c +++ b/drivers/usb/gadget/legacy/inode.c @@ -1336,14 +1336,14 @@ gadgetfs_setup (struct usb_gadget *gadget, const struct usb_ctrlrequest *ctrl) u16 w_length = le16_to_cpu(ctrl->wLength); if (w_length > RBUF_SIZE) { - if (ctrl->bRequestType == USB_DIR_OUT) { - return value; - } else { + if (ctrl->bRequestType & USB_DIR_IN) { /* Cast away the const, we are going to overwrite on purpose. */ __le16 *temp = (__le16 *)&ctrl->wLength; *temp = cpu_to_le16(RBUF_SIZE); w_length = RBUF_SIZE; + } else { + return value; } } -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mainline inclusion form mainline-v5.16-rc6 commit f08adf5add9a071160c68bb2a61d697f39ab0758 issue: #I4RVJ4 CVE: CVE-2021-39685
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
Szymon rightly pointed out that the previous check for the endpoint direction in bRequestType was not looking at only the bit involved, but rather the whole value. Normally this is ok, but for some request types, bits other than bit 8 could be set and the check for the endpoint length could not stall correctly.
Fix that up by only checking the single bit.
Fixes: 153a2d7e3350 ("USB: gadget: detect too-big endpoint 0 requests") Cc: Felipe Balbi <balbi@kernel.org> Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com> Link: https://lore.kernel.org/r/20211214184621.385828-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> ---
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Linus Torvalds <torvalds@linux-foundation.org> stable inclusion from stable-5.10.84 commit 4baba6ba56eb91a735a027f783cc4b9276b48d5b category: bugfix issue: #I4RVJ4 CVE: CVE-2021-4083 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.10.y&id=4baba6ba56eb91a735a027f783cc4b9276b48d5b Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> ------------------------------------------------- commit 054aa8d439b9185d4f5eb9a90282d1ce74772969 upstream. Jann Horn points out that there is another possible race wrt Unix domain socket garbage collection, somewhat reminiscent of the one fixed in commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK"). See the extended comment about the garbage collection requirements added to unix_peek_fds() by that commit for details. The race comes from how we can locklessly look up a file descriptor just as it is in the process of being closed, and with the right artificial timing (Jann added a few strategic 'mdelay(500)' calls to do that), the Unix domain socket garbage collector could see the reference count decrement of the close() happen before fget() took its reference to the file and the file was attached onto a new file descriptor. This is all (intentionally) correct on the 'struct file *' side, with RCU lookups and lockless reference counting very much part of the design. Getting that reference count out of order isn't a problem per se. But the garbage collector can get confused by seeing this situation of having seen a file not having any remaining external references and then seeing it being attached to an fd. In commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK") the fix was to serialize the file descriptor install with the garbage collector by taking and releasing the unix_gc_lock. That's not really an option here, but since this all happens when we are in the process of looking up a file descriptor, we can instead simply just re-check that the file hasn't been closed in the meantime, and just re-do the lookup if we raced with a concurrent close() of the same file descriptor. Reported-and-tested-by: Jann Horn <jannh@google.com> Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- fs/file.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/file.c b/fs/file.c index 21c0893f2f1d..9d02352fa18c 100644 --- a/fs/file.c +++ b/fs/file.c @@ -834,6 +834,10 @@ static struct file *__fget_files(struct files_struct *files, unsigned int fd, file = NULL; else if (!get_file_rcu_many(file, refs)) goto loop; + else if (__fcheck_files(files, fd) != file) { + fput_many(file, refs); + goto loop; + } } rcu_read_unlock(); -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Linus Torvalds <torvalds@linux-foundation.org>
stable inclusion from stable-5.10.84 commit 4baba6ba56eb91a735a027f783cc4b9276b48d5b category: bugfix issue: #I4RVJ4 CVE: CVE-2021-4083
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------------------------
commit 054aa8d439b9185d4f5eb9a90282d1ce74772969 upstream.
Jann Horn points out that there is another possible race wrt Unix domain socket garbage collection, somewhat reminiscent of the one fixed in commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK").
See the extended comment about the garbage collection requirements added to unix_peek_fds() by that commit for details.
The race comes from how we can locklessly look up a file descriptor just as it is in the process of being closed, and with the right artificial timing (Jann added a few strategic 'mdelay(500)' calls to do that), the Unix domain socket garbage collector could see the reference count decrement of the close() happen before fget() took its reference to the file and the file was attached onto a new file descriptor.
This is all (intentionally) correct on the 'struct file *' side, with RCU lookups and lockless reference counting very much part of the design. Getting that reference count out of order isn't a problem per se.
But the garbage collector can get confused by seeing this situation of having seen a file not having any remaining external references and then seeing it being attached to an fd.
In commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK") the fix was to serialize the file descriptor install with the garbage collector by taking and releasing the unix_gc_lock.
That's not really an option here, but since this all happens when we are in the process of looking up a file descriptor, we can instead simply just re-check that the file hasn't been closed in the meantime, and just re-do the lookup if we raced with a concurrent close() of the same file descriptor.
Reported-and-tested-by: Jann Horn <jannh@google.com> Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- fs/file.c | 4 ++++ 1 file changed, 4 insertions(+)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Hangyu Hua <hbh25y@gmail.com> mainline inclusion from mainline-v5.16-rc6 commit bcd0f93353326954817a4f9fa55ec57fb38acbb0 category: bugfix issue: #I4RVJ4 CVE: CVE-2021-45095 Reference: https://patchwork.kernel.org/project/netdevbpf/patch/20211209082839.33985-1-... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> ------------------------------------------------------------------- sock_hold(sk) is invoked in pep_sock_accept(), but __sock_put(sk) is not invoked in subsequent failure branches(pep_accept_conn() != 0). Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Link: https://lore.kernel.org/r/20211209082839.33985-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Huang Guobin <huangguobin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/phonet/pep.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/phonet/pep.c b/net/phonet/pep.c index a1525916885a..b4f90afb0638 100644 --- a/net/phonet/pep.c +++ b/net/phonet/pep.c @@ -868,6 +868,7 @@ static struct sock *pep_sock_accept(struct sock *sk, int flags, int *errp, err = pep_accept_conn(newsk, skb); if (err) { + __sock_put(sk); sock_put(newsk); newsk = NULL; goto drop; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Hangyu Hua <hbh25y@gmail.com>
mainline inclusion from mainline-v5.16-rc6 commit bcd0f93353326954817a4f9fa55ec57fb38acbb0 category: bugfix issue: #I4RVJ4 CVE: CVE-2021-45095
Reference: https://patchwork.kernel.org/project/netdevbpf/patch/20211209082839.33985-1-...
should be : https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------------------------------------------
sock_hold(sk) is invoked in pep_sock_accept(), but __sock_put(sk) is not invoked in subsequent failure branches(pep_accept_conn() != 0).
Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Link: https://lore.kernel.org/r/20211209082839.33985-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Huang Guobin <huangguobin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/phonet/pep.c | 1 + 1 file changed, 1 insertion(+)
Looks good to me

From: Jens Wiklander <jens.wiklander@linaro.org> stable inclusion from stable-v5.10.89 commit c05d8f66ec3470e5212c4d08c46d6cb5738d600d issue: #I4RVJ4 CVE: CVE-2021-44733 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- commit dfd0743f1d9ea76931510ed150334d571fbab49d upstream. Since the tee subsystem does not keep a strong reference to its idle shared memory buffers, it races with other threads that try to destroy a shared memory through a close of its dma-buf fd or by unmapping the memory. In tee_shm_get_from_id() when a lookup in teedev->idr has been successful, it is possible that the tee_shm is in the dma-buf teardown path, but that path is blocked by the teedev mutex. Since we don't have an API to tell if the tee_shm is in the dma-buf teardown path or not we must find another way of detecting this condition. Fix this by doing the reference counting directly on the tee_shm using a new refcount_t refcount field. dma-buf is replaced by using anon_inode_getfd() instead, this separates the life-cycle of the underlying file from the tee_shm. tee_shm_put() is updated to hold the mutex when decreasing the refcount to 0 and then remove the tee_shm from teedev->idr before releasing the mutex. This means that the tee_shm can never be found unless it has a refcount larger than 0. Fixes: 967c9cca2cc5 ("tee: generic TEE subsystem") Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Lars Persson <larper@axis.com> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Reported-by: Patrik Lantz <patrik.lantz@axis.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- drivers/tee/tee_shm.c | 171 ++++++++++++++++------------------------ include/linux/tee_drv.h | 4 +- 2 files changed, 68 insertions(+), 107 deletions(-) diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index 8a9384a64f3e..499fccba3d74 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -1,11 +1,11 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * Copyright (c) 2015-2016, Linaro Limited + * Copyright (c) 2015-2017, 2019-2021 Linaro Limited */ +#include <linux/anon_inodes.h> #include <linux/device.h> -#include <linux/dma-buf.h> -#include <linux/fdtable.h> #include <linux/idr.h> +#include <linux/mm.h> #include <linux/sched.h> #include <linux/slab.h> #include <linux/tee_drv.h> @@ -28,16 +28,8 @@ static void release_registered_pages(struct tee_shm *shm) } } -static void tee_shm_release(struct tee_shm *shm) +static void tee_shm_release(struct tee_device *teedev, struct tee_shm *shm) { - struct tee_device *teedev = shm->ctx->teedev; - - if (shm->flags & TEE_SHM_DMA_BUF) { - mutex_lock(&teedev->mutex); - idr_remove(&teedev->idr, shm->id); - mutex_unlock(&teedev->mutex); - } - if (shm->flags & TEE_SHM_POOL) { struct tee_shm_pool_mgr *poolm; @@ -64,45 +56,6 @@ static void tee_shm_release(struct tee_shm *shm) tee_device_put(teedev); } -static struct sg_table *tee_shm_op_map_dma_buf(struct dma_buf_attachment - *attach, enum dma_data_direction dir) -{ - return NULL; -} - -static void tee_shm_op_unmap_dma_buf(struct dma_buf_attachment *attach, - struct sg_table *table, - enum dma_data_direction dir) -{ -} - -static void tee_shm_op_release(struct dma_buf *dmabuf) -{ - struct tee_shm *shm = dmabuf->priv; - - tee_shm_release(shm); -} - -static int tee_shm_op_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) -{ - struct tee_shm *shm = dmabuf->priv; - size_t size = vma->vm_end - vma->vm_start; - - /* Refuse sharing shared memory provided by application */ - if (shm->flags & TEE_SHM_USER_MAPPED) - return -EINVAL; - - return remap_pfn_range(vma, vma->vm_start, shm->paddr >> PAGE_SHIFT, - size, vma->vm_page_prot); -} - -static const struct dma_buf_ops tee_shm_dma_buf_ops = { - .map_dma_buf = tee_shm_op_map_dma_buf, - .unmap_dma_buf = tee_shm_op_unmap_dma_buf, - .release = tee_shm_op_release, - .mmap = tee_shm_op_mmap, -}; - struct tee_shm *tee_shm_alloc(struct tee_context *ctx, size_t size, u32 flags) { struct tee_device *teedev = ctx->teedev; @@ -137,6 +90,7 @@ struct tee_shm *tee_shm_alloc(struct tee_context *ctx, size_t size, u32 flags) goto err_dev_put; } + refcount_set(&shm->refcount, 1); shm->flags = flags | TEE_SHM_POOL; shm->ctx = ctx; if (flags & TEE_SHM_DMA_BUF) @@ -150,10 +104,7 @@ struct tee_shm *tee_shm_alloc(struct tee_context *ctx, size_t size, u32 flags) goto err_kfree; } - if (flags & TEE_SHM_DMA_BUF) { - DEFINE_DMA_BUF_EXPORT_INFO(exp_info); - mutex_lock(&teedev->mutex); shm->id = idr_alloc(&teedev->idr, shm, 1, 0, GFP_KERNEL); mutex_unlock(&teedev->mutex); @@ -161,28 +112,11 @@ struct tee_shm *tee_shm_alloc(struct tee_context *ctx, size_t size, u32 flags) ret = ERR_PTR(shm->id); goto err_pool_free; } - - exp_info.ops = &tee_shm_dma_buf_ops; - exp_info.size = shm->size; - exp_info.flags = O_RDWR; - exp_info.priv = shm; - - shm->dmabuf = dma_buf_export(&exp_info); - if (IS_ERR(shm->dmabuf)) { - ret = ERR_CAST(shm->dmabuf); - goto err_rem; - } } teedev_ctx_get(ctx); return shm; -err_rem: - if (flags & TEE_SHM_DMA_BUF) { - mutex_lock(&teedev->mutex); - idr_remove(&teedev->idr, shm->id); - mutex_unlock(&teedev->mutex); - } err_pool_free: poolm->ops->free(poolm, shm); err_kfree: @@ -243,6 +177,7 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, unsigned long addr, goto err; } + refcount_set(&shm->refcount, 1); shm->flags = flags | TEE_SHM_REGISTER; shm->ctx = ctx; shm->id = -1; @@ -303,22 +238,6 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, unsigned long addr, goto err; } - if (flags & TEE_SHM_DMA_BUF) { - DEFINE_DMA_BUF_EXPORT_INFO(exp_info); - - exp_info.ops = &tee_shm_dma_buf_ops; - exp_info.size = shm->size; - exp_info.flags = O_RDWR; - exp_info.priv = shm; - - shm->dmabuf = dma_buf_export(&exp_info); - if (IS_ERR(shm->dmabuf)) { - ret = ERR_CAST(shm->dmabuf); - teedev->desc->ops->shm_unregister(ctx, shm); - goto err; - } - } - return shm; err: if (shm) { @@ -336,6 +255,35 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, unsigned long addr, } EXPORT_SYMBOL_GPL(tee_shm_register); +static int tee_shm_fop_release(struct inode *inode, struct file *filp) +{ + tee_shm_put(filp->private_data); + return 0; +} + +static int tee_shm_fop_mmap(struct file *filp, struct vm_area_struct *vma) +{ + struct tee_shm *shm = filp->private_data; + size_t size = vma->vm_end - vma->vm_start; + + /* Refuse sharing shared memory provided by application */ + if (shm->flags & TEE_SHM_USER_MAPPED) + return -EINVAL; + + /* check for overflowing the buffer's size */ + if (vma->vm_pgoff + vma_pages(vma) > shm->size >> PAGE_SHIFT) + return -EINVAL; + + return remap_pfn_range(vma, vma->vm_start, shm->paddr >> PAGE_SHIFT, + size, vma->vm_page_prot); +} + +static const struct file_operations tee_shm_fops = { + .owner = THIS_MODULE, + .release = tee_shm_fop_release, + .mmap = tee_shm_fop_mmap, +}; + /** * tee_shm_get_fd() - Increase reference count and return file descriptor * @shm: Shared memory handle @@ -348,10 +296,11 @@ int tee_shm_get_fd(struct tee_shm *shm) if (!(shm->flags & TEE_SHM_DMA_BUF)) return -EINVAL; - get_dma_buf(shm->dmabuf); - fd = dma_buf_fd(shm->dmabuf, O_CLOEXEC); + /* matched by tee_shm_put() in tee_shm_op_release() */ + refcount_inc(&shm->refcount); + fd = anon_inode_getfd("tee_shm", &tee_shm_fops, shm, O_RDWR); if (fd < 0) - dma_buf_put(shm->dmabuf); + tee_shm_put(shm); return fd; } @@ -361,17 +310,7 @@ int tee_shm_get_fd(struct tee_shm *shm) */ void tee_shm_free(struct tee_shm *shm) { - /* - * dma_buf_put() decreases the dmabuf reference counter and will - * call tee_shm_release() when the last reference is gone. - * - * In the case of driver private memory we call tee_shm_release - * directly instead as it doesn't have a reference counter. - */ - if (shm->flags & TEE_SHM_DMA_BUF) - dma_buf_put(shm->dmabuf); - else - tee_shm_release(shm); + tee_shm_put(shm); } EXPORT_SYMBOL_GPL(tee_shm_free); @@ -478,10 +417,15 @@ struct tee_shm *tee_shm_get_from_id(struct tee_context *ctx, int id) teedev = ctx->teedev; mutex_lock(&teedev->mutex); shm = idr_find(&teedev->idr, id); + /* + * If the tee_shm was found in the IDR it must have a refcount + * larger than 0 due to the guarantee in tee_shm_put() below. So + * it's safe to use refcount_inc(). + */ if (!shm || shm->ctx != ctx) shm = ERR_PTR(-EINVAL); - else if (shm->flags & TEE_SHM_DMA_BUF) - get_dma_buf(shm->dmabuf); + else + refcount_inc(&shm->refcount); mutex_unlock(&teedev->mutex); return shm; } @@ -493,7 +437,24 @@ EXPORT_SYMBOL_GPL(tee_shm_get_from_id); */ void tee_shm_put(struct tee_shm *shm) { - if (shm->flags & TEE_SHM_DMA_BUF) - dma_buf_put(shm->dmabuf); + struct tee_device *teedev = shm->ctx->teedev; + bool do_release = false; + + mutex_lock(&teedev->mutex); + if (refcount_dec_and_test(&shm->refcount)) { + /* + * refcount has reached 0, we must now remove it from the + * IDR before releasing the mutex. This will guarantee that + * the refcount_inc() in tee_shm_get_from_id() never starts + * from 0. + */ + if (shm->flags & TEE_SHM_DMA_BUF) + idr_remove(&teedev->idr, shm->id); + do_release = true; + } + mutex_unlock(&teedev->mutex); + + if (do_release) + tee_shm_release(teedev, shm); } EXPORT_SYMBOL_GPL(tee_shm_put); diff --git a/include/linux/tee_drv.h b/include/linux/tee_drv.h index 459e9a76d7e6..0c6c1de6f3b7 100644 --- a/include/linux/tee_drv.h +++ b/include/linux/tee_drv.h @@ -195,7 +195,7 @@ int tee_session_calc_client_uuid(uuid_t *uuid, u32 connection_method, * @offset: offset of buffer in user space * @pages: locked pages from userspace * @num_pages: number of locked pages - * @dmabuf: dmabuf used to for exporting to user space + * @refcount: reference counter * @flags: defined by TEE_SHM_* in tee_drv.h * @id: unique id of a shared memory object on this device * @@ -210,7 +210,7 @@ struct tee_shm { unsigned int offset; struct page **pages; size_t num_pages; - struct dma_buf *dmabuf; + refcount_t refcount; u32 flags; int id; }; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Jens Wiklander <jens.wiklander@linaro.org>
stable inclusion from stable-v5.10.89 commit c05d8f66ec3470e5212c4d08c46d6cb5738d600d issue: #I4RVJ4 CVE: CVE-2021-44733
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
commit dfd0743f1d9ea76931510ed150334d571fbab49d upstream.
Since the tee subsystem does not keep a strong reference to its idle shared memory buffers, it races with other threads that try to destroy a shared memory through a close of its dma-buf fd or by unmapping the memory.
In tee_shm_get_from_id() when a lookup in teedev->idr has been successful, it is possible that the tee_shm is in the dma-buf teardown path, but that path is blocked by the teedev mutex. Since we don't have an API to tell if the tee_shm is in the dma-buf teardown path or not we must find another way of detecting this condition.
Fix this by doing the reference counting directly on the tee_shm using a new refcount_t refcount field. dma-buf is replaced by using anon_inode_getfd() instead, this separates the life-cycle of the underlying file from the tee_shm. tee_shm_put() is updated to hold the mutex when decreasing the refcount to 0 and then remove the tee_shm from teedev->idr before releasing the mutex. This means that the tee_shm can never be found unless it has a refcount larger than 0.
Fixes: 967c9cca2cc5 ("tee: generic TEE subsystem") Cc: stable@vger.kernel.org Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Lars Persson <larper@axis.com> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Reported-by: Patrik Lantz <patrik.lantz@axis.com> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- drivers/tee/tee_shm.c | 171 ++++++++++++++++------------------------ include/linux/tee_drv.h | 4 +- 2 files changed, 68 insertions(+), 107 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Chao Yu <chao@kernel.org> maillist inclusion category: bugfix issue: #I4RVJ4 CVE: CVE-2021-45469 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/commit/?h=dev&id=5598b24efaf4892741c798b425d543e4bed357a1 Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- As Wenqing Liu reported in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215235 - Overview page fault in f2fs_setxattr() when mount and operate on corrupted image - Reproduce tested on kernel 5.16-rc3, 5.15.X under root 1. unzip tmp7.zip 2. ./single.sh f2fs 7 Sometimes need to run the script several times - Kernel dump loop0: detected capacity change from 0 to 131072 F2FS-fs (loop0): Found nat_bits in checkpoint F2FS-fs (loop0): Mounted with checkpoint version = 7548c2ee BUG: unable to handle page fault for address: ffffe47bc7123f48 RIP: 0010:kfree+0x66/0x320 Call Trace: __f2fs_setxattr+0x2aa/0xc00 [f2fs] f2fs_setxattr+0xfa/0x480 [f2fs] __f2fs_set_acl+0x19b/0x330 [f2fs] __vfs_removexattr+0x52/0x70 __vfs_removexattr_locked+0xb1/0x140 vfs_removexattr+0x56/0x100 removexattr+0x57/0x80 path_removexattr+0xa3/0xc0 __x64_sys_removexattr+0x17/0x20 do_syscall_64+0x37/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae The root cause is in __f2fs_setxattr(), we missed to do sanity check on last xattr entry, result in out-of-bound memory access during updating inconsistent xattr data of target inode. After the fix, it can detect such xattr inconsistency as below: F2FS-fs (loop11): inode (7) has invalid last xattr entry, entry_size: 60676 F2FS-fs (loop11): inode (8) has corrupted xattr F2FS-fs (loop11): inode (8) has corrupted xattr F2FS-fs (loop11): inode (8) has invalid last xattr entry, entry_size: 47736 Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- fs/f2fs/xattr.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/f2fs/xattr.c b/fs/f2fs/xattr.c index 65afcc3cc68a..f44c60114379 100644 --- a/fs/f2fs/xattr.c +++ b/fs/f2fs/xattr.c @@ -680,8 +680,17 @@ static int __f2fs_setxattr(struct inode *inode, int index, } last = here; - while (!IS_XATTR_LAST_ENTRY(last)) + while (!IS_XATTR_LAST_ENTRY(last)) { + if ((void *)(last) + sizeof(__u32) > last_base_addr || + (void *)XATTR_NEXT_ENTRY(last) > last_base_addr) { + f2fs_err(F2FS_I_SB(inode), "inode (%lu) has invalid last xattr entry, entry_size: %zu", + inode->i_ino, ENTRY_SIZE(last)); + set_sbi_flag(F2FS_I_SB(inode), SBI_NEED_FSCK); + error = -EFSCORRUPTED; + goto exit; + } last = XATTR_NEXT_ENTRY(last); + } newsize = XATTR_ALIGN(sizeof(struct f2fs_xattr_entry) + len + size); -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Chao Yu <chao@kernel.org>
maillist inclusion category: bugfix issue: #I4RVJ4 CVE: CVE-2021-45469
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
As Wenqing Liu reported in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215235
- Overview page fault in f2fs_setxattr() when mount and operate on corrupted image
- Reproduce tested on kernel 5.16-rc3, 5.15.X under root
1. unzip tmp7.zip 2. ./single.sh f2fs 7
Sometimes need to run the script several times
- Kernel dump loop0: detected capacity change from 0 to 131072 F2FS-fs (loop0): Found nat_bits in checkpoint F2FS-fs (loop0): Mounted with checkpoint version = 7548c2ee BUG: unable to handle page fault for address: ffffe47bc7123f48 RIP: 0010:kfree+0x66/0x320 Call Trace: __f2fs_setxattr+0x2aa/0xc00 [f2fs] f2fs_setxattr+0xfa/0x480 [f2fs] __f2fs_set_acl+0x19b/0x330 [f2fs] __vfs_removexattr+0x52/0x70 __vfs_removexattr_locked+0xb1/0x140 vfs_removexattr+0x56/0x100 removexattr+0x57/0x80 path_removexattr+0xa3/0xc0 __x64_sys_removexattr+0x17/0x20 do_syscall_64+0x37/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
The root cause is in __f2fs_setxattr(), we missed to do sanity check on last xattr entry, result in out-of-bound memory access during updating inconsistent xattr data of target inode.
After the fix, it can detect such xattr inconsistency as below:
F2FS-fs (loop11): inode (7) has invalid last xattr entry, entry_size: 60676 F2FS-fs (loop11): inode (8) has corrupted xattr F2FS-fs (loop11): inode (8) has corrupted xattr F2FS-fs (loop11): inode (8) has invalid last xattr entry, entry_size: 47736
Cc: stable@vger.kernel.org Reported-by: Wenqing Liu <wenqingliu0120@gmail.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- fs/f2fs/xattr.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Michal Koutný <mkoutny@suse.com> mainline inclusion from mainline-v5.12-rc1 commit da70862efe0065bada33d67a903270cdbbaf07d9 issue: #I4RVJ4 CVE: CVE-2021-4197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- The functions cgroup_threads_write and cgroup_procs_write are almost identical. In order to reduce duplication, factor out the common code in similar fashion we already do for other threadgroup/task functions. No functional changes are intended. Suggested-by: Hao Lee <haolee.swjtu@gmail.com> Signed-off-by: Michal Koutný <mkoutny@suse.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup.c | 55 +++++++++++------------------------------- 1 file changed, 14 insertions(+), 41 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 350297ad63f1..3b9b9a18e901 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4749,8 +4749,8 @@ static int cgroup_attach_permissions(struct cgroup *src_cgrp, return ret; } -static ssize_t cgroup_procs_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off) +static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf, + bool threadgroup) { struct cgroup *src_cgrp, *dst_cgrp; struct task_struct *task; @@ -4761,7 +4761,7 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of, if (!dst_cgrp) return -ENODEV; - task = cgroup_procs_write_start(buf, true, &locked); + task = cgroup_procs_write_start(buf, threadgroup, &locked); ret = PTR_ERR_OR_ZERO(task); if (ret) goto out_unlock; @@ -4771,19 +4771,26 @@ static ssize_t cgroup_procs_write(struct kernfs_open_file *of, src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root); spin_unlock_irq(&css_set_lock); + /* process and thread migrations follow same delegation rule */ ret = cgroup_attach_permissions(src_cgrp, dst_cgrp, - of->file->f_path.dentry->d_sb, true); + of->file->f_path.dentry->d_sb, threadgroup); if (ret) goto out_finish; - ret = cgroup_attach_task(dst_cgrp, task, true); + ret = cgroup_attach_task(dst_cgrp, task, threadgroup); out_finish: cgroup_procs_write_finish(task, locked); out_unlock: cgroup_kn_unlock(of->kn); - return ret ?: nbytes; + return ret; +} + +static ssize_t cgroup_procs_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + return __cgroup_procs_write(of, buf, true) ?: nbytes; } static void *cgroup_threads_start(struct seq_file *s, loff_t *pos) @@ -4794,41 +4801,7 @@ static void *cgroup_threads_start(struct seq_file *s, loff_t *pos) static ssize_t cgroup_threads_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { - struct cgroup *src_cgrp, *dst_cgrp; - struct task_struct *task; - ssize_t ret; - bool locked; - - buf = strstrip(buf); - - dst_cgrp = cgroup_kn_lock_live(of->kn, false); - if (!dst_cgrp) - return -ENODEV; - - task = cgroup_procs_write_start(buf, false, &locked); - ret = PTR_ERR_OR_ZERO(task); - if (ret) - goto out_unlock; - - /* find the source cgroup */ - spin_lock_irq(&css_set_lock); - src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root); - spin_unlock_irq(&css_set_lock); - - /* thread migrations follow the cgroup.procs delegation rule */ - ret = cgroup_attach_permissions(src_cgrp, dst_cgrp, - of->file->f_path.dentry->d_sb, false); - if (ret) - goto out_finish; - - ret = cgroup_attach_task(dst_cgrp, task, false); - -out_finish: - cgroup_procs_write_finish(task, locked); -out_unlock: - cgroup_kn_unlock(of->kn); - - return ret ?: nbytes; + return __cgroup_procs_write(of, buf, false) ?: nbytes; } /* cgroup core interface files for the default hierarchy */ -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Michal Koutný <mkoutny@suse.com>
mainline inclusion from mainline-v5.12-rc1 commit da70862efe0065bada33d67a903270cdbbaf07d9 issue: #I4RVJ4 CVE: CVE-2021-4197
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
The functions cgroup_threads_write and cgroup_procs_write are almost identical. In order to reduce duplication, factor out the common code in similar fashion we already do for other threadgroup/task functions. No functional changes are intended.
Suggested-by: Hao Lee <haolee.swjtu@gmail.com> Signed-off-by: Michal Koutný <mkoutny@suse.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup.c | 55 +++++++++++------------------------------- 1 file changed, 14 insertions(+), 41 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Hui Su <sh_def@163.com> mainline inclusion from mainline-v5.11-rc2 commit 5a7b5f32c5aa628841502d19a813c633ff6ecbe4 issue: #I4RVJ4 CVE: CVE-2021-4197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- we have supplied the inline function: of_cft() in cgroup.h. So replace the direct use 'of->kn->priv' with inline func of_cft(), which is more readable. Signed-off-by: Hui Su <sh_def@163.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 3b9b9a18e901..cd573334ce8d 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3677,7 +3677,7 @@ static ssize_t cgroup_freeze_write(struct kernfs_open_file *of, static int cgroup_file_open(struct kernfs_open_file *of) { - struct cftype *cft = of->kn->priv; + struct cftype *cft = of_cft(of); if (cft->open) return cft->open(of); @@ -3686,7 +3686,7 @@ static int cgroup_file_open(struct kernfs_open_file *of) static void cgroup_file_release(struct kernfs_open_file *of) { - struct cftype *cft = of->kn->priv; + struct cftype *cft = of_cft(of); if (cft->release) cft->release(of); @@ -3697,7 +3697,7 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf, { struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup *cgrp = of->kn->parent->priv; - struct cftype *cft = of->kn->priv; + struct cftype *cft = of_cft(of); struct cgroup_subsys_state *css; int ret; @@ -3747,7 +3747,7 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf, static __poll_t cgroup_file_poll(struct kernfs_open_file *of, poll_table *pt) { - struct cftype *cft = of->kn->priv; + struct cftype *cft = of_cft(of); if (cft->poll) return cft->poll(of, pt); -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Hui Su <sh_def@163.com>
mainline inclusion from mainline-v5.11-rc2 commit 5a7b5f32c5aa628841502d19a813c633ff6ecbe4 issue: #I4RVJ4 CVE: CVE-2021-4197
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
we have supplied the inline function: of_cft() in cgroup.h.
So replace the direct use 'of->kn->priv' with inline func of_cft(), which is more readable.
Signed-off-by: Hui Su <sh_def@163.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Tejun Heo <tj@kernel.org> mainline inclusion from mainline commit 1756d7994ad85c2479af6ae5a9750b92324685af issue: #I4RVJ4 CVE: CVE-2021-4197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's credentials which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created. This patch makes both cgroup2 and cgroup1 process migration interfaces to use the credentials saved at the time of open (file->f_cred) instead of current's. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy") Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Conflict: kernel/cgroup/cgroup-v1.c Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup-v1.c | 7 ++++--- kernel/cgroup/cgroup.c | 9 ++++++++- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 88b72815aa9d..2b0dc44e2447 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -503,10 +503,11 @@ static ssize_t __cgroup1_procs_write(struct kernfs_open_file *of, goto out_unlock; /* - * Even if we're attaching all tasks in the thread group, we only - * need to check permissions on one of them. + * Even if we're attaching all tasks in the thread group, we only need + * to check permissions on one of them. Check permissions using the + * credentials from file open to protect against inherited fd attacks. */ - cred = current_cred(); + cred = of->file->f_cred; tcred = get_task_cred(task); if (!uid_eq(cred->euid, GLOBAL_ROOT_UID) && !uid_eq(cred->euid, tcred->uid) && diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index cd573334ce8d..44d58023904b 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4754,6 +4754,7 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf, { struct cgroup *src_cgrp, *dst_cgrp; struct task_struct *task; + const struct cred *saved_cred; ssize_t ret; bool locked; @@ -4771,9 +4772,15 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf, src_cgrp = task_cgroup_from_root(task, &cgrp_dfl_root); spin_unlock_irq(&css_set_lock); - /* process and thread migrations follow same delegation rule */ + /* + * Process and thread migrations follow same delegation rule. Check + * permissions using the credentials from file open to protect against + * inherited fd attacks. + */ + saved_cred = override_creds(of->file->f_cred); ret = cgroup_attach_permissions(src_cgrp, dst_cgrp, of->file->f_path.dentry->d_sb, threadgroup); + revert_creds(saved_cred); if (ret) goto out_finish; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Tejun Heo <tj@kernel.org>
mainline inclusion from mainline commit 1756d7994ad85c2479af6ae5a9750b92324685af issue: #I4RVJ4 CVE: CVE-2021-4197
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's credentials which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created.
This patch makes both cgroup2 and cgroup1 process migration interfaces to use the credentials saved at the time of open (file->f_cred) instead of current's.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Fixes: 187fe84067bd ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy") Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Conflict: kernel/cgroup/cgroup-v1.c Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup-v1.c | 7 ++++--- kernel/cgroup/cgroup.c | 9 ++++++++- 2 files changed, 12 insertions(+), 4 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Tejun Heo <tj@kernel.org> mainline inclusion from mainline commit 0d2b5955b36250a9428c832664f2079cbf723bec issue: #I4RVJ4 CVE: CVE-2021-4197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- of->priv is currently used by each interface file implementation to store private information. This patch collects the current two private data usages into struct cgroup_file_ctx which is allocated and freed by the common path. This allows generic private data which applies to multiple files, which will be used to in the following patch. Note that cgroup_procs iterator is now embedded as procs.iter in the new cgroup_file_ctx so that it doesn't need to be allocated and freed separately. v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in cgroup_file_ctx as suggested by Linus. v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too. Converted. Didn't change to embedded allocation as cgroup1 pidlists get stored for caching. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Michal Koutný <mkoutny@suse.com> Conflict: kernel/cgroup/cgroup.c Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup-internal.h | 17 +++++++++++ kernel/cgroup/cgroup-v1.c | 26 ++++++++-------- kernel/cgroup/cgroup.c | 53 +++++++++++++++++++++------------ 3 files changed, 65 insertions(+), 31 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index bfbeabc17a9d..cf637bc4ab45 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -65,6 +65,23 @@ static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc) return container_of(kfc, struct cgroup_fs_context, kfc); } +struct cgroup_pidlist; + +struct cgroup_file_ctx { + struct { + void *trigger; + } psi; + + struct { + bool started; + struct css_task_iter iter; + } procs; + + struct { + struct cgroup_pidlist *pidlist; + } procs1; +}; + /* * A cgroup can be associated with multiple css_sets as different tasks may * belong to different cgroups on different hierarchies. In the other diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c index 2b0dc44e2447..1805c682ccc3 100644 --- a/kernel/cgroup/cgroup-v1.c +++ b/kernel/cgroup/cgroup-v1.c @@ -393,6 +393,7 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos) * next pid to display, if any */ struct kernfs_open_file *of = s->private; + struct cgroup_file_ctx *ctx = of->priv; struct cgroup *cgrp = seq_css(s)->cgroup; struct cgroup_pidlist *l; enum cgroup_filetype type = seq_cft(s)->private; @@ -402,25 +403,24 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos) mutex_lock(&cgrp->pidlist_mutex); /* - * !NULL @of->priv indicates that this isn't the first start() - * after open. If the matching pidlist is around, we can use that. - * Look for it. Note that @of->priv can't be used directly. It - * could already have been destroyed. + * !NULL @ctx->procs1.pidlist indicates that this isn't the first + * start() after open. If the matching pidlist is around, we can use + * that. Look for it. Note that @ctx->procs1.pidlist can't be used + * directly. It could already have been destroyed. */ - if (of->priv) - of->priv = cgroup_pidlist_find(cgrp, type); + if (ctx->procs1.pidlist) + ctx->procs1.pidlist = cgroup_pidlist_find(cgrp, type); /* * Either this is the first start() after open or the matching * pidlist has been destroyed inbetween. Create a new one. */ - if (!of->priv) { - ret = pidlist_array_load(cgrp, type, - (struct cgroup_pidlist **)&of->priv); + if (!ctx->procs1.pidlist) { + ret = pidlist_array_load(cgrp, type, &ctx->procs1.pidlist); if (ret) return ERR_PTR(ret); } - l = of->priv; + l = ctx->procs1.pidlist; if (pid) { int end = l->length; @@ -448,7 +448,8 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos) static void cgroup_pidlist_stop(struct seq_file *s, void *v) { struct kernfs_open_file *of = s->private; - struct cgroup_pidlist *l = of->priv; + struct cgroup_file_ctx *ctx = of->priv; + struct cgroup_pidlist *l = ctx->procs1.pidlist; if (l) mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork, @@ -459,7 +460,8 @@ static void cgroup_pidlist_stop(struct seq_file *s, void *v) static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos) { struct kernfs_open_file *of = s->private; - struct cgroup_pidlist *l = of->priv; + struct cgroup_file_ctx *ctx = of->priv; + struct cgroup_pidlist *l = ctx->procs1.pidlist; pid_t *p = v; pid_t *end = l->list + l->length; /* diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 44d58023904b..6f3501869a16 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3583,6 +3583,7 @@ static int cgroup_cpu_pressure_show(struct seq_file *seq, void *v) static ssize_t cgroup_pressure_write(struct kernfs_open_file *of, char *buf, size_t nbytes, enum psi_res res) { + struct cgroup_file_ctx *ctx = of->priv; struct psi_trigger *new; struct cgroup *cgrp; struct psi_group *psi; @@ -3601,7 +3602,7 @@ static ssize_t cgroup_pressure_write(struct kernfs_open_file *of, char *buf, return PTR_ERR(new); } - psi_trigger_replace(&of->priv, new); + psi_trigger_replace(&ctx->psi.trigger, new); cgroup_put(cgrp); @@ -3632,12 +3633,16 @@ static ssize_t cgroup_cpu_pressure_write(struct kernfs_open_file *of, static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of, poll_table *pt) { - return psi_trigger_poll(&of->priv, of->file, pt); + struct cgroup_file_ctx *ctx = of->priv; + + return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); } static void cgroup_pressure_release(struct kernfs_open_file *of) { - psi_trigger_replace(&of->priv, NULL); + struct cgroup_file_ctx *ctx = of->priv; + + psi_trigger_replace(&ctx->psi.trigger, NULL); } #endif /* CONFIG_PSI */ @@ -3678,18 +3683,31 @@ static ssize_t cgroup_freeze_write(struct kernfs_open_file *of, static int cgroup_file_open(struct kernfs_open_file *of) { struct cftype *cft = of_cft(of); + struct cgroup_file_ctx *ctx; + int ret; - if (cft->open) - return cft->open(of); - return 0; + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + of->priv = ctx; + + if (!cft->open) + return 0; + + ret = cft->open(of); + if (ret) + kfree(ctx); + return ret; } static void cgroup_file_release(struct kernfs_open_file *of) { struct cftype *cft = of_cft(of); + struct cgroup_file_ctx *ctx = of->priv; if (cft->release) cft->release(of); + kfree(ctx); } static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf, @@ -4613,21 +4631,21 @@ void css_task_iter_end(struct css_task_iter *it) static void cgroup_procs_release(struct kernfs_open_file *of) { - if (of->priv) { - css_task_iter_end(of->priv); - kfree(of->priv); - } + struct cgroup_file_ctx *ctx = of->priv; + + if (ctx->procs.started) + css_task_iter_end(&ctx->procs.iter); } static void *cgroup_procs_next(struct seq_file *s, void *v, loff_t *pos) { struct kernfs_open_file *of = s->private; - struct css_task_iter *it = of->priv; + struct cgroup_file_ctx *ctx = of->priv; if (pos) (*pos)++; - return css_task_iter_next(it); + return css_task_iter_next(&ctx->procs.iter); } static void *__cgroup_procs_start(struct seq_file *s, loff_t *pos, @@ -4635,21 +4653,18 @@ static void *__cgroup_procs_start(struct seq_file *s, loff_t *pos, { struct kernfs_open_file *of = s->private; struct cgroup *cgrp = seq_css(s)->cgroup; - struct css_task_iter *it = of->priv; + struct cgroup_file_ctx *ctx = of->priv; + struct css_task_iter *it = &ctx->procs.iter; /* * When a seq_file is seeked, it's always traversed sequentially * from position 0, so we can simply keep iterating on !0 *pos. */ - if (!it) { + if (!ctx->procs.started) { if (WARN_ON_ONCE((*pos))) return ERR_PTR(-EINVAL); - - it = kzalloc(sizeof(*it), GFP_KERNEL); - if (!it) - return ERR_PTR(-ENOMEM); - of->priv = it; css_task_iter_start(&cgrp->self, iter_flags, it); + ctx->procs.started = true; } else if (!(*pos)) { css_task_iter_end(it); css_task_iter_start(&cgrp->self, iter_flags, it); -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Tejun Heo <tj@kernel.org>
mainline inclusion from mainline commit 0d2b5955b36250a9428c832664f2079cbf723bec issue: #I4RVJ4 CVE: CVE-2021-4197
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
of->priv is currently used by each interface file implementation to store private information. This patch collects the current two private data usages into struct cgroup_file_ctx which is allocated and freed by the common path. This allows generic private data which applies to multiple files, which will be used to in the following patch.
Note that cgroup_procs iterator is now embedded as procs.iter in the new cgroup_file_ctx so that it doesn't need to be allocated and freed separately.
v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in cgroup_file_ctx as suggested by Linus.
v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too. Converted. Didn't change to embedded allocation as cgroup1 pidlists get stored for caching.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Michal Koutný <mkoutny@suse.com> Conflict: kernel/cgroup/cgroup.c Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup-internal.h | 17 +++++++++++ kernel/cgroup/cgroup-v1.c | 26 ++++++++-------- kernel/cgroup/cgroup.c | 53 +++++++++++++++++++++------------ 3 files changed, 65 insertions(+), 31 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Tejun Heo <tj@kernel.org> mainline inclusion from mainline commit e57457641613fef0d147ede8bd6a3047df588b95 issue: #I4RVJ4 CVE: CVE-2021-4197 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's cgroup namespace which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created. This patch makes cgroup remember the cgroup namespace at the time of open and uses it for migration permission checks instad of current's. Note that this only applies to cgroup2 as cgroup1 doesn't have namespace support. This also fixes a use-after-free bug on cgroupns reported in https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Note that backporting this fix also requires the preceding patch. Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Fixes: 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option") Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup-internal.h | 2 ++ kernel/cgroup/cgroup.c | 28 +++++++++++++++++++--------- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h index cf637bc4ab45..6e36e854b512 100644 --- a/kernel/cgroup/cgroup-internal.h +++ b/kernel/cgroup/cgroup-internal.h @@ -68,6 +68,8 @@ static inline struct cgroup_fs_context *cgroup_fc2context(struct fs_context *fc) struct cgroup_pidlist; struct cgroup_file_ctx { + struct cgroup_namespace *ns; + struct { void *trigger; } psi; diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 6f3501869a16..c2d4e00d21d7 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3689,14 +3689,19 @@ static int cgroup_file_open(struct kernfs_open_file *of) ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) return -ENOMEM; + + ctx->ns = current->nsproxy->cgroup_ns; + get_cgroup_ns(ctx->ns); of->priv = ctx; if (!cft->open) return 0; ret = cft->open(of); - if (ret) + if (ret) { + put_cgroup_ns(ctx->ns); kfree(ctx); + } return ret; } @@ -3707,13 +3712,14 @@ static void cgroup_file_release(struct kernfs_open_file *of) if (cft->release) cft->release(of); + put_cgroup_ns(ctx->ns); kfree(ctx); } static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { - struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; + struct cgroup_file_ctx *ctx = of->priv; struct cgroup *cgrp = of->kn->parent->priv; struct cftype *cft = of_cft(of); struct cgroup_subsys_state *css; @@ -3730,7 +3736,7 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf, */ if ((cgrp->root->flags & CGRP_ROOT_NS_DELEGATE) && !(cft->flags & CFTYPE_NS_DELEGATABLE) && - ns != &init_cgroup_ns && ns->root_cset->dfl_cgrp == cgrp) + ctx->ns != &init_cgroup_ns && ctx->ns->root_cset->dfl_cgrp == cgrp) return -EPERM; if (cft->write) @@ -4715,9 +4721,9 @@ static int cgroup_may_write(const struct cgroup *cgrp, struct super_block *sb) static int cgroup_procs_write_permission(struct cgroup *src_cgrp, struct cgroup *dst_cgrp, - struct super_block *sb) + struct super_block *sb, + struct cgroup_namespace *ns) { - struct cgroup_namespace *ns = current->nsproxy->cgroup_ns; struct cgroup *com_cgrp = src_cgrp; int ret; @@ -4746,11 +4752,12 @@ static int cgroup_procs_write_permission(struct cgroup *src_cgrp, static int cgroup_attach_permissions(struct cgroup *src_cgrp, struct cgroup *dst_cgrp, - struct super_block *sb, bool threadgroup) + struct super_block *sb, bool threadgroup, + struct cgroup_namespace *ns) { int ret = 0; - ret = cgroup_procs_write_permission(src_cgrp, dst_cgrp, sb); + ret = cgroup_procs_write_permission(src_cgrp, dst_cgrp, sb, ns); if (ret) return ret; @@ -4767,6 +4774,7 @@ static int cgroup_attach_permissions(struct cgroup *src_cgrp, static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf, bool threadgroup) { + struct cgroup_file_ctx *ctx = of->priv; struct cgroup *src_cgrp, *dst_cgrp; struct task_struct *task; const struct cred *saved_cred; @@ -4794,7 +4802,8 @@ static ssize_t __cgroup_procs_write(struct kernfs_open_file *of, char *buf, */ saved_cred = override_creds(of->file->f_cred); ret = cgroup_attach_permissions(src_cgrp, dst_cgrp, - of->file->f_path.dentry->d_sb, threadgroup); + of->file->f_path.dentry->d_sb, + threadgroup, ctx->ns); revert_creds(saved_cred); if (ret) goto out_finish; @@ -5994,7 +6003,8 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs) goto err; ret = cgroup_attach_permissions(cset->dfl_cgrp, dst_cgrp, sb, - !(kargs->flags & CLONE_THREAD)); + !(kargs->flags & CLONE_THREAD), + current->nsproxy->cgroup_ns); if (ret) goto err; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Tejun Heo <tj@kernel.org>
mainline inclusion from mainline commit e57457641613fef0d147ede8bd6a3047df588b95 issue: #I4RVJ4 CVE: CVE-2021-4197
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
cgroup process migration permission checks are performed at write time as whether a given operation is allowed or not is dependent on the content of the write - the PID. This currently uses current's cgroup namespace which is a potential security weakness as it may allow scenarios where a less privileged process tricks a more privileged one into writing into a fd that it created.
This patch makes cgroup remember the cgroup namespace at the time of open and uses it for migration permission checks instad of current's. Note that this only applies to cgroup2 as cgroup1 doesn't have namespace support.
This also fixes a use-after-free bug on cgroupns reported in
https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Note that backporting this fix also requires the preceding patch.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com Fixes: 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option") Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Lu Jialin <lujialin4@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- kernel/cgroup/cgroup-internal.h | 2 ++ kernel/cgroup/cgroup.c | 28 +++++++++++++++++++--------- 2 files changed, 21 insertions(+), 9 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Hangyu Hua <hbh25y@gmail.com> mainline inclusion from mainline-v5.16-rc6 commit 5f9562ebe710c307adc5f666bf1a2162ee7977c0 issue: #I4RVJ4 CVE: CVE-2021-45480 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- __rds_conn_create() did not release conn->c_path when loop_trans != 0 and trans->t_prefer_loopback != 0 and is_outgoing == 0. Fixes: aced3ce57cd3 ("RDS tcp loopback connection can hang") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Sharath Srinivasan <sharath.srinivasan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Baisong Zhong zhongbaisong@huawei.com Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/rds/connection.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/rds/connection.c b/net/rds/connection.c index a3bc4b54d491..b4cc699c5fad 100644 --- a/net/rds/connection.c +++ b/net/rds/connection.c @@ -253,6 +253,7 @@ static struct rds_connection *__rds_conn_create(struct net *net, * should end up here, but if it * does, reset/destroy the connection. */ + kfree(conn->c_path); kmem_cache_free(rds_conn_slab, conn); conn = ERR_PTR(-EOPNOTSUPP); goto out; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Hangyu Hua <hbh25y@gmail.com>
mainline inclusion from mainline-v5.16-rc6 commit 5f9562ebe710c307adc5f666bf1a2162ee7977c0 issue: #I4RVJ4 CVE: CVE-2021-45480
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
__rds_conn_create() did not release conn->c_path when loop_trans != 0 and trans->t_prefer_loopback != 0 and is_outgoing == 0.
Fixes: aced3ce57cd3 ("RDS tcp loopback connection can hang") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Sharath Srinivasan <sharath.srinivasan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Baisong Zhong zhongbaisong@huawei.com Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Lin Ma <linma@zju.edu.cn> stable inclusion form stable-v5.10.82 commit cb14b196d991c864ed2d1b6e79d68a7ce38e6538 issue: #I4RVJ4 CVE: CVE-2021-4202 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- [ Upstream commit 86cdf8e38792545161dbe3350a7eced558ba4d15 ] There is a possible data race as shown below: thread-A in nci_request() | thread-B in nci_close_device() | mutex_lock(&ndev->req_lock); test_bit(NCI_UP, &ndev->flags); | ... | test_and_clear_bit(NCI_UP, &ndev->flags) mutex_lock(&ndev->req_lock); | | This race will allow __nci_request() to be awaked while the device is getting removed. Similar to commit e2cb6b891ad2 ("bluetooth: eliminate the potential race condition when removing the HCI controller"). this patch alters the function sequence in nci_request() to prevent the data races between the nci_close_device(). Signed-off-by: Lin Ma <linma@zju.edu.cn> Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Link: https://lore.kernel.org/r/20211115145600.8320-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/nci/core.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c index 32e8154363ca..5e55cb6c087a 100644 --- a/net/nfc/nci/core.c +++ b/net/nfc/nci/core.c @@ -144,12 +144,15 @@ inline int nci_request(struct nci_dev *ndev, { int rc; - if (!test_bit(NCI_UP, &ndev->flags)) - return -ENETDOWN; - /* Serialize all requests */ mutex_lock(&ndev->req_lock); - rc = __nci_request(ndev, req, opt, timeout); + /* check the state after obtaing the lock against any races + * from nci_close_device when the device gets removed. + */ + if (test_bit(NCI_UP, &ndev->flags)) + rc = __nci_request(ndev, req, opt, timeout); + else + rc = -ENETDOWN; mutex_unlock(&ndev->req_lock); return rc; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Lin Ma <linma@zju.edu.cn>
stable inclusion form stable-v5.10.82 commit cb14b196d991c864ed2d1b6e79d68a7ce38e6538 issue: #I4RVJ4 CVE: CVE-2021-4202
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
[ Upstream commit 86cdf8e38792545161dbe3350a7eced558ba4d15 ]
There is a possible data race as shown below:
thread-A in nci_request() | thread-B in nci_close_device() | mutex_lock(&ndev->req_lock); test_bit(NCI_UP, &ndev->flags); | ... | test_and_clear_bit(NCI_UP, &ndev->flags) mutex_lock(&ndev->req_lock); | |
This race will allow __nci_request() to be awaked while the device is getting removed.
Similar to commit e2cb6b891ad2 ("bluetooth: eliminate the potential race condition when removing the HCI controller"). this patch alters the function sequence in nci_request() to prevent the data races between the nci_close_device().
Signed-off-by: Lin Ma <linma@zju.edu.cn> Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Link: https://lore.kernel.org/r/20211115145600.8320-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> ---
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Lin Ma <linma@zju.edu.cn> stable inclusion form stable-v5.10.82 commit 73a0d12114b4bc1a9def79a623264754b9df698e issue: #I4RVJ4 CVE: CVE-2021-4202 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- [ Upstream commit 3e3b5dfcd16a3e254aab61bd1e8c417dd4503102 ] There is a potential UAF between the unregistration routine and the NFC netlink operations. The race that cause that UAF can be shown as below: (FREE) | (USE) nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | nci_unregister_device | nfc_get_device nfc_unregister_device | nfc_dev_up rfkill_destory | device_del | rfkill_blocked ... | ... The root cause for this race is concluded below: 1. The rfkill_blocked (USE) in nfc_dev_up is supposed to be placed after the device_is_registered check. 2. Since the netlink operations are possible just after the device_add in nfc_register_device, the nfc_dev_up() can happen anywhere during the rfkill creation process, which leads to data race. This patch reorder these actions to permit 1. Once device_del is finished, the nfc_dev_up cannot dereference the rfkill object. 2. The rfkill_register need to be placed after the device_add of nfc_dev because the parent device need to be created first. So this patch keeps the order but inject device_lock to prevent the data race. Signed-off-by: Lin Ma <linma@zju.edu.cn> Fixes: be055b2f89b5 ("NFC: RFKILL support") Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211116152652.19217-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/core.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/net/nfc/core.c b/net/nfc/core.c index eb377f87bcae..6800470dd6df 100644 --- a/net/nfc/core.c +++ b/net/nfc/core.c @@ -94,13 +94,13 @@ int nfc_dev_up(struct nfc_dev *dev) device_lock(&dev->dev); - if (dev->rfkill && rfkill_blocked(dev->rfkill)) { - rc = -ERFKILL; + if (!device_is_registered(&dev->dev)) { + rc = -ENODEV; goto error; } - if (!device_is_registered(&dev->dev)) { - rc = -ENODEV; + if (dev->rfkill && rfkill_blocked(dev->rfkill)) { + rc = -ERFKILL; goto error; } @@ -1117,11 +1117,7 @@ int nfc_register_device(struct nfc_dev *dev) if (rc) pr_err("Could not register llcp device\n"); - rc = nfc_genl_device_added(dev); - if (rc) - pr_debug("The userspace won't be notified that the device %s was added\n", - dev_name(&dev->dev)); - + device_lock(&dev->dev); dev->rfkill = rfkill_alloc(dev_name(&dev->dev), &dev->dev, RFKILL_TYPE_NFC, &nfc_rfkill_ops, dev); if (dev->rfkill) { @@ -1130,6 +1126,12 @@ int nfc_register_device(struct nfc_dev *dev) dev->rfkill = NULL; } } + device_unlock(&dev->dev); + + rc = nfc_genl_device_added(dev); + if (rc) + pr_debug("The userspace won't be notified that the device %s was added\n", + dev_name(&dev->dev)); return 0; } @@ -1146,10 +1148,17 @@ void nfc_unregister_device(struct nfc_dev *dev) pr_debug("dev_name=%s\n", dev_name(&dev->dev)); + rc = nfc_genl_device_removed(dev); + if (rc) + pr_debug("The userspace won't be notified that the device %s " + "was removed\n", dev_name(&dev->dev)); + + device_lock(&dev->dev); if (dev->rfkill) { rfkill_unregister(dev->rfkill); rfkill_destroy(dev->rfkill); } + device_unlock(&dev->dev); if (dev->ops->check_presence) { device_lock(&dev->dev); @@ -1159,11 +1168,6 @@ void nfc_unregister_device(struct nfc_dev *dev) cancel_work_sync(&dev->check_pres_work); } - rc = nfc_genl_device_removed(dev); - if (rc) - pr_debug("The userspace won't be notified that the device %s " - "was removed\n", dev_name(&dev->dev)); - nfc_llcp_unregister_device(dev); mutex_lock(&nfc_devlist_mutex); -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Lin Ma <linma@zju.edu.cn>
stable inclusion form stable-v5.10.82 commit 73a0d12114b4bc1a9def79a623264754b9df698e issue: #I4RVJ4 CVE: CVE-2021-4202
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
[ Upstream commit 3e3b5dfcd16a3e254aab61bd1e8c417dd4503102 ]
There is a potential UAF between the unregistration routine and the NFC netlink operations.
The race that cause that UAF can be shown as below:
(FREE) | (USE) nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | nci_unregister_device | nfc_get_device nfc_unregister_device | nfc_dev_up rfkill_destory | device_del | rfkill_blocked ... | ...
The root cause for this race is concluded below: 1. The rfkill_blocked (USE) in nfc_dev_up is supposed to be placed after the device_is_registered check. 2. Since the netlink operations are possible just after the device_add in nfc_register_device, the nfc_dev_up() can happen anywhere during the rfkill creation process, which leads to data race.
This patch reorder these actions to permit 1. Once device_del is finished, the nfc_dev_up cannot dereference the rfkill object. 2. The rfkill_register need to be placed after the device_add of nfc_dev because the parent device need to be created first. So this patch keeps the order but inject device_lock to prevent the data race.
Signed-off-by: Lin Ma <linma@zju.edu.cn> Fixes: be055b2f89b5 ("NFC: RFKILL support") Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211116152652.19217-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/core.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Bongsu Jeon <bongsu.jeon@samsung.com> stable inclusion form stable-v5.10.82 commit b2a60b4a0195ba918ce924ba0616048ce09a3cc5 issue: #I4RVJ4 CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- [ Upstream commit f011539e723c737b74876ac47345e40270a3c384 ] If there is a NCI command in work queue after closing the NCI device at nci_unregister_device, The NCI command timer starts at flush_workqueue function and then NCI command timeout handler would be called 5 second after flushing the NCI command work queue and destroying the queue. At that time, the timeout handler would try to use NCI command work queue that is destroyed already. it will causes the problem. To avoid this abnormal situation, change the sequence to prevent the NCI command timeout handler from being called after destroying the NCI command work queue. Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/nci/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c index 5e55cb6c087a..4d3ab0f44c9f 100644 --- a/net/nfc/nci/core.c +++ b/net/nfc/nci/core.c @@ -568,11 +568,11 @@ static int nci_close_device(struct nci_dev *ndev) clear_bit(NCI_INIT, &ndev->flags); - del_timer_sync(&ndev->cmd_timer); - /* Flush cmd wq */ flush_workqueue(ndev->cmd_wq); + del_timer_sync(&ndev->cmd_timer); + /* Clear flags */ ndev->flags = 0; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Bongsu Jeon <bongsu.jeon@samsung.com>
stable inclusion form stable-v5.10.82 commit b2a60b4a0195ba918ce924ba0616048ce09a3cc5 issue: #I4RVJ4 CVE: NA
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
[ Upstream commit f011539e723c737b74876ac47345e40270a3c384 ]
If there is a NCI command in work queue after closing the NCI device at nci_unregister_device, The NCI command timer starts at flush_workqueue function and then NCI command timeout handler would be called 5 second after flushing the NCI command work queue and destroying the queue. At that time, the timeout handler would try to use NCI command work queue that is destroyed already. it will causes the problem. To avoid this abnormal situation, change the sequence to prevent the NCI command timeout handler from being called after destroying the NCI command work queue.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/nci/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Lin Ma <linma@zju.edu.cn> stable inclusion form stable-v5.10.82 commit 34e54703fb0fdbfc0a3cfc065d71e9a8353d3ac9 issue: #I4RVJ4 CVE: CVE-2021-4202 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- [ Upstream commit 48b71a9e66c2eab60564b1b1c85f4928ed04e406 ] There are two sites that calls queue_work() after the destroy_workqueue() and lead to possible UAF. The first site is nci_send_cmd(), which can happen after the nci_close_device as below nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | flush_workqueue | del_timer_sync | nci_unregister_device | nfc_get_device destroy_workqueue | nfc_dev_up nfc_unregister_device | nci_dev_up device_del | nci_open_device | __nci_request | nci_send_cmd | queue_work !!! Another site is nci_cmd_timer, awaked by the nci_cmd_work from the nci_send_cmd. ... | ... nci_unregister_device | queue_work destroy_workqueue | nfc_unregister_device | ... device_del | nci_cmd_work | mod_timer | ... | nci_cmd_timer | queue_work !!! For the above two UAF, the root cause is that the nfc_dev_up can race between the nci_unregister_device routine. Therefore, this patch introduce NCI_UNREG flag to easily eliminate the possible race. In addition, the mutex_lock in nci_close_device can act as a barrier. Signed-off-by: Lin Ma <linma@zju.edu.cn> Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211116152732.19238-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- include/net/nfc/nci_core.h | 1 + net/nfc/nci/core.c | 19 +++++++++++++++++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/include/net/nfc/nci_core.h b/include/net/nfc/nci_core.h index 33979017b782..004e49f74841 100644 --- a/include/net/nfc/nci_core.h +++ b/include/net/nfc/nci_core.h @@ -30,6 +30,7 @@ enum nci_flag { NCI_UP, NCI_DATA_EXCHANGE, NCI_DATA_EXCHANGE_TO, + NCI_UNREG, }; /* NCI device states */ diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c index 4d3ab0f44c9f..e38719e2ee58 100644 --- a/net/nfc/nci/core.c +++ b/net/nfc/nci/core.c @@ -473,6 +473,11 @@ static int nci_open_device(struct nci_dev *ndev) mutex_lock(&ndev->req_lock); + if (test_bit(NCI_UNREG, &ndev->flags)) { + rc = -ENODEV; + goto done; + } + if (test_bit(NCI_UP, &ndev->flags)) { rc = -EALREADY; goto done; @@ -536,6 +541,10 @@ static int nci_open_device(struct nci_dev *ndev) static int nci_close_device(struct nci_dev *ndev) { nci_req_cancel(ndev, ENODEV); + + /* This mutex needs to be held as a barrier for + * caller nci_unregister_device + */ mutex_lock(&ndev->req_lock); if (!test_and_clear_bit(NCI_UP, &ndev->flags)) { @@ -573,8 +582,8 @@ static int nci_close_device(struct nci_dev *ndev) del_timer_sync(&ndev->cmd_timer); - /* Clear flags */ - ndev->flags = 0; + /* Clear flags except NCI_UNREG */ + ndev->flags &= BIT(NCI_UNREG); mutex_unlock(&ndev->req_lock); @@ -1259,6 +1268,12 @@ void nci_unregister_device(struct nci_dev *ndev) { struct nci_conn_info *conn_info, *n; + /* This set_bit is not protected with specialized barrier, + * However, it is fine because the mutex_lock(&ndev->req_lock); + * in nci_close_device() will help to emit one. + */ + set_bit(NCI_UNREG, &ndev->flags); + nci_close_device(ndev); destroy_workqueue(ndev->cmd_wq); -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Lin Ma <linma@zju.edu.cn>
stable inclusion form stable-v5.10.82 commit 34e54703fb0fdbfc0a3cfc065d71e9a8353d3ac9 issue: #I4RVJ4 CVE: CVE-2021-4202
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
[ Upstream commit 48b71a9e66c2eab60564b1b1c85f4928ed04e406 ]
There are two sites that calls queue_work() after the destroy_workqueue() and lead to possible UAF.
The first site is nci_send_cmd(), which can happen after the nci_close_device as below
nfcmrvl_nci_unregister_dev | nfc_genl_dev_up nci_close_device | flush_workqueue | del_timer_sync | nci_unregister_device | nfc_get_device destroy_workqueue | nfc_dev_up nfc_unregister_device | nci_dev_up device_del | nci_open_device | __nci_request | nci_send_cmd | queue_work !!!
Another site is nci_cmd_timer, awaked by the nci_cmd_work from the nci_send_cmd.
... | ... nci_unregister_device | queue_work destroy_workqueue | nfc_unregister_device | ... device_del | nci_cmd_work | mod_timer | ... | nci_cmd_timer | queue_work !!!
For the above two UAF, the root cause is that the nfc_dev_up can race between the nci_unregister_device routine. Therefore, this patch introduce NCI_UNREG flag to easily eliminate the possible race. In addition, the mutex_lock in nci_close_device can act as a barrier.
Signed-off-by: Lin Ma <linma@zju.edu.cn> Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211116152732.19238-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Chen Jun <chenjun102@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- include/net/nfc/nci_core.h | 1 + net/nfc/nci/core.c | 19 +++++++++++++++++-- 2 files changed, 18 insertions(+), 2 deletions(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: Lin Ma <linma@zju.edu.cn> mainline inclusion from mainline-v5.16-rc1 commit aedddb4e45b34426cfbfa84454b6f203712733c5 category: bugfix issue: #I4RVJ4 CVE: CVE-2021-4202 Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> ---------------------------------- The CAP_NET_ADMIN checks are needed to prevent attackers faking a device under NCIUARTSETDRIVER and exploit privileged commands. This patch add GENL_ADMIN_PERM flags in genl_ops to fulfill the check. Except for commands like NFC_CMD_GET_DEVICE, NFC_CMD_GET_TARGET, NFC_CMD_LLC_GET_PARAMS, and NFC_CMD_GET_SE, which are mainly information- read operations. Signed-off-by: Lin Ma <linma@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/netlink.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/net/nfc/netlink.c b/net/nfc/netlink.c index bec7847f8eaa..123f473b163d 100644 --- a/net/nfc/netlink.c +++ b/net/nfc/netlink.c @@ -1664,31 +1664,37 @@ static const struct genl_ops nfc_genl_ops[] = { .cmd = NFC_CMD_DEV_UP, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_dev_up, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_DEV_DOWN, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_dev_down, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_START_POLL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_start_poll, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_STOP_POLL, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_stop_poll, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_DEP_LINK_UP, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_dep_link_up, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_DEP_LINK_DOWN, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_dep_link_down, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_GET_TARGET, @@ -1706,26 +1712,31 @@ static const struct genl_ops nfc_genl_ops[] = { .cmd = NFC_CMD_LLC_SET_PARAMS, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_llc_set_params, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_LLC_SDREQ, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_llc_sdreq, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_FW_DOWNLOAD, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_fw_download, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_ENABLE_SE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_enable_se, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_DISABLE_SE, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_disable_se, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_GET_SE, @@ -1737,21 +1748,25 @@ static const struct genl_ops nfc_genl_ops[] = { .cmd = NFC_CMD_SE_IO, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_se_io, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_ACTIVATE_TARGET, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_activate_target, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_VENDOR, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_vendor_cmd, + .flags = GENL_ADMIN_PERM, }, { .cmd = NFC_CMD_DEACTIVATE_TARGET, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = nfc_genl_deactivate_target, + .flags = GENL_ADMIN_PERM, }, }; -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: Lin Ma <linma@zju.edu.cn>
mainline inclusion from mainline-v5.16-rc1 commit aedddb4e45b34426cfbfa84454b6f203712733c5 category: bugfix issue: #I4RVJ4 CVE: CVE-2021-4202
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> ----------------------------------
The CAP_NET_ADMIN checks are needed to prevent attackers faking a device under NCIUARTSETDRIVER and exploit privileged commands.
This patch add GENL_ADMIN_PERM flags in genl_ops to fulfill the check. Except for commands like NFC_CMD_GET_DEVICE, NFC_CMD_GET_TARGET, NFC_CMD_LLC_GET_PARAMS, and NFC_CMD_GET_SE, which are mainly information- read operations.
Signed-off-by: Lin Ma <linma@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- net/nfc/netlink.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>

From: "Darrick J. Wong" <djwong@kernel.org> mainline inclusion from mainline-v5.16-rc5 commit 983d8e60f50806f90534cc5373d0ce867e5aaf79 category: bugfix issue: #I4RVJ4 CVE: CVE-2021-4155 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> -------------------------------- The old ALLOCSP/FREESP ioctls in XFS can be used to preallocate space at the end of files, just like fallocate and RESVSP. Make the behavior consistent with the other ioctls. Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- fs/xfs/xfs_ioctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 3fbd98f61ea5..646735aad45d 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -686,7 +686,8 @@ xfs_ioc_space( if (bf->l_start > XFS_ISIZE(ip)) { error = xfs_alloc_file_space(ip, XFS_ISIZE(ip), - bf->l_start - XFS_ISIZE(ip), 0); + bf->l_start - XFS_ISIZE(ip), + XFS_BMAPI_PREALLOC); if (error) goto out_unlock; } -- 2.25.1

在 2022/1/24 9:03, yiyuchangchun@126.com 写道:
From: "Darrick J. Wong" <djwong@kernel.org>
mainline inclusion from mainline-v5.16-rc5 commit 983d8e60f50806f90534cc5373d0ce867e5aaf79 category: bugfix issue: #I4RVJ4 CVE: CVE-2021-4155
Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --------------------------------
The old ALLOCSP/FREESP ioctls in XFS can be used to preallocate space at the end of files, just like fallocate and RESVSP. Make the behavior consistent with the other ioctls.
Reported-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Guo Xuenan <guoxuenan@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> --- fs/xfs/xfs_ioctl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>
participants (2)
-
weiyongjun (A)
-
yiyuchangchun@126.com