From patchwork Thu Jan 4 16:56:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Bethany Jamison X-Patchwork-Id: 1882551 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4T5XmN4p0Wz1ydd for ; Fri, 5 Jan 2024 03:56:19 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1rLR0s-00074F-84; Thu, 04 Jan 2024 16:56:10 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1rLR0o-00073u-Gf for kernel-team@lists.ubuntu.com; Thu, 04 Jan 2024 16:56:06 +0000 Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 5CDE13F154 for ; Thu, 4 Jan 2024 16:56:06 +0000 (UTC) Received: by mail-qt1-f200.google.com with SMTP id d75a77b69052e-427884c2eccso10851251cf.1 for ; Thu, 04 Jan 2024 08:56:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704387365; x=1704992165; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nNj/LoKS/hvw4K7kcmONJaoJ0JJL9KVPbUxCLG8i4TI=; b=ohN4hwGexnEvPbqJli5YcB6RS8IsyVtlDs7qruIGhA7p0eeSn6UwH6+ggfEskOeArl yR7keHsyVvV/eRIVuagXfjZerryqV5P/Pg+kVc+h9MhFaMNLIkHpi9wfZuEZ9+oKN5xx iqyV8Izznq0mMjDi/FzHI//8nQr/SNIhKb8kW9CqhubaAvQmafDKOvsZHu5F1AXjHBCB Mm+qi22ukX990uwGZPIg672H0whaabNNVFeTe/ZPlTIvYmqSJpZ4/OjaxKPtuQLZo2Kj TfUPZNJaxEshcUUOjdvn6Ny1xeMgwVfkqgpW7tA2RHcdRcnshNmOENcdiF4r1NoEbNGx 0kZQ== X-Gm-Message-State: AOJu0YyvW6oFJVJu1XjU7C7nzU/sH/Kyf36iqgo5YA2WGO6KfF1V1Jsp g0HDSAENobT23feX+Vhkpfmh0tAUhFZpHQqpVcTikVGB1U3wVetHIVXeiKPuWD+hGQVzBcCD4gz EeALf1I9G33KMdKPXV6DaeA01Q4BvRO3VZlQSBFxEh5RvXPlB+DsHZy7G X-Received: by 2002:a05:622a:10d:b0:427:9c78:a2bd with SMTP id u13-20020a05622a010d00b004279c78a2bdmr1042181qtw.22.1704387364993; Thu, 04 Jan 2024 08:56:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IEV8OWXNVYIDlJgAnNRk7SVA5U21NGyoo/HvcOqOSiTevh3/YzqvZEL08/ktD32p3xs/3YioQ== X-Received: by 2002:a05:622a:10d:b0:427:9c78:a2bd with SMTP id u13-20020a05622a010d00b004279c78a2bdmr1042158qtw.22.1704387364656; Thu, 04 Jan 2024 08:56:04 -0800 (PST) Received: from smtp.gmail.com (104-218-69-19.dynamic.lnk.ne.allofiber.net. [104.218.69.19]) by smtp.gmail.com with ESMTPSA id fp20-20020a05622a509400b00425442a2f32sm15234026qtb.16.2024.01.04.08.56.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 08:56:04 -0800 (PST) From: Bethany Jamison To: kernel-team@lists.ubuntu.com Subject: [SRU][Mantic][PATCH 1/1] xen/events: replace evtchn_rwlock with RCU Date: Thu, 4 Jan 2024 10:56:02 -0600 Message-Id: <20240104165602.13396-2-bethany.jamison@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240104165602.13396-1-bethany.jamison@canonical.com> References: <20240104165602.13396-1-bethany.jamison@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Juergen Gross In unprivileged Xen guests event handling can cause a deadlock with Xen console handling. The evtchn_rwlock and the hvc_lock are taken in opposite sequence in __hvc_poll() and in Xen console IRQ handling. Normally this is no problem, as the evtchn_rwlock is taken as a reader in both paths, but as soon as an event channel is being closed, the lock will be taken as a writer, which will cause read_lock() to block: CPU0 CPU1 CPU2 (IRQ handling) (__hvc_poll()) (closing event channel) read_lock(evtchn_rwlock) spin_lock(hvc_lock) write_lock(evtchn_rwlock) [blocks] spin_lock(hvc_lock) [blocks] read_lock(evtchn_rwlock) [blocks due to writer waiting, and not in_interrupt()] This issue can be avoided by replacing evtchn_rwlock with RCU in xen_free_irq(). Note that RCU is used only to delay freeing of the irq_info memory. There is no RCU based dereferencing or replacement of pointers involved. In order to avoid potential races between removing the irq_info reference and handling of interrupts, set the irq_info pointer to NULL only when freeing its memory. The IRQ itself must be freed at that time, too, as otherwise the same IRQ number could be allocated again before handling of the old instance would have been finished. This is XSA-441 / CVE-2023-34324. Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework") Reported-by: Marek Marczykowski-Górecki Signed-off-by: Juergen Gross Reviewed-by: Julien Grall Signed-off-by: Juergen Gross (cherry picked from commit 87797fad6cce28ec9be3c13f031776ff4f104cfc) CVE-2023-34324 Signed-off-by: Bethany Jamison --- drivers/xen/events/events_base.c | 87 +++++++++++++++++--------------- 1 file changed, 46 insertions(+), 41 deletions(-) diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 3bdd5b59661d..c803714d0f0d 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -96,6 +97,7 @@ enum xen_irq_type { struct irq_info { struct list_head list; struct list_head eoi_list; + struct rcu_work rwork; short refcnt; u8 spurious_cnt; u8 is_accounted; @@ -146,23 +148,13 @@ const struct evtchn_ops *evtchn_ops; */ static DEFINE_MUTEX(irq_mapping_update_lock); -/* - * Lock protecting event handling loop against removing event channels. - * Adding of event channels is no issue as the associated IRQ becomes active - * only after everything is setup (before request_[threaded_]irq() the handler - * can't be entered for an event, as the event channel will be unmasked only - * then). - */ -static DEFINE_RWLOCK(evtchn_rwlock); - /* * Lock hierarchy: * * irq_mapping_update_lock - * evtchn_rwlock - * IRQ-desc lock - * percpu eoi_list_lock - * irq_info->lock + * IRQ-desc lock + * percpu eoi_list_lock + * irq_info->lock */ static LIST_HEAD(xen_irq_list_head); @@ -306,6 +298,22 @@ static void channels_on_cpu_inc(struct irq_info *info) info->is_accounted = 1; } +static void delayed_free_irq(struct work_struct *work) +{ + struct irq_info *info = container_of(to_rcu_work(work), struct irq_info, + rwork); + unsigned int irq = info->irq; + + /* Remove the info pointer only now, with no potential users left. */ + set_info_for_irq(irq, NULL); + + kfree(info); + + /* Legacy IRQ descriptors are managed by the arch. */ + if (irq >= nr_legacy_irqs()) + irq_free_desc(irq); +} + /* Constructors for packed IRQ information. */ static int xen_irq_info_common_setup(struct irq_info *info, unsigned irq, @@ -668,33 +676,36 @@ static void xen_irq_lateeoi_worker(struct work_struct *work) eoi = container_of(to_delayed_work(work), struct lateeoi_work, delayed); - read_lock_irqsave(&evtchn_rwlock, flags); + rcu_read_lock(); while (true) { - spin_lock(&eoi->eoi_list_lock); + spin_lock_irqsave(&eoi->eoi_list_lock, flags); info = list_first_entry_or_null(&eoi->eoi_list, struct irq_info, eoi_list); - if (info == NULL || now < info->eoi_time) { - spin_unlock(&eoi->eoi_list_lock); + if (info == NULL) + break; + + if (now < info->eoi_time) { + mod_delayed_work_on(info->eoi_cpu, system_wq, + &eoi->delayed, + info->eoi_time - now); break; } list_del_init(&info->eoi_list); - spin_unlock(&eoi->eoi_list_lock); + spin_unlock_irqrestore(&eoi->eoi_list_lock, flags); info->eoi_time = 0; xen_irq_lateeoi_locked(info, false); } - if (info) - mod_delayed_work_on(info->eoi_cpu, system_wq, - &eoi->delayed, info->eoi_time - now); + spin_unlock_irqrestore(&eoi->eoi_list_lock, flags); - read_unlock_irqrestore(&evtchn_rwlock, flags); + rcu_read_unlock(); } static void xen_cpu_init_eoi(unsigned int cpu) @@ -709,16 +720,15 @@ static void xen_cpu_init_eoi(unsigned int cpu) void xen_irq_lateeoi(unsigned int irq, unsigned int eoi_flags) { struct irq_info *info; - unsigned long flags; - read_lock_irqsave(&evtchn_rwlock, flags); + rcu_read_lock(); info = info_for_irq(irq); if (info) xen_irq_lateeoi_locked(info, eoi_flags & XEN_EOI_FLAG_SPURIOUS); - read_unlock_irqrestore(&evtchn_rwlock, flags); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(xen_irq_lateeoi); @@ -732,6 +742,7 @@ static void xen_irq_init(unsigned irq) info->type = IRQT_UNBOUND; info->refcnt = -1; + INIT_RCU_WORK(&info->rwork, delayed_free_irq); set_info_for_irq(irq, info); /* @@ -789,31 +800,18 @@ static int __must_check xen_allocate_irq_gsi(unsigned gsi) static void xen_free_irq(unsigned irq) { struct irq_info *info = info_for_irq(irq); - unsigned long flags; if (WARN_ON(!info)) return; - write_lock_irqsave(&evtchn_rwlock, flags); - if (!list_empty(&info->eoi_list)) lateeoi_list_del(info); list_del(&info->list); - set_info_for_irq(irq, NULL); - WARN_ON(info->refcnt > 0); - write_unlock_irqrestore(&evtchn_rwlock, flags); - - kfree(info); - - /* Legacy IRQ descriptors are managed by the arch. */ - if (irq < nr_legacy_irqs()) - return; - - irq_free_desc(irq); + queue_rcu_work(system_wq, &info->rwork); } /* Not called for lateeoi events. */ @@ -1711,7 +1709,14 @@ static int __xen_evtchn_do_upcall(void) int cpu = smp_processor_id(); struct evtchn_loop_ctrl ctrl = { 0 }; - read_lock(&evtchn_rwlock); + /* + * When closing an event channel the associated IRQ must not be freed + * until all cpus have left the event handling loop. This is ensured + * by taking the rcu_read_lock() while handling events, as freeing of + * the IRQ is handled via queue_rcu_work() _after_ closing the event + * channel. + */ + rcu_read_lock(); do { vcpu_info->evtchn_upcall_pending = 0; @@ -1724,7 +1729,7 @@ static int __xen_evtchn_do_upcall(void) } while (vcpu_info->evtchn_upcall_pending); - read_unlock(&evtchn_rwlock); + rcu_read_unlock(); /* * Increment irq_epoch only now to defer EOIs only for