From patchwork Tue Apr 24 15:33:19 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 154720 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id B4D18B6FA3 for ; Wed, 25 Apr 2012 02:42:39 +1000 (EST) Received: from localhost ([::1]:38215 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMhkp-00069V-OI for incoming@patchwork.ozlabs.org; Tue, 24 Apr 2012 11:34:11 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36098) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMhjx-0003kn-Py for qemu-devel@nongnu.org; Tue, 24 Apr 2012 11:33:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SMhjv-0003uX-VO for qemu-devel@nongnu.org; Tue, 24 Apr 2012 11:33:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:28215) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMhjv-0003uG-NC for qemu-devel@nongnu.org; Tue, 24 Apr 2012 11:33:15 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q3OFXBqn020064 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 24 Apr 2012 11:33:11 -0400 Received: from redhat.com (dhcp-4-60.tlv.redhat.com [10.35.4.60]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with SMTP id q3OFX84U027203; Tue, 24 Apr 2012 11:33:09 -0400 Date: Tue, 24 Apr 2012 18:33:19 +0300 From: "Michael S. Tsirkin" To: qemu-devel@nongnu.org Message-ID: <5f860186390d1286efa9318388aaef22d3bc3e05.1335281438.git.mst@redhat.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Mutt-Fcc: =sent X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.132.183.28 Cc: Anthony Liguori , Juan Quintela , Alexey Kardashevskiy , stefanha@gmail.com, Jason Wang , Eric Sunshine , Amit Shah , David Gibson Subject: [Qemu-devel] [PATCHv3 3/3] virtio: order index/descriptor reads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org virtio has the equivalent of: if (vq->last_avail_index != vring_avail_idx(vq)) { read descriptor head at vq->last_avail_index; } In theory, processor can reorder descriptor head read to happen speculatively before the index read. this would trigger the following race: host descriptor head read <- reads invalid head from ring guest writes valid descriptor head guest writes avail index host avail index read <- observes valid index as a result host will use an invalid head value. This was not observed in the field by me but after the experience with the previous two races I think it is prudent to address this theoretical race condition. Signed-off-by: Michael S. Tsirkin --- hw/virtio.c | 5 +++++ qemu-barrier.h | 12 +++++++++++- 2 files changed, 16 insertions(+), 1 deletions(-) diff --git a/hw/virtio.c b/hw/virtio.c index def0bf1..c081e1b 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -287,6 +287,11 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) idx, vring_avail_idx(vq)); exit(1); } + /* On success, callers read a descriptor at vq->last_avail_idx. + * Make sure descriptor read does not bypass avail index read. */ + if (num_heads) { + smp_rmb(); + } return num_heads; } diff --git a/qemu-barrier.h b/qemu-barrier.h index f0b842e..c89d312 100644 --- a/qemu-barrier.h +++ b/qemu-barrier.h @@ -24,10 +24,13 @@ #define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory") #endif +#define smp_rmb() smp_mb() + #elif defined(__x86_64__) #define smp_wmb() barrier() #define smp_mb() asm volatile("mfence" ::: "memory") +#define smp_rmb() asm volatile("lfence" ::: "memory") #elif defined(_ARCH_PPC) @@ -39,16 +42,23 @@ #define smp_wmb() asm volatile("eieio" ::: "memory") #define smp_mb() asm volatile("sync" ::: "memory") +#if defined(__powerpc64__) +#define smp_rmb() asm volatile("lwsync" ::: "memory") +#else +#define smp_rmb() asm volatile("sync" ::: "memory") +#endif + #else /* * For (host) platforms we don't have explicit barrier definitions * for, we use the gcc __sync_synchronize() primitive to generate a * full barrier. This should be safe on all platforms, though it may - * be overkill for wmb(). + * be overkill for wmb() and rmb(). */ #define smp_wmb() __sync_synchronize() #define smp_mb() __sync_synchronize() +#define smp_rmb() __sync_synchronize() #endif