From patchwork Mon Apr 23 13:19:53 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 154443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 8B63BB6F62 for ; Mon, 23 Apr 2012 23:49:09 +1000 (EST) Received: from localhost ([::1]:34163 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMJBp-0003KP-8x for incoming@patchwork.ozlabs.org; Mon, 23 Apr 2012 09:20:25 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38775) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMJBN-00026D-Gy for qemu-devel@nongnu.org; Mon, 23 Apr 2012 09:20:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SMJBH-0002by-0Y for qemu-devel@nongnu.org; Mon, 23 Apr 2012 09:19:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:10966) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SMJBG-0002ar-No for qemu-devel@nongnu.org; Mon, 23 Apr 2012 09:19:50 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q3NDJkfV028811 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 23 Apr 2012 09:19:46 -0400 Received: from redhat.com (vpn-202-51.tlv.redhat.com [10.35.202.51]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id q3NDJfP2017379; Mon, 23 Apr 2012 09:19:42 -0400 Date: Mon, 23 Apr 2012 16:19:53 +0300 From: "Michael S. Tsirkin" To: qemu-devel@nongnu.org Message-ID: <6d0b21bf59b3998868b7b88bb7a39a47e9660e18.1335186822.git.mst@redhat.com> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Mutt-Fcc: =sent X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.132.183.28 Cc: Anthony Liguori , Juan Quintela , Alexey Kardashevskiy , stefanha@gmail.com, Jason Wang , Eric Sunshine , Amit Shah , David Gibson Subject: [Qemu-devel] [PATCHv2 3/3] virtio: order index/descriptor reads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org virtio has the equivalent of: if (vq->last_avail_index != vring_avail_idx(vq)) { read descriptor head at vq->last_avail_index; } In theory, processor can reorder descriptor head read to happen speculatively before the index read. this would trigger the following race: host descriptor head read <- reads invalid head from ring guest writes valid descriptor head guest writes avail index host avail index read <- observes valid index as a result host will use an invalid head value. This was not observed in the field by me but after the experience with the previous two races I think it is prudent to address this theoretical race condition. Signed-off-by: Michael S. Tsirkin --- hw/virtio.c | 5 +++++ qemu-barrier.h | 7 ++++++- 2 files changed, 11 insertions(+), 1 deletions(-) diff --git a/hw/virtio.c b/hw/virtio.c index def0bf1..c081e1b 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -287,6 +287,11 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned int idx) idx, vring_avail_idx(vq)); exit(1); } + /* On success, callers read a descriptor at vq->last_avail_idx. + * Make sure descriptor read does not bypass avail index read. */ + if (num_heads) { + smp_rmb(); + } return num_heads; } diff --git a/qemu-barrier.h b/qemu-barrier.h index f6722a8..39aa0b0 100644 --- a/qemu-barrier.h +++ b/qemu-barrier.h @@ -24,10 +24,13 @@ #define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory") #endif +#define smp_rmb() smp_mb() + #elif defined(__x86_64__) #define smp_wmb() barrier() #define smp_mb() asm volatile("mfence" ::: "memory") +#define smp_rmb() asm volatile("lfence" ::: "memory") #elif defined(_ARCH_PPC) @@ -38,6 +41,7 @@ */ #define smp_wmb() asm volatile("eieio" ::: "memory") #define smp_mb() asm volatile("eieio" ::: "memory") +#define smp_rmb() asm volatile("eieio" ::: "memory") #else @@ -45,10 +49,11 @@ * For (host) platforms we don't have explicit barrier definitions * for, we use the gcc __sync_synchronize() primitive to generate a * full barrier. This should be safe on all platforms, though it may - * be overkill for wmb(). + * be overkill for wmb() and rmb(). */ #define smp_wmb() __sync_synchronize() #define smp_mb() __sync_synchronize() +#define smp_rmb() __sync_synchronize() #endif