From patchwork Fri Jan 31 17:34:39 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antonios Motakis X-Patchwork-Id: 315799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 61BFA2C00AF for ; Sat, 1 Feb 2014 04:37:54 +1100 (EST) Received: from localhost ([::1]:57296 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W9I2J-0004ez-Q9 for incoming@patchwork.ozlabs.org; Fri, 31 Jan 2014 12:37:51 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32951) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W9I02-0001He-T3 for qemu-devel@nongnu.org; Fri, 31 Jan 2014 12:35:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1W9Hzw-0003ui-Ol for qemu-devel@nongnu.org; Fri, 31 Jan 2014 12:35:30 -0500 Received: from mail-wg0-f48.google.com ([74.125.82.48]:65114) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1W9Hzw-0003tH-G5 for qemu-devel@nongnu.org; Fri, 31 Jan 2014 12:35:24 -0500 Received: by mail-wg0-f48.google.com with SMTP id x13so9279404wgg.15 for ; Fri, 31 Jan 2014 09:35:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vS4kBwbEijRTQIqH9HQhCBxfIoC2/6K15B0EXttl0G4=; b=cU+0gCmnbx4MUWJKxfwwwaBhAKwvxKh9rpJOXlTj2u4SMlpnUFJzvoWOCVCqGbJeT5 TSWJRdiUOmkFTeWJu9eFqPZAUsbTaVPCyKESc+NNqeP3hO7vwPaWopC9iTsdORdXQKLx DDbBT9n4DQkgZIHPEdL2i8i+8RSMUT+XoY9OgzfOaFo5YCPCRCJUG9FHHLMgRKvQxtw+ o3TbkBZDTDWraK1LixvhoNugKKMxAEmn4xVJOCjOcY80MWeszCWoVPSB4CKKy1wzN2LI zta6XDLCThaomieh+R9J6eVZBmbetiydaNmxxqh5wM68umsnsvgb7b/87RR/QdDg19gh kgzQ== X-Gm-Message-State: ALoCoQkTBxjOUp86WsesU4sEauqvCQrxxITYHuOLDtVnCnGD0oPxebs+lXvziUUm8YgIBRyZK0YU X-Received: by 10.180.77.129 with SMTP id s1mr14555102wiw.56.1391189723760; Fri, 31 Jan 2014 09:35:23 -0800 (PST) Received: from localhost.localdomain (home.tvelocity.eu. [82.67.68.96]) by mx.google.com with ESMTPSA id cm5sm21309470wid.5.2014.01.31.09.35.21 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 31 Jan 2014 09:35:22 -0800 (PST) From: Antonios Motakis To: qemu-devel@nongnu.org, snabb-devel@googlegroups.com Date: Fri, 31 Jan 2014 18:34:39 +0100 Message-Id: <1391189683-1602-11-git-send-email-a.motakis@virtualopensystems.com> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1391189683-1602-1-git-send-email-a.motakis@virtualopensystems.com> References: <1391189683-1602-1-git-send-email-a.motakis@virtualopensystems.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 74.125.82.48 Cc: Peter Maydell , mst@redhat.com, n.nikolaev@virtualopensystems.com, Paolo Bonzini , lukego@gmail.com, Antonios Motakis , tech@virtualopensystems.com, KONRAD Frederic Subject: [Qemu-devel] [PATCH v7 10/13] Add vhost-user as a vhost backend. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The initialization takes a chardev backed by a unix domain socket. It should implement qemu_fe_set_msgfds in order to be able to pass file descriptors to the remote process. Each ioctl request of vhost-kernel has a vhost-user message equivalent, which is sent over the control socket. The general approach is to copy the data from the supplied argument pointer to a designated field in the message. If a file descriptor is to be passed it will be placed in the fds array for inclusion in the sendmsg control header. VHOST_SET_MEM_TABLE ignores the supplied vhost_memory structure and scans the global ram_list for ram blocks with a valid fd field set. This would be set when the -mem-path option with shared=on property is used. Signed-off-by: Antonios Motakis Signed-off-by: Nikolay Nikolaev --- hw/virtio/Makefile.objs | 2 +- hw/virtio/vhost-backend.c | 5 + hw/virtio/vhost-user.c | 331 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 337 insertions(+), 1 deletion(-) create mode 100644 hw/virtio/vhost-user.c diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs index 51e5bdb..ec9e855 100644 --- a/hw/virtio/Makefile.objs +++ b/hw/virtio/Makefile.objs @@ -5,4 +5,4 @@ common-obj-y += virtio-mmio.o common-obj-$(CONFIG_VIRTIO_BLK_DATA_PLANE) += dataplane/ obj-y += virtio.o virtio-balloon.o -obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o +obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c index 509e103..35316c4 100644 --- a/hw/virtio/vhost-backend.c +++ b/hw/virtio/vhost-backend.c @@ -14,6 +14,8 @@ #include +extern const VhostOps user_ops; + static int vhost_kernel_call(struct vhost_dev *dev, unsigned long int request, void *arg) { @@ -57,6 +59,9 @@ int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType backend_type) case VHOST_BACKEND_TYPE_KERNEL: dev->vhost_ops = &kernel_ops; break; + case VHOST_BACKEND_TYPE_USER: + dev->vhost_ops = &user_ops; + break; default: error_report("Unknown vhost backend type\n"); r = -1; diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c new file mode 100644 index 0000000..1483647 --- /dev/null +++ b/hw/virtio/vhost-user.c @@ -0,0 +1,331 @@ +/* + * vhost-user + * + * Copyright (c) 2013 Virtual Open Systems Sarl. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "hw/virtio/vhost.h" +#include "hw/virtio/vhost-backend.h" +#include "sysemu/char.h" +#include "qemu/error-report.h" +#include "qemu/sockets.h" + +#include +#include +#include +#include +#include +#include + +#define VHOST_MEMORY_MAX_NREGIONS 8 + +typedef enum VhostUserRequest { + VHOST_USER_NONE = 0, + VHOST_USER_GET_FEATURES = 1, + VHOST_USER_SET_FEATURES = 2, + VHOST_USER_SET_OWNER = 3, + VHOST_USER_RESET_OWNER = 4, + VHOST_USER_SET_MEM_TABLE = 5, + VHOST_USER_SET_LOG_BASE = 6, + VHOST_USER_SET_LOG_FD = 7, + VHOST_USER_SET_VRING_NUM = 8, + VHOST_USER_SET_VRING_ADDR = 9, + VHOST_USER_SET_VRING_BASE = 10, + VHOST_USER_GET_VRING_BASE = 11, + VHOST_USER_SET_VRING_KICK = 12, + VHOST_USER_SET_VRING_CALL = 13, + VHOST_USER_SET_VRING_ERR = 14, + VHOST_USER_MAX +} VhostUserRequest; + +typedef struct VhostUserMemoryRegion { + uint64_t guest_phys_addr; + uint64_t memory_size; + uint64_t userspace_addr; +} VhostUserMemoryRegion; + +typedef struct VhostUserMemory { + uint32_t nregions; + uint32_t padding; + VhostUserMemoryRegion regions[VHOST_MEMORY_MAX_NREGIONS]; +} VhostUserMemory; + +typedef struct VhostUserMsg { + VhostUserRequest request; + +#define VHOST_USER_VERSION_MASK (0x3) +#define VHOST_USER_REPLY_MASK (0x1<<2) + uint32_t flags; + uint32_t size; /* the following payload size */ + union { + uint64_t u64; + struct vhost_vring_state state; + struct vhost_vring_addr addr; + VhostUserMemory memory; + }; +} QEMU_PACKED VhostUserMsg; + +static VhostUserMsg m __attribute__ ((unused)); +#define VHOST_USER_HDR_SIZE (sizeof(m.request) \ + + sizeof(m.flags) \ + + sizeof(m.size)) + +#define VHOST_USER_PAYLOAD_SIZE (sizeof(m) - VHOST_USER_HDR_SIZE) + +/* The version of the protocol we support */ +#define VHOST_USER_VERSION (0x1) + +static unsigned long int ioctl_to_vhost_user_request[VHOST_USER_MAX] = { + -1, /* VHOST_USER_NONE */ + VHOST_GET_FEATURES, /* VHOST_USER_GET_FEATURES */ + VHOST_SET_FEATURES, /* VHOST_USER_SET_FEATURES */ + VHOST_SET_OWNER, /* VHOST_USER_SET_OWNER */ + VHOST_RESET_OWNER, /* VHOST_USER_RESET_OWNER */ + VHOST_SET_MEM_TABLE, /* VHOST_USER_SET_MEM_TABLE */ + VHOST_SET_LOG_BASE, /* VHOST_USER_SET_LOG_BASE */ + VHOST_SET_LOG_FD, /* VHOST_USER_SET_LOG_FD */ + VHOST_SET_VRING_NUM, /* VHOST_USER_SET_VRING_NUM */ + VHOST_SET_VRING_ADDR, /* VHOST_USER_SET_VRING_ADDR */ + VHOST_SET_VRING_BASE, /* VHOST_USER_SET_VRING_BASE */ + VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */ + VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */ + VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */ + VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */ +}; + +static VhostUserRequest vhost_user_request_translate(unsigned long int request) +{ + VhostUserRequest idx; + + for (idx = 0; idx < VHOST_USER_MAX; idx++) { + if (ioctl_to_vhost_user_request[idx] == request) { + break; + } + } + + return (idx == VHOST_USER_MAX) ? VHOST_USER_NONE : idx; +} + +static int vhost_user_read(struct vhost_dev *dev, VhostUserMsg *msg) +{ + CharDriverState *chr = dev->opaque; + uint8_t *p = (uint8_t *) msg; + int r, size = VHOST_USER_HDR_SIZE; + + r = qemu_chr_fe_read_all(chr, p, size); + if (r != size) { + error_report("Failed to read msg header. Read %d instead of %d.\n", r, + size); + goto fail; + } + + /* validate received flags */ + if (msg->flags != (VHOST_USER_REPLY_MASK | VHOST_USER_VERSION)) { + error_report("Failed to read msg header." + " Flags 0x%x instead of 0x%x.\n", msg->flags, + VHOST_USER_REPLY_MASK | VHOST_USER_VERSION); + goto fail; + } + + /* validate message size is sane */ + if (msg->size > VHOST_USER_PAYLOAD_SIZE) { + error_report("Failed to read msg header." + " Size %d exceeds the maximum %zu.\n", msg->size, + VHOST_USER_PAYLOAD_SIZE); + goto fail; + } + + if (msg->size) { + p += VHOST_USER_HDR_SIZE; + size = msg->size; + r = qemu_chr_fe_read_all(chr, p, size); + if (r != size) { + error_report("Failed to read msg payload." + " Read %d instead of %d.\n", r, msg->size); + goto fail; + } + } + + return 0; + +fail: + return -1; +} + +static int vhost_user_write(struct vhost_dev *dev, VhostUserMsg *msg, + int *fds, int fd_num) +{ + CharDriverState *chr = dev->opaque; + int size = VHOST_USER_HDR_SIZE + msg->size; + + if (fd_num) { + qemu_chr_fe_set_msgfds(chr, fds, fd_num); + } + + return qemu_chr_fe_write_all(chr, (const uint8_t *) msg, size) == size ? + 0 : -1; +} + +static int vhost_user_call(struct vhost_dev *dev, unsigned long int request, + void *arg) +{ + VhostUserMsg msg; + VhostUserRequest msg_request; + RAMBlock *block = 0; + struct vhost_vring_file *file = 0; + int need_reply = 0; + int fds[VHOST_MEMORY_MAX_NREGIONS]; + size_t fd_num = 0; + + assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER); + + msg_request = vhost_user_request_translate(request); + msg.request = msg_request; + msg.flags = VHOST_USER_VERSION; + msg.size = 0; + + switch (request) { + case VHOST_GET_FEATURES: + need_reply = 1; + break; + + case VHOST_SET_FEATURES: + case VHOST_SET_LOG_BASE: + msg.u64 = *((__u64 *) arg); + msg.size = sizeof(m.u64); + break; + + case VHOST_SET_OWNER: + case VHOST_RESET_OWNER: + break; + + case VHOST_SET_MEM_TABLE: + QTAILQ_FOREACH(block, &ram_list.blocks, next) + { + if (block->fd > 0) { + msg.memory.regions[fd_num].userspace_addr = (__u64) block->host; + msg.memory.regions[fd_num].memory_size = block->length; + msg.memory.regions[fd_num].guest_phys_addr = block->offset; + fds[fd_num++] = block->fd; + } + } + + msg.memory.nregions = fd_num; + + if (!fd_num) { + error_report("Failed initializing vhost-user memory map\n" + "consider using -mem-path option\n"); + return -1; + } + + msg.size = sizeof(m.memory.nregions); + msg.size += sizeof(m.memory.padding); + msg.size += fd_num * sizeof(VhostUserMemoryRegion); + + break; + + case VHOST_SET_LOG_FD: + fds[fd_num++] = *((int *) arg); + break; + + case VHOST_SET_VRING_NUM: + case VHOST_SET_VRING_BASE: + memcpy(&msg.state, arg, sizeof(struct vhost_vring_state)); + msg.size = sizeof(m.state); + break; + + case VHOST_GET_VRING_BASE: + memcpy(&msg.state, arg, sizeof(struct vhost_vring_state)); + msg.size = sizeof(m.state); + need_reply = 1; + break; + + case VHOST_SET_VRING_ADDR: + memcpy(&msg.addr, arg, sizeof(struct vhost_vring_addr)); + msg.size = sizeof(m.addr); + break; + + case VHOST_SET_VRING_KICK: + case VHOST_SET_VRING_CALL: + case VHOST_SET_VRING_ERR: + file = arg; + msg.u64 = file->index; + msg.size = sizeof(m.u64); + if (file->fd > 0) { + fds[fd_num++] = file->fd; + } + break; + default: + error_report("vhost-user trying to send unhandled ioctl\n"); + return -1; + break; + } + + if (vhost_user_write(dev, &msg, fds, fd_num) < 0) { + return 0; + } + + if (need_reply) { + if (vhost_user_read(dev, &msg) < 0) { + return 0; + } + + if (msg_request != msg.request) { + error_report("Received unexpected msg type." + " Expected %d received %d\n", msg_request, msg.request); + return -1; + } + + switch (msg_request) { + case VHOST_USER_GET_FEATURES: + if (msg.size != sizeof(m.u64)) { + error_report("Received bad msg size.\n"); + return -1; + } + *((__u64 *) arg) = msg.u64; + break; + case VHOST_USER_GET_VRING_BASE: + if (msg.size != sizeof(m.state)) { + error_report("Received bad msg size.\n"); + return -1; + } + memcpy(arg, &msg.state, sizeof(struct vhost_vring_state)); + break; + default: + error_report("Received unexpected msg type.\n"); + return -1; + break; + } + } + + return 0; +} + +static int vhost_user_init(struct vhost_dev *dev, void *opaque) +{ + assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER); + + dev->opaque = opaque; + + return 0; +} + +static int vhost_user_cleanup(struct vhost_dev *dev) +{ + assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER); + + dev->opaque = 0; + + return 0; +} + +const VhostOps user_ops = { + .backend_type = VHOST_BACKEND_TYPE_USER, + .vhost_call = vhost_user_call, + .vhost_backend_init = vhost_user_init, + .vhost_backend_cleanup = vhost_user_cleanup + };