From patchwork Thu Feb 13 12:03:26 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antonios Motakis X-Patchwork-Id: 319996 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 610762C007C for ; Thu, 13 Feb 2014 23:11:05 +1100 (EST) Received: from localhost ([::1]:45778 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDv8B-0001TE-5A for incoming@patchwork.ozlabs.org; Thu, 13 Feb 2014 07:11:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39899) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDv1w-0008K9-TX for qemu-devel@nongnu.org; Thu, 13 Feb 2014 07:04:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDv1q-0001Bu-7b for qemu-devel@nongnu.org; Thu, 13 Feb 2014 07:04:36 -0500 Received: from mail-wi0-f174.google.com ([209.85.212.174]:63876) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDv1p-0001Be-VG for qemu-devel@nongnu.org; Thu, 13 Feb 2014 07:04:30 -0500 Received: by mail-wi0-f174.google.com with SMTP id f8so8468741wiw.7 for ; Thu, 13 Feb 2014 04:04:29 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=s6lhsXPAEpopElsk+9V3QH7oNfPmWVUcQk12WYPLgaU=; b=KD2aVa2sL1LkZ8+yEdHyZAv0z5p2+C9pUOpKRZAEb+DoTa+XvdR3MrQCIt2xopSMdA rCLKoQHlZIKnT8tn6LaJTYejlcet434W4KZ6+5jdoZ9+4CqAge6bbh+HOudOg6aXzJsh p/7zZVVIxC0hUSQFTkhkl5tSck2CPKxENvabD/BRAmtCI9sNSBbKuHLRBZKmE32ngJ3Y tLhekYU1ch+Zt3GI6PEwqP5q+tR1C9lODzCaTRqxcmFeJzMHLcvtLBFgXinr7gRkA5ym 6pUR7fp17U/rg6MUm19rMnNhZhysiNV1UYxvAa/LKkFTud9AiVBB6z3j6FVxHlUd1ROF enGA== X-Gm-Message-State: ALoCoQk13zU44RuwtE36achgCzvuGOlaoNJr5YlySflIWJCEzFgv00beaPOg8c13jvS+V7JZuwd8 X-Received: by 10.180.13.33 with SMTP id e1mr6189166wic.38.1392293069271; Thu, 13 Feb 2014 04:04:29 -0800 (PST) Received: from localhost.localdomain (home.tvelocity.eu. [82.67.68.96]) by mx.google.com with ESMTPSA id bj3sm4039419wjb.14.2014.02.13.04.04.27 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 13 Feb 2014 04:04:28 -0800 (PST) From: Antonios Motakis To: qemu-devel@nongnu.org, snabb-devel@googlegroups.com Date: Thu, 13 Feb 2014 13:03:26 +0100 Message-Id: <1392293009-13812-16-git-send-email-a.motakis@virtualopensystems.com> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1392293009-13812-1-git-send-email-a.motakis@virtualopensystems.com> References: <1392293009-13812-1-git-send-email-a.motakis@virtualopensystems.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.85.212.174 Cc: lukego@gmail.com, Antonios Motakis , tech@virtualopensystems.com, n.nikolaev@virtualopensystems.com, mst@redhat.com Subject: [Qemu-devel] [PATCH v8 15/17] Add vhost-user protocol documentation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This document describes the basic message format used by vhost-user for communication over a unix domain socket. The protocol is based on the existing ioctl interface used for the kernel version of vhost. Signed-off-by: Antonios Motakis Signed-off-by: Nikolay Nikolaev --- docs/specs/vhost-user.txt | 249 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 249 insertions(+) create mode 100644 docs/specs/vhost-user.txt diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt new file mode 100644 index 0000000..236ec45 --- /dev/null +++ b/docs/specs/vhost-user.txt @@ -0,0 +1,249 @@ +Vhost-user Protocol +=================== + +This protocol is aiming to complement the ioctl interface used to control the +vhost implementation in the Linux kernel. It implements the control plane needed +to establish virtqueue sharing with a user space process on the same host. It +uses communication over a Unix domain socket to share file descriptors in the +ancillary data of the message. + +The protocol defines 2 sides of the communication, master and slave. Master is +the application that shares it's virtqueues, in our case QEMU. Slave is the +consumer of the virtqueues. + +In the current implementation QEMU is the Master, and the Slave is intended to +be a software ethernet switch running in user space, such as Snabbswitch. + +Master and slave can be either a client (i.e. connecting) or server (listening) +in the socket communication. + +Message Specification +--------------------- + +Note that all numbers are in the machine native byte order. A vhost-user message +consists of 3 header fields and a payload: + +------------------------------------ +| request | flags | size | payload | +------------------------------------ + + * Request: 32-bit type of the request + * Flags: 32-bit bit field: + - Lower 2 bits are the version (currently 0x01) + - Bit 2 is the reply flag - needs to be sent on each reply from the slave + * Size - 32-bit size of the payload + + +Depending on the request type, payload can be: + + * A single 64-bit integer + ------- + | u64 | + ------- + + u64: a 64-bit unsigned integer + + * A vring state description + --------------- + | index | num | + --------------- + + Index: a 32-bit index + Num: a 32-bit number + + * A vring address description + -------------------------------------------------------------- + | index | flags | size | descriptor | used | available | log | + -------------------------------------------------------------- + + Index: a 32-bit vring index + Flags: a 32-bit vring flags + Descriptor: a 64-bit user address of the vring descriptor table + Used: a 64-bit user address of the vring used ring + Available: a 64-bit user address of the vring available ring + Log: a 64-bit guest address for logging + + * Memory regions description + --------------------------------------------------- + | num regions | padding | region0 | ... | region7 | + --------------------------------------------------- + + Num regions: a 32-bit number of regions + Padding: 32-bit + + A region is: + --------------------------------------- + | guest address | size | user address | + --------------------------------------- + + Guest address: a 64-bit guest address of the region + Size: a 64-bit size + User address: a 64-bit user address + + +In QEMU the vhost-user message is implemented with the following struct: + +typedef struct VhostUserMsg { + VhostUserRequest request; + uint32_t flags; + uint32_t size; + union { + uint64_t u64; + struct vhost_vring_state state; + struct vhost_vring_addr addr; + VhostUserMemory memory; + }; +} QEMU_PACKED VhostUserMsg; + +Communication +------------- + +The protocol for vhost-user is based on the existing implementation of vhost +for the Linux Kernel. Most messages that can be send via the Unix domain socket +implementing vhost-user have an equivalent ioctl to the kernel implementation. + +The communication consists of master sending message requests and slave sending +message replies. Most of the requests don't require replies. Here is a list of +the ones that do: + + * VHOST_GET_FEATURES + * VHOST_GET_VRING_BASE + +There are several messages that the master sends with file descriptors passed +in the ancillary data: + + * VHOST_SET_MEM_TABLE + * VHOST_SET_LOG_FD + * VHOST_SET_VRING_KICK + * VHOST_SET_VRING_CALL + * VHOST_SET_VRING_ERR + +If Master is unable to send the full message or receives a wrong reply it will +close the connection. An optional reconnection mechanism can be implemented. + +Message types +------------- + + * VHOST_USER_GET_FEATURES + + Id: 2 + Equivalent ioctl: VHOST_GET_FEATURES + Master payload: N/A + Slave payload: u64 + + Get from the underlying vhost implementation the features bitmask. + + * VHOST_USER_SET_FEATURES + + Id: 3 + Ioctl: VHOST_SET_FEATURES + Master payload: u64 + + Enable features in the underlying vhost implementation using a bitmask. + + * VHOST_USER_SET_OWNER + + Id: 4 + Equivalent ioctl: VHOST_SET_OWNER + Master payload: N/A + + Issued when a new connection is established. It sets the current Master + as an owner of the session. This can be used on the Slave as a + "session start" flag. + + * VHOST_USER_RESET_OWNER + + Id: 5 + Equivalent ioctl: VHOST_RESET_OWNER + Master payload: N/A + + Issued when a new connection is about to be closed. The Master will no + longer own this connection (and will usually close it). + + * VHOST_USER_SET_MEM_TABLE + + Id: 6 + Equivalent ioctl: VHOST_SET_MEM_TABLE + Master payload: memory regions description + + Sets the memory map regions on the slave so it can translate the vring + addresses. In the ancillary data there is an array of file descriptors + for each memory mapped region. The size and ordering of the fds matches + the number and ordering of memory regions. + + * VHOST_USER_SET_LOG_BASE + + Id: 7 + Equivalent ioctl: VHOST_SET_LOG_BASE + Master payload: u64 + + Sets the logging base address. + + * VHOST_USER_SET_LOG_FD + + Id: 8 + Equivalent ioctl: VHOST_SET_LOG_FD + Master payload: N/A + + Sets the logging file descriptor, which is passed as ancillary data. + + * VHOST_USER_SET_VRING_NUM + + Id: 9 + Equivalent ioctl: VHOST_SET_VRING_NUM + Master payload: vring state description + + Sets the number of vrings for this owner. + + * VHOST_USER_SET_VRING_ADDR + + Id: 10 + Equivalent ioctl: VHOST_SET_VRING_ADDR + Master payload: vring address description + Slave payload: N/A + + Sets the addresses of the different aspects of the vring. + + * VHOST_USER_SET_VRING_BASE + + Id: 11 + Equivalent ioctl: VHOST_SET_VRING_BASE + Master payload: vring state description + + Sets the base address where the available descriptors are. + + * VHOST_USER_GET_VRING_BASE + + Id: 12 + Equivalent ioctl: VHOST_USER_GET_VRING_BASE + Master payload: vring state description + Slave payload: vring state description + + Get the vring base address. + + * VHOST_USER_SET_VRING_KICK + + Id: 13 + Equivalent ioctl: VHOST_SET_VRING_KICK + Master payload: N/A + + Set the event file descriptor for adding buffers to the vring. It + is passed in the ancillary data. + + * VHOST_USER_SET_VRING_CALL + + Id: 14 + Equivalent ioctl: VHOST_SET_VRING_CALL + Master payload: N/A + + Set the event file descriptor to signal when buffers are used. It + is passed in the ancillary data. + + * VHOST_USER_SET_VRING_ERR + + Id: 15 + Equivalent ioctl: VHOST_SET_VRING_ERR + Master payload: N/A + + Set the event file descriptor to signal when error occurs. It + is passed in the ancillary data.