diff mbox series

[v2] vsock.7: document VSOCK socket address family

Message ID 20171205105618.30049-1-stefanha@redhat.com
State Not Applicable, archived
Delegated to: David Miller
Headers show
Series [v2] vsock.7: document VSOCK socket address family | expand

Commit Message

Stefan Hajnoczi Dec. 5, 2017, 10:56 a.m. UTC
The AF_VSOCK address family has been available since Linux 3.9 without a
corresponding man page.

This patch adds vsock.7 and describes its use along the same lines as
existing ip.7, unix.7, and netlink.7 man pages.

CC: Jorgen Hansen <jhansen@vmware.com>
CC: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 man7/vsock.7 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 180 insertions(+)
 create mode 100644 man7/vsock.7

Comments

Jorgen Hansen Dec. 6, 2017, 2:06 p.m. UTC | #1
> On Dec 5, 2017, at 11:56 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:

> 

> The AF_VSOCK address family has been available since Linux 3.9 without a

> corresponding man page.

> 

> This patch adds vsock.7 and describes its use along the same lines as

> existing ip.7, unix.7, and netlink.7 man pages.

> 

> CC: Jorgen Hansen <jhansen@vmware.com>

> CC: Dexuan Cui <decui@microsoft.com>

> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

> ---

> man7/vsock.7 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

> 1 file changed, 180 insertions(+)

> create mode 100644 man7/vsock.7

> 

> diff --git a/man7/vsock.7 b/man7/vsock.7

> new file mode 100644

> index 000000000..46dc561f5

> --- /dev/null

> +++ b/man7/vsock.7

> @@ -0,0 +1,180 @@

> +.TH VSOCK 7 2017-11-30 "Linux" "Linux Programmer's Manual"

> +.SH NAME

> +vsock \- Linux VSOCK address family

> +.SH SYNOPSIS

> +.B #include <sys/socket.h>

> +.br

> +.B #include <linux/vm_sockets.h>

> +.PP

> +.IB stream_socket " = socket(AF_VSOCK, SOCK_STREAM, 0);"

> +.br

> +.IB datagram_socket " = socket(AF_VSOCK, SOCK_DGRAM, 0);"

> +.SH DESCRIPTION

> +The VSOCK address family facilitates communication between virtual machines and

> +the host they are running on.  This address family is used by guest agents and

> +hypervisor services that need a communications channel that is independent of

> +virtual machine network configuration.

> +.PP

> +Valid socket types are

> +.B SOCK_STREAM

> +and

> +.BR SOCK_DGRAM .

> +.B SOCK_STREAM

> +provides connection-oriented byte streams with guaranteed, in-order delivery.

> +.B SOCK_DGRAM

> +provides a connectionless datagram packet service with best-effort delivery and

> +best-effort ordering.  Availability of these socket types is dependent on the

> +underlying hypervisor.

> +.PP

> +A new socket is created with

> +.PP

> +    socket(AF_VSOCK, socket_type, 0);

> +.PP

> +When a process wants to establish a connection it calls

> +.BR connect (2)

> +with a given destination socket address.  The socket is automatically bound to

> +a free port if unbound.

> +.PP

> +A process can listen for incoming connections by first binding to a socket

> +address using

> +.BR bind (2)

> +and then calling

> +.BR listen (2).

> +.PP

> +Data is transferred using the usual

> +.BR send (2)

> +and

> +.BR recv (2)

> +family of socket system calls.

> +.SS Address format

> +A socket address is defined as a combination of a 32-bit Context Identifier

> +(CID) and a 32-bit port number.  The CID identifies the source or destination,

> +which is either a virtual machine or the host.  The port number differentiates

> +between multiple services running on a single machine.

> +.PP

> +.in +4n

> +.EX

> +struct sockaddr_vm {

> +    sa_family_t     svm_family;     /* address family: AF_VSOCK */

> +    unsigned short  svm_reserved1;

> +    unsigned int    svm_port;       /* port in native byte order */

> +    unsigned int    svm_cid;        /* address in native byte order */

> +};

> +.EE

> +.in

> +.PP

> +.I svm_family

> +is always set to

> +.BR AF_VSOCK .

> +.I svm_reserved1

> +is always set to 0.

> +.I svm_port

> +contains the port in native byte order.

> +The port numbers below 1024 are called

> +.IR "privileged ports" .

> +Only a process with

> +.B CAP_NET_BIND_SERVER

> +capability may

> +.BR bind (2)

> +to these port numbers.

> +.PP

> +There are several special addresses:

> +.B VMADDR_CID_ANY

> +(-1U)

> +means any address for binding;

> +.B VMADDR_CID_HYPERVISOR

> +(0) is reserved for services built into the hypervisor;

> +.B VMADDR_CID_RESERVED

> +(1) must not be used;

> +.B VMADDR_CID_HOST

> +(2)

> +is the well-known address of the host.

> +.PP

> +The special constant

> +.B VMADDR_PORT_ANY

> +(-1U)

> +means any port number for binding.

> +.SS Live migration

> +Sockets are affected by live migration of virtual machines.  Connected

> +.B SOCK_STREAM

> +sockets become disconnected when the virtual machine migrates to a new host.

> +Applications must reconnect when this happens.

> +.PP

> +The local CID may change across live migration if the old CID is not available

> +on the new host.  Bound sockets are automatically updated to the new CID.

> +.SS Ioctls

> +.TP

> +.B IOCTL_VM_SOCKETS_GET_LOCAL_CID

> +Get the CID of the local machine.  The argument is a pointer to an unsigned int.

> +.IP

> +.in +4n

> +.EX

> +.IB error " = ioctl(" socket ", " IOCTL_VM_SOCKETS_GET_LOCAL_CID ", " &cid ");"

> +.EE

> +.in

> +.IP

> +Consider using

> +.B VMADDR_CID_ANY

> +when binding instead of getting the local CID with

> +.BR IOCTL_VM_SOCKETS_GET_LOCAL_CID .

> +.SH ERRORS

> +.TP

> +.B EACCES

> +Unable to bind to a privileged port without the

> +.B CAP_NET_BIND_SERVICE

> +capability.

> +.TP

> +.B EINVAL

> +Invalid parameters.  This includes:

> +attempting to bind a socket that is already bound, providing an invalid struct

> +.BR sockaddr_vm ,

> +and other input validation errors.

> +.TP

> +.B EOPNOTSUPP

> +Operation not supported.  This includes:

> +the

> +.B MSG_OOB

> +flag that is not implemented for

> +.BR sendmsg (2)

> +and

> +.B MSG_PEEK

> +for

> +.BR recvmsg (2).

> +.TP

> +.B EADDRINUSE

> +Unable to bind to a port that is already in use.

> +.TP

> +.B EADDRNOTAVAIL

> +Unable to find a free port for binding or unable to bind to a non-local CID.

> +.TP

> +.B ENOTCONN

> +Unable to perform operation on an unconnected socket.

> +.TP

> +.B ENOPROTOOPT

> +Invalid socket option in

> +.BR setsockopt (2)

> +or

> +.BR getsockopt (2).

> +.TP

> +.B EPROTONOSUPPORT

> +Invalid socket protocol number.  Protocol should always be 0.

> +.TP

> +.B ESOCKTNOSUPPORT

> +Unsupported socket type in

> +.BR socket (2).

> +Only

> +.B SOCK_STREAM

> +and

> +.B SOCK_DGRAM

> +are valid.

> +.SH VERSIONS

> +Support for VMware (VMCI) has been available since Linux 3.9.  KVM (virtio) is

> +supported since Linux 4.8.  Hyper-V is supported since 4.14.

> +.SH SEE ALSO

> +.BR socket (2),

> +.BR bind (2),

> +.BR connect (2),

> +.BR listen (2),

> +.BR send (2),

> +.BR recv (2),

> +.BR capabilities (7)

> -- 

> 2.14.3

> 


Looks great to me. Thanks for doing this. I don’t have anything to add.

Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Michael Kerrisk \(man-pages\) Dec. 11, 2017, 7:32 p.m. UTC | #2
Hello Stefan,

Thanks for this page!

I have applied your patch, and made a few tweaks, but
I have some minor questions. Please see below.

On 12/05/2017 11:56 AM, Stefan Hajnoczi wrote:
> The AF_VSOCK address family has been available since Linux 3.9 without a
> corresponding man page.
> 
> This patch adds vsock.7 and describes its use along the same lines as
> existing ip.7, unix.7, and netlink.7 man pages.
> 
> CC: Jorgen Hansen <jhansen@vmware.com>
> CC: Dexuan Cui <decui@microsoft.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  man7/vsock.7 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 180 insertions(+)
>  create mode 100644 man7/vsock.7
> 
> diff --git a/man7/vsock.7 b/man7/vsock.7
> new file mode 100644
> index 000000000..46dc561f5
> --- /dev/null
> +++ b/man7/vsock.7
> @@ -0,0 +1,180 @@
> +.TH VSOCK 7 2017-11-30 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +vsock \- Linux VSOCK address family
> +.SH SYNOPSIS
> +.B #include <sys/socket.h>
> +.br
> +.B #include <linux/vm_sockets.h>
> +.PP
> +.IB stream_socket " = socket(AF_VSOCK, SOCK_STREAM, 0);"
> +.br
> +.IB datagram_socket " = socket(AF_VSOCK, SOCK_DGRAM, 0);"
> +.SH DESCRIPTION
> +The VSOCK address family facilitates communication between virtual machines and
> +the host they are running on.  This address family is used by guest agents and
> +hypervisor services that need a communications channel that is independent of
> +virtual machine network configuration.
> +.PP
> +Valid socket types are
> +.B SOCK_STREAM
> +and
> +.BR SOCK_DGRAM .
> +.B SOCK_STREAM
> +provides connection-oriented byte streams with guaranteed, in-order delivery.
> +.B SOCK_DGRAM
> +provides a connectionless datagram packet service with best-effort delivery and
> +best-effort ordering.  Availability of these socket types is dependent on the
> +underlying hypervisor.
> +.PP
> +A new socket is created with
> +.PP
> +    socket(AF_VSOCK, socket_type, 0);
> +.PP
> +When a process wants to establish a connection it calls
> +.BR connect (2)
> +with a given destination socket address.  The socket is automatically bound to
> +a free port if unbound.
> +.PP
> +A process can listen for incoming connections by first binding to a socket
> +address using
> +.BR bind (2)
> +and then calling
> +.BR listen (2).
> +.PP
> +Data is transferred using the usual
> +.BR send (2)
> +and
> +.BR recv (2)

Or equally, write(2) and read(2), right? By failing to mention those, the
text subtly implies that send(2) and recv(2) are preferred, but I don't
suppose that is true.

> +family of socket system calls.
> +.SS Address format
> +A socket address is defined as a combination of a 32-bit Context Identifier
> +(CID) and a 32-bit port number.  The CID identifies the source or destination,
> +which is either a virtual machine or the host.  The port number differentiates
> +between multiple services running on a single machine.
> +.PP
> +.in +4n
> +.EX
> +struct sockaddr_vm {
> +    sa_family_t     svm_family;     /* address family: AF_VSOCK */
> +    unsigned short  svm_reserved1;
> +    unsigned int    svm_port;       /* port in native byte order */
> +    unsigned int    svm_cid;        /* address in native byte order */
> +};
> +.EE
> +.in
> +.PP
> +.I svm_family
> +is always set to
> +.BR AF_VSOCK .
> +.I svm_reserved1
> +is always set to 0.
> +.I svm_port
> +contains the port in native byte order.
> +The port numbers below 1024 are called
> +.IR "privileged ports" .
> +Only a process with
> +.B CAP_NET_BIND_SERVER
> +capability may
> +.BR bind (2)
> +to these port numbers.
> +.PP
> +There are several special addresses:
> +.B VMADDR_CID_ANY
> +(-1U)
> +means any address for binding;
> +.B VMADDR_CID_HYPERVISOR
> +(0) is reserved for services built into the hypervisor;
> +.B VMADDR_CID_RESERVED
> +(1) must not be used;
> +.B VMADDR_CID_HOST
> +(2)
> +is the well-known address of the host.
> +.PP
> +The special constant
> +.B VMADDR_PORT_ANY
> +(-1U)
> +means any port number for binding.
> +.SS Live migration
> +Sockets are affected by live migration of virtual machines.  Connected
> +.B SOCK_STREAM
> +sockets become disconnected when the virtual machine migrates to a new host.
> +Applications must reconnect when this happens.
> +.PP
> +The local CID may change across live migration if the old CID is not available
> +on the new host.  Bound sockets are automatically updated to the new CID.
> +.SS Ioctls
> +.TP
> +.B IOCTL_VM_SOCKETS_GET_LOCAL_CID
> +Get the CID of the local machine.  The argument is a pointer to an unsigned int.
> +.IP
> +.in +4n
> +.EX
> +.IB error " = ioctl(" socket ", " IOCTL_VM_SOCKETS_GET_LOCAL_CID ", " &cid ");"
> +.EE
> +.in
> +.IP
> +Consider using
> +.B VMADDR_CID_ANY
> +when binding instead of getting the local CID with
> +.BR IOCTL_VM_SOCKETS_GET_LOCAL_CID .
> +.SH ERRORS
> +.TP
> +.B EACCES
> +Unable to bind to a privileged port without the
> +.B CAP_NET_BIND_SERVICE
> +capability.
> +.TP
> +.B EINVAL
> +Invalid parameters.  This includes:
> +attempting to bind a socket that is already bound, providing an invalid struct
> +.BR sockaddr_vm ,
> +and other input validation errors.
> +.TP
> +.B EOPNOTSUPP
> +Operation not supported.  This includes:
> +the
> +.B MSG_OOB
> +flag that is not implemented for
> +.BR sendmsg (2)
> +and
> +.B MSG_PEEK
> +for
> +.BR recvmsg (2).

So these errors might also occur for send() and recv(), right?

> +.TP
> +.B EADDRINUSE
> +Unable to bind to a port that is already in use.
> +.TP
> +.B EADDRNOTAVAIL
> +Unable to find a free port for binding or unable to bind to a non-local CID.
> +.TP
> +.B ENOTCONN
> +Unable to perform operation on an unconnected socket.
> +.TP
> +.B ENOPROTOOPT
> +Invalid socket option in
> +.BR setsockopt (2)
> +or
> +.BR getsockopt (2).
> +.TP
> +.B EPROTONOSUPPORT
> +Invalid socket protocol number.  Protocol should always be 0.
> +.TP
> +.B ESOCKTNOSUPPORT
> +Unsupported socket type in
> +.BR socket (2).
> +Only
> +.B SOCK_STREAM
> +and
> +.B SOCK_DGRAM
> +are valid.
> +.SH VERSIONS
> +Support for VMware (VMCI) has been available since Linux 3.9.  KVM (virtio) is
> +supported since Linux 4.8.  Hyper-V is supported since 4.14.
> +.SH SEE ALSO
> +.BR socket (2),
> +.BR bind (2),
> +.BR connect (2),
> +.BR listen (2),
> +.BR send (2),
> +.BR recv (2),
> +.BR capabilities (7)

Cheers,

Michael
Michael Kerrisk \(man-pages\) Dec. 11, 2017, 7:32 p.m. UTC | #3
On 12/06/2017 03:06 PM, Jorgen S. Hansen wrote:
> 
>> On Dec 5, 2017, at 11:56 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>
>> The AF_VSOCK address family has been available since Linux 3.9 without a
>> corresponding man page.
>>
>> This patch adds vsock.7 and describes its use along the same lines as
>> existing ip.7, unix.7, and netlink.7 man pages.
>>
>> CC: Jorgen Hansen <jhansen@vmware.com>
>> CC: Dexuan Cui <decui@microsoft.com>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>> man7/vsock.7 | 180 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 180 insertions(+)
>> create mode 100644 man7/vsock.7
>>
>> diff --git a/man7/vsock.7 b/man7/vsock.7
>> new file mode 100644
>> index 000000000..46dc561f5
>> --- /dev/null
>> +++ b/man7/vsock.7
>> @@ -0,0 +1,180 @@
>> +.TH VSOCK 7 2017-11-30 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +vsock \- Linux VSOCK address family
>> +.SH SYNOPSIS
>> +.B #include <sys/socket.h>
>> +.br
>> +.B #include <linux/vm_sockets.h>
>> +.PP
>> +.IB stream_socket " = socket(AF_VSOCK, SOCK_STREAM, 0);"
>> +.br
>> +.IB datagram_socket " = socket(AF_VSOCK, SOCK_DGRAM, 0);"
>> +.SH DESCRIPTION
>> +The VSOCK address family facilitates communication between virtual machines and
>> +the host they are running on.  This address family is used by guest agents and
>> +hypervisor services that need a communications channel that is independent of
>> +virtual machine network configuration.
>> +.PP
>> +Valid socket types are
>> +.B SOCK_STREAM
>> +and
>> +.BR SOCK_DGRAM .
>> +.B SOCK_STREAM
>> +provides connection-oriented byte streams with guaranteed, in-order delivery.
>> +.B SOCK_DGRAM
>> +provides a connectionless datagram packet service with best-effort delivery and
>> +best-effort ordering.  Availability of these socket types is dependent on the
>> +underlying hypervisor.
>> +.PP
>> +A new socket is created with
>> +.PP
>> +    socket(AF_VSOCK, socket_type, 0);
>> +.PP
>> +When a process wants to establish a connection it calls
>> +.BR connect (2)
>> +with a given destination socket address.  The socket is automatically bound to
>> +a free port if unbound.
>> +.PP
>> +A process can listen for incoming connections by first binding to a socket
>> +address using
>> +.BR bind (2)
>> +and then calling
>> +.BR listen (2).
>> +.PP
>> +Data is transferred using the usual
>> +.BR send (2)
>> +and
>> +.BR recv (2)
>> +family of socket system calls.
>> +.SS Address format
>> +A socket address is defined as a combination of a 32-bit Context Identifier
>> +(CID) and a 32-bit port number.  The CID identifies the source or destination,
>> +which is either a virtual machine or the host.  The port number differentiates
>> +between multiple services running on a single machine.
>> +.PP
>> +.in +4n
>> +.EX
>> +struct sockaddr_vm {
>> +    sa_family_t     svm_family;     /* address family: AF_VSOCK */
>> +    unsigned short  svm_reserved1;
>> +    unsigned int    svm_port;       /* port in native byte order */
>> +    unsigned int    svm_cid;        /* address in native byte order */
>> +};
>> +.EE
>> +.in
>> +.PP
>> +.I svm_family
>> +is always set to
>> +.BR AF_VSOCK .
>> +.I svm_reserved1
>> +is always set to 0.
>> +.I svm_port
>> +contains the port in native byte order.
>> +The port numbers below 1024 are called
>> +.IR "privileged ports" .
>> +Only a process with
>> +.B CAP_NET_BIND_SERVER
>> +capability may
>> +.BR bind (2)
>> +to these port numbers.
>> +.PP
>> +There are several special addresses:
>> +.B VMADDR_CID_ANY
>> +(-1U)
>> +means any address for binding;
>> +.B VMADDR_CID_HYPERVISOR
>> +(0) is reserved for services built into the hypervisor;
>> +.B VMADDR_CID_RESERVED
>> +(1) must not be used;
>> +.B VMADDR_CID_HOST
>> +(2)
>> +is the well-known address of the host.
>> +.PP
>> +The special constant
>> +.B VMADDR_PORT_ANY
>> +(-1U)
>> +means any port number for binding.
>> +.SS Live migration
>> +Sockets are affected by live migration of virtual machines.  Connected
>> +.B SOCK_STREAM
>> +sockets become disconnected when the virtual machine migrates to a new host.
>> +Applications must reconnect when this happens.
>> +.PP
>> +The local CID may change across live migration if the old CID is not available
>> +on the new host.  Bound sockets are automatically updated to the new CID.
>> +.SS Ioctls
>> +.TP
>> +.B IOCTL_VM_SOCKETS_GET_LOCAL_CID
>> +Get the CID of the local machine.  The argument is a pointer to an unsigned int.
>> +.IP
>> +.in +4n
>> +.EX
>> +.IB error " = ioctl(" socket ", " IOCTL_VM_SOCKETS_GET_LOCAL_CID ", " &cid ");"
>> +.EE
>> +.in
>> +.IP
>> +Consider using
>> +.B VMADDR_CID_ANY
>> +when binding instead of getting the local CID with
>> +.BR IOCTL_VM_SOCKETS_GET_LOCAL_CID .
>> +.SH ERRORS
>> +.TP
>> +.B EACCES
>> +Unable to bind to a privileged port without the
>> +.B CAP_NET_BIND_SERVICE
>> +capability.
>> +.TP
>> +.B EINVAL
>> +Invalid parameters.  This includes:
>> +attempting to bind a socket that is already bound, providing an invalid struct
>> +.BR sockaddr_vm ,
>> +and other input validation errors.
>> +.TP
>> +.B EOPNOTSUPP
>> +Operation not supported.  This includes:
>> +the
>> +.B MSG_OOB
>> +flag that is not implemented for
>> +.BR sendmsg (2)
>> +and
>> +.B MSG_PEEK
>> +for
>> +.BR recvmsg (2).
>> +.TP
>> +.B EADDRINUSE
>> +Unable to bind to a port that is already in use.
>> +.TP
>> +.B EADDRNOTAVAIL
>> +Unable to find a free port for binding or unable to bind to a non-local CID.
>> +.TP
>> +.B ENOTCONN
>> +Unable to perform operation on an unconnected socket.
>> +.TP
>> +.B ENOPROTOOPT
>> +Invalid socket option in
>> +.BR setsockopt (2)
>> +or
>> +.BR getsockopt (2).
>> +.TP
>> +.B EPROTONOSUPPORT
>> +Invalid socket protocol number.  Protocol should always be 0.
>> +.TP
>> +.B ESOCKTNOSUPPORT
>> +Unsupported socket type in
>> +.BR socket (2).
>> +Only
>> +.B SOCK_STREAM
>> +and
>> +.B SOCK_DGRAM
>> +are valid.
>> +.SH VERSIONS
>> +Support for VMware (VMCI) has been available since Linux 3.9.  KVM (virtio) is
>> +supported since Linux 4.8.  Hyper-V is supported since 4.14.
>> +.SH SEE ALSO
>> +.BR socket (2),
>> +.BR bind (2),
>> +.BR connect (2),
>> +.BR listen (2),
>> +.BR send (2),
>> +.BR recv (2),
>> +.BR capabilities (7)
>> -- 
>> 2.14.3
>>
> 
> Looks great to me. Thanks for doing this. I don’t have anything to add.
> 
> Reviewed-by: Jorgen Hansen <jhansen@vmware.com>

Thanks, Jorgen!

Cheers,

Michael
Stefan Hajnoczi Dec. 12, 2017, 9:23 a.m. UTC | #4
On Mon, Dec 11, 2017 at 08:32:20PM +0100, Michael Kerrisk (man-pages) wrote:
> On 12/05/2017 11:56 AM, Stefan Hajnoczi wrote:
> > +Data is transferred using the usual
> > +.BR send (2)
> > +and
> > +.BR recv (2)
> 
> Or equally, write(2) and read(2), right? By failing to mention those, the
> text subtly implies that send(2) and recv(2) are preferred, but I don't
> suppose that is true.
> 
> > +family of socket system calls.

Yes, this file descriptor is a socket so write(2) and read(2) work.

I said "family of socket system calls" to avoid listing all the
variations of send(2), sendmsg(2), sendfile(2), sendmmsg(2), etc but I
guess that doesn't include the read(2)/write(2) family of syscalls
(readv(2)/writev(2)).

Will send a follow-up patch to clarify this.

> > +.B EOPNOTSUPP
> > +Operation not supported.  This includes:
> > +the
> > +.B MSG_OOB
> > +flag that is not implemented for
> > +.BR sendmsg (2)
> > +and
> > +.B MSG_PEEK
> > +for
> > +.BR recvmsg (2).
> 
> So these errors might also occur for send() and recv(), right?

Yes, I'll change this to "the send(2) family of syscalls" and "recv(2)
family of syscalls", respectively.
diff mbox series

Patch

diff --git a/man7/vsock.7 b/man7/vsock.7
new file mode 100644
index 000000000..46dc561f5
--- /dev/null
+++ b/man7/vsock.7
@@ -0,0 +1,180 @@ 
+.TH VSOCK 7 2017-11-30 "Linux" "Linux Programmer's Manual"
+.SH NAME
+vsock \- Linux VSOCK address family
+.SH SYNOPSIS
+.B #include <sys/socket.h>
+.br
+.B #include <linux/vm_sockets.h>
+.PP
+.IB stream_socket " = socket(AF_VSOCK, SOCK_STREAM, 0);"
+.br
+.IB datagram_socket " = socket(AF_VSOCK, SOCK_DGRAM, 0);"
+.SH DESCRIPTION
+The VSOCK address family facilitates communication between virtual machines and
+the host they are running on.  This address family is used by guest agents and
+hypervisor services that need a communications channel that is independent of
+virtual machine network configuration.
+.PP
+Valid socket types are
+.B SOCK_STREAM
+and
+.BR SOCK_DGRAM .
+.B SOCK_STREAM
+provides connection-oriented byte streams with guaranteed, in-order delivery.
+.B SOCK_DGRAM
+provides a connectionless datagram packet service with best-effort delivery and
+best-effort ordering.  Availability of these socket types is dependent on the
+underlying hypervisor.
+.PP
+A new socket is created with
+.PP
+    socket(AF_VSOCK, socket_type, 0);
+.PP
+When a process wants to establish a connection it calls
+.BR connect (2)
+with a given destination socket address.  The socket is automatically bound to
+a free port if unbound.
+.PP
+A process can listen for incoming connections by first binding to a socket
+address using
+.BR bind (2)
+and then calling
+.BR listen (2).
+.PP
+Data is transferred using the usual
+.BR send (2)
+and
+.BR recv (2)
+family of socket system calls.
+.SS Address format
+A socket address is defined as a combination of a 32-bit Context Identifier
+(CID) and a 32-bit port number.  The CID identifies the source or destination,
+which is either a virtual machine or the host.  The port number differentiates
+between multiple services running on a single machine.
+.PP
+.in +4n
+.EX
+struct sockaddr_vm {
+    sa_family_t     svm_family;     /* address family: AF_VSOCK */
+    unsigned short  svm_reserved1;
+    unsigned int    svm_port;       /* port in native byte order */
+    unsigned int    svm_cid;        /* address in native byte order */
+};
+.EE
+.in
+.PP
+.I svm_family
+is always set to
+.BR AF_VSOCK .
+.I svm_reserved1
+is always set to 0.
+.I svm_port
+contains the port in native byte order.
+The port numbers below 1024 are called
+.IR "privileged ports" .
+Only a process with
+.B CAP_NET_BIND_SERVER
+capability may
+.BR bind (2)
+to these port numbers.
+.PP
+There are several special addresses:
+.B VMADDR_CID_ANY
+(-1U)
+means any address for binding;
+.B VMADDR_CID_HYPERVISOR
+(0) is reserved for services built into the hypervisor;
+.B VMADDR_CID_RESERVED
+(1) must not be used;
+.B VMADDR_CID_HOST
+(2)
+is the well-known address of the host.
+.PP
+The special constant
+.B VMADDR_PORT_ANY
+(-1U)
+means any port number for binding.
+.SS Live migration
+Sockets are affected by live migration of virtual machines.  Connected
+.B SOCK_STREAM
+sockets become disconnected when the virtual machine migrates to a new host.
+Applications must reconnect when this happens.
+.PP
+The local CID may change across live migration if the old CID is not available
+on the new host.  Bound sockets are automatically updated to the new CID.
+.SS Ioctls
+.TP
+.B IOCTL_VM_SOCKETS_GET_LOCAL_CID
+Get the CID of the local machine.  The argument is a pointer to an unsigned int.
+.IP
+.in +4n
+.EX
+.IB error " = ioctl(" socket ", " IOCTL_VM_SOCKETS_GET_LOCAL_CID ", " &cid ");"
+.EE
+.in
+.IP
+Consider using
+.B VMADDR_CID_ANY
+when binding instead of getting the local CID with
+.BR IOCTL_VM_SOCKETS_GET_LOCAL_CID .
+.SH ERRORS
+.TP
+.B EACCES
+Unable to bind to a privileged port without the
+.B CAP_NET_BIND_SERVICE
+capability.
+.TP
+.B EINVAL
+Invalid parameters.  This includes:
+attempting to bind a socket that is already bound, providing an invalid struct
+.BR sockaddr_vm ,
+and other input validation errors.
+.TP
+.B EOPNOTSUPP
+Operation not supported.  This includes:
+the
+.B MSG_OOB
+flag that is not implemented for
+.BR sendmsg (2)
+and
+.B MSG_PEEK
+for
+.BR recvmsg (2).
+.TP
+.B EADDRINUSE
+Unable to bind to a port that is already in use.
+.TP
+.B EADDRNOTAVAIL
+Unable to find a free port for binding or unable to bind to a non-local CID.
+.TP
+.B ENOTCONN
+Unable to perform operation on an unconnected socket.
+.TP
+.B ENOPROTOOPT
+Invalid socket option in
+.BR setsockopt (2)
+or
+.BR getsockopt (2).
+.TP
+.B EPROTONOSUPPORT
+Invalid socket protocol number.  Protocol should always be 0.
+.TP
+.B ESOCKTNOSUPPORT
+Unsupported socket type in
+.BR socket (2).
+Only
+.B SOCK_STREAM
+and
+.B SOCK_DGRAM
+are valid.
+.SH VERSIONS
+Support for VMware (VMCI) has been available since Linux 3.9.  KVM (virtio) is
+supported since Linux 4.8.  Hyper-V is supported since 4.14.
+.SH SEE ALSO
+.BR socket (2),
+.BR bind (2),
+.BR connect (2),
+.BR listen (2),
+.BR send (2),
+.BR recv (2),
+.BR capabilities (7)