Message ID | 1360622997-26904-4-git-send-email-mrhines@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
Il 11/02/2013 23:49, Michael R. Hines ha scritto: > From: "Michael R. Hines" <mrhines@us.ibm.com> > > > Signed-off-by: Michael R. Hines <mrhines@us.ibm.com> > --- > exec.c | 27 +++++++++++++++++++++++++++ > vl.c | 13 +++++++++++++ > 2 files changed, 40 insertions(+) > > diff --git a/exec.c b/exec.c > index b85508b..b7ac6fa 100644 > --- a/exec.c > +++ b/exec.c > @@ -25,6 +25,8 @@ > #endif > > #include "qemu-common.h" > +#include "qemu/rdma.h" > +#include "monitor/monitor.h" > #include "cpu.h" > #include "tcg.h" > #include "hw/hw.h" > @@ -104,6 +106,31 @@ static MemoryRegion io_mem_watch; > > #if !defined(CONFIG_USER_ONLY) > > +/* > + * Memory regions need to be registered with the device and queue pairs setup > + * in advanced before the migration starts. This tells us where the RAM blocks > + * are so that we can register them individually. > + */ > +int rdma_init_ram_blocks(struct rdma_ram_blocks *rdma_ram_blocks) > +{ > + RAMBlock *block; > + int num_blocks = 0; > + > + memset(rdma_ram_blocks, 0, sizeof *rdma_ram_blocks); > + QTAILQ_FOREACH(block, &ram_list.blocks, next) { > + if (num_blocks >= RDMA_MAX_RAM_BLOCKS) { > + return -1; > + } > + rdma_ram_blocks->block[num_blocks].local_host_addr = block->host; > + rdma_ram_blocks->block[num_blocks].offset = (uint64_t)block->offset; > + rdma_ram_blocks->block[num_blocks].length = (uint64_t)block->length; > + num_blocks++; > + } > + rdma_ram_blocks->num_blocks = num_blocks; > + > + return 0; > +} Memory regions are not static data, so you have to do this at the time migration starts. For the RDMA-impaired among us, why do you need a separate host+port? Can it be the same by default, and if it is different you can then specify it like rdma://host:port/?rdmahost=HOST&rdmaport=PORT Paolo
Yes, this is done at migration time (see functions "rdma_client_init" and "rdma_server_prepare()") To explain the host and port: The separate host and port are used by the library "librdmacm". This library performs a network translation between the IP address and a unique infiniband user-level Port number and the physical interface that has the RDMA capabilities. This library requires an IP address and port bound specifically to the requested RDMA interface to work. The patch does not assume that the network interface used for TCP traffic will necessarily be the same as the interface used for RDMA traffic. Alternatively, this host and port could be specified using the QMP "migrate" command, but this command already has the URI for the TCP side of things reserved. If you guys like, we could specify a *second* URI on the QMP command line - we don't really have a preference. Either way is fine........ whatever the consensus is. - Michael On 02/18/2013 05:37 AM, Paolo Bonzini wrote: > Il 11/02/2013 23:49, Michael R. Hines ha scritto: >> +/* >> + * Memory regions need to be registered with the device and queue pairs setup >> + * in advanced before the migration starts. This tells us where the RAM blocks >> + * are so that we can register them individually. >> + */ >> +int rdma_init_ram_blocks(struct rdma_ram_blocks *rdma_ram_blocks) >> +{ >> + RAMBlock *block; >> + int num_blocks = 0; >> + >> + memset(rdma_ram_blocks, 0, sizeof *rdma_ram_blocks); >> + QTAILQ_FOREACH(block, &ram_list.blocks, next) { >> + if (num_blocks >= RDMA_MAX_RAM_BLOCKS) { >> + return -1; >> + } >> + rdma_ram_blocks->block[num_blocks].local_host_addr = block->host; >> + rdma_ram_blocks->block[num_blocks].offset = (uint64_t)block->offset; >> + rdma_ram_blocks->block[num_blocks].length = (uint64_t)block->length; >> + num_blocks++; >> + } >> + rdma_ram_blocks->num_blocks = num_blocks; >> + >> + return 0; >> +} > Memory regions are not static data, so you have to do this at the time > migration starts. > > For the RDMA-impaired among us, why do you need a separate host+port? > Can it be the same by default, and if it is different you can then > specify it like > > rdma://host:port/?rdmahost=HOST&rdmaport=PORT > > Paolo >
Il 19/02/2013 07:00, Michael R. Hines ha scritto: > Yes, this is done at migration time (see functions "rdma_client_init" > and "rdma_server_prepare()") > > To explain the host and port: > > The separate host and port are used by the library "librdmacm". This > library performs a network translation between the IP address and a > unique infiniband user-level Port number and the physical interface that > has the RDMA capabilities. This library requires an IP address and port > bound specifically to the requested RDMA interface to work. > > The patch does not assume that the network interface used for TCP > traffic will necessarily be the same as the interface used for RDMA > traffic. Of course the best thing to do would be to have all traffic on the RDMA interface... :) Paolo > Alternatively, this host and port could be specified using the QMP > "migrate" command, but this command already has the URI for the TCP side > of things reserved. > > If you guys like, we could specify a *second* URI on the QMP command > line - we don't really have a preference. > > Either way is fine........ whatever the consensus is. > > - Michael
On Tue, Feb 19, 2013 at 09:42:45AM +0100, Paolo Bonzini wrote: > Il 19/02/2013 07:00, Michael R. Hines ha scritto: > > Yes, this is done at migration time (see functions "rdma_client_init" > > and "rdma_server_prepare()") > > > > To explain the host and port: > > > > The separate host and port are used by the library "librdmacm". This > > library performs a network translation between the IP address and a > > unique infiniband user-level Port number and the physical interface that > > has the RDMA capabilities. This library requires an IP address and port > > bound specifically to the requested RDMA interface to work. > > > > The patch does not assume that the network interface used for TCP > > traffic will necessarily be the same as the interface used for RDMA > > traffic. > > Of course the best thing to do would be to have all traffic on the RDMA > interface... :) > > Paolo You can't do this with infiniband, RDMA is only possible once the connection is established. > > Alternatively, this host and port could be specified using the QMP > > "migrate" command, but this command already has the URI for the TCP side > > of things reserved. > > > > If you guys like, we could specify a *second* URI on the QMP command > > line - we don't really have a preference. > > > > Either way is fine........ whatever the consensus is. > > > > - Michael >
> On Tue, Feb 19, 2013 at 09:42:45AM +0100, Paolo Bonzini wrote: > > Il 19/02/2013 07:00, Michael R. Hines ha scritto: > > > Yes, this is done at migration time (see functions > > > "rdma_client_init" > > > and "rdma_server_prepare()") > > > > > > To explain the host and port: > > > > > > The separate host and port are used by the library "librdmacm". This > > > library performs a network translation between the IP address and a > > > unique infiniband user-level Port number and the physical > > > interface that has the RDMA capabilities. This library requires an > > > IP address and port bound specifically to the requested RDMA interface > > > to work. > > > > > > The patch does not assume that the network interface used for TCP > > > traffic will necessarily be the same as the interface used for > > > RDMA traffic. > > > > Of course the best thing to do would be to have all traffic on the > > RDMA interface... :) > > You can't do this with infiniband, RDMA is only possible once the > connection is established. Sorry, I meant on the infiniband interface. Right now Michael (Hines)'s code needs two sockets, one for TCP and one for RDMA. If I understand correctly, the rdmacm library does not need a separate address to set up the connection, that's just an artifact of the implementation. Whatever goes on in the TCP socket can be done on RDMA after establishing the connection, or can be done with SEND. Paolo > > > > Alternatively, this host and port could be specified using the > > > QMP > > > "migrate" command, but this command already has the URI for the > > > TCP side > > > of things reserved. > > > > > > If you guys like, we could specify a *second* URI on the QMP > > > command > > > line - we don't really have a preference. > > > > > > Either way is fine........ whatever the consensus is. > > > > > > - Michael > > >
diff --git a/exec.c b/exec.c index b85508b..b7ac6fa 100644 --- a/exec.c +++ b/exec.c @@ -25,6 +25,8 @@ #endif #include "qemu-common.h" +#include "qemu/rdma.h" +#include "monitor/monitor.h" #include "cpu.h" #include "tcg.h" #include "hw/hw.h" @@ -104,6 +106,31 @@ static MemoryRegion io_mem_watch; #if !defined(CONFIG_USER_ONLY) +/* + * Memory regions need to be registered with the device and queue pairs setup + * in advanced before the migration starts. This tells us where the RAM blocks + * are so that we can register them individually. + */ +int rdma_init_ram_blocks(struct rdma_ram_blocks *rdma_ram_blocks) +{ + RAMBlock *block; + int num_blocks = 0; + + memset(rdma_ram_blocks, 0, sizeof *rdma_ram_blocks); + QTAILQ_FOREACH(block, &ram_list.blocks, next) { + if (num_blocks >= RDMA_MAX_RAM_BLOCKS) { + return -1; + } + rdma_ram_blocks->block[num_blocks].local_host_addr = block->host; + rdma_ram_blocks->block[num_blocks].offset = (uint64_t)block->offset; + rdma_ram_blocks->block[num_blocks].length = (uint64_t)block->length; + num_blocks++; + } + rdma_ram_blocks->num_blocks = num_blocks; + + return 0; +} + static void phys_map_node_reserve(unsigned nodes) { if (phys_map_nodes_nb + nodes > phys_map_nodes_nb_alloc) { diff --git a/vl.c b/vl.c index 7aab73b..170d209 100644 --- a/vl.c +++ b/vl.c @@ -29,6 +29,7 @@ #include <sys/time.h> #include <zlib.h> #include "qemu/bitmap.h" +#include "qemu/rdma.h" /* Needed early for CONFIG_BSD etc. */ #include "config-host.h" @@ -233,6 +234,9 @@ int boot_menu; uint8_t *boot_splash_filedata; size_t boot_splash_filedata_size; uint8_t qemu_extra_params_fw[2]; +int rdmaport = -1; +char rdmahost[64] = ""; +struct rdma_data rdma_mdata; typedef struct FWBootEntry FWBootEntry; @@ -3622,6 +3626,13 @@ int main(int argc, char **argv, char **envp) default_sdcard = 0; default_vga = 0; break; + case QEMU_OPTION_rdmaport: + rdmaport = atoi(optarg); + break; + case QEMU_OPTION_rdmahost: + strncpy(rdmahost, optarg, 64); + rdmahost[63] = '\0'; + break; case QEMU_OPTION_xen_domid: if (!(xen_available())) { printf("Option %s not supported for this target\n", popt->name); @@ -3725,6 +3736,8 @@ int main(int argc, char **argv, char **envp) } loc_set_none(); + rdma_data_init(&rdma_mdata); + if (qemu_init_main_loop()) { fprintf(stderr, "qemu_init_main_loop failed\n"); exit(1);