Message ID | 20190720012850.14369-3-aik@ozlabs.ru |
---|---|
State | New |
Headers | show |
Series | spapr: Kexec style boot | expand |
On Sat, Jul 20, 2019 at 11:28:48AM +1000, Alexey Kardashevskiy wrote: > The pseries kernel can do either usual prom-init boot or kexec style boot. > We always did the prom-init which relies on the completeness of > the device tree (for example, PCI BARs have to be assigned beforehand) and > the client interface; the system firmware (SLOF) implements this. > > However we can use the kexec style boot as well. To do that, we can skip > loading SLOF and jump straight to the kernel. GPR5==0 (the client > interface entry point, SLOF passes a valid pointer there) tells Linux to > do the kexec boot rather than prom_init so it can proceed to the initramfs. > With few PCI fixes in the guest kernel, it can boot from PCI (via > petitboot, for example). > > This adds a "bios" machine option which controls whether QEMU loads SLOF > or jumps directly to the kernel. When bios==off, this does not copy SLOF > and RTAS into the guest RAM and sets RTAS properties to 0 to bypass > the kexec user space tool which checks for their presence (not for > the values though). BIOS is sometimes used to refer to any machine's firmware, but it's also used to refer specifically to PC style BIOS. I think it would be clearer to be explicit here and call the options "slof" rather than "bios". > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > include/hw/ppc/spapr.h | 1 + > hw/ppc/spapr.c | 58 ++++++++++++++++++++++++++++++++---------- > 2 files changed, 45 insertions(+), 14 deletions(-) > > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > index ff82bb8554e1..7f5d7a70d27e 100644 > --- a/include/hw/ppc/spapr.h > +++ b/include/hw/ppc/spapr.h > @@ -160,6 +160,7 @@ struct SpaprMachineState { > long kernel_size; > bool kernel_le; > uint64_t kernel_addr; > + bool bios_enabled; > uint32_t initrd_base; > long initrd_size; > uint64_t rtc_offset; /* Now used only during incoming migration */ > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 6d13d65d8996..81ad6a6f28de 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) > _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity", > lrdr_capacity, sizeof(lrdr_capacity))); > > + /* These are to make kexec-lite happy */ > + _FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0)); > + _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0)); What exactly is kexec-lite and what does it need here? > spapr_dt_rtas_tokens(fdt, rtas); > } > > @@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine) > spapr->fdt_blob = fdt; > > /* Set up the entry state */ > - spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr); > + if (!spapr->bios_enabled) { > + spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr); > + } else { > + spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr); > + } > first_ppc_cpu->env.gpr[5] = 0; > > spapr->cas_reboot = false; > @@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine) > } > } > > - if (bios_name == NULL) { > - bios_name = FW_FILE_NAME; > + if (spapr->bios_enabled) { > + if (bios_name == NULL) { > + bios_name = FW_FILE_NAME; > + } > + filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); > + if (!filename) { > + error_report("Could not find LPAR firmware '%s'", bios_name); > + exit(1); > + } > + fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE); > + if (fw_size <= 0) { > + error_report("Could not load LPAR firmware '%s'", filename); > + exit(1); > + } > + g_free(filename); > } > - filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); > - if (!filename) { > - error_report("Could not find LPAR firmware '%s'", bios_name); > - exit(1); > - } > - fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE); > - if (fw_size <= 0) { > - error_report("Could not load LPAR firmware '%s'", filename); > - exit(1); > - } > - g_free(filename); > > /* FIXME: Should register things through the MachineState's qdev > * interface, this is a legacy from the sPAPREnvironment structure > @@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name, > visit_type_uint64(v, name, (uint64_t *)opaque, errp); > } > > +static bool spapr_get_bios_enabled(Object *obj, Error **errp) > +{ > + SpaprMachineState *spapr = SPAPR_MACHINE(obj); > + > + return spapr->bios_enabled; > +} > + > +static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp) > +{ > + SpaprMachineState *spapr = SPAPR_MACHINE(obj); > + > + spapr->bios_enabled = value; > +} > + > static char *spapr_get_ic_mode(Object *obj, Error **errp) > { > SpaprMachineState *spapr = SPAPR_MACHINE(obj); > @@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj) > " for -kernel is the default", > NULL); > spapr->kernel_addr = KERNEL_LOAD_ADDR; > + object_property_add_bool(obj, "bios", spapr_get_bios_enabled, > + spapr_set_bios_enabled, NULL); > + object_property_set_description(obj, "bios", "Conrols whether to load bios", > + NULL); > + spapr->bios_enabled = true; > + > /* The machine class defines the default interrupt controller mode */ > spapr->irq = smc->irq; > object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
On 23/07/2019 13:52, David Gibson wrote: > On Sat, Jul 20, 2019 at 11:28:48AM +1000, Alexey Kardashevskiy wrote: >> The pseries kernel can do either usual prom-init boot or kexec style boot. >> We always did the prom-init which relies on the completeness of >> the device tree (for example, PCI BARs have to be assigned beforehand) and >> the client interface; the system firmware (SLOF) implements this. >> >> However we can use the kexec style boot as well. To do that, we can skip >> loading SLOF and jump straight to the kernel. GPR5==0 (the client >> interface entry point, SLOF passes a valid pointer there) tells Linux to >> do the kexec boot rather than prom_init so it can proceed to the initramfs. >> With few PCI fixes in the guest kernel, it can boot from PCI (via >> petitboot, for example). >> >> This adds a "bios" machine option which controls whether QEMU loads SLOF >> or jumps directly to the kernel. When bios==off, this does not copy SLOF >> and RTAS into the guest RAM and sets RTAS properties to 0 to bypass >> the kexec user space tool which checks for their presence (not for >> the values though). > > BIOS is sometimes used to refer to any machine's firmware, but it's > also used to refer specifically to PC style BIOS. I think it would be > clearer to be explicit here and call the options "slof" rather than > "bios". This is a machine option of the "pseries" machine so it did not sound like PC bios to me, and slof itself lives in pc-bios so it seemed aligned name for a property like this. > >> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> include/hw/ppc/spapr.h | 1 + >> hw/ppc/spapr.c | 58 ++++++++++++++++++++++++++++++++---------- >> 2 files changed, 45 insertions(+), 14 deletions(-) >> >> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h >> index ff82bb8554e1..7f5d7a70d27e 100644 >> --- a/include/hw/ppc/spapr.h >> +++ b/include/hw/ppc/spapr.h >> @@ -160,6 +160,7 @@ struct SpaprMachineState { >> long kernel_size; >> bool kernel_le; >> uint64_t kernel_addr; >> + bool bios_enabled; >> uint32_t initrd_base; >> long initrd_size; >> uint64_t rtc_offset; /* Now used only during incoming migration */ >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >> index 6d13d65d8996..81ad6a6f28de 100644 >> --- a/hw/ppc/spapr.c >> +++ b/hw/ppc/spapr.c >> @@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) >> _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity", >> lrdr_capacity, sizeof(lrdr_capacity))); >> >> + /* These are to make kexec-lite happy */ >> + _FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0)); >> + _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0)); > > What exactly is kexec-lite and what does it need here? This is a leftover which did not help (I think, need to double check). It is a small kexec used in openpower builds, maintained by Anton: https://github.com/open-power/kexec-lite This is a part of the petitboot initramdisk used on bare metal powernv machines and if the tool detects /rtas in the DT, it insists on these 2 properties, otherwise it does not proceed to reboot("kexec"): https://github.com/open-power/kexec-lite/blob/master/kexec_memory_map.c#L272 I ended up patching it (hi Anton, please review my "[PATCH kexec] memory_map: Allow RTAS-less setup") and rebuilding the initramdisk. > >> spapr_dt_rtas_tokens(fdt, rtas); >> } >> >> @@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine) >> spapr->fdt_blob = fdt; >> >> /* Set up the entry state */ >> - spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr); >> + if (!spapr->bios_enabled) { >> + spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr); >> + } else { >> + spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr); >> + } >> first_ppc_cpu->env.gpr[5] = 0; >> >> spapr->cas_reboot = false; >> @@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine) >> } >> } >> >> - if (bios_name == NULL) { >> - bios_name = FW_FILE_NAME; >> + if (spapr->bios_enabled) { >> + if (bios_name == NULL) { >> + bios_name = FW_FILE_NAME; >> + } >> + filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); >> + if (!filename) { >> + error_report("Could not find LPAR firmware '%s'", bios_name); >> + exit(1); >> + } >> + fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE); >> + if (fw_size <= 0) { >> + error_report("Could not load LPAR firmware '%s'", filename); >> + exit(1); >> + } >> + g_free(filename); >> } >> - filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); >> - if (!filename) { >> - error_report("Could not find LPAR firmware '%s'", bios_name); >> - exit(1); >> - } >> - fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE); >> - if (fw_size <= 0) { >> - error_report("Could not load LPAR firmware '%s'", filename); >> - exit(1); >> - } >> - g_free(filename); >> >> /* FIXME: Should register things through the MachineState's qdev >> * interface, this is a legacy from the sPAPREnvironment structure >> @@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name, >> visit_type_uint64(v, name, (uint64_t *)opaque, errp); >> } >> >> +static bool spapr_get_bios_enabled(Object *obj, Error **errp) >> +{ >> + SpaprMachineState *spapr = SPAPR_MACHINE(obj); >> + >> + return spapr->bios_enabled; >> +} >> + >> +static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp) >> +{ >> + SpaprMachineState *spapr = SPAPR_MACHINE(obj); >> + >> + spapr->bios_enabled = value; >> +} >> + >> static char *spapr_get_ic_mode(Object *obj, Error **errp) >> { >> SpaprMachineState *spapr = SPAPR_MACHINE(obj); >> @@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj) >> " for -kernel is the default", >> NULL); >> spapr->kernel_addr = KERNEL_LOAD_ADDR; >> + object_property_add_bool(obj, "bios", spapr_get_bios_enabled, >> + spapr_set_bios_enabled, NULL); >> + object_property_set_description(obj, "bios", "Conrols whether to load bios", >> + NULL); >> + spapr->bios_enabled = true; >> + >> /* The machine class defines the default interrupt controller mode */ >> spapr->irq = smc->irq; >> object_property_add_str(obj, "ic-mode", spapr_get_ic_mode, >
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index ff82bb8554e1..7f5d7a70d27e 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -160,6 +160,7 @@ struct SpaprMachineState { long kernel_size; bool kernel_le; uint64_t kernel_addr; + bool bios_enabled; uint32_t initrd_base; long initrd_size; uint64_t rtc_offset; /* Now used only during incoming migration */ diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 6d13d65d8996..81ad6a6f28de 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity", lrdr_capacity, sizeof(lrdr_capacity))); + /* These are to make kexec-lite happy */ + _FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0)); + _FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0)); + spapr_dt_rtas_tokens(fdt, rtas); } @@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine) spapr->fdt_blob = fdt; /* Set up the entry state */ - spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr); + if (!spapr->bios_enabled) { + spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr); + } else { + spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr); + } first_ppc_cpu->env.gpr[5] = 0; spapr->cas_reboot = false; @@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine) } } - if (bios_name == NULL) { - bios_name = FW_FILE_NAME; + if (spapr->bios_enabled) { + if (bios_name == NULL) { + bios_name = FW_FILE_NAME; + } + filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); + if (!filename) { + error_report("Could not find LPAR firmware '%s'", bios_name); + exit(1); + } + fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE); + if (fw_size <= 0) { + error_report("Could not load LPAR firmware '%s'", filename); + exit(1); + } + g_free(filename); } - filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name); - if (!filename) { - error_report("Could not find LPAR firmware '%s'", bios_name); - exit(1); - } - fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE); - if (fw_size <= 0) { - error_report("Could not load LPAR firmware '%s'", filename); - exit(1); - } - g_free(filename); /* FIXME: Should register things through the MachineState's qdev * interface, this is a legacy from the sPAPREnvironment structure @@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name, visit_type_uint64(v, name, (uint64_t *)opaque, errp); } +static bool spapr_get_bios_enabled(Object *obj, Error **errp) +{ + SpaprMachineState *spapr = SPAPR_MACHINE(obj); + + return spapr->bios_enabled; +} + +static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp) +{ + SpaprMachineState *spapr = SPAPR_MACHINE(obj); + + spapr->bios_enabled = value; +} + static char *spapr_get_ic_mode(Object *obj, Error **errp) { SpaprMachineState *spapr = SPAPR_MACHINE(obj); @@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj) " for -kernel is the default", NULL); spapr->kernel_addr = KERNEL_LOAD_ADDR; + object_property_add_bool(obj, "bios", spapr_get_bios_enabled, + spapr_set_bios_enabled, NULL); + object_property_set_description(obj, "bios", "Conrols whether to load bios", + NULL); + spapr->bios_enabled = true; + /* The machine class defines the default interrupt controller mode */ spapr->irq = smc->irq; object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,
The pseries kernel can do either usual prom-init boot or kexec style boot. We always did the prom-init which relies on the completeness of the device tree (for example, PCI BARs have to be assigned beforehand) and the client interface; the system firmware (SLOF) implements this. However we can use the kexec style boot as well. To do that, we can skip loading SLOF and jump straight to the kernel. GPR5==0 (the client interface entry point, SLOF passes a valid pointer there) tells Linux to do the kexec boot rather than prom_init so it can proceed to the initramfs. With few PCI fixes in the guest kernel, it can boot from PCI (via petitboot, for example). This adds a "bios" machine option which controls whether QEMU loads SLOF or jumps directly to the kernel. When bios==off, this does not copy SLOF and RTAS into the guest RAM and sets RTAS properties to 0 to bypass the kexec user space tool which checks for their presence (not for the values though). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- include/hw/ppc/spapr.h | 1 + hw/ppc/spapr.c | 58 ++++++++++++++++++++++++++++++++---------- 2 files changed, 45 insertions(+), 14 deletions(-)