Message ID | 20100803161914.15514.59304.stgit@localhost6.localdomain6 |
---|---|
State | New |
Headers | show |
On Tue, Aug 03, 2010 at 10:19:47AM -0600, Alex Williamson wrote: > Several devices rely on their reset() function being called to > initialize device state, e1000 and rtl8139 in particular. When > the device is hot added, the reset doesn't occur, often leaving > the device in an unusable state. Adding a call to reset() after > init() for hotplugged devices puts the device in the expected > state for the guest. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> I like this one better.
Alex Williamson <alex.williamson@redhat.com> writes: > Several devices rely on their reset() function being called to > initialize device state, e1000 and rtl8139 in particular. When > the device is hot added, the reset doesn't occur, often leaving > the device in an unusable state. Adding a call to reset() after > init() for hotplugged devices puts the device in the expected > state for the guest. > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > --- > > 0.13 candidate? > > hw/qdev.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/hw/qdev.c b/hw/qdev.c > index e99c73f..b156272 100644 > --- a/hw/qdev.c > +++ b/hw/qdev.c > @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) > qdev_free(dev); > return rc; > } > + if (dev->hotplugged) { > + qdev_reset(dev); > + } > qemu_register_reset(qdev_reset, dev); > if (dev->info->vmsd) { > vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev, qdev_reset() isn't necessary when !dev->hotplugged, because then qemu_system_reset() will run shortly, which will call qdev_reset(). Correct?
On Fri, 2010-08-20 at 11:00 +0200, Markus Armbruster wrote: > Alex Williamson <alex.williamson@redhat.com> writes: > > > Several devices rely on their reset() function being called to > > initialize device state, e1000 and rtl8139 in particular. When > > the device is hot added, the reset doesn't occur, often leaving > > the device in an unusable state. Adding a call to reset() after > > init() for hotplugged devices puts the device in the expected > > state for the guest. > > > > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> > > --- > > > > 0.13 candidate? > > > > hw/qdev.c | 3 +++ > > 1 files changed, 3 insertions(+), 0 deletions(-) > > > > diff --git a/hw/qdev.c b/hw/qdev.c > > index e99c73f..b156272 100644 > > --- a/hw/qdev.c > > +++ b/hw/qdev.c > > @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) > > qdev_free(dev); > > return rc; > > } > > + if (dev->hotplugged) { > > + qdev_reset(dev); > > + } > > qemu_register_reset(qdev_reset, dev); > > if (dev->info->vmsd) { > > vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev, > > qdev_reset() isn't necessary when !dev->hotplugged, because then > qemu_system_reset() will run shortly, which will call qdev_reset(). > Correct? Yes, exactly.
Alex Williamson <alex.williamson@redhat.com> writes: > On Fri, 2010-08-20 at 11:00 +0200, Markus Armbruster wrote: >> Alex Williamson <alex.williamson@redhat.com> writes: >> >> > Several devices rely on their reset() function being called to >> > initialize device state, e1000 and rtl8139 in particular. When >> > the device is hot added, the reset doesn't occur, often leaving >> > the device in an unusable state. Adding a call to reset() after >> > init() for hotplugged devices puts the device in the expected >> > state for the guest. >> > >> > Signed-off-by: Alex Williamson <alex.williamson@redhat.com> >> > --- >> > >> > 0.13 candidate? >> > >> > hw/qdev.c | 3 +++ >> > 1 files changed, 3 insertions(+), 0 deletions(-) >> > >> > diff --git a/hw/qdev.c b/hw/qdev.c >> > index e99c73f..b156272 100644 >> > --- a/hw/qdev.c >> > +++ b/hw/qdev.c >> > @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) >> > qdev_free(dev); >> > return rc; >> > } >> > + if (dev->hotplugged) { >> > + qdev_reset(dev); >> > + } >> > qemu_register_reset(qdev_reset, dev); >> > if (dev->info->vmsd) { >> > vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev, >> >> qdev_reset() isn't necessary when !dev->hotplugged, because then >> qemu_system_reset() will run shortly, which will call qdev_reset(). >> Correct? > > Yes, exactly. Hmm. "qdev_init() automatically calls qdev_reset() if hotplug" feels needlessly complicated. I'd prefer qdev_init() to call it always or never. If "always", we reset twice for cold-plug. Is that bad? If "never", we need to reset somewhere else for hot-plug. What about do_device_add()?
On 08/20/2010 10:47 AM, Markus Armbruster wrote: > Alex Williamson<alex.williamson@redhat.com> writes: > > >> On Fri, 2010-08-20 at 11:00 +0200, Markus Armbruster wrote: >> >>> Alex Williamson<alex.williamson@redhat.com> writes: >>> >>> >>>> Several devices rely on their reset() function being called to >>>> initialize device state, e1000 and rtl8139 in particular. When >>>> the device is hot added, the reset doesn't occur, often leaving >>>> the device in an unusable state. Adding a call to reset() after >>>> init() for hotplugged devices puts the device in the expected >>>> state for the guest. >>>> >>>> Signed-off-by: Alex Williamson<alex.williamson@redhat.com> >>>> --- >>>> >>>> 0.13 candidate? >>>> >>>> hw/qdev.c | 3 +++ >>>> 1 files changed, 3 insertions(+), 0 deletions(-) >>>> >>>> diff --git a/hw/qdev.c b/hw/qdev.c >>>> index e99c73f..b156272 100644 >>>> --- a/hw/qdev.c >>>> +++ b/hw/qdev.c >>>> @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) >>>> qdev_free(dev); >>>> return rc; >>>> } >>>> + if (dev->hotplugged) { >>>> + qdev_reset(dev); >>>> + } >>>> qemu_register_reset(qdev_reset, dev); >>>> if (dev->info->vmsd) { >>>> vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev, >>>> >>> qdev_reset() isn't necessary when !dev->hotplugged, because then >>> qemu_system_reset() will run shortly, which will call qdev_reset(). >>> Correct? >>> >> Yes, exactly. >> > Hmm. "qdev_init() automatically calls qdev_reset() if hotplug" feels > needlessly complicated. I'd prefer qdev_init() to call it always or > never. > > If "always", we reset twice for cold-plug. Is that bad? > > If "never", we need to reset somewhere else for hot-plug. What about > do_device_add()? > The real problem is how we do reset. We shouldn't register a reset handler for every qdev device but rather register a single reset handler that walks the device tree and calls reset on every reachable device. Then we can always call reset in init() and there's no need to have a dev->hotplugged check. The qdev device tree reset handler should not be registered until *after* we call qemu_system_reset() after creating the device model which will ensure that we don't do a double reset. Regards, Anthony Liguori
Anthony Liguori <anthony@codemonkey.ws> writes: > On 08/20/2010 10:47 AM, Markus Armbruster wrote: >> Alex Williamson<alex.williamson@redhat.com> writes: >> >> >>> On Fri, 2010-08-20 at 11:00 +0200, Markus Armbruster wrote: >>> >>>> Alex Williamson<alex.williamson@redhat.com> writes: >>>> >>>> >>>>> Several devices rely on their reset() function being called to >>>>> initialize device state, e1000 and rtl8139 in particular. When >>>>> the device is hot added, the reset doesn't occur, often leaving >>>>> the device in an unusable state. Adding a call to reset() after >>>>> init() for hotplugged devices puts the device in the expected >>>>> state for the guest. >>>>> >>>>> Signed-off-by: Alex Williamson<alex.williamson@redhat.com> >>>>> --- >>>>> >>>>> 0.13 candidate? >>>>> >>>>> hw/qdev.c | 3 +++ >>>>> 1 files changed, 3 insertions(+), 0 deletions(-) >>>>> >>>>> diff --git a/hw/qdev.c b/hw/qdev.c >>>>> index e99c73f..b156272 100644 >>>>> --- a/hw/qdev.c >>>>> +++ b/hw/qdev.c >>>>> @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) >>>>> qdev_free(dev); >>>>> return rc; >>>>> } >>>>> + if (dev->hotplugged) { >>>>> + qdev_reset(dev); >>>>> + } >>>>> qemu_register_reset(qdev_reset, dev); >>>>> if (dev->info->vmsd) { >>>>> vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev, >>>>> >>>> qdev_reset() isn't necessary when !dev->hotplugged, because then >>>> qemu_system_reset() will run shortly, which will call qdev_reset(). >>>> Correct? >>>> >>> Yes, exactly. >>> >> Hmm. "qdev_init() automatically calls qdev_reset() if hotplug" feels >> needlessly complicated. I'd prefer qdev_init() to call it always or >> never. >> >> If "always", we reset twice for cold-plug. Is that bad? >> >> If "never", we need to reset somewhere else for hot-plug. What about >> do_device_add()? >> > > The real problem is how we do reset. We shouldn't register a reset > handler for every qdev device but rather register a single reset > handler that walks the device tree and calls reset on every reachable > device. > > Then we can always call reset in init() and there's no need to have a > dev->hotplugged check. The qdev device tree reset handler should not > be registered until *after* we call qemu_system_reset() after creating > the device model which will ensure that we don't do a double reset. Fine with me. But we need to merge something short term (pre 0.13) to fix hot plug of e1000 et al. Use Alex's patch as such a stop-gap?
On 08/20/2010 12:00 PM, Markus Armbruster wrote: > Alex Williamson<alex.williamson@redhat.com> writes: > >> Several devices rely on their reset() function being called to >> initialize device state, e1000 and rtl8139 in particular. When >> the device is hot added, the reset doesn't occur, often leaving >> the device in an unusable state. Adding a call to reset() after >> init() for hotplugged devices puts the device in the expected >> state for the guest. >> >> Signed-off-by: Alex Williamson<alex.williamson@redhat.com> >> --- >> >> 0.13 candidate? >> >> hw/qdev.c | 3 +++ >> 1 files changed, 3 insertions(+), 0 deletions(-) >> >> diff --git a/hw/qdev.c b/hw/qdev.c >> index e99c73f..b156272 100644 >> --- a/hw/qdev.c >> +++ b/hw/qdev.c >> @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) >> qdev_free(dev); >> return rc; >> } >> + if (dev->hotplugged) { >> + qdev_reset(dev); >> + } >> qemu_register_reset(qdev_reset, dev); >> if (dev->info->vmsd) { >> vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev, > qdev_reset() isn't necessary when !dev->hotplugged, because then > qemu_system_reset() will run shortly, which will call qdev_reset(). > Correct? Off-topic, but what's the reason for dev->hotplugged's existence? A device is either plugged or not, it is either hotpluggable or not, but is there a way to tell, from looking at a plugged device, whether it has been hotplugged in the past? AFAICT it is redundant state and should be removed.
On 08/23/2010 07:00 AM, Avi Kivity wrote: > Off-topic, but what's the reason for dev->hotplugged's existence? A > device is either plugged or not, it is either hotpluggable or not, but > is there a way to tell, from looking at a plugged device, whether it > has been hotplugged in the past? > > AFAICT it is redundant state and should be removed. I've got that in my qdev tree. http://repo.or.cz/w/qemu/aliguori.git/commit/672720c5c0315d2a5b43f58fd75822cc4a390ae9 Regards, Anthony Liguori
On Fri, Aug 20, 2010 at 10:56:57AM -0500, Anthony Liguori wrote: > The real problem is how we do reset. We shouldn't register a reset > handler for every qdev device but rather register a single reset handler > that walks the device tree and calls reset on every reachable device. > > Then we can always call reset in init() and there's no need to have a > dev->hotplugged check. The qdev device tree reset handler should not be > registered until *after* we call qemu_system_reset() after creating the > device model which will ensure that we don't do a double reset. I don't see why you re-invent the patch. Please see the patches I sent out on August 5. http://lists.nongnu.org/archive/html/qemu-devel/2010-08/msg00377.html http://lists.nongnu.org/archive/html/qemu-devel/2010-08/msg00383.html Maybe we can merge the patches. As for your patch, I have some comment. - bus itself may want its own handler. At lease pci bus needs it. And propagating reset signal to children is up to the bus controller. - child devices should be reset before parent. This doesn't matter much though.
On 08/24/2010 10:07 PM, Isaku Yamahata wrote: > On Fri, Aug 20, 2010 at 10:56:57AM -0500, Anthony Liguori wrote: > >> The real problem is how we do reset. We shouldn't register a reset >> handler for every qdev device but rather register a single reset handler >> that walks the device tree and calls reset on every reachable device. >> >> Then we can always call reset in init() and there's no need to have a >> dev->hotplugged check. The qdev device tree reset handler should not be >> registered until *after* we call qemu_system_reset() after creating the >> device model which will ensure that we don't do a double reset. >> > I don't see why you re-invent the patch. > Please see the patches I sent out on August 5. > http://lists.nongnu.org/archive/html/qemu-devel/2010-08/msg00377.html > http://lists.nongnu.org/archive/html/qemu-devel/2010-08/msg00383.html > I honestly didn't connect the two but it's now obvious to me that they overlap significantly. > Maybe we can merge the patches. > As for your patch, I have some comment. > - bus itself may want its own handler. At lease pci bus needs it. > And propagating reset signal to children is up to the bus controller. > I disagree. Reset should be equivalent to power off + init and it's not something that can be selectively propagated. There are different types of bus level resets and it may make sense to model that in the PCI layer but the qdev reset is a power cycle semantically. I think using a walker pattern is a stronger design than having each reset function do it's own transversal because the later has the potential to introduce bugs. Regards, Anthony Liguori > - child devices should be reset before parent. > This doesn't matter much though. > >
On Wed, Aug 25, 2010 at 07:55:27AM -0500, Anthony Liguori wrote: >> Maybe we can merge the patches. >> As for your patch, I have some comment. >> - bus itself may want its own handler. At lease pci bus needs it. >> And propagating reset signal to children is up to the bus controller. >> > > I disagree. Reset should be equivalent to power off + init and it's not > something that can be selectively propagated. > > There are different types of bus level resets and it may make sense to > model that in the PCI layer but the qdev reset is a power cycle > semantically. > > I think using a walker pattern is a stronger design than having each > reset function do it's own transversal because the later has the > potential to introduce bugs. Warm reset is different from cold reset, so reset isn't equivalent to a power cycle. Typically registers which report errors must retain the contents across warm reset, but doesn't across a power cycle. I can see the reset call back is called when both power on and warm reset, but I don't see why qdev_reset() is a power cycle semantically. Although your current patch might work for me at the moment, the claim that qdev_reset() == a power cycle worries me. How about adding a argument to distinguish reset type? Something like QEMU_RESET_COLD, QEMU_RESET_WARM.
On 08/25/2010 10:17 AM, Isaku Yamahata wrote: > On Wed, Aug 25, 2010 at 07:55:27AM -0500, Anthony Liguori wrote: > >>> Maybe we can merge the patches. >>> As for your patch, I have some comment. >>> - bus itself may want its own handler. At lease pci bus needs it. >>> And propagating reset signal to children is up to the bus controller. >>> >>> >> I disagree. Reset should be equivalent to power off + init and it's not >> something that can be selectively propagated. >> >> There are different types of bus level resets and it may make sense to >> model that in the PCI layer but the qdev reset is a power cycle >> semantically. >> >> I think using a walker pattern is a stronger design than having each >> reset function do it's own transversal because the later has the >> potential to introduce bugs. >> > Warm reset is different from cold reset, so reset isn't equivalent to > a power cycle. But qdev reset is not warm reset. It's cold reset. IOW, it's called when the device is first powered up. > Typically registers which report errors must retain the > contents across warm reset, but doesn't across a power cycle. > I can see the reset call back is called when both power on and warm reset, > but I don't see why qdev_reset() is a power cycle semantically. > > Although your current patch might work for me at the moment, > the claim that qdev_reset() == a power cycle worries me. > > How about adding a argument to distinguish reset type? > Something like QEMU_RESET_COLD, QEMU_RESET_WARM. > Before approaching this, I think we ought to better understand exactly what type of reset semantics we would need to support in the future. I think that starts by understanding exactly what's guaranteed and understanding what the use cases are for it. Regards, Anthony Liguori
On Wed, Aug 25, 2010 at 11:49:19AM -0500, Anthony Liguori wrote: > On 08/25/2010 10:17 AM, Isaku Yamahata wrote: >> On Wed, Aug 25, 2010 at 07:55:27AM -0500, Anthony Liguori wrote: >> >>>> Maybe we can merge the patches. >>>> As for your patch, I have some comment. >>>> - bus itself may want its own handler. At lease pci bus needs it. >>>> And propagating reset signal to children is up to the bus controller. >>>> >>>> >>> I disagree. Reset should be equivalent to power off + init and it's not >>> something that can be selectively propagated. >>> >>> There are different types of bus level resets and it may make sense to >>> model that in the PCI layer but the qdev reset is a power cycle >>> semantically. >>> >>> I think using a walker pattern is a stronger design than having each >>> reset function do it's own transversal because the later has the >>> potential to introduce bugs. >>> >> Warm reset is different from cold reset, so reset isn't equivalent to >> a power cycle. > > But qdev reset is not warm reset. It's cold reset. IOW, it's called > when the device is first powered up. No. It's not true. Surely qemu_system_reset() is called on qemu startup from main(). But it is also called from main_loop() as warm reset. qemu_system_reset_request() is called when warm reset is requested. That is, qemu_system_reset() doesn't distinguish warm reset from cold reset. Thus qdev_reset() doesn't. >> Typically registers which report errors must retain the >> contents across warm reset, but doesn't across a power cycle. >> I can see the reset call back is called when both power on and warm reset, >> but I don't see why qdev_reset() is a power cycle semantically. >> >> Although your current patch might work for me at the moment, >> the claim that qdev_reset() == a power cycle worries me. >> >> How about adding a argument to distinguish reset type? >> Something like QEMU_RESET_COLD, QEMU_RESET_WARM. >> > > Before approaching this, I think we ought to better understand exactly > what type of reset semantics we would need to support in the future. > > I think that starts by understanding exactly what's guaranteed and > understanding what the use cases are for it. Fair enough. How about the followings? This is just a starting point. I borrowed terminology pci/pcie spec. reset Bring the state of hardware state to consistent state. (some state might be left unknown.) system reset a hardware mechanism for setting or returning all hardware states to the initial conditions. Use case: In qemu, system_system_reset(). cold reset(power on reset) system reset following the application of power. Use case: In qemu, system_reset() in main(). We might want to use this as a power cycle. When a device is hot plugged, the device should be cold reset too. This is your motivation. QEMU_RESET_COLD Guarantee: The internal status must be same to qdev_init() + qdev_reset() warm reset system reset without cycling the supplied power. Use case: In qemu, system_reset() in main_loop(). There are many places which calls qemu_system_reset_request(). Some state are retained across warm reset. Like PCIe AER, error reporting registers need to keep its contents across warm reset as OS would examine them and report it when hardware errors caused warm reset. QEMU_RESET_WARM bus reset Reset bus and devices on the bus. Bus reset is usually triggered when cold reset, warm reset and commanding the bus controller to reset the child bus. When bus reset is triggered as command to bus controller, the effect is usually same to warm reset on devices on the bus. Typically on parallel bus, bus reset is started by asserting a designated signal. Example: PCI RST#, ATA RESET-, SCSI RST Use case: bus reset as result of programming bus controller. Qemu is currently missing it which I'd like to fill for pci bus. ATA and SCSI could benefit from this. QEMU_RESET_WARM with bus. Guarantee: device state under the bus is same as warm reset. device/function reset: Reset triggered by sending reset command to a device. This is bus/device specific. There might be many reset commands whose effects are different. Example: PCI FLR, ATA DEVICE RESET command, scsi bus device reset message. This reset is bus specific, so it wouldn't be suitable for qdev frame work and could be handled by each bus level. hot reset: I just put it here for completeness because pcie defines hot reset. A reset propagated in-band across a Link using a Physical Layer mechanism. Qemu doesn't emulate physical layer, so we don't care it. From software point of view, hot reset has same effect to warm reset.
On 08/26/2010 03:38 AM, Isaku Yamahata wrote: > >> I think that starts by understanding exactly what's guaranteed and >> understanding what the use cases are for it. >> > Fair enough. How about the followings? > Thanks for enumerating. > This is just a starting point. I borrowed terminology pci/pcie spec. > > > reset > Bring the state of hardware state to consistent state. > (some state might be left unknown.) > > > system reset > a hardware mechanism for setting or returning all hardware states > to the initial conditions. > Use case: > In qemu, system_system_reset(). > > > cold reset(power on reset) > system reset following the application of power. > Use case: > In qemu, system_reset() in main(). > We might want to use this as a power cycle. > When a device is hot plugged, the device should be cold reset too. > This is your motivation. > QEMU_RESET_COLD > Guarantee: > The internal status must be same to qdev_init() + qdev_reset() > This is what we do today in QEMU and from a functional perspective it covers the type of function we need today. > > warm reset > system reset without cycling the supplied power. > Use case: > In qemu, system_reset() in main_loop(). There are many places > which calls qemu_system_reset_request(). > Some state are retained across warm reset. Like PCIe AER, error > reporting registers need to keep its contents across warm reset > as OS would examine them and report it when hardware errors caused > warm reset. > QEMU_RESET_WARM > With AER, I can't imagine that this matters that much unless we're doing PCI passthrough, right? So maybe the way we should frame this discussion is, what's the type of reset semantics that we need to support for PCI passthrough? The next question after that is how do we achieve the different types of reset for passthrough devices? BTW, if you could transfer some of this discussion to a wiki page on qemu.org, I think that would be extremely valuable. Regards, Anthony Liguori > bus reset > Reset bus and devices on the bus. > Bus reset is usually triggered when cold reset, warm reset and > commanding the bus controller to reset the child bus. > When bus reset is triggered as command to bus controller, > the effect is usually same to warm reset on devices on the bus. > > Typically on parallel bus, bus reset is started by asserting > a designated signal. > Example: PCI RST#, ATA RESET-, SCSI RST > > Use case: > bus reset as result of programming bus controller. > Qemu is currently missing it which I'd like to fill for pci bus. > ATA and SCSI could benefit from this. > QEMU_RESET_WARM with bus. > Guarantee: > device state under the bus is same as warm reset. > > > device/function reset: > Reset triggered by sending reset command to a device. > This is bus/device specific. > There might be many reset commands whose effects are different. > Example: PCI FLR, ATA DEVICE RESET command, > scsi bus device reset message. > > This reset is bus specific, so it wouldn't be suitable for qdev > frame work and could be handled by each bus level. > > > hot reset: > I just put it here for completeness because pcie defines hot reset. > A reset propagated in-band across a Link using a Physical Layer > mechanism. > Qemu doesn't emulate physical layer, so we don't care it. > From software point of view, hot reset has same effect to warm reset. > >
On 08/26/2010 03:38 AM, Isaku Yamahata wrote: > > QEMU_RESET_COLD > BTW, just from a implementation perspective, I'd rather have multiple reset callbacks in qdev instead of having a single callback with a type flag. A type flag implies that every callback has to handle all cases whereas with separate callbacks, if a device doesn't implement warm_reset we can easily default it to reset (which is a cold reset). Regards, Anthony Liguori > Guarantee: > The internal status must be same to qdev_init() + qdev_reset() > > > warm reset > system reset without cycling the supplied power. > Use case: > In qemu, system_reset() in main_loop(). There are many places > which calls qemu_system_reset_request(). > Some state are retained across warm reset. Like PCIe AER, error > reporting registers need to keep its contents across warm reset > as OS would examine them and report it when hardware errors caused > warm reset. > QEMU_RESET_WARM > > > bus reset > Reset bus and devices on the bus. > Bus reset is usually triggered when cold reset, warm reset and > commanding the bus controller to reset the child bus. > When bus reset is triggered as command to bus controller, > the effect is usually same to warm reset on devices on the bus. > > Typically on parallel bus, bus reset is started by asserting > a designated signal. > Example: PCI RST#, ATA RESET-, SCSI RST > > Use case: > bus reset as result of programming bus controller. > Qemu is currently missing it which I'd like to fill for pci bus. > ATA and SCSI could benefit from this. > QEMU_RESET_WARM with bus. > Guarantee: > device state under the bus is same as warm reset. > > > device/function reset: > Reset triggered by sending reset command to a device. > This is bus/device specific. > There might be many reset commands whose effects are different. > Example: PCI FLR, ATA DEVICE RESET command, > scsi bus device reset message. > > This reset is bus specific, so it wouldn't be suitable for qdev > frame work and could be handled by each bus level. > > > hot reset: > I just put it here for completeness because pcie defines hot reset. > A reset propagated in-band across a Link using a Physical Layer > mechanism. > Qemu doesn't emulate physical layer, so we don't care it. > From software point of view, hot reset has same effect to warm reset. > >
On 08/25/2010 03:55 PM, Anthony Liguori wrote: > >> Maybe we can merge the patches. >> As for your patch, I have some comment. >> - bus itself may want its own handler. At lease pci bus needs it. >> And propagating reset signal to children is up to the bus controller. > > I disagree. Reset should be equivalent to power off + init and it's > not something that can be selectively propagated. Not all busses propagate reset - SCSI is an example (I think).
On 08/26/2010 08:15 AM, Avi Kivity wrote: > On 08/25/2010 03:55 PM, Anthony Liguori wrote: >> >>> Maybe we can merge the patches. >>> As for your patch, I have some comment. >>> - bus itself may want its own handler. At lease pci bus needs it. >>> And propagating reset signal to children is up to the bus >>> controller. >> >> I disagree. Reset should be equivalent to power off + init and it's >> not something that can be selectively propagated. > > Not all busses propagate reset - SCSI is an example (I think). > We're talking about cold reset vs. warm reset. In the absence of passthrough, I'm struggling to see a useful use-case with warm reset. However, there are many useful things we can do assuming a cold reset (like MADV_DONTNEED memory on reboot). That's not saying we shouldn't do a warm reset, but I'd like to see that as an incremental addition to what we have today (like introducing a propagating warm reset callback) and thinking through what the actual behavior should and shouldn't be. Regards, Anthony Liguori
On 08/26/2010 04:25 PM, Anthony Liguori wrote: > On 08/26/2010 08:15 AM, Avi Kivity wrote: >> On 08/25/2010 03:55 PM, Anthony Liguori wrote: >>> >>>> Maybe we can merge the patches. >>>> As for your patch, I have some comment. >>>> - bus itself may want its own handler. At lease pci bus needs it. >>>> And propagating reset signal to children is up to the bus >>>> controller. >>> >>> I disagree. Reset should be equivalent to power off + init and it's >>> not something that can be selectively propagated. >> >> Not all busses propagate reset - SCSI is an example (I think). >> > > We're talking about cold reset vs. warm reset. > > In the absence of passthrough, I'm struggling to see a useful use-case > with warm reset. However, there are many useful things we can do > assuming a cold reset (like MADV_DONTNEED memory on reboot). > > That's not saying we shouldn't do a warm reset, but I'd like to see > that as an incremental addition to what we have today (like > introducing a propagating warm reset callback) and thinking through > what the actual behavior should and shouldn't be. Pressing the reset button is a warm reset on real machines, therefore it should be a warm reset in qemu.
On Thu, Aug 26, 2010 at 2:29 PM, Avi Kivity <avi@redhat.com> wrote: > On 08/26/2010 04:25 PM, Anthony Liguori wrote: >> >> On 08/26/2010 08:15 AM, Avi Kivity wrote: >>> >>> On 08/25/2010 03:55 PM, Anthony Liguori wrote: >>>> >>>>> Maybe we can merge the patches. >>>>> As for your patch, I have some comment. >>>>> - bus itself may want its own handler. At lease pci bus needs it. >>>>> And propagating reset signal to children is up to the bus controller. >>>> >>>> I disagree. Reset should be equivalent to power off + init and it's not >>>> something that can be selectively propagated. >>> >>> Not all busses propagate reset - SCSI is an example (I think). >>> >> >> We're talking about cold reset vs. warm reset. >> >> In the absence of passthrough, I'm struggling to see a useful use-case >> with warm reset. However, there are many useful things we can do assuming a >> cold reset (like MADV_DONTNEED memory on reboot). >> >> That's not saying we shouldn't do a warm reset, but I'd like to see that >> as an incremental addition to what we have today (like introducing a >> propagating warm reset callback) and thinking through what the actual >> behavior should and shouldn't be. > > Pressing the reset button is a warm reset on real machines, therefore it > should be a warm reset in qemu. I agree. For example Sparc64 CPU uses different reset vector for warm and cold reset, so system_reset acts like a reset button. But most real life chips have only one reset signal pin, CPUs may be the only cases where multiple reset signals are used. I think current system_reset matches the need well. For most devices, system_reset initiates the same procedure for cold or warm reset. Those devices which care about warm reset can count the number of the resets. The only problem I see is that there is no way to signal cold reset or power cycle. Then there are devices which don't have reset signals at all, like RAM. Their behavior at reset is different compared to power cycling. Stateless hardware like ROMs do not care about either.
I added CC for those who might be interested in this discussion. On Thu, Aug 26, 2010 at 08:02:38AM -0500, Anthony Liguori wrote: > On 08/26/2010 03:38 AM, Isaku Yamahata wrote: >> >>> I think that starts by understanding exactly what's guaranteed and >>> understanding what the use cases are for it. >>> >> Fair enough. How about the followings? >> > > Thanks for enumerating. > >> This is just a starting point. I borrowed terminology pci/pcie spec. >> >> >> reset >> Bring the state of hardware state to consistent state. >> (some state might be left unknown.) >> >> >> system reset >> a hardware mechanism for setting or returning all hardware states >> to the initial conditions. >> Use case: >> In qemu, system_system_reset(). >> >> >> cold reset(power on reset) >> system reset following the application of power. >> Use case: >> In qemu, system_reset() in main(). >> We might want to use this as a power cycle. >> When a device is hot plugged, the device should be cold reset too. >> This is your motivation. >> QEMU_RESET_COLD >> Guarantee: >> The internal status must be same to qdev_init() + qdev_reset() >> > > This is what we do today in QEMU and from a functional perspective it > covers the type of function we need today. > >> >> warm reset >> system reset without cycling the supplied power. >> Use case: >> In qemu, system_reset() in main_loop(). There are many places >> which calls qemu_system_reset_request(). >> Some state are retained across warm reset. Like PCIe AER, error >> reporting registers need to keep its contents across warm reset >> as OS would examine them and report it when hardware errors caused >> warm reset. >> QEMU_RESET_WARM >> > > With AER, I can't imagine that this matters that much unless we're doing > PCI passthrough, right? Even without PCI passthrough, AER errors can be injected into emulated pci/pcie devices in a virtual pcie bus hierarchy from qemu command line. This is useful to test guest OS AER handler. > So maybe the way we should frame this discussion is, what's the type of > reset semantics that we need to support for PCI passthrough? The next > question after that is how do we achieve the different types of reset > for passthrough devices? What I want is hot reset in pcie terminology on a virtual bus as PCI_BRIDGE_CTL_BUS_RESET emulation and propagated reset on devices/child buses which might be a directly assigned. In direct assigned device case, device-assignment code would be notified the reset. As hot reset has same effect to warm reset in functionality sense (the difference is the way to signal it in physical/signal layer which qemu doesn't care), I'd like to implement pci bus reset as triggering warm reset on a (virtual) bus by utilizing qdev frame work. This would be applicable to ata, scsi, I suppose. It's another story how to virtualize hot reset on a given directly assigned pci function or a pcie bus hierarchy. For example, as PCI device assignment is done per function basis, co-existing functions in the same card shouldn't be reset. Another example is, virtual pci bus hierarchy might be reset, but it would be difficult problem how to map the virtual bus topology to host bus topology. thanks, > BTW, if you could transfer some of this discussion to a wiki page on > qemu.org, I think that would be extremely valuable. > > Regards, > > Anthony Liguori > >> bus reset >> Reset bus and devices on the bus. >> Bus reset is usually triggered when cold reset, warm reset and >> commanding the bus controller to reset the child bus. >> When bus reset is triggered as command to bus controller, >> the effect is usually same to warm reset on devices on the bus. >> >> Typically on parallel bus, bus reset is started by asserting >> a designated signal. >> Example: PCI RST#, ATA RESET-, SCSI RST >> >> Use case: >> bus reset as result of programming bus controller. >> Qemu is currently missing it which I'd like to fill for pci bus. >> ATA and SCSI could benefit from this. >> QEMU_RESET_WARM with bus. >> Guarantee: >> device state under the bus is same as warm reset. >> >> >> device/function reset: >> Reset triggered by sending reset command to a device. >> This is bus/device specific. >> There might be many reset commands whose effects are different. >> Example: PCI FLR, ATA DEVICE RESET command, >> scsi bus device reset message. >> >> This reset is bus specific, so it wouldn't be suitable for qdev >> frame work and could be handled by each bus level. >> >> >> hot reset: >> I just put it here for completeness because pcie defines hot reset. >> A reset propagated in-band across a Link using a Physical Layer >> mechanism. >> Qemu doesn't emulate physical layer, so we don't care it. >> From software point of view, hot reset has same effect to warm reset. >> >> >
On Thu, Aug 26, 2010 at 08:02:38AM -0500, Anthony Liguori wrote: > BTW, if you could transfer some of this discussion to a wiki page on > qemu.org, I think that would be extremely valuable. http://wiki.qemu.org/Features/ResetAPI
Isaku and Anthony: This is excellent discussion! Thanks for forwarding. Wei Xu wexu2@cisco.com On 8/26/10 8:52 PM, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote: > I added CC for those who might be interested in this discussion. > > On Thu, Aug 26, 2010 at 08:02:38AM -0500, Anthony Liguori wrote: >> On 08/26/2010 03:38 AM, Isaku Yamahata wrote: >>> >>>> I think that starts by understanding exactly what's guaranteed and >>>> understanding what the use cases are for it. >>>> >>> Fair enough. How about the followings? >>> >> >> Thanks for enumerating. >> >>> This is just a starting point. I borrowed terminology pci/pcie spec. >>> >>> >>> reset >>> Bring the state of hardware state to consistent state. >>> (some state might be left unknown.) >>> >>> >>> system reset >>> a hardware mechanism for setting or returning all hardware states >>> to the initial conditions. >>> Use case: >>> In qemu, system_system_reset(). >>> >>> >>> cold reset(power on reset) >>> system reset following the application of power. >>> Use case: >>> In qemu, system_reset() in main(). >>> We might want to use this as a power cycle. >>> When a device is hot plugged, the device should be cold reset too. >>> This is your motivation. >>> QEMU_RESET_COLD >>> Guarantee: >>> The internal status must be same to qdev_init() + qdev_reset() >>> >> >> This is what we do today in QEMU and from a functional perspective it >> covers the type of function we need today. >> >>> >>> warm reset >>> system reset without cycling the supplied power. >>> Use case: >>> In qemu, system_reset() in main_loop(). There are many places >>> which calls qemu_system_reset_request(). >>> Some state are retained across warm reset. Like PCIe AER, error >>> reporting registers need to keep its contents across warm reset >>> as OS would examine them and report it when hardware errors caused >>> warm reset. >>> QEMU_RESET_WARM >>> >> >> With AER, I can't imagine that this matters that much unless we're doing >> PCI passthrough, right? > > Even without PCI passthrough, AER errors can be injected into emulated > pci/pcie > devices in a virtual pcie bus hierarchy from qemu command line. > This is useful to test guest OS AER handler. > > >> So maybe the way we should frame this discussion is, what's the type of >> reset semantics that we need to support for PCI passthrough? The next >> question after that is how do we achieve the different types of reset >> for passthrough devices? > > What I want is hot reset in pcie terminology on a virtual bus as > PCI_BRIDGE_CTL_BUS_RESET emulation and propagated reset on devices/child > buses which might be a directly assigned. > In direct assigned device case, device-assignment code would be notified the > reset. > > As hot reset has same effect to warm reset in functionality sense > (the difference is the way to signal it in physical/signal layer which > qemu doesn't care), > I'd like to implement pci bus reset as triggering warm reset on a > (virtual) bus by utilizing qdev frame work. > This would be applicable to ata, scsi, I suppose. > > > It's another story how to virtualize hot reset on a given directly assigned > pci function or a pcie bus hierarchy. For example, as PCI device assignment > is done per function basis, co-existing functions in the same card shouldn't > be reset. > Another example is, virtual pci bus hierarchy might be reset, but it would > be difficult problem how to map the virtual bus topology to host bus topology. > > thanks, > >> BTW, if you could transfer some of this discussion to a wiki page on >> qemu.org, I think that would be extremely valuable. >> >> Regards, >> >> Anthony Liguori >> >>> bus reset >>> Reset bus and devices on the bus. >>> Bus reset is usually triggered when cold reset, warm reset and >>> commanding the bus controller to reset the child bus. >>> When bus reset is triggered as command to bus controller, >>> the effect is usually same to warm reset on devices on the bus. >>> >>> Typically on parallel bus, bus reset is started by asserting >>> a designated signal. >>> Example: PCI RST#, ATA RESET-, SCSI RST >>> >>> Use case: >>> bus reset as result of programming bus controller. >>> Qemu is currently missing it which I'd like to fill for pci bus. >>> ATA and SCSI could benefit from this. >>> QEMU_RESET_WARM with bus. >>> Guarantee: >>> device state under the bus is same as warm reset. >>> >>> >>> device/function reset: >>> Reset triggered by sending reset command to a device. >>> This is bus/device specific. >>> There might be many reset commands whose effects are different. >>> Example: PCI FLR, ATA DEVICE RESET command, >>> scsi bus device reset message. >>> >>> This reset is bus specific, so it wouldn't be suitable for qdev >>> frame work and could be handled by each bus level. >>> >>> >>> hot reset: >>> I just put it here for completeness because pcie defines hot reset. >>> A reset propagated in-band across a Link using a Physical Layer >>> mechanism. >>> Qemu doesn't emulate physical layer, so we don't care it. >>> From software point of view, hot reset has same effect to warm reset. >>> >>> >>
diff --git a/hw/qdev.c b/hw/qdev.c index e99c73f..b156272 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -278,6 +278,9 @@ int qdev_init(DeviceState *dev) qdev_free(dev); return rc; } + if (dev->hotplugged) { + qdev_reset(dev); + } qemu_register_reset(qdev_reset, dev); if (dev->info->vmsd) { vmstate_register_with_alias_id(dev, -1, dev->info->vmsd, dev,
Several devices rely on their reset() function being called to initialize device state, e1000 and rtl8139 in particular. When the device is hot added, the reset doesn't occur, often leaving the device in an unusable state. Adding a call to reset() after init() for hotplugged devices puts the device in the expected state for the guest. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> --- 0.13 candidate? hw/qdev.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-)