Message ID | 1406121438-23083-3-git-send-email-lcapitulino@redhat.com |
---|---|
State | New |
Headers | show |
On 07/23/2014 07:17 AM, Luiz Capitulino wrote: > This event has the same characteristics of the other rate-limited > events, mainly we can emit dozens of it. Rate limit it then. > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > monitor.c | 1 + > 1 file changed, 1 insertion(+) It's not guest-triggered, per se, but can indeed flood management. Reviewed-by: Eric Blake <eblake@redhat.com> > > diff --git a/monitor.c b/monitor.c > index 5bc70a6..33abe6c 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void) > monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000); > monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000); > monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000); > + monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000); > > qmp_event_set_func_emit(monitor_qapi_event_queue); > } >
On Wed, Jul 23, 2014 at 09:17:17AM -0400, Luiz Capitulino wrote: > This event has the same characteristics of the other rate-limited > events, mainly we can emit dozens of it. Rate limit it then. > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > --- > monitor.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/monitor.c b/monitor.c > index 5bc70a6..33abe6c 100644 > --- a/monitor.c > +++ b/monitor.c > @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void) > monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000); > monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000); > monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000); > + monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000); The rate limiting code only rate limits at the granularity of individual event types. If there is context sensitive data associated with events then the rate limiting will cause problems for applications tracking the events. eg consider with the simpler RTC CHANGE events if we get QAPI_EVENT_RTC_CHANGE offset=30 QAPI_EVENT_RTC_CHANGE offset=700 QAPI_EVENT_RTC_CHANGE offset=340 then rate limiting will mean that the application only receives QAPI_EVENT_RTC_CHANGE offset=340 This is fine because the application will always end up with a correct view of the current system state. For the BLOCK IO ERROR events this does not work because the events are device and operation specific. QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop with throttling the app wll only receive QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop which means it will have an *incorrect* view of the system state because the info about scsi1-hd2 is irretrievably lost, likewise info about the read operation of ide0-hd1. If you want to throttle BLOCK IO ERROR events, then you need to make the monitor throttling more intelligent, so that it hashes on all the contextual state. In this case you'd have to throttle based on (event, dev, op) to get correct application behaviour. Regards, Daniel
"Daniel P. Berrange" <berrange@redhat.com> writes: > On Wed, Jul 23, 2014 at 09:17:17AM -0400, Luiz Capitulino wrote: >> This event has the same characteristics of the other rate-limited >> events, mainly we can emit dozens of it. Rate limit it then. >> >> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> >> --- >> monitor.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/monitor.c b/monitor.c >> index 5bc70a6..33abe6c 100644 >> --- a/monitor.c >> +++ b/monitor.c >> @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void) >> monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000); >> monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000); >> monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000); >> + monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000); > > > The rate limiting code only rate limits at the granularity of > individual event types. If there is context sensitive data associated > with events then the rate limiting will cause problems for applications > tracking the events. > > eg consider with the simpler RTC CHANGE events if we get > > QAPI_EVENT_RTC_CHANGE offset=30 > QAPI_EVENT_RTC_CHANGE offset=700 > QAPI_EVENT_RTC_CHANGE offset=340 > > then rate limiting will mean that the application only receives > > QAPI_EVENT_RTC_CHANGE offset=340 > > This is fine because the application will always end up with a correct > view of the current system state. ... since the intermediate states no longer matter. > For the BLOCK IO ERROR events this does not work because the events are > device and operation specific. > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop > QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop > > with throttling the app wll only receive > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop > > which means it will have an *incorrect* view of the system state because > the info about scsi1-hd2 is irretrievably lost, likewise info about the > read operation of ide0-hd1. Even when the event is lost, the information should not be lost. There should be a way to poll for it (libvirt needs that anyway, to cope with possible event loss during a libvirt restart). > If you want to throttle BLOCK IO ERROR events, then you need to make the > monitor throttling more intelligent, so that it hashes on all the contextual > state. In this case you'd have to throttle based on (event, dev, op) to get > correct application behaviour. I think there's more than one to skin this cat: 1. Don't throttle. Client can rely on events as long as it keeps the QMP connection alive. Client should poll after establishing the QMP connection. 2. Throttle more smartly, so that events only get dropped when they're semantically superseded. I figure that's what you proposed in your last paragraph. 3. Throttle, but accumulate the information carried by the event, i.e. any dropped events' data is sent with the next non-dropped event. 4. Throttle without smarts or accumulation. a. The event's additional information may be incomplete, thus worthless. Client needs too poll after getting an event. b. Add a flag "throttling has dropped some events". The additional information is incomplete when the flag is set. Client needs to poll then. Backward compatibility considerations may narrow our choice.
On Mon, Aug 11, 2014 at 01:07:51PM +0200, Markus Armbruster wrote: > "Daniel P. Berrange" <berrange@redhat.com> writes: > > For the BLOCK IO ERROR events this does not work because the events are > > device and operation specific. > > > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop > > QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop > > > > with throttling the app wll only receive > > > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop > > > > which means it will have an *incorrect* view of the system state because > > the info about scsi1-hd2 is irretrievably lost, likewise info about the > > read operation of ide0-hd1. > > Even when the event is lost, the information should not be lost. There > should be a way to poll for it (libvirt needs that anyway, to cope with > possible event loss during a libvirt restart). Yes, that's true. > > If you want to throttle BLOCK IO ERROR events, then you need to make the > > monitor throttling more intelligent, so that it hashes on all the contextual > > state. In this case you'd have to throttle based on (event, dev, op) to get > > correct application behaviour. > > I think there's more than one to skin this cat: > > 1. Don't throttle. Client can rely on events as long as it keeps the > QMP connection alive. Client should poll after establishing the QMP > connection. A malicious guest OS can flood libvirt with events in this way. Of course even if we throttle, a compromised QEMU can still flood libvirt. The only fail-safe protection is for libvirt to detect flooding and throttle the rate at which it talks to the (malicious) QEMU. > 2. Throttle more smartly, so that events only get dropped when they're > semantically superseded. I figure that's what you proposed in your > last paragraph. Yep, that's what I was suggesting. > 3. Throttle, but accumulate the information carried by the event, i.e. > any dropped events' data is sent with the next non-dropped event. I fear this could get rather ugly - fields which are currently scalar quantities would need to become lists or hashes. > 4. Throttle without smarts or accumulation. > > a. The event's additional information may be incomplete, thus > worthless. Client needs too poll after getting an event. Might as well just not bother sending any additionl info in events if we took this path. > > b. Add a flag "throttling has dropped some events". The additional > information is incomplete when the flag is set. Client needs to > poll then. This is a reasonable idea too. > Backward compatibility considerations may narrow our choice. I think 1, 2 or 4b are viable from a general design POV, but only 1 or 2 are viable from a back-compat POV, unless there was an explicit command that client apps issued to turn on the throttling in 4b instead of it being on by default. Regards, Daniel
On Mon, 11 Aug 2014 09:17:19 +0100 "Daniel P. Berrange" <berrange@redhat.com> wrote: > On Wed, Jul 23, 2014 at 09:17:17AM -0400, Luiz Capitulino wrote: > > This event has the same characteristics of the other rate-limited > > events, mainly we can emit dozens of it. Rate limit it then. > > > > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> > > --- > > monitor.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/monitor.c b/monitor.c > > index 5bc70a6..33abe6c 100644 > > --- a/monitor.c > > +++ b/monitor.c > > @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void) > > monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000); > > monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000); > > monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000); > > + monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000); > > > The rate limiting code only rate limits at the granularity of > individual event types. If there is context sensitive data associated > with events then the rate limiting will cause problems for applications > tracking the events. > > eg consider with the simpler RTC CHANGE events if we get > > QAPI_EVENT_RTC_CHANGE offset=30 > QAPI_EVENT_RTC_CHANGE offset=700 > QAPI_EVENT_RTC_CHANGE offset=340 > > then rate limiting will mean that the application only receives > > QAPI_EVENT_RTC_CHANGE offset=340 > > This is fine because the application will always end up with a correct > view of the current system state. > > > For the BLOCK IO ERROR events this does not work because the events are > device and operation specific. > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=read action=stop > QAPI_EVENT_BLOCK_IO_ERROR dev=scsi1-hd2 op=write action=stop > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop > > with throttling the app wll only receive > > QAPI_EVENT_BLOCK_IO_ERROR dev=ide0-hd1 op=write action=stop > > which means it will have an *incorrect* view of the system state because > the info about scsi1-hd2 is irretrievably lost, likewise info about the > read operation of ide0-hd1. You're completely right, of course. Thanks for reviewing! I think I'll just drop this patch for now. > > If you want to throttle BLOCK IO ERROR events, then you need to make the > monitor throttling more intelligent, so that it hashes on all the contextual > state. In this case you'd have to throttle based on (event, dev, op) to get > correct application behaviour. > > Regards, > Daniel
Il 11/08/2014 13:15, Daniel P. Berrange ha scritto: >> > 1. Don't throttle. Client can rely on events as long as it keeps the >> > QMP connection alive. Client should poll after establishing the QMP >> > connection. > A malicious guest OS can flood libvirt with events in this way. Of course > even if we throttle, a compromised QEMU can still flood libvirt. The only > fail-safe protection is for libvirt to detect flooding and throttle the > rate at which it talks to the (malicious) QEMU. > If you use rerror=stop,werror=stop, only a limited error can be passed down to libvirt before libvirt invokes the "cont" command and there's no need to do any throttling. Paolo
diff --git a/monitor.c b/monitor.c index 5bc70a6..33abe6c 100644 --- a/monitor.c +++ b/monitor.c @@ -589,6 +589,7 @@ static void monitor_qapi_event_init(void) monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_REPORT_BAD, 1000); monitor_qapi_event_throttle(QAPI_EVENT_QUORUM_FAILURE, 1000); monitor_qapi_event_throttle(QAPI_EVENT_VSERPORT_CHANGE, 1000); + monitor_qapi_event_throttle(QAPI_EVENT_BLOCK_IO_ERROR, 1000); qmp_event_set_func_emit(monitor_qapi_event_queue); }
This event has the same characteristics of the other rate-limited events, mainly we can emit dozens of it. Rate limit it then. Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> --- monitor.c | 1 + 1 file changed, 1 insertion(+)