Message ID | 1416996685-15115-3-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On Wed, Nov 26, 2014 at 01:11:25PM +0300, Denis V. Lunev wrote: > From: Raushaniya Maksudova <rmaksudova@parallels.com> > > Excessive virtio_balloon inflation can cause invocation of OOM-killer, > when Linux is under severe memory pressure. Various mechanisms are > responsible for correct virtio_balloon memory management. Nevertheless it > is often the case that these control tools does not have enough time to > react on fast changing memory load. As a result OS runs out of memory and > invokes OOM-killer. The balancing of memory by use of the virtio balloon > should not cause the termination of processes while there are pages in the > balloon. Now there is no way for virtio balloon driver to free memory at > the last moment before some process get killed by OOM-killer. > > This does not provide a security breach as balloon itself is running > inside Guest OS and is working in the cooperation with the host. Thus > some improvements from Guest side should be considered as normal. > > To solve the problem, introduce a virtio_balloon callback which is > expected to be called from the oom notifier call chain in out_of_memory() > function. If virtio balloon could release some memory, it will make the > system to return and retry the allocation that forced the out of memory > killer to run. > > This behavior should be enabled if and only if appropriate feature bit > is set on the device. It is off by default. > > This functionality was recently merged into vanilla Linux (actually in > linux-next at the moment) > > commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > Author: Raushaniya Maksudova <rmaksudova@parallels.com> > Date: Mon Nov 10 09:36:29 2014 +1030 > > This patch adds respective control bits into QEMU. It introduces > deflate-on-oom option for baloon device which do the trick. > > Signed-off-by: Raushaniya Maksudova <rmaksudova@parallels.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Anthony Liguori <aliguori@amazon.com> > CC: Michael S. Tsirkin <mst@redhat.com> > --- > hw/virtio/virtio-balloon.c | 7 +++++++ > include/hw/virtio/virtio-balloon.h | 2 ++ > qemu-options.hx | 6 +++++- > 3 files changed, 14 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > index 7bfbb75..9d145fa 100644 > --- a/hw/virtio/virtio-balloon.c > +++ b/hw/virtio/virtio-balloon.c > @@ -305,7 +305,12 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > > static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) > { > + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); > + if (dev->deflate_on_oom) { > + f |= (1 << VIRTIO_BALLOON_F_DEFLATE_ON_OOM); > + } > + > return f; > } > > @@ -409,6 +414,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) > } > > static Property virtio_balloon_properties[] = { > + DEFINE_PROP_BOOL("deflate-on-oom", VirtIOBalloon, deflate_on_oom, false), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h > index f863bfe..45cc55a 100644 > --- a/include/hw/virtio/virtio-balloon.h > +++ b/include/hw/virtio/virtio-balloon.h > @@ -30,6 +30,7 @@ > /* The feature bitmap for virtio balloon */ > #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ > #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ > +#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ > > /* Size of a PFN in the balloon interface. */ > #define VIRTIO_BALLOON_PFN_SHIFT 12 > @@ -67,6 +68,7 @@ typedef struct VirtIOBalloon { > QEMUTimer *stats_timer; > int64_t stats_last_update; > int64_t stats_poll_interval; > + bool deflate_on_oom; > } VirtIOBalloon; > > #endif You don't need an extra bool, and open-coding. Do it same as we do for other features please, set bit in feature mask directly. > diff --git a/qemu-options.hx b/qemu-options.hx > index da9851d..14ede0b 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -324,7 +324,8 @@ ETEXI > DEF("balloon", HAS_ARG, QEMU_OPTION_balloon, > "-balloon none disable balloon device\n" > "-balloon virtio[,addr=str]\n" > - " enable virtio balloon device (default)\n", QEMU_ARCH_ALL) > + " enable virtio balloon device (default)\n" > + " [,deflate-on-oom=on|off]\n", QEMU_ARCH_ALL) > STEXI > @item -balloon none > @findex -balloon > @@ -332,6 +333,9 @@ Disable balloon device. > @item -balloon virtio[,addr=@var{addr}] > Enable virtio balloon device (default), optionally with PCI address > @var{addr}. > +@item -balloon virtio[,deflate-on-oom=@var{deflate-on-oom}] > +@var{deflate-on-oom} is "on" or "off" and enables whether to let Guest OS > +to deflate virtio balloon on OOM. Default is off. > ETEXI > > DEF("device", HAS_ARG, QEMU_OPTION_device, Please don't add stuff to legacy -balloon. New -device is enough, you don't need to touch qemu-options.hx for it. > -- > 1.9.1
On 26/11/14 14:16, Michael S. Tsirkin wrote: > On Wed, Nov 26, 2014 at 01:11:25PM +0300, Denis V. Lunev wrote: >> From: Raushaniya Maksudova <rmaksudova@parallels.com> >> >> Excessive virtio_balloon inflation can cause invocation of OOM-killer, >> when Linux is under severe memory pressure. Various mechanisms are >> responsible for correct virtio_balloon memory management. Nevertheless it >> is often the case that these control tools does not have enough time to >> react on fast changing memory load. As a result OS runs out of memory and >> invokes OOM-killer. The balancing of memory by use of the virtio balloon >> should not cause the termination of processes while there are pages in the >> balloon. Now there is no way for virtio balloon driver to free memory at >> the last moment before some process get killed by OOM-killer. >> >> This does not provide a security breach as balloon itself is running >> inside Guest OS and is working in the cooperation with the host. Thus >> some improvements from Guest side should be considered as normal. >> >> To solve the problem, introduce a virtio_balloon callback which is >> expected to be called from the oom notifier call chain in out_of_memory() >> function. If virtio balloon could release some memory, it will make the >> system to return and retry the allocation that forced the out of memory >> killer to run. >> >> This behavior should be enabled if and only if appropriate feature bit >> is set on the device. It is off by default. >> >> This functionality was recently merged into vanilla Linux (actually in >> linux-next at the moment) >> >> commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 >> Author: Raushaniya Maksudova <rmaksudova@parallels.com> >> Date: Mon Nov 10 09:36:29 2014 +1030 >> >> This patch adds respective control bits into QEMU. It introduces >> deflate-on-oom option for baloon device which do the trick. >> >> Signed-off-by: Raushaniya Maksudova <rmaksudova@parallels.com> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: Anthony Liguori <aliguori@amazon.com> >> CC: Michael S. Tsirkin <mst@redhat.com> >> --- >> hw/virtio/virtio-balloon.c | 7 +++++++ >> include/hw/virtio/virtio-balloon.h | 2 ++ >> qemu-options.hx | 6 +++++- >> 3 files changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c >> index 7bfbb75..9d145fa 100644 >> --- a/hw/virtio/virtio-balloon.c >> +++ b/hw/virtio/virtio-balloon.c >> @@ -305,7 +305,12 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, >> >> static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) >> { >> + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); >> f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); >> + if (dev->deflate_on_oom) { >> + f |= (1 << VIRTIO_BALLOON_F_DEFLATE_ON_OOM); >> + } >> + >> return f; >> } >> >> @@ -409,6 +414,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) >> } >> >> static Property virtio_balloon_properties[] = { >> + DEFINE_PROP_BOOL("deflate-on-oom", VirtIOBalloon, deflate_on_oom, false), >> DEFINE_PROP_END_OF_LIST(), >> }; >> >> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h >> index f863bfe..45cc55a 100644 >> --- a/include/hw/virtio/virtio-balloon.h >> +++ b/include/hw/virtio/virtio-balloon.h >> @@ -30,6 +30,7 @@ >> /* The feature bitmap for virtio balloon */ >> #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ >> #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ >> +#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ >> >> /* Size of a PFN in the balloon interface. */ >> #define VIRTIO_BALLOON_PFN_SHIFT 12 >> @@ -67,6 +68,7 @@ typedef struct VirtIOBalloon { >> QEMUTimer *stats_timer; >> int64_t stats_last_update; >> int64_t stats_poll_interval; >> + bool deflate_on_oom; >> } VirtIOBalloon; >> >> #endif > You don't need an extra bool, and open-coding. > Do it same as we do for other features please, > set bit in feature mask directly. there is no host_features placeholder at this level. We could add it and propagate proper bit in get_features callback like return f | dev->host_features or we have to move this stuff a little bit up into CCW & PCI code like done in virtscsi with #define DEFINE_VIRTIO_SCSI_FEATURES(_state, _feature_field) The first approach keeps bits in the same place, the second follows current approach. If you prefer unification way, I think that other devices could be re-written this way. Any opinion? >> diff --git a/qemu-options.hx b/qemu-options.hx >> index da9851d..14ede0b 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -324,7 +324,8 @@ ETEXI >> DEF("balloon", HAS_ARG, QEMU_OPTION_balloon, >> "-balloon none disable balloon device\n" >> "-balloon virtio[,addr=str]\n" >> - " enable virtio balloon device (default)\n", QEMU_ARCH_ALL) >> + " enable virtio balloon device (default)\n" >> + " [,deflate-on-oom=on|off]\n", QEMU_ARCH_ALL) >> STEXI >> @item -balloon none >> @findex -balloon >> @@ -332,6 +333,9 @@ Disable balloon device. >> @item -balloon virtio[,addr=@var{addr}] >> Enable virtio balloon device (default), optionally with PCI address >> @var{addr}. >> +@item -balloon virtio[,deflate-on-oom=@var{deflate-on-oom}] >> +@var{deflate-on-oom} is "on" or "off" and enables whether to let Guest OS >> +to deflate virtio balloon on OOM. Default is off. >> ETEXI >> >> DEF("device", HAS_ARG, QEMU_OPTION_device, > Please don't add stuff to legacy -balloon. > New -device is enough, you don't need to touch qemu-options.hx > for it. > sure
On Thu, Nov 27, 2014 at 02:04:55PM +0300, Denis V. Lunev wrote: > On 26/11/14 14:16, Michael S. Tsirkin wrote: > >On Wed, Nov 26, 2014 at 01:11:25PM +0300, Denis V. Lunev wrote: > >>From: Raushaniya Maksudova <rmaksudova@parallels.com> > >> > >>Excessive virtio_balloon inflation can cause invocation of OOM-killer, > >>when Linux is under severe memory pressure. Various mechanisms are > >>responsible for correct virtio_balloon memory management. Nevertheless it > >>is often the case that these control tools does not have enough time to > >>react on fast changing memory load. As a result OS runs out of memory and > >>invokes OOM-killer. The balancing of memory by use of the virtio balloon > >>should not cause the termination of processes while there are pages in the > >>balloon. Now there is no way for virtio balloon driver to free memory at > >>the last moment before some process get killed by OOM-killer. > >> > >>This does not provide a security breach as balloon itself is running > >>inside Guest OS and is working in the cooperation with the host. Thus > >>some improvements from Guest side should be considered as normal. > >> > >>To solve the problem, introduce a virtio_balloon callback which is > >>expected to be called from the oom notifier call chain in out_of_memory() > >>function. If virtio balloon could release some memory, it will make the > >>system to return and retry the allocation that forced the out of memory > >>killer to run. > >> > >>This behavior should be enabled if and only if appropriate feature bit > >>is set on the device. It is off by default. > >> > >>This functionality was recently merged into vanilla Linux (actually in > >>linux-next at the moment) > >> > >> commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > >> Author: Raushaniya Maksudova <rmaksudova@parallels.com> > >> Date: Mon Nov 10 09:36:29 2014 +1030 > >> > >>This patch adds respective control bits into QEMU. It introduces > >>deflate-on-oom option for baloon device which do the trick. > >> > >>Signed-off-by: Raushaniya Maksudova <rmaksudova@parallels.com> > >>Signed-off-by: Denis V. Lunev <den@openvz.org> > >>CC: Anthony Liguori <aliguori@amazon.com> > >>CC: Michael S. Tsirkin <mst@redhat.com> > >>--- > >> hw/virtio/virtio-balloon.c | 7 +++++++ > >> include/hw/virtio/virtio-balloon.h | 2 ++ > >> qemu-options.hx | 6 +++++- > >> 3 files changed, 14 insertions(+), 1 deletion(-) > >> > >>diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > >>index 7bfbb75..9d145fa 100644 > >>--- a/hw/virtio/virtio-balloon.c > >>+++ b/hw/virtio/virtio-balloon.c > >>@@ -305,7 +305,12 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, > >> static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) > >> { > >>+ VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > >> f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); > >>+ if (dev->deflate_on_oom) { > >>+ f |= (1 << VIRTIO_BALLOON_F_DEFLATE_ON_OOM); > >>+ } > >>+ > >> return f; > >> } > >>@@ -409,6 +414,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) > >> } > >> static Property virtio_balloon_properties[] = { > >>+ DEFINE_PROP_BOOL("deflate-on-oom", VirtIOBalloon, deflate_on_oom, false), > >> DEFINE_PROP_END_OF_LIST(), > >> }; > >>diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h > >>index f863bfe..45cc55a 100644 > >>--- a/include/hw/virtio/virtio-balloon.h > >>+++ b/include/hw/virtio/virtio-balloon.h > >>@@ -30,6 +30,7 @@ > >> /* The feature bitmap for virtio balloon */ > >> #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ > >> #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ > >>+#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ > >> /* Size of a PFN in the balloon interface. */ > >> #define VIRTIO_BALLOON_PFN_SHIFT 12 > >>@@ -67,6 +68,7 @@ typedef struct VirtIOBalloon { > >> QEMUTimer *stats_timer; > >> int64_t stats_last_update; > >> int64_t stats_poll_interval; > >>+ bool deflate_on_oom; > >> } VirtIOBalloon; > >> #endif > >You don't need an extra bool, and open-coding. > >Do it same as we do for other features please, > >set bit in feature mask directly. > there is no host_features placeholder at this level. > > We could add it and propagate proper bit in get_features > callback like > return f | dev->host_features > or we have to move this stuff a little bit up into > CCW & PCI code like done in virtscsi with > #define DEFINE_VIRTIO_SCSI_FEATURES(_state, _feature_field) > > The first approach keeps bits in the same place, > the second follows current approach. If you prefer unification way, > I think that other devices could be re-written this way. > > Any opinion? Balloon is the weird one out here. Look at how feature bits are defined for e.g. DEFINE_PROP_BIT("any_layout", _state, _field, VIRTIO_F_ANY_LAYOUT, true), you should do something similar. > >>diff --git a/qemu-options.hx b/qemu-options.hx > >>index da9851d..14ede0b 100644 > >>--- a/qemu-options.hx > >>+++ b/qemu-options.hx > >>@@ -324,7 +324,8 @@ ETEXI > >> DEF("balloon", HAS_ARG, QEMU_OPTION_balloon, > >> "-balloon none disable balloon device\n" > >> "-balloon virtio[,addr=str]\n" > >>- " enable virtio balloon device (default)\n", QEMU_ARCH_ALL) > >>+ " enable virtio balloon device (default)\n" > >>+ " [,deflate-on-oom=on|off]\n", QEMU_ARCH_ALL) > >> STEXI > >> @item -balloon none > >> @findex -balloon > >>@@ -332,6 +333,9 @@ Disable balloon device. > >> @item -balloon virtio[,addr=@var{addr}] > >> Enable virtio balloon device (default), optionally with PCI address > >> @var{addr}. > >>+@item -balloon virtio[,deflate-on-oom=@var{deflate-on-oom}] > >>+@var{deflate-on-oom} is "on" or "off" and enables whether to let Guest OS > >>+to deflate virtio balloon on OOM. Default is off. > >> ETEXI > >> DEF("device", HAS_ARG, QEMU_OPTION_device, > >Please don't add stuff to legacy -balloon. > >New -device is enough, you don't need to touch qemu-options.hx > >for it. > > > sure
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 7bfbb75..9d145fa 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -305,7 +305,12 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) { + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); + if (dev->deflate_on_oom) { + f |= (1 << VIRTIO_BALLOON_F_DEFLATE_ON_OOM); + } + return f; } @@ -409,6 +414,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) } static Property virtio_balloon_properties[] = { + DEFINE_PROP_BOOL("deflate-on-oom", VirtIOBalloon, deflate_on_oom, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index f863bfe..45cc55a 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -30,6 +30,7 @@ /* The feature bitmap for virtio balloon */ #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory stats virtqueue */ +#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 @@ -67,6 +68,7 @@ typedef struct VirtIOBalloon { QEMUTimer *stats_timer; int64_t stats_last_update; int64_t stats_poll_interval; + bool deflate_on_oom; } VirtIOBalloon; #endif diff --git a/qemu-options.hx b/qemu-options.hx index da9851d..14ede0b 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -324,7 +324,8 @@ ETEXI DEF("balloon", HAS_ARG, QEMU_OPTION_balloon, "-balloon none disable balloon device\n" "-balloon virtio[,addr=str]\n" - " enable virtio balloon device (default)\n", QEMU_ARCH_ALL) + " enable virtio balloon device (default)\n" + " [,deflate-on-oom=on|off]\n", QEMU_ARCH_ALL) STEXI @item -balloon none @findex -balloon @@ -332,6 +333,9 @@ Disable balloon device. @item -balloon virtio[,addr=@var{addr}] Enable virtio balloon device (default), optionally with PCI address @var{addr}. +@item -balloon virtio[,deflate-on-oom=@var{deflate-on-oom}] +@var{deflate-on-oom} is "on" or "off" and enables whether to let Guest OS +to deflate virtio balloon on OOM. Default is off. ETEXI DEF("device", HAS_ARG, QEMU_OPTION_device,