Message ID | 20210326183024.3214527-3-blp@ovn.org |
---|---|
State | Rejected |
Headers | show |
Series | [ovs-dev,1/3] ovs-lldp: Get rid of pointless null pointer check. | expand |
On 3/26/21 7:30 PM, Ben Pfaff wrote: > From: James Page <james.page@ubuntu.com> > > Normally when OVS runs on a server the effective system limit for > locked memory is unlimited as the ovs-vswitchd runs as the root > user in the root namespace of the server. > > When OVS is run in a non-priviledged container a system limit will > apply and its possible that this limit will be reached during > normal operation. > > If this occurs unlock all memory and re-attempt (re)allocation of > memory rather than fail and abort. > > If this subsequent attempt also fails abort the process as before. > > Signed-off-by: James Page <james.page@ubuntu.com> > [blp@ovn.org adapted to cover more cases] > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- Hi, Ben and James. I'll repeat here what I said in a comment on github. DPDK uses memory locking for mempool pages and it relies on these pages not being moved or unloaded, because it uses physical addresses to configure hardware NICs. Also, some DPDK drivers are using memory locking to lock in memory pages that are used for driver-specific device I/O. Unlocking any of these pages in runtime will likely crash the system by NMI if such a page will be moved or swapped without notifying the device and driver. Blindly unlocking all the memory is very dangerous. I'd say it's much more safe to just run OVS without passing --mlockall. Or we need to itroduce a special cmdline flag to allow dynamic unlocking and also forbid DPDK initialization if this flag enabled. And all this should be sufficiently documented. Best regards, Ilya Maximets.
Bleep bloop. Greetings Ben Pfaff, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: Ben Pfaff <blp@ovn.org> ERROR: C99 style comment #88 FILE: lib/util.c:132: #endif // HAVE_MLOCKALL ERROR: Use xrealloc() in place of realloc() #119 FILE: lib/util.c:167: void *rp = realloc(p, size ? size : 1); Lines checked: 156, Warnings: 1, Errors: 2 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
On Fri, Mar 26, 2021 at 07:52:18PM +0100, Ilya Maximets wrote: > On 3/26/21 7:30 PM, Ben Pfaff wrote: > > From: James Page <james.page@ubuntu.com> > > > > Normally when OVS runs on a server the effective system limit for > > locked memory is unlimited as the ovs-vswitchd runs as the root > > user in the root namespace of the server. > > > > When OVS is run in a non-priviledged container a system limit will > > apply and its possible that this limit will be reached during > > normal operation. > > > > If this occurs unlock all memory and re-attempt (re)allocation of > > memory rather than fail and abort. > > > > If this subsequent attempt also fails abort the process as before. > > > > Signed-off-by: James Page <james.page@ubuntu.com> > > [blp@ovn.org adapted to cover more cases] > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > --- > > Hi, Ben and James. > > I'll repeat here what I said in a comment on github. > > DPDK uses memory locking for mempool pages and it relies on these > pages not being moved or unloaded, because it uses physical addresses > to configure hardware NICs. Also, some DPDK drivers are using memory > locking to lock in memory pages that are used for driver-specific > device I/O. Unlocking any of these pages in runtime will likely crash > the system by NMI if such a page will be moved or swapped without > notifying the device and driver. > > Blindly unlocking all the memory is very dangerous. > > I'd say it's much more safe to just run OVS without passing --mlockall. > Or we need to itroduce a special cmdline flag to allow dynamic unlocking > and also forbid DPDK initialization if this flag enabled. > And all this should be sufficiently documented. I don't have a dog in this fight. I'll leave it to you and James to figure out the preferred behavior. Patches 1 and 2 may be of interest either way.
On 3/26/21 9:18 PM, Ben Pfaff wrote: > On Fri, Mar 26, 2021 at 07:52:18PM +0100, Ilya Maximets wrote: >> On 3/26/21 7:30 PM, Ben Pfaff wrote: >>> From: James Page <james.page@ubuntu.com> >>> >>> Normally when OVS runs on a server the effective system limit for >>> locked memory is unlimited as the ovs-vswitchd runs as the root >>> user in the root namespace of the server. >>> >>> When OVS is run in a non-priviledged container a system limit will >>> apply and its possible that this limit will be reached during >>> normal operation. >>> >>> If this occurs unlock all memory and re-attempt (re)allocation of >>> memory rather than fail and abort. >>> >>> If this subsequent attempt also fails abort the process as before. >>> >>> Signed-off-by: James Page <james.page@ubuntu.com> >>> [blp@ovn.org adapted to cover more cases] >>> Signed-off-by: Ben Pfaff <blp@ovn.org> >>> --- >> >> Hi, Ben and James. >> >> I'll repeat here what I said in a comment on github. >> >> DPDK uses memory locking for mempool pages and it relies on these >> pages not being moved or unloaded, because it uses physical addresses >> to configure hardware NICs. Also, some DPDK drivers are using memory >> locking to lock in memory pages that are used for driver-specific >> device I/O. Unlocking any of these pages in runtime will likely crash >> the system by NMI if such a page will be moved or swapped without >> notifying the device and driver. >> >> Blindly unlocking all the memory is very dangerous. >> >> I'd say it's much more safe to just run OVS without passing --mlockall. >> Or we need to itroduce a special cmdline flag to allow dynamic unlocking >> and also forbid DPDK initialization if this flag enabled. >> And all this should be sufficiently documented. > > I don't have a dog in this fight. I'll leave it to you and James to > figure out the preferred behavior. Sure. AFAIK, container management systems like kubernetes are tracking memory consumption and doesn't schedule/create new containers if system (kubelet daemon on the node) reports memory pressure. So, it should not be a big concern to run OVS without memory locking. In its current form change is not acceptable as it will cause problems for DPDK users (and, probably, afxdp, but I'm not sure about that). > > Patches 1 and 2 may be of interest either way. > Replied to these patches. Best regards, Ilya Maximets.
On Wed, Apr 07, 2021 at 01:19:09PM +0200, Ilya Maximets wrote: > On 3/26/21 9:18 PM, Ben Pfaff wrote: > > On Fri, Mar 26, 2021 at 07:52:18PM +0100, Ilya Maximets wrote: > >> On 3/26/21 7:30 PM, Ben Pfaff wrote: > >>> From: James Page <james.page@ubuntu.com> > >>> > >>> Normally when OVS runs on a server the effective system limit for > >>> locked memory is unlimited as the ovs-vswitchd runs as the root > >>> user in the root namespace of the server. > >>> > >>> When OVS is run in a non-priviledged container a system limit will > >>> apply and its possible that this limit will be reached during > >>> normal operation. > >>> > >>> If this occurs unlock all memory and re-attempt (re)allocation of > >>> memory rather than fail and abort. > >>> > >>> If this subsequent attempt also fails abort the process as before. > >>> > >>> Signed-off-by: James Page <james.page@ubuntu.com> > >>> [blp@ovn.org adapted to cover more cases] > >>> Signed-off-by: Ben Pfaff <blp@ovn.org> > >>> --- > >> > >> Hi, Ben and James. > >> > >> I'll repeat here what I said in a comment on github. > >> > >> DPDK uses memory locking for mempool pages and it relies on these > >> pages not being moved or unloaded, because it uses physical addresses > >> to configure hardware NICs. Also, some DPDK drivers are using memory > >> locking to lock in memory pages that are used for driver-specific > >> device I/O. Unlocking any of these pages in runtime will likely crash > >> the system by NMI if such a page will be moved or swapped without > >> notifying the device and driver. > >> > >> Blindly unlocking all the memory is very dangerous. > >> > >> I'd say it's much more safe to just run OVS without passing --mlockall. > >> Or we need to itroduce a special cmdline flag to allow dynamic unlocking > >> and also forbid DPDK initialization if this flag enabled. > >> And all this should be sufficiently documented. > > > > I don't have a dog in this fight. I'll leave it to you and James to > > figure out the preferred behavior. > > Sure. AFAIK, container management systems like kubernetes are tracking > memory consumption and doesn't schedule/create new containers if system > (kubelet daemon on the node) reports memory pressure. So, it should not > be a big concern to run OVS without memory locking. > > In its current form change is not acceptable as it will cause problems > for DPDK users (and, probably, afxdp, but I'm not sure about that). Consider this dropped.
On Wed, Apr 7, 2021 at 5:45 PM Ben Pfaff <blp@ovn.org> wrote: > On Wed, Apr 07, 2021 at 01:19:09PM +0200, Ilya Maximets wrote: > > On 3/26/21 9:18 PM, Ben Pfaff wrote: > > > On Fri, Mar 26, 2021 at 07:52:18PM +0100, Ilya Maximets wrote: > > >> On 3/26/21 7:30 PM, Ben Pfaff wrote: > > >>> From: James Page <james.page@ubuntu.com> > > >>> > > >>> Normally when OVS runs on a server the effective system limit for > > >>> locked memory is unlimited as the ovs-vswitchd runs as the root > > >>> user in the root namespace of the server. > > >>> > > >>> When OVS is run in a non-priviledged container a system limit will > > >>> apply and its possible that this limit will be reached during > > >>> normal operation. > > >>> > > >>> If this occurs unlock all memory and re-attempt (re)allocation of > > >>> memory rather than fail and abort. > > >>> > > >>> If this subsequent attempt also fails abort the process as before. > > >>> > > >>> Signed-off-by: James Page <james.page@ubuntu.com> > > >>> [blp@ovn.org adapted to cover more cases] > > >>> Signed-off-by: Ben Pfaff <blp@ovn.org> > > >>> --- > > >> > > >> Hi, Ben and James. > > >> > > >> I'll repeat here what I said in a comment on github. > > >> > > >> DPDK uses memory locking for mempool pages and it relies on these > > >> pages not being moved or unloaded, because it uses physical addresses > > >> to configure hardware NICs. Also, some DPDK drivers are using memory > > >> locking to lock in memory pages that are used for driver-specific > > >> device I/O. Unlocking any of these pages in runtime will likely crash > > >> the system by NMI if such a page will be moved or swapped without > > >> notifying the device and driver. > > >> > > >> Blindly unlocking all the memory is very dangerous. > > >> > > >> I'd say it's much more safe to just run OVS without passing > --mlockall. > > >> Or we need to itroduce a special cmdline flag to allow dynamic > unlocking > > >> and also forbid DPDK initialization if this flag enabled. > > >> And all this should be sufficiently documented. > > > > > > I don't have a dog in this fight. I'll leave it to you and James to > > > figure out the preferred behavior. > > > > Sure. AFAIK, container management systems like kubernetes are tracking > > memory consumption and doesn't schedule/create new containers if system > > (kubelet daemon on the node) reports memory pressure. So, it should not > > be a big concern to run OVS without memory locking. > > > > In its current form change is not acceptable as it will cause problems > > for DPDK users (and, probably, afxdp, but I'm not sure about that). > > Consider this dropped. > Apologies for the laggy response - email woes now resolved. I'll raise a review to simply detect execution in a container and stop passing mlockall in this combo
diff --git a/lib/stream-ssl.c b/lib/stream-ssl.c index 078fcbc3aa4a..cd906e9e7b1d 100644 --- a/lib/stream-ssl.c +++ b/lib/stream-ssl.c @@ -1096,10 +1096,10 @@ tmp_dh_callback(SSL *ssl OVS_UNUSED, int is_export OVS_UNUSED, int keylength) for (dh = dh_table; dh < &dh_table[ARRAY_SIZE(dh_table)]; dh++) { if (dh->keylength == keylength) { - if (!dh->dh) { + while (!dh->dh) { dh->dh = dh->constructor(); if (!dh->dh) { - out_of_memory(); + munlockall_or_out_of_memory(); } } return dh->dh; diff --git a/lib/util.c b/lib/util.c index 1195c7982118..e9c5c40094bc 100644 --- a/lib/util.c +++ b/lib/util.c @@ -41,6 +41,9 @@ #ifdef _WIN32 #include <shlwapi.h> #endif +#ifdef HAVE_MLOCKALL +#include <sys/mman.h> +#endif VLOG_DEFINE_THIS_MODULE(util); @@ -103,6 +106,12 @@ set_memory_locked(void) is_memory_locked = true; } +void +set_memory_unlocked(void) +{ + is_memory_locked = false; +} + bool memory_locked(void) { @@ -110,8 +119,17 @@ memory_locked(void) } void -out_of_memory(void) -{ +munlockall_or_out_of_memory(void) +{ +#ifdef HAVE_MLOCKALL + if (errno == EAGAIN && memory_locked()) { + /* Locked memory allocation failed + * unlock and try again */ + munlockall(); + set_memory_unlocked(); + return; + } +#endif // HAVE_MLOCKALL ovs_abort(0, "virtual memory exhausted"); } @@ -120,7 +138,8 @@ xcalloc__(size_t count, size_t size) { void *p = count && size ? calloc(count, size) : malloc(1); if (p == NULL) { - out_of_memory(); + munlockall_or_out_of_memory(); + return xcalloc__(count, size); } return p; } @@ -136,7 +155,8 @@ xmalloc__(size_t size) { void *p = malloc(size ? size : 1); if (p == NULL) { - out_of_memory(); + munlockall_or_out_of_memory(); + return xmalloc__(size); } return p; } @@ -144,11 +164,12 @@ xmalloc__(size_t size) void * xrealloc__(void *p, size_t size) { - p = realloc(p, size ? size : 1); - if (p == NULL) { - out_of_memory(); + void *rp = realloc(p, size ? size : 1); + if (rp == NULL) { + munlockall_or_out_of_memory(); + return xrealloc__(p, size); } - return p; + return rp; } void * @@ -253,7 +274,8 @@ xmalloc_size_align(size_t size, size_t alignment) COVERAGE_INC(util_xalloc); error = posix_memalign(&p, alignment, size ? size : 1); if (error != 0) { - out_of_memory(); + munlockall_or_out_of_memory(); + return xmalloc_size_align(size, alignment); } return p; #else diff --git a/lib/util.h b/lib/util.h index 9c2b14fae304..a1958fcf93b4 100644 --- a/lib/util.h +++ b/lib/util.h @@ -144,8 +144,9 @@ void ctl_timeout_setup(unsigned int secs); void ovs_print_version(uint8_t min_ofp, uint8_t max_ofp); void set_memory_locked(void); +void set_memory_unlocked(void); bool memory_locked(void); -OVS_NO_RETURN void out_of_memory(void); +void munlockall_or_out_of_memory(void); /* Allocation wrappers that abort if memory is exhausted. */ void *xmalloc(size_t) MALLOC_LIKE;