Message ID | 20200907144435.43165-1-dust.li@linux.alibaba.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net/sock: don't drop udp packets if udp_mem[2] not reached | expand |
Hi, On Mon, 2020-09-07 at 22:44 +0800, Dust Li wrote: > We encoutered udp packets drop under a pretty low pressure > with net.ipv4.udp_mem[0] set to a small value (4096). > > After some tracing and debugging, we found that for udp > protocol, __sk_mem_raise_allocated() will possiblly drop > packets if: > udp_mem[0] < udp_prot.memory_allocated < udp_mem[2] > > That's because __sk_mem_raise_allocated() didn't handle > the above condition for protocols like udp who doesn't > have sk_has_memory_pressure() > > We can reproduce this with the following condition > 1. udp_mem[0] is relateive small, > 2. net.core.rmem_default/max > udp_mem[0] * 4K This looks like something that could/should be addressed at configuration level ?!? udp_mem[0] should accomodate confortably at least a socket. Cheers, Paolo
On Mon, Sep 07, 2020 at 07:18:48PM +0200, Paolo Abeni wrote: >Hi, > >On Mon, 2020-09-07 at 22:44 +0800, Dust Li wrote: >> We encoutered udp packets drop under a pretty low pressure >> with net.ipv4.udp_mem[0] set to a small value (4096). >> >> After some tracing and debugging, we found that for udp >> protocol, __sk_mem_raise_allocated() will possiblly drop >> packets if: >> udp_mem[0] < udp_prot.memory_allocated < udp_mem[2] >> >> That's because __sk_mem_raise_allocated() didn't handle >> the above condition for protocols like udp who doesn't >> have sk_has_memory_pressure() >> >> We can reproduce this with the following condition >> 1. udp_mem[0] is relateive small, >> 2. net.core.rmem_default/max > udp_mem[0] * 4K > >This looks like something that could/should be addressed at >configuration level ?!? Thanks a lot for the review ! Sorry, maybe I haven't make it clear enough The real problem is the scability with the sockets number. Since the udp_mem is for all UDP sockets, with the number of udp sockets grows, soon or later, udp_prot.memory_allocated will exceed udp_mem[0], and __sk_mem_raise_allocated() will cause the packets drop here. But the total udp memory allocated may still far under udp_mem[1] or udp_mem[2] > >udp_mem[0] should accomodate confortably at least a socket. Yeah, I agree udp_mem[0] should be large enough for at least a socket. Here I use 4096 just for simple and reproduce what we met before. I changed my test program a bit: - with 16 server sockets - with 1 client sending 3000 messages(size: 4096bytes) to each of those 8 server sockets - set net.core.rmem_default/max to (2*4096*4096) - and keep udp_mem unset, which by default on my 4GB VM is 'net.ipv4.udp_mem = 91944 122592 183888' https://github.com/dust-li/kernel-test/blob/master/exceed_udp_mem_min_drop/multi_sockets_test.sh Actually, with more udp sockets, I can always make it large enough to exceed udp_mem[0], and drop packets before udp_mem[1] and udp_mem[2]. > >Cheers, > >Paolo
Hi, On Tue, 2020-09-08 at 11:15 +0800, dust.li wrote: > Actually, with more udp sockets, I can always make it large > enough to exceed udp_mem[0], and drop packets before udp_mem[1] > and udp_mem[2]. Sure, but with enough sockets you can also exceeeds any limits ;). > diff --git a/net/core/sock.c b/net/core/sock.c > index 6c5c6b18eff4..fed8211d8dbe 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -2648,6 +2648,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) > atomic_read(&sk->sk_rmem_alloc) + > sk->sk_forward_alloc)) > return 1; > + } else { > + /* for prots without memory_pressure callbacks, we should not > + * drop until hard limit reached > + */ > + if (allocated <= sk_prot_mem_limits(sk, 2)) > + return 1; At this point, the above condition is always true, due to an earlier check. Additionally, accepting any value below udp_mem[2] would make the previous checks to allow a minimum per socket memory useless. You can obtain the same result setting udp_mem[0] = udp_mem[2], without any kernel change. But with this change applied you can't guarantee anymore a minimum per socket amount of memory. I think you are possibly mislead by your own comment: the point is that we should never allow allocation above the hard limit, but the protocol is allowed to drop as soon as the memory allocated raises above the lower limit. Note that the current behavior is correctly documented in Documentation/networking/ip-sysctl.rst. Your problem must be solved in another way e.g. raising udp_mem[0] - and keeping udp_mem[2] above that value. Cheers, Paolo
On Tue, Sep 08, 2020 at 10:46:13AM +0200, Paolo Abeni wrote: >Hi, > >On Tue, 2020-09-08 at 11:15 +0800, dust.li wrote: >> Actually, with more udp sockets, I can always make it large >> enough to exceed udp_mem[0], and drop packets before udp_mem[1] >> and udp_mem[2]. > >Sure, but with enough sockets you can also exceeeds any limits ;). > >> diff --git a/net/core/sock.c b/net/core/sock.c >> index 6c5c6b18eff4..fed8211d8dbe 100644 >> --- a/net/core/sock.c >> +++ b/net/core/sock.c >> @@ -2648,6 +2648,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) >> atomic_read(&sk->sk_rmem_alloc) + >> sk->sk_forward_alloc)) >> return 1; >> + } else { >> + /* for prots without memory_pressure callbacks, we should not >> + * drop until hard limit reached >> + */ >> + if (allocated <= sk_prot_mem_limits(sk, 2)) >> + return 1; > >At this point, the above condition is always true, due to an earlier >check. Additionally, accepting any value below udp_mem[2] would make >the previous checks to allow a minimum per socket memory useless. Thanks for your reply ! Yeah, this is wrong :O > >You can obtain the same result setting udp_mem[0] = udp_mem[2], without >any kernel change. > >But with this change applied you can't guarantee anymore a minimum per >socket amount of memory. Yeah, understood. > >I think you are possibly mislead by your own comment: the point is that >we should never allow allocation above the hard limit, but the protocol >is allowed to drop as soon as the memory allocated raises above the >lower limit. Agree. > >Note that the current behavior is correctly documented >in Documentation/networking/ip-sysctl.rst. > >Your problem must be solved in another way e.g. raising udp_mem[0] - >and keeping udp_mem[2] above that value. I think it's somehow not quite fit what Documentation/networking/ip-sysctl.rst says: min: Below this number of pages UDP is not bothered about its memory appetite. When amount of memory allocated by UDP exceeds this number, UDP starts to moderate memory usage. It says UDP starts to moderate memory usage when exceeds min, but here it is actually dropping all the packets unless the socket is under minimum , which means udp_mem[0] is almost equal to the hard limit, and udp_mem[1]/udp_mem[2] is useless. Do you think it's usefull if we move the per socket memory check to the common path for both with and without memory_pressure protocols ? So only those sockets who comsume most of the memory will got dropped before hard limit. Something like this: if (sk_has_memory_pressure(sk)) { if (!sk_under_memory_pressure(sk)) return 1; } alloc = sk_sockets_allocated_read_positive(sk); if (sk_prot_mem_limits(sk, 2) > alloc * sk_mem_pages(sk->sk_wmem_queued + atomic_read(&sk->sk_rmem_alloc) + sk->sk_forward_alloc)) return 1; Sorry for bothering again :) > >Cheers, > >Paolo
diff --git a/net/core/sock.c b/net/core/sock.c index 6c5c6b18eff4..fed8211d8dbe 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2648,6 +2648,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) atomic_read(&sk->sk_rmem_alloc) + sk->sk_forward_alloc)) return 1; + } else { + /* for prots without memory_pressure callbacks, we should not + * drop until hard limit reached + */ + if (allocated <= sk_prot_mem_limits(sk, 2)) + return 1; } suppress_allocation:
We encoutered udp packets drop under a pretty low pressure with net.ipv4.udp_mem[0] set to a small value (4096). After some tracing and debugging, we found that for udp protocol, __sk_mem_raise_allocated() will possiblly drop packets if: udp_mem[0] < udp_prot.memory_allocated < udp_mem[2] That's because __sk_mem_raise_allocated() didn't handle the above condition for protocols like udp who doesn't have sk_has_memory_pressure() We can reproduce this with the following condition 1. udp_mem[0] is relateive small, 2. net.core.rmem_default/max > udp_mem[0] * 4K 3. The udp server receive slowly, causing the udp_prot->memory_allocated exceed udp_mem[0], but still under udp_mem[2] I wrote a test script to reproduce this: https://github.com/dust-li/kernel-test/blob/master/exceed_udp_mem_min_drop/exceed_udp_mem_min_drop.sh Obviously, we should not drop packets when udp_prot.memory_allocated just exceed udp_mem[0] but still under hard limit. For protocols with memory_pressure callbacks (like TCP), this is not a problem, because there is an extra check: ``` if (sk_has_memory_pressure(sk)) { u64 alloc; if (!sk_under_memory_pressure(sk)) return 1; alloc = sk_sockets_allocated_read_positive(sk); if (sk_prot_mem_limits(sk, 2) > alloc * sk_mem_pages(sk->sk_wmem_queued + atomic_read(&sk->sk_rmem_alloc) + sk->sk_forward_alloc)) return 1; } ``` But UDP didn't check this, so I add an extra check here to make sure UDP packets are not dropped until the hard limit is reached. Signed-off-by: Dust Li <dust.li@linux.alibaba.com> --- net/core/sock.c | 6 ++++++ 1 file changed, 6 insertions(+)