Message ID | e6cb92be01af6190350b9b3765bee6e3@nuclearcat.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On 2016-08-01 23:59, Guillaume Nault wrote: > Do you still have the vmlinux file with debug symbols that generated > this panic? Sorry for delay, i didn't had same image on all servers and probably i found cause of panic, but still testing on several servers. If i remove SFQ qdisc from ppp shapers, servers not rebooting anymore. But still i need around 2 days to make sure that's the reason.
On Mon, Aug 08, 2016 at 02:25:00PM +0300, Denys Fedoryshchenko wrote: > On 2016-08-01 23:59, Guillaume Nault wrote: > > Do you still have the vmlinux file with debug symbols that generated > > this panic? > Sorry for delay, i didn't had same image on all servers and probably i found > cause of panic, but still testing on several servers. > If i remove SFQ qdisc from ppp shapers, servers not rebooting anymore. > Thanks for the feedback. I wonder which interactions between SFQ and PPP can lead to this problem. I'll take a look. > But still i need around 2 days to make sure that's the reason. > Okay, just let me know if you can confirm that removing SFQ really solves the problem.
On 2016-08-09 00:05, Guillaume Nault wrote: > On Mon, Aug 08, 2016 at 02:25:00PM +0300, Denys Fedoryshchenko wrote: >> On 2016-08-01 23:59, Guillaume Nault wrote: >> > Do you still have the vmlinux file with debug symbols that generated >> > this panic? >> Sorry for delay, i didn't had same image on all servers and probably i >> found >> cause of panic, but still testing on several servers. >> If i remove SFQ qdisc from ppp shapers, servers not rebooting anymore. >> > Thanks for the feedback. I wonder which interactions between SFQ and > PPP can lead to this problem. I'll take a look. > >> But still i need around 2 days to make sure that's the reason. >> > Okay, just let me know if you can confirm that removing SFQ really > solves the problem. After long testing, i can confirm removing sfq from rules decreased panic reboot greatly, tested on many different servers. I will try today to do some stress tests, to apply on live system at night sfq qdiscs, then remove them. Then i will try also to disconnect all users with sfq qdiscs attached. Not sure it will help to reproduce the bug, but worth to try. Still i am hitting once per week some different conntrack bug, sand thats why i was confused, i was getting clearly panics in conntrack and then something else, i was not sure if it is different bugs, hardware glitch or something else.
--- linux/net/sched/sch_htb.c 2016-06-08 01:23:53.000000000 +0000 +++ linux-new/net/sched/sch_htb.c 2016-06-21 14:03:08.398486593 +0000 @@ -1495,10 +1495,10 @@ cl->common.classid); cl->quantum = 1000; } - if (!hopt->quantum && cl->quantum > 200000) { + if (!hopt->quantum && cl->quantum > 2000000) { pr_warn("HTB: quantum of class %X is big. Consider r2q change.\n", cl->common.classid); - cl->quantum = 200000; + cl->quantum = 2000000; } if (hopt->quantum) cl->quantum = hopt->quantum;