From patchwork Thu Sep 2 14:19:46 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean Delvare X-Patchwork-Id: 63487 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 389F1B7176 for ; Fri, 3 Sep 2010 00:20:03 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755326Ab0IBOTv (ORCPT ); Thu, 2 Sep 2010 10:19:51 -0400 Received: from cantor.suse.de ([195.135.220.2]:49498 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755221Ab0IBOTu (ORCPT ); Thu, 2 Sep 2010 10:19:50 -0400 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 6F4CF8FEA2; Thu, 2 Sep 2010 16:19:49 +0200 (CEST) From: Jean Delvare Organization: Suse Linux To: Jay Vosburgh Subject: [PATCH] bonding: Fix jiffies overflow problems (again) Date: Thu, 2 Sep 2010 16:19:46 +0200 User-Agent: KMail/1.12.4 (Linux/2.6.32.13-0.5-pae; KDE/4.3.5; i686; ; ) Cc: bonding-devel@lists.sourceforge.net, netdev@vger.kernel.org, Jiri Bohac MIME-Version: 1.0 Message-Id: <201009021619.46206.jdelvare@suse.de> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jiri Bohac The time_before_eq()/time_after_eq() functions operate on unsigned long and only work if the difference between the two compared values is smaller than half the range of unsigned long (31 bits on i386). Some of the variables (slave->jiffies, dev->trans_start, dev->last_rx) used by bonding store a copy of jiffies and may not be updated for a long time. With HZ=1000, time_before_eq()/time_after_eq() will start giving bad results after ~25 days. jiffies will never be before slave->jiffies, dev->trans_start, dev->last_rx by more than possibly a couple ticks caused by preemption of this code. This allows us to detect/prevent these overflows by replacing time_before_eq()/time_after_eq() with time_in_range(). Signed-off-by: Jiri Bohac Signed-off-by: Jean Delvare --- drivers/net/bonding/bond_main.c | 48 +++++++++++++++++++++++++++------------- 1 file changed, 33 insertions(+), 15 deletions(-) --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -2798,8 +2798,12 @@ void bond_loadbalance_arp_mon(struct wor */ bond_for_each_slave(bond, slave, i) { if (slave->link != BOND_LINK_UP) { - if (time_before_eq(jiffies, dev_trans_start(slave->dev) + delta_in_ticks) && - time_before_eq(jiffies, slave->dev->last_rx + delta_in_ticks)) { + if (time_in_range(jiffies, + dev_trans_start(slave->dev) - delta_in_ticks, + dev_trans_start(slave->dev) + delta_in_ticks) && + time_in_range(jiffies, + slave->dev->last_rx - delta_in_ticks, + slave->dev->last_rx + delta_in_ticks)) { slave->link = BOND_LINK_UP; slave->state = BOND_STATE_ACTIVE; @@ -2827,8 +2831,12 @@ void bond_loadbalance_arp_mon(struct wor * when the source ip is 0, so don't take the link down * if we don't know our ip yet */ - if (time_after_eq(jiffies, dev_trans_start(slave->dev) + 2*delta_in_ticks) || - (time_after_eq(jiffies, slave->dev->last_rx + 2*delta_in_ticks))) { + if (!time_in_range(jiffies, + dev_trans_start(slave->dev) - delta_in_ticks, + dev_trans_start(slave->dev) + 2*delta_in_ticks) || + (!time_in_range(jiffies, + slave->dev->last_rx - delta_in_ticks, + slave->dev->last_rx + 2*delta_in_ticks))) { slave->link = BOND_LINK_DOWN; slave->state = BOND_STATE_BACKUP; @@ -2888,8 +2896,10 @@ static int bond_ab_arp_inspect(struct bo slave->new_link = BOND_LINK_NOCHANGE; if (slave->link != BOND_LINK_UP) { - if (time_before_eq(jiffies, slave_last_rx(bond, slave) + - delta_in_ticks)) { + if (time_in_range(jiffies, + slave_last_rx(bond, slave) - delta_in_ticks, + slave_last_rx(bond, slave) + delta_in_ticks)) { + slave->new_link = BOND_LINK_UP; commit++; } @@ -2902,8 +2912,9 @@ static int bond_ab_arp_inspect(struct bo * active. This avoids bouncing, as the last receive * times need a full ARP monitor cycle to be updated. */ - if (!time_after_eq(jiffies, slave->jiffies + - 2 * delta_in_ticks)) + if (time_in_range(jiffies, + slave->jiffies - delta_in_ticks, + slave->jiffies + 2 * delta_in_ticks)) continue; /* @@ -2921,8 +2932,10 @@ static int bond_ab_arp_inspect(struct bo */ if (slave->state == BOND_STATE_BACKUP && !bond->current_arp_slave && - time_after(jiffies, slave_last_rx(bond, slave) + - 3 * delta_in_ticks)) { + !time_in_range(jiffies, + slave_last_rx(bond, slave) - delta_in_ticks, + slave_last_rx(bond, slave) + 3 * delta_in_ticks)) { + slave->new_link = BOND_LINK_DOWN; commit++; } @@ -2934,10 +2947,13 @@ static int bond_ab_arp_inspect(struct bo * the bond has an IP address) */ if ((slave->state == BOND_STATE_ACTIVE) && - (time_after_eq(jiffies, dev_trans_start(slave->dev) + - 2 * delta_in_ticks) || - (time_after_eq(jiffies, slave_last_rx(bond, slave) - + 2 * delta_in_ticks)))) { + (!time_in_range(jiffies, + dev_trans_start(slave->dev) - delta_in_ticks, + dev_trans_start(slave->dev) + 2 * delta_in_ticks) || + (!time_in_range(jiffies, + slave_last_rx(bond, slave) - delta_in_ticks, + slave_last_rx(bond, slave) + 2 * delta_in_ticks)))) { + slave->new_link = BOND_LINK_DOWN; commit++; } @@ -2964,7 +2980,9 @@ static void bond_ab_arp_commit(struct bo case BOND_LINK_UP: if ((!bond->curr_active_slave && - time_before_eq(jiffies, + time_in_range(jiffies, + dev_trans_start(slave->dev) - + delta_in_ticks, dev_trans_start(slave->dev) + delta_in_ticks)) || bond->curr_active_slave != slave) {