From patchwork Tue Aug 25 12:13:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1351043 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=RJhThgoX; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BbSZd0n34z9sTY for ; Tue, 25 Aug 2020 22:13:45 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729201AbgHYMNm (ORCPT ); Tue, 25 Aug 2020 08:13:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726681AbgHYMNj (ORCPT ); Tue, 25 Aug 2020 08:13:39 -0400 Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECE14C061574 for ; Tue, 25 Aug 2020 05:13:38 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id t9so3041417pfq.8 for ; Tue, 25 Aug 2020 05:13:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JBAzRDlmS5llybdKuvOihg+SoXiBd1/lSywPiogf424=; b=RJhThgoXqFgQ9SQ2gljP5KUmAw5SaZL8B7HScravCb0yNKIDR/pk4PTYU37GwBnfh0 NPmfpcBTmc0SQEvOscjuEetprSb2Qtbhknm7hCOkNj0WFnv7LwFCpXL8BwBLNjVe9CxM Hkov65LUhACU2L+cB9+vPwERBmt3z0aP3pqYeJFJ875AYyyFHqn9+PpTfN5GMrDUlXTV bHGBO/8BeRDke+UBvq8mRXretB4ih7RBAMmtzIUF3OB6qpStN37CPoNvDpeKZRq0/ZNs J5RiBPrpHHiV+YXPDRPLvjYe85il7435LbQvHHJSN2W3vK5aVUoh0yavPE9HWFr7YiQP ki4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JBAzRDlmS5llybdKuvOihg+SoXiBd1/lSywPiogf424=; b=fRW0TX+nfH2MtuflAMXJi4uAS5IE39+GwSf4ChaiQthW1gA6OClxwhtN9g8y/enyci m/H7zt1Ir0kMHMQXgWyGpvR3rd7TJ83nSh1I5lTT7syy2oSYCUjwiVgwKS9m7srTVSrs E5szZLd+gZ1TBhWP/THI2YafG1Uv6YP9cLr5GqUFFRoFczCDFDuYZaeymNGc0iuBvWRH mVEZ7RYhZu00nGW8BiTbtDn4B85MQMfivfmUDP5qhMnFtVle8tHiLCbfRkK9HIsNYPY9 RvCHc4/mNSHe/FJyph2CMp+y5/Elnbax2aQH0LoAowMGbR6qXAJpnvP2IMIxrCWSTd3H sMBw== X-Gm-Message-State: AOAM532WqNq9Y34TDUNJzeBRzFAy1nwuv78gXFI7EnVCc2gINFZHea5v /RSBEADrZOVbaRJ/9CUGG9o= X-Google-Smtp-Source: ABdhPJxt5YX/AfbT46BlfoL8S4vrOrK73hnuC1JpwI3oqvUiHNHkb7wWVSGL/y7kPPXUE0XhU/WNCg== X-Received: by 2002:aa7:9569:: with SMTP id x9mr7827579pfq.16.1598357618463; Tue, 25 Aug 2020 05:13:38 -0700 (PDT) Received: from btopel-mobl.ger.intel.com ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id d5sm2700031pjw.18.2020.08.25.05.13.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Aug 2020 05:13:37 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: jeffrey.t.kirsher@intel.com, intel-wired-lan@lists.osuosl.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, magnus.karlsson@gmail.com, netdev@vger.kernel.org, maciej.fijalkowski@intel.com, piotr.raczynski@intel.com, maciej.machnikowski@intel.com, lirongqing@baidu.com Subject: [PATCH net v2 1/3] i40e: avoid premature Rx buffer reuse Date: Tue, 25 Aug 2020 14:13:21 +0200 Message-Id: <20200825121323.20239-2-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200825121323.20239-1-bjorn.topel@gmail.com> References: <20200825121323.20239-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Longer explanation: Intel NICs have a recycle mechanism. The main idea is that a page is split into two parts. One part is owned by the driver, one part might be owned by someone else, such as the stack. t0: Page is allocated, and put on the Rx ring +--------------- used by NIC ->| upper buffer (rx_buffer) +--------------- | lower buffer +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX t1: Buffer is received, and passed to the stack (e.g.) +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 1 t2: Buffer is received, and redirected +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- Now, prior calling xdp_do_redirect(): page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 2 This means that buffer *cannot* be flipped/reused, because the skb is still using it. The problem arises when xdp_do_redirect() actually frees the segment. Then we get: page count == USHRT_MAX - 1 rx_buffer->pagecnt_bias == USHRT_MAX - 2 From a recycle perspective, the buffer can be flipped and reused, which means that the skb data area is passed to the Rx HW ring! To work around this, the page count is stored prior calling xdp_do_redirect(). Note that this is not optimal, since the NIC could actually reuse the "lower buffer" again. However, then we need to track whether XDP_REDIRECT consumed the buffer or not. Fixes: d9314c474d4f ("i40e: add support for XDP_REDIRECT") Reported-and-analyzed-by: Li RongQing Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 24 +++++++++++++++------ 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 3e5c566ceb01..07d8f8a684b3 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -1873,7 +1873,8 @@ static inline bool i40e_page_is_reusable(struct page *page) * * In either case, if the page is reusable its refcount is increased. **/ -static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer) +static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer, + int rx_buffer_pgcnt) { unsigned int pagecnt_bias = rx_buffer->pagecnt_bias; struct page *page = rx_buffer->page; @@ -1884,7 +1885,7 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer) #if (PAGE_SIZE < 8192) /* if we are only owner of page we can reuse it */ - if (unlikely((page_count(page) - pagecnt_bias) > 1)) + if (unlikely((rx_buffer_pgcnt - pagecnt_bias) > 1)) return false; #else #define I40E_LAST_OFFSET \ @@ -1948,11 +1949,18 @@ static void i40e_add_rx_frag(struct i40e_ring *rx_ring, * for use by the CPU. */ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring, - const unsigned int size) + const unsigned int size, + int *rx_buffer_pgcnt) { struct i40e_rx_buffer *rx_buffer; rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); + *rx_buffer_pgcnt = +#if (PAGE_SIZE < 8192) + page_count(rx_buffer->page); +#else + 0; +#endif prefetchw(rx_buffer->page); /* we are reusing so sync this buffer for CPU use */ @@ -2112,9 +2120,10 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring, * either recycle the buffer or unmap it and free the associated resources. */ static void i40e_put_rx_buffer(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *rx_buffer) + struct i40e_rx_buffer *rx_buffer, + int rx_buffer_pgcnt) { - if (i40e_can_reuse_rx_page(rx_buffer)) { + if (i40e_can_reuse_rx_page(rx_buffer, rx_buffer_pgcnt)) { /* hand second half of page back to the ring */ i40e_reuse_rx_page(rx_ring, rx_buffer); } else { @@ -2328,6 +2337,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget) while (likely(total_rx_packets < (unsigned int)budget)) { struct i40e_rx_buffer *rx_buffer; union i40e_rx_desc *rx_desc; + int rx_buffer_pgcnt; unsigned int size; u64 qword; @@ -2370,7 +2380,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget) break; i40e_trace(clean_rx_irq, rx_ring, rx_desc, skb); - rx_buffer = i40e_get_rx_buffer(rx_ring, size); + rx_buffer = i40e_get_rx_buffer(rx_ring, size, &rx_buffer_pgcnt); /* retrieve a buffer from the ring */ if (!skb) { @@ -2413,7 +2423,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget) break; } - i40e_put_rx_buffer(rx_ring, rx_buffer); + i40e_put_rx_buffer(rx_ring, rx_buffer, rx_buffer_pgcnt); cleaned_count++; if (i40e_is_non_eop(rx_ring, rx_desc, skb)) From patchwork Tue Aug 25 12:13:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1351047 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=Hsjh+YUq; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BbSZr1Ddpz9sTY for ; Tue, 25 Aug 2020 22:13:56 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729691AbgHYMNy (ORCPT ); Tue, 25 Aug 2020 08:13:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726681AbgHYMNm (ORCPT ); Tue, 25 Aug 2020 08:13:42 -0400 Received: from mail-pg1-x542.google.com (mail-pg1-x542.google.com [IPv6:2607:f8b0:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7F5BC061574 for ; Tue, 25 Aug 2020 05:13:42 -0700 (PDT) Received: by mail-pg1-x542.google.com with SMTP id 31so4448117pgy.13 for ; Tue, 25 Aug 2020 05:13:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UtTChdCSLkSxd+sRXkD3oHe4KiRrekM2v0o2HWq7aYE=; b=Hsjh+YUqtOge9m7LmuU6ncdvyxCBqkwk2x4en09veWJwHGTmJzThKIqvoJdIneojxK A2jTmZz5mjkLkj5e+osH45RxpSw5R/FbOy4qlWbFqbs3/d5JV67mjpdxRjWJown1aijQ AU0gJwbZKQ1oRXWh51AjaCEadzPRTwZsqKzGv92csICFSw543BHT9JNXmVcXWqZXubg5 u5qGf0E2oI0fSAfXdh2znvI1Es+umwPEFNl+XyJcG2kjfN/8vkgdV8BA2kAWOuku/Kge 9wjyOMZ70Iajv+TiCTyEa08NvJ6sYzAMMpxGjW9+053Eu/LXAEibBB9OXHilzvbphg7w IgsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UtTChdCSLkSxd+sRXkD3oHe4KiRrekM2v0o2HWq7aYE=; b=sHlKvuCYKimWtwGtJEEdJZMva19+J+Gdtas0NOH8nYkuPBaFLCzZLvJCRyVg9L80tj TuS2OA65mr/i2w6T5wko1gUoFMaple7/cxu81t47e+ap0pQOX9/mXWg08bmHyPmt3HLA SmmgOpzwXkLAxoYNBeYLw5H5zjsoWyuf8xJ+UZXNn0h/V31A52zDM5QJLviV1wCT0E7C 6sSV52HOuPIAuGfa5PT5M8IrwaX+HQmq3YRYvcCrjw3Y+BEHQHhYZ8gglxP8REfLqThe tLgJhGZEIgn5IJzy6lbuztTFHUHYz2F2RqHB2R4o0Qjdkpj1IfPqunC22yeLwuL1UirH NxgA== X-Gm-Message-State: AOAM533Y3B7vO4Gy8iSwhe/NY2dGiJugCjzzn3F9OadjOPtJE9xPGYS9 SjcLk+BLRZeCF3cEDhLMA465pxyNBz1TFw== X-Google-Smtp-Source: ABdhPJxq2jgXbMfch3hvPWv2t4wcFSZbHnmDuuycePhXOAbytzWpmCsLxoAJWeANtrTdUlyQ0qfQ5g== X-Received: by 2002:a17:902:6944:: with SMTP id k4mr6973661plt.147.1598357622464; Tue, 25 Aug 2020 05:13:42 -0700 (PDT) Received: from btopel-mobl.ger.intel.com ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id d5sm2700031pjw.18.2020.08.25.05.13.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Aug 2020 05:13:41 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: jeffrey.t.kirsher@intel.com, intel-wired-lan@lists.osuosl.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, magnus.karlsson@gmail.com, netdev@vger.kernel.org, maciej.fijalkowski@intel.com, piotr.raczynski@intel.com, maciej.machnikowski@intel.com, lirongqing@baidu.com Subject: [PATCH net v2 2/3] ixgbe: avoid premature Rx buffer reuse Date: Tue, 25 Aug 2020 14:13:22 +0200 Message-Id: <20200825121323.20239-3-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200825121323.20239-1-bjorn.topel@gmail.com> References: <20200825121323.20239-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: 6453073987ba ("ixgbe: add initial support for xdp redirect") Reported-and-analyzed-by: Li RongQing Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 24 +++++++++++++------ 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 2f8a4cfc5fa1..824c776a3abc 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1945,7 +1945,8 @@ static inline bool ixgbe_page_is_reserved(struct page *page) return (page_to_nid(page) != numa_mem_id()) || page_is_pfmemalloc(page); } -static bool ixgbe_can_reuse_rx_page(struct ixgbe_rx_buffer *rx_buffer) +static bool ixgbe_can_reuse_rx_page(struct ixgbe_rx_buffer *rx_buffer, + int rx_buffer_pgcnt) { unsigned int pagecnt_bias = rx_buffer->pagecnt_bias; struct page *page = rx_buffer->page; @@ -1956,7 +1957,7 @@ static bool ixgbe_can_reuse_rx_page(struct ixgbe_rx_buffer *rx_buffer) #if (PAGE_SIZE < 8192) /* if we are only owner of page we can reuse it */ - if (unlikely((page_ref_count(page) - pagecnt_bias) > 1)) + if (unlikely((rx_buffer_pgcnt - pagecnt_bias) > 1)) return false; #else /* The last offset is a bit aggressive in that we assume the @@ -2021,11 +2022,18 @@ static void ixgbe_add_rx_frag(struct ixgbe_ring *rx_ring, static struct ixgbe_rx_buffer *ixgbe_get_rx_buffer(struct ixgbe_ring *rx_ring, union ixgbe_adv_rx_desc *rx_desc, struct sk_buff **skb, - const unsigned int size) + const unsigned int size, + int *rx_buffer_pgcnt) { struct ixgbe_rx_buffer *rx_buffer; rx_buffer = &rx_ring->rx_buffer_info[rx_ring->next_to_clean]; + *rx_buffer_pgcnt = +#if (PAGE_SIZE < 8192) + page_count(rx_buffer->page); +#else + 0; +#endif prefetchw(rx_buffer->page); *skb = rx_buffer->skb; @@ -2055,9 +2063,10 @@ static struct ixgbe_rx_buffer *ixgbe_get_rx_buffer(struct ixgbe_ring *rx_ring, static void ixgbe_put_rx_buffer(struct ixgbe_ring *rx_ring, struct ixgbe_rx_buffer *rx_buffer, - struct sk_buff *skb) + struct sk_buff *skb, + int rx_buffer_pgcnt) { - if (ixgbe_can_reuse_rx_page(rx_buffer)) { + if (ixgbe_can_reuse_rx_page(rx_buffer, rx_buffer_pgcnt)) { /* hand second half of page back to the ring */ ixgbe_reuse_rx_page(rx_ring, rx_buffer); } else { @@ -2308,6 +2317,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector, union ixgbe_adv_rx_desc *rx_desc; struct ixgbe_rx_buffer *rx_buffer; struct sk_buff *skb; + int rx_buffer_pgcnt; unsigned int size; /* return some buffers to hardware, one at a time is too slow */ @@ -2327,7 +2337,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector, */ dma_rmb(); - rx_buffer = ixgbe_get_rx_buffer(rx_ring, rx_desc, &skb, size); + rx_buffer = ixgbe_get_rx_buffer(rx_ring, rx_desc, &skb, size, &rx_buffer_pgcnt); /* retrieve a buffer from the ring */ if (!skb) { @@ -2372,7 +2382,7 @@ static int ixgbe_clean_rx_irq(struct ixgbe_q_vector *q_vector, break; } - ixgbe_put_rx_buffer(rx_ring, rx_buffer, skb); + ixgbe_put_rx_buffer(rx_ring, rx_buffer, skb, rx_buffer_pgcnt); cleaned_count++; /* place incomplete frames back on ring for completion */ From patchwork Tue Aug 25 12:13:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 1351048 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=V2Gd3bMS; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BbSb70cDLz9sTY for ; Tue, 25 Aug 2020 22:14:11 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729774AbgHYMOF (ORCPT ); Tue, 25 Aug 2020 08:14:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729627AbgHYMNt (ORCPT ); Tue, 25 Aug 2020 08:13:49 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C58D0C061574 for ; Tue, 25 Aug 2020 05:13:46 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id p11so4029287pfn.11 for ; Tue, 25 Aug 2020 05:13:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=E3FCU1ysVCQIS/y1LR8GDdyKBD7FyoWGv58ugE95k9E=; b=V2Gd3bMSWjb7++EjJGkSGGumtzMg4VDVUyC/lsdu1LTIUAPBkaWWuaoQpPq7FnENNE h05mQeLPZII16LAJZ/Z/vxq+jtlu8PSpA0Js48LBqfaHSPdNyGgW2zRvS1gqVrQ8YAK3 9rT0dJd879yasu5lrYr6CZa1Z9xxXAdXTha7tl3XsQfQ9H4n3L7O5antaVyHubmLuxxv DRkz8yw/guX+VB08PJpLbFKKrsB0eggBEUzAn5WreBR0Ygv6Vl8eVlVI8PV+oAAzktBM 9MBqUgGt+NvQjg9kBpTbUROOwmZ2XSBNH8wvMuBRN7j7k/cX6TcA+10Yl9Tx9xbE4238 LNUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=E3FCU1ysVCQIS/y1LR8GDdyKBD7FyoWGv58ugE95k9E=; b=BtRr2BSaspwAB4gxtJaVZVvXxUnggy4MbygQylBgLDg7TY8qdHQbtTUvYJvMMASalB SS2laaBYpn5fR06PKha4CvlA8dxibqAKPZhU0B2HtxbAo/ANFpGJLwhp2cuxmESkgfv9 cfTgnYJUoeKUwBHx+zoMO2hWgG7uJM2qsSmFpgip+uVnC4A7KdkIBs/nPG/5eus225yd BPlT9aKuOoppAshoGjH1MG70o9ljr/9kqEw7LFjYzPiOj7ng1Hff7t99FiAIHJWekC9Q UOD8dmdy74Dso1DfFkh2sVGiKeBBswS15vqPg5hXxrO5XNYO2PrwIrC1jbDU/g0PJkZm Ozyw== X-Gm-Message-State: AOAM532IUzeY8R4WSGZeL1TQ6yOrteGZ4PvWmxFueQV/day1aR43hQeC BBCgPpF0yS9/2/d+HT7f8WQ= X-Google-Smtp-Source: ABdhPJxz8r9wVR00AicZeBN5vv6rNG38IzyW6IBd9yFmG0BNmLXpZnq4GyFjyOdM+yt6o4wnV8gO1Q== X-Received: by 2002:aa7:84d1:: with SMTP id x17mr7591486pfn.87.1598357626371; Tue, 25 Aug 2020 05:13:46 -0700 (PDT) Received: from btopel-mobl.ger.intel.com ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id d5sm2700031pjw.18.2020.08.25.05.13.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Aug 2020 05:13:45 -0700 (PDT) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: jeffrey.t.kirsher@intel.com, intel-wired-lan@lists.osuosl.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, magnus.karlsson@gmail.com, netdev@vger.kernel.org, maciej.fijalkowski@intel.com, piotr.raczynski@intel.com, maciej.machnikowski@intel.com, lirongqing@baidu.com Subject: [PATCH net v2 3/3] ice: avoid premature Rx buffer reuse Date: Tue, 25 Aug 2020 14:13:23 +0200 Message-Id: <20200825121323.20239-4-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200825121323.20239-1-bjorn.topel@gmail.com> References: <20200825121323.20239-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: efc2214b6047 ("ice: Add support for XDP") Reported-and-analyzed-by: Li RongQing Signed-off-by: Björn Töpel --- drivers/net/ethernet/intel/ice/ice_txrx.c | 27 +++++++++++++++-------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 9d0d6b0025cf..924d34ad9fa4 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -768,7 +768,8 @@ ice_rx_buf_adjust_pg_offset(struct ice_rx_buf *rx_buf, unsigned int size) * pointing to; otherwise, the DMA mapping needs to be destroyed and * page freed */ -static bool ice_can_reuse_rx_page(struct ice_rx_buf *rx_buf) +static bool ice_can_reuse_rx_page(struct ice_rx_buf *rx_buf, + int rx_buf_pgcnt) { unsigned int pagecnt_bias = rx_buf->pagecnt_bias; struct page *page = rx_buf->page; @@ -779,7 +780,7 @@ static bool ice_can_reuse_rx_page(struct ice_rx_buf *rx_buf) #if (PAGE_SIZE < 8192) /* if we are only owner of page we can reuse it */ - if (unlikely((page_count(page) - pagecnt_bias) > 1)) + if (unlikely((rx_buf_pgcnt - pagecnt_bias) > 1)) return false; #else #define ICE_LAST_OFFSET \ @@ -870,11 +871,18 @@ ice_reuse_rx_page(struct ice_ring *rx_ring, struct ice_rx_buf *old_buf) */ static struct ice_rx_buf * ice_get_rx_buf(struct ice_ring *rx_ring, struct sk_buff **skb, - const unsigned int size) + const unsigned int size, + int *rx_buf_pgcnt) { struct ice_rx_buf *rx_buf; rx_buf = &rx_ring->rx_buf[rx_ring->next_to_clean]; + *rx_buf_pgcnt = +#if (PAGE_SIZE < 8192) + page_count(rx_buf->page); +#else + 0; +#endif prefetchw(rx_buf->page); *skb = rx_buf->skb; @@ -1017,7 +1025,7 @@ ice_construct_skb(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf, * of the rx_buf. It will either recycle the buffer or unmap it and free * the associated resources. */ -static void ice_put_rx_buf(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf) +static void ice_put_rx_buf(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf, int rx_buf_pgcnt) { u16 ntc = rx_ring->next_to_clean + 1; @@ -1028,7 +1036,7 @@ static void ice_put_rx_buf(struct ice_ring *rx_ring, struct ice_rx_buf *rx_buf) if (!rx_buf) return; - if (ice_can_reuse_rx_page(rx_buf)) { + if (ice_can_reuse_rx_page(rx_buf, rx_buf_pgcnt)) { /* hand second half of page back to the ring */ ice_reuse_rx_page(rx_ring, rx_buf); } else { @@ -1103,6 +1111,7 @@ int ice_clean_rx_irq(struct ice_ring *rx_ring, int budget) struct sk_buff *skb; unsigned int size; u16 stat_err_bits; + int rx_buf_pgcnt; u16 vlan_tag = 0; u8 rx_ptype; @@ -1125,7 +1134,7 @@ int ice_clean_rx_irq(struct ice_ring *rx_ring, int budget) dma_rmb(); if (rx_desc->wb.rxdid == FDIR_DESC_RXDID || !rx_ring->netdev) { - ice_put_rx_buf(rx_ring, NULL); + ice_put_rx_buf(rx_ring, NULL, 0); cleaned_count++; continue; } @@ -1134,7 +1143,7 @@ int ice_clean_rx_irq(struct ice_ring *rx_ring, int budget) ICE_RX_FLX_DESC_PKT_LEN_M; /* retrieve a buffer from the ring */ - rx_buf = ice_get_rx_buf(rx_ring, &skb, size); + rx_buf = ice_get_rx_buf(rx_ring, &skb, size, &rx_buf_pgcnt); if (!size) { xdp.data = NULL; @@ -1174,7 +1183,7 @@ int ice_clean_rx_irq(struct ice_ring *rx_ring, int budget) total_rx_pkts++; cleaned_count++; - ice_put_rx_buf(rx_ring, rx_buf); + ice_put_rx_buf(rx_ring, rx_buf, rx_buf_pgcnt); continue; construct_skb: if (skb) { @@ -1193,7 +1202,7 @@ int ice_clean_rx_irq(struct ice_ring *rx_ring, int budget) break; } - ice_put_rx_buf(rx_ring, rx_buf); + ice_put_rx_buf(rx_ring, rx_buf, rx_buf_pgcnt); cleaned_count++; /* skip if it is NOP desc */