From patchwork Mon Jan 27 00:14:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Fastabend X-Patchwork-Id: 1229475 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=KzIA6lKl; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 485Vc90yTkz9sR1 for ; Mon, 27 Jan 2020 11:14:33 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727479AbgA0AOc (ORCPT ); Sun, 26 Jan 2020 19:14:32 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:37924 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727250AbgA0AOc (ORCPT ); Sun, 26 Jan 2020 19:14:32 -0500 Received: by mail-pf1-f195.google.com with SMTP id x185so4076080pfc.5; Sun, 26 Jan 2020 16:14:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jwKhSkxejmyMPyM+NVjUzzmt99OqXL3Od8mcxBL1EyU=; b=KzIA6lKlm9r14alo9oUUhjDXA3ZfYQgq3ShYMZl3kZGwnBzvqUpk95mvY65KsQ6z8I qrTpJo7DgBQT475zg9jgxi3VxFJ/nzN/SpsDILQN4uKjaPuVlN9IKvUyZq+p7gE45dTA 0xmwZHIMXcycWcf/Ex8jOU1nR23LPDy1sM9ypayxyrDAo8JxVKLMe4QeSbfd/VcEkstV z89DWZSwozkr9PWLn/JDbtvw7wrK9+8orMOvnnzPh8sYQwuV1fQOJ0Ys/v94tsIsij0S u8SHWMspiyltDcN0N3E19PE6sC80wUvHDfAsUvisGTg+D8Qs5HmUZ7wcAj9KW6rTEPiI IlGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jwKhSkxejmyMPyM+NVjUzzmt99OqXL3Od8mcxBL1EyU=; b=KCLu9D/W/RyiJ3xDIWM8BMsIuNINANlfJrTf5jOZ2ObF87s5ThLtagBZiF+YfgH0y1 iW1AZ5tsWYfhbHEpQul2cxUWWZzNdtgZmyf86+y6q4qK3x4oMYdC8iv+54eEsXeyZhEy Um5dsiFTxmu95oKYowADh2Wt0SJ+kWr5PrmU7CqtlI3DR2+PIgmcR7g6GZklZcFH0PKU k9An5wKA3ppxJi8qklgYWPNrAH3J5Z3VDRwNCKQjmNmsqr4FdzJDalvhHTiwlOKz8dLy sS0KzmjHSf8kdwfMavWc05nfQuelHSeiAR1N+NNL7CVN7uCSkjDyPP+ld815zKHW83Dd 2YbA== X-Gm-Message-State: APjAAAUV4jAjGsdEzKHUuQ+vpNOgtRcl3aQfpVP4u0l6dRx+TLFetGFo fdS3uy9fD7lbYCJrdl7/i7VhCtgS X-Google-Smtp-Source: APXvYqwUKVDer0FpGdFYXA6N3SPce7fkiWTORnJcD7OhGLqI+0LLvgzE/TObc301N6Vg1mYOfsL/mA== X-Received: by 2002:a63:1346:: with SMTP id 6mr16735605pgt.111.1580084071242; Sun, 26 Jan 2020 16:14:31 -0800 (PST) Received: from localhost.localdomain ([184.63.162.180]) by smtp.gmail.com with ESMTPSA id i23sm13326949pfo.11.2020.01.26.16.14.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 26 Jan 2020 16:14:30 -0800 (PST) From: John Fastabend To: bpf@vger.kernel.org Cc: bjorn.topel@intel.com, songliubraving@fb.com, john.fastabend@gmail.com, ast@kernel.org, daniel@iogearbox.net, toke@redhat.com, maciej.fijalkowski@intel.com, netdev@vger.kernel.org Subject: [PATCH bpf-next v3 1/3] bpf: xdp, update devmap comments to reflect napi/rcu usage Date: Sun, 26 Jan 2020 16:14:00 -0800 Message-Id: <1580084042-11598-2-git-send-email-john.fastabend@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1580084042-11598-1-git-send-email-john.fastabend@gmail.com> References: <1580084042-11598-1-git-send-email-john.fastabend@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Now that we rely on synchronize_rcu and call_rcu waiting to exit perempt-disable regions (NAPI) lets update the comments to reflect this. Fixes: 0536b85239b84 ("xdp: Simplify devmap cleanup") Acked-by: Björn Töpel Acked-by: Song Liu Signed-off-by: John Fastabend --- kernel/bpf/devmap.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index da9c832..707583f 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -193,10 +193,12 @@ static void dev_map_free(struct bpf_map *map) /* At this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0, * so the programs (can be more than one that used this map) were - * disconnected from events. Wait for outstanding critical sections in - * these programs to complete. The rcu critical section only guarantees - * no further reads against netdev_map. It does __not__ ensure pending - * flush operations (if any) are complete. + * disconnected from events. The following synchronize_rcu() guarantees + * both rcu read critical sections complete and waits for + * preempt-disable regions (NAPI being the relevant context here) so we + * are certain there will be no further reads against the netdev_map and + * all flush operations are complete. Flush operations can only be done + * from NAPI context for this reason. */ spin_lock(&dev_map_lock); @@ -498,12 +500,11 @@ static int dev_map_delete_elem(struct bpf_map *map, void *key) return -EINVAL; /* Use call_rcu() here to ensure any rcu critical sections have - * completed, but this does not guarantee a flush has happened - * yet. Because driver side rcu_read_lock/unlock only protects the - * running XDP program. However, for pending flush operations the - * dev and ctx are stored in another per cpu map. And additionally, - * the driver tear down ensures all soft irqs are complete before - * removing the net device in the case of dev_put equals zero. + * completed as well as any flush operations because call_rcu + * will wait for preempt-disable region to complete, NAPI in this + * context. And additionally, the driver tear down ensures all + * soft irqs are complete before removing the net device in the + * case of dev_put equals zero. */ old_dev = xchg(&dtab->netdev_map[k], NULL); if (old_dev) From patchwork Mon Jan 27 00:14:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Fastabend X-Patchwork-Id: 1229478 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=b2/IUtdu; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 485VcM0wgSz9sRR for ; Mon, 27 Jan 2020 11:14:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727250AbgA0AOm (ORCPT ); Sun, 26 Jan 2020 19:14:42 -0500 Received: from mail-pj1-f66.google.com ([209.85.216.66]:50519 "EHLO mail-pj1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726444AbgA0AOl (ORCPT ); Sun, 26 Jan 2020 19:14:41 -0500 Received: by mail-pj1-f66.google.com with SMTP id r67so2271747pjb.0; Sun, 26 Jan 2020 16:14:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=g9j0lMDO8NoNU4KTbS42+2WezLD8kVVyGGPJ2HPGp/M=; b=b2/IUtdu18/t4bYGGOQ7x5MWjXe0RvXpuaSe36Y2GK8n+4PhxfH23p2ePlTepBHC9q S8WwMTyHRkmfSV2OdzyW3aFwb+4DtzOHYElM/CD1yDEf0LsPZx5tGnVFtQ1dcYGvHcvY mkW33Ggd1dL5n5llbrjzKwY/XwY4vp614NKRj8GEAdYgdk6AdmzRjpw+bnm+x2kjGiJ2 egbuub+FkmL4vIA4kFXkT8zFuJGpOtTJV++TOTFCDBw3/YgOg41GcfikeSQcARLRKikC k1ySwv6tKO2uiFp37lJvVdTeT8WmHBjTX4R8p5uZw/0JFUsK+7pW5YIetVQ7eiY7hXTE vf4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=g9j0lMDO8NoNU4KTbS42+2WezLD8kVVyGGPJ2HPGp/M=; b=BTJbTO04uLtwUN2QK8R2ecEBBXBTif3dW/yFbIHloo3fdGtFgxe1N1IZ8oCJBXHhi2 lKDkjciR2ca0ZKHRGCm9uYrYIQDW5PzwhJ8kvocb67tYHaub+vLufvQ4XI6bOHmhMYMk TTSCavNLYB7KLJdJ3r5Z7IRwfgsPfxFPFpb7EPQ8hu8gzNQG4yl0fKLDen1AxjcawZrh hhT2NfnFQhD0Klk8IG+ukexhEd26IwIey29++2DewDFTgrNZYgE7XZmMrq+NFiQMRE+9 lcIUXvXWQ+dCRvz0k1nYcZpcBccUA4dPkj/9vSVq3x4ZlCmouv03XdHt6ba65McRjDYg TRtQ== X-Gm-Message-State: APjAAAUKnZaGZOZn/VVSQWS6VULJfl46SJPn8C1k+zQaZ8+JrmOEdl4+ bpJEgjvD+zIbST+KegQdUBsjnpGA X-Google-Smtp-Source: APXvYqztDq2yJzJ1c3OzJhrSUTP0qhBn1h/Gxq5DXts3xNuqVcULO20YQ5lUODekPeFyzBao8+iGUA== X-Received: by 2002:a17:902:8f8a:: with SMTP id z10mr15532830plo.169.1580084080980; Sun, 26 Jan 2020 16:14:40 -0800 (PST) Received: from localhost.localdomain ([184.63.162.180]) by smtp.gmail.com with ESMTPSA id i23sm13326949pfo.11.2020.01.26.16.14.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 26 Jan 2020 16:14:40 -0800 (PST) From: John Fastabend To: bpf@vger.kernel.org Cc: bjorn.topel@intel.com, songliubraving@fb.com, john.fastabend@gmail.com, ast@kernel.org, daniel@iogearbox.net, toke@redhat.com, maciej.fijalkowski@intel.com, netdev@vger.kernel.org Subject: [PATCH bpf-next v3 2/3] bpf: xdp, virtio_net use access ptr macro for xdp enable check Date: Sun, 26 Jan 2020 16:14:01 -0800 Message-Id: <1580084042-11598-3-git-send-email-john.fastabend@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1580084042-11598-1-git-send-email-john.fastabend@gmail.com> References: <1580084042-11598-1-git-send-email-john.fastabend@gmail.com> Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org virtio_net currently relies on rcu critical section to access the xdp program in its xdp_xmit handler. However, the pointer to the xdp program is only used to do a NULL pointer comparison to determine if xdp is enabled or not. Use rcu_access_pointer() instead of rcu_dereference() to reflect this. Then later when we drop rcu_read critical section virtio_net will not need in special handling. Acked-by: Jesper Dangaard Brouer Signed-off-by: John Fastabend Acked-by: Maciej Fijalkowski --- drivers/net/virtio_net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 4d7d5434..945eabc 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -501,7 +501,7 @@ static int virtnet_xdp_xmit(struct net_device *dev, /* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this * indicate XDP resources have been successfully allocated. */ - xdp_prog = rcu_dereference(rq->xdp_prog); + xdp_prog = rcu_access_pointer(rq->xdp_prog); if (!xdp_prog) return -ENXIO; From patchwork Mon Jan 27 00:14:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Fastabend X-Patchwork-Id: 1229480 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=Oacr/vOD; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 485VcX6PBpz9sR1 for ; Mon, 27 Jan 2020 11:14:52 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728236AbgA0AOv (ORCPT ); Sun, 26 Jan 2020 19:14:51 -0500 Received: from mail-pf1-f196.google.com ([209.85.210.196]:46719 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726444AbgA0AOv (ORCPT ); Sun, 26 Jan 2020 19:14:51 -0500 Received: by mail-pf1-f196.google.com with SMTP id k29so1566134pfp.13; Sun, 26 Jan 2020 16:14:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=p84g0qKiuqEEntAsj7EHhSdJwxP3dtEjqrEnC5w/2AI=; b=Oacr/vODgYV7TIyH2HUpaYeIk/9HDS4DdfHNXvKCMX0O+GEB5JYVeL0hoghFcv0lUk 0HDvEQcAgvBUZv9cIWmyzp+879xuJ/JL20hboLKtOhFpxKyaL+XLupQ1+AVAZX0GfEPu 5JxyRXhloU0a8+Pvv/fWOLXKE1J6QRsTPsNbSa0mzZnvzxH2AZ431OmQ/EYf2ThjNxaF 7Bu/UKkUoOaaLGU92Nx22AyaBnnK7iVn++SpRLT1aVhvlEWmkCsFgFKSMp3ltHMU+pkQ P/OQNE8eG9bIfkgvag6arFXhT7aGTPnunKi9bjseLeBaQNY0BsFLQDSZzzqajH1ZuHp7 8F6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p84g0qKiuqEEntAsj7EHhSdJwxP3dtEjqrEnC5w/2AI=; b=j9oToHBDhFzJC6lemC7J+DhN3SaOQFmNzeW46Ewj3pwvgCoRULmupyoHcz1LswfZyw Ibngblaq3Rry7KAOXeL4Ra+d1lf+EaBVsMwIsvitUcWqruyVxjd4ufcFAZNpz9hZerxy L/KsT51KpADpEL6oCi5gpifl6pO5sOytrxojXoGU7fi4wMPPOAuVFAXNE9mdmsHcpeGA GyTngw81LabSskoi2GliQgTIuVbXIAl37deGEPanl9CJcL6uXsGcC/WKl94+ye8iiVMG +1eucDTkXK3yTMA/32miNcub9FZI7ShtFDwuxZvoSLD9ZZtO0Zmmz3+QWfcsVeFsWCWm m+3Q== X-Gm-Message-State: APjAAAWN0FNs5cf/BFeZuytmCrH0ZxVsRC0h4qSOop8fP2NgWPiqXFqk 6v2aW+9OJEUd4QoQnbms1lTx/m9i X-Google-Smtp-Source: APXvYqwZvBOZMavlMQd3vOkL1+oPCz1EkiblVDRhZxAbl0HVcjXH8SKuzHEQBEeLu764j6tIf4JVcg== X-Received: by 2002:a62:486:: with SMTP id 128mr6804130pfe.236.1580084090722; Sun, 26 Jan 2020 16:14:50 -0800 (PST) Received: from localhost.localdomain ([184.63.162.180]) by smtp.gmail.com with ESMTPSA id i23sm13326949pfo.11.2020.01.26.16.14.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sun, 26 Jan 2020 16:14:50 -0800 (PST) From: John Fastabend To: bpf@vger.kernel.org Cc: bjorn.topel@intel.com, songliubraving@fb.com, john.fastabend@gmail.com, ast@kernel.org, daniel@iogearbox.net, toke@redhat.com, maciej.fijalkowski@intel.com, netdev@vger.kernel.org Subject: [PATCH bpf-next v3 3/3] bpf: xdp, remove no longer required rcu_read_{un}lock() Date: Sun, 26 Jan 2020 16:14:02 -0800 Message-Id: <1580084042-11598-4-git-send-email-john.fastabend@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1580084042-11598-1-git-send-email-john.fastabend@gmail.com> References: <1580084042-11598-1-git-send-email-john.fastabend@gmail.com> Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Now that we depend on rcu_call() and synchronize_rcu() to also wait for preempt_disabled region to complete the rcu read critical section in __dev_map_flush() is no longer required. Except in a few special cases in drivers that need it for other reasons. These originally ensured the map reference was safe while a map was also being free'd. And additionally that bpf program updates via ndo_bpf did not happen while flush updates were in flight. But flush by new rules can only be called from preempt-disabled NAPI context. The synchronize_rcu from the map free path and the rcu_call from the delete path will ensure the reference there is safe. So lets remove the rcu_read_lock and rcu_read_unlock pair to avoid any confusion around how this is being protected. If the rcu_read_lock was required it would mean errors in the above logic and the original patch would also be wrong. Now that we have done above we put the rcu_read_lock in the driver code where it is needed in a driver dependent way. I think this helps readability of the code so we know where and why we are taking read locks. Most drivers will not need rcu_read_locks here and further XDP drivers already have rcu_read_locks in their code paths for reading xdp programs on RX side so this makes it symmetric where we don't have half of rcu critical sections define in driver and the other half in devmap. Acked-by: Jesper Dangaard Brouer Signed-off-by: John Fastabend --- drivers/net/veth.c | 6 +++++- kernel/bpf/devmap.c | 5 +++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index a552df3..184e1b4 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -377,6 +377,7 @@ static int veth_xdp_xmit(struct net_device *dev, int n, unsigned int max_len; struct veth_rq *rq; + rcu_read_lock(); if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) { ret = -EINVAL; goto drop; @@ -418,11 +419,14 @@ static int veth_xdp_xmit(struct net_device *dev, int n, if (flags & XDP_XMIT_FLUSH) __veth_xdp_flush(rq); - if (likely(!drops)) + if (likely(!drops)) { + rcu_read_unlock(); return n; + } ret = n - drops; drop: + rcu_read_unlock(); atomic64_add(drops, &priv->dropped); return ret; diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 707583f..2b83c56 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -372,16 +372,17 @@ static int bq_xmit_all(struct xdp_bulk_queue *bq, u32 flags) * from NET_RX_SOFTIRQ. Either way the poll routine must complete before the * net device can be torn down. On devmap tear down we ensure the flush list * is empty before completing to ensure all flush operations have completed. + * When drivers update the bpf program they may need to ensure any flush ops + * are also complete. Using synchronize_rcu or call_rcu will suffice for this + * because both wait for napi context to exit. */ void __dev_map_flush(void) { struct list_head *flush_list = this_cpu_ptr(&dev_map_flush_list); struct xdp_bulk_queue *bq, *tmp; - rcu_read_lock(); list_for_each_entry_safe(bq, tmp, flush_list, flush_node) bq_xmit_all(bq, XDP_XMIT_FLUSH); - rcu_read_unlock(); } /* rcu_read_lock (from syscall and BPF contexts) ensures that if a delete and/or