From patchwork Mon Nov 20 22:26:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shannon Nelson X-Patchwork-Id: 839794 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3ygjxK0rz5z9sNx for ; Tue, 21 Nov 2017 09:26:29 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751275AbdKTW00 (ORCPT ); Mon, 20 Nov 2017 17:26:26 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:36947 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751146AbdKTW0Z (ORCPT ); Mon, 20 Nov 2017 17:26:25 -0500 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vAKMQLxE006201 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 20 Nov 2017 22:26:22 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vAKMQKUu015902 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 20 Nov 2017 22:26:20 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id vAKMQJr4024964; Mon, 20 Nov 2017 22:26:20 GMT Received: from slnelson-mint18.us.oracle.com (/10.159.141.221) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 20 Nov 2017 14:26:19 -0800 From: Shannon Nelson To: steffen.klassert@secunet.com Cc: netdev@vger.kernel.org Subject: [PATCH ipsec] xfrm: add documentation for xfrm device offload api Date: Mon, 20 Nov 2017 14:26:07 -0800 Message-Id: <1511216767-21320-1-git-send-email-shannon.nelson@oracle.com> X-Mailer: git-send-email 2.7.4 X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add a writeup on how to use the XFRM device offload API, and mention this new file in the index. Signed-off-by: Shannon Nelson --- Documentation/networking/00-INDEX | 2 + Documentation/networking/xfrm_device.txt | 132 +++++++++++++++++++++++++++++++ 2 files changed, 134 insertions(+) create mode 100644 Documentation/networking/xfrm_device.txt diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 7a79b35..f5d642c 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -228,6 +228,8 @@ x25.txt - general info on X.25 development. x25-iface.txt - description of the X.25 Packet Layer to LAPB device interface. +xfrm_device.txt + - description of XFRM offload API xfrm_proc.txt - description of the statistics package for XFRM. xfrm_sync.txt diff --git a/Documentation/networking/xfrm_device.txt b/Documentation/networking/xfrm_device.txt new file mode 100644 index 0000000..2d9d588c --- /dev/null +++ b/Documentation/networking/xfrm_device.txt @@ -0,0 +1,132 @@ + +=============================================== +XFRM device - offloading the IPsec computations +=============================================== +Shannon Nelson + + +Overview +======== + +IPsec is a useful feature for securing network traffic, but the +computational cost is high: a 10Gbps link can easily be brought down +to under 1Gbps, depending on the traffic and link configuration. +Luckily, there are NICs that offer a hardware based IPsec offload which +can radically increase throughput and decrease CPU utilization. The XFRM +Device interface allows NIC drivers to offer to the stack access to the +hardware offload. + +Userland access to the offload is typically through a system such as +libreswan or KAME/raccoon, but the iproute2 'ip xfrm' command set can +be handy when experimenting. An example command might look something +like this: + + ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ + reqid 0x07 replay-window 32 \ + aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \ + sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \ + offload dev eth4 dir in + +Yes, that's ugly, but that's what shell scripts and/or libreswan are for. + + + +Callbacks to implement +====================== + +/* from include/linux/netdevice.h */ +struct xfrmdev_ops { + int (*xdo_dev_state_add) (struct xfrm_state *x); + void (*xdo_dev_state_delete) (struct xfrm_state *x); + void (*xdo_dev_state_free) (struct xfrm_state *x); + bool (*xdo_dev_offload_ok) (struct sk_buff *skb, + struct xfrm_state *x); +}; + +The NIC driver offering ipsec offload will need to implement these +callbacks to make the offload available to the network stack's +XFRM subsytem. Additionally, the feature bits NETIF_F_HW_ESP and +NETIF_F_HW_ESP_TX_CSUM will signal the availability of the offload. + + + +Flow +==== + +At probe time and before the call to register_netdev(), the driver should +set up local data structures and XFRM callbacks, and set the feature bits. +The XFRM code's listener will finish the setup on NETDEV_REGISTER. + + adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops; + adapter->netdev->features |= NETIF_F_HW_ESP; + adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP; + +When new SAs are set up with a request for "offload" feature, the +driver's xdo_dev_state_add() will be given the new SA to be offloaded +and an indication of whether it is for Rx or Tx. The driver should + - verify the algorithm is supported for offloads + - store the SA information (key, salt, target-ip, protocol, etc) + - enable the HW offload of the SA + +The driver can also set an offload_handle in the SA, an opaque void pointer +that can be used to convey context into the fast-path offload requests. + + xs->xso.offload_handle = context; + + +When the network stack is preparing an IPsec packet for an SA that has +been setup for offload, it first calls into xdo_dev_offload_ok() with +the skb and the intended offload state to ask the driver if the offload +will serviceable. This can check the packet information to be sure the +offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and +return true of false to signify its support. + +When ready to send, the driver needs to inspect the Tx packet for the +offload information, including the opaque context, and set up the packet +send accordingly. + + xs = xfrm_input_state(skb); + context = xs->xso.offload_handle; + set up HW for send + +The stack has already inserted the appropriate IPsec headers in the +packet data, the offload just needs to do the encryption and fix up the +header values. + + +When a packet is received and the HW has indicated that it offloaded a +decryption, the driver needs to add a reference to the decoded SA into +the packet's skb. At this point the data should be decrypted but the +IPsec headers are still in the packet data; they are removed later up +the stack in xfrm_input(). + + find and hold the SA that was used to the Rx skb + get spi, protocol, and destination IP from packet headers + xs = find xs from (spi, protocol, dest_IP) + xfrm_state_hold(xs); + + store the state information into the skb + skb->sp = secpath_dup(skb->sp); + skb->sp->xvec[skb->sp->len++] = xs; + skb->sp->olen++; + + indicate the success and/or error status of the offload + xo = xfrm_offload(skb); + xo->flags = CRYPTO_DONE; + xo->status = crypto_status; + + hand the packet to napi_gro_receive() as usual + + +When the SA is removed by the user, the driver's xdo_dev_state_delete() +is asked to disable the offload. Later, xdo_dev_state_free() is called +from a garbage collection routine after all reference counts to the state +have been removed and any remaining resources can be cleared for the +offload state. How these are used by the driver will depend on specific +hardware needs. + +As a netdev is set to DOWN the XFRM stack's netdev listener will call +xdo_dev_state_delete() and xdo_dev_state_free() on any remaining offloaded +states. + +