[net-next,v9,08/19] zinc: Poly1305 x86_64 implementation

These x86_64 vectorized implementations are based on Andy Polyakov's
implementation, and support AVX, AVX-2, and AVX512F. The AVX-512F
implementation is disabled on Skylake, due to throttling.

The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
the unfortunate situation of using AVX and then having to go back to scalar
-- because the user is silly and has called the update function from two
separate contexts -- then we need to convert back to the original base before
proceeding. It is possible to reason that the initial reduction below is
sufficient given the implementation invariants. However, for an avoidance of
doubt and because this is not performance critical, we do the full reduction
anyway. This conversion is found in the glue code, and a proof of
correctness may be easily obtained from Z3: <https://xn--4db.cc/ltPtHCKN/py>.

On the left is cycle counts on a Core i7 6700HQ using the AVX-2 codepath,
comparing this implementation ("new") to the implementation in the current
crypto api ("old"). On the right are benchmarks on a Xeon Gold 5120 using
the AVX-512 codepath. The difference is so stark, because the current
crypto api's implementation does not support AVX-512 at all.

        AVX-2                  AVX-512
      ---------              -----------

size	old	new      size	old	new
----	----	----     ----	----	----
0	70	68       0	74	70
16	92	90       16	96	92
32	134	104      32	136	106
48	172	120      48	184	124
64	218	136      64	218	138
80	254	158      80	260	160
96	298	174      96	300	176
112	342	192      112	342	194
128	388	212      128	384	212
144	428	228      144	420	226
160	466	246      160	464	248
176	510	264      176	504	264
192	550	282      192	544	282
208	594	302      208	582	300
224	628	316      224	624	318
240	676	334      240	662	338
256	716	354      256	708	358
272	764	374      272	748	372
288	802	352      288	788	358
304	420	366      304	422	370
320	428	360      320	432	364
336	484	378      336	486	380
352	426	384      352	434	390
368	478	400      368	480	408
384	488	394      384	490	398
400	542	408      400	542	412
416	486	416      416	492	426
432	534	430      432	538	436
448	544	422      448	546	432
464	600	438      464	600	448
480	540	448      480	548	456
496	594	464      496	594	476
512	602	456      512	606	470
528	656	476      528	656	480
544	600	480      544	606	498
560	650	494      560	652	512
576	664	490      576	662	508
592	714	508      592	716	522
608	656	514      608	664	538
624	708	532      624	710	552
640	716	524      640	720	516
656	770	536      656	772	526
672	716	548      672	722	544
688	770	562      688	768	556
704	774	552      704	778	556
720	826	568      720	832	568
736	768	574      736	780	584
752	822	592      752	826	600
768	830	584      768	836	560
784	884	602      784	888	572
800	828	610      800	838	588
816	884	628      816	884	604
832	888	618      832	894	598
848	942	632      848	946	612
864	884	644      864	896	628
880	936	660      880	942	644
896	948	652      896	952	608
912	1000	664      912	1004	616
928	942	676      928	954	634
944	994	690      944	1000	646
960	1002	680      960	1008	646
976	1054	694      976	1062	658
992	1002	706      992	1012	674
1008	1052	720      1008	1058	690

While this is CRYPTOGAMS code, the originating code for this happens to
be derived from OpenSSL's commit 4dfe4310c31c4483705991d9a798ce9be1ed1c68

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Samuel Neves <sneves@dei.uc.pt>
Co-developed-by: Andy Polyakov <appro@openssl.org>
Cc: Andy Polyakov <appro@openssl.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Samuel Neves <sneves@dei.uc.pt>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-crypto@vger.kernel.org
---
 lib/zinc/Makefile                        |    2 +
 lib/zinc/poly1305/poly1305-x86_64-glue.c |  154 +
 lib/zinc/poly1305/poly1305-x86_64.pl     | 4266 ++++++++++++++++++++++
 lib/zinc/poly1305/poly1305.c             |    4 +
 4 files changed, 4426 insertions(+)
 create mode 100644 lib/zinc/poly1305/poly1305-x86_64-glue.c
 create mode 100644 lib/zinc/poly1305/poly1305-x86_64.pl

Message ID	20190322071122.6677-9-Jason@zx2c4.com
State	Changes Requested
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: "Jason A. Donenfeld" <Jason@zx2c4.com> To: linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>, Samuel Neves <sneves@dei.uc.pt>, Andy Polyakov <appro@openssl.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, x86@kernel.org, Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>, Andy Lutomirski <luto@kernel.org>, Greg KH <gregkh@linuxfoundation.org>, Andrew Morton <akpm@linux-foundation.org>, Linus Torvalds <torvalds@linux-foundation.org>, kernel-hardening@lists.openwall.com Subject: [PATCH net-next v9 08/19] zinc: Poly1305 x86_64 implementation Date: Fri, 22 Mar 2019 01:11:11 -0600 Message-Id: <20190322071122.6677-9-Jason@zx2c4.com> In-Reply-To: <20190322071122.6677-1-Jason@zx2c4.com> References: <20190322071122.6677-1-Jason@zx2c4.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk
Series	WireGuard: Secure Network Tunnel \| expand [net-next,v9,00/19] WireGuard: Secure Network Tunnel [net-next,v9,01/19] asm: simd context helper API [net-next,v9,02/19] zinc: introduce minimal cryptography library [net-next,v9,03/19] zinc: ChaCha20 generic C implementation and selftest [net-next,v9,04/19] zinc: ChaCha20 x86_64 implementation [net-next,v9,05/19] zinc: ChaCha20 ARM and ARM64 implementations [net-next,v9,06/19] zinc: ChaCha20 MIPS32r2 implementation [net-next,v9,07/19] zinc: Poly1305 generic C implementations and selftest [net-next,v9,08/19] zinc: Poly1305 x86_64 implementation [net-next,v9,09/19] zinc: Poly1305 ARM and ARM64 implementations [net-next,v9,10/19] zinc: Poly1305 MIPS64 and MIPS32r2 implementations [net-next,v9,11/19] zinc: ChaCha20Poly1305 construction and selftest [net-next,v9,12/19] zinc: BLAKE2s generic C implementation and selftest [net-next,v9,13/19] zinc: BLAKE2s x86_64 implementation [net-next,v9,14/19] zinc: Curve25519 generic C implementations and selftest [net-next,v9,15/19] zinc: Curve25519 x86_64 implementation [net-next,v9,16/19] zinc: import Bernstein and Schwabe's Curve25519 ARM implementation [net-next,v9,17/19] zinc: Curve25519 ARM implementation [net-next,v9,18/19] security/keys: rewrite big_key crypto to use Zinc [net-next,v9,19/19] net: WireGuard secure network tunnel

[net-next,v9,08/19] zinc: Poly1305 x86_64 implementation

Commit Message

Patch