-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace ChaCha20Poly1305 implementation #221
Conversation
Improve AEAD speed with slightly faster poly1305 implementation. Avoid memory allocations whenever possible. (AEAD) But currently missing AVX2 support. BenchmarkSeal64B-8 1561 ns/op 40.97 MB/s BenchmarkSeal1K-8 5570 ns/op 183.82 MB/s BenchmarkSeal64K-8 161271 ns/op 406.37 MB/s BenchmarkOpen64B-8 1747 ns/op 45.79 MB/s BenchmarkOpen1K-8 5741 ns/op 181.14 MB/s BenchmarkOpen64K-8 157116 ns/op 417.22 MB/s
Current coverage is 86.81%@@ master #221 diff @@
==========================================
Files 67 67
Lines 4129 4132 +3
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 3588 3587 -1
- Misses 312 314 +2
- Partials 229 231 +2
|
The Travis-CI build seems to fail because of a network error...
|
Thanks for this PR! I restarted the travis build and it worked, so I'm going ahead with merging this now. I've had a quick look at your implementation, and it indeed looks better with regard to allocations. It also allows us to use in-place encryption, which will probably be done in our current milestone (see #217). However, I'm not sure if you're aware that we're only using AES-GCM right now. We switched since it is around an order of magnitude faster (with the current implementations, see below). Due to a lack of time I didn't build proper negotiation for chacha yet (the client gets to decide, see #201), but I will do this soon-ish. Just pasting my benchmark results for future reference (of course they don't cover GC pressure):
|
Thanks for merging! It's quite hard to beat AES-GCM on amd64 with AES-NI - Maybe AVX2 has a chance (but both poly1305 and chacha20 must use it). Thanks for the benchmark reference - quite helpful! |
Significant performance improvement for poly1305 through amd64 assembly. BenchmarkSeal64B-8 57.89 MB/s |
Very cool work! 👍 On my machine it's around 420, up from 370.
Once QUIC gets adopted by mobile browsers (who like choosing chacha over AES) this will become really useful :) |
Thanks 😄 The main problem for you now is the compilation. If I add the AVX2 code to the repo (as soon as the release is available), only the 1.7 builds will pass ( <= 1.6.3 will fail). As far as I know Go does not support something like "version depended compilation", so I cannot fix this... Currently i think i'll create a AVX2 branch for chacha20 and as far as possible (without breaking other's code) make it the master... Some benchmarks for AVX2 code on a Skylake i7-6500U:
With AVX2:
|
I would like to keep support for 1.6 for the near future. However, I think it's possible to use Go's build tags for that purpose, see https://golang.org/pkg/go/build/#hdr-Build_Constraints. In short, if you put |
Thanks for the info - perfect 👍 |
SSSE3 support available: aead/chacha20@b2542ac |
Improve AEAD speed with slightly faster poly1305 implementation.
Avoid memory allocations whenever possible. (AEAD)
But currently missing AVX2 support.
BenchmarkSeal64B-8 40.97 MB/s
BenchmarkSeal1K-8 183.82 MB/s
BenchmarkSeal64K-8 406.37 MB/s
BenchmarkOpen64B-8 45.79 MB/s
BenchmarkOpen1K-8 181.14 MB/s
BenchmarkOpen64K-8 417.22 MB/s