src: enable avx512 optimization for base64 encoding #43717

lucshi · 2022-07-07T14:09:18Z

Optimized the Base64 encoding by AVX512VL and AVX512VBMI instructions. The measurement of benchmark/buffers/buffer-base64-encoding on AWS EC2 m6i.large shows 2.4X performance (+140% gain). This optimization only applies to Linux X86_64 at present.

The algorithm is based on open project of https://github.com/WojciechMula/base64-avx512 with some modifications to fit in Node.js code base. I have added the original BSD-3 license in the patch.

addaleax · 2022-07-07T16:43:05Z

src/base64-inl.h

@@ -182,6 +211,96 @@ inline size_t base64_encode(const char* src,
  return dlen;
 }

+
+#if (defined(__x86_64) || defined(__x86_64__)) && \
+    (defined(__linux) || defined(__linux__))


I don't see why this operation wouldn't apply outside of Linux? Should this be a check for gcc/clang instead?

The original code is for Linux GCC and I do not have a windows to test for it. I would try to remove the line 216 and add gcc/clang check and let the bot to test it. Thanks!

addaleax · 2022-07-07T16:44:55Z

src/base64-inl.h

+
+#if (defined(__x86_64) || defined(__x86_64__)) && \
+    (defined(__linux) || defined(__linux__))
+#pragma GCC target("avx512vl", "avx512vbmi")


Would it be a bit cleaner to use __attribute__((target(...))) on the function itself? Or is this necessary for the #include to work?

I chose #pragma so to explicitly reset the pragma to avoid polution on other code.

mscdex · 2022-07-07T19:31:51Z

FWIW it might be better to go with something that is not so exclusive platform/CPU feature-wise. Awhile back I submitted #39775 which uses https://github.com/aklomp/base64 for broader platform/CPU support. I'll try and work on bringing that PR closer to being landable.

Optimized the Base64 encoding by AVX512VL and AVX512VBMI instructions. The measurement of benchmark/buffers/buffer-base64-encoding on AWS EC2 m6i.large shows 2.4X performance (+140% gain). This optimization only applies to Linux X86_64 at present.

lucshi · 2022-07-08T02:39:40Z

FWIW it might be better to go with something that is not so exclusive platform/CPU feature-wise. Awhile back I submitted #39775 which uses https://github.com/aklomp/base64 for broader platform/CPU support. I'll try and work on bringing that PR closer to being landable.

The considerations were that: AVX512VL is not a new technology (after Cannonlake in 2018) and the main computing intensive AWS EC2 instances are based on Icelake. The users even care about the performance difference of Base64 encoding may mainly work on new arch machine and have requirements of top performance of Node.js. So the purpose of this optimization is to provide the fastest option for base64 encoding with least code change for those users's requirement.

AFAIK WojciechMula's AVX512VL solution is the fastest algorithm which is faster than AVX512F/AVX512VBMI. AKLOMP's solution is good to support a variaty platforms but lack of the top performance option so I did not borrow that solution in this optimization.

lucshi · 2022-07-08T09:40:34Z

Resubmit to modify the first commit msg.

Optimized the Base64 encoding by AVX512VL instructions. Purpose of this opt is not to provide a generic solution for variety of platforms, but for modern cpu archs after Cannonlake which has been widly employeed by AWS and other CSPs and for the users who want to obtain the known best performance of Node.js. The measurement of buffer-base64-encoding on AWS EC2 m6i.large instance shows 2.4X performance (+140% gain). This opt only applies to X86_64 with GNU compiler at present.

nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Jul 7, 2022

addaleax reviewed Jul 7, 2022

View reviewed changes

lucshi closed this Jul 8, 2022

lucshi deleted the my-branch branch July 8, 2022 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src: enable avx512 optimization for base64 encoding #43717

src: enable avx512 optimization for base64 encoding #43717

lucshi commented Jul 7, 2022

addaleax Jul 7, 2022

lucshi Jul 8, 2022

addaleax Jul 7, 2022

lucshi Jul 8, 2022

mscdex commented Jul 7, 2022

lucshi commented Jul 8, 2022

lucshi commented Jul 8, 2022

src: enable avx512 optimization for base64 encoding #43717

src: enable avx512 optimization for base64 encoding #43717

Conversation

lucshi commented Jul 7, 2022

addaleax Jul 7, 2022

Choose a reason for hiding this comment

lucshi Jul 8, 2022

Choose a reason for hiding this comment

addaleax Jul 7, 2022

Choose a reason for hiding this comment

lucshi Jul 8, 2022

Choose a reason for hiding this comment

mscdex commented Jul 7, 2022

lucshi commented Jul 8, 2022

lucshi commented Jul 8, 2022