Do same intrinsics exist for CLang/GCC?
Closest I found are
_mulx_u64() mentioned in Intel’s Guide. But they produce
mulx instruction which needs BMI2 support. While MSVC’s intrinsics produce regular
mul instruction. Also
_mulx_u32() is not available in
-m64 mode, while
_umul128() both exist in 32 and 64 bit mode of MSVC.
Of cause for 32-bit one may do
return uint64_t(a) * uint64_t(b); (see it online) hoping that compiler will guess correctly and optimize to using
u32*u32->u64 multiplication instead of
u64*u64->u64. But is there a way to be sure about this? Not to rely on compiler’s guess that both arguments are 32-bit (i.e. higher part of uint64_t is zeroed)? To have some intrinsics like
__emulu() that make you sure about code.
__int128 in GCC/CLang (see code online) but again we have to rely on compiler’s guess that we actually multiply 64-bit numbers (i.e. higher part of int128 is zeroed). Is there a way to be sure without compiler guessing through using some intrinsic?
uint64_t (for 32-bit) and
__int128 (for 64-bit) produce correct
mul instruction instead of
mulx in GCC/CLang. But again we have to rely that compiler guesses correctly that higher part of
__int128 is zeroed.
Of cause I can look into assembler code that GCC/Clang have optimized and guessed correctly, but looking at assembler once doesn’t guarantee that same will happen always in all circumstances. And I don’t know of a way in C++ to statically assert that compiler did correct guess about assembler instructions.
Source: Windows Questions C++