Fastest way to sum array of float values


I have doing DSP coding using Visual Studio and C++.

I have an array of floats, only 8 right now but may be changed later to more or less, I need to sum to a single float variable, and then average.

I would like to use intrinsic instructions, which I have no experience with and why I am asking here.

All that is required is that the code is faster than what I got below, and it will work on Intel and AMD processor’s say within the past 5 years.

Note that all the array float values are within -1 and 1, and speed is more important than precision.

float sum = (sampleValue[0] + sampleValue[1] + sampleValue[2] + sampleValue[3] +
             sampleValue[4] + sampleValue[5] + sampleValue[6] + sampleValue[7]) / 8;

I apologize if this question has already been answered, and if so please direct me to the answer, thanks.

Also if somebody can direct me to "Intrinsic functions for dummies" online article/tutorial it would be much appreciated, thanks!

Source: Visual Studio Questions