Example: Vector Addition

Vanilla Version

void add_vectors(float *a, float *b, float* c, int n) {
    for(int i = 0; i < n; i++) {
        c[i] = a[i] + b[i];
    }  
}

Using Intrinsics

#include "nmmintrin.h" // for SSE4.2

void add_vectors(float *a, float *b, float* c, int n) {
    assert(n%4 == 0); 
    for(int i = 0; i < n/4; i++) {
        __m128 av = _mm_load_ps(&a[4*i]); // load 4 float from a
        __m128 bv = _mm_load_ps(&b[4*i]); // load 4 float from b       
        __m128 cv = _mm_add_ps(av, bv); // cv = av .+ bv
        _mm_store_ps(&c[4*i], cv);
    }   
}

References

[1] Practical SIMD Programming, Supplemental tutorial for INFOB3CC, INFOMOV & INFOMAGR, Jacco Bikker, 2017arrow-up-right [cache]arrow-up-right

Last updated