The programming model used in UME::SIMD is very simple. Instead of using scalar variables, use vector variables. A simple vector declaration can look like:
UME::SIMD::SIMDVec<float, 8> x_vec;
In the above declaration two template parameters have to be passed: number of elements packed in the vector (8) and the fundamental type used to represent each element (float). The fundamental types supported are:
- unsigned integer: uint8_t (8b), uint16_t (16b), uint32_t (32b) and uint64_t (64b);
- signed integer: int8_t (8b), int16_t (16b), int32_t (32b) and int64_t (64b);
- floating point: float(32b) and double (64b).
For the vector length two rules apply:
- vector length is power of 2, starting with ‘1’,
- maximum size of a vector is not higher than 1024b.
But what is the reason for these rules? Both limitations are comming from hardware constraints. On hardware level it only makes sense to have registers of length being power of 2, as having the arbitrary vector sizes would require additional die surface to be used. At the same time, the hardware limit is being put on number of bits, rather than on number of elements: vector of 32 64-bit elements would occupy 2048 bit registers, while a vector of 32 8-bit elements would only use 256 bit registers. Unfortunatelly this means that in the current model, we can operate on up to 128-element uint8_t vectors, but only on 16-element uint64_t vectors.
Once a vector is declared it can be used in a similar way as any fundamental type… with some exceptions. First problem is: how to put actual data in this vector type?
Initialization from scalars
All vector elements can be initialized with the same value already contained in a scalar variable or constant. To initialize the vector with scalar (or a constant) it is possible to simply write something like:
UME::SIMD::SIMDVec<float, 8> x_vec1(3.14f); UME::SIMD::SIMDVec<float, 8> x_vec2=3.14f; float y=2.71f; UME::SIMD::SIMDVec<float, 8> y_vec1(y); UME::SIMD::SIMDVec<float, 8> y_vec2=y;
If the initialization has to take place after the vector declaration, it is enough to use the assignment operator (‘=’):
UME::SIMD::SIMDVec<float, 8> x_vec1, x_vec2; x_vec1=3.14f; float x=2.71f; x_vec2=x;
The broadcast initialization is not always feasible as we might want to have different initial values in every vector cell. You can initialize a vector using multiple scalar variables in a following way:
float x1=1.0f, x2=2.0f, x3=3.0f, x4=4.0f; UME::SIMD::SIMDVec<float, 4> x_vec(x1, x2, x3, x4);
Initialization from memory
In many situations, we don’t want to simply propagate a single value to a vector register. If the initialization data is stored in an array of memory, it is possible to load the variables into the vector register:
float x[4]={1.0f, 2.0f, 3.0f, 4.0f} // Load-initialize with 'x': UME::SIMD::SIMDVec<float, 4> x_vec1(x); // Load from 'x' after vector declaration UME::SIMD::SIMDVec<float, 4> x_vec2; x_vec2.load(x);
Loading is important as it allows use of memory pointers instead of scalar variables. This is an important initialization mode especially for long vectors (imagine initializing SIMDVec using scalars…).
In next tutorial we will look into some basic computations that can be performed using vector primitives.