Category Archives: Vectorization

UME::SIMD Tutorial 0: Installation

The library is provided in a header-only form, which makes its’ installation trivial. We will present installation procedures for both Linux and Windows operating systems. Mind that as there is no perfectly portable build system, we only limit ourselves to most common tools used.

This installation procedure shows only steps required for UME::SIMD installation as a standalone library, but we make some small adjustments so that switching to full UME framework would be possible in the future.

Linux installation procedure

The library in its’ primitive form doesn’t require any build system. The only configuration that has to be made is by passing proper compilation flags to the compiler. The specific flags used will depend on both: hardware, compiler and operating system. We will discuss specific configurations in future tutorials in this series.

This installation procedure should work for most of the Linux based systems and requires a GIT client (

    1. Navigate to the directory where you want to store downloaded the current version of the library and create a directory for the framework:
$ mkdir ume
$ cd ume 
    1. Clone the repository:
$ git clone 


    1. You might want to checkout a tagged version to keep track of library updates:


$ cd umesimd
$ git checkout tags/v0.8.1
$ cd ..
    1. export the library directory
$ pwd
    1. Include the library code into your C++ project:

Now you should be able to include the library into your program using following directive:

 #include <umesimd/UMESimd.h>

That’s it!

Windows installation procedure

Windows installation procedure is similar, but requires additional steps in configuring the specific IDE. We will discuss this configuration for MS Visual Studio 2015 only, but similar rules apply for others.

    1. Navigate to the directory where you want to store downloaded the current version of the library and create a directory for the framework:
c:\> mkdir ume
c:\> cd ume 
    1. Clone the repository:
c:\ume> git clone 


    1. You might want to checkout a tagged version to keep track of library updates:


c:\ume> cd umesimd
c:\ume\umesimd> git checkout tags/v0.8.1
c:\ume> cd ..
    1. Create a new project (or open an existing one):

[Ctrl+Shift+N] or File->New->Project…


    1. Add new source file to the project:

[Ctrl+Shift+A] or Project->Add New Item…


    1. Fill the main.cpp with:
#include <umesimd/UMESimd.h>

int main()
    return 0;
    1. Open project properties and configure path to the ‘ume’ directory created before

[Alt+Enter] or Project->Properties


There you go!

Vectorization #1: the difficulty of notation, part 2

In the previous post we showed a simple example of a vectorization process. One thing we did there was to assume certain behaviour of \cdot operation. Before getting to a more complicated vectorization example, we have to discuss our previous omission. It will be critical to understanding more complex example.

We will also discuss problem of operation symbols ambiguity and why we actually have to use multiple operations with different symbols.

Operations definition (aka. operator overloading)

Before we try applying second round of vectorization to our previous equations, we have to discuss one more thing: definition of operations. (Remember my comment about \circ vs. \cdot symbols mentioned in previous post?) Here’s the explanation.

If we have an equation of following form:

a = b_1 \cdot c_1 + b_2 \cdot c_2 + ... + b_N \cdot c_N

and we want to represent it in a shorter, vector form, we could write:

\textbf{b} = (b_1, b_2, ..., b_N)
\textbf{c} = (c_1, c_2, ... , c_N)

a = \textbf{b} * \textbf{c}

We know this simplification as vector dot product. In mathematics it is usually denoted with ‘\cdot ‘ symbol.

On the other hand, we might face a set of equations, such as:

a_1 = b_1 \cdot c_1
a_2 = b_2 \cdot c_2
a_N = b_N \cdot c_N

In which case our best possible choice could be to re-write it as:

\textbf{a} = (a_1, a_2, ..., a_N)
\textbf{b} = (b_1, b_2, ..., b_N)
\textbf{c} = (c_1, c_2, ..., c_N)

\textbf{a} = \textbf{b} * \textbf{c}

This element-wise multiplication is known as Hadamard product, and in commonly accepted mathematical notation, should be written as:

\textbf{a} = \textbf{b} \circ \textbf{c}

On one hand, we want to have simpler notations, which boils down to smaller number of different symbols, and variable types. On the other hand, we would like the notation to be as readable as possible, so we increase the number of types and symbols. In programming languages we are free to define as many ‘functions’ as possible to describe behavior of operations we will be performing. But having too many functions, requires programmers to memorize many of them to navigate the code efficiently.

What is then a practical limitation that prohibits us from re-using operation symbols? If we used the same symbol for both operations, the only way we could tell one operation from another, would be to look at the type of result. In the first case this is a scalar, in the second: a vector. This would be possible, but when dealing with more complex equations can really make our notation more complicated.

First lets define ‘+’ operation in a following way:

\textbf{a} + \textbf{b} = (a_1 + b_1, a_2 + b_2, ..., a_N + b_N)
a + \textbf{b} = a + b_1 + b_2 + ... + b_N
\textbf{a} + b = a_1 + a_2 + ... + a_N + b

Example of an expression to be evaluated:

\textbf{a} = \textbf{b} * \textbf{c} + \textbf{d} * \textbf{e}

We have now to work it back, starting from the type of a result:

  1. \textbf{a} is a vector, so according to regular precedence, the result of ‘+’ operation has to be a vector.
  2. The result of \textbf{b} * \textbf{c} + \textbf{d} *\textbf{e} is a vector only if both \textbf{b} * \textbf{c} and \textbf{d} * \textbf{e} are vectors.
  3. Both ‘* ‘ operations result in a vector only when the requested operation is a Hadamard product.

So in the example above the meaning of * would be of mathematical \circ .

What about following expression in which the only difference is that a is a scalar:

a = \textbf{b} * \textbf{c} + \textbf{d} * \textbf{e}

Again working it backwards:

  1. a is a scalar in multiple situations:
    1. \textbf{b} * \textbf{c} result is scalar and \textbf{d} * \textbf{e} is a vector
    2. \textbf{b} * \textbf{c} result is vector and \textbf{d} * \textbf{e} is a scalar
    3. Both results of: \textbf{b} * \textbf{c} and \textbf{d} * \textbf{e} are scalars
  2. Since both \textbf{b} * \textbf{c} and  \textbf{d} * \textbf{e} cannot be a vector at the same time, the result of \textbf{b} * \textbf{c} is a vector only when result of \textbf{d} * \textbf{e} is a scalar.
  3. Since both \textbf{b} * \textbf{c} and  \textbf{d} * \textbf{e} cannot be a vector at the same time, the result of \textbf{d} * \textbf{e} is a vector only when result of \textbf{b} * \textbf{c} is a scalar.

At this moment our deduction reveals a phenomenon known in programming as symbol ambiguity. This phenomenon is a nightmare of both programming language and compiler designers. Between points 2 and 3 there exists an ambiguity saying: we cannot deduce anything about the type of result of one operand, without knowing the result of the other operand. Without any additional rule saying anything about how such conflict should be resolved, we cannot simply deduce anything about ‘\cdot ‘ operations! We could modify our rules for + saying for example: ‘if a + operator is supposed to result in a scalar, assume both of it’s operands to result in scalar’. If such a rule held, then we could immediately say that the meaning of \cdot would be that of \circ . The definition of:

\textbf{a} + \textbf{b} = (a_1 + b_1, a_2 + b_2, ..., a_N + b_N)

would immediately become:

\textbf{a} + \textbf{b}  -> vector: \textbf{a} + \textbf{b} = (a_1 + b_1, a_2 + b_2, ..., a_N + b_N)
\textbf{a} + \textbf{b}  -> scalar: \textbf{a} + \textbf{b} = a_1 + a_2 + ... + a_N + b_1 + b_2 + a_N

Multiplying such rules is not very welcome, as it can lead to an ambiguity avalanche and, as a result, explosion in the description of our notation. When an ambiguity happens, the simplest solution is then to use a separate symbol, having its’ own definition.

The continuation in: Vectorization #1: the difficulty of notation, part 3

Vectorization #1: the difficulty of notation, part 1

In this series of posts, I will try looking at different aspects of vectorization. We will start today with a scalar/vector/matrix notations, and try explaining what are the benefits and complications related to using such notations. I won’t be talking much about the performance or programming in the strict sense, but I will rather try explaining what is the problem in finding a fair notation. The mathematics presented here are simplified so that non-mathematicians could also follow the thought process without (too) much effort. Although the knowledge about real vector/matrix algebra might be of some use.

Vectorizing scalar equations

The vector term originates from mathematics. Say you have a group of equations, like:

a_1 = b_1 + c_1 \cdot d_1
a_2 = b_2 + c_2 \cdot d_2
a_N = b_N + c_N \cdot d_N

All of the variables: a_1, b_1, c_1, d_1, a_2, b_2, c_2, d_2 ..., a_N, b_N, c_N, d_N are used in the equation of the same exact form:

\alpha = \beta + \gamma \cdot \delta

In order to simplify the first notation, you can first group the variables at the same position in equations in a following way, forming vectors:

\textbf{a} = (a_1, a_2, ..., a_N)
\textbf{b} = (b_1, b_2, ..., b_N)
\textbf{c} = (c_1, c_2, ..., c_N)
\textbf{d} = (d_1, d_2, ..., d_N)

Now you can replace the original set of equations with a vector form (a cautious mathematician can now point that ‘\cdot ‘ is not a correct vector operation symbol here, and that we should use \circ instead. This is a part of the thought experiment. Please bear with me in this insanity for a while.):

\textbf{a} = \textbf{b} + \textbf{c} \cdot \textbf{d}

An entity representing a singular value, such as a_1 is called a scalar. A list (or more precisely ‘an ordered-set’) of elements, such as \textbf{a} is called a vector.

By vectorizing our equations, we replaced an unknown number of equations (N )  of known number of variables (a_1, b_1, c_1, d_1), with a known number of equations (1), and unknown number of packed variables(a_1, ..., a_N, ... , b_N, ..., c_N, ..., d_N ).

The obvious gain we will get from having vector notation is the simplification of mathematical proofs. Instead of repeating each time that a certain property applies to equations for a_1, a_2, ..., a_N we can simply state that the property applies to all elements of \textbf{a}. We still have more complicated definition of the types we manipulate, but we usually have to deal with that only in the beginning and possibly at the end of a proof.

Now imagine following set of equations:

a_{11} = 3 \cdot b_{11} + 4 \cdot c_{11} + 5 \cdot d_{11}
a_{12} = 3 \cdot b_{12} + 4 \cdot c_{12} + 5 \cdot d_{12}
a_{1M} = 3 \cdot b_{1M} + 4 \cdot b_{1M} + 5 \cdot d_{1M}
a_{21} = 1 \cdot b_{21} + 5 \cdot c_{21} + 4 \cdot d_{21}
a_{22} = 1 \cdot b_{22} + 5 \cdot c_{22} + 4 \cdot d_{22}
a_{2M} = 1 \cdot b_{2M} + 5 \cdot c_{2M} + 4 \cdot d_{2M}
a_{N1} = 7 \cdot b_{N1} + 3 \cdot c_{N1} + 9 \cdot d_{N2}
a_{N2} = 7 \cdot b_{N2} + 3 \cdot c_{N2} + 9 \cdot d_{N2}
a_{NM} = 7 \cdot b_{NM} + 3 \cdot c_{NM} + 9 \cdot d_{NM}

If we apply our vector notation then we have:

\textbf{a}_1 = (a_{11}, a_{12}, ..., a_{1M})
\textbf{a}_2 = (a_{21}, a_{22}, ..., a_{2M})

\textbf{a}_N = (a_{N1}, a_{N2}, ..., a_{NM})

\textbf{b}_1 = (b_{11}, b_{12}, ..., b_{1M})
\textbf{b}_2 = (b_{21}, b_{22}, ..., b_{2M})

\textbf{b}_N = (b_{N1}, b_{N2}, ..., b_{NM})

\textbf{c}_1 = (c_{11}, c_{12}, ..., c_{1M})
\textbf{c}_2 = (c_{21}, c_{22}, ..., c_{2M})

\textbf{c}_N = (c_{N1}, c_{N2}, ..., c_{NM})

\textbf{d}_1 = (d_{11}, d_{12}, ..., d_{1M})
\textbf{d}_2 = (d_{21}, d_{22}, ..., d_{2M})

\textbf{d}_N = (d_{N1}, d_{N2}, ..., d_{NM})

And the re-written equations:

\textbf{a}_1 = 3 \cdot \textbf{b}_1 + 4 \cdot \textbf{c}_1 + 5 \cdot \textbf{d}_1
\textbf{a}_2 = 1 \cdot \textbf{b}_2 + 5 \cdot \textbf{c}_2 + 4 \cdot \textbf{d}_2

\textbf{a}_N = 7 \cdot \textbf{b}_N + 3 \cdot \textbf{c}_N + 9 \cdot \textbf{d}_N

So we replaced a long set of equations, with a shorter set of equations. But again if we start performing proofs on such equations, we would end up repeating everything as in the first example.

We can also think about another way of representing the same set of equations. Instead of grouping all a ‘s and all b ‘s together, we could use following grouping:

\textbf{p}_1 = (3, 4, 5)
\textbf{p}_2 = (1, 5, 4)
\textbf{p}_N = (7, 3, 9)

\textbf{q}_{11} = (b_{11}, c_{11}, d_{11})
\textbf{q}_{12} = (b_{12}, c_{12}, d_{12})
\textbf{q}_{NM} = (b_{NM}, c_{NM}, d_{NM})

Our second vector form becomes then:

a_{11} = \textbf{p}_1 \cdot \textbf{q}_{11}
a_{12} = \textbf{p}_1 \cdot \textbf{q}_{12}
a_{1M} = \textbf{p}_1 \cdot \textbf{q}_{1M}
a_{21} = \textbf{p}_2 \cdot \textbf{q}_{21}
a_{22} = \textbf{p}_2 \cdot \textbf{q}_{22}
a_{2M} = \textbf{p}_2 \cdot \textbf{q}_{2M}
a_{N1} = \textbf{p}_N \cdot \textbf{q}_{N1}
a_{N2} = \textbf{p}_N \cdot \textbf{q}_{N2}
a_{NM} = \textbf{p}_N \cdot \textbf{q}_{NM}

While in this case we didn’t decrease the number of equations, we made each of them much simpler. As for the previous transformation, we could vectorize again and get yet simpler notation for the set of our equations. We will do that in a moment. But first we need to clarify something about ‘\cdot ‘ symbol we’ve been using carelessly so far.

The continuation in: Vectorization #1: the difficulty of notation, part 2