Skip to main content

Lab 9

The core idea in vector processing is to use a single instruction multiple is. Instead of programming at the assembly level we are going to be interacting with vector programming at the C level using intrinsics

Intrinsic (definition) These are special functions and types that we have implemented that you can use within a C program to activate vector operations and vector registers without requiring to deal with assembly-level syntax and finite 32 number of register.

  • With intrinsics you can create an arbtirary number of vector registers in your program and invoke as many operations as defined in our vector library.

Vector types

  • __cs295_vec_int : The implementation of the vector registers of either float or int type. Each vector register holds a VLEN number of is. These registers can be passed as parameters to the vector functions.

{: .table-striped table-bordered}

**cs295_vec_int Vector of type int
**cs295_vec_float Vector of type floa t
__cs295_mask Declares mask vector of type bool

e.g.,

__cs295_vec_int x_v, y_v;
  • __cs295_mask: Masks are bool vectors of width VLEN which specify whether a particular lane or index is active in an operation. We illustrate with vmult below

If the mask is a 1 the lane performs the operation, if its a 0 the lane is in active and result is not modified.

for i = 0 to VLEN
   if mask[i] == 1 then
     result[i] = x_v[i] * y_v[i]
  else
    // DO NOTHING

Mask instructions

{: .table-striped table-bordered}

Return value Function Description
__cs295_mask _cs295_init_ones(width) Return a mask initialized to 1 in the first width lanes and 0 in the others
int _cs295_cntbits(mask); Count the number of 1s in maska
_cs295_init_ones
for i = 0 to width
     mask[i] = 1
return mask
_cs295_init_ones
for i = 0 to VLEN
    if(mask[i] == 1)
       ones = ones + 1
return ones

Mask logical operations

{: .table-striped table-bordered}

**cs295_mask _cs295_mask_not Return the inverse of maska
**cs295_mask _cs295_mask_and(mask_a,mask_b) Return (maska & maskb)
__cs295_mask _cs295_mask_or(maskA, maskB) Return (maska or maskb)
_cs295_mask_and(maskA, maskB)
for i = 0 to VLEN
     mask = maskA[i] & maskB[i]
return mask
# 1 s where maskA and maskB is 1

Vector compute instructions

vec below refers to __cs295_vec_int mask below refers to __cs295_mask

If you want to use the float operations use float e.g., _cs295_vadd_float You also need to use the float vector registers __cs295_vec_float

We use the shorthand notation to ensure readability.

_cs295_vset_int

{: .table-striped .table-bordered}

Function Description
_cs295_vset_int(int value, mask m) For user's convenience, returns a vector register with all lanes initialized to value otherwise keep the old value
for i = 0 to VLEN
   if (m[i] == 1)
     v[i] = 5;

_cs295_vmove_int

{: .table-striped .table-bordered}

Function Description
_cs295_vmove_int(vec dest, vec src, mask m) For user's convenience, returns a vector register with all lanes initialized to value otherwise keep the old value
for i = 0 to VLEN
   if (m[i] == 1)
     dst[i] = src[i];

_cs295_vadd_int

{: .table-striped .table-bordered}

Function Description
_cs295_vadd_int(vec &res, vec &x_v, vec &y_v, mask m) Return calculation of (x_v + y_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = x_v[i] + y_v[i];

_cs295_vsub_int

{: .table-striped .table-bordered}

Function Description
_cs295_vsub_int(vec &res, vec &x_v, vec &y_v, mask m) Return calculation of (x_v - y_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = x_v[i] - y_v[i];

_cs295_vmult_int

{: .table-striped .table-bordered}

Function Description
_cs295_vmult_int(vec &res, vec &x_v, vec &y_v, mask m) Return calculation of (x_v * y_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = x_v[i] * y_v[i];

_cs295_vdiv_int

{: .table-striped .table-bordered}

Function Description
_cs295_vdiv_int(vec &res, vec &x_v, vec &y_v, mask m) Return calculation of (x_v / y_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = x_v[i] / y_v[i];

_cs295_vshiftright_int

{: .table-striped .table-bordered}

Function Description
_cs295_vdiv_int(vec &res, vec &x_v, vec &y_v, mask m) Return calculation of (x_v >> y_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = x_v[i] >> y_v[i];

_cs295_vbitand_int

{: .table-striped .table-bordered}

Function Description
_cs295_vdiv_int(vec &res, vec &x_v, vec &y_v, mask m) Return calculation of (x_v & y_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = x_v[i] & y_v[i];

_cs295_vabs_int

{: .table-striped .table-bordered}

Function Description
_cs295_vabs_int(vec &res, vec &x_v, mask m) Return calculation of abs(x_v) if vector lane active
for i = 0 to VLEN
   if (m[i] == 1)
     res[i] = abs(x_v[i]);

Comparison instructions

These set of operations return a mask with true/false based on equality check between two vectors. Note that there are two masks in these operations.

  • The mask operator which controls which lanes are active
  • The result mask which is a result of the comparisons.

_cs295_vgt_int

{: .table-striped .table-bordered}

Function Description
_cs295_vgt_int(mask &resmask, vec &x_v, vec &y_v, mask &mask); Return a mask of (x_v > y_v) if vector lane active; otherwise keeps old value.
for i = 0 to VLEN
# This mask controls whether lane is active
   if (m[i] == 1)
    # The result of the comparison controls
    # result mask
     if (x_v[i] > y_v[i])
        resmask[i] = 1
     else
        resmask[i] = 0

_cs295_vlt_int

{: .table-striped .table-bordered}

Function Description
_cs295_vgt_int(mask &resmask, vec &x_v, vec &y_v, mask &mask); Return a mask of (x_v < y_v) if vector lane active; otherwise keeps old value.
for i = 0 to VLEN
# This mask controls whether lane is active
   if (m[i] == 1)
    # The result of the comparison controls
    # result mask
     if (x_v[i] < y_v[i])
        resmask[i] = 1
     else
        resmask[i] = 0

_cs295_veq_int

{: .table-striped .table-bordered}

Function Description
_cs295_vgt_int(mask &resmask, vec &x_v, vec &y_v, mask &mask); Return a mask of (x_v == y_v) if vector lane active; otherwise keeps old value.
for i = 0 to VLEN
# This mask controls whether lane is active
   if (m[i] == 1)
    # The result of the comparison controls
    # result mask
     if (x_v[i] == y_v[i])
        resmask[i] = 1
     else
        resmask[i] = 0

Vector memory instructions

_cs295_vmove_int

{: .table-striped .table-bordered}

Function Description
_cs295_vmove_int(vec &dest, vec &src, mask &mask); Copies one vector to another.
for i = 0 to VLEN
# This mask controls whether lane is active
   if (m[i] == 1)
     x_v[i] = y_v[i]

_cs295_vload_int

{: .table-striped .table-bordered}

Function Description
_cs295_vload_int(vec &x_v, int* src, mask &mask); Load values from array src to vector register dest if vector lane active
for i = 0 to VLEN
# This mask controls whether lane is active
   if (m[i] == 1)
     x_v[i] = src[i]

_cs295_vstore_int

{: .table-striped .table-bordered}

Function Description
_cs295_vstore_int(int* dst, vec &x_v, mask &mask); Store values from vector register to array dest if vector lane active

WARNING: x_v is a register dst is an array in memory

for i = 0 to VLEN
# This mask controls whether lane is active
   if (m[i] == 1)
     dest[i] = x_v[i]

_cs295_vload_seg_int

Converts array-of-structs to multiple arrays.

{: .table-striped .table-bordered}

Function Description
_cs295_vload_seg_int(vec dst[], int*src, int fields); Loads a vector of tuples of is from memory such that component of the tuple is loaded into a different vector. This operation is useful to convert a memory representation of Array-of-Structures into a register representation of Structure-of-Arrays.

Note that we are returning an array of vector registers.

  for i = 0 to VLEN
    for f = 0 to fields
      if (m[i] == 1)
        dest_v[f][i] = src
        # Pointer scaled by int
        src = src + 1
<iframe allowfullscreen frameborder="0" style="width:640px; height:480px" src="https://lucid.app/documents/embeddedchart/4a7d6ac4-4017-4a9f-a8f3-df5efd9fff41" id="ihha85gDSYDP">

Vector index instructions

_cs295_firstbit

{: .table-striped .table-bordered}

Function Description
int _cs295_firstbit(__cs295_mask &maska); Finds first index that is non-zero from left
for i = 0 to VLEN:
  if (m[i] != 0):
     return i
return 0

Vector permutation instructions

_cs295_hadd_int

{: .table-striped .table-bordered}

Function Description
_cs295_hadd_int(vec result, vec x_v); Adds up adjacent pairs of is, so
[0 1 2 3] -> [0+1 0+1 2+3 2+3]
for i = 0 to VLEN
# This mask controls whether lane is active
   if (i % 2 == 0)
    res[i] = x_v[i] + x_v[i +1]
   else
     res[i] = res[i-1]
<iframe allowfullscreen frameborder="0" style="width:640px; height:480px" src="https://lucid.app/documents/embeddedchart/49506859-3930-4169-b2b9-6fd6834fd8bd" id="Loha5RfnanTV">

_cs295_interleave_int

{: .table-striped .table-bordered}

Function Description
_cs295_interleave_int(vec result, vec x_v); Performs an even-odd interleaving where all even-indexed is move to front half of the array and odd-indexed to the back half, so [0 1 2 3 4 5 6 7] -> [0 2 4 6 1 3 5 7]
<iframe allowfullscreen frameborder="0" style="width:640px; height:480px" src="https://lucid.app/documents/embeddedchart/52b8a1d6-3f4a-4409-a21b-be29b44c8f6f" id="CwhaJXzX_4F-">