This lab will help you get acquainted with the routines needed for assignment 3. However, note that file formats and read/writes are different then assignment 3.
This lab has been modified by your CMPT 295 instructor.
wget "https://www2.cs.sfu.ca/~ashriram/Courses/CS295/assets/distrib/Venus/jvm/venus.jar"
We will be testing all of your code on RISC-V calling conventions, as described in lecture/lab/discussion. All functions that overwrite registers that are preserved by convention must have a prologue and epilogue where they save those register values to the stack at the start of the function and restore them at the end.
Follow the calling conventions. It is extremely important for this project, as you’ll be writing functions that call your other functions, and maintaining the abstraction barrier provided by the conventions will make your life a lot easier.
We’ve provided # Prologue
and # Epilogue
comments in each function as a reminder. Note that depending on your implementation, some functions won’t end up needed a prologue and epilogue. In these cases, feel free to delete/ignore the comments we’ve provided.
For an closer look at RISC-V calling conventions, refer here.
addi sp, sp, -8
and addi sp, sp, 8
?sw s0, 4 (sp)
Source | Description |
---|---|
test_dot.s, dot.s | Driver for dot product, implement dot product |
test_read_vector.s, read_vector.s | Driver for reading matrix, read the matrix |
main.s | Main starting file for end-to-end run |
In the test_files
subdirectory, you’ll find several RISC-V files to test your code with. There is a test file corresponding to every function you’ll have to write, except for the part 2.
DO NOT MODIFY THE INPUTS WHEN COMMITTING THE FILES TO GIT. IT WILL FAIL THE REFERENCE CHECKS
In this project, vectors are stores in row-major order. We can think of vectors as one-dimensional matrices with all values flattered out.
In this part, you will implement functions to read matrices from the binary files. Then, you’ll write a main function putting together all of the functions you’ve written so far into an MNIST classifier, and run it using pre-trained weight matrices that we’ve provided.
Our vector files are provided in binary format. We recommend the xxd
command to open the binary file (DO NOT USE YOUR EDITOR). You can find it’s man page here, but its default functionality is to output the raw bits of the file in a hex representation.
The first 4 bytes of the binary file represent one 4 byte integer. These integers are the number of elements in the vector. Every 4 following bytes represents an integer that is an element of the matrix, in row-major order. In this case each of the 4 bytes represents a value of the pixel There are no gaps between elements.. It is important to note that the bytes are in little-endian order. This means the least significant byte is placed at the lowest memory address. For files, the start of the file is considered the lower address
. This relates to how we read files into memory, and the fact that the start/first element of an array is usually at the lowest memory address.
$ xxd ./inputs/m0.bin | more
# hit q to exit the viewer
The stride of a vector is the number of memory locations between consecutive accesses to the vector. If our stride is n, then consecutive accesses access vector[i] and vector[i+n]. If the address of vector[i] is address, then the memory address of vector[i+n] = address + n * sizeof(element).
So far, all the arrays/vectors we’ve worked with have had stride 1, meaning there is no gap betwen consecutive elements. For a closer look at strides in vectors/arrays, see this Wikipedia page.
In read_vector.s
, implement the read_vector
function which uses the file operations we described above to read a binary matrix file into memory. If any file operation fails or doesn’t return the expected number of bytes, exit the program with exit code 1
. The code to do this has been provided for you, simply jump to the eof_or_error
label at the end of the file.
Recall that the first 4 bytes indicate the size of the vecotr, which will tell you how many bytes to read from the rest of the file.
You’ll need to allocate memory for the matrix in this function as well. This will require calls to malloc
, which is in util.s
and also described in the background section above.
Finally, note that RISC-V only allows for a0
and a1
to be return registers, and our function needs to return three values: The pointer to the matrix in memory, the number of rows, and the number of columns. We get around this by having two int pointers passed in as arguments. We set these integers to the number of rows and columns, and return just the pointer to the matrix.
$ java -jar venus.jar ./test_files/test_read_vector.s
## See the screen output
## To validate
$ java -jar venus.jar ./test_files/test_read_vector.s > ./out/test_read_vector.out
$ diff ./out/test_read_vector.out ./ref/test_read_vector.out
# If diff report any lines, then check your output
In dot.s
, implement the dot
function to compute the dot product of two integer vectors. The dot product of two vectors a and b is defined as dot(a, b) = \sum_{i=0}^{n-1} a_ib_i = a_0 * b_0 + a_1 * b_1 + \cdots + a_{n-1} * b_{n-1}, where a_i is the ith element of a.
Notice that this function takes in the a stride as a variable for each of the two vectors, make sure you’re considering this when calculating your memory addresses. We’ve described strides in more detail in the background section above, which also contains a detailed example on how stride affects memory addresses for vector elements.
Also note that we do not expect you to handle overflow when multiplying. This means you won’t need to use the mulh
instruction.
For a closer look at dot products, see this Wikipedia page.
This time, you’ll need to fill out test_dot.s
, using the starter code and comments we’ve provided. Overall, this test should call your dot product on two vectors in static memory, and print the result (285 for the sample input).
$ java -jar venus.jar ./test_files/test_dot.s
## See the screen output
## To validate
$ java -jar venus.jar ./test_files/test_dot.s > ./out/test_dot.out
$ diff ./out/test_dot.out ./ref/test_dot.out
# If diff report any lines, then check your output
By default, in the starter code we’ve provided, v0
and v1
point to the start of an array of the integers 1 to 9, continuous in memory. Let’s assume we set the length and stride of both vectors to 9 and 1 respectively. We should get the following:
v0 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
v1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
dot(v0, v1) = 1 * 1 + 2 * 2 + ... + 9 * 9 = 285
What if we changed the length to 3 and the stride of the second vector v1
to 2, without changing the values in static memory? Now, the vectors contain the following:
v0 = [1, 2, 3]
v1 = [1, 3, 5]
dot(v0, v1) = 1 * 1 + 2 * 3 + 3 * 5 = 22
Note that v1
now has stride 2, so we skip over elements in memory when calculating the dot product. However, the pointer v1
still points to the same place as before: the start of the sequence of integers 1 to 9 in memory.
In main.s
, implement the main
function. This will bring together everything you’ve written so far, and create a basic sequence of functions that will allow you to dot product two vectors and display the results on the screen. You may need to malloc space when reading in vector and computing the output of the dot product.
Note that for THIS PROJECT/FUNCTION ONLY, we will NOT require you to follow RISC-V calling convention by preserving saved registers in the main
function. This is to make testing the main function easier, and to reduce its length. Normally, main
functions do follow convention with a prologue and epilogue.
The filepaths for the m0
, m1
to write to will all be passed in on the command line. RISC-V handles command line arguments in the same way as C, at the start of the main function a0
and a1
will be set to argc
and argv
respectively.
We will call main.s
in the following way:
java -jar venus.jar ./test_files/main.s ./inputs/m0.bin ./inputs/m1.bin
Note that this means the pointer to the string M0_PATH
will be located at index 1 of argv
, M1_PATH
at index 2, and so on.
If the number of command line arguments is different from what is expected, you code should exit with exit code 3. This will require a call to a helper function in utils.s
. Take a look at the starter code for matmul
, read_vector
, and write_matrix
for hints on how to do this.
line 18:dot.s sw ra, 0(sp)
. What does this instruction do ? and Why do
we need this instruction ?line 19-26:dot.s
- Why do we save registers s0---s7 ?line 36: slli s3, s3, 2
line 17:addi sp sp -36
and line 71: addi sp sp 36