Homework — Programming in Assembler

Overview

The aim of this homework is to explore how high level languages constructions (if/then/else, for loops...) can be expressed in assembly language.

We suppose that i is a variable containing a 32-bit integer (for instance a int32_t in C language) stored at address 0x10000000 in memory.

You can test your answers using Ripes, a RISC-V simulator (the instructions are at the bottom of the page).

Load a 32-bit value into a register

The first construction we need is a way to load an arbitrary value into a register of the processor.

Question 1: using the instruction ORI (OR with immediate value), and the fact that the register x0 always contains the value 0, how can we load a 12-bit unsigned value (such as 0x1A0) into register x6?

Loading a 12-bit value is nice but we would like to load any 32-bit value into a register. Hopefully, the LUI instruction can help us.

Question 2: using the instruction LUI and the answer to question 1, how can we load an arbitrary 32-bit value (such as 0xBEEF01A0) into register x6?

Often, in real code, the loading of an arbitrary constant into a register is done differently. The constant is directly embedded into the code section of the program (as if it was an instruction) and loaded into a register with a LW (load word) instruction.

Note also that the goal of loading a 32 bit value into a register can be achieved with the pseudo instruciton LI (load immediate), which is automatically translated to LUI and ORI or ADDI instructions

Use of variables

As you have seen during the lectures, in the RISC-V, all arithmetic and logic operations (addition, subtraction...) are performed on operands stored in registers and the result is stored in a register. In high-level programming languages (C, Python, Rust...), variables are often stored in memory. So when the content of a variable is manipulated, it is often necessary to:

  • read the value of the variable from the memory to a register,
  • manipulate the content of the register,
  • store the new value from a register to the memory.

Let's consider the following code snippet:

i = i + 1;

Question 3: write the sequence of assembly instructions required to increment the content of the variable i by 1 (corresponding to the code snippet above). Use as many registers as necessary. Hint: you will need the following instructions: LW (load word), SW (store word), ADDI (add with immediate), and the result of question 2.

Reminder: i is stored at address 0x10000000 in memory. In Ripes, we can initialise i by adding the following assembler code to the end of your program:

.data
i: .word 127

The directive .data indicates that the following code (or data) will be stored at the beginning of the data section, which happens to be located at adress 0x10000000. You will learn more about memory layout in the second part of the lecture. i: is a label.

If/then/else constructions

We will now see how to translate an if/then/else construction.

If/then

Let's consider the following code snippet (in C language):

if (i == 0) {
    i = 1;     // code block #1
}
i = i + 1;     // code block #2

If you are not (yet) familiar with the C language, the code does the following:

  • If the value of the variable i is 0, the code block in braces { } is executed
  • Next, whatever the value of i, the code block after the braces us executed

In assembly, if constructions are built using a conditional branch instructions (BEQ, BNE, BLT, BGE). These instructions take three parameters:

  • two registers (src1 and src2),
  • an 12-bit signed immediate value (offset).

When the condition is met, i.e. :

  • for BEQ, when the content of the register src1 is equal to the content of the register src2,
  • for BNE, when the content of the register src1 is not equal to the content of the register src2,
  • for BLT, when the content of the register src1 is less than the content of the register src2,
  • for BGE, when the content of the register src1 is greater than or equal to the content of the register src2,

Two other instructions exist (BLTU and BGEU) which do the same as BLT and BGE but using unsigned comparisons instead of signed comparisons.

the processor jumps to the instruction at the address: PC + offset, where PC is the address of the conditional branch instruction. However, the usual way to indicate a jump target is a label, which will be translated into the correct offset during asembly. In order to declare a label for a certain assembler instruction, start the line before the instruction with the label identifier (it must begin with a letter), followed by a colon :.

Question 4: write the sequence of assembly instructions that behaves the same as the code block above. You should reuse your solution to Question 3 in order to load variable i into a register before testing its value. The assignment i = 0 and addition i = i + 1 can be done locally in a register, but don't forget to store the result in the correct memory location at the end.

If/then/else

We now want to translate the following C code:

if (i == 0) {
    i = 1;     // code block #1
}
else {
    i = -i;    // code block #2
}
i = i + 1;     // code block #3

If the content of the variable i is 0, code block 1 then code block 3 are executed, else (if i is not equal to 0), code block 2 then code block 3 are executed.

To translate this block of code, we will need to use, in addition to a conditional branch instruction, an unconditional jump instruction. In the RISC-V, an unconditional jump is encoded as jal x0, offset where offset is a 20-bit signed immediate value. When the processor executes this instruction it jumps to the instruction at address A + offset where A is the address of the jal instruction. As for branches, we can use labels for jump targets.

The 'l' in jal stands for 'link', which means basically storing the address of the next instruction in a register as return address for a function call. If we don't care about returning, we can use register x0 (zero) as target, since writes to the zero register will just be ignored. There is even a pseudo instruction j (jump), which is translated to jal zero.

Question 5: write the sequence of assembly instructions that behaves the same as the C code shown above. As before, reuse your code to load i into a register. All arithmetic operations can be performed on this register. Note that arithmetic negation can be realised by subtracting a value from zero. Use Ripes and different initial values of i to test your code.

For loops

The next construction we will study is the for loop. Let's consider the following code snippet:

for (j = 0; j < 10; j++) {
    i = i + 10; // code block # 1
}
// code block #2

At the beginning of the loop, the variable j is initialized with value 0. Next, the exit condition (j < 10) is evaluated. If it is false, we exit the loop (code block 2), else, we execute the content of the loop (code block 1). In this latter case, after one iteration of the loop, the variable j is incremented by 1 (j++) and the exit condition is evaluated again.

An equivalent program using the while construct instead look like this:

j = 0;
while (j < 10) {
    i = i + 10;  // code block #1
    j++;
}
// code block #2

Question 6: write the sequence of assembly instruction that behaves the same as the code block above. You can use a register for the loop variable j, no need to store it in memory. Reuse your code in order to load i from memory, and to save it at the end (in code block #2.)

Squaring a number

Now you should have developped enough assembler skills to implement a function that computes the square of an integer. It uses the simple mathematical fact:

\[ n^2 = \sum_{i=1}^{n} (2n - 1), \quad \text{for}\; n \geq 1 \]

So all we need to do to compute the square is summing up the first \(n\) odd numbers. In C, this corresponds to the following code:

if (n < 0) {

    // make sure n is positive
    n = -n;  
}
s = 0;  // square
k = 1;  // odd number to sum up
for (int i=0; i<n; i++) {
    s = s + k;
    k = k + 2;
}

Question 7: Write an assembler program that computes the square of n, where n is stored at address 0x10000000. The result shall be stored at address 0x10000004. You can use registers for intermediate results and loop variables. You can initialise the memory by adding the following code to the end of your program:

.data
n: .word 42
s: .word 0

Change the value of n to test your code. We do not consider arithmetic overflows, i.e. we make the hypothesis that \(n^2\) is small enough to fit into a 32 bit signed integer.

Test with Ripes

You can test your answers using Ripes.

Configuration

Launch Ripes.

On the school's computer labs machines, Ripes is already installed and can be launched from a terminal:

$ Ripes

Note: do not type $, it represents the prompt, i.e. what is displayed by the shell to invite you to type a command. Be also careful about the uppercase and lowercase letters as the shell is case-sensitive.

Next, using the first icon Select processor (just below the menu File), select a Single-cycle processor.

Icon Select processor

Select single cycle processor

Write assembly code

In the Editor panel, on the left (below Source code), you can directly write assembly code.

For instance, you can type:

lui  x18, 0x10000
lw   x19, 0(x18)
addi x19, x19, 1
sw   x19, 0(x18)

The next instruction to be executed is highlighted in red. You can execute it using the icon representing a right black arrow (clock the circuit). The value of the registers is shown on the right (GPR) and the registers that has been modified by the previous instruction are highlighted in orange.

Source code panel

See the content of memory

You can see the content of the memory using the Memory panel. The address 0x10000000 used in the text above can be accessed by selecting .data in Go to section list at the bottom of the screen.

Memory panel