Grace Hopper developed the first compiler for a computer programming language.

TA Objectives:

Compile and run C code on a Linux environment.
Show how to step debug in vscode for part 1.
How to print pointers in C
walk through argc and argv in part 2

Be sure to read the tutorials page to set up your environment and set up vscode before starting this lab.

Warmup

Using gdb to track a segfaulting program

Ask ChatGPT or TA

What does s, n and b and do in gdb?
Show me how to use s, n and b in gdb?
How to print variable values in C ?
How to print pointers in hex in C ?
How to set up launch.json in VSCODE for debugging C programs? Read here

1. Compile program with appropriate debugging flag(s)

Assume you have the following program, segfault.c that is segfaulting (you can download it here :

#include <stdio.h>

  int main () {
    int foo[5], n;

    memset((char *)0x0, 1, 100);

    printf (" Initial value of n is %d \n", n);
    return 0;
  }

To use GDB, first compile your program using the -g option in cc or gcc, for example: By default, all CMPT-295 builds include the debugger flag.

$  gcc -g -o segfault segfault.c

2. Open GDB (within editor for added functionality)

# First try cgdb, if it is not found then try module command below
$ cgdb segfault
# If logged into CSIL. You need to do this everytime you log in to CSIL
$ module load cmpt295/cgdb

To investigate why the program is crashing, run the program first using the run command. Then, you could try the where command. It will show you a stack trace, with the source line number where each function in the stack was.

Note: The run command will cause your source code to be loaded in a second window, one of the many advantages of using cgdb.

In this simple case, it is obvious that the program crashed on the memset() function call.

3. Where is the crash ?

In real world programs with more complicated code, if the where command is not enough to isolate the problem, you'll need to do debugging at a more detailed level. For that, you may need to set break points at spots where you think the problem might be. In the example shown below, break points are set at line numbers 3, 4, and 6. Then, the program is run once again, while stepping through each break point using the run and next commands respectively:

  (gdb) break 3
  Breakpoint 1 at 0x80483a8: file segfault.c, line 3.
  (gdb) break 4
  Breakpoint 2 at 0x80483b8: file segfault.c, line 4.
  (gdb) break 6
  Breakpoint 3 at 0x80483b8: file segfault.c, line 6.
  (gdb)
  (gdb) run
  The program being debugged has been started already.
  Start it from the beginning? (y or n) y

  Starting program: segfault

  Breakpoint 1, main () at segfault.c:3
  (gdb) next

  Breakpoint 2, main () at segfault.c:6
  (gdb) next

  Program received signal SIGSEGV, Segmentation fault.
  0x009d4b47 in memset () from /lib/tls/libc.so.6
  (gdb)

Given that the example was too simple in the first place, it should still be obvious that the call to memset() is what was causing the segfault.

4. Exit the debugger

Once you find the bug, you might want to kill the program (that crashed halfway through) using the kill command:

  (gdb) help
  List of classes of commands:

  aliases -- Aliases of other commands
  . . .

  Type "help" followed by a class name for a list of commands in that class.
  Type "help" followed by command name for full documentation.
  Command name abbreviations are allowed if unambiguous.
  (gdb)
  (gdb) kill
  Kill the program being debugged? (y or n) y
  (gdb)

Part 1

Debugging code
Editing Code
Compiling Code

Debugging code

# This step will only work if you followed the module instructions
$ module load cmpt295/cgdb # See tutorials page on modules.
# OR if modules did not work. You can hardcode the path
# Add this to your ~/.bashrc if you do not want to do this 
# on each login
$ export PATH=/usr/shared/CMPT/courses/cmpt295/cgdb/bin:$PATH

Editing Code

Open lab1.c in Visual Studio Code (VSCODE). Knowing how to use an editor is a prereq for 295; if unsure watch here or refer to preeqs page. The lab1.c file includes comments explaining C programming basics. This lab has five parts. Take some time to review the printf() function, which outputs formatted messages to the console.

Compiling Code

The source file lab1.c won't run on its own; you'll need a compiler, specifically the GNU C compiler, to create an executable. The GNU C compiler is available on the instructional Linux machines in the lab. Open a terminal and execute gcc -v. We see:

$ gcc -v
Using built-in specs.
COLLECT\_GCC=gcc
COLLECT\_LTO\_WRAPPER=/opt/rh/devtoolset-7/root/usr/libexec/gcc/x86\_64-redhat-linux/7/lto-wrapper
Target: x86\_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/opt/rh/devtoolset-7/root/usr --mandir=/opt/rh/devtoolset-7/root/usr/share/man --infodir=/opt/rh/devtoolset-7/root/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-\_\_cxa\_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-plugin --with-linker-hash-style=gnu --enable-initfini-array --with-default-libstdcxx-abi=gcc4-compatible --with-isl=/builddir/build/BUILD/gcc-7.3.1-20180303/obj-x86\_64-redhat-linux/isl-install --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch\_32=i686 --build=x86\_64-redhat-linux
Thread model: posix
gcc version 7.3.1 20180303 (Red Hat 7.3.1-5) (GCC)

The output tells us a bunch of the configuration options for the our installation of GCC as well as the version number. Assuming that you have saved lab1.c somewhere on your machine, navigate to that directory and then use GCC to compile it with the following command:

$ gcc -g -Wall -std=c99 -o lab1.bin lab1.c
$ ls
lab1.bin  lab1.c

-g tells the compiler to include debugging symbols; needed for using gdb to debug your code.
-Wall says to print warnings for all types of potential problems
-std=c99 says to use the C99 standard (now only 19 years old!)
-o lab1.bin instructs the compiler to output the executable code to a file named lab1.bin
lab1.c is the source file being compiled.

The lab1.bin file is an executable file, which you can run using the command ./lab1.bin. You should see:

$ ./lab1.bin
Usage: ./lab1.bin <num>

In this case, the executable lab1.bin is expecting a command-line argument, which is text that is provided to the executable from the command-line when the program is run. In particular, lab1.bin wants a number from 1 to 5, corresponding to which part of the lab code you want to run. See main() in lab1.c for more details. For example (your values of p and q may differ):

$ ./lab1.bin 1
\*\*\* LAB 1 PART 1 \*\*\*
x = 295
y = 410
p = 0x7fffaec6a2ec
q = 0x7fffaec6a2e8
x & x = 295

BE AWARE: Every time you want to test a code modification, you will need to use the gcc -g -Wall -std=c99 -o lab1.bin lab1.c command to produce an updated lab1.bin executable file (Tip: Use the up and down keys to scroll through previous terminal commands you've executed OR you can chain commands together using the && operator).

Part 2 Debugging in VSCODE

(Highly Recommended)

To debug linked list programs in Visual Studio Code, use the built-in debugger. A preconfigured launch setup is provided in the part2 folder of the repository—use it as a template for other assignments. Watch the video below to see how it simplifies debugging (or refer to VS Code debugger help for more guidance)

$ ls part2/.vscode
launch.json tasks.json

The .vscode folder contains two files:

launch.json: Defines the debugging configuration for the linked list programs.
tasks.json: Configures building and cleaning the project using a Makefile.

Figure explains content of these files. Hover over image to select tasks.json or launch.json. If you want more details on each field see here VScode jsons. Also atch video below (or ask TA) to see how to use the debugger in vscode.

Part 3

Ask ChatGPT or TA

How to print pointers in hex ?

You will now debug a C program that implements a simple linked list. The lab includes two driver files: driver1.c and driver2.c. Both use linkedlist.c, which contains a dynamically allocated linked list with two bugs.

Compile and run the code the code

$ cd part2
$ gcc -g -Wall driver1.c linkedlist.c -o listtest
$ ./listtest 

It ends with a segmentation fault. Use GDB to find and fix the bugs in linkedlist.c so the output becomes:

$ ./listtest
Test OK.

Talk to the TA and/or watch lab video.

You are encouraged to discuss GDB with other students. You should try to find the bug on your own, and let others using the same driver to find it on their own as well.

gdb scripting

Read the gdb tutorial

There are two main ways to debug code containing linked lists in GDB. You can manually go through and print each node in the linked list (which can become tedious), or you can write a script to do it for you. In this example, we will learn how to write a GDB script to traverse the linked list given in part3/ of repo

$ gcc -g -o linked_list linked_list.c # compile
$ ./linked_list # Test Run

Place the following GDB commands in a .gdbinit file in the same folder as

# Gdb init file for linked list debugging
# $PWD/.gdbinit
define p_generic_list
  set var $n = $arg0
  while $n
    print *($n)
    set var $n = $n->next
  end
end

document p_generic_list
        p_generic_list LIST_HEAD_POINTER
        Print all the fields of the nodes in the linked list pointed to by LIST_HEAD_POINTER. Assumes there is a next field in the struct.
end



define indentby
    printf "\n"
    set $i_$arg0 = $arg0
    while $i_$arg0 > 10
        set $i_$arg0 = $i_$arg0 - 1
        printf "%c", ' '
    end
end

Place the following in the ~/.gdbinit file (~/: Home directory) for giving permissions to load:

    set auto-load safe-path /

    $ gcc -g -o linked_list linked_list.c
    $ gdb -q ./linked_list
    (gdb) br 18
    Breakpoint 1 at 0x40061c: file linked_list.c, line 18.
    (gdb) r
    Breakpoint 1, main () at linked_list.c:18
    18          print_list(list1);
    (gdb) p_generic_list list1
    $1 = {data = 0, next = 0x602030}
    $2 = {data = 1, next = 0x602050}
    $3 = {data = 2, next = 0x602070}
    $4 = {data = 3, next = 0x602090}
    $5 = {data = 4, next = 0x6020b0}
    $6 = {data = 5, next = 0x6020d0}
    $7 = {data = 6, next = 0x0}
    (gdb)

Script details

We want our script to traverse the list given in the first argument to the command arg0. So we create a convenience variable to store our current pointer and set it to arg0: :

set var $n = $arg0

We might want to change our node structure later on, so we create a command that will print out every field in the node, regardless of what it is. We do this by using the gdb command print on the dereferenced nodes (we could use printf, but we would have to individually specify each field). :

print *($n)

The only part of the linked list node that we assume will always be present is the next pointer, which we use to move through our program in the line. :

set var $n = $n->next

When our p_generic_list command is run in gdb, it prints out all of the nodes until it reaches a NULL

Challenge:

Now modify the print_list to only print the node if the data == 5 (you can read the gdb tutorial on how to write if statements) OR you can ask chatgpt.

Part 4

Ask ChatGPT or TA

How to print argvp[] array?
Show example of how to use argc and argv[] array in C program.

The way a running program accesses these additional parameters is that these are passed as parameters to the function main: Here argc and argv mean argument count and argument vector, respectively. The first argument is the number of parameters passed plus one to include the name of the program that was executed to get those process running. Thus, argc is always greater than zero and argv[0] is the name of the executable (including the path) that was run to begin this process. For example,

#include <stdio.h>

int main(int argc, char *argv[]) {
  printf("argv[0]: %s\n", argv[0]);
  return 0;
}

First compile and run it so that the executable name is a.out. Then, compile it again and run it so the executable name is arg:

$ gcc argument0.c
$ ./a.out 
argv[0]:  ./a.out
$ gcc -o arg argument0.c
$ ./arg 
argv[0]:  ./arg

If additional command-line arguments are passed, the string of all characters is parsed and separated into substrings based on a few rules; however, if all the characters are either characters, numbers or spaces, the shell will separate the based on spaces and assign args[1] the address of the first, args[2] the address of the second, and so on. The following program prints all the arguments:

#include <stdio.h>

int main(int argc, char *argv[]) {
  int i;

  printf("argc:     %d\n", argc);
  printf("argv[0]:  %s\n", argv[0]);

  if (argc == 1) {
    printf("No arguments were passed.\n");
  } else {
    printf("Arguments:\n");

    for (i = 1; i < argc; ++i) {
      printf("  %d. %s\n", i, argv[i]);
    }
  }

  return 0;
}

Here we execute this program with one and then five command-line arguments:

$ gcc argument.c
$ ./a.out first
argc:     2
argv[0]:  ./a.out
Arguments:
  1. first
$ ./a.out first second third fourth
argc:     5
argv[0]:  ./a.out
Arguments:
  1. first
  2. second
  3. third
  4. fourth