Summary

This is the first session in a series of 10 labs covering various aspects of programming in C++; ranging from the most basic examples to more complicated topics. Most of the things covered in the theoretical course notes will be applied in practice during these sessions. For starters, you'll see an introduction to the more practical aspects of the labs: setting up the programming environment, version control, etc.

Practical remarks

The reference platform for all assignments and projects is the GNU C++ compiler installed on the lab computers (including the installed libraries) which is a recent version of the Ubuntu linux distribution. We advise you to install and use your own standalone Linux system for development.

The current most up-to-date GNU C++ compiler is version 13.2.1 and for clang version 16.0.6. You can check the release notes to know exactly which new C++ feature they support. Mac computers usually have their own version of clang installed (based on the open-source clang); probably you'll first have to install Xcode IDE. To install the newest GNU C++ on Mac computers, you can use Macports or Homebrew. To check the version of your C++ compiler, do:

    g++ --version

You can use whatever decent text editor that suits your fancy to write code in C++. The lab computers include a recent version of CLion for programming, code management, debugging, and related things. However, the more hardcore terminal users can also find a recent version of vim/neovim installed.

I assume that you know how to work on the command line interface (CLI, a.k.a. the terminal) in Linux / UNIX-like systems! In case you have trouble with that you can find plenty of information online.

Hello world

Let's dive directly into coding. I am sure you missed it!

First attempt

Virtually any programming course starts with the well-known Hello world, so here it is. Open up any text editor and create a text file named main.cpp with these contents:

#include <iostream>

int main() {
    std::cout << "Hello world!" << std::endl;
    return 0;
}

Compile & link this source file on the command line (make sure you're in the right directory):

g++ -o hw main.cpp

Finally, run it (in the same directory) to receive a friendly message:

./hw

Modular code

An important concept in programming is modular design, i.e., nobody writes all their code in one huge file. Therefore, split the previous code into two separate files. The file hello.cpp contains the definition of the function that says hello:

#include <iostream>

void hello() {
    std::cout << "Hello world!" << std::endl;
}

The file main.cpp contains the main entry point of the program (the main function) which calls the hello function defined in the other file and quits afterward:

int main() {
    hello();
    return 0;
}

Compile & link them:

g++ -o hw main.cpp hello.cpp

This won't work. You'll get an error similar to this one:

main.cpp: In function 'int main()':
main.cpp:2:8: error: 'hello' was not declared in this scope

In human: "Help, I don't know what hello on line 2 in main.cpp means".

The reason for this error is that in C++ every entity (thing) must be declared before it can be used within a compilation unit (source file). We could solve this problem by adding a line in main.cpp that declares the function hello as in:

// Declares a function called hello with no return value and no arguments
void hello();

int main() {
    // Now we can call hello(), even though it's defined in hello.cpp
    hello();
    return 0;
}

Although this would solve the problem in this particular case it won't be a very clean way to do it: you'd have to repeat this for every entity you use in every compilation unit (and remember to update it everywhere if something changes!).

Header files

The standard solution is to gather declarations of entities in header files and include these using the #include preprocessor directive. In our example, the header file is hello.h:

#ifndef HELLO_H
#define HELLO_H

// Declares a function called hello with no return value and no arguments
void hello();

#endif /* HELLO_H */

The #ifndef block that wraps the contents of the header file is called an include guard. It prevents the same header file from being included more than once in one source file by defining a preprocessor macro (HELLO_H in this case; usually some variation of the header's file name) the first time the file is included. If the same header file is included again (through inclusion of other headers that include this particular one, for instance) the #ifndef directive will just skip over the file's contents, since the macro named HELLO_H is already defined. When used correctly it solves some (not all!) subtle issues with including header files. Read more about circular dependencies.

Now include this header file in both hello.cpp:

#include "hello.h"
#include <iostream>

void hello() {
    std::cout << "Hello world!" << std::endl;
}

and main.cpp:

#include "hello.h"

int main() {
    hello();
    return 0;
}

Compiling & linking should work fine now. Note that you don't compile the header files explicitly. The code they contain will be included ("copy-pasted") & compiled in the source files that include them.

g++ -o hw main.cpp hello.cpp

Note that although it's not strictly necessary to include hello.h in hello.cpp, it helps if you have many entities in hello.cpp that depend on each other because the order in which you define them becomes less important. It is also a good way of checking that the definitions in the source file actually match the declarations in the header file.

Compiling & linking

In our simple example compiling and linking seems to be an easy step. Unfortunately, this is not the case in all but the simplest examples. The line g++ -o hw main.cpp hello.cpp recompiles everything, always! Very quickly this approach will become painfully slow and cumbersome.

To make the compilation process more efficient we need to distinguish between different phases during the translation of source code to a working binary program:

  • Compilation: Translation of a compilation unit (one .cpp file with all its included headers; often also called translation unit) to a binary .o (object) file. It consists of, roughly spoken, two things:
    • Preprocessing: Handles the preprocessor directives (e.g. #include, #ifndef, #define etc.) to create an intermediate source file with all headers and macros resolved. Usually, you don't get to see that file unless you specifically ask for it by passing the -E option to g++.
    • Code generation: Translates the preprocessed source of the compilation unit into binary code (.o file), which contains the machine code with placeholders for code from other compilation units. This means, for example, that the file main.o will not contain the machine code of the function hello, but it will know how to call it from the hello.o file through a placeholder that will be resolved during linking . (In other words: declarations get resolved)
  • Linking: "gluing" object files together into one single executable (the placeholders will be resolved and replaced by real code). Resolving any static or shared libraries is also done in this phase. (In other words: definitions get resolved)

Until now, you've been doing all that in just one command, but it actually makes sense to separate the two phases.
Compilation is done using the -c option in g++ (for all compilation units separately!):

g++ -c main.cpp
g++ -c hello.cpp

The resulting object files are linked into a binary called hw with:

g++ -o hw main.o hello.o

The useful effect of splitting up the compilation into these two phases is that during development you only need to recompile the sources that have changed since the last compilation. On the other hand, linking needs to be re-done for all object files, but since it's a much faster process the speed gains are tremendous anyway. The only issue that still remains is: how do we know precisely which files need to be recompiled?

Makefiles

Since around 1977 a tool called make has been widely used to automate the build process on UNIX machines. The programmer writes a Makefile like this one:

CXX = g++
CXXFLAGS = -Wall
LDFLAGS =

all: oef1 oef2

oef1: oef1.o
    $(CXX) $(LDFLAGS) -o $@ $^

oef2: oef2.o
    $(CXX) $(LDFLAGS) -o $@ $^

clean:
    $(RM) oef1.o oef1 oef2.o oef2

This for instance specifies the C++ compiler to be used (the CXX variable), which compilation flags should be appended (CXXFLAGS) and how the different "targets" should be created or cleaned up. To build the default target (all) you just type make on the command line and all is done automatically for you. If you really want to know more, read some GNU make documentation.

Although functional, the above example will not track dependencies between source files and header files. I.e.: it does not know which source files need to be recompiled if some header files are modified.

CMake

Fortunately there are alternatives to the old & dusty make; one of which is CMake, "Cross Platform Make". Various arguments in favor of using CMake are:

  • It's widely used, modern, and well-documented.
  • CMake is cross-platform: works on just about any decent OS; for example:
    • On Linux / UNIX CMake will automatically create awesome makefiles with dependency checking and other fancy stuff. After generating these, you just call make.
    • On Windows CMake can generate MSVC project files.
    • On OS X you can tell it to generate Xcode project files. Since OS X is UNIX-compliant it can generate makefiles as well; this is what I'd suggest you should do.
    • Other options include Kdevelop, Eclipse project files, CLion's project model is based on CMake, etc.
  • CMake is pretty good at finding requested libraries on your system, as you'll experience in a later lab session.
  • It's much easier to set up & maintain than writing complicated makefiles.

Some good reading material to get you started can be found here: CMake FAQ.

A working example

Here's a basic working example of how to set up a small project using CMake to get you started. You can use & extend it to fit your purposes.

All project files are contained in a top-level project directory with the following subdirectories and files:

  • build/ (disposable directory for temporary build files; usually you don't need to browse it that often)
  • CMakeLists.txt (top-level CMake file with project definition)
  • src/ (contains sources and other support files)
    • CMakeLists.txt (build information specific to the sources in this directory)
    • hello.cpp
    • hello.h
    • main.cpp

CMakeLists.txt looks like:

CMAKE_MINIMUM_REQUIRED(VERSION 2.6)
PROJECT(HelloWorld)
SET(CMAKE_CXX_FLAGS "-Wall")
ADD_SUBDIRECTORY(src)
INSTALL(TARGETS hw DESTINATION bin)
IF (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)
  SET (CMAKE_INSTALL_PREFIX "${CMAKE_BINARY_DIR}/installed" CACHE PATH "default install path" FORCE)
ENDIF()

src/CMakeLists.txt looks like:

SET(SRC hello.cpp main.cpp)
ADD_EXECUTABLE(hw ${SRC})

Now go to the project directory and create the temporary build directory:

mkdir build

Go inside the build directory and call CMake with the path to the directory containing the **top-level ** CMakeLists.txt file. This could be a relative path as in this example:

cd build
cmake ..

Still inside the build directory (Notice the bunch of files generated by CMake, including a Makefile), invoke make to build the hw executable:

make

If the build was successful you can install the hw binary to the predefined install location, installed/bin/ in this case, by invoking make with the install target:

make install

Finally, you can launch the compiled executable by running:

./installed/bin/hw

If you change any of the source files, you just type make install to automatically rebuild and reinstall only what's needed. By default, CMake will install programs to the /usr/local/ directory on Linux, which is meant for manually installed system programs, as opposed to programs installed by the package manager, which usually reside directly in /usr/. For our simple program, it's not really necessary to install them as a system program, so we set the default CMAKE_INSTALL_PREFIX to our build/installed folder instead using the last 3 lines of our top-level CMakeLists.txt. If you want to change where the program is installed, you can always change the CMAKE_INSTALL_PREFIX to a different directory by running:

cmake -DCMAKE_INSTALL_PREFIX=/tmp/hello_world ..

Instead of the regular cmake .. command, which after running make install again, will install our executable to /tmp/hello_world/bin/hw.

In case you want to get rid of all the temporary files generated by CMake during the build process, just remove the build directory. This keeps your project organized and your sources clean from any useless junk.

You'll see more advanced uses of CMake (finding & linking to libraries, creating your own libraries, etc.) as we progress through the sessions.

Version control

Sooner rather than later you'll be writing hundreds lines of code and more, possibly collaborating with other people. Keeping track of changes in your code over its development time will become essential. Version or revision control software allows you to do just that: keep a history of what happened to your source files at what time.

Some of the more important reasons to use some kind of revision control software are:

  • Humans make mistakes: revision control allows you to make backups, revert to previous versions, revise changes made in the past, etc.
  • Straightforward collaboration: both distributed systems and those based on a central repository allow you to work together with other people while keeping track of who did what and when.
  • Experimenting: you can create branches to add experimental features to your code without breaking the production code and work on different versions of the same code base in parallel. These branches can be merged later on.

If you're still not convinced, go have a look at Wiki: Revision Control and About git. The most popular version control program is git, which we will also be using in this course for the exercises and project.

In practice

Important To manage the project and exercises, you need to create an account on GitHub! You should register via the Education program, so you get access to free private repositories and tons of other advantages (scroll down on the website). It is part of the course, and you will need a UAntwerp-based account to submit assignments (i.e., use your UAntwerp student email address to register an account).

I won't cover all the intricate details of how to work with git; that's up to you to discover (you can find excellent documentation for them online). Although I will give you a short walkthrough of the typical workflow involved.

Your average daily Git session

Suppose you have a project, just like the one you used in the CMake walkthrough earlier, sitting in a directory called HelloWorld. First, you need to make sure git knows this directory is a repository. Go inside the directory and initialize the repository:

cd HelloWorld/
git init

This will create a hidden .git directory inside HelloWorld (you can check this with ls -la). This .git directory will contain the whole repository. Don't remove it, unless you want to lose the repository information, history, etc.

The repository you created is still empty though (regardless of the files inside HelloWorld). To add the files to version control you type:

git add src/
git add CMakeLists.txt

This will make sure Git tracks all files inside the src/ subdirectory of HelloWorld and the top-level CMakeLists.txt file. The reason for the explicit specification of which files to track is to prevent trashing your repository with temporary binary files and other build artifacts that are not worth saving and can be regenerated automatically anyway.

To check what's staged for the upcoming commit, issue: git status. Mine gives:

# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#   new file:   CMakeLists.txt
#   new file:   src/CMakeLists.txt
#   new file:   src/hello.cpp
#   new file:   src/hello.h
#   new file:   src/main.cpp
#

Nice!

To permanently save the current state of the files listed above in your repository you need to commit your changes. Just do:

git commit -m "Initial commit of the HelloWorld project"

which should give something similar to

[master (root-commit) 2aecdc2] Initial commit of the HelloWorld project
 5 files changed, 46 insertions(+)
 create mode 100644 CMakeLists.txt
 create mode 100644 src/CMakeLists.txt
 create mode 100644 src/hello.cpp
 create mode 100644 src/hello.h
 create mode 100644 src/main.cpp

At this point, you've permanently "recorded" the state of your repository. You can check this with git log. Also, git status will now claim that your working copy is clean; all changes have been committed, and the current files reflect the most recent state of the repository.

Now let's make some random changes to the hello.cpp file, save it, and check the state of the repository again with git status:

# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#   modified:   src/hello.cpp
#
no changes added to commit (use "git add" and/or "git commit -a")

Git knows that you've changed hello.cpp with respect to its most recent state in the repository. You can check what the differences are with git diff:

diff --git a/src/hello.cpp b/src/hello.cpp
index 8e37833..68be36c 100644
--- a/src/hello.cpp
+++ b/src/hello.cpp
@@ -3,6 +3,6 @@
 using namespace std;

 void hello() {
-    cout << "Hello world!" << endl;
+    cout << "Almost time for a short break!" << endl;
 }

If you're happy with this you can commit it. You first add files to be committed (i.e. put them in the stage area) and then commit those added files. You can also use the -a flag to commit all the files, but this is considered bad practice as your individual commits get too messy (i.e., when you want to revert back to a certain commit, it can contain a lot of changes that you do not want) and the commit messages will get too long. It is important to exactly describe (but keep it short) which feature you added to your code base so future you (or colleagues) know(s) exactly what your commit contains:

git commit -am "Changed the output message."

which reassures me all is fine with this message:

[master ab94d0d] Changed the output message.
 1 file changed, 1 insertion(+), 1 deletion(-)

git log will now show two commits in history.

Summary

You've used a bunch of the most common Git commands. There are some more that are useful in a usual workflow. For example when working with a remote repository like GitHub. Here's a short summary of the commands you'll be using most frequently:

Git commands
init Initializes a new repository
add Adds/stages files for the upcoming commit
commit Commits (saves) staged changes to the repository (or all changes, if you use the -a flag)
log Lists all commits in chronological order
status Shows the state of the working copy w.r.t. the repository
clone Makes a clone of a repository for local use. This is the most common way of getting access to someone else's repository
push If you're working with a remote repository (like a repository on GitHub for instance) this will sync the remote repository with the commits you've made locally
pull Just like push but the other way around: syncs your repository with the remote

Most of these commands have a wide variety of possible options and parameters. Consult the online documentation or the Git manual (man git) for more information and if you are hungry for even more you can read Pro Git.

Coding conventions

Important Make sure you write code that you'll understand 6 months from now: code for readability and consistency! In order to do so try to follow the coding conventions from the course notes.

Code formatting

Code formatting is a first step towards readable code. A well-known command line code formatter is clang-format. An example is:

clang-format -i -style=file main.cpp

The -style=file option tells clang-format to search for the .clang-format file in the current directory from which it will read the formatting rules. The .clang-format file we use in this course can be downloaded here. The file specifies the formatting instructions. The command above will reformat the code according to those specifications.

The -i tells clang-format to immediately apply the changes. If you drop this option, clang-format only shows you the proposed changes.

An advantage of this command line tool is that you can auto-format your code every time you commit/push it to your Git repository, by using so-called Git hooks: look here. This way you only have nicely formatted code in your code repository!

In this course all code needs to be formatted using the .clang-format . So the use of clang-format is needed.

Tabs vs. Spaces

Documentation

Writing comments in your code is essential. It will make you happy during code maintenance and other's lives easier if they ever need to read or use your code.

Usually, programmers tend to avoid writing comments until the moment they have to deliver their project. Avoid this and learn to comment your code as you write it. This will save you a lot of time later in the project (it is not fun having to focus on writing documentation when a deadline is approaching) and it helps you to understand your code when you are extending it.

There's a well-known tool, Doxygen, that can auto-generate documentation from your C++ code and the comments you write. We strongly advise you to write special Doxygen comment blocks, especially in your project. To use Doxygen, the first step is to go inside your source directory and generate a Doxygen config file:

doxygen -g <config_file>

You can edit the generated config file as much as you want. Then run:

doxygen <config_file>

to run Doxygen based on the config file. You will see (at least) a html directory appear which contains your generated documentation.

Take a look at some Doxygen-generated documentation examples here.

Code inspection

Next to formatting and documenting code, nicely written source code should also avoid typical programming errors, issues, style violations, ... While a lot of them will not result in compile errors, they can be considered bad practice, and can give surprising run-time effects and as such should be avoided. Luckily, there is also a tool for that: clang-tidy, part of the LLVM framework. It is an extensible framework for diagnosing such typical programming issues.

clang-tidy supports Clang Static Analyzer checks which is a code analysis tool (also part of LLVM) to find bugs in C++ (or C) programs which can also be used as a standalone tool. Next to those checks, it also supports a group of other checks (e.g., checks related to Boost, Android, etc.) that can be individually enabled. Check the website for more information.

The CLion IDE has a similar code inspection built-in in its interface: Code > Inspect Code. It detects the language and runtime errors and suggests corrections and improvements. You can see all the possible checks it does in Preferences > Editor > Inspections. As you can see there, also clang-tidy is included, meaning that when running the code inspection provided by CLion, you also run the clang-tidy!

Check these pages for more information:

Exercises

"Let's see what you remember from last year"

You will only be able to accept an assignment with your UAntwerpen-based GitHub account. Once you've accepted an assignment, it will be added to your account as a private repository to which you will also have to commit the solution of the assignment. The deadline of an exercise will typically be the day before the next lab session. You are expected to submit these exercises, but they will not count towards your grade for this course. They are meant to make sure that you digest the material for each session in a timely manner, because near the end of the semester you will be too busy with the project and the exam to also study all the lab sessions. We will give the solution at the beginning of the next lab session.

Fix My Code

Go to the assignment: https://classroom.github.com/a/2eaEjuwA

ASCII Table

Go to the assignment: https://classroom.github.com/a/XyqVPTpk

Factorials

Go to the assignment: https://classroom.github.com/a/ZQZg7LNp

Fibonacci

Go to the assignment: https://classroom.github.com/a/LkS-IY5w