Crash-course in C for super(computer)students

My first C-program

Insert this following text into your favourite editor (preferably emacs which you can start by typing emacs &), and save it under the filename "helloworld.c"


How to run

After saving this to presumably "helloworld.c", you tell the computer to compile your program into a file that can be executed on the computer you are running on.

gridur$ cc -o helloworld helloworld.c
gridur$ helloworld
Hello world
On gridur, the compiler is called cc while on most other machines, gcc is installed (this is the free compiler made by the GNU-project). The -o option to gcc/cc means that the next argument following -o is the name of the file of the executable file you are generating. The last argument is the name of the C source file. If you omit "-o helloworld" your code will be placed in a file called a.out.

Digression: Portabiliby

The generated (executable) file "helloworld" can only be run on computers similar to "gridur". The clients you are (physically) sitting by run another operating system, and cannot run the file helloworld. What you need to do is to compile your source code helloworld.c on each new computer system. You can always run the generated file on the system you compiled it on.

How it worked

The first line/statement #include <stdio.h> tells the compiler to include some standard input-output-function (STanDard Input Output). Among these functions is the printf() function you used.

Next is the declaration of the function main(). A function declaration starts with a specification of its "return"-type. The function main() should always have a return-type of int. main is always the starting point in a C-program (starting point while executing the program after compiling, it does not have to be the first function in your source file).

The contents of the main()-function is encapsulated by a pair of curly braces { and }. Inside here is a call to the stdio-function printf(). This function takes a string as an argument (later we will see much more arguments). The string ends with \n - a newline (check the output without it).

The return 0 ends the function, with an argument of zero. Returnvalue 0 from a main() function tells the operating system that your function exited successfully. There are no paranthesis around the argument as return is a C-keyword, not a C-function.

Every instruction or function call in C ends with a semicolon. Forget it, and the compiler (gcc/cc) will complain.

Data types

There are five basic data types in C which we will cater for here.

Declaring a variable in your program can be done by a line int count;.

Try this program to calculate the circumference of a cirle:


Variables declared outside our main() function are global to the program. (Having global variables is usually bad programming habit).

A new way of using printf() is introduced. The symbol %f tells printf to replace the symbol by a float, which follows as the next argument after the first string.

See here for more info on printf.

Constants like pi should really not be declared as a variable in the way used here, because it is slower and uses more storage than necessary. The "correct" way is to insert the line

#define PI 3.14156
just after the #include statements, and replace pi in your code by PI (and don't declare float pi = ...). This is called a macro in C and could potentially be much more complex. At compile time, the compiler will replace macros in your code with the contents of the macros before compiling.

Getting input from user

Instead of specifying the diameter of your circle at compile-time, you may want to input the diameter while the program is running. Insert the following segment into your circumference-code

  printf("Input the diameter of a circle: ");
  scanf("%f", &diameter);
and modify other necessary bits. The scanf()-functions works almost like printf(), but the other way around, it accepts formatted strings from the keyboard (or rather standard input) instead of printing them to standard output. The ampersand in &diameter means that you supply the address of diameter, not the diameter itself, so that scanf() knows where to put the diameter.

Allocating an array

Next we will try to allocate an array in which we can put usable data.

The array will be of length N. N is a symbol we keep for the length of the array, in C it is called a macro because we define the symbol with a #define command at the start.

A new #include-statement is needed because we need more standard functions, this time we need random numbers.

A for-loop is used to traverse the array and fill each value with a number. The rand()-function returns an int, but we would like a float because that is the datatype for our array. We therefore do a type-cast to our wanted datatype. This is done like this: float iAmRandom = (float)rand();

After filling the array, the array is printed using printf again. Now two extra arguments are passed to printf, the first one is the integer i which we would like to print as an integer using the symbol %d and the next one is the (float) value in the array at position i. This is printed with the %f symbol as before.


Matrices

For matrices you can substitute myArray[N] with my myMatrix[N][N] and you will get a N × N matrix. You will need an extra for-loop inside the other one, like this:


The printing of matrices is more tricky. A somewhat standard way to print matrices is like this:

where each element is separated by a tabulator space (\t) and each row is separated by a newline.

Exercises

Pointers

What is really the symbol myArray? You have used myArray[i] to fetch the contents of the i'th element of the array myArray. But myArray is something on its own, it is a pointer to an address in the computers memory. Every memory cell has its own address, and you use symbols instead of physical numbers for your convenience.

myArray[i] actually tells the computer to read the pointer myArray and add i to the address. The contents of this address is the contents of the i'th array-element.

For matrices this is even more tricky. When you declare myMatrix[10][10] you actually declare 10 arrays of type myArray[10]. The pointers to the start of these arrays are placed into another array called myMatrix. So your matrix is really an array of arrays.

Functions

While your program grows, various tasks will eventually be repeated. Every time you want to print a matrix, you shouldn't need to replicate the above code.

The correct place for the matrixprinting-code is in a function, which you can call from wherever you want in your program.

You have already written a function called main(). Now we will write various functions. A function consist of first a return-type, a name for the function with an argument list, and then a function body enclosed in curly braces { and }.

Allocating memory

You will eventually need to allocate memory as workspace as your program runs. Up until now, we have declared memory at compile-time, but what if you don't know at compile-time how big your matrices are going to be?

The answer is to use malloc() which stands for memory allocation.

To allocate a vector or array of doubles at run-time, you could make a function like this:


After being finished with the memory you have allocated with a = malloc(), it should be freed by calling free(a), otherwise the memory will be unavailable to other programs and your own malloc-calls until your program finishes.

Exercise: What happens if you try to allocate a vector with a size of 10Gb? (each element takes 4 bytes = sizeof(double)).

Allocating memory for matrices

Matrices are again more tricky to allocate. A function which you should not use before you understand is:


This allocation can be done in many ways. You don't have to have the whole matrix allocated in a single memory block. The rows can be scattered around the memory if you want, but that's usually not that smart in a single-processor program.

Printing matrices

Now we will write a function called printMatrix(). First you need to decide the return-type you need. A printing function probably does not need a return-type, so let's use void. Next the input arguments. We need a matrix to print, and we need to know the dimensions of the matrix. Lets assume a square matrix for simplicity.

void printMatrix(double **aMatrix, int dimension)

The two stars in front of aMatrix tells the compiler that this is not a value passed in (as for dimension), but a pointer to pointers (one star would indicate a pointer is incoming).

The full code of the function is now:


This code should be placed above the main() function declaration in your source file.

Exercise: Extend the function to be able to print rectangular matrices.

Getting input from the command line

Quite common you would like to vary for example the system size in your simulation. Calling your program with some number as an argument on the command line is an easy way to accomplish that. You might have wondered what the symbols inside the paranthesis in int main(int argc, char **argv) is, now it is time to use it.

int argc is an integer counter of how many arguments your program was called with from the command line. You typically first check this integer to see if the user has supplied anything at all.

char **argv is an array of strings with the content of each argument (C stores each string as an array of character, so **argv is really a matrix of characters.

Bear in mind that input arguments are stored as strings (in ASCII-representation), and that is a fundamental different way of storing numbers than having them available in a variable. You will therefore have to convert any numbers into integers (or floats) if you are going to use them. We show an example:


Testing of code:
$ gcc -o inputexample inputexample.c
$ inputexample
Usage: inputexample <n>
where <n> is an integer.
$ inputexample 4
You gave me n=4, n squared is 16.

Links