These are like Isaac Asimov's 4 laws of robotics:
0:
programs must work properly1:
programs must be readable, provided this does not conflict
with the previous law2:
programs must be compact, provided this does not conflict
with the previous laws3:
programs must be efficient, provided this does not conflict
with the previous lawsA statement in a function tells the computer to do something
/* Find the grade for a mark. */ #include <stdio.h> int grade(int mark) { int grade; if (mark >= 70) grade = 1; else if (mark >= 50) grade = 2; else if (mark >= 40) grade = 3; else grade = 4; return grade; } int main() { printf("My grade is %d\n", grade(66)); return 0; }
To compile and run in a terminal window:
$ clang -std=c11 -Wall grade.c -o grade $ ./grade 2 $
Each name has a limited scope
int grade(int mark) { int grade; ... }
The scope of the local integer variable grade
is the function
body, between the curly brackets
It temporarily hides the global grade
function
Things can be declared first, then defined later
int grade; ... grade = 1;
The statement int grade;
declares the variable without defining
it
The assignment grade = 1;
defines it later
You can also declare functions before defining them
int grade(int mark); int main() { ... } int grade(int mark) { ... }
This is to tell the compiler about functions defined (a) later or (b) in other modules (often via header files)
These declarations are called signatures or prototypes
Simple decisions can be made using if
and else
if (mark >= 70) grade = 1; else if (mark >= 50) grade = 2; else if (mark >= 40) grade = 3; else grade = 4;
/* Sort two numbers. */ #include <stdio.h> // Sort two numbers into ascending order and print int main() { int a = 42, b = 21; if (a > b) { int save = a; a = b; b = save; } printf("%d %d\n", a, b); return 0; }
With no arrays yet, it is difficult to sort lots of things, or separate the calculation from the input/output
An if
statement can be followed by a block
if (a > b) { int save = a; a = b; b = save; }
else
is optional - the default is 'else do nothing'
A block is a sequence of statements between curly brackets, the same as a function body
The scope of save
is the block - it doesn't exist outside
Use %d
in printf
to print out ints
printf("%d %d\n", a, b);
An int
variable in C is a 'box' with an integer in it
int a = 42; ... a = b;
A variable can be re-used by putting different numbers in the box
Swapping needs a third variable because after a = b
the old
number in a
has been forgotten
In C, you often see this:
n = n + 1;
As an equation, it is unsolvable, as an executable definition, it defines
n
as an infinite loop
But in C, the =
operator means "becomes"
So this means "get the number out of the box called n
, add one
to it, and put the result back in the box n
, throwing away the
old number"
In other words, increase n
by one, also written as
n++
An important skill is to be able to trace execution in your head, or on paper, or with a tool like gdb
a b save 42 21 ? int save = a; 42 21 42 a = b; 21 21 42 b = save; 21 42 42
Use gdb only in emergencies, otherwise it soaks up too much time - judicious printfs are usually better
If you are writing a chess program, the constant 8 is likely to appear all over the place in your program
... for (i=0; i<8; i++) ...
Even though you will probably never want to change it, it makes programs more readable to give it a name
const int size = 8; ... for (i=0; i<size; i++) ...
const
doesn't mean constant, just "check that the variable is
never explicitly updated"
It is possible to define a constant outside of any functions, so that it is available everywhere:
#include ... const int size = 8; int doSomething() { ... size ... } int main() { ... size ... }
This is not as bad as global variables (which we are going to avoid) but it is restrictive (e.g. prevents multiple boards of different sizes) and unnecessary (as we will see when we reach structures)
It is possible to declare arguments as const
:
int grade(const char mark[]) { ... }
This only applies to pointer and array arguments where the function could make changes to the original
It documents the fact that the function will not in fact change the argument, and the compiler checks
Doing this is optional (especially if it causes restrictions elsewhere)
Technically, the keyword const
comes after the thing you
want to be readonly (except that it can come before if it is the first thing)
So some programmers always put const
after, e.g.
int grade(char const mark[]) { ... }
In more complex situations, it matters (e.g.
char const *
means
pointer to read-only characters, char * const
means read-only pointer
to characters, char const * const
means read-only pointer to read-only
characters)
Another way of defining constants is enumeration:
enum { First, Second, Third, Fail };
This is (almost) the same as:
const int First=0, Second=1, Third=2, Fail=3;
These are integer constants, and enum
is most often used where
it doesn't matter what the constants are
It is very common (but not necessary) to use capital first letters, or all-capitals, for the names of constants
A grade program might contain:
... if (mark >= 70) grade = First; ... if (grade == First) printf("First"); ...
The constant First
has been used as a code for a
concept, without ever caring what its value is
You can set the values of some or all of the constants:
enum { First=1, Second=2, Third=3, Fail=4 };
Or, you can write code which takes advantage of the fact that the constants are successive integers
enum { Mon, Tue, Wed, Thu, Fri, Sat, Sun }; int next(int day) { if (day == Sun) day = Mon; else day = day + 1; return day; }
switch (day) { case Mon: ... ; break; case Tue: ... ; break; ... case Sat: case Sun: ... break; default: ... ; break; }
A switch is a direct jump, more efficient than sequential tests
Beware: (a) don't forget the break
('fall
through') (b) no agreement on indenting (c) can make functions big: consider
one line per case, maybe a function call
One difference between enum
and const
is that an
enum
is always an int
(or something compatible like
char
) whereas a const
can be any type
Another difference is that an enum
can be used in a switch
statement, but a const
int
can't
That's because const
means readonly, not constant, e.g. a
temperature reading from a device might be declared const
, to
prevent the program from updating it
Sometimes, enum
is used to define isolated constants,
not sequences:
enum { size=8 }; enum { Width=80, Height=24 };
The advantage over using const
is that the compiler
knows that they are truly constant, and can optimize better or
allow them to be used in some special contexts
You can pretend that an enumeration is a different type:
enum Grade { First, Second, Third, Fail }; ... enum Grade grade = First;
The type enum Grade
is just a synonym for int
,
but the code may be more readable, by making intentions clearer
A while loop allows code to be repeated: it is basically a conditional backward jump in the code
/* Print a countdown. */ #define _POSIX_C_SOURCE 200809L #include <unistd.h> #include <stdio.h> int main() { int t = 10; while (t >= 0) { sleep(1); printf("%d\n", t); t = t - 1; } return 0; }
C claims to be portable, not platform independent
To go beyond the minimal standard libraries, you need to find libraries which are available across platforms
For coursework, don't use platform specific libraries,
e.g. #include <windows.h>
You can use the Posix cross-platform libraries, e.g.
#define
_POSIX_C_SOURCE 200809L
#include <unistd.h>
There are increment and decrement abbreviations:
n++; means n = n + 1;
++n; means n = n + 1;
n--; means n = n - 1;
--n; means n = n - 1;
m = n++; means m = n; n++;
m = ++n; means n++; m = n;
n = n++; is a bug
Use ++
sparingly, and avoid the bug n = n++;
Here's a square root function
// Find square roots (like sqrt). double root(double x) { double r = x / 2.0; double eps = 1e-15; while (fabs(r - x/r) > eps) { r = (r + x/r) / 2; } return r; }
The double
type is usually used for floating point
It is stored in 8 bytes, has about 15 decimal significant digits of precision, and has a range of about ±10±308
When compactness is needed and precision/range requirements are low,
e.g. graphics, you can use float
A double
is not exact, even 0.1
can't be stored exactly -
try printf("%.18f\n", 0.1);
Errors accumulate at an average rate of sqrt(n)
for n
operations, when there is no bias (rounding alternates)
The algorithm is essentially Newton's:
r = (r + x/r) / 2;
If r
is less than the real root, then x/r
is
greater, and vice versa, so the average is a better approximation, and the gap
tells you how close you are
It is easy to understand, convergence is rapid (faster than halving the gap) but libraries use even faster special-purpose techniques
One edge case is when x = 4.0
Then the initial guess is exactly correct
A while loop is repeated 0
times if the test starts false:
r = 2; while (r != 2) { ... }
This is almost always what you want, and minimizes edge cases
The countdown loop could be rewritten like this:
int t; for (t = 10; t >= 0; t--) { ... }
It is completely equivalent to
int t = 10; while (t >= 0) { ... t--; }
It gathers the three pieces into one place
Here is another variation, and its equivalent:
for (int t = 10; t >= 0; t--) { ... }
{ int t = 10; while (t >= 0) { ... t--; } }
The outer block limits the scope of t
Because of their logical complexity, you should only use for loops in a few familiar stylised special cases:
for (int i = 0; i < n; i++) ... for (int i = n-1; i >= 0; i--) ... for (item *p = list; p != NULL; p = p->next) ...
The last example, scanning a linked list, we'll see later
If your situation isn't a simple one like these, it is probably better to use a while loop
Sometimes you want to evaluate an expression (with side effects) and there is no result or you don't need it:
n++; printResults();
The first is an increment, where you don't need the value of n
for anything
The second is a function call, where nothing is returned, or you don't need the returned value
There is a do..while
loop, with the test at the end, where the
loop is always executed at least once
It's visually and semantically confusing - don't use it
There is a goto
statement (left over from long ago) - don't use it
Kernighan (co-inventor of C) said debugging is twice as hard as writing the code in the
first place - therefore, if you write the code as cleverly as possible, you are
not smart enough to debug it
So it is important to avoid writing functions which are 'too clever'
A good strategy is to write a function as if you are going to have to prove that it is correct
Functions which are logically simpler are usually also intuitively simpler, and more likely to be correct
The more your code jumps about, the harder it is to debug
So it pays to make the jumps as controlled as possible
Function calls, if statements, and loops are the most controlled statements
The statements in the next few slides should be used 'sparingly', i.e. not at all, or restricted to a few familiar stylised special cases
The return
statement doesn't have to be at the end
int abs(int n) { if (n >= 0) return n; return -n; }
No else
needed here - if n>=0
, execution
returns from the function before reaching the second line
One stylised use is to dispose of an exceptional case, and avoid an extra indent for the general case
The disadvantage is it may be unclear what property holds on return, or how to add extra end-code
The break
statement exits from a loop early:
// Search for first prime in a range while (i < last) { if (isprime(i)) break; i++; }
Again, it can help to separate special cases from the general case, without mangling the general case
Searching is the most common use
One disadvantage is that the loop can end while the test expression is still true
Programmers often say they must use break
for efficient searching,
to avoid unnecessary work
But, arguably, it is logically clearer and cleaner to write
// Search for first prime in a range while (i < last && ! isprime(i)) { i++; }
Now the test tells you exactly what must be true each time round the loop, and false when it ends (making it easier to prove correctness)
The continue
statement restarts a loop early:
// process the even numbers for (int i = 0; i < n; i++) { if (odd(i)) continue; ... }
Some people would say that using an if
..else
inside
a loop is always better than using continue
What if you want to print things out with commas?
The number of commas is one less than the number of times round the loop
There are solutions using break
or continue
, but
my favourite is:
// print s, n times with commas for (int i = 0; i < n; i++) { if (i > 0) printf(", "); printf("%s", s); } printf("\n");
What if you have a search involving a double loop?
// Search for table entry for (r = 0; r < m; r++) { for (c = 0; c < n; c++) { if (table[r][c] ...) ... } }
You can't use break
, because it only breaks out of the inner
loop, not the outer one
Some programmers say this is the sort of exception where you must use goto
, but don't do it!
A good readable approach to the problem is to use an extra variable
// Search for table entry bool found = false; for (r = 0; r < m && !found; r++) { for (c = 0; c < n && !found; c++) { if (table[r][c] ...) found = true; } }
There is one operator in C which has three arguments instead of the usual one or two
It has the form t ? x : y
and is equal to x
or y
according to the test t
:
int max = m > n ? m : n; int abs = n >= 0 ? n : -n;
It is the same as the Haskell feature if..then..else..
, but it
should be used more 'sparingly' than in Haskell
Examples where the advice is "don't use" or "use sparingly" illustrate that adding features to a language is not necessarily a good idea
But inevitably features do get added (C has resisted more than most)
This tendency is called feature creep, and partly explains why languages go out of fashion