More C programming concepts --------------------------- 0. Makefiles 1. Pointers 2. Dynamically allocating memory 3. Passing information to subroutines using pointers (pass by reference) rather than variables (pass by value) ------------------------------------------------------------------------ MAKEFILES --------- The unix "make" command simplifies compiling programs in C, C++, Fortran, etc. "make" has a set of default rules, for example if you say "make myprog" and there is a program "myprog.c" in your subdirectory, then "make" will issue the command "cc -o myprog myprog.c". You can add to or customize make's rules by creating a text file called "makefile" in your local directory. For most of our programming in C, we need to be sure we are using gcc, and we need to add the flag "-lm" which loads the math functions library if needed. We can do this by saving the following two lines in "makefile": CC = gcc LDFLAGS = -lm "make" and "makefiles" can do many more things than this but just make with the two-line makefile above will simplify a lot of our compiling needs. For a brief introduction to "make" try typing "man make" at the linux command prompt, or better yet see the short discussion under the subheading "make" in the C tutorial at http://www-h.eng.cam.ac.uk/help/tpl/languages/C/teaching_C/teaching_C.html For a longer introduction, type "info make" at a unix prompt on one of the stat dept workstations [this works in andrew-land too I think]. POINTERS -------- Variables (in any computer language) have (at least) three attributes: - the address, or location, of the variable in memory - the size, or number of bytes, set aside for the variable - the value "stored in" the variable In order to use the value stored in the variable, you have to know the location of the variable in memory, i.e. you need to know where to look for the value! (and of course you need to know the size, to know how many bytes to read to get the current value of the variable). This bit of sophistry underlies absolutely every variable and constant in every computer language. If you have access to the variable's address then there are lots of neat things you can do. In C (and C++, Pascal and a few other languages) you can do the following sorts of things: - given any variable, find its address [ using the & operator ]; - define and use variables whose values are the addresses of other variables (these are called "pointer variables" since they point to the locations of other variables) [ using the * operator ]; - allocate as much or as little space for a variable as you like, *while the program is running*. [ using the malloc() function ]; These capabilities turn out to be very powerful for things like (a) making more efficient use of memory by only allocating space for variables as you need them (b) interfacing with other programming languages like Splus, R, and Fortran. (c) making data structures like "linked lists", "trees", etc. (analogous to directed graphs) that record semantic relationships between variables as well as the variables' values. I'll concentrate on (a) and (b) in our use of C in the course, but (c) is also important! Examples -------- Here are some simple examples, taken from the "C in a nutshell" document: /************* pointers1.c *************/ #include void main (void) { int m = 0, n = 1, k = 2; int *p; char msg[] = "hello world"; char *cp; p = &m; /* p now points to m */ *p = 1; /* m now equals 1 */ k = *p; /* k now equals 1 */ cp = msg; /* cp points to the first character of msg */ *cp = 'H'; /* change the case of the 'h' in msg */ cp = &msg[6]; /* cp points to the 'w' */ *cp = 'W'; /* change its case */ printf ("m = %d, n = %d, k = %d\nmsg = \"%s\"\n", m, n, k, msg); } Note the very important point that the name of an array (`msg' in the above example), if used without an index, is considered to be a pointer to the first element of the array. In fact, an array name followed by an index is exactly equivalent to a pointer followed by an offset. For example, /************* pointers2.c *************/ #include void main (void) { char msg[] = "hello world"; char *cp; cp = msg; cp[0] = 'H'; *(msg+6) = 'W'; printf ("%s\n", msg); printf ("%s\n", &msg[0]); printf ("%s\n", cp); printf ("%s\n", &cp[0]); } Pointers used as arguments to functions --------------------------------------- Suppose we want to write a function that will swap two values. We could do something like this: /***************** swap0.c *****************/ #include void swap (int a, int b) { int tmp; printf("Before, a = %d and b=%d\n\n",a,b); tmp = a; a = b; b = tmp; printf("After, a = %d and b=%d\n\n",a,b); } void main(void) { int x = 3, y = 10; printf("Before, X = %d and Y=%d\n\n",x,y); swap(x,y); printf("After, X = %d and Y=%d\n\n",x,y); } But if we run it we will see that the program does not work! The reason is that C passes just copies the values from x and y to a and b when it calls "swap" [this was called "pass by value" when I took a Pascal course long ago]. Therefore, when "swap" ends, a and b are disposed of, and along with them, the swapped values! Instead what we want to do is have "swap" work on the memory locations where x and y are, directly. We can do this with pointers: /***************** swap1.c *****************/ include void swap (int *a, int *b) { /* now "swap" expects to be told the locations of two integers, rather than the integers themselves. */ int tmp; printf("Before, a = %d and b=%d\n\n",*a,*b); tmp = *a; /* let tmp = the value at location a */ *a = *b; /* replace the value at location a with the value at location b */ *b = tmp; /* copy the value from location b to tmp */ printf("After, a = %d and b=%d\n\n",*a,*b); } void main(void) { int x = 3, y = 10; printf("Before, X = %d and Y=%d\n\n",x,y); swap(&x, &y); /* pass the locations of x, y; NOT THEIR VALUES! */ printf("After, X = %d and Y=%d\n\n",x,y); } This works because we pass references (pointers) to the actual locations of x and y to "swap", so "swap" can work directly with x and y, rather than with copies of them [ this was called "pass by reference" when I took that Pascal course long ago ]. It is interesting to note that "pass by reference" is the ONLY way to pass information to Fortran subroutines, and it will turn out to be the way to pass information between Splus and C [or Fortran]. ---------------------------------------------------------------------- Aside on scanf -------------- You may have noticed in other programs, that scanf (the input function that corresponds to printf for printing) requires the & sign before each variable it is inputting. The reason is the same as for the "swap" function: scanf changes the values of its arguments. So, a slightly fancier version of the swap program would be: /***************** swap2.c *****************/ #include void swap (int *a, int *b) { /* now "swap" expects to be told the locations of two integers, rather than the integers themselves. */ int tmp; printf("Before, a = %d and b=%d\n\n",*a,*b); tmp = *a; /* let tmp = the value at location a */ *a = *b; /* replace the value at location a with the value at location b */ *b = tmp; /* copy the value from location b to tmp */ printf("After, a = %d and b=%d\n\n",*a,*b); } void main(void) { int x = 3, y = 10; printf("Enter a value for X: "); scanf ("%d",&x); printf("Enter a value for X: "); scanf ("%d",&y); printf("Before, X = %d and Y=%d\n\n",x,y); swap(&x, &y); /* pass the locations of x, y; NOT THEIR VALUES! */ printf("After, X = %d and Y=%d\n\n",x,y); } Now that you know about scanf, you can put it inside a loop, to read sequences of data values, or whatever you want, from the keyboard (or a file piped into the program). There are also functions fscanf and fprintf that work with files directly. Reading the command line ------------------------ Command line arguments are passed to a C program via a special set of pointers or arrays. Here's an example: /***************** cmdline1.c *****************/ #include void main (int argc, char *argv[]) { printf ("this program is called '%s'\n", argv[0]); if (argc == 1) { printf ("it was called without any arguments\n"); } else { int i; printf ("it was called with %d arguments\n", argc - 1); for (i = 1; i < argc; i++) { printf ("argument number %d was <%s>\n", i, argv[i]); } } exit (argc); } The 'exit status' of the program can be examined with "echo $status". Another way to do the same thing is: /***************** cmdline2.c *****************/ #include int main (int argc, char *argv[]) { printf ("this program is called '%s'\n", argv[0]); if (argc == 1) { printf ("it was called without any arguments\n"); } else { int i; printf ("it was called with %d arguments\n", argc - 1); for (i = 1; i < argc; i++) { printf ("argument number %d was <%s>\n", i, argv[i]); } } return (argc); } Once again the 'exit status' can be examined with "echo $status". Using malloc ------------ Finally here is an example of dynamically allocating space for variables (usually but not often this will happen when you need to define an array while the program is running, as in the followintg example): /***************** random-array.c *****************/ #include #include /* Generates a two dimensional matrix, and fills it randomly with zeroes and ones. */ void main (int argc, char *argv[]) { int xdim, ydim; int i, j; int *p, *q; if (argc<3) { printf ("x dimension of matrix? > "); scanf ("%d", &xdim); printf ("y dimension of matrix? > "); scanf ("%d", &ydim); } else { xdim = atoi(argv[1]); ydim = atoi(argv[2]); } p = (int *) malloc (xdim * ydim * sizeof(int)); if (p == NULL) { printf ("malloc failed!\n"); return; } for (i = 0; i < xdim * ydim; i++) { /* for (i=0;i RAND_MAX/2) { /* for (j=0;jRAND_MAX/2) { */ } else { /* p[i][j] = 1; */ *(p+i) = 0; /* } else { */ } /* p[i][j] = 1; */ } /* } */ /* } */ for (i = 0; i < xdim; i++) { q = p + i * ydim; for (j = 0; j < ydim; j++) { printf ("%d ", *(q++)); } printf ("\n"); } free ((void *)p); } ---------------------------------------------------------------------- For later lectures: ------------------- - working more with arrays - calling C from Splus (and R?) - calling Fortran from C