C Programming: What is the difference between an array and a pointer?

Why is a raven like a writing-desk? (Lewis Carroll)

This is a copy of an article I wrote a long time ago. I'm putting it here to give it a more permanent home. Sorry for being off topic again!

Introduction

I'm glad you asked. The answer is surprisingly simple: almost everything. In other words, they have almost nothing in common. To understand why, we'll take a look at what they are and what operations they support.

Arrays

An array is a fixed-length collection of objects, which are stored sequentially in memory. There are only three things you can do with an array:

  1. sizeof - get its size
    You can apply sizeof to it. An array x of N elements of type T (T x[N]) has the size N * sizeof (T), which is what you should expect. For example, if sizeof (int) == 2 and int arr[5];, then sizeof arr == 10 == 5 * 2 == 5 * sizeof (int).
  2. & - get its address
    You can take its address with &, which results in a pointer to the entire array.
  3. any other use - implicit pointer conversion
    Any other use of an array results in a pointer to the first array element (the array "decays" to a pointer).

That's all. Yes, this means arrays don't provide direct access to their contents. More specifically, there is no array indexing operator.

Pointers

A pointer is a value that refers to another object (or function). You might say it contains the object's address. Here are the operations that pointers support:

  1. sizeof - get its size
    Like arrays, pointers have a size that can be obtained with sizeof. Note that different pointer types can have different sizes.

  2. & - get its address
    Assuming your pointer is an lvalue, you can take its address with &. The result is a pointer to a pointer.

  3. * - dereference it
    Assuming the base type of your pointer isn't an incomplete type, you can dereference it; i.e., you can follow the pointer and get the object it refers to. Incomplete types include void and predeclared struct types that haven't been defined yet.

  4. +, - - pointer arithmetic
    If you have a pointer to an array element, you can add an integer amount to it. This amount can be negative, and ptr - n is equivalent to ptr + -n (and -n + ptr, since + is commutative, even with pointers). If ptr is a pointer to the i'th element of an array, then ptr + n is a pointer to the (i + n)'th array element, unless i + n is negative or greater than the number of array elements, in which case the results are undefined. If i + n is equal to the number of elements, the result is a pointer that must not be dereferenced.

That's it, really. However, there are a few other pointer operations defined in terms of the above fundamental operations:

  1. -> - struct dereference
    p->m is equivalent to (*p).m, where . is the struct/union member access operator. This means p must be a pointer to a struct or union.

  2. [] - indexed dereference
    a[b] is equivalent to *(a + b). This means a and b must be a pointer to an array element and an integer; not necessarily respectively, because a[b] == *(a + b) == *(b + a) == b[a]. Another important equivalence is p[0] == 0[p] == *p.

A quirk of parameter declarations

However, there's one thing that confuses this issue. Whenever you declare a function parameter to have an array type, it gets silently converted to a pointer and any size information is ignored. Thus the following four declarations are equivalent:

void foo(int [42]);

void foo(int []);

typedef int t_array[23];
void foo(t_array);

void foo(int *);

A more common example is int main(int argc, char *argv[]), which is the same as int main(int argc, char **argv). However, int main(int argc, char argv[][]) would be an error because the above rule isn't recursive; the result after conversion would be int main(int argc, char (*argv)[]), i.e. argv would be a pointer to an array of unknown size, not a pointer to a pointer.

Conclusion

Arrays by themselves are nearly useless in C. Even the fundamental [] operator, which is used for getting at the array's contents, is an illusion: it's defined on pointers and only happens to work with arrays because of the rule that any use of an array outside of sizeof and & yields a pointer.

4 Comments

Yes, pointers in C will drive you batty. Perl's references are much easier to understand.

You should put more emphasis on sizeof for pointers: the pointer has no idea of how many elements it points to.

int array[5];
int *ptr = array;
printf("%zd\n", sizeof(array)); // 5*sizeof(int)
printf("%zd\n", sizeof(ptr));   // sizeof(int*)
printf("%zd\n", sizeof(*ptr));  // sizeof(int)

Leave a comment

About mauke

user-pic I blog about Perl.