Arthur J. O'Dwyer

CMU 15-113: Why casts are evil

A lot of C code presented to beginners takes the form


    int main()
    {
        int* p;
        p = (int* ) malloc ( 42 * sizeof(int) );
        if (p) {
            puts ("Success!");
            free ( p );
        }
        return(0);
    }
This style is, of course, totally wrong. (If it weren't wrong, why would it be colored red?)

There are several problems with the above code, most relating to ease of reading — whitespace issues, for example. However, there's one insidious problem that doesn't disappear even when the code is reformatted as follows:

    int main(void)
    {
        int *p = (int *)malloc(42 * sizeof(int));
        if (p != NULL)
          puts("Success!");
        free(p);
        return 0;
    }
The insidious problem is the casting of malloc's return value. This problem causes the compiler to accept the above program as valid C — and then, perhaps, crash horribly upon execution of the offending line.

Before I explain the problem in detail, let's look at the solution:

    int main(void)
    {
        int *p = malloc(42 * sizeof *p);
        if (p != NULL)
          puts("Success!");
        free(p);
        return 0;
    }
The cast is gone, and — interestingly — the expression sizeof(int) has been replaced by sizeof *p. Why? Let's find out.

Implicit declarations

Consider the original program (either of the two programs above with red coloration). What happens when the compiler reads it? The compiler accepts it and produces an object file. If you're unlucky, it even produces a binary! Unfortunately, since the programmer forgot to #include <stdlib.h>, there's no prototype for malloc in sight. Therefore, the compiler assumes the following non-prototype declaration:

    int malloc();
So the cast tells the compiler to generate machine code to convert an int (the int returned by malloc) to int *, which may be a non-identity operation — for example, it might swap the byte order, or fetch a value from the wrong machine register.

The result? Well, if we're lucky, we get a linker error. If we're unlucky, the code generated by the cast doesn't do anything serious, and the program appears to work, until it's ported to a new machine. If we're very unlucky, the code generated by the cast mangles the pointer so that we end up with a wild pointer in our program — a kind of bug that's very hard to catch, even if the cause isn't as obscure as a missing #include <stdlib.h>!

You may wonder why the compiler didn't also make up "fake" non-prototype declarations for free and puts. In fact, it does — and the call to free also causes problems. (Technically speaking, it invokes undefined behavior.) The call to puts is fine, though; puts really does return an int, and its one parameter is of type char *, which is the same type as the string literal our program provides.

Casts are evil

The above scenario is a specific case of the more general "Casts are evil" doctrine. There are perhaps three cases in which C requires casts; I can think of only two that students are likely to encounter in their college careers.1

Noise is evil

The second argument against casting is simply that it's excessive noise. The cast is not needed (and has never been needed, not since C came down from the PDP-7s and got a national standard, in 1989). The correct use of malloc is not black magic; it's really simple, and it needs to be taught that way, or else students will start thinking that C is hard, or obscure, or complicated. Here's the magic bullet again:


     foo = malloc(sizeof *foo);
Ta-da! This always works (as long as foo isn't of type void *, but in that case the programmer already knows C), and nary a typename in sight. This means we can write
     Alpha *foo;
and then later, when we realize that foo really ought to be a Beta, we can write
     Beta *foo;
without needing to also grep for instances of foo = malloc(...) or foo = realloc(...), as we would if we wrote

     foo = (Alpha *)malloc(sizeof (Alpha));
Also, notice that the preferred form is shorter. The less code you write, the less buggy code you write.

Other reasons why not

Of course there are plenty of lesser arguments against casting malloc, including "ease of reading," "ease of typing," and "C is not C++."2 All these and more can be found online by Googling "cast malloc," of course.

Footnotes

  1. One scenario in which you might use a cast is in the expression printf("%p", (void *)foo), where foo is a value of some pointer type other than void *.

    Another scenario involves "casting away const"; that is, circumventing C's type system in order to call library functions that don't believe in const. For example:

        FILE *open_file(char *filename);
    
        int process_file_by_name(const char *filename)
        {
            FILE *fp = open_file((char *)filename);
            return process_file(fp);
        }
        
    Here, it would be incorrect to remove const from the prototype of process_file_by_name (implying that the function might modify its string argument); but otherwise, C won't let us remove the const modifier from filename without an explicit cast.

    A third scenario in which a programmer might cast away const is exemplified by the standard library function strchr, whose prototype is char *strchr(const char *s, int k). The parameter s should be const, since the function doesn't modify its argument; but the return value (which is a pointer into the string s, or NULL) must be non-const; otherwise, we couldn't write

        char s[] = "foobar";
        char *trunc = strchr(s, 'b');  /* couldn't assign const to non-const */
        *trunc = '\0';
        
    Therefore, the const modifier must be cast away somewhere inside strchr.
  2. In C++, you cannot implicitly convert between void * and other kinds of data pointers, as you can in C. This was done in the name of type-safety; I personally consider it a mistake. However, C++ programmers should generally be using new and delete instead of malloc in the first place. There is much less need for void * in idiomatic C++.


This page was last updated    15 February 2006
All original code, images and documentation on this page are in the public domain.