«Previous | Contents | Next»

31 goto

The goto statement is universally revered and can be here presented without contest.

Just kidding! Over the years, there has been a lot of back-and-forth over whether or not (often not) goto is considered harmful163.

In this programmer’s opinion, you should use whichever constructs leads to the best code, factoring in maintainability and speed. And sometimes this might be goto!

In this chapter, we’ll see how goto works in C, and then check out some of the common cases where it is used164.

31.1 A Simple Example

In this example, we’re going to use goto to skip a line of code and jump to a label. The label is the identifier that can be a goto target—it ends with a colon (:).

#include <stdio.h>

int main(void)
{
    printf("One\n");
    printf("Two\n");

    goto skip_3;

    printf("Three\n");

skip_3:

    printf("Five!\n");
}

The output is:

One
Two
Five!

goto sends execution jumping to the specified label, skipping everything in between.

You can jump forward or backward with goto.

infinite_loop:
    print("Hello, world!\n");
    goto infinite_loop;

Labels are skipped over during execution. The following will print all three numbers in order just as if the labels weren’t there:

    printf("Zero\n");
label_1:
label_2:
    printf("One\n");
label_3:
    printf("Two\n");
label_4:
    printf("Three\n");

As you’ve noticed, it’s common convention to justify the labels all the way on the left. This increases readability because a reader can quickly scan to find the destination.

Labels have function scope. That is, no matter how many levels deep in blocks they appear, you can still goto them from anywhere in the function.

It also means you can only goto labels that are in the same function as the goto itself. Labels in other functions are out of scope from goto’s perspective. And it means you can use the same label name in two functions—just not the same label name in the same function.

31.2 Labeled continue

In some languages, you can actually specify a label for a continue statement. C doesn’t allow it, but you can easily use goto instead.

To show the issue, check out continue in this nested loop:

for (int i = 0; i < 3; i++) {
    for (int j = 0; j < 3; j++) {
        printf("%d, %d\n", i, j);
        continue;   // Always goes to next j
    }
}

As we see, that continue, like all continues, goes to the next iteration of the nearest enclosing loop. What if we want to continue in the next loop out, the loop with i?

Well, we can break to get back to the outer loop, right?

for (int i = 0; i < 3; i++) {
    for (int j = 0; j < 3; j++) {
        printf("%d, %d\n", i, j);
        break;     // Gets us to the next iteration of i
    }
}

That gets us two levels of nested loop. But then if we nest another loop, we’re out of options. What about this, where we don’t have any statement that will get us out to the next iteration of i?

for (int i = 0; i < 3; i++) {
    for (int j = 0; j < 3; j++) {
        for (int k = 0; k < 3; k++) {
            printf("%d, %d, %d\n", i, j, k);

            continue;  // Gets us to the next iteration of k
            break;     // Gets us to the next iteration of j
            ????;      // Gets us to the next iteration of i???

        }
    }
}

The goto statement offers us a way!

    for (int i = 0; i < 3; i++) {
        for (int j = 0; j < 3; j++) {
            for (int k = 0; k < 3; k++) {
                printf("%d, %d, %d\n", i, j, k);

                goto continue_i;   // Now continuing the i loop!!
            }
        }
continue_i: ;
    }

We have a ; at the end there—that’s because you can’t have a label pointing to the plain end of a compound statement (or before a variable declaration).

31.3 Bailing Out

When you’re super nested in the middle of some code, you can use goto to get out of it in a manner that’s often cleaner than nesting more ifs and using flag variables.

    // Pseudocode

    for(...) {
        for (...) {
            while (...) {
                do {
                    if (some_error_condition)
                        goto bail;

                } while(...);
            }
        }
    }

bail:
    // Cleanup here

Without goto, you’d have to check an error condition flag in all of the loops to get all the way out.

31.4 Labeled break

This is a very similar situation to how continue only continues the innermost loop. break also only breaks out of the innermost loop.

    for (int i = 0; i < 3; i++) {
        for (int j = 0; j < 3; j++) {
            printf("%d, %d\n", i, j);
            break;   // Only breaks out of the j loop
        }
    }

    printf("Done!\n");

But we can use goto to break farther:

    for (int i = 0; i < 3; i++) {
        for (int j = 0; j < 3; j++) {
            printf("%d, %d\n", i, j);
            goto break_i;   // Now breaking out of the i loop!
        }
    }

break_i:

    printf("Done!\n");

31.5 Multi-level Cleanup

If you’re calling multiple functions to initialize multiple systems and one of them fails, you should only de-initialize the ones that you’ve gotten to so far.

Let’s do a fake example where we start initializing systems and checking to see if any returns an error (we’ll use -1 to indicate an error). If one of them does, we have to shutdown only the systems we’ve initialized so far.

    if (init_system_1() == -1)
        goto shutdown;

    if (init_system_2() == -1)
        goto shutdown_1;

    if (init_system_3() == -1)
        goto shutdown_2;

    if (init_system_4() == -1)
        goto shutdown_3;

    do_main_thing();   // Run our program

    shutdown_system4();

shutdown_3:
    shutdown_system3();

shutdown_2:
    shutdown_system2();

shutdown_1:
    shutdown_system1();

shutdown:
    print("All subsystems shut down.\n");

Note that we’re shutting down in the reverse order that we initialized the subsystems. So if subsystem 4 fails to start up, it will shut down 3, 2, then 1 in that order.

31.6 Tail Call Optimization

Kinda. For recursive functions only.

If you’re unfamiliar, Tail Call Optimization (TCO)165 is a way to not waste stack space when calling other functions under very specific circumstances. Unfortunately the details are beyond the scope of this guide.

But if you have a recursive function you know can be optimized in this way, you can make use of this technique. (Note that you can’t tail call other functions due to the function scope of labels.)

Let’s do a straightforward example, factorial.

Here’s a recursive version that’s not TCO, but it can be!

#include <stdio.h>
#include <complex.h>

int factorial(int n, int a)
{
    if (n == 0)
        return a;

    return factorial(n - 1, a * n);
}

int main(void)
{
    for (int i = 0; i < 8; i++)
        printf("%d! == %ld\n", i, factorial(i, 1));
}

To make it happen, you can replace the call with two steps:

  1. Set the values of the parameters to what they’d be on the next call.
  2. goto a label on the first line of the function.

Let’s try it:

#include <stdio.h>

int factorial(int n, int a)
{
tco:  // add this

    if (n == 0)
        return a;

    // replace return by setting new parameter values and
    // goto-ing the beginning of the function

    //return factorial(n - 1, a * n);

    int next_n = n - 1;  // See how these match up with
    int next_a = a * n;  // the recursive arguments, above?

    n = next_n;   // Set the parameters to the new values
    a = next_a;

    goto tco;   // And repeat!
}

int main(void)
{
    for (int i = 0; i < 8; i++)
        printf("%d! == %d\n", i, factorial(i, 1));
}

I used temporary variables up there to set the next values of the parameters before jumping to the start of the function. See how they correspond to the recursive arguments that were in the recursive call?

Now, why use temp variables? I could have done this instead:

    a *= n;
    n -= 1;

    goto tco;

and that actually works just fine. But if I carelessly reverse those two lines of code:

    n -= 1;  // BAD NEWS
    a *= n;

—now we’re in trouble. We modified n before using it to modify a. That’s Bad because that’s not how it works when you call recursively. Using the temporary variables avoids this problem even if you’re not looking out for it. And the compiler likely optimizes them out, anyway.

31.7 Restarting Interrupted System Calls

This is outside the spec, but commonly seen in Unix-like systems.

Certain long-lived system calls might return an error if they’re interrupted by a signal, and errno will be set to EINTR to indicate the syscall was doing fine; it was just interrupted.

In those cases, it’s really common for the programmer to want to restart the call and try it again.

retry:
    byte_count = read(0, buf, sizeof(buf) - 1);  // Unix read() syscall

    if (byte_count == -1) {            // An error occurred...
        if (errno == EINTR) {          // But it was just interrupted
            printf("Restarting...\n");
            goto retry;
        }

Many Unix-likes have an SA_RESTART flag you can pass to sigaction() to request the OS automatically restart any slow syscalls instead of failing with EINTR.

Again, this is Unix-specific and is outside the C standard.

That said, it’s possible to use a similar technique any time any function should be restarted.

31.8 goto and Variable Scope

We’ve already seen that labels have function scope, but weird things can happen if we jump past some variable initialization.

Look at this example where we jump from a place where the variable x is out of scope into the middle of its scope (in the block).

    goto label;

    {
        int x = 12345;

label:
        printf("%d\n", x);
    }

This will compile and run, but gives me a warning:

warning: ‘x’ is used uninitialized in this function

And then it prints out 0 when I run it (your mileage may vary).

Basically what has happened is that we jumped into x’s scope (so it was OK to reference it in the printf()) but we jumped over the line that actually initialized it to 12345. So the value was indeterminate.

The fix is, of course, to get the initialization after the label one way or another.

    goto label;

    {
        int x;

label:
        x = 12345;
        printf("%d\n", x);
    }

Let’s look at one more example.

    {
        int x = 10;

label:

        printf("%d\n", x);
    }

    goto label;

What happens here?

The first time through the block, we’re good. x is 10 and that’s what prints.

But after the goto, we’re jumping into the scope of x, but past its initialization. Which means we can still print it, but the value is indeterminate (since it hasn’t been reinitialized).

On my machine, it prints 10 again (to infinity), but that’s just luck. It could print any value after the goto since x is uninitialized.

31.9 goto and Variable-Length Arrays

When it comes to VLAs and goto, there’s one rule: you can’t jump from outside the scope of a VLA into the scope of that VLA.

If I try to do this:

    int x = 10;

    goto label;

    {
        int v[x];

label:

        printf("Hi!\n");
    }

I get an error:

error: jump into scope of identifier with variably modified type

You can jump in ahead of the VLA declaration, like this:

    int x = 10;

    goto label;

    {
label:  ;
        int v[x];

        printf("Hi!\n");
    }

Because that way the VLA gets allocated properly before its inevitable deallocation once it falls out of scope.

32 Types Part V: Compound Literals and Generic Selections

This is the final chapter for types! We’re going to talk about two things:

They’re not particularly related, but don’t really each warrant their own chapters. So I crammed them in here like a rebel!

32.1 Compound Literals

This is a neat feature of the language that allows you to create an object of some type on the fly without ever assigning it to a variable. You can make simple types, arrays, structs, you name it.

One of the main uses for this is passing complex arguments to functions when you don’t want to make a temporary variable to hold the value.

The way you create a compound literal is to put the type name in parentheses, and then put an initializer list after. For example, an unnamed array of ints, might look like this:

(int []){1,2,3,4}

Now, that line of code doesn’t do anything on its own. It creates an unnamed array of 4 ints, and then throws them away without using them.

We could use a pointer to store a reference to the array…

int *p = (int []){1 ,2 ,3 ,4};

printf("%d\n", p[1]);  // 2

But that seems a little like a long-winded way to have an array. I mean, we could have just done this166:

int p[] = {1, 2, 3, 4};

printf("%d\n", p[1]);  // 2

So let’s take a look at a more useful example.

32.1.1 Passing Unnamed Objects to Functions

Let’s say we have a function to sum an array of ints:

int sum(int p[], int count)
{
    int total = 0;

    for (int i = 0; i < count; i++)
        total += p[i];

    return total;
}

If we wanted to call it, we’d normally have to do something like this, declaring an array and storing values in it to pass to the function:

int a[] = {1, 2, 3, 4};

int s = sum(a, 4);

But unnamed objects give us a way to skip the variable by passing it directly in (parameter names listed above). Check it out—we’re going to replace the variable a with an unnamed array that we pass in as the second argument:

//                   p[]         count
//           |-----------------|  |
int s = sum((int []){1, 2, 3, 4}, 4);

Pretty slick!

32.1.2 Unnamed structs

We can do something similar with structs.

First, let’s do things without unnamed objects. We’ll define a struct to hold some x/y coordinates. Then we’ll define one, passing in values into its initializer. Finally, we’ll pass it to a function to print the values:

#include <stdio.h>

struct coord {
    int x, y;
};

void print_coord(struct coord c)
{
    printf("%d, %d\n", c.x, c.y);
}

int main(void)
{
    struct coord t = {.x=10, .y=20};

    print_coord(t);   // prints "10, 20"
}

Straightforward enough?

Let’s modify it to use an unnamed object instead of the variable t we’re passing to print_coord().

We’ll just take t out of there and replace it with an unnamed struct:

    //struct coord t = {.x=10, .y=20};

    print_coord((struct coord){.x=10, .y=20});   // prints "10, 20"

Still works!

32.1.3 Pointers to Unnamed Objects

You might have noticed in the last example that even through we were using a struct, we were passing a copy of the struct to print_coord() as opposed to passing a pointer to the struct.

Turns out, we can just take the address of an unnamed object with & like always.

This is because, in general, if an operator would have worked on a variable of that type, you can use that operator on an unnamed object of that type.

Let’s modify the above code so that we pass a pointer to an unnamed object

#include <stdio.h>

struct coord {
    int x, y;
};

void print_coord(struct coord *c)
{
    printf("%d, %d\n", c->x, c->y);
}

int main(void)
{
    //     Note the &
    //          |
    print_coord(&(struct coord){.x=10, .y=20});   // prints "10, 20"
}

Additionally, this can be a nice way to pass even pointers to simple objects:

// Pass a pointer to an int with value 3490
foo(&(int){3490});

Easy as that.

32.1.4 Unnamed Objects and Scope

The lifetime of an unnamed object ends at the end of its scope. The biggest way this could bite you is if you make a new unnamed object, get a pointer to it, and then leave the object’s scope. In that case, the pointer will refer to a dead object.

So this is undefined behavior:

int *p;

{
    p = &(int){10};
}

printf("%d\n", *p);  // INVALID: The (int){10} fell out of scope

Likewise, you can’t return a pointer to an unnamed object from a function. The object is deallocated when it falls out of scope:

#include <stdio.h>

int *get3490(void)
{
    // Don't do this
    return &(int){3490};
}

int main(void)
{
    printf("%d\n", *get3490());  // INVALID: (int){3490} fell out of scope
}

Just think of their scope like that of an ordinary local variable. You can’t return a pointer to a local variable, either.

32.1.5 Silly Unnamed Object Example

You can put any type in there and make an unnamed object.

For example, these are effectively equivalent:

int x = 3490;

printf("%d\n", x);               // 3490 (variable)
printf("%d\n", 3490);            // 3490 (constant)
printf("%d\n", (int){3490});     // 3490 (unnamed object)

That last one is unnamed, but it’s silly. Might as well do the simple one on the line before.

But hopefully that provides a little more clarity on the syntax.

32.2 Generic Selections

This is an expression that allows you to select different pieces of code depending on the type of the first argument to the expression.

We’ll look at an example in just a second, but it’s important to know this is processed at compile time, not at runtime. There’s no runtime analysis going on here.

The expression begins with _Generic, works kinda like a switch, and it takes at least two arguments.

The first argument is an expression (or variable167) that has a type. All expressions have a type. The remaining arguments to _Generic are the cases of what to substitute in for the result of the expression if the first argument is that type.

Wat?

Let’s try it out and see.

#include <stdio.h>

int main(void)
{
    int i;
    float f;
    char c;

    char *s = _Generic(i,
                    int: "that variable is an int",
                    float: "that variable is a float",
                    default: "that variable is some type"
                );

    printf("%s\n", s);
}

Check out the _Generic expression starting on line 9.

When the compiler sees it, it look at the type of the first argument. (In this example, the type of the variable i.) It then looks through the cases for something of that type. And then it substitutes the argument in place of the entire _Generic expression.

In this case, i is an int, so it matches that case. Then the string is substituted in for the expression. So the line turns into this when the compiler sees it:

    char *s = "that variable is an int";

If the compiler can’t find a type match in the _Generic, it looks for the optional default case and uses that.

If it can’t find a type match and there’s no default, you’ll get a compile error. The first expression must match one of the types or default.

Because it’s inconvenient to write _Generic over and over, it’s often used to make the body of a macro that can be easily repeatedly reused.

Let’s make a macro TYPESTR(x) that takes an argument and returns a string with the type of the argument.

So TYPESTR(1) will return the string "int", for example.

Here we go:

#include <stdio.h>

#define TYPESTR(x) _Generic((x), \
                        int: "int", \
                        long: "long", \
                        float: "float", \
                        double: "double", \
                        default: "something else")

int main(void)
{
    int i;
    long l;
    float f;
    double d;
    char c;

    printf("i is type %s\n", TYPESTR(i));
    printf("l is type %s\n", TYPESTR(l));
    printf("f is type %s\n", TYPESTR(f));
    printf("d is type %s\n", TYPESTR(d));
    printf("c is type %s\n", TYPESTR(c));
}

This outputs:

i is type int
l is type long
f is type float
d is type double
c is type something else

Which should be no surprise, because, like we said, that code in main() is replaced with the following when it is compiled:

    printf("i is type %s\n", "int");
    printf("l is type %s\n", "long");
    printf("f is type %s\n", "float");
    printf("d is type %s\n", "double");
    printf("c is type %s\n", "something else");

And that’s exactly the output we see.

Let’s do one more. I’ve included some macros here so that when you run:

int i = 10;
char *s = "Foo!";

PRINT_VAL(i);
PRINT_VAL(s);

you get the output:

i = 10
s = Foo!

We’ll have to make use of some macro magic to do that.

#include <stdio.h>
#include <string.h>

// Macro that gives back a format specifier for a type
#define FMTSPEC(x) _Generic((x), \
                        int: "%d", \
                        long: "%ld", \
                        float: "%f", \
                        double: "%f", \
                        char *: "%s")
                        // TODO: add more types
                        
// Macro that prints a variable in the form "name = value"
#define PRINT_VAL(x) do { \
    char fmt[512]; \
    snprintf(fmt, sizeof fmt, #x " = %s\n", FMTSPEC(x)); \
    printf(fmt, (x)); \
} while(0)

int main(void)
{
    int i = 10;
    float f = 3.14159;
    char *s = "Hello, world!";

    PRINT_VAL(i);
    PRINT_VAL(f);
    PRINT_VAL(s);
}

for the output:

i = 10
f = 3.141590
s = Hello, world!

We could have crammed that all in one big macro, but I broke it into two to prevent eye bleeding.


«Previous | Contents | Next»