Coroutines in C (1/3)

Concurrency is not part of the C language model.
Depending on your OS you have facilities to "split the logic" of your program so that it will be able to do multiple things in parallel.
An example is the fork() function on Unix (or CreateProcess() on Windows), where you create an entirely separate process that will do its own things.
A more lightweight approach is to use threads: functions are executed in parallel, sharing the same process space (memory, files, ...). If you are on a Posix platform, the pthread library will help you going that route.

However, once your code start splitting into parallel modules, you have to think about syncing them, ensuring atomic access to resources, preventing and detecting race conditions, cleanly exiting and a lot of other fun stuff.

Let's admit it, very few programmers have the ability to correctly follow the logic of parallel code (for sure I'm not one of those!) and debugging such code is a nightmare.

Sometimes, however, you may be ok with something less complex: coroutines.

Coroutines are a form of multitasking where each parallel module voluntarily yelds to other modules whenever they think it's appropriate. This is often described as "cooperative multitasking" as opposed to the "preemptive multitasking" that is used by processes and threads.

With preemptive multitasking, a scheduler decides wich process/thread to suspend and which one to run. Quickly switching between processes (or threads) gives us the impression of multitasking.

A tipical example used for explaining coroutines is the producer/consumer scenario. Let's use it as running example: we'll have two functions:

  • prodrand(), that produces random numbers and store them in a buffer
  • consprime(), that read from the buffer, discards the non prime numbers and uses the primes for its own purpose.

Since multitasking is expensive and, possibly, complicated let's stay in our comfortable single-tasking world.

We could write the producer so that whenever it has generated enough numbers it will call the consumer:

prodrand(buffer buf) {
      ...
      while (1) {
        n=rnd();
        bufferadd(buf,n);
        if (bufferlen(buf) == 10)
           consprime(buf);
      }
      ...
   }

Or we could write the consumer so that whenever the buffer is empty it will call the producer:

consprime(buffer buf) {
  ...
  while(1) {
    while (bufferlen(buf) > 0)  {
    ... // do stuff with prime number
    }
    prodrand(buf);
  }
  ...
}

In both cases, we coupled the two function: one must be aware of the other as they are in a caller/callee relationship. This is a point of inflexibility. In the latter one, for example, you have to change the code if you want to have multiple producers.

If you could use coroutines, your code would have been much cleaner as you could write the two functions in a way that they do not depend on each other:

prodrand(buffer buf) {
   ...
   while (1) {
     n=rnd();
     bufferadd(buf,n);
     if (bufferlen(buf) == 10)
       yeld();
   }
   ...
}

consprime(buffer buf) {
  ...
  while(1) {
    while (bufferlen(buf) > 0)  {
      ...
    }
    yeld();
  }
  ...
}

... // Somewhere else in the code
  while(1) {
    prodrand(buf);
    consprime(buf);
  }
...

The trick is in the yeld() function. It suspends the function and returns back to the caller. The next time the function will be called, the execution will restart from the instruction below yeld(). The role of the scheduler is taken by a simple loop that calls the coroutines. If, say, you want to introduce another producer that takes numbers from a file you don't have to change anything except adding the new producer to the loop:

... // Somewhere else in the code
  while(1) {
    prodrand(buf);
    prodfile(buf, infile);
    consprime(buf);
  }
...

Here is another classica example for an iterator if we had coroutines:

nextindex(int *ndx)
{
  int n = 0;
  while (1) ;
    *ndx = n++;
    yeld();
  }
}

main() {
  int x; 
  ...
  for (int k =0; k<10; k++) {
   nextindex(&x);
   printf("index: %d\n",x);
  }
}

In this example, the problem of using C functions as coroutines is fully visible: if nextindex() returns (by yelding), the variable n will be destroyed as the function stack will be released! We could have n declared as static but this can't be a general solution.

The other major problem is how to implement yeld() so that the execution can be resumed at the next invocation.

Luckily for us, there are solutions to these two problems and many coroutine libraries exist for us C programmers. They can be categorized into two groups:

  • those using Posix functions to create and save function stacks (like getcontext() and setcontext()) possibly with Assembler code to cover specific platforms.
  • those using the Duff device, a way of mixing the switch and while statements that many define being a product of the Devil himself.

The Wikipedia page on Coroutines has a good list of implementation.

As it is often the case, I found them too complex and somehow bloated (i.e. I didn't need all the nice features they offered) and so I wrote an implementation of coroutines myself. I call them
bees. Here is an example:

// The variables followed by semicolon are preserved between calls

beedef(prodrand, buffer buf;) 
{
  int n;
  while (1) {
    if (bufferlen(buf) >= 10)
      beeyeld;
    bufferadd(buffer, rand());
  }
  beereturn;
}

beedef(consprime, buffer buf;)
{
  int n;
  while(1) {
    while (bufferlen(buf) > 0)  {
      n = bufferpop(buf);
      if (isprime(n)) process(n);
    }
    beeyeld;
  }
  beereturn;
}

int main(int argc, char *argv[])
{
  prodrand producer = beenew(prodrand);
  consprime consumer = beenew(consprime);
  buffer buf;

  buf = buffernew();

  // producer and consumer share the same buffer.
  producer->buf = buf;
  consumer->buf = buf;

  while ( beefly(producer) || beefly(consumer));
}

I'm still working on it, in the next article I'll provide more details on those bees.

15