How to "multithread" C code

If the task is highly parallelizable and your compiler is modern, you could try OpenMP. http://en.wikipedia.org/wiki/OpenMP


One alternative to multithread your code would be using pthreads ( provides more precise control than OpenMP ).

Assuming x, y & result are global variable arrays,

#include <pthread.h>

...

void *get_result(void *param)  // param is a dummy pointer
{
...
}

int main()
{
...
pthread_t *tid = malloc( ntimes * sizeof(pthread_t) );

for( i=0; i<ntimes; i++ ) 
    pthread_create( &tid[i], NULL, get_result, NULL );

... // do some tasks unrelated to result    

for( i=0; i<ntimes; i++ ) 
    pthread_join( tid[i], NULL );
...
}

(Compile your code with gcc prog.c -lpthread)


You should have a look at openMP for this. The C/C++ example on this page is similar to your code: https://computing.llnl.gov/tutorials/openMP/#SECTIONS

#include <omp.h>
#define N     1000

main ()
{

int i;
float a[N], b[N], c[N], d[N];

/* Some initializations */
for (i=0; i < N; i++) {
  a[i] = i * 1.5;
  b[i] = i + 22.35;
  }

#pragma omp parallel shared(a,b,c,d) private(i)
  {

  #pragma omp sections nowait
    {

    #pragma omp section
    for (i=0; i < N; i++)
      c[i] = a[i] + b[i];

    #pragma omp section
    for (i=0; i < N; i++)
      d[i] = a[i] * b[i];

    }  /* end of sections */

  }  /* end of parallel section */

}

If you prefer not to use openMP you could use either pthreads or clone/wait directly.

No matter which route you choose you are just dividing up your arrays into chunks which each thread will process. If all of your processing is purely computational (as suggested by your example function) then you should do well to have only as many threads as you have logical processors.

There is some overhead with adding threads to do parallel processing, so make sure that you give each thread enough work to make up for it. Usually you will, but if each thread only ends up with 1 computation to do and the computations aren't that difficult to do then you may actually slow things down. You can always have fewer threads than you have processors if that is the case.

If you do have some IO going on in your work then you may find that having more threads than processors is a win because while one thread may be blocking waiting for some IO to complete another thread can be doing its computations. You have to be careful doing IO to the same file in threads, though.


If you are hoping to provide concurrency for a single loop for some kind of scientific computing or similar, OpenMP as @Novikov says really is your best bet; this is what it was designed for.

If you're looking to learn the more classical approach that you would more typically see in an application written in C... On POSIX you want pthread_create() et al. I'm not sure what your background might be with concurrency in other languages, but before going too deeply into that, you will want to know your synchronization primitives (mutexes, semaphores, etc.) fairly well, as well as understanding when you will need to use them. That topic could be a whole book or set of SO questions unto itself.


Depending on the OS, you could use posix threads. You could instead implement stack-less multithreading using state machines. There is a really good book entitled "embedded multitasking" by Keith E. Curtis. It's just a neatly crafted set of switch case statements. Works great, I've used it on everything from apple macs, rabbit semiconductor, AVR, PC.

Vali