GNU parallel is a shell tool for executing jobs in parallel locally or using remote computers. A job is typically a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. If you use xargs today you will find GNU parallel very easy to use, as GNU parallel is written to have the same options as xargs. If you write loops in shell, you will find GNU parallel may be able to replace most of the loops and make them run faster by running several jobs in parallel. GNU parallel makes sure output from the commands is the same output as you would get had you run the commands sequentially. This makes it possible to use output from GNU parallel as input for other programs.
Concurrency Kit provides a plethora of concurrency primitives and lock-less and lock-free data structures designed to aid in the design and implementation of high performance scalable concurrent systems. It was designed to minimize dependencies on operating system-specific interfaces, and most of the interface relies only on a strict subset of the standard library and more popular compiler extensions.
Likwid is a set of easy to use command line tools for Linux. It supports programmers in developing high performance multi-threaded programs. "Likwid" stands for "Like I knew what I am doing". It contains the following tools: likwid-topology, which shows thread and cache topology; likwid-perfctr, which measures hardware performance counters on Intel and AMD processors; likwid-features, which shows and toggles hardware prefetch control bits on Intel Core 2 processors; likwid-pin, which pins a threaded application without touching its code (it supports pthreads, Intel OpenMP, and gcc OpenMP), likwid-powermeter which prints the Turbo mode steps and measures energy consumption on supported Intel processors, and likwid-bench, a low level benchmarking framework. It works with any standard Linux kernel. Likwid is lightweight and adds no overhead during measurements.
prll is a utility for parallelizing the execution of shell functions. It provides a convenient interface for parallelizing the execution of a single task over multiple data files or any other kind of data that you can pass as a shell function argument. It is meant to make it simple to fully utilize a multicore/multiprocessor machine. prll is designed to be used not just in shell scripts, but also in interactive shells. To make the latter convenient, it is implemented as a shell function. Shells are not very good at automatic job management, so prll uses helper programs, written in C. To prevent race conditions, System V Message Queues are used to signal job completion. Standard output is buffered and Semaphores are used to prevent interleaving.