GNU parallel

GNU parallel is a command-line tool that runs jobs in parallel, taking work that would otherwise run one item at a time in a shell loop and spreading it across the available CPU cores, or even across several computers. It was written by Ole Tange in Perl and is distributed by the GNU Project under the GPLv3 license (https://www.gnu.org/software/parallel/). A typical use is to feed it a list of inputs, a filename per line for example, and a command template; parallel then runs many copies of that command at once, one per input, keeping a chosen number of jobs busy at all times.

The tool is deliberately built to slot into existing Unix pipelines. The GNU parallel manual presents it as a drop-in replacement for the common patterns it speeds up, describing it as “a shell tool for executing jobs in parallel using one or more computers” and noting that it “can often be used as a substitute for xargs or cat | bash” (https://www.gnu.org/software/parallel/man.html). Where xargs batches arguments onto a few command invocations, parallel can run a separate job per argument and run those jobs concurrently, which is the behavior most people actually want when the work is independent.

A defining design choice is that parallel preserves output ordering and grouping. Running commands concurrently normally interleaves their output into an unreadable mess, but the GNU parallel manual states that it “makes sure output from the commands is the same output as you would get had you run the commands sequentially”, which makes the output safe to pipe into another program. parallel buffers each job’s output and emits it as a clean block, so a parallelized pipeline produces the same result as a serial one, just faster. This is what lets it be a transparent accelerator rather than a tool you have to design your scripts around.

Beyond a single machine, parallel can distribute jobs to remote hosts over SSH, transferring input files, running the command there, and bringing results back. Combined with options for the number of simultaneous jobs, retries on failure, progress display, and a dry-run mode that prints the commands without executing them, this turns a one-line shell command into a small distributed batch system. The manual ships with extensive tutorial, how-to, and reference material, reflecting Tange’s emphasis on documentation as part of the tool.

GNU parallel sits squarely in the Unix tradition of small composable programs that read lines and write lines. It does not replace a shell or a job scheduler; it sharpens one specific, very common need, doing the same operation to many inputs, by making that operation concurrent without changing how the surrounding pipeline is written. For data processing, media conversion, log crunching, and similar embarrassingly parallel chores, it often turns a multi-hour serial run into one that finishes in the time a single core would take divided by the number of cores available.