Parallelism with xargs -P and Background Jobs
Run independent tasks concurrently using xargs parallel mode and managed background job pools.
Why Sequential Is Slow
When you run commands one after another in a shell script, you leave CPU cores idle. Consider resizing 500 images: each convert call uses one core while the other seven sit idle.
Parallelism fixes this by dispatching multiple tasks simultaneously. Two primary tools in Bash make this easy:
- xargs -P — fan out a list of inputs across N parallel worker processes
- Background jobs (&) + wait — manually spawn processes and collect them
This lesson covers both approaches so you can choose the right tool for each situation.
xargs Basics Refresher
Before adding parallelism, recall how xargs works. It reads items from stdin and passes them as arguments to a command.
The -I {} flag lets you place the input item anywhere in the command string, not just at the end.
The example below converts every .txt file to uppercase using tr. Each file is processed one at a time (sequential baseline).
#!/usr/bin/env bash
# Create sample files
mkdir -p /tmp/xargs_demo
for i in 1 2 3; do
echo "hello world $i" > /tmp/xargs_demo/file$i.txt
done
# Process files one at a time (sequential)
find /tmp/xargs_demo -name '*.txt' | xargs -I {} sh -c 'tr a-z A-Z < "$1"' _ {}
# Cleanup
rm -rf /tmp/xargs_demoAll lessons in this course
- Profiling Scripts and Avoiding Useless Subshells
- Parallelism with xargs -P and Background Jobs
- Orchestrating Workloads with GNU parallel
- Streaming Pipelines and Named Pipes for Throughput