Script Valley
Bash Scripting for Developers
Process Management and AutomationLesson 5.5

Bash parallel processing with xargs and GNU parallel

xargs -P for parallelism, GNU parallel basics, parallel vs xargs, controlling job count, parallel with progress, collecting output, handling failures in parallel, rate limiting

Parallelism at Scale

Parallel processing fan-out pattern

Background jobs with & work for a fixed set of tasks. For dynamic lists of items, xargs -P or GNU parallel are more powerful.

# xargs -P: run up to N jobs in parallel
find /data -name "*.gz" | xargs -P 8 -I{} gunzip {}

# Process 4 files at a time
printf '%s\n' *.csv | xargs -P 4 -I{} bash -c '
  echo "Processing: {}"
  process_file "{}" > "processed/{}.out"
'

GNU parallel

# Install: apt install parallel OR brew install parallel

# Process all CSV files with 8 workers
parallel -j 8 process_file ::: *.csv

# With progress bar
parallel --progress -j 4 compress_image ::: images/*.png

# Pass multiple arguments
parallel -j 4 deploy_to_region ::: us-east eu-west ap-south

# From a file of inputs
parallel -j 8 -a servers.txt ping -c 1 {}

Collecting Results

# xargs writes to stdout โ€” capture with tee or redirect
find . -name '*.log' | xargs -P 4 -I{} grep -l 'ERROR' {} \
  | sort > error_files.txt

# GNU parallel can tag output by job
parallel --tag -j 4 wc -l ::: *.txt
# Output prefixed with the argument: "file.txt\t42"

GNU parallel handles edge cases better than manual & loops: it manages job slots automatically, retries failed jobs (--retry-failed), and keeps output organized. For more than 10 parallel tasks, prefer it over handwritten job control.

Bash parallel processing with xargs and GNU parallel โ€” Process Management and Automation โ€” Bash Scripting for Developers โ€” Script Valley โ€” Script Valley