Text Processing and File OperationsLesson 4.1

grep, sed, and awk - which tool to use when

grep pattern matching, sed stream editing, awk field processing, when to use each tool, extended regex, in-place editing with sed, awk field separator, combining tools in pipelines

Three Tools, Three Jobs

These three tools solve different problems. Pick the right one and your pipelines stay readable.

grep: filter lines matching a pattern
sed: substitute, delete, or insert text in a stream
awk: process fields and compute per row

# grep: find lines containing ERROR
grep "ERROR" app.log
grep -E "ERROR|WARN" app.log  # extended regex
grep -v "DEBUG" app.log       # invert - exclude DEBUG
grep -c "404" access.log      # count matching lines

# sed: substitute text
sed 's/foo/bar/' file.txt          # replace first occurrence per line
sed 's/foo/bar/g' file.txt         # replace all occurrences
sed -i 's/localhost/db.prod/g' config.env  # in-place edit
sed '/^#/d' config.txt             # delete comment lines

# awk: process fields (default separator: whitespace)
awk '{print $1, $3}' data.txt      # print columns 1 and 3
awk -F',' '{print $2}' data.csv    # CSV: use comma as separator
awk '$3 > 100 {print $0}' data.txt # filter rows where field 3 > 100
awk '{sum += $1} END {print sum}' numbers.txt  # sum a column

Rule of thumb: if you need to filter, use grep. If you need to edit text in place, use sed. If you need to compute or reformat structured data, use awk. Don't use sed to do awk's job.