awk
Process and transform structured text data — extract columns, filter rows, and compute aggregates from log files and CSV data with awk.
- awk
- What awk Does and When to Use It
- How to Install awk
- Core Concepts of awk
- awk Fields and Delimiters
- awk Patterns and Actions
- awk Built-In Variables
- Common Tasks with awk
- How to Print a Specific Column with awk
- How to Filter Lines by Pattern with awk
- How to Sum a Column with awk
- How to Change the Output Delimiter with awk
- awk Troubleshooting
- Related Tools and Guides
awk
awk is a text-processing language and command-line tool that scans input line by line, splits each line into fields, and applies pattern-action rules to extract, transform, and report structured data on Linux, macOS, and other Unix-like systems.
What awk Does and When to Use It
awk reads input one record (line) at a time, splits each record into fields by a delimiter (whitespace by default), and executes user-defined rules against those fields. Each rule consists of a pattern (when to act) and an action (what to do). System administrators use awk to extract columns from log files, calculate sums and averages, reformat CSV data, and build quick reports from command output.
awk is not a general-purpose scripting language. For tasks involving complex data structures, error handling, or external API calls, use Python or Perl. awk excels at one-liner text transformations where performance and brevity matter.
Three implementations exist: the original
awk (one true awk),
gawk (GNU awk, default on most Linux distributions), and
mawk (faster, minimal implementation, default on Debian/Ubuntu). gawk adds features like network I/O, multi-dimensional arrays, and POSIX character classes. For official documentation, see
man awk or
gnu.org/software/gawk/manual/.
How to Install awk
awk ships pre-installed on all Linux and macOS systems. Ubuntu and Debian install
mawk by default; install
gawk for GNU extensions:
sudo apt install gawkVerify the installed version:
awk --versionCore Concepts of awk
awk Fields and Delimiters
awk splits each input line into fields labeled
$1,
$2,
$3, and so on.
$0 represents the entire line. The default field separator is whitespace (spaces and tabs). Use
-F to set a custom delimiter — for example,
-F: for
/etc/passwd or
-F, for CSV files.
awk Patterns and Actions
An awk program consists of
pattern { action } rules. The pattern determines which lines to process; the action defines what to do with matching lines. Omitting the pattern processes every line. Omitting the action prints the matching line. Special patterns
BEGIN and
END run before the first line and after the last line.
awk Built-In Variables
awk provides built-in variables for program logic:
NR (current record/line number),
NF (number of fields in the current line),
FS (input field separator),
OFS (output field separator),
RS (record separator), and
FILENAME (current input filename).
Common Tasks with awk
How to Print a Specific Column with awk
awk extracts a single column from structured text. Print the first column of command output:
df -h | awk '{print $1}'Print the username field from
/etc/passwd (colon-delimited):
awk -F: '{print $1}' /etc/passwdHow to Filter Lines by Pattern with awk
awk prints only lines matching a regex pattern. Filter Nginx access log entries returning a 500 status code (assuming status is field 9):
awk '$9 == 500' /var/log/nginx/access.logMatch lines containing a string:
awk '/error/' /var/log/syslogHow to Sum a Column with awk
awk computes column totals using a variable accumulator and the
END block:
awk '{sum += $5} END {print "Total:", sum}' data.txtHow to Change the Output Delimiter with awk
awk reformats output by setting the output field separator (
OFS). Convert
/etc/passwd from colon-separated to tab-separated:
awk -F: 'BEGIN {OFS="\t"} {print $1, $3, $6}' /etc/passwdawk Troubleshooting
| Error / Symptom | Cause | Fix |
|---|---|---|
awk: fatal: cannot open file | File path is incorrect or file does not exist | → Full article |
| Empty or incorrect output | Wrong field separator; fields are not where expected | → Full article |
awk: syntax error at source line 1 | Unbalanced braces, missing quotes, or using double quotes around the program on the shell | → Full article |
Related Tools and Guides
grep filters lines by pattern but cannot extract or rearrange fields. Use grep for simple matching and awk for field-level processing. See the grep article.
sed performs line-level substitutions and transformations. awk handles column-based processing that sed cannot. See sed article.