© Joaquin Menchaca, 2014
Version 1.5
The AWK tool was introduced in Version 7 Unix and named after the authors: Aho, Weinberger, and Kernighan. AWK provided computational features to the Unix pipeline, and at the time, AWK was the only other scripting language available besides Bourne Shell.
AWK was extremely popular in the 1970s and 1980s. The available shell at the time (Bourne shell) was extremely limited, and AWK provided numerous capabilities absent from Bourne Shell. This included rich text processing capabilities, math functions, and the capability to create arrays and associative arrays (hashes).
AWK was updated in the late 1980s with the release of nawk (New AWK) and gawk (GNU AWK). New AWK is available on SVR Unix versions, while GNU AWK is widely available, especially in open source systems like Linux. In the 1990s, the popularity of Perl caused AWK to be used less for text-processing chores.
GNU Awk is continues to be updated. Gawk 3.1.5 added the ability to get the size of an array with length()
, where before this only worked on strings. Also, Gawk 4.0 adds use of switch() { ... }
.
- GNU Awk: This is not Your Father's Awk: http:https://www.drdobbs.com/open-source/gnu-awk-this-is-not-your-fathers-awk/240158351
- GNU Awk Manual: http:https://www.gnu.org/software/gawk/manual/gawk.pdf
Today, AWK is found on many Linux systems. For UNIX systems or systems claiming to have POSIX compatibility, would likely have awk as apart of that tool set for compliance toward IEEE Std 1003.1, 2013 Edition.
Thus, with any UNIX or Linux system, you can expect awk to be available. For Windows, that is another matter. You can get awk from a number of locations.
For Windows, you can run GNU in various environments, some mimicking a Unix-like environment on Windows:
- CygWin - robust environment uses special library to provide Unix compatibility.
- GitBash - Git tools that bundles the MSYS environment (http:https://www.mingw.org/wiki/msys) that provides Bash shell and some GNU tools including GNU Awk.
- GNUWin32 - GNU tools that work directly from Command Shell.
- GNUWin32 Gawk - http:https://gnuwin32.sourceforge.net/packages/gawk.htm
- UWIN - toolset comes directly from AT&T and provides tools that are found with SVR4 Unix systems. Tools seem to only work within their environment, i.e.
login.exe
program.
C:\> "C:\Program Files (x86)"\GnuWin32\bin\gawk --version | head -1
GNU Awk 3.1.6
C:\>"C:\Program Files (x86)\Git\bin\gawk.exe" --version | head -1
GNU Awk 3.0.4
OS X 10.8.5 comes with a version of AWK that is normally distributed with BSD flavors of UNIX. The newer features found in GAWK will not be available. Thus you can use a tool like HomeBrew to grab the latest version of GAWK. Here's a sample run of this on July, 2014:
$ brew install gawk
==> Downloading https://downloads.sf.net/project/machomebrew/Bottles/gawk-4.1.1.mountain_lion.bottle.tar.gz
######################################################################## 100.0%
==> Pouring gawk-4.1.1.mountain_lion.bottle.tar.gz
🍺 /usr/local/Cellar/gawk/4.1.1: 61 files, 2.8M
The default awk that comes with Cent OS 6.5 is extremely old:
/bin/awk --version | head -1
GNU Awk 3.1.7
AWK is quite different in that an AWK script works as a filter. This means AWK scripts are oriented to receive text from the console, standard-input. Thus AWK scripts are generally not interactive scripts, where you selectively read lines of input. Thus most awk scripts will be run with input being sent directly into the script. Examples:
some_command | awkscript
awkscript < somefile.txt
AWK scripts have the ability to do pre-processing and post-processing before and after receiving input through the BEGIN
and END
blocks. These tutorial scripts here will use the BEGIN
blocks. These tutorial scripts illustrate that AWK in and of itself is a powerful scripting language, and have the capabilities shared by other modern scripting languages.
Environments will have AWK in either /bin/awk
or /usr/bin/awk
or both. These scripts expect AWK to be in /bin/awk
. The workaround, provided you have administration privileges is to add a symbolic link.
On Mac OS X 10.8.5, you can do this:
sudo ln -s
which awk/bin/awk
- 📀 Windows 7 (32-bit)
- 📦 Gawk 3.0.4 (msysgit 1.9.2-preview20140411)
- 🪲
length(array)
does not work.
- 🪲
- 📦 Nawk (UWIN 2012-08-06)
- 🪲
length(array)
does not work.
- 🪲
- 📦 GNUWin32, GNU Awk 3.1.6
- 🪲
length(array)
does not work. - 🪲 last file from
ls
is not $9, but $8, so use $NF instead - 🪲 UNIX
date
needs to be in the path. Unix date is available with MSYS environment.
- 🪲
- 📦 Gawk 3.0.4 (msysgit 1.9.2-preview20140411)
- 📀 OS X 10.8.5
- 📦 BSD Awk 20070501 (bundled with OS)
- 🪲 awk not in
/bin/awk
, but found in/usr/bin/awk
- 🪲 awk not in
- 📦 BSD Awk 20070501 (bundled with OS)
- 📀 CentOS 6.5
- 📦 GNU Awk 3.1.7
- 🪲
switch
will not work as requires GNU Awk 4.0 and above.
- 🪲
- 📦 GNU Awk 3.1.7
- 📚 Output
- 📗 Standard Output [A00]
- 📕 Standard Error [A11] GAWK only
- 📚 Variables
- 📗 String Concatenation [B00]
- 📗 Formatting [B20]
- 📚 Arithmetic
- 📗 Basic Arithmetic [C00]
- 📗 Boolean Logic [C10]
- 📗 Exponential [C20]
- 📗 Math Function [C30]
- 📚 Input
- 📗 Reading a Line [D00]
- 📚 Branching
- 📗 Branch on String [E00]
- 📗 Ternary [E10]
- 📗 Branch on Number Range [E20]
- 📗 Branch on Number [E30]
- 📕 Multiway Branch on Number [E41] GAWK only
- 📕 Multiway Branch on String Pattern
- 📄 Character Sets [E51] GAWK only
- 📄 POSIX Character Sets [E52] GAWK only
- 📗 Branch on String Pattern
- 📄 Character Sets [E60]
- 📄 POSIX Character Sets [E61]
- 📚 Looping
- 📕 Collection Loop
- 📄 Conditional Loop test directory listing [F01]
- 📗 Count Loop
- 📄 Loop with
for ( ; ; )
[F10] - 📄 Loop with
while ()
[F11]
- 📄 Loop with
- 📗 Conditional Loop [F20]
- 📄 Loop with
do...while()
[F10] - 📄 Loop with
while ()
[F11]
- 📄 Loop with
- 📗 Spin Loop [F30]
- 📗 Skipping [F40]
- 📕 Collection Loop
- 📚 Arrays
- 📗 Assign by Index and Length [G00]
- 📄 helper function to count array items [G00]
- 📄 Gawk's
length()
[G01]
- 📗 Assign by List and Enumeration by Item [G10]
- 📗 Assign by List and Enumeration by Index [G20]
- 📄 helper function to count array items [G00]
- 📄 Gawk's
length()
[G01]
- 📗 Assign by Index and Length [G00]
- 📚 Associative Arrays
- 📗 Assign by Key [H00]
- 📗 Assign by List and Appending [H10]
- 📚 Sub-Routines
- 📗 Creation and Calling [I00]
- 📗 Global Variable (Scope) [I10]
- 📗 Local Variable (Scope) [I20]
- 📚 Arguments from the Command Line
- 📕 Usage Statement (Script Name and Arg Count) [J01] GAWK only
- 📗 Enumerate Arguments in Order [J00]
- 📗 Enumerate Arguments in Reverse Order [J10]
- 📚 Parameters
- 📗 Pass a Single Parameters [K00]
- 📗 Pass Variable Number of Parameters [K10]
- 📚 Exit
- 📕 Returning an Exit Status Code [L01] GAWK only
- 📚 Functions [M00]
- 📗 Return an Integer [M00]
- 📗 Return a String [M10]
This covers notes regarding each section.
- Output
- Variables
- output variables using string concatenation with
print
- output variables using string interpolation with
printf
- output variables using string concatenation with
- Arithmetic
- Input
- Branch
- branch on a string with
if
- yes/no - branch on a string with ternary
- branch on number range
- branch on an exact number using
if
- multiway branch on an exact number using
switch
(Gawk 3.1.5+ only) - multiway branch on a pattern using
switch
(Gawk 3.1.5+ only) - select on pattern using
if
- branch on a string with
- Looping
- iterative (count) loop
- conditional loop
- collection loop
- OMITTED Awk does not have a true collection loop, so a conditional loop that pulls from standard-input was used as an alternative
- Arrays
- populate array using index
- populate array using list of items
- NOTES
- Awk does not have syntax support to declare an array on one line.
split(string, array)
function with a space delimited string will work.
- NOTES
- enumerate array using collection loop
- NOTES
- Awk does not have a collection loop, but rather pulls an key from the array
- Awk does not have real arrays, as indexes are actually strings. The for loop, i.e.
for (key in array)
, can pull indexes (keys) in any order.
- NOTES
- enumerate array using iterative loop
- NOTES
- Awk doesn't support
length(array)
. Only available in GNU Awk (gawk) 3.1.5 and after. - A helper function of
array_length(array)
was created to support this.
- Awk doesn't support
- NOTES
- Associative Arrays
- Create Associative Array using key index
- NOTES
- helper function
keys()
to return string of all the keys - helper function
values()
to return string of all the values
- helper function
- NOTES
- Create Associative Array using supplied list of key and value pairs
- NOTES
- helper function of
make_array()
to create associative array from supplied string - helper function of
merge()
to merge two associative arrays
- helper function of
- NOTES
- Create Associative Array using key index
- Subroutines
- Demonstrating creating and calling a subroutine
- utilize subroutine that prints the current date in "Month Day, Year" format
- Demonstrate global variables
- NOTES All variables in in AWK are global variables, unless passed in as parameters
- Demonstrate local variables
- NOTES Local varabialbes can be created by listing them in the parameter list
- Demonstrating creating and calling a subroutine
- Arguments
- demonstrate testing for two arguments
- print list of all arguments with count
- NOTES
- skipping
for (key in array)
as keys fetched out of order and ARGV contains script name.
- skipping
- NOTES
- print list of all arguments in reverse with count
- Parameters
- demonstrate passing 1 parameter
- utilize subroutine that prints celsius temperature when supplied fahrenheit temperature
- demonstrate passing unlimited parameters
- Note AWK does not support variable parameters, so an array is passed instead.
- utilize subroutine prints sum of all numbers passed into it.
- Functions
- demonstrate returning integer
- returns summation of all numbers passed into function
- demonstrate returning string
- returns capitalized string from lower case string
- demonstrate returning integer