+ All Categories
Home > Documents > awk_power

awk_power

Date post: 06-Apr-2018
Category:
Upload: saroj-kumar
View: 217 times
Download: 0 times
Share this document with a friend
35
9 Powerful Awk Built-in Functions for Numeric Similar to awk built-in variables, awk also has lot of buil t-in functions for numeric, string, input, and output operations. Awk has the following three types of high level built-in function categories. 1. Built -in functions for numeric operat ions 2. Built-in functions for Stri ng oper ations 3. Built -in f unctions for I nput Outpu t ope rations 1. Awk int(n) Function int() function gives you the integer part of the given argument. This produces the lowest integer part of given n. n is any number with or without floating point. If you give the whole number as an argument, this function returns the same. For floating point number, it truncates. Example $ awk 'BEGIN{ print int(3.534); print int(4); print int(-5.223); print int(-5); }' 3 4 -5 -5 2. Awk log(n) Function log() function provides natural logarithmic of given argument n. log() returns logarithm value only when n is positive number. If you give any i nvalid number (even negative) it throws an error. Example $ awk 'BEGIN{ print log(12); print log(0); print log(1); print log(-1);
Transcript
Page 1: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 1/35

9 Powerful Awk Built-in Functions for Numeric

Similar to awk built-in variables, awk also has lot of built-in functions for numeric, string, input, and outputoperations. Awk has the following three types of high level built-in function categories.

1. Built-in functions for numeric operations2. Built-in functions for String operations3. Built-in functions for Input Output operations

1. Awk int(n) Functionint() function gives you the integer part of the given argument. This produces the lowest integer part of givenn. n is any number with or without floating point. If you give the whole number as an argument, this functionreturns the same. For floating point number, it truncates.

Example

$ awk 'BEGIN{print int(3.534);print int(4);print int(-5.223);print int(-5);

}'34-5-5

2. Awk log(n) Functionlog() function provides natural logarithmic of given argument n. log() returns logarithm value only when n ispositive number. If you give any invalid number (even negative) it throws an error.

Example

$ awk 'BEGIN{print log(12);print log(0);print log(1);print log(-1);

Page 2: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 2/35

}'2.48491-inf 0nan

In the above output you can identify that log(0) is infinity which was shown as -inf, and log(-1) gives you theerror nan (Not a Number)

3. Awk sqrt(n) Functionsqrt function gives the positive square root for the given integer n. This function also accepts the positivenumber, and it returns nan error if you give the negative number as an argument.

Example

$ awk 'BEGIN{print sqrt(16);print sqrt(0);print sqrt(-12);}'40nan

4. Awk exp(n) Functionexp function provides e to the power of n.

Example

$ awk 'BEGIN{print exp(123434346);print exp(0);print exp(-12);}'inf 16.14421e-06

In the above output, for exp(1234346), it gives you the output infinity, because this is out of range.

5. Awk sin(n) Functionsin() function gives sine value of n, with n in radians.

Page 3: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 3/35

Example

$ awk 'BEGIN {print sin(90);print sin(45);}'

0.8939970.850904

6. Awk cos(n) Functioncos() returns cosine value of n, with n in radians.

Example

$ awk 'BEGIN {

print cos(90);print cos(45);}'-0.4480740.525322

7. Awk atan2(m,n) FunctionThis function gives you the arc-tangent of m/n in radians.

Example$ awk 'BEGIN {print atan2(30,45);

}'0.588003

8. Awk rand() Functionrand() is used to generate the random number between 0 and 1. It never returns 0 and 1. It always returnsthe value between 0 and 1. Numbers are random within one awk run, but predictable from run to run. Awk

uses some algorithm to generate the random numbers. Since this algorithm is fixed, the numbers arerepeatable.

ExampleThe following example generates 1000 random numbers between 0 to 100 and shows how often eachnumber was used

Page 4: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 4/35

$cat rand.awkBEGIN {while(i<1000){

n = int(rand()*100);

rnd[n]++;i++;}for(i=0;i<=100;i++) {

print i,"Ocured", rnd[i], "times";}}$

Pasted some of the output of the above script here.

$awk -f rand.awk0 Occured 6 times

1 Occured 16 times2 Occured 12 times3 Occured 6 times4 Occured 13 times5 Occured 13 times6 Occured 8 times7 Occured 7 times8 Occured 16 times9 Occured 9 times10 Occured 6 times11 Occured 9 times12 Occured 17 times13 Occured 12 times

From the above output, sure that rand() function can generate repeatable numbers very often.

9. Awk srand(n) Functionsrand() is used to initialize the random generation with the given argument n. So that whenever the programexecution starts, it starts generating the number from n. If no argument is given, it uses the time of the day togenerate the seed.

Example. Generate 5 random number starting from 5 to 50

$cat srand.awkBEGIN {#initialize the seed with 5.srand(5);# Totally I want to generate 5 numbers.total=5;#maximum number is 50.max=50;

Page 5: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 5/35

count=0;while(count < total) {

rnd = int(rand() * max);if ( array[rnd] == 0 ) {

count++;

array[rnd]++;}}for ( i=5; i<=max; i++) {

if ( array[i] )print i;

}}

In this srand.awk, using rand() function, generate the number and multiply with max value to produce thenumber with the max of 50, and check if the generated random number is already exist in the array, if it doesnot exist, increment its index and as well as increment loop count. so that it generates 5 number like this andfinally in the for loop from minimum number to maximum, and prints the index only which has the value.

Here is the output of the above script

$ awk -f srand.awk915263739

Page 6: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 6/35

AWK Arrays Explained with 5 Practical Examples

Awk programming language supports arrays. Arrays are an extension of variables. Arrays are variable thathold more than one value. Similar to variables, arrays also has names. In some programming languages,arrays has to be declared, so that memory will be allocated for the arrays. Also, array indexes are typically

integer, like array[1],array[2] etc.,

Awk Associative ArrayAwk supports only associative array. Associative arrays are like traditional arrays except they uses strings astheir indexes rather than numbers. When using an associative array, you can mimic traditional array by usingnumeric string as index.

Syntax:

arrayname[string]=value

In the above awk syntax:

arrayname is the name of the array.

string is the index of an array.

value is any value assigning to the element of the array.

Accessing elements of the AWK arrayIf you want to access a particular element in an array, you can access through its index — arrayname[index]which gives you the value assigned in that index.

If you want to access all the array elements, you can use a loop to go through all the indexes of an array asshown below.

Syntax:

for (var in arrayname)actions

In the above awk syntax:

var is any variable name

in is a keyword

arrayname is the name of the array.

actions are list of statements to be performed. If you want to perform more than one action, it has tobe enclosed within braces.This loop executes list of actions for each different value which was used as an index in array with thevariable var set to that index.

Page 7: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 7/35

Removing an element from the AWK arrayIf you want to remove an element in a particular index of an array, use awk delete statement. Once youdeleted an element from an awk array, you can no longer obtain that value.

Syntax:

delete arrayname[index];

The loop command below removes all elements from an array. There is no single statement to remove all theelements from an array. You have to go through the loop and delete each array element using awk deletestatement.

for (var in array)delete array[var]

5 Practical Awk Array Examples

All the examples given below uses the Iplogs.txt file shown below. This sample text file contains list of ipaddress requested by the gateway server. This sample Iplogs.txt file contains data in the following format:

[date] [time] [ip-address] [number-of-websites-accessed]$ cat Iplogs.txt180607 093423 123.12.23.122 133180607 121234 125.25.45.221 153190607 084849 202.178.23.4 44190607 084859 164.78.22.64 12200607 012312 202.188.3.2 13210607 084849 202.178.23.4 34210607 121435 202.178.23.4 32210607 132423 202.188.3.2 167

Example 1. List all unique IP addresses and number of times it was

requested

$ awk '{> Ip[$3]++;> }> END{> for (var in Ip)> print var, "access", Ip[var]," times"

> }> ' Iplogs.txt125.25.45.221 access 1 times123.12.23.122 access 1 times164.78.22.64 access 1 times202.188.3.2 access 2 times202.178.23.4 access 3 times

In the above script:

Page 8: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 8/35

Third field ($3) is an ip address. This is used as an index of an array called Ip.

For each line, it increments the value of the corresponding ip address index.

Finally in the END section, all the index will be the list of unique IP address and its correspondingvalues are the occurrence count.

Example 2. List all the IP address and calculate how many sites itaccessed

The last field in the Iplogs.txt is the number of sites each IP address accessed on a particular date and time.The below script generates the report which has list of IP address and how many times it requested gatewayand total number of sites it accessed.

$cat ex2.awkBEGIN {print "IP Address\tAccess Count\tNumber of sites";}{

Ip[$3]++;count[$3]+=$NF;}END{for (var in Ip)

print var,"\t",Ip[var],"\t\t",count[var];}

$ awk -f ex2.awk Iplogs.txtIP Address Access Count Number of sites125.25.45.221 1 153123.12.23.122 1 133164.78.22.64 1 12

202.188.3.2 2 180202.178.23.4 3 110

In the above example:

It has two arrays. The index for both the arrays are same — which is the IP address (third field).

The first array named “Ip” has list of unique IP address and its occurrence count. The second arraycalled “count” has the IP address as an index and its value will be the last field (number of sites), sowhenever the IP address comes it just keeps on adding the last field.

In the END section, it goes through all the IP address and prints the Ip address and access countfrom the array called Ip and number of sites from the array count.

Example 3. Identify maximum access day$ cat ex3.awk{date[$1]++;}END{for (count in date)

Page 9: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 9/35

{if ( max < date[count] ) {

max = date[count];maxdate = count;

}

}print "Maximum access is on", maxdate;}

$ awk -f ex3.awk Iplogs.txtMaximum access is on 210607

In this example:

array named “date” has date as an index and occurrence count as the value of the array.

max is a variable which has the count value and used to find out the date which has max count.

maxdate is a variable which has the date for which the count is maximum.

Example 4. Reverse the order of lines in a file

$ awk '{ a[i++] = $0 } END { for (j=i-1; j>=0;) print a[j--] }' Iplogs.txt210607 132423 202.188.3.2 167210607 121435 202.178.23.4 32210607 084849 202.178.23.4 34200607 012312 202.188.3.2 13190607 084859 164.78.22.64 12190607 084849 202.178.23.4 44180607 121234 125.25.45.221 153180607 093423 123.12.23.122 133

In this example,

It starts by recording all the lines in the array ‘a’.

When the program has finished processing all lines, Awk executes the END { } block.

The END block loops over the elements in the array ‘a’ and prints the recorded lines in reversemanner.

Example 5. Remove duplicate and nonconsecutive lines using awk

$ cat > tempfoobarfoobazbar

$ awk '!($0 in array) { array[$0]; print }' tempfoobarbaz

Page 10: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 10/35

In this example:

Awk reads every line from the file “temp”, and using “in” operator it checks if the current line exist inthe array “a”.

If it does not exist, it stores and prints the current line.

Page 11: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 11/35

Caught In the Loop? Awk While, Do While, For Loop, Break,Continue, Exit Examples

In this article, let us review about awk loop statements – while, do while, for loops, break, continue, and

exit statements along with 7 practical examples.

Awk looping statements are used for performing set of actions again and again in succession. It repeatedlyexecutes a statement as long as condition is true. Awk has number of looping statement as like ‘C’programming language.

Awk While Loop

Syntax:

while(condition)

actions

while is a keyword.

condition is conditional expression

actions are body of the while loop which can have one or more statement. If actions has more thanone statement, it has to be enclosed with in the curly braces.

How it works? — Awk while loop checks the condition first, if the condition is true, then it executes the list o

actions. After action execution has been completed, condition is checked again, and if it is true, actions isperformed again. This process repeats until condition becomes false. If the condition returns false in the firstiteration then actions are never executed.

1. Awk While Loop Example: Create a string of a specific length$awk 'BEGIN { while (count++<50) string=string "x"; print string }'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The above example uses the ‘BEGIN { }’ special block that gets executed before anything else in an Awkprogram. In this block, awk while loop appends character ‘x’ to variable ‘string’ 50 times. count is a variablewhich gets incremented and checked it is less than 50. So after 50 iteration, this condition becomes false.After it has looped, the ‘string’ variable gets printed out. As this Awk program does not have a body, it quitsafter executing the BEGIN block.

Awk Do-While LoopHow it works? – Awk Do while loop is called exit controlled loop, whereas awk while loop is called as entry

controlled loop. Because while loop checks the condition first, then it decides to execute the body or not. But

the awk do while loop executes the body once, then repeats the body as long as the condition is true.

Syntax:

do

Page 12: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 12/35

actionwhile(condition)

Even if the condition is false, at the beginning action is performed at least once.

2. Awk Do While Loop Example: Print the message at least once

$ awk 'BEGIN{count=1;doprint "This gets printed at least once";while(count!=1)}'

 This gets printed at least once

In the above script, the print statement, executed at least once, if you use the while statement, first thecondition will be checked after the count is initialized to 1, at first iteration itself the condition will be false,so

print statement won’t get executed, but in do while first body will be executed, so it executes print statement.

Awk For Loop StatementAwk for statement is same as awk while loop, but it is syntax is much easier to use.

Syntax:

for(initialization;condition;increment/decrement)actions

How it works? — Awk for statement starts by executing initialization, then checks the condition, if the

condition is true, it executes the actions, then increment or decrement.Then as long as the condition is true, repeatedly executes action and then increment/decrement.

3. Awk For Loop Example . Print the sum of fields in all lines.

$ awk '{ for (i = 1; i <= NF; i++) total = total+$i }; END { print total }'12 23 34 45 5634 56 23 45 23351

Initially the variable i is initialized to 1, then checks if i is lesser or equal to total number of fields, then it keepon adding all the fields and finally the addition is stored in the variable total. In the END block just print the

variable total.

4. Awk For Loop Example: Print the fields in reverse order on every line.

$ awk 'BEGIN{ORS="";}{ for (i=NF; i>0; i--) print $i," "; print "\n"; }' student-marks77 84 78 2143 Jones45 58 56 2321 Gondrol

Page 13: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 13/35

37 38 2122 RinRao95 97 87 2537 Edwin47 30 2415 Dayan

We discussed about awk NF built-in variable in our previous article. After processing each line, Awk sets theNF variable to number of fields found on that line.The above script,loops in reverse order starting from NF to 1 and outputs the fields one by one. It starts withfield $NF, then $(NF-1),…, $1. After that it prints a newline character.Now let us see some other statements which can be used with looping statement.

Awk Break statementBreak statement is used for jumping out of the innermost looping (while,do-while and for loop) that enclosesit.

5. Awk Break Example: Awk Script to go through only 10 iteration

$ awk 'BEGIN{while(1) print "forever"}'

The above awk while loop prints the string “forever” forever, because the condition never get fails. Now if youwant to stop the loop after first 10 iteration, see the below script.

$ awk 'BEGIN{x=1;while(1){print "Iteration";if ( x==10 )break;

x++;}}'IterationIterationIterationIterationIterationIterationIterationIterationIterationIteration

In the above script, it checks the value of the variable “x”, if it reaches 10 just jumps out of the loop usingbreak statement.

Awk Continue statementContinue statement skips over the rest of the loop body causing the next cycle around the loop to beginimmediately.

Page 14: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 14/35

6. Awk Continue Example: Execute the loop except 5th iteration

$ awk 'BEGIN{x=1;while(x<=10){

if(x==5){x++;continue;}print "Value of x",x;x++;}}'Value of x 1Value of x 2Value of x 3Value of x 4Value of x 6

Value of x 7Value of x 8Value of x 9Value of x 10

In the above script, it prints value of x, at each iteration, but if the value of x reaches 5, then it just incrementthe value of x, then continue with the next iteration, it wont execute the rest body of the loop, so that value ofx is not printed for the value 5. Continue statement is having the meaning only if you use with in the loop.

Awk Exit statementExit statement causes the script to immediately stop executing the current input and to stop processing input

all the remaining input is ignored.Exit accepts any integer as an argument which will be the exit status code for the awk process. If noargument is supplied, exit returns status zero.

7. Awk Exit Example: Exit from the loop at 5th iteration

$ awk 'BEGIN{x=1;while(x<=10){if(x==5){exit;}print "Value of x",x;x++;}}'Value of x 1Value of x 2Value of x 3Value of x 4

Page 15: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 15/35

In the above script, once the value of x reaches 5, it calls exit, which stops the execution of awk process. Sothe value of x is printed only till 4, once it reaches 5 it exits.

4 Awk If Statement Examples ( if, if else, if else if, :? )

Page 16: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 16/35

In this awk tutorial, let us review awk conditional if statements with practical examples.

Awk supports lot of conditional statements to control the flow of the program. Most of the Awk conditionalstatement syntax are looks like ‘C’ programming language.Normally conditional statement checks the condition, before performing any action. If the condition is trueaction(s) are performed. Similarly action can be performed if the condition is false.Conditional statement starts with the keyword called ‘if’. Awk supports two different kind of if statement.

1. Awk Simple If statement2. Awk If-Else statement3. Awk If-ElseIf-Ladder

Awk Simple If StatementSingle Action: Simple If statement is used to check the conditions, if the condition returns true, it performs it

corresponding action(s).

Syntax:if (conditional-expression)

action

if is a keyword conditional-expression – expression to check conditions

action – any awk statement to perform action.

Multiple Action: If the conditional expression returns true, then action will be performed. If more than one

action needs to be performed, the actions should be enclosed in curly braces, separating them into a newline or semicolon as shown below.

Syntax:if (conditional-expression){

action1;

action2;}

If the condition is true, all the actions enclosed in braces will be performed in the given order. After all theactions are performed it continues to execute the next statements.

Awk If Else StatementIn the above simple awk If statement, there is no set of actions in case if the condition is false. In the awk IfElse statement you can give the list of action to perform if the condition is false. If the condition returns trueaction1 will be performed, if the condition is false action 2 will be performed.

Syntax:if (conditional-expression)action1

elseaction2

Awk also has conditional operator i.e ternary operator ( ?: ) whose feature is similar to the awk If Else

Statement. If the conditional-expression is true, action1 will be performed and if the conditional-expression isfalse action2 will be performed.

Page 17: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 17/35

Syntax:

conditional-expression ? action1 : action2 ;

Awk If Else If ladder

if(conditional-expression1)action1;

else if(conditional-expression2)action2;

else if(conditional-expression3)action3;..

elseaction n;

If the conditional-expression1 is true then action1 will be performed.

If the conditional-expression1 is false then conditional-expression2 will be checked, if its true, action2will be performed and goes on like this. Last else part will be performed if none of the conditional-expression is true.

Now let us create the sample input file which has the student marks.

$cat student-marks Jones 2143 78 84 77Gondrol 2321 56 58 45RinRao 2122 38 37Edwin 2537 87 97 95Dayan 2415 30 47

1. Awk If Example: Check all the marks are exist

$ awk '{if ($3 =="" || $4 == "" || $5 == "")

print "Some score for the student",$1,"is missing";'}' student-marksSome score for the student RinRao is missingSome score for the student Dayan is missing

$3, $4 and $5 are test scores of the student. If test score is equal to empty, it throws the message. || operato

is to check any one of marks is not exist, it should alert.

2. Awk If Else Example: Generate Pass/Fail Report based on Studentmarks in each subject

$ awk '{

Page 18: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 18/35

if ($3 >=35 && $4 >= 35 && $5 >= 35)print $0,"=>","Pass";

elseprint $0,"=>","Fail";

}' student-marks

 Jones 2143 78 84 77 => PassGondrol 2321 56 58 45 => PassRinRao 2122 38 37 => FailEdwin 2537 87 97 95 => PassDayan 2415 30 47 => Fail

The condition for Pass is all the test score mark should be greater than or equal to 35. So all the test scoresare checked if greater than 35, then it prints the whole line and string “Pass”, else i.e even if any one of thetest score doesn’t meet the condition, it prints the whole line and prints the string “Fail”.

3. Awk If Else If Example: Find the average and grade for every student

$ cat grade.awk{total=$3+$4+$5;avg=total/3;if ( avg >= 90 ) grade="A";else if ( avg >= 80) grade ="B";else if (avg >= 70) grade ="C";else grade="D";

print $0,"=>",grade;}$ awk -f grade.awk student-marks

 Jones 2143 78 84 77 => CGondrol 2321 56 58 45 => DRinRao 2122 38 37 => DEdwin 2537 87 97 95 => ADayan 2415 30 47 => D

In the above awk script, the variable called ‘avg’ has the average of the three test scores. If the average isgreater than or equal to 90, then grade is A, or if the average is greater than or equal to 80 then grade is B, ifthe average is greater than or equal to 70, then the grade is C. Or else the grade is D.

4. Awk Ternary ( ?: ) Example: Concatenate every 3 lines of input with a

comma.$ awk 'ORS=NR%3?",":"\n"' student-marks

 Jones 2143 78 84 77,Gondrol 2321 56 58 45,RinRao 2122 38 37Edwin 2537 87 97 95,Dayan 2415 30 47,

We discussed about awk ORS built-in variable earlier. This variable gets appended after every line that getsoutput. In this example, it gets changed on every 3rd line from a comma to a newline. For lines 1, 2 it’s acomma, for line 3 it’s a newline, for lines 4, 5 it’s a comma, for line 6 a newline, etc.

Page 19: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 19/35

 

7 Powerful Awk Operators Examples (Unary, Binary,Arithmetic, String, Assignment, Conditional, Reg-Ex Awk

Operators)

Page 20: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 20/35

Like any other programming language Awk also has lot of operators for number and string operations. In this

article let us discuss about all the key awk operators.

There are two types of operators in Awk.

1. Unary Operator – Operator which accepts single operand is called unary operator.2. Binary Operator – Operator which accepts more than one operand is called binary operator.

Awk Unary OperatorOperator Description

+ Positivate the number

- Negate the number

++ AutoIncrement

 – AutoDecrement

Awk Binary OperatorThere are different kinds of binary operators are available in Awk. It is been classified based on its usage.

Awk Arithmetic OpertorsThe following operators are used for performing arithmetic calculations.

Operator Description

+ Addition

- Subtraction

* Multiplication

 / Division

% Modulo Division

Awk String OperatorFor string concatenation Awk has the following operators.

Operator Description

(space) String Concatenation

Awk Assignment OperatorsAwk has Assignment operator and Shortcut assignment operator as listed below.

Operator Description

= Assignment

+= Shortcut addition assignment

-= Shortcut subtraction assignment

Page 21: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 21/35

*= Shortcut multiplication assignment

 /= Shortcut division assignment

%= Shortcut modulo division assignment

Awk Conditional OperatorsAwk has the following list of conditional operators which can be used with control structures and loopingstatement which will be covered in the coming article.

Operator Description

> Is greater than

>= Is greater than or equal to

< Is less than

<= Is less than or equal to

<= Is less than or equal to

== Is equal to

!= Is not equal to&& Both the conditional expression should be true

|| Any one of the conditional expression should be true

Awk Regular Expression OperatorOperator Description

~ Match operator

!~ No Match operator

Awk Operator ExamplesNow let us review some examples that uses awk operators. Let us use /etc/passwd as input file in theseexamples.

$ cat /etc/passwdgnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/shlibuuid:x:100:101::/var/lib/libuuid:/bin/shsyslog:x:101:102::/home/syslog:/bin/falsehplip:x:103:7:HPLIP system user,,,:/var/run/hplip:/bin/falsesaned:x:110:116::/home/saned:/bin/falsepulse:x:111:117:PulseAudio daemon,,,:/var/run/pulse:/bin/false

gdm:x:112:119:Gnome Display Manager:/var/lib/gdm:/bin/false

Awk Example 1: Count the total number of fields in a file.The below awk script, matches all the lines and keeps adding the number of fields in each line,using shortcuaddition assignment operator. The number of fields seen so far is kept in a variable named ‘total’. Once theinput has been processed, special pattern ‘END {…}’ is executed, which prints the total number of fields.

Page 22: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 22/35

$ awk -F ':' '{ total += NF }; END { print total }' /etc/passwd49

Awk Example 2: Count number of users who is using /bin/sh shellIn the below awk script, it matches last field of all lines containing the pattern /bin/sh. Regular expressionshould be closed between //. So all the frontslash(/) has to be escaped in the regular expression. When a linematches variable ‘n’ gets incremented by one. Printed the value of the ‘n’ in the END section.

$ awk -F ':' '$NF ~ /\/bin\/sh/ { n++ }; END { print n }' /etc/passwd2

Awk Example 3: Find the user details who is having the highest USER IDThe below awk script, keeps track of the largest number in the field in variable ‘maxuid’ and thecorresponding line will be stored in variable ‘maxline’. Once it has looped over all lines, it prints them out.

$ awk -F ':' '$3 > maxuid { maxuid=$3; maxline=$0 }; END { print maxuid, maxline }'/etc/passwd112 gdm:x:112:119:Gnome Display Manager:/var/lib/gdm:/bin/false

Awk Example 4: Print the even-numbered linesThe below awk script, processes each line and checks NR % 2 ==0 i.e if NR is multiples of 2. It performs thedefault operation which printing the whole line.

$ awk 'NR % 2 == 0' /etc/passwdlibuuid:x:100:101::/var/lib/libuuid:/bin/shhplip:x:103:7:HPLIP system user,,,:/var/run/hplip:/bin/falsepulse:x:111:117:PulseAudio daemon,,,:/var/run/pulse:/bin/false

Awk Example 5.Print every line which has the same USER ID and

GROUP IDThe below awk script prints the line only if $3(USER ID) an $4(GROUP ID) are equal. It checks this conditionfor each line of input, if it matches, prints the whole line.

$awk -F ':' '$3==$4' passwd.txtgnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh

Awk Example 6: Print user details who has USER ID greater than or equato 100 and who has to use /bin/sh

In the below Awk statement, there are two conditional expression one is User id($3) greater than or equal to100, and second is last field should match with the /bin/sh , ‘&&’ is to print only if both the above conditionsare true.

Page 23: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 23/35

$ awk -F ':' '$3>=100 && $NF ~ /\/bin\/sh/' passwd.txtlibuuid:x:100:101::/var/lib/libuuid:/bin/sh

Awk Example 7: Print user details who doesn’t have the comments in /etc/passwd fileThe below Awk script, reads each line and checks for fifth field is empty, if it is empty, it prints the line.

$awk -F ':' '$5 == "" ' passwd.txtlibuuid:x:100:101::/var/lib/libuuid:/bin/shsyslog:x:101:102::/home/syslog:/bin/falsesaned:x:110:116::/home/saned:/bin/false

Page 24: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 24/35

8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR,NF, FILENAME, FNR

Awk has several powerful built-in variables. There are two types of built-in variables in Awk.1. Variable which defines values which can be changed such as field separator and record separator.2. Variable which can be used for processing and reports such as Number of records, number of fields.

1. Awk FS Example: Input field separator variable.Awk reads and parses each line from input based on whitespace character by default and set the variables$1,$2 and etc. Awk FS variable is used to set the field separator for each record. Awk FS can be set to anysingle character or regular expression. You can use input field separator using one of the following twooptions:

1. Using -F command line option.2. Awk FS can be set like normal variable.

Syntax:

$ awk -F 'FS' 'commands' inputfilename

(or)

$ awk 'BEGIN{FS="FS";}'

Awk FS is any single character or regular expression which you want to use as a input field

separator.

Awk FS can be changed any number of times, it retains its values until it is explicitly changed. If you

want to change the field separator, its better to change before you read the line. So that change affectsthe line what you read.Here is an awk FS example to read the /etc/passwd file which has “:” as field delimiter.

$ cat etc_passwd.awkBEGIN{FS=":";print "Name\tUserID\tGroupID\tHomeDirectory";}{

print $1"\t"$3"\t"$4"\t"$6;}END {

print NR,"Records Processed";}$awk -f etc_passwd.awk /etc/passwdName UserID GroupID HomeDirectorygnats 41 41 /var/lib/gnatslibuuid 100 101 /var/lib/libuuidsyslog 101 102 /home/sysloghplip 103 7 /var/run/hplip

Page 25: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 25/35

avahi 105 111 /var/run/avahi-daemonsaned 110 116 /home/sanedpulse 111 117 /var/run/pulsegdm 112 119 /var/lib/gdm8 Records Processed

2. Awk OFS Example: Output Field Separator VariableAwk OFS is an output equivalent of awk FS variable. By default awk OFS is a single space character.

Following is an awk OFS example.

$ awk -F':' '{print $3,$4;}' /etc/passwd41 41100 101101 102103 7105 111

110 116111 117112 119

Concatenator in the print statement “,” concatenates two parameters with a space which is the value of awkOFS by default. So, Awk OFS value will be inserted between fields in the output as shown below.

$ awk -F':' 'BEGIN{OFS="=";} {print $3,$4;}' /etc/passwd41=41100=101101=102103=7105=111

110=116111=117112=119

3. Awk RS Example: Record Separator variableAwk RS defines a line. Awk reads line by line by default.

Let us take students marks are stored in a file, each records are separated by double new line, and eachfields are separated by a new line character.

$cat student.txt Jones2143788477

Gondrol2321

Page 26: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 26/35

565845

RinRao

2122383765

Edwin2537786745

Dayan2415

304720

Now the below Awk script prints the Student name and Rollno from the above input file.

$cat student.awkBEGIN {

RS="\n\n";FS="\n";

}{

print $1,$2;}

$ awk -f student.awk student.txt Jones 2143Gondrol 2321RinRao 2122Edwin 2537Dayan 2415

In the script student.awk, it reads each student detail as a single record,because awk RS has been assignedto double new line character and each line in a record is a field, since FS is newline character.

4. Awk ORS Example: Output Record Separator VariableAwk ORS is an Output equivalent of RS. Each record in the output will be printed with this delimiter.

Following is an awk ORS example:

$ awk 'BEGIN{ORS="=";} {print;}' student-marks Jones 2143 78 84 77=Gondrol 2321 56 58 45=RinRao 2122 38 37 65=Edwin 2537 78 6745=Dayan 2415 30 47 20=

Page 27: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 27/35

In the above script,each records in the file student-marks file is delimited by the character “=”.

5. Awk NR Example: Number of Records VariableAwk NR gives you the total number of records being processed or line number. In the following awk NR

example, NR variable has line number, in the END section awk NR tells you the total number of records in afile.

$ awk '{print "Processing Record - ",NR;}END {print NR, "Students Records are processed";}'student-marksProcessing Record - 1Processing Record - 2Processing Record - 3Processing Record - 4Processing Record - 55 Students Records are processed

6. Awk NF Example: Number of Fields in a recordAwk NF gives you the total number of fields in a record. Awk NF will be very useful for validating whether all

the fields are exist in a record.Let us take in the student-marks file, Test3 score is missing for to students as shown below.

$cat student-marks Jones 2143 78 84 77Gondrol 2321 56 58 45RinRao 2122 38 37Edwin 2537 78 67 45Dayan 2415 30 47

The following Awk script, prints Record(line) number, and number of fields in that record. So It will be verysimple to find out that Test3 score is missing.

$ awk '{print NR,"->",NF}' student-marks1 -> 52 -> 53 -> 44 -> 55 -> 4

7. Awk FILENAME Example: Name of the current input fileFILENAME variable gives the name of the file being read. Awk can accept number of input files to process.

$ awk '{print FILENAME}' student-marksstudent-marksstudent-marksstudent-marksstudent-marksstudent-marks

Page 28: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 28/35

In the above example, it prints the FILENAME i.e student-marks for each record of the input file.

8. Awk FNR Example: Number of Records relative to the current input fileWhen awk reads from the multiple input file, awk NR variable will give the total number of records relative to

all the input file. Awk FNR will give you number of records for each input file.

$ awk '{print FILENAME, FNR;}' student-marks bookdetailsstudent-marks 1student-marks 2student-marks 3student-marks 4student-marks 5bookdetails 1bookdetails 2bookdetails 3bookdetails 4bookdetails 5

In the above example, instead of awk FNR, if you use awk NR, for the file bookdetails the you will get from 6to 10 for each record.

Awk Tutorial: Understand Awk Variables with 3 PracticalExamples

Like any other programming languages, Awk also has user defined variables and built-in variables.

In this article let us review how to define and use awk variables.

Page 29: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 29/35

Awk variables should begin with the letter, followed by it can consist of alpha numeric characters orunderscore.

Keywords cannot be used as a awk variable

Awk does not support variable declaration like other programming languages

Its always better to initialize awk variables in BEGIN section, which will be executed only once in thebeginning.

There are no datatypes in Awk. Whether a awk variable is to be treated as a number or as a stringdepends on the context it is used in.

Now let us review few simple examples to learn how to use user-defined awk variables.

Awk Example 1: Billing for BooksIn this example, the input file bookdetails.txt contains records with fields — item number, Book name,Quantity and Rate per book.

$ cat bookdetails.txt1 Linux-programming 2 4502 Advanced-Linux 3 3003 Computer-Networks 4 4004 OOAD&UML 3 4505 Java2 5 200

Now the following Awk script, reads and processes the above bookdetails.txt file, and generates report thatdisplays — rate of each book sold, and total amount for all the books sold.So far we have seen Awk reads the commands from the command line, but Awk can also read thecommands from the file using -f option.

Syntax:

$ awk -f script-filename inputfilename

Now our Awk script for billing calculation for books is given below.

$ cat book-calculation.awkBEGIN {

total=0;}{

itemno=$1;book=$2;bookamount=$3*$4;total=total+bookamount;print itemno," ", book,"\t","$"bookamount;

}END {

print "Total Amount = $"total;}

In the above script,

Awk BEGIN section initializes the variable total. itemno, total, book, bookamount are userdefined

awk variables.

Page 30: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 30/35

In the Awk Action section, Quantity*bookprice will be stored in a variable called bookamount. Each

bookamount will be added with the total.

Finally in the Awk END section, total variable will have total amount.

Now execute the book-calculation.awk script to generate the report that displays each book rate and totalamount as shown below.

$ awk -f book-calculation.awk bookdetails.txt1 Linux-programming $9002 Advanced-Linux $9003 Computer-Networks $16004 OOAD&UML $13505 Java2 $1000

 Total Amount = $5750

Awk Example 2. Student Mark CalculationIn this example, create an input file “student-marks.txt” with the following content — Student name, Roll

Number, Test1 score, Test2 score and Test3 score.$ cat student-marks.txt

 Jones 2143 78 84 77Gondrol 2321 56 58 45RinRao 2122 38 37 65Edwin 2537 78 67 45Dayan 2415 30 47 20

Now the following Awk script will calculate and generate the report to show the Average marks of eachstudent, average of Test1, Test2 and Test3 scores.

$cat student.awk

BEGIN {test1=0;test2=0;test3=0;print "Name\tRollNo\t Average Score";

}{

total=$3+$4+$5;test1=test1+$3;test2=test2+$4;test3=test3+$5;

print $1"\t"$2"\t",total/3;}END{

print "Average of Test1="test1/NR;print "Average of Test2="test2/NR;print "Average of Test3="test3/NR;

}

Page 31: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 31/35

In the above Awk script,

In the Awk BEGIN section all the awk variables are initialized to zero. test1, test2, test3 and total are

user-defined awk variables.

In the Awk ACTION section, $3, $4, $5 are Test1, Test2 and Test3 scores respectively. total variable

is the addition of 3 test scores for each student. The awk variable test1, test2 and test3 has the totalscores of each corresponding test.

So in the Awk END section, dividing each test total by total number of records (i.e student) will give

you the average score. NR is an Awk built-in variable which gives total number of records in input.

Awk Example 3. HTML Report for Student DetailsIn the above two example, we have seen awk variable which has numbers as its values. This example showawk script to generate the html report for the students name and their roll number.

$ cat string.awkBEGIN{title="AWK";print "<html>\n<title>"title"</title><body bgcolor=\"#ffffff\">\n<table border=1><th

colspan=2 align=centre>Student Details</th>";

}{name=$1;rollno=$2;print "<tr><td>"name"</td><td>"rollno"</td></tr>";

}END {

print "</table></body>\n</html>";}

Use the same student-marks.txt input file that we created in the above example.

$ awk -f string.awk student-marks.txt<html><title>AWK</title><body bgcolor="#ffffff"><table border=1><th colspan=2 align=centre>Student Details</th><tr><td>Jones</td><td>2143</td></tr><tr><td>Gondrol</td><td>2321</td></tr><tr><td>RinRao</td><td>2122</td></tr><tr><td>Edwin</td><td>2537</td></tr><tr><td>Dayan</td><td>2415</td></tr></table></body>

</html>

We can store the above output, which gives the following html table. In the above script, variable called namand rollno are string variable, because it is used in string context.

Student

Details

 Jones 2143

Gondrol2321

Page 32: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 32/35

RinRao 2122

Edwin 2537

Dayan 2415

Yet Another Sudoku Puzzle Solver Using AWK

We have seen in a previous awk introduction article that awk can be an effective tool for everything fromsmall one-liners up through some interesting applications. There are certainly more complex languages at oudisposal if a situation calls for it; perl and python come to mind. Applications requiring networking support,

database access, user interfaces, binary data or more extensive library support and complexity are best leftto other languages with better support in these areas.

Nevertheless, awk is a superb language for testing algorithms and applications with some complexity

especially where the problem can be broken into chunks which can streamed as part of a pipe. It’s an idealtool for augmenting the features of shell programming as it is ubiquitous; found in some form on almost allUnix/Linux/BSD systems. Many problems dealing with text, log lines or symbol tables are handily solved or athe very least prototyped with awk along with the other tools found on Unix/Linux systems.

Page 33: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 33/35

While awk lends itself well to operating on input a line at a time, processing and then sending some output foeach line, it may also be used in a more traditional batching-style application where it reads all the input,processes and then sends the processed output onward.

Yet Another Sudoku Puzzle Solver – YASPS for AwkThe application I chose to use as an example is “yet another sudoku puzzle solver”. I must confess at theoutset that I have never sat down to solve one of these puzzles myself, but sketched this out over a few dayswhile commuting on a train and watching other people work on them. It was far more fun I think than actuallydoing any of the puzzles..

Download YASPS Program for Awk: solve.awk

The input format I’ve chosen is one which is easy to parse in awk and is fairly traditional in the Unixenvironment. Blank lines and those starting with a hash (#) character are ignored making it easy to insertcomments. Extra spaces may be used between columns for readability. An example is shown in the followingfigure:

-------------------------------------------------------------------------------# I forget where I got this..

# this is one of the hardest ones I've found for this algorithm, although# after transposing the matrix it can be solved in a fraction of the time

2 0 0 6 7 0 0 0 00 0 6 0 0 0 2 0 14 0 0 0 0 0 8 0 0

5 0 0 0 0 9 3 0 00 3 0 0 0 0 0 5 00 0 2 8 0 0 0 0 7

0 0 1 0 0 0 0 0 4

7 0 8 0 0 0 6 0 00 0 0 0 5 3 0 0 8-------------------------------------------------------------------------------

There is almost no error checking in this program, but it would be easy to add in front or as part of a wrapperscript. I’ll leave that as an exercise for the reader.This program uses a very simple depth-first recursive backtracking algorithm with up-front and ongoingelimination of invalid entries. Awk may not have the expressive power for representing complex data that peror other languages have, but with care, many moderate sized problems and data sets can be used. Thisalgorithm may not be the best one around, but it is certainly fast enough for most problems and is easy toimplement.With any problem, representing the data effectively makes the task of designing a program much easier. I

have represented the complete state of the puzzle in a matrix called “master”. This is hardly used for muchexcept keeping a record of what is where and for doing the final output.The main workhorse variables are three other arrays. Intuitively we know from the recursive trial and errormethod we’ve chosen that we will need to check the validity of trial numbers quite often. To accelerate thatprocess we maintain associative arrays for storing the current state for the rows, columns and each region(which, although not technically correct, we’ll call a “quadrant”). These are the arrays R, C and Q. (Note thatawk is case sensitive.)Sometimes it helps when trying to factor out common computations from nested for-loops or recursivefunction calls to pre-compute values which are used often. I had tried this with the “regmap” matrix which

Page 34: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 34/35

would store a quadrant number given the row, column values but I found the time savings in this case to beambiguous. I’ve left it commented out, but your mileage may vary and the technique is often very useful.Recursive algorithms are often best designed and therefore described in a top-down manner. The topfunction in this program is called “search()” and is called from the “END” pattern after the problem data hasbeen read into the arrays.At a high level, search() starts with the supplied row and column parameters and looks for the next empty

space to check. If there aren’t any, the problem has been solved and it returns with the solution. If there is anempty space (represented by zero), it tests available numbers (with the inuse() function, for numbers not inuse in that row, column or quadrant) by inserting a number into the arrays using mark() and calling itselfrecursively. If the recursive search() function returns a zero it means that it has failed, so the mark() functionis again called to un-mark the trial number. It then loops around until the possibilities are exhausted or thesearch() call returns success.The beauty of many recursive algorithms is the inherent elegance and simplicity of the solutions. While theyare sometimes not the fastest algorithms, they are often “fast enough” and easier to design. This programsolves most puzzles in less than a few seconds. One thing I noticed while trying this program on differentpuzzles is that if a problem took a longer period of time to solve (in tens of seconds), simply transposing thematrix would often give the same solution in a fraction of a second. With today’s multi-core CPUs, thissuggests one possibility for speeding it up: Write a wrapper script which starts several instances of theprogram with different transpositions of the matrix. An example is shown with the previous puzzle shownabove and the transposed version in the following figure where the transposed problem was solved fourtimes faster.

-------------------------------------------------------------------------------marion:~/sudoku$ time awk -f solve.awk THE-HARDEST-SO-FAR.dat

# Searches=134214

2 8 3 6 7 1 9 4 59 7 6 5 4 8 2 3 14 1 5 3 9 2 8 7 6

5 6 7 4 1 9 3 8 28 3 4 2 6 7 1 5 91 9 2 8 3 5 4 6 7

3 2 1 7 8 6 5 9 47 5 8 9 2 4 6 1 36 4 9 1 5 3 7 2 8

real 0m10.009suser 0m9.889ssys 0m0.024s

marion:~/sudoku$ time awk -f solve.awk /tmp/transposed.dat

# Searches=32253

8 3 4 7 9 2 6 1 52 1 9 6 5 8 7 3 47 6 5 4 1 3 8 2 9

Page 35: awk_power

8/3/2019 awk_power

http://slidepdf.com/reader/full/awkpower 35/35

3 4 6 5 7 9 2 8 15 2 8 3 6 1 9 4 71 9 7 8 2 4 3 5 6

9 8 1 2 4 7 5 6 3

4 5 2 9 3 6 1 7 86 7 3 1 8 5 4 9 2

real 0m2.487suser 0m2.456ssys 0m0.008s-------------------------------------------------------------------------------

When something even faster is required, it can often be accomplished by translating the algorithm intoanother language with a more direct representation of the data sets. I did a translation of this program to Conce with an interesting twist on the data indexing. This version probably executes a few orders of magnitudfaster, largely due to the way that the data is represented. Probably we’ll release “Yet Another Sudoku PuzzlSolver Using C” as an another article later.

I believe that awk deserves a place in everyone’s toolkit. Its simplicity relative to other languages is perhapsseen as a weakness, but I see it as one of its strengths. The language can be learned in an afternoon andused without resorting to reference books for solving many day-to-day problems. I use it on a regular basisstraight from the command line, right through to implementing things such as compilers for DSLs (DomainSpecific Languages).