+ All Categories
Home > Documents > 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and...

1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and...

Date post: 13-Jan-2016
Category:
Upload: ethelbert-mccormick
View: 221 times
Download: 0 times
Share this document with a friend
44
1 Programming in Unix Regular Expressions These expressions are used in grep, sed, awk, ed, vi and the various shells
Transcript
Page 1: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

1

Programming in Unix

Regular ExpressionsThese expressions are used in grep,

sed, awk, ed, vi and the various shells

Page 2: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

2

Regular Expressions

A regular expression is a pattern to be matched

Perl is a superset of all these toolsAny regular expression used in Unix

tools can be used in Perl

Page 3: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

3

Regular Expressions

The string abc can be a regular expression by enclosing the string in slashes:

$_ = “I know my abc s” if (/abc/) {

print $_; }

Page 4: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

4

Regular Expressions

Single character patterns - a character in the expression must match a single character in the string

The dot “.” matches any single character other than “\n”

/r.g/ would match rug or rag

Page 5: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

5

Regular Expressions

Metacharacters or escape sequences allow you to match certain conditions in a string. \ | ( ) [ * + ? (Are all metacharacters)

A backslash in front of any metacharacter makes it non-special

5.18 would use /5\.18/ 01\20\03 would use /01\\20\\03

Page 6: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

6

Regular Expressions

Some escape sequences you might see\a An alarm bell

\d A digit between 0 and 9

\D A non-digit character

\e The character generated by pressing Escape

\f A form feed

Page 7: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

7

Regular Expressions

\l The next lower case letter

\r A carriage return

\s A whitespace character

\S A non whitespace character

\U The next uppercase character

Page 8: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

8

Regular Expressions

Pattern /m./ matches any two character pattern that starts with m

my or me would be examples of matches

Page 9: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

9

Regular Expressions

A character class uses a list of possible characters enclosed in brackets [ ]

It will match any one character listed within the brackets

[a-z] will match any single lowercase letter (a range can be used with the hyphen)

Negated character class ^ matches character not in the list

Page 10: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

10

Regular Expressions

Grouping Patterns - one or more of….Sequence - i.e.; abc means a followed by b

followed by cMultipliers

* indicates zero or more of previous characters + meaning one or more of the immediately

previous character ? means zero or one of the immediately

previous character

Page 11: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

11

Regular Expressions

General Multiplier $_ = “fred xxxxxxxxxx barney”; /x{5,10}/ #would look for 5 to 10

repetitions of the letter x s/x[5,10]/and/; #would substitutesubstitute and

for the x’s

Page 12: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

12

Regular Expressions

Parentheses (a) matches an a ([a-z]) matches any single lowercase

letterAlternation

match exactly one of the alternatives a|b|c

/[abc]/ works the same way

Page 13: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

13

Regular Expressions

Anchoring Patterns Generally when a pattern is matched

against a string it is evaluated from left to right matching at the first opportunity

\b anchor requires a word boundary at the indicated point

\B requires that there is not a word boundary

^ matches the beginning of a pattern $ matches the ending of a pattern

Page 14: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

14

Regular Expressions

/fred\b/; #matches fred but not frederick

/\bmo/; #matches moe but not Elmo

/\bFred\b/;#matches Fred but not Freddy or AlFred

/\b\+\b/; #matches “ + “but not ++ or x+y

Page 15: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

15

Regular Expressions

Precedence Parentheses ( ) Quantifiers * + ? { } Anchors and sequence ^ $ \b \B\ Alternation |

Page 16: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

16

Regular Expressions

Matches with m// (m not needed when using //)

Searches using /pattern/ is actually a shortcut for m/pattern/

You may choose any pair of delimiters to quote the contents

Where you used /fred/ you can use m(fred) or m,fred, or m<fred> or m!fred!

Page 17: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

17

Regular Expressions

Different delimiter rather than the slash (/) add the letter m to the new delimiter ie. m@/usr/etc@

Page 18: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

18

Regular Expressions

Binding Operator =~ selects a different target, it tells Perl to match the pattern on the right against the string on the left (instead of matching $_)

Ignoring case with /i [yY] matches either upper or lower case y /^procedure/i #matches P or p

Page 19: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

19

Regular Expressions

Case shifting$_ = “I saw Barney with Fred.”;s/(fred|barney)/\U$1/gi;#Now $_ is “I saw BARNEY with FRED.”

Page 20: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

20

Regular Expressions

The split Operator will break up a string according to a separator. This is useful for tab separated or colon-separated data@fields = split /:/, “abc:def:g:h”; Gives you (“abc”, “def”, “g”, “h”)@fields = split /:/, “abc:def::g:h”; Gives you (“abc”, “def”, “”, “g”, “h”)

Page 21: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

21

Regular Expressions

It is common to split on whitespace using /\s+/ as the pattern

All whitespace runs equal to a single space$input= “This is a \t test.\n”;split /\s+/, $input;will give you the result “This”, “is”, “a”, “test.”

Page 22: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

22

Regular Expressions

Substitutions $_ = “foot fool buffoon”; s/foo/bar/;#$_is now “bart fool buffoon” s/// will make just one replacement s/foo/bar/g; #$_is now “bart barl

bufbarn” /g globally replace on all possible

matches

Page 23: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

23

Regular Expressions

The join function takes a list of values and glues them together. Performs the opposite of split.

For example$info = join(“\n”, Name, Address, “Zip Code”); print $info will displayNameAddressZip Code

Page 24: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

24

Regular Expressions

Or take a list @values = ( 2, 4, 6, 8, 10);$new_value= join “-”, @values;# $new_value looks like “2-4-6-8-10”$new_value= join “:”, @values;# $new_value looks like “2:4:6:8:10”$new_value= join “-”, “cat”, @values;# $new_value looks like “cat-2-4-6-8-10”

Page 25: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

25

Filehandles and File Tests

What is a filehandle? An I/O connection between your Perl

process and the outside world. Like the names for labeled blocks Easy to confuse with future reserved

words, so recommendation is to use all UPPERCASE letters in your filehandle;

Page 26: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

26

Filehandles and File Tests

syntax is like: open (FILEHANDLE, “somename”); FILEHANDLE is the new filehandle and

somename is the external filename (such as file or device)

To open a file for write, use the same open statement but prefix the filename with a greater than sign (caution this will overwrite any existing files with the same name)open (OUT, “>outfile”);

Page 27: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

27

Filehandles and File Tests

Syntax continued: To open a file to append data to it

open (LOGFILE, “>>mylogfile”); All forms of open return true for success

and false for failure When finished with a filehandle you close

itclose(LOGFILE);

reopening a filehandle will close the previous version

Page 28: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

28

Filehandles and File Tests

When a filehandle does not open successfully you can use the die function to report that an error has occurred

unless statement can be used as a logical or unless (this) { that; } this || that;

unless statement used as a logical or with the die statement

unless (open (DATAPLACE, >/tmp/dataplace”)) {print “Sorry, I couldn’t create your file”;}else {#the rest of your program

}

Page 29: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

29

Filehandles and File Tests

Or….make it even simpler with:unless (open DATAPLACE, “>/tmp/dataplace”) { die “Sorry, I couldn’t create your file”;

oropen (DATAPLACE, “>tmp/dataplace”) ||

die “Sorry, I couldn’t create your file”;

Page 30: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

30

Filehandles and File Tests

The -x File TestsSuppose you wanted to make sure

that there wasn’t a file by that name (so you don’t blow away valuable data) when you open and write to a file

Use file tests (see page 157-8)-e for a file or directory exists

Page 31: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

31

Formats

Helps you generate simple, formatted reports and charts

Keeps track of number of lines per page, current page

Use “format” to declare and “write” to execute

Page 32: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

32

Declaring a Format

format MYNAME = FORMLIST.

Note: if MYNAME is omitted writes to STDOUT

FORMLIST is a list containing the followingA comment (start the line with #)A “picture” giving the output for one output

lineAn argument line supplying values to plug

into the previous “picture” line

Page 33: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

33

Special Values

FORMAT_NAME_TOP defines text that will appear at the top of each page

FORMAT_NAME section defines format and variables for each line that should print as the body of the report

You should define the format and format_top together somewhere in your program (often seen at the end).

Page 34: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

34

Example

# a report on the /etc/passwd fileformat MY_REPORT_TOP =

Password File Report

Name Login Uid Gid Shell Home

-------------------------------------------------------------------.

Page 35: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

35

Example

#how to send output to the screenformat STDOUT =

Password File Report

Name Login Uid Gid Shell Home-------------------------------------------------------------------.

open STDOUT;write;

Page 36: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

36

Example (cont...)

format MY_REPORT = @<<<<< @||||||| @<<<< @>>>> @>>>>

@<<<<<<<<<<<<$name, $login, $uid, $gid, $shell, $home

.Then to print this when you want:write MY_REPORT;

Page 37: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

Example of Code#!/usr/local/bin/perl -w

print "This is an address label program\n";

print "Enter your name: \n";

$name=<>;

print "Enter your street address: \n";

$street=<>;

print "Enter your City, State, and Zip: \n";

$therest=<>;

open (AddressLabel,">myaddrlist");

write (AddressLabel);

format AddressLabel =

==================================

| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |

$name

| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |

$street

| @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< |

$therest

==================================

.

Page 38: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

38

Example Entering Data

# addrlabel.pl

This is an address label program

Enter your name:

Mike

Enter your street address:

14590 Roller Coaster Rd

Enter your City, State, and Zip:

Denver, CO 80931

Page 39: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

39

Example Output to File

# cat myaddrlist

==================================

| Mike |

| 14590 Roller Coaster Rd |

| Denver, CO 80931 |

==================================

Page 40: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

40

Format Pictures

@ or ^ indicates substitution at run-time

< left justify> right justify| centeringIf the variable has more characters than

the format picture, it will be truncatedTo avoid truncating use “@*” on a

format line by itself.

Page 41: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

41

The ^ Picture

Starting a field with ^ allows you to print part of the text with the first call

The next time you reference it, the string will only contain that part of the string that has not been printed and the next n characters will be printed and so on...

Warning!: this does destroy the original value of the variable so store it off if you will need it again.

Page 42: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

42

Example of the ^# a report from a bug report formformat BUG_REPORT = Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $subjectFrom: @<<<<<<<<<<<<<< Priority: @<<<<<<<<<< $from, $priorityDescription:

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< $description

^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<… $description

Page 43: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

43

Special Variables

$~ contains $FORMAT_NAME$^ contains $FORMAT_NAME_TOP$% contains the current output page

number$= contains number of lines per page$- contains lines remaining on

current page (set to zero to force a new page)

Page 44: 1 Programming in Unix zRegular Expressions zThese expressions are used in grep, sed, awk, ed, vi and the various shells.

44

To Use Special Variables

You can use these by “selecting”:$myform = select(MYFORMAT);$~ = “My_Other_Format”;$^ = “My_Top_Format”;select($myform);


Recommended