+ All Categories
Home > Documents > Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

Date post: 02-Jan-2016
Category:
Upload: valentine-hodges
View: 222 times
Download: 1 times
Share this document with a friend
Popular Tags:
66
Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13
Transcript
Page 1: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

Introduction to Programming the WWW I

Introduction to Programming the WWW I

CMSC 10100-1

Summer 2004

Lecture 13

Page 2: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

2

Today’s TopicsToday’s Topics

• CGI module

• Patterns and regular expression

Page 3: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

3

Perl ModulesPerl Modules

• A Perl module is a self-contained piece of Perl code that can be used by a Perl program or by other Perl modules Conceptually similar to a C link library, or a C++ class Perl 5 module list

• Each Perl module has a unique name Perl provides a hierarchal name space for modules Components of a module name are separated by double

colons (::) Example:

• CGI• Math::Complex

Page 4: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

4

Perl Modules (cont’d)Perl Modules (cont’d)

• Each module is contained in a single file Module files are stored in a subdirectory hierarchy that parallels

the module name hierarchy All module files have an extension of .pm Example:

• Math::Complex is stored in Math/Complex.pm

• Finding module libraries The Perl interpreter has a list of directories in which it searches for

modules. This list is available in the global array @INC Use perl –V to see the initial contents of @INC

• Local modules vs. modules coming from standard distribution• CGI stored in /opt/perl/perl-5.005.03/lib/5.00503/CGI.pm• Math::Complex is actually stored in /opt/perl/perl-5.005.03/lib/5.00503/Math/Complex.pm

Page 5: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

5

Using Perl ModulesUsing Perl Modules

• Modules must be imported in order to be accessible to a script This is done with the use function use statements are commonly made at the

beginning of a program or subroutine• This makes it easier to understand the program

and see which modules are loaded. Example:

use Math::Complex;use CGI “:standard”;

http://world.std.com/~swmcd/steven/perl/module_mechanics.html

A modifier toa module

Page 6: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

6

Using CGI.pm to generate HTML

Using CGI.pm to generate HTML

• The CGI.pm module provides several functions that can be used to concisely output HTML tags

• For example, $mypage=‘It is a New Day’;

print “<HTML><HEAD><TITLE> $mypage </TITLE></HEAD><BODY>”;

can also be written as:$mypage=’It is a New Day’;

print start_html(‘$mypage’);

Page 7: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

7

3 Basic CGI.pm Modules3 Basic CGI.pm Modules

• header creates the MIME Content-type line

• start_html creates starting HTML tags

• end_html creates ending HTML tags

1.     #!/usr/local/bin/perl2.     use CGI ‘:standard’;3.     print header;4.     print start_html;5.     print '<FONT size=4 color="blue">';6.     print 'Welcome <I>humans</I> to my site</FONT>'; 7. print end_html;

http://people.cs.uchicago.edu/~hai/hw4/cgipm1.cgi

Page 8: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

8

CGI.pm Basic Functions CGI.pm Basic Functions

• The various CGI/PM function accept 3 basic syntactic formats:

No argument format

• functions that can be used without any arguments

Positional argument format

• functions that can accept comma-separated arguments within parentheses

Name-value argument format

• functions that accept parameters submitted as name-and-value pairs

Page 9: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

9

No Argument Format No Argument Format

• The Previous Example shows the start_html, header, end_html functions

• You can place the 1 or more functions directly within a print statement

Would output

print start_html, br, br, hr;

Comma separateeach CGI.pmfunction call.

print will outputthe HTML tags

generated directly.

<HTML><HEAD><TITLE></TITLE></HEAD><BODY><BR><BR><HR>

Page 10: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

10

Some Single Argument FunctionsSome Single Argument FunctionsCGI.pm Function Example of Use Example Output

header- the MIME Content-type line

print header; Content-type:text/html\n\n

start_html—Tags to start an HTML document

print start_html; <HTML><HEAD><TITLE></TITLE></HEAD><BODY>

br—output <BR> tag

print br; <BR>

hr—generate horizontal rule

print hr; <HR>

end_html—end an HTML document

print end_html; </BODY></HTML>

Page 11: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

11

Positional Argument Format Positional Argument Format

• Specify multiple arguments based on the position of the argument

• For example

would output

<H1>Hello World</H1>

print h1('Hello World');

Argument used as stringto include in the

<H 1> ... </H1> tags.

Generate<H1> ... </H1>

tags.

Page 12: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

12

Some Positional FunctionsSome Positional FunctionsCGI.pm Functions Example of Use Example Output

start_html()—tags needed to start an HTML document.

start_html(‘My Page’);

<HTML><HEAD><TITLE> My Page </TITLE></HEAD><BODY>

h1()—header level 1 tags. (also h2(), …, h6() )

print h1(‘Hello There’);

<H1>Hello There </H1>

strong() – output argument in strong.

print strong('Now');

<STRONG>Now</STRONG>

p()—creates a paragraph.

print p(‘Time to move’);

<P>Time to move </P>

b()—prints the argument in bold.

print b('Exit'); <B>Exit</B>

i()—prints the argument in italics.

Print i('Quick'); <I>Quick</I>

Page 13: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

13

Operating on VariablesOperating on Variables

• Can concisely use functions with a single print statement: print i('Please '),'come when I call you ', strong('immediately.');

• This code would output the following: <I>Please</I> come when I call you <STRONG>immediately.</STRONG>

Page 14: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

14

Consider the following example:Consider the following example:

 1. #!/usr/local/bin/perl

2. use CGI ':standard';

3. print header, start_html(‘Positional Example’), h1('Simple Math');

4. print b('two times two='), 2*2;

5. print br, 'but ', b('four times four='), 4*4;

6. print br, 'Finally, ', b('eight times eight='), 8*8;

7. print end_html;

http://people.cs.uchicago.edu/~hai/hw4/cgipm2.cgi

Page 15: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

15

Name-Value Argument Format Name-Value Argument Format

print start_html ( { -title=>'My Title', -bgcolor=>'yellow' } );

Use curly brackets to encloseyour arguments.

The => sequence separates theargument name from the value.Argument name

is specifiedafter a dash. Place argument value

in single quotes

Commaseparatearguments

• Can specify names and values as follows:

• Would output the following:<HTML><TITLE>My Title</TITLE></HEAD><BODY BGCOLOR=”yellow”>

Page 16: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

16

Some name/value functionsSome name/value functionsCGI.pm Function

Example Usage Example Output

start_html start HTML document

print start_html({ -title=>‘my title’, –bgcolor=>’red’ });

<HTML><HEAD><TITLE>my title</TITLE></HEAD> <BODY BGCOLOR=”RED”>

img—inserts an image

print img({-src=>'myfile.gif', -alt=>’picture’});

<IMG SRC="myfile.gif” alt=”picture”>

a—establishes links

print a({ -href =>'http://www.mysite.com'}, 'Click Here');

<A HREF="http://www.mysite.com"> Click Here </A>

font()—creates <FONT> … </FONT> tags

print font( { -color=>‘BLUE’,–size=> ’4’}, ‘Lean, and mean.’);

<FONT SIZE=”4” COLOR=”BLUE”> Lean, and mean. </FONT>

 

Page 17: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

17

Example Name/Value ProgramExample Name/Value Program

1.#!/usr/local/bin/perl2.use CGI ':standard';3.print header;4.print start_html({-title=>'New Day ', -bgcolor=>'yellow'});5.print 'Welcome One And ', i('All');

6.print end_html;

http://people.cs.uchicago.edu/~hai/hw4/cgipm3.cgi

Page 18: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

18

Using CGI.pm with HTML forms

Using CGI.pm with HTML forms

CGI.pm Function

Example Usage Example Output

start_form start HTML form element

print start_form({ -method=>‘post’,–action=> ‘http://people.cs.uchicago.edu/~wfreis/cgi-bin/reflector.pl’});

<form method="post" action=http://people.cs.uchicago.edu/~wfreis/cgi-bin/reflector.pl>

textfield, password_field —inserts a text field or password field

print textfield(-name=>'textfield1', -size=>'50', -maxlength=>'50');

<input type="text" name="textfield1" size=50 maxlength=50 />

scrolling_list —insert a multiple list

print scrolling_list(-name=>'list1', -values=> ['eenie', 'minie', 'moe'], -default=> ['eenie','moe'], -size=>5, -multiple=>'true');

<select name="list1" size=5 multiple><option selected value= "eenie“ > eenie</option><option value="minie">minie </option><option selected value="moe">moe </option></select>

textarea—inserts a text area

print textarea(-name=> 'large_field_name',-rows=> 10, -columns=>50);

<textarea name="large_field_name" rows=10 cols=50></textarea>

Page 19: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

19

Using CGI.pm with HTML forms (cont’d)Using CGI.pm with HTML forms (cont’d)

CGI.pm Function

Example Usage Example Output

checkbox_group – insert checkbox

print checkbox_group(-name=> 'color', -values=>['red ','orange ','yellow '], -default=>['red ']);

<input type="checkbox" name="color" value="red " checked />red <input type="checkbox" name="color" value="orange " />orange <input type="checkbox" name="color" value="yellow " />yellow

raidio-group —inserts a text field

print radio_group(-name=>'color blind', -values=>['Yes','No'], -default=>'No');

<input type="radio" name="color blind" value="Yes" />Yes<input type="radio" name="color blind" value="No" checked />No

submit,reset—insert a submit or reset button

print submit('submit', 'Submit');Print reset;

<input type="submit" name="submit" value="Submit" /><input type="reset" />

endform— print end form tag

print endform(); </form>

Perl CGI Reference

Page 20: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

20

A CGI Form ExampleA CGI Form Example

http://people.cs.uchicago.edu/~hai/hw4/cgiform1.cgi

Page 21: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

21

Receiving HTML Form ArgumentsReceiving HTML Form Arguments

• Within the CGI program call param() function Input variables into CGI/Perl programs are

called CGI variables

• Values received through your Web server as input from a Web browser, usually filled in a form

To use param():

$thecolor = param('color');

The CGI variblename in

quotation marks.

Assign the value ofthe CGI variable to

$thecolor.

Page 22: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

22

Receiving HTML Form Arguments Receiving HTML Form Arguments

<FORM ACTION="cgiform1_checker.cgi" METHOD="POST">print "What is your favourite color?";print checkbox_group(-name=>'color',-values=>['red ','orange ','yellow ','green ','blue ','indigo ','violet '], -default=>['red ','blue ']);

URL of program to send form output to.

Name ofargument fromcheckbox is color.

.

.

.

</ FORM>

#!/ usr/ l ocal / bi n/ perluse CGI ": standard";pri nt header;pri nt "Your f avouri te col or: ", param(' col or' ) ;. . .pri nt end_html ;

Get the valueof form elementcalled color

The Calling HTML Form

The Receiving CGI/Perl Program

http://people.cs.uchicago.edu/~hai/hw4/cgiform1.cgi

Page 23: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

23

Sending ArgumentsSending Arguments

• You can send arguments to your CGI program directly from the URL address of a browser

http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red

The argument name is color.Its' value is red.

URL of the CGIprogram to start.

The "?" signals argument to follow.

http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red

Page 24: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

24

Sending Multiple ArgumentsSending Multiple Arguments

Precede firstargument with ?

Precede next argument with &

http://people.cs.uchicago.edu/~hai/hw4/cgiform1_checker.cgi?color=red&secret=nothing

Page 25: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

25

Debug CGI Program in Command Line

Debug CGI Program in Command Line

• To start and send an argument to the password program can execute the following:

perl cgiform1_checker.cgi color=red

• Enclose blank spaces or multiple arguments in quotation marks:

perl cgiform1_checker.cgi ‘color=rose red’

perl cgiform1_checker.cgi 'color=red&secret=none'

Page 26: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

26

Check CGI VariablesValues

Check CGI VariablesValues

• Perl provides a simple method to test if any parameters were received or null: $var = param(‘some_cgi_variable’) ;

if ($var) {

statement(s) to execute when $var has a value

} else {

statement(s) to execute when $var has no value

}

Page 27: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

27

Combining Program FilesCombining Program Files

• Applications so far have required two separate files; one file for to generate the form, and the other to process the form Example:

cgiform1.cgi and cgiform1_checker.cgi

Can test return value on param() to combine these

• At least two advantages With one file, it is easier to change arguments It is easier to maintain one file

Page 28: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

28

Combining Program FilesCombining Program Files

if ( !param() ) { &create_form(); }else { &process_form();}

If no parameters, thenthis is first time for

program. Call create_formto create the form.

Check to seeif there are any

parameters.

Must be some parameters toprocess so call process_form

http://people.cs.uchicago.edu/~hai/hw4/cgiform2.cgi

Page 29: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

29

CGI Module: Advanced Topic*

CGI Module: Advanced Topic*

• Functional(procedural) Orientation use CGI ‘:standard’;

• Object Orientation use CGI;

Call new() operator to create a CGI object and stores in a variable. The functions of CGI.pm are accessed through the -> operator with the object variable at the left side

• $q = new CGI;• print $q->header();

http://www.classes.cs.uchicago.edu/classes/archive/2004/winter/10100-1/02/perl/perl_index.html

Page 30: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

30

Several ResourcesSeveral Resources

• URL http://www.classes.cs.uchicago.edu/classes/

archive/2004/winter/10100-1/02/perl/perl_index.html

• Topics How to write your first CGI script Checking CGI Parameters on the Command Line Server-side Validation Hidden HTML Form Fields Sorting with Perl

Page 31: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

31

Patterns in String Variables Patterns in String Variables

• Many programming problems require matching, changing, or manipulating patterns in string variables. An important use is verifying input fields of a form

• helps provide security against accidental or malicious attacks.

• For example, if expecting a form field to provide a telephone number as input, your program needs a way to verify that the input comprises a string of seven digits.

Page 32: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

32

Four Different Constructs Four Different Constructs

• Will look at 4 different Perl String manipulation constructs: The match operator enables your program to look for

patterns in strings. The substitute operator enables your program to change

patterns in strings. The split function enables your program to split strings

into separate variables based on a pattern. (already covered)

Regular expressions provide a pattern matching language that can work with these operators and functions to work on string variables.

Page 33: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

33

The Match OperatorThe Match Operator

• The match operator is used to test if a pattern appears in a string. It is used with the binding operator (“=~”)

to see whether a variable contains a particular pattern.

if ( $name =~ m/edu/ ) {

set of statements to execute}

These statements execute if 'edu' isANYWHERE in the contents of the stringvariable $name.

Trys to match the patterninside slashes "/". In thiscase the pattern "edu".

The binding operatorindicates toexamine thecontents of$name.

Page 34: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

34

Possible Values of $namePossible Values of $name

Value of $name Test from Figure 7.1

‘www.myschool.edu’ True because the string contains edu

‘www.myschool.com’ False because edu is not in the string

‘I like my education’ True because the string contains edu

‘I Like My Education’ False because matching is case sensitive

‘I liked umbrellas’ False because edu is not in the string

Page 35: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

35

Using Character ClassUsing Character Class

• Matching any one in a set of characters enclosed within square brackets foo[bc]ar will match foobar and foocar

• Ranges can be expressed inside of a character class by using a dash between two characters [a-g] is equal to [abcdefg] [0-9]is equal to any digit [a-zA-Z]

• Negative character class: use the caret (^) symbol as the first thing in the character class a[^bc]d, [^0-9]

Page 36: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

36

Other Delimiters? Other Delimiters?

• Slash (“/”) is most common match pattern

Others are possible, For example, both use valid match operator syntax:

if ( $name =~ m!Dave! ) { if ( $name =~ m<Dave> ) {

• The reverse binding operator test if pattern is NOT found:

if ( $color !~ m/blue/ ) {

• Demohttp://www.people.cs.uchicago.edu/~wfreis/regex/regex_match.pl

Page 37: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

37

The Substitution OperatorThe Substitution Operator

• Similar to the match operator but also enables you to change the matched string.

Use with the binding operator (“=~”) to test whether a variable contains a pattern

$stringvar =~ s/ABC/abc/;

Pattern to change if a match.

String variable tosearch for and

potentiallysubstitute pattern in.

Pattern tosearch for.

Page 38: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

38

How It WorksHow It Works

• Substitutes the first occurrence of the search pattern for the change pattern in the string variable.

• For example, the following changes the first occurrence of t to T:

$name = “tom turtle”;$name =~ s/t/T/;print “Name=$name”;

• The output of this code would beName=Tom turtle

Page 39: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

39

Changing All OccurrencesChanging All Occurrences

• You can place a g (for global substitution) at the end of the substitution expression to change all occurrences of the target pattern string in the search string. For example,

$name = “tom turtle”; $name =~ s/t/T/g; print “Name=$name”;

• The output of this code would be

Name= Tom TurTle

• Demo http://www.people.cs.uchicago.edu/~wfreis/regex/regex_sub.pl

Page 40: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

40

Using TranslateUsing Translate

• A similar function is called tr (for “translate”). Useful for translating characters from uppercase to lowercase, and vice versa.

The tr function allows you to specify a range of characters to translate from and a range of characters to translate to. :

$name="smokeY";

$name =~ tr/[a-z]/[A-Z]/;

print "name=$name";

Would output the following

Name=SMOKEY

Page 41: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

41

A Full Pattern Matching ExampleA Full Pattern Matching Example

1. #!/usr/local/bin/perl2. use CGI ':standard';3. print header, start_html('Command Search');4. @PartNums=( 'XX1234', 'XX1892', 'XX9510');5. $com=param('command');6. $prod=param('uprod');7. if ($com eq "ORDER" || $com eq "RETURN") {8. $prod =~ s/xx/XX/g; # switch xx to XX9. if ($prod =~ /XX/ ) {10. foreach $item ( @PartNums ) {11. if ( $item eq $prod ) {12. print "VALIDATED command=$com prodnum=$prod";13. $found = 1;14. }15. }16. if ( $found != 1 ) {17. print br,"Sorry Prod Num=$prod NOT FOUND";18. }19. } else {20. print br, "Sorry that prod num prodnum=$prod looks wrong";21. }22. } else {23. print br, "Invalid command=$com did not receive ORDER or RETURN";24. }

25. print end_html;

Page 42: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

42

Would Output The Following ...Would Output The Following ...

Page 43: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

43

Using Regular Expressions Using Regular Expressions

• regular expressions to enable programs to match patterns more completely .

They actually make up a small language of special matching operators that can be employed to enhance the Perl string pattern matching.

Page 44: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

44

The Alternation OperatorThe Alternation Operator

• Alternation operator looks for alternative strings for matching within a pattern.

(That is, you use it to indicate that the program should match one pattern OR the other). The following shows a match statement using the alternation operator (left) and some possible matches based on the contents of $address (right); this pattern matches either com or edu.

Page 45: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

45

Example Alternation Operator Example Alternation Operator

Match Statement Possible Matching String Values for

$address

if ( $address =~ /com|edu/ ) { “www.mysite.com”, “Welcome to my

site”,

"Time for education”,“www.mysite.edu”

Page 46: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

46

Parenthesis For GroupingsParenthesis For Groupings

• You use parentheses within regular expressions to specify groupings. For example, the following matches a $name value of Dave or David.

Match Statement Possible Matching String Values for $nameif ( $name =~ /Dav(e|id)/

) {

“Dave”, “David”, “Dave was here”,

"How long before David comes home”

Page 47: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

47

Special Character Classes Special Character Classes

• Perl has a special set of character classes for short hand pattern matching

• For example consider these two statements

if ( $name =~ m/ / ) {

will match $name with embedded space char

if ($name =~ m/\s/ ) {

will match $name with embedded space, tab, newline

Page 48: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

48

Special Character ClassesSpecial Character Classes

Character Class Meaning

\s Matches a single space. For example, the following matches

“Apple Core”, “Alle y”, and “Here you go”; it does not match

“Alone”: if ( $name =~ m/e\s/ ) {

\S Matches any nonspace, tab, newline, return, or formfeed

character. For example, the following matches “ZT”, “YT”,

and “;T”: if( $part =~ m/\ST/ ) {

Page 49: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

49

Special Character Classes - IISpecial Character Classes - IICharacter Class Meaning

\w Matches any word character (uppercase or lowercase letters, digits, or the

underscore character). For example, the following matches “Apple”,

“Time”, “Part time”, “time_to_go”, “ Time”, and “1234”; it does not

match “#%^&”: if ( $part =~ m/\w/ ) {

\W Matches any nonword character (not uppercase or lowercase letters,

digits, or the underscore character). For example, the following

matches “A*B” and “A{B”, but not “A**B”, “AB*”, “AB101”,

or “1234”: if ( $part =~ m/A\WB/ ) {

Page 50: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

50

Special Character Classes - IIISpecial Character Classes - IIICharacter Class Meaning

\d Matches any valid numerical digit (that is, any number 0–9). For

example, the following matches “B12abc”, “The B1 product is late”, “I

won bingo with a B9”, and “Product B00121”; it does not match “B 0”,

“Product BX 111”, or “Be late 1”: if ( $part =~ m/B\d/ ) {

\D Matches any non-numerical character (that is any character not a digit 0–

9). For example, the following matches “AB1234”, “Product number

1111”, “Number VG928321212”, “The number_A1234”, and “Product

1212”; it does not match “1212” or “PR12”:

if ( $part =~ m/\D\D\d\d\d\d/) {

Page 51: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

51

Setting Specific Patterns w/ Quantifiers

Setting Specific Patterns w/ Quantifiers

• Character quantifiers let you look for very specific patterns

• For example, use the dollar sign (“$”) to to match if a string ends with a specified pattern.

if ($Name =~ /Jones$/ ) {

• Matches “John Jones” but not “Jones is here” would not. Also, “The guilty party is Jones” would matches.

Page 52: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

52

Selected Perl Character Quantifiers I

Selected Perl Character Quantifiers I

Character

Quantifier

Meaning

^ Matches when the following character starts the string. For example,

the following matches “Smith is OK”, “Smithsonian”, and “Smith,

Black”: if ( $name =~ m/^Smith/ ) {

$ Matches when the preceding character ends the string. For example,

the following matches “the end”, “Tend”, and “Time to Bend”:

if ( $part =~ m/end$/ ) {

Page 53: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

53

Selected Perl Character Quantifiers II

Selected Perl Character Quantifiers II

Quantifier Meaning

? Matches zero or one occurrences of the preceding character. For example, the following matches “A101”, “AB101”, but not “ABB101”: if ( $part =~ m/^AB?101/ ) {

{n} {min,} {min,max}

Matches exactly n times, or min-or-more times, or at least min times, but at most max times of the preceding character. For example, if ( $part =~ m/^AB{2}101/ ) matches “ABB101” but not “ABBB101”; if ( $part =~ m/^AB{2,}101/ ) will match “ABBB101” and “ABBBBB101”; if ( $part =~ m/^AB{2,4}101/ ) will match “ABBB101” but not “ABBBBB101”.

+ Matches one or more occurrences of the preceding character. For example, the following matches “AB101”, “ABB101”, and “ABBB101 is the right part”: if ( $part =~ m/^AB+101/ ) {

* Matches zero or more occurrences of the preceding character. For example, the following matches “AB101”, “ABB101”, “A101”, and “A101 is broke”: if ( $part =~ m/^AB*101/) {

Page 54: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

54

Selected Perl Character Quantifiers III

Selected Perl Character Quantifiers III

Character

Quantifier

Meaning

. A wildcard symbol that matches any one character. For example, the

following matches “Stop”, “Soap”, “Szxp”, and “Soap is good”; it

does not match “Sxp”:

if ( $name =~ m/^S..p/ ) {

Page 55: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

55

Match the Special Characters Themselves

Match the Special Characters Themselves

• Use a back slash before the special character \^, \$, \., \?, \(, \), \+, \*,\\,\/ etc Examples

• Will a\??bc matches abc, a?bc• Will a\++bc matches a+bc, a++bc

Page 56: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

56

Building Regular Expressions That Work

Building Regular Expressions That Work

• Regular expressions are very powerful—but they can also be virtually unreadable. When building one, tart with a simple regular

expression and then refine it incrementally. • Build a piece and then test

The following example will build a regular expression for a date checker

• dd/mm/yyyy format (for example, 05/05/2002 but not 5/12/01).

Page 57: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

57

1. Determine the precise field rules. - What is valid input and what is not valid input? E.g., For a date field, think through the valid

and invalid rules for the field. You might allow 09/09/2002 but not 9/9/2002 or Sep/9/2002.

Work through several examples as follows:

Building Regular Expressions That Work

Page 58: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

58

Work through several examplesWork through several examples

Rule Reject These

05/05/2002 - / as a separator 05-05-2002—Require slash delimiters

05/05/2002—Use a four-digit year 05/05/02—Four-digit years only

05/05/2001—Contain only a date The date is 05/05/2002—Only date fields

05/05/2002 is my date—Only date fields

05/05/2001 —Two digits for

months and days

5/05/2002—Two-digit months only

05/5/2002—Two-digit days only

5/5/2002—Two-digit days and months only

Page 59: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

59

Building Regular Expressions that Work

Building Regular Expressions that Work

2. Get form and form-handling programs working Build a sending form the input field

Build the receiving program that accepts the field.

For example, a first cut receiving program: $date = param(‘udate’);if ( $date =~ m/.+/ ) {

print ‘Valid date=’, $date;} else {

print ‘Invalid date=’, $date;}

Any Sequence of characters

Page 60: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

60

Building Regular Expressions that Work

Building Regular Expressions that Work

3. Start with the most specific term possible. For example, slashes must always separate

two characters (for the month), followed by two more characters (for the day), followed by four characters (for the year).

if ( $date =~ m{../../....} ) {

Any 2 characters

Any 2 characters

Any 4characters

Page 61: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

61

Building Regular Expressions that Work

Building Regular Expressions that Work

4. Anchor and refine. (Use ^ and $ when possible) if ( $date =~ m{^\d\d/\d\d/\d\d\d\d$} ) {

Starts with2 digits

2 digitsin middle

Ends with 4 digits

Page 62: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

62

Building Regular Expressions that Work

Building Regular Expressions that Work

5. Get more specific if possible. The first digit of the month can be only 0, 1, 2

or 3. For example, 05/55/2002 is clearly an illegal date.

Only years from this century are allowed. Because we don’t care about dates like 05/05/1999 or 05/05/3003.

Page 63: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

63

• Add these rules belowif ( $date =~ m{^\d\d/[0-3]\d/2\d\d\d$} ) {

Now the regular expression recognizes input like 09/99/2001 and 05/05/4000 as illegal.

Year starts with a “2”

Month starts with a “0-3”

Building Regular Expressions that Work

Page 64: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

64

Tip: Regular Expression Special Variables

Tip: Regular Expression Special Variables

• Perl regexs set several special scalar variables:

$& will be equal to the first matching text

$`will be the text before the match, and

$’ will be the text after the first match. $name='*****Marty';

if ( $name =~ m/\w/ ) {

print "got match at=$& ";

print "B4=$` after=$'";

} else { print "Not match"; }

• would output: got match at=M B4=***** after=arty

Page 65: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

65

Full Example ProgramFull Example Program

1. #!/usr/local/bin/perl2. use CGI ':standard';3. print header, start_html('Date Check');4. $date=param('udate');5. if ($date =~ m{^\d\d/[0-3]\d/2\d\d\d$}){6. print 'Valid date=', $date;7. } else {8. print 'Invalid date=', $date;9.}

10. print end_html;

Page 66: Introduction to Programming the WWW I CMSC 10100-1 Summer 2004 Lecture 13.

66

Would Output The Following ...Would Output The Following ...


Recommended