of 14
7/31/2019 Lex-rev
1/14
Using Lex
1
7/31/2019 Lex-rev
2/14
The Structure of a Lex Program
2
(Definition section)
%%(Rules section)
%%
(User subroutines section)
7/31/2019 Lex-rev
3/14
3
%{/*
* this sample demonstrates (very) simple recognition:
* a verb/not a verb.
*/
%}
%%
[\t ]+ /* ignore white space */ ;
is |
am |
are |
were |
was |be |
being |
been |
do |
does |
did |
will |would |
should |
can |
could |
has |
have |
had |
go { printf("%s: is a verb\n", yytext); }
[a-zA-Z]+ { printf("%s: is not a verb\n", yytext); }
.|\n { ECHO; /* normal default anyway */ }
%%
main(){
yylex();
}
Example 1-1: Word recognizer ch1-02.l
7/31/2019 Lex-rev
4/14
The definition section
Lex copies the material between %{ and
%} directly to the generated C file, so you
may write any valid C codes here
4
7/31/2019 Lex-rev
5/14
Rules section
Each rule is made up of two parts
A pattern
An action
E.g.
[\t ]+ /* ignore white space */ ;
5
7/31/2019 Lex-rev
6/14
Rules section (Contd) E.g.
is |am |are |were |was |be |being |
been |do |does |did |will |would |
should |can |could |has |have |had |go { printf("%s: is a verb\n", yytext); }
6
7/31/2019 Lex-rev
7/14
Rules section (Contd)
E.g.
[a-zA-Z]+ { printf("%s: is not a verb\n", yytext); }
.|\n { ECHO; /* normal default anyway */ }
Lex had a set of simple disambiguating rules:1. Lex patterns only match a given input character or
string once
2. Lex executes the action for the longest possiblematch for the current input
7
7/31/2019 Lex-rev
8/14
User subroutines section
It can consists of any legal C code
Lex copies it to the C file after the end of
the Lex generated code
%%
main()
{ yylex();
}
8
7/31/2019 Lex-rev
9/14
Examples of Regular Expressions [0-9]
[0-9]+
[0-9]*
-?[0-9]+
[0-9]*\.[0-9]+
([0-9]+)|([0-9]*\.[0-9]+)
-?(([0-9]+)|([0-9]*\.[0-9]+))
[eE][-+]?[0-9]+
-?(([0-9]+)|([0-9]*\.[0-9]+))([eE][-+]?[0-9]+)?)
9
7/31/2019 Lex-rev
10/14
Example 2-1
10
%%
[\n\t ] ;
-?(([0-9]+)|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?) { printf("number\n"); }
. ECHO;
%%
main()
{ yylex();
}
7/31/2019 Lex-rev
11/14
A Word Counting Program
The definition section
11
%{unsigned charCount = 0, wordCount = 0, lineCount = 0;
%}
word [^ \t\n]+
eol \n
7/31/2019 Lex-rev
12/14
A Word Counting Program (Contd)
The rules section
12
{word} { wordCount++; charCount += yyleng; }{eol} { charCount++; lineCount++; }
. charCount++;
7/31/2019 Lex-rev
13/14
A Word Counting Program (Contd) The user subroutines section
13
main(argc,argv)
int argc;
char **argv;
{
if (argc > 1) {
FILE *file;
file = fopen(argv[1], "r");
if (!file) {
fprintf(stderr,"could not open %s\n",argv[1]);
exit(1);
}
yyin = file;}
yylex();
printf("%d %d %d\n",charCount, wordCount, lineCount);
return 0;
}
7/31/2019 Lex-rev
14/14
Another Problem%{
letter [A-Za-z]
digit [0-9]
}%
%%
begin {return (BEGIN);}
end {return (END);}
:= {return (ASGOP);}
{letter} ({letter}|{digit})* {yyval = enter_id();
return(ID);}
{digit}+ {yyval = enter_num();
return (NUM);}
%%
enter_id()
{ /* enter the id in the symbol table and returns entry number */}enter_num()
{ /* enter the number in the constant table and return entry number */}
14