+ All Categories
Home > Documents > CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose...

CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose...

Date post: 04-Jan-2016
Category:
Upload: leslie-byrd
View: 215 times
Download: 3 times
Share this document with a friend
36
CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak www.cs.sjsu.edu /~mak
Transcript
Page 1: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

CS 152: Programming Language Paradigms

April 2 Class Meeting

Department of Computer ScienceSan Jose State University

Spring 2014Instructor: Ron Mak

www.cs.sjsu.edu/~mak

Page 2: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

2

Introduction to Compilers and Interpreters

In this class, you will use Java to write an interpreter for the Scheme language.

You will be able to execute simple Scheme programs._

Page 3: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

3

A Compiler is a Translator

A compiler translates a program you’ve written

... in a high-level language C, C++, Java, Pascal, etc.

... into a low-level language assembly language or machine language

... that a computer can understand and eventually execute._

Page 4: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

4

Conceptual Design (Version 1) Parser

Controls the translation process. Repeatedly asks the scanner

for the next token.

Scanner Repeatedly reads characters

from the source to construct tokens for the parser.

Token A source language element

identifier (name) number special symbol (+ - * / = etc.) reserved word

Also reads from the source

Source The source program

Page 5: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

5

Token

A low-level element of the source language. AKA lexeme

Java language tokens

identifiers names of variables, types, procedures, functions,

enumeration values, etc.

numbers integer and real (floating-point)

reserved words class interface if else for while etc.

special symbols + - * / = < <= = >= > >>= <<= . , : ( ) [ ] { } ' "

Page 6: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

6

Parser

Controls the translation process. Repeatedly asks the scanner for the next token.

Knows the syntax (“grammar”) of the source language’s statements and expressions. Analyzes the sequence of tokens to determine

what kind of statement or expression it is translating. Verifies that what it’s seeing is syntactically correct. Flags any syntax errors that it finds and

attempts to recover from them.

What the parser does is called parsing. It parses the source program in order to translate it. AKA syntax analyzer

_

Page 7: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

7

Scanner

Whenever requested by the parser,the scanner reads characters sequentially from the source in order to construct the next token.

It knows the syntax of the source language’s tokens.

What the scanner does is called scanning.it scans the source program in order to extract tokens. AKA lexical analyzer

_

Page 8: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

8

Conceptual Design (Version 2)

We can architect a compiler with three major parts:

Page 9: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

9

Major Parts of a Compiler Front end

Parser, Scanner, Source, Token

Intermediate tier

Intermediate code (icode) “Predigested” form of the

source code that the back end can process efficiently.

Example: parse trees AKA intermediate representation

(IR)

Symbol table (symtab) Stores information about the

symbols (such as the identifiers) contained in the source program.

Back end

Code generator Processes the icode and the symtab

in order to generate the object code.

Only the front end needs to be source language-specific.

The intermediate tier and the back end can be language-independent!

Page 10: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

10

What Else Can Compilers Do?

Allow you to program in a high-level language and think about your algorithms, not about machine architecture.

Provide language portability. You can run your C++ and Java programs on different

machines because their compilers enforce language standards.

Can optimize and improve how your programs execute. Optimize the object code for speed. Optimize the object code for size. Optimize the object code for power consumption.

_

Page 11: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

11

What about Interpreters?

An interpreter executes a source program instead of generating object code.

It executes a source program using the intermediate code and the symbol table.

It shares many of the components of a compiler.

Instead of a code generator in the back end, it has an executor._

Page 12: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

12

Conceptual Design (Version 3) A compiler and an interpreter can both use the

same front end and intermediate tier.

Page 13: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

13

Comparing Compilers and Interpreters

A compiler generates object code, but an interpreter does not.

Executing the source program from object code can be several orders of magnitude faster than executing the program by interpreting the intermediate code and the symbol table.

But an interpreter requires less effort to get a source program to execute = faster turnaround time.

An interpreter maintains control of the source program’s execution.

Interpreters often come with interactive source-level debuggers that allow you to refer to source program elements, such as variable names.

AKA symbolic debugger_

Page 14: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

14

Compilers and Interpreters

Therefore ...

Interpreters are useful during program development.

Compilers are useful to run released programs in a production environment._

Page 15: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

15

How to Scan for Pascal Tokens

Suppose the source line contains IF (index >= 10) THEN

The scanner skips over the leading blanks. The current character is I, so the next token must be a word.

The scanner extracts a word token by copying characters up to but not including the first character that is not valid for a word, which in this case is a blank. The blank becomes the current character. The scanner determines that the word is a reserved word.

Page 16: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

16

How to Scan for Pascal Tokens, cont’d

The scanner skips over any blanks between tokens. The current character is (. The next token must be a special symbol.

After extracting the special symbol token, the current character is i. The next token must be a word.

After extracting the word token, the current character is a blank.

Page 17: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

17

How to Scan for Pascal Tokens, cont’d Skip the blank. The current character is >.

Extract the special symbol token. The current character is a blank.

Skip the blank. The current character is 1, so the next token must be a number.

After extracting the number token, the current character is ).

Page 18: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

18

How to Scan for Pascal Tokens, cont’d Extract the special symbol token. The current character is a blank.

Skip the blank. The current character is T, so the next token must be a word.

Extract the word token. Determine that it’s a reserved word.

The current character is \n, so the scanner is done with this line.

Page 19: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

19

Basic Scanning Algorithm Skip any blanks until the current character is nonblank.

Treat each comment and the end-of-line character as a blank.

The current (nonblank) character determines what the next token is and becomes that token’s first character.

Extract the rest of the next token by copying successive characters up to but not including the first character that does not belong to that token.

Extracting a token consumes all the source characters that constitute the token. After extracting a token, the current character is the first character after the last character of that token. _

Page 20: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

20

Scheme Syntax Diagrams

0

9

1

digitletter A

Z

B

a

z

b letter

word digit

?

letter

-

Page 21: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

21

Scheme Syntax Diagrams, cont’d

unsigned integer

unsigned integer .

digit

unsigned integer

number

Page 22: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

22

Scheme Syntax Diagrams, cont’dcharacter

# \ any character

string

"

any character except "

"

# t

# f

boolean

Page 23: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

23

Scheme Syntax Diagrams, cont’d

word

number

boolean

symbol

element

( )element

list

list

Page 24: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

24

Scheme Keywords (Partial List)

and begin cond define else if lambda let letrec let* not or quote

Page 25: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

25

Scheme Intermediate Code

Scheme programs have a simple structure. Everything is a list.

The Scheme parser can translate a list into a binary tree. The left subtree is the car of the list. The right subtree is the cdr of the list. Each leaf node contains an element of the list.

Example: (1 2 3)1

2

3

Page 26: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

26

Scheme Intermediate Code, cont’d

Example: ((a b) c (d))

Do a preorder walk of the tree to recreate the list: Visit the root.

If the left subtree is not an element node, open a set of parentheses.

Visit the left subtree. If the left subtree is a leaf, print its element.

Visit the right subtree.

a

b

c

d

Page 27: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

27

The Symbol Table: Basic Concepts

Purpose To store information about certain tokens

during the translation process (i.e., parsing and scanning).

What information to store? Anything that’s useful! For a symbol:

name how it’s defined (as a variable, procedure name, etc.)

Basic operations Enter new information Look up existing information Update existing information

Page 28: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

28

The Symbol Table: Conceptual Design

Each symbol table entry has the name of a symbol the symbol’s attributes

At the conceptual level, we don’t worry about implementation.

Page 29: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

29

The Symbol Table: Implementation

Each symbol table entry includes the name of a symbol and its attributes.

To maintain maximum flexibility, implement the attributes as a hash table. Key: the attribute name Value: the attribute value

Page 30: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

30

The Symbol Table: Implementation, cont’d

The symbol table itself can be a hash table. Key: the symbol name Value: the symbol table entry for the symbol

Therefore, we have a hash table of hash tables.

Page 31: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

31

Assignment #5

Use Java to write a parser and a scanner for Scheme.

Your scanner should recognize Scheme tokens.

Print each source input line after the scanner reads the line.

For each token, your scanner should print the token string and the token type (identifier, number, keyword, special symbol, etc.). One token per output line.

The scanner should ignore comments._

Page 32: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

32

Assignment #5, cont’d

The parser should enter each identifier into a symbol table.

Don’t enter keywords. For now, the attributes of each symbol table entry can be null.

For this assignment, you’ll have only one symbol table._

Page 33: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

33

Assignment #5, cont’d

Your parser should translate Scheme lists. Build the intermediate code (binary trees).

After an entire top-level list has been parsed, your backend takes over to perform the following:

Walk the binary tree and print the list with the proper parentheses. The list output does not need to be “pretty” but it must be correct. For example, you can start each sublist on a new line

after an indent.

Print the contents of the symbol table in alphabetical order. Hint: Use a Java TreeMap instead of a Hashtable or Hashmap

for the symbol table.

Page 34: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

34

Assignment #5, cont’d

Clear the intermediate code and the symbol table before parsing the next top-level list.

To ensure that your code is properly partitioned, put your classes into three packages: frontend, intermediate, and backend.

Your test input file is on the last slide.

Email [email protected] a zip file containing: The source directory containing all your Java source files. A text file of the output from your program. Subject: CS 152 Assignment #5, team-name

Page 35: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

35

Assignment #5, cont’d

Due Wednesday, April 16 Do not wait until the last minute to do this assignment!

_

Page 36: CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak mak.

SJSU Dept. of Computer ScienceSpring 2014: April 2

CS 152: Programming Language Paradigms© R. Mak

36

Assignment #5, cont’d

; Find the derivative of polynomial poly with respect to variable var.; The polynomial must be in canonical infix form.(define deriv (lambda (poly var) (let* ((terms (terminize poly)) ; "terminize" the polynomial (deriv-term ; local procedure deriv-term (lambda (term) (cond ((null? term) '()) ((not (member? var term)) '(0)) ; deriv = 0 ((not (member? '^ term)) (upto var term)) ; deriv = coeff (else (deriv-term-expo term var)) ; handle exponent ))) (diff (map deriv-term terms))) ; map deriv-term over the terms (remove-trailing-plus (polyize diff)) ; finalize the answer)))

; Convert an infix polynomial into a list of sublists,; where each sublist is a term.(define terminize (lambda (poly) (cond ((null? poly) '()) (else (cons (upto '+ poly) (terminize (after '+ poly)))))))

input.lisp


Recommended