ISBN 0-321-49362-1 Chapter 1 Preliminaries Original slides extended by Geylani Kardas.

ISBN 0-321-49362-1

Chapter 1

Preliminaries

Original slides extended by Geylani Kardas

Copyright © 2009 Addison-Wesley. All rights reserved. 1-2

Chapter 1 Topics

• Reasons for Studying Concepts of Programming Languages (PLs)

• Programming Domains• Language Evaluation Criteria• Influences on Language Design• Language Categories• Language Design Trade-Offs• Implementation Methods• Programming Environments


Reasons for Studying Concepts of Programming Languages

• Increased ability to express ideas

– The depth at which people can think is influenced by the expressive power of the language in which they communicate their thoughts.

– Programmers can increase the range of their software development thought processes by learning new language constructs.

– It might be argued that learning the capabilities of other languages does not help a programmer who is forced to use a language that lacks those capabilities

• Does not hold because often language constructs can be simulated in other languages that do not support those constructs directly



• Improved background for choosing appropriate languages

– Programmers with little or no formal education prefer to use the language with which they are most familiar, even if it is poorly suited to the requirements.

– If these programmers were familiar with a wider range of languages and language constructs, they would be better able to choose the language with the features that best address the problem.

– Some of the features of one language often can be simulated in another language.

• However, it is preferable to use a feature whose design has been integrated into a language than to use a simulation of that feature, which is often less elegant, more cumbersome, and less safe



• Increased ability to learn new languages

– The process of learning a new programming language can be lengthy and difficult for someone who is only comfortable with one or two languages and has never examined programming concepts in general.

• For example, programmers who understand the concepts of object-oriented programming will have a much easier time learning Java than those who have never used those concepts

– The same phenomenon occurs in natural languages• The better you know the grammar of your native language, the easier

you will find it to learn a second natural language

• Furthermore, learning a second language also has the beneficial side effect of teaching you more about your first language

– It is essential that practicing programmers know the vocabulary and fundamental concepts of programming languages so they can read and understand programming language descriptions and evaluations



• Better understanding of significance of implementation

– In some cases, an understanding of implementation issues leads to an understanding of why languages are designed the way they are.

– Provides ability of use a language more intelligently, as it was designed to be used.

– Certain kinds of bugs can be found and fixed by only programmers who know some related implementation details

– Allows us to visualize how a computer executes various language constructs

– Provides hints about the relative efficiency of alternative constructs• e.g. Awareness about the inefficiency of frequent subprogram calls.



• Better use of languages that are already known

– Many contemporary programming languages are large and complex

• Mostly, it is uncommon for a programmer to be familiar with and use all of the features of a language he/she uses

– By studying the concepts of programming languages programmers can:

• learn about previously unknown and unused parts of the languages they already use and

• begin to use those features


Reasons for Studying Concepts of Programming Languages• Overall advancement of computing

– The most popular languages are not always the best available

– In some cases, it might be concluded that a language became widely used, at least in part, because those in positions to choose languages were not sufficiently familiar with programming language concepts

– e.g. Many people believe that it would have been better if ALGOL 60 (with carrying benefits of block structure, recursion and well-structured control statements) had displaced Fortran in the early 1960s.

– If those who choose languages were better informed, perhaps better languages would eventually squeeze out poorer ones.


Programming Domains

• Computers have been applied to a myriad of different areas, from controlling nuclear power plants to providing video games in mobile phones

• Because of this great diversity in computer use, programming languages with very different goals have been developed

• A few of the areas of computer applications and their associated languages:– Scientific Applications

– Business Applications

– Artificial Intelligence

– Systems Programming

– Web Software


Programming Domains

• Scientific applications– The first digital computers were used and indeed invented for

scientific applications.

– Scientific applications • have relatively simple data structures but

• require large numbers of floating point computations

– Most common data structures: arrays and matrices

– Most common control structures: counting loops and selections

– Fortran: First language for scientific applications

– ALGOL 60 and most of its descendants were also intended to be used in this area

• However no subsequent language is significantly better than Fortran, which explains why Fortran is still used.


Programming Domains

• Business applications

– The use of computers for business applications began in the 1950s.

• Special computers were developed for this purpose, along with special languages

– Business languages are characterized by • facilities for producing elaborate reports,

• precise ways of describing and storing decimal numbers and character data, and

• the ability to specify decimal arithmetic operations

– COBOL: First successful high-level language for business• It is still the most commonly used language for business

applications


Programming Domains

• Artificial intelligence (AI)

– Characterized by symbolic computation• Symbols rather than numbers manipulated• is more conveniently done with linked lists of data rather

than arrays

– AI programming sometimes requires more flexibility than other programming domains

• For example in some AI applications the ability to create and execute code segments during execution is convenient

– LISP: First widely used programming language developed for AI• Some important successors are Prolog and Scheme• More recently some AI applications have been written in systems

languages such as C


Programming Domains

• Systems programming– The operating system and all of the programming support

tools of a computer system are collectively known as its systems software

– Need efficiency because of continuous use

– Furthermore, it must have low-level features that allow the software interfaces to external devices to be written

– C: low-level and execution efficient; does not burden the user with safety restrictions

• The UNIX OS is written almost entirely in C

• Systems programmers favors in C while some nonsystems programmers find C to be too dangerous to use on large, important software systems


Programming Domains

• Web Software

– Eclectic collection of languages: markup (e.g. XHTML), scripting (e.g. JavaScript, PHP), general-purpose (e.g. Java)

– Because of the pervasive need for dynamic Web content, some computation capability is often included in the technology of content presentation.

• This functionality can be provided by embedding programming code in an XHTML document.

• Such code is often in the form of a scripting language, such as JavaScript or PHP.

• There are also some markup-like languages that have been extended to include constructs that control document processing


Language Evaluation Criteria

• Readability: the ease with which programs can be read and understood

• Writability: the ease with which a language can be used to create programs

• Reliability: conformance to specifications (i.e. performs to its specifications)

• Cost: the ultimate total cost

The fourth criteria (“cost”) is not included because it is only slightly related to the other three criteria and the characteristics that influence them.


Evaluation Criteria: Readability• Before 1970, software development was largely thought of

in terms of writing code.– The primary positive characteristics of PLs were efficiency and

machine readability.

– Language constructs were designed more from the point of view of the computer than of computer users.

• The software lifecycle concept was developed later– Coding was relegated to a much smaller role

– Maintenance was recognized as a major part of the cycle

– Ease of maintenance is determined in large part by the readability of programs

– Readability became an important measure of the quality of programs and programming languages

– There was a distinct crossover from a focus on machine orientation to a focus on human orientation

• Characteristics that contribute to the readability of a PL:


Evaluation Criteria: Readability• Overall simplicity

– A language that has a large number of basic constructs is more difficult to learn

– A manageable set of features and constructs

– Minimal feature multiplicity

• Feature multiplicity: Having more than one way to accomplish a particular operation

– e.g. four ways of incrementing a simple integer variable in Java:count = count + 1count += 1count++++count

– Minimal operator overloading

• Operator overloading: A single operator has more than one meaning

• Although this is often useful, it can lead to reduced readability if users are allowed to create their own overloading and do not do it sensibly.


Evaluation Criteria: Readability• Overall simplicity (continued)

– Simplicity in languages can, of course, be carried too far.

• e.g. The form and meaning of most Assembly language statements are models of simplicity

– This very simplicity, however, makes assembly language programs less readable

– Because they lack more complex control statements, program structure is less obvious

– Because the statements are simple, far more of them are required than in equivalent program in a high-level language


Evaluation Criteria: Readability• Orthogonality

– A relatively small set of primitive constructs can be combined in a relatively small number of ways to build the control and the data structures of the language

– Every possible combination of primitives is legal and meaningful

• Suppose a language has four primitive types (integer, float, double and character) and two operators (array and pointer)

– If the two type operators can be applied to themselves and the four primitive data types, a large number of data structures can be defined

– The more orthogonal the design of a language, the fewer exceptions the language rules require• For example,

– it should be possible in a programming language that supports pointers to define a pointer to point to any specific type defined in the language

– However, if pointers are not allowed to point to arrays, many potentially useful user-defined data structures could not be defined


Evaluation Criteria: Readability• Orthogonality (continued)

– Illustrating orthogonality: Adding two 32-bit integer values that resides in either memory or registers and replacing one of the two values with the sum

• in assembly language of IBM mainframes:A Reg1, memory_cellAR Reg1, Reg2

• in assembly language of VAX series of minicomputersADDL operand_1, operand_2 (either operand can be a register or a

memory cell)

• VAX instruction design is orthogonal since a single instruction can use either registers or memory cells as operands

• The IBM design is not orthogonal– Only two operand combinations are legal out of four possibilities

and the two require different instructions, A and AR.

• The IBM design is more restricted– It is also more difficult to learn because of the restrictions and the

additional instruction


Evaluation Criteria: Readability• Orthogonality (continued)

– An example of the lack of orthogonality in a high-level language:• Although C has two kinds of structured data types, arrays and records, only

records can be returned from functions

– Too much orthogonality can also cause problems• e.g. ALGOL 68 (perhaps the most orthogonal programming language) has

no restrictions on types of every language constructs and this extreme form of orthogonality leads to unnecessary complexity

– Some believe that functional languages offer a good combination of simplicity and orthogonality

• A functional language, such as LISP, is one in which computations are made primarily by applying functions to given parameters

• In contrast, in imperative languages such as C, C++, and Java, computations are usually specified with variables and assignment statements

• Functional languages offer potentially the greatest overall simplicity, because they can accomplish everything with a single construct, the function call

• Efficiency, however, have prevented functional languages from becoming more widely used.


Evaluation Criteria: Readability• Data types

– The presence of adequate facilities for defining data types and data structures

– For example• suppose a numeric type is used for an indicator flag because

there is no Boolean type in the language

timeout = 1

• The meaning of this statement is unclear, whereas in a language that includes Boolean types, we would have the following:

timeOut = true

• The meaning of this statement is perfectly clear


Evaluation Criteria: Readability• Syntax considerations

– Identifier forms: flexible composition, should not be too short

• e.g. In Fortran 77 identifiers can have at six characters at most

• A more extreme example is the original ANSI BASIC: an identifier could consist only of a single letter or a single letter followed by a single digit.

– Special words and methods of forming compound statements

• Program readability is strongly influenced by the forms of language’s special words (for example while, class, and for)

• e.g. Ada is more readable than C and its descendants since it uses a distinct closing syntax for each type of a statement group (e.g. end if, end loop) instead of just braces

• Another important issue: Can the special words of a language be used as names for program variables?

– If so, the resulting programs can be very confusing

– For example, in Fortran 95, special words, such as Do and End, are legal variable names


Evaluation Criteria: Readability• Syntax considerations (continued)

– Form and meaning: self-descriptive constructs, meaningful keywords

• Designing statements so that their appearance at least partially indicates their purpose

• Semantics or meaning should follow directly from syntax or form

• In some cases, this principle is violated by two language constructs that are identical or similar in appearance but have different meanings, depending perhaps on context

– e.g. use of the reserved word static in C:» if it is used on the definition of a variable inside a function, it

means the variable is created at compile time

» if it is used on the definition of a variable that is outside all functions, it means the variable is visible only in the file in which its definition appears; that is, it is not exported from that file


Evaluation Criteria: Writability

• Most of the language characteristics that affect readability also affect writability

– Because the process of writing a program requires the programmer frequently to reread the part of the program that is already written

• It is simply not reasonable to compare the writability of two languages in the realm of a particular application when one was designed for that application the other was not

– e.g. Writability of Visual Basic (VB) and C are dramatically different

• VB is ideal for writing GUI enabled programs

• C was designed for writing system programs

• Characteristics influencing the writability of a language:



• Simplicity and orthogonality

– Few constructs, a small number of primitives, a small set of rules for combining them

• A programmer can design a solution to a complex problem after learning only a simple set of primitive constructs

– Too much orthogonality can be a detriment to writability

• Errors in programs can go undetected when nearly any combination of primitives is legal



• Support for abstraction

– The ability to define and use complex structures or operations in ways that allow details to be ignored

– PLs can support two distinct categories of abstraction• Process abstraction

– e.g. The use of a subprogram to implement a sort algorithm that is required several times in a program

» If the subprogram were not used, the code that used to sort subprogram would be cluttered with the sort algorithm details, greatly obscuring the flow an overall intent of that code

• Data abstraction

– e.g. Consider a binary tree that stores integer data in its nodes.» Fortran 77 does not support pointers and dynamic storage

management. Three parallel integer arrays, where two of the integers are used as subscripts to specify offspring nodes, can be used.

» In Java and C++, trees can be implemented by using abstraction of a tree node in the form of a simple class



• Expressivity

– A set of relatively convenient ways of specifying operations

– Strength and number of operators and predefined functions

• e.g. In C, count++ is more convenient and shorter than count = count + 1

• e.g. and then boolean operator in Ada conveniently specifies short-circuit evaluation of a Boolean expression

• e.g. The inclusion of for statement in Java makes writing counting loops easier than with the use of while


Evaluation Criteria: Reliability

• A program is said to be reliable if it performs to its specifications under all conditions.

• Following are the language features that have a significant effect on the reliability of a program in a given language:

• Type checking– Testing for type errors

– Compile-time type checking: by the compiler

– Run-time type checking: during program execution

– Run-time type checking is expensive, compile-time checking is more desirable

– e.g. The design of Java requires checks of the types of nearly all variables and expressions at compile time

• Virtually eliminates type errors at run-time


Evaluation Criteria: Reliability

• Exception handling– Intercept run-time errors and take corrective measures– Ada, C++ and Java include extensive capabilities for exception

handling– Exception handling is practically nonexistent in C and Fortran

• Aliasing– Presence of two or more distinct referencing methods for the same

memory location• e.g. Two pointers set to point to the same variable

– It is now widely accepted that aliasing is a dangerous feature in a PL

• Readability and writability– A language that does not support “natural” ways of expressing an

algorithm will require the use of “unnatural” approaches, and hence reduced reliability

• The easier a program is to write, the more likely it is to be correct

– Programs that are difficult to read are difficult both to write and to modify later


Evaluation Criteria: Cost

• The total cost of a programming language is a function of many of its characteristics

• Factor 1: Training programmers to use the language

• Factor 2: Writing programs in the language (closeness to particular applications)

– The original efforts to design and implement high-level languages were driven by the desire to lower the costs of creating software

• Factor 3: Compiling programs

– A major impediment to the early use of Ada was the prohibitively high cost of running the first-generation Ada compilers


Evaluation Criteria: Cost

• Factor 4: Executing programs

– A language that requires many run-time type checks will prohibit fast code execution

• Factor 5: Language implementation system: availability of free compilers

– One of the factors that explains the rapid acceptance of Java was that free compiler/interpreter systems have been available for it since soon after its design was first released.

– The high cost of first-generation Ada compilers helped prevent Ada from becoming popular in its early days

• Factor 6: Reliability: poor reliability leads to high costs– If the software fails in a critical system such as a nuclear

power plant or an X-ray machine for medical use, the cost would be very high.


Evaluation Criteria: Cost• Factor 7: Maintaining programs

– includes both corrections and modifications to add new functionality

– Maintenance is often done by individuals other than the original author of the software

• Poor readability can make the task extremely challenging

– It has been estimated that for large software systems with relatively long lifetimes, maintenance costs can be as high as two to four times as much as development costs

• As a result, program development, reliability and maintenance are three most important contributors to the cost– Because these are functions of writability and readability,

these two evaluation criteria are, in turn, the most important


Evaluation Criteria: Others

• Portability– The ease with which programs can be moved from one

implementation to another

– Most strongly influenced by the degree of standardization of the language

– Some languages, such as BASIC, are not standardized at all, making programs in these languages very difficult to move from one implementation to another

– Standardization is a time-consuming and difficult process.• A committee began work on producing a standard version of C++ in 1989.

It was approved in 1998

• Generality– The applicability to a wide range of applications

• Well-definedness– The completeness and precision of the language’s official definition

documents


Evaluation Criteria

• Most criteria, particularly readability, writability and reliability, are neither precisely defined nor exactly measurable

– Nevertheless, they are useful concepts and they provide valuable insight into the design and evaluation of PLs.

• Language design criteria are weighted differently from different perspectives:

– Language implementers are concerned primarily with the difficulty of implementing the constructs and features of the language

– Language users are worried about writability first and readability later

– Language designers are likely to emphasize elegance and the ability to attract widespread use

– These characteristics often conflict with one another


Influences on Language Design

• Computer Architecture– Languages are developed around the

prevalent computer architecture, known as the von Neumann architecture

• Programming Methodologies– New software development methodologies

(e.g., object-oriented software development) led to new programming paradigms and by extension, new programming languages


Computer Architecture Influence

• Well-known computer architecture: von Neumann

• Most of the popular languages of the past 50 years have been designed around the von Neumann architecture– These dominant languages are called imperative languages

• In a von Neumann architecture:– Data and programs stored in memory– Memory is separate from CPU– Instructions and data are piped from memory to CPU

• Nearly all digital computers built since the 1940s have been based on the von Neumann architecture


The von Neumann Architecture


• Central features of imperative languages:

– Variables which model memory cells

– Assignment statements which are based on the piping operation

– Iterative form of repetition which is the most efficient way to implement repetition on von Neumann architecture

• Operands in the expressions are piped from memory to the CPU and

the result of evaluating the expression is piped back to the memory cell represented by the left side of the assignment



• Iteration is fast in von Neumann computers – Because instructions are stored in adjacent memory cells

andrepeating the execution of a section of code requires only a simple branch of instruction

– This efficiency discourages the use of recursion for repetition, although recursion is sometimes more natural

• Fetch-execute cycle (on a von Neumann architecture computer): – It is the process in which the execution of a machine code

program occurs• Programs reside in memory but are executed in the CPU. Hence, each

instruction to be executed must be moved from memory to the processor

• The address of the next instruction to be executed is maintained in a register called the program counter




The fetch-execute cycle can be simply described by the following algorithm:

initialize the program counterrepeat forever

fetch the instruction pointed by the counter

increment the counterdecode the instructionexecute the instruction

end repeat



• “decode the instruction” means the instruction is examined to determine what action it specifies

• Program execution terminates when a stop instructions is encountered

– However, on an actual computer a stop instruction is rarely executed.

– Rather, control transfers from the operating system to a user program for its execution and then back to the operating system when the user program execution is complete

– In a computer system in which more than one user program may be in memory at a given time, this process is far more complex



• A functional language is one in which the primary means of computation is applying functions to given parameters– Programming can be done in a functional language without using

variables, assignment operators and iterations

• Although many computer scientists have expounded on the myriad benefits of functional languages, it s unlikely that they will displace the imperative languages until a non-von Neumann computer is designed that allows efficient execution of programs in functional languages

• In spite of the fact that the structure of the imperative programming languages is modeled on a machine architecture, rather than on the abilities and inclinations of the users of programing languages, some believe that using imperative languages is somehow more natural than using a functional language– So, many believe that even if functional programs were as efficient as

imperative programs, use of imperative programming languages would still dominate


Programming Methodologies Influences• 1950s and early 1960s: Simple applications; worry about machine

efficiency

• The late 1960s and early 1970s brought an intense analysis, begun in large part by the structured-programming movement, of both the software development process and programming language design

– An important reason for this research was the shift in the major cost of computing from hardware to software, as hardware costs decreased and programmer costs increased

• Late 1960s: People efficiency became important; readability, better control structures

– structured programming

– The new software development methodologies that emerged were called top-down design and step-wise refinement

– The primary programming language deficiencies that were discovered:

• Incompleteness of type checking• Inadequacy of control statements (requiring the extensive use of gotos)


Programming Methodologies Influences• Late 1970s: A shift from process-oriented to data-oriented

– data-oriented methods emphasize data design, focusing on the use of abstract data types to solve problems

– SIMULA 67: The first PL to provide even limited support for data abstraction

– The benefits of data abstraction were not widely recognized until the early 1970s• However, most languages designed since the late 1970s support data abstraction

• Middle 1980s: Object-oriented programming (OOP)

– Latest step in the evolution of data-oriented software development

– Data abstraction + inheritance + polymorphism

– OOP was developed along with a language that supported its concepts: Smalltalk

– Support for OOP is now part of the most popular imperative languages such as Ada 95, C++ and Java

– Object-oriented concepts have also found their way into functional programming in CLOS and logic programming in Prolog++


Programming Methodologies Influences• Procedure-oriented (the opposite of data-oriented) programming

have not been abandoned although data-oriented methods now dominate software development

– in recent years, a good deal of research has occurred in procedure-oriented programming

• esp. in the area of concurrency– e.g. Ada, Java and C# include language facilities for creating and

controlling concurrent program units.

• All of these evolutionary steps in software development methodologies led to new language constructs to support them


Language Categories

• Programming languages are categorized as:

– Imperative• It covers:

– Object-oriented (OO)

– Visual

– Markup

– etc.

– Functional

– Logic


Language Categories

• We do not consider languages that support OO programming to form a separate category

– They are included in imperative languages category

– Most popular languages that support OOP grew out of imperative languages.

– Although OO paradigm differs greatly from the procedure-oriented paradigm usually used with imperative languages, the extensions to an imperative language required to support OOP are not overwhelming

• e.g. The expressions, assignment statements, and control statements of C and Java are nearly identical. On the other hand the arrays, subprograms and semantics of Java are very different form those of C


Language Categories

• Imperative

– Central features are variables, assignment statements and iteration• e.g. C, Java, Perl, Visual BASIC .NET

– Include languages that support object-oriented programming• e.g. Java, C++

– Include the visual languages• e.g. Visual BASIC .NET• These languages include capabilities for drag-and-drop generation of code

segments• Such languages were once called fourth-generation languages, although

that name has fallen out of use• The visual languages provide a simple way to generate GUIs to programs

– Include scripting languages• e.g. Perl, JavaScript, Ruby


Language Categories• Functional

– Main means of making computations is by applying functions to given parameters

– Eliminating side-effects can make it much easier to understand and predict the behavior of a program, which is one of the key motivations for the development of functional programming

– Examples: LISP, Scheme

• Logic– Rule-based (rules are specified in no particular order)

• Language implementation system must choose an order in which the rules are used to produce the desired result

• Controversial to imperative languages in which algorithm is specified in great detail and the specific order of execution of the instructions or statements must be included

– The approach to software development is radically different from those used with the other categories of languages and clearly requires a completely different kind of language

– Example: Prolog


Language Categories

• Markup/programming hybrid languages

– Markup languages are not PLs• e.g. XHTML

– the most widely used markup language– is used to specify the layout of information in Web documents

– However, some programming capability has crept into some extensions to XHTML and XML

• Examples:– JSTL (Java Server Pages Standard Tag Library)– XSLT (eXtensible Stylesheet Language Transformations)

– Those languages can not be compared to any of the complete programming languages


Language Categories

• In addition to all of the above categories a host of special-purpose languages have appeared

– RPG (Report Program Generator): is used to produce business reports

– APT (Automatically Programmed Tools): is used for instructing programmable machine tools

– GPSS (General Purpose Simulation System): is used for system simulation

– These are not covered in our course due to their narrow applicability and difficulty of comparing with other languages


Language Design Trade-Offs• The framework of language evaluation criteria is self-

contradictory

• reliability vs. cost of execution– Example:

• Java demands all references to array elements be checked for proper indexing, which leads to increased execution costs.

• C does not require index range checking, so C programs execute faster than semantically equivalent Java programs, although Java programs are more reliable

• The designers of Java traded execution efficiency for reliability

• readability vs. writability– Example:

• APL provides many powerful operators (and a large number of new symbols), allowing complex computations to be written in a compact program but at the cost of poor readability

• Well-known author Daniel McCracken once noted that it took him four hours to read and understand a four-line APL program.


Language Design Trade-Offs

• writability (flexibility) vs. reliability– Example:

• C++ pointers are powerful and very flexible but are unreliable

• Because of the potential reliability problems with pointers, they are not included in Java

• Examples of conflicts among language design (and evaluation) criteria abound; some are subtle, others are obvious

• It is clear that the task of choosing constructs and features when designing a PL requires many compromises and trade-offs


Implementation Methods

• Two of the primary components of a computer:– Internal memory: is used to store programs and data – Processor: provides a realization of a set of primitive operations, or

machine instructions, such as those for arithmetic and logic operations

• The machine language of the computer is its set of instructions

• In the absence of other supporting software, its own machine language is the only language that most hardware computers “understand”.

• Theoretically, a computer could be designed and built with a particular high-level language as its machine language, but it would be– very complex and expensive– highly inflexible

• because it would be difficult to use it with other high-level languages



• The more practical machine design choice implements in a hardware a very low-level language that

– provides the most commonly needed primitive operations and

– requires system software to create an interface to programs in higher-level languages

• A language implementation system cannot be the only software on a computer. Also required is a large collection of programs, called the operating system


Layered View of Computer•The operating system and language implementation are layered over machine interface of a computer

•These layers can be thought of as virtual computers, providing interfaces to the user at higher levels.

•For example, an operating system and a C compiler provide virtual C computer.



• PLs can be implemented by any of the three general methods:

– Compilation• Programs are translated into machine language

– Pure Interpretation• Programs are interpreted by another program known as an

interpreter

– Hybrid Implementation Systems• A compromise between compilers and pure interpreters


Compilation

• Translate high-level program (source language) into machine code (machine language)

• Slow translation, fast execution– e.g. C, COBOL, Ada

• Compilation process has several phases: – lexical analysis: converts characters in the source program into lexical

units• Lexical units are identifiers, special words, operators and punctuation

symbols– lexical analyzer ignores comments in the source program because the

compiler has no use for them

– syntax analysis: transforms lexical units into parse trees which represent the syntactic structure of program in a hierarchy

• In many cases no actual parse tree structure is constructed; rather the information that would be required to build a tree is generated and used directly


Compilation

• Compilation process has several phases (continued): – semantics analysis: generate intermediate code

• This code is sometimes a machine language code or very likely to machine language code.

– In other cases it is at a level somewhat higher than an assembly language.

• The semantic analyzer checks for errors that are difficult if not impossible to detect during syntax analysis such as type errors.

• Optimization, which improves programs (usually in their intermediate code version) by making them smaller or faster or both, is often an optional part of compilation

– Because many kinds of optimization are difficult to do on machine language, most optimization is done on the intermediate code.

– code generation: machine code is generated from the optimized intermediate code


Compilation

• Symbol table

– serves as a database for the compilation process

• The primary contents of the symbol table are the type and the attribute information of each user-defined name in the program.

• This info

– is placed in the symbol table by the lexical and syntax analyzer and

– is used by the semantic analyzer and the code generator


The Compilation Process


Additional Compilation Terminologies

• The user and system code together are sometimes called a load module (executable image)

• Linking and loading: the process of collecting system program units and linking them to a user program– Sometimes just called linking

• Accomplished by a system program called a linker

– Linking operation connects the user program to the system programs (e.g. for I/O operations) by placing the address of the entry points of the system programs in the calls to them in the user program

– In addition to system programs, user programs must often be linked to previously compiled user programs that reside in libraries

• Linker also links a given program to other user programs


von Neumann Bottleneck

• Connection speed between a computer’s memory and its processor determines the speed of a computer

• Program instructions often can be executed much faster than the speed of the connection; the connection speed thus results in a bottleneck

• Known as the von Neumann bottleneck; it is the primary limiting factor in the speed of computers


Pure Interpretation

• No translation

• Programs are interpreted by another program called an interpreter

• Interpreter acts as a software simulation of a machine– whose fetch-execute cycle deals with high-level language

program statements rather than machine instructions

• Easier implementation of programs– esp. many source-level debugging operations– run-time errors can easily and immediately be displayed

• For example, if an array index is found to be out of range, the error message can easily indicate the source line and the name of the array


Pure Interpretation• Slower execution (10 to 100 times slower than compiled

programs)– Decoding of high-level language statements is the main reason– Furthermore, regardless of how many times a statement is

executed, it must be decoded every time• Therefore, statement decoding, rather than the connection between

the processor and memory, is the bottleneck of a pure interpreter

• Often requires more space– In addition to the source program, the symbol table must be

present during interpretation

• Although some simple early languages of the 1960s (APL, SNOBOL, and LISP) were purely interpreted, by the 1980s, the approach was rarely used on high-level languages

• Significant comeback with some Web scripting languages (e.g. JavaScript, PHP)


Pure Interpretation Process


Hybrid Implementation Systems

• A compromise between compilers and pure interpreters

• A high-level language program is translated to an intermediate language that allows easy interpretation

• Faster than pure interpretation

– Because the source language statements are decoded only once

• Instead of translating intermediate language code to machine code, it simply interprets the intermediate code.

• Examples

– Perl programs are partially compiled to detect errors before interpretation

– Initial implementations of Java were hybrid; the intermediate form, byte code, provides portability to any machine that has a byte code interpreter and a run-time system (together, these are called Java Virtual Machine)


Hybrid Implementation Process


Just-in-Time Implementation Systems

• Initially translate programs to an intermediate language

• Then compile the intermediate language of the subprograms into machine code when they are called

• Machine code version is kept for subsequent calls

• JIT systems are widely used for Java programs

• .NET languages are implemented with a JIT system


Just-in-Time Implementation Systems

• Sometimes an implementor may provide both compiled and interpreted implementations for a language.

– In these cases, the interpreter is used to develop and debug programs

– Then, after a (relatively) bug-free state is reached, the programs are compiled to increase their execution speed


Preprocessors

• A preprocessor is a program that processes a program immediately before the program is compiled.

• Preprocessor instructions are embedded in programs.

• The preprocessor is essentially a macro expander.

• Preprocessor instructions are commonly used to specify that the code from another file is to be included.


Preprocessors

• A well-known example: C preprocessor

• expands #include, #define, and similar macros

• #include “myLib.h”– causes the preprocessor to copy the contents of myLib.h into

the program at the position of the #include

• #define max(A, B) ((A) > (B) ? (A) : (B))– x = max(2 * y, z / 1.73);

– would be expanded by the preprocessor to

– x = ((2 * y) > (z / 1.73) ? (2 * y) : (z / 1.73);

– Notice that this is one of those cases where expression side effects can cause trouble

• e.g. İf either of the expressions given to the max macro have side effects (such as z++) it could cause a problem.

– Because one of the two expression parameters is evaluated twice, this could result in z being incremented twice by the code produced by the macro expansion


Programming Environments• A collection of tools used in software development

– may consist only of a file system, a text editor, a linker and a compileror – may include a large collection of integrated tools, each accessed

through a uniform user interface

• In the latter case, the development and maintenance of software is greatly enhanced.

• Therefore, the characteristics of a programming language are not the only measure of the software development capability of a system.

• Some programming environments:

– UNIX• an older operating system and tool collection• nowadays often used through a GUI (e.g., CDE, KDE or GNOME) that runs

on top of UNIX


Programming Environments

• Some programming environments (continued):

– Borland Jbuilder• a programming environment that provides an integrated compiler, editor,

debugger and file system for Java development

– Microsoft Visual Studio.NET• a large, complex visual environment

• used to build Web applications and non-Web applications in any .NET language (C#, Visual BASIC .NET, JScript, J#)

– NetBeans• a development environment that is primarily used for Java Web application

development but also supports JavaScript, Ruby and PHP

– Both Visual Studio and NetBeans are more than development environments—they are also frameworks,

• which means they actually provide common parts of the code of the application


Summary

• The study of programming languages is valuable for a number of reasons:– Increase our capacity to use different constructs– Enable us to choose languages more intelligently– Makes learning new languages easier

• The design and evaluation of a particular programming language is highly dependent on the domain in which it is to be used.

• Most important criteria for evaluating programming languages include: readability, writability, reliability, cost

• Major influences on language design have been machine architecture and software development methodologies

• A long list of trade-offs must be made among features, constructs, and capabilities in order to design a PL

• The major methods of implementing programming languages are: compilation, pure interpretation, and hybrid implementation

• Programming environments have become important parts of software development systems, in which the language is just one of the components

Date post:	18-Dec-2015
Category:	Documents
Upload:	janis-allison
View:	235 times
Download:	5 times

ISBN 0-321-49362-1 Chapter 1 Preliminaries Original slides extended by Geylani Kardas.

Documents