Programming Language Concepts
Tatjana [email protected]
Programming Language Pragmatics
Michael Scott
http://www.cs.rochester.edu/u/scott/pragmatics/
Contents1 Introduction
2 Programming Language Syntax2.1 Specifying Syntax 2.2 Recognizing Syntax 2.3* Theoretical Foundations
3 Names, Scopes, and Bindings3.1 The Notion of Binding Time 3.2 Object Lifetime and Storage Management 3.3 Scope Rules 3.4 The Binding of Referencing Environments 3.5 Overloading and Related Concepts 3.6 Naming-Related Pitfalls in Language Design
4 Semantic Analysis5 Assembly-Level Computer Architecture6 Control Flow6.1 Expression Evaluation 6.2 Structured and Unstructured Flow 6.3 Sequencing 6.4 Selection 6.5 Iteration 6.6 Recursion
7 Data Types7.1 Type Systems 7.2 Type Checking 7.3 Records (Structures) and Variants (Unions) 7.4 Arrays 7.5 Strings 7.6 Sets 7.7 Pointers and Recursive Types 7.8 Lists
8 Subroutines and Control Abstraction8.1 Review of Stack Layout 8.2 Calling Sequences 8.3 Parameter Passing 8.4 Generic Subroutines and Modules 8.5 Exception Handling 8.6 Coroutines
Building a Runnable Program
10 Data Abstraction and Object Orientation
11 Alternative Programming Models: Functional and Logic Languages
12 Concurrency
13 Code Improvement
IntroductionComputing devices: Mechanical:Fingers, abacusBlaise Pascal 1642 +-Gottfried Wilhelm von Leibnitz +-*/Charles Babbage 1832 programmable
Electronical:COLOSSUS 1943ENIAC (Electronic Numerical Integrator and Computer) 1946
Machine languagebinary system - John Von Neumann GCD for MIPS R4000
coding in the true meaning of the wordcode is notreusable: monolithic structurerelocatable: consider adding one instruction in the middlereadablepractically impossible to create large programs
Assembly languagesassemblerGCD
Assembler translator from symbolic language to machine language (one-to-one mapping)tool to assemble the symbolic program in the machineAdvantagesrelocatable & reusable (copy) programsmacro expansion first step towards higher-level programminglarger programs (like operating systems) possible
But,each kind of computer has its ownprogrammers must learn to think like computersmaintenance of larger programs is difficult
Higher-level languagesportabilitynatural notation (for anything)support to software development
Machine independent languagesFortran 1956Cobol 1959Algol 1958, 1960...
compilers
Fortran (Mathematical Formula Translator)Backus, 1957 IBM compilation instead of translationlanguage for scientific computingmost important task in those daysefficiency important to replace assemblersintroduced many important language concepts that are still in useFortran 99 array operations
Cobol (Common Business Oriented Language)1959COBOL commetee (IBM, Honeywell, Flow-Matic,...)at some point 60% of all business applications
Algol 60 (Algorithmic Language)the first European languagenever very present in practiceintroduced modern concepts big influence on further developmentAda
Basic (Beginers All-purpose Symbolic Instruction Code)1961popular in the eightiesVisul Basic, Visual Basic for Application
PL/1 (Programming Language One)general-purposemeant to replace Fortran, Cobol and Algol
Algol 68the same idea of universality
too complexhardware could not support themAlgol 68 compiler never completely realized
PascalN. Wirthlate sixtiessimple to learn, easy to use, ...introduces subrange and enumeration types, unified structures, unionsPascal-like notionTurbo Pascalfree availabilityModula
an analysis from the beginning of seventiesfor the next 15-20 years predictedsoftware cost not in proportion to hardware costabout 450 languages
ADA1983new attempt for the universal languageUS DODtoo big expectations never fulfilledtheoreticaly significant, data types, moduls, abstraction, concurrency, exception handling
C1970UNIX, system software programming1978 D. M. Ritchi and B. W. Kernighan1983 ANSI Cclose to assembly languagesnot reliable, weak type checking, no dynamic semantic checks C++
object-oriented languagesdata abstractionobjects, classesinheritance, polimorphismroots in Simula 67Smalltalk 80, Eiffel, Omega, Oberon, C++, Delta, Javavisul environment, interactive, events driven programming: Visual Basic, Delphi
Language classificationimperativehow the computer should solve the problemfirst do this, then repeat that, then branch there...procedural languages (Pascal, C, Basic, ...)computing via side-effectsVon Neumann architecture (1946)object-oriented
declarative languagesprogram = description of the problema formal statement of what is the problemcloser to humans than computers
functional languagesLisp, 1958-calculus, Church 1930computing without variableslogic programmingpredicate logic, Fredge 1871Prolog, seventiescomputing with relations
The programming language spectrum
sequentialconcurrentin conjuction with sequential (Fortran, C,...)explicite (Java, Ada, Modula-3)
Why so many languages?evolutiongoto while, case, ... object-oriented
special purposessymbolic data Lispcharacter strings Snobol, Iconlow-level programming Cnumeric data Fortranlogic programming - Prolog
personal preferenceiteration : recursionpointers : implicit dereferencing
What makes a language successful?expressive power
ease of use for a noviceBasic, Logo, Pascal, Java, ...
ease of implementation
excellent compilersFortran
economics, patronage, inertiaCobol
programming vs. implementation conceptual clarity vs. implementation efficiency
Language characteristicsformally defined syntax (grammars, syntax diagrams)data types (predefined, others)data structures (array, record, file, set)control (if, case, for, while)subroutines
modules abstract data typesdata + procedures + functionsclosedconcurrency parallelismlow-level mechanisms to access registers, memory, format dataexception handling mechanismsI/O procedures
Evaluating languages
readabilitymore readable less documentationfactors: key words .. modularity degreesimplicity num = num + 1 num += 1 ++num num++
readability (still...)orthogonalitysmall number of concepts and ways to combine themcontrol flowstructural languagesdata structuresrecords more clear than arrayssyntaxbegin .. end, if .. fi (end if)
easy of usedepends on the applicationsimplicity and orthogonalityprogrammers accept limitted number of new conceptssmall numbers of concepts and constructsabstraction supportemphasses global characteristicssubroutines, modules, classesexpressivitynum = num + 1 or num++while or for
reliabilityto decrese number of run-time errorsearly bindingdata typesexplicitly definedoperators types determinedcastingexceptions handlingrun-time errors caused by the program or systemaliasingmutual references to the same memory locationFortran: equivalencePascal: pointersmay cause errors
effectivityof a programimportant for real-time systemsof the compilerimportant for often modified programsoverallimportant for widely used software
Why to study programming languages?interesting, practical
choose the most appropriate languagescientific applications, system software, embedded systems, word processorC, Fortran, Java, Ada, Visual Basic, Modula-2
easier to learn new languagesC C++ JavaPascal Modula-2common concepts: types, control, naming, abstraction
Our aim is to:Understand obscure featuresC: unions, arrays vs. pointers, separate compilation, varargs, ...understanding the basic concepts is a necessity to understand non-basic onesChoose the best alternative depending on implementation costsalternative ways of doing the same thingx * x or of x**2pointer arithmetic or arrayscomputation vs. memory (function or table)things to avoidPascal & value parameters for large typesMake good use of the environment
Simulate features where they do not existFortran (pre -90)bad control structures use comments & programmer disciplineno recursion eliminate recursionno named constants use variablesC, Pascalno modules use naming & disciplineEquip with basic knowledge for further study of language design and implementation, or interactions of languages with operating systems
Useful in designing command interpreters, programmable editors, text processors, ...Many system programs are like languagescommand shellsprogrammable editorsprogrammable applicationsMany system programs are like compilersread & analyze configuration files and command line optionsEasier to use and design such things once you know about real languages
Compilation and interpretation
Interpretationgreater flexibilitybetter diagnosticsexcellent source-level debuggercope with variables sizes, types, nameswrite and execute on fly program pieces (Prolog, Lisp)
Compilationbetter performancesaves time, memory
a mixture of both
compilation or interpretation?
preprocessor (in interpreted languages)removes comments and white space, forms tokens, expand abbriviations, identifies high-level structures
compilationthorough analysis and nontrivial transformation
examplesBasic, pure interpretedFortran, pure compiledformat interpreter
Cpreprocessor removes comments, expands macros, conditional compilation
C++early AT&T compiler
Pascalearly compilers:a Pascal compiler written in Pascal the same compiler in P-codea P-code interpreter written in Pascaltranslate (by hand) the P-code interpreter into a local language
still both for Pascal, C, other imperativelate bindingProlog, LispJava, byte code (interpreter or just-in-time compiler)Assembly languages run on interpreter
some compilers produce C-codetranslating automaticaly from one nontrivial language to anothertext processors, query language processors for databases
Programming environmentsAssemblers, debuggers, preprocessors, linkers, editors, configuration management tools
Explicit request of the user (Unix)Integrated enviroments (Smalltalk, Visual Studio env.)
Overview of compilingphases
front endfigure out the meaning of the source program
back endconstruct target program
passesa serialized set of phases separate programs, input/output fileseconomic memory usedivision such thatfront end for more than one machineback end for more than one language
Lexical and Syntax Analysisprogram gcd (input,output);var i , j : integer;beginread (i, j);while ij doif i>j then i := i jelse j := j i;writeln (i)end.
scannerlexical analysistokens: program, gcd, (, i, ,, j, ), ;, ... , end, .removes comments, tags tokens with line and column numbers
parsersyntactic analysisparse treeCFG (context-free grammar)
program PROGRAM identifier ( identifier more_identifiers ) ; block .
block labels constants types variables subroutines BEGIN statement more_statements END
more_identifiers , identifier more_identifiers |
Semantic AnalysisIntermediate code generation
meaningrecognizes multiple occurances of an identifier, tracks types of identifiers and expressionssymbol tableidentifier, type, internal structure, scope
checks thatidentifiers defined before usednot used inappropriatellycorrect arguments in subrotine callsarms in CASE distinct constantsexist return values for functions
semantic action routines invoked by parser
static semantics (at compile time)
dynamic semantics (at run time)variables in expressions have valuespointers refer to valid objectsarray subscript is in the boundsfunctions return values
exception if a dynamic check failserroneous if a program breaks a rule expensive to be checked
parse tree (concrete syntax tree)syntax tree (abstract syntax tree)decorated by attributes, i.e., pointers from identifiers to their symbolic table entries
intermediate form between front and back end:- annotated syntax tree - traversal of some intermediate tree (resembles asembly language)
Target code generationcode generation:intermediate form target languagetraverses the symbol table to assign locations to variablestraverses the syntax tree generating loads and storesarithmetics, tests, branches
Code improvementsmore efficientquicker and/or less memory
two phases:machine independent, on intermediate formtarget program improvement, register distribution, reordering instructions