Introduction Computer Science
Henri Bal
Vrije Universiteit Amsterdam
Goals of this course
● Understand typical Computer Science topics● Meet with students and some staff
members● Develop skills:
● Reading (English) scientific literature● Critical/analytical thinking about CS topics● Discussing● Presenting● Scientific writing
Structure
● Tuesdays: guest lectures● 2 scientific papers provided as context● Questions made up by lecturers
beforehand
● Thursday/Friday/Monday: working groups● 2 students per group present a paper● Each group discusses both papers +
questions
Topics (Tuesday lectures)
● Intro & high-performance computing (Henri Bal)
● Finding & reading scientific literature (Michel Klein, with LI & IMM students)
● e-Science infrastructures (Cees de Laat)● e-Health (Aart van Halteren)● Astronomy & manycores (Rob van Nieuwpoort)● Watson (Lora Aroyo, with LI & IMM students) ● Luggage handling at Heathrow Terminal 5
(Huub van der Wouden, with IMM students)
Working Groups● Supervised by staff members (instructors)● First meeting:
● Instructors will present 1 paper, you do the discussions
● Other meetings:● Students present/discuss papers
● Course material + working group composition will be made available on Blackboard (bb.vu.nl)
Your tasks● Attend Tuesday lectures
● Send brief answers to questions + pose 2 new questions per paper before workgroup deadline
● Give 1 presentation in a working group● Make slides, talk for 10-15 minutes
● Participate in working group discussions
● Write 2-page paper on 1 topic of your choice● Use (find!) 2 extra publications in the literature
● Grading:● 40% participation, 40% paper, 20% presentation
First presentation
● My personal view on Computer Science● Why is Computer Science so interesting?
● Biased towards my own research area:● High performance distributed computing
Computer Science (CS)
● CS sits between technology and applications, both of which have turbulent developments● Processors, networks, mobiles, wearables, …
● Data explosion in virtually all applications
● CS also studies many fundamental problems of its own● Programming languages, security, AI, theory ….
Outline● Technology
● Computers● Some history● High performance computers● Modern (multicore) PCs
● Networks & mobile computing
● Applications● Data explosion● Computation demands
● Fundamental CS questions
Computers● Mainframe: powerful centralized computer
● IBM 704 (1964)
● Minicomputers: <25K$, for small groups● PDP-8, PDP-11, VAX (1960s-1980s)
● Workstations: expensive personalgraphical machine
● Xerox Alto (1973)
● PCs: inexpensive machine for the masses● IBM PC (1981)
High Performance Computers
● Computer systems with many processors, all computing in parallel
● Paper: “Back to Thin-Core Massively Parallel Processors”
Warning
● Scientific papers may be overwhelming● Have to learn how to read scientific
literature, without understanding every word
● ‘’Moreover, smart algorithms that exploit data locality, perform loop unrolling, eliminate iterative loops and recursive algorithms, and use idle-power-friendly programming languages and libraries as well as auto-tuning based on multiversion algorithms can achieve higher-energy-efficiency applications.’’
● (You’re not supposed to understand this yet!)
High Performance Computers (1)
● Vector machines● Can do vector operations in parallel
● A and B: 1-dimensional matrices with 100 elements● Computing A+B (= 100 computations) takes as
much time as doing 1 addition on a sequential computer
● History● 1970s, 1980s (e.g., Cray)● 2000s (Japanese Earth Simulator)● 2010s (GPUs, Graphical Processing Units)
High Performance Computers (2)
● Massively parallel machines● 1000s of special processors connected by
a special network, all running in parallel, each doing part of the overall computations
● E.g., CM-1, CM-5, Intel Paragon, IBM BlueGene● Connection network uses graph theory
(math)
High Performance Computers (3)
● Cluster computers● Parallel machines built from off-the-shelf
(commodity) PCs and networks● Excellent price/performance ratio
● Exponential performance growth ofprocessor speeds
● See http://www.top500.orgfor 500 fastest supercomputers
Multicores & Manycores
● All PCs now have >1 compute cores● Every PC is a parallel computer!
● Some PCs already have 48 cores● Core count will increase to hundreds● GPUs (manycores): 1000’s very simple
cores● Intel Phi (2012): 60 Pentium-1’s on 1 chip,
with advanced vector support● Challenge: how to program these things?
Thinking in parallel is hard
● How to split up the work?● Load balancing
● All cores should do the same amount of work
● Communication & synchronization● Cores must exchange data (=overhead)
● Nondeterminism:● A single processor always gives same outcome● With >1 core the outcome may depend on the
order (called a ``race condition’’ bug)
Current debates
● Should we build chips with:● Very fast/complicated (superscalar)
processors?● Hits a ‘’power wall’’, hard to increase clock
frequency● Many slower/simpler (thin) processors?
● Hard to program
● How to deal with energy consumption?● Performance per Watt becomes key factor
Networks
● Wide area networks (WANs)● Local area networks (LANs)● Mobile networks
● Much more in Computer Networks class
Wide area networks
● ARPANET● First computer network, connecting some US
sites (1960s) ● Speeds measured in kbit/s
● Internet● Based on standardized (IP) protocol suite● Connect everyone/everything (Internet-of-
things)
● Dedicated optical networks (light paths)● 10 gbit/s, point-to-point
Local Area Networks
● Ethernet: developed by Xerox PARC (1974)● Speed increased from 10 mbit/s to 100
gbit/s
● Cluster computers use Ethernet or faster commodity networks● Myrinet● Infiniband
An aside
● In Computer Science● k(ilo)=1024● m(ega)=10242
● g(iga)=10243
● t(era)=10244
● p(eta)=10245
● e(xa)=10246
● All has to do withbinary numbers
DAS-4Dual quad-core Xeon E5620 24-48 GB memory1-10 TB diskInfiniband + 1Gb/s EthernetVarious accelerators (GPUs, multicores, ….)Scientific LinuxBuilt by ClusterVision
VU (74)
TU Delft (32) Leiden (16)
UvA/MultimediaN (16/36)
SURFnet6
10 Gb/s light pathsASTRON (23)
Mobile computing
● Laptops, sensors, smartphones, tablets● Many forms of mobile networks
● Wifi (local range)● 3G, 4G (lower bandwidth, high coverage)● BlueTooth (for pairing devices)
● Ultimately: ubiquitous computing?● Vision by Mark Weiser (1988)● ‘’machines that fit the human environment
instead of forcing humans to enter theirs’’
Outline● Technology
● Computers● Some history● High performance computers● Modern (multicore) PCs
● Networks & mobile computing
● Applications● Data explosion● Computation demands
● Fundamental CS questions
Application developments
● There is a ``data explosion’’ in many application areas● Huge amounts of data (up to
Petabytes/year)● Very complicated/heterogeneous data
● Demand for computing● Model (simulate) designs on a computer
Data explosion
● Society:● Web, social networks
● Industry, economy:● Banks, stock markets
● Science● LHC (``Higgs particle’’)
● Data stored on world-wide ``grid’’● Bioinformatics (next generation sequencing)● Astronomy: software telescopes (LOFAR, SKA)
Computing demands● Computational science:
● Modeling ozone layer, climate, ocean, human brain● Simulating galaxies
● Engineering:● Aircraft modeling, designing F1 cars (Virgin VR01)● TVs (mostly software), embedded systems
● Games and multimedia:● Computer chess (Deep Blue)● Watson (Jeopardy)● Analyzing multimedia content● Generating movies
Pixar’s ``Up’’ (2009)
Whole movie (96 minutes) would take 94 years on 1 PC
(4 frames per day; 1 second takes 6 days; 1 minute per year)
Some fundamental Computer Science topics
(1)● Operating systems:
● Windows, Linux, Minix (Andy Tanenbaum)
● Programming languages and systems● Fortran, Cobol, C, Java, Python … (thousands)
What happens if you ask a computer scientist to solve a problem?
He/she will come back 3 months later, with …
a new programming language ideally suited for solving your problem
Some fundamental Computer Science topics
(2)● Security
● Preventing/detecting attacks, privacy, etc
● (Semantic) web technology● Finding and reasoning about content on
the web
● Cloud computing● Store data and programs remotely, in the
Cloud
Some fundamental Computer Science topics
(3)● Artificial intelligence
● E.g. automatic machine-learning
● Databases● Storing and searching huge amounts of
data
● Logic, modelling, graph theory, complexity● Essential for many applications
Conclusion
● Modern Computer Science deals with hectic developments in technology and applications
● Both provide us many research problems● Application-driven vs technology-driven
research
● There also are many fundamental CS problems
Literature (Context)
● Ami Marowka: Back to Thin-Core Massively Parallel Processors, IEEE Computer, December 2011, pp. 49-54
QUESTIONS
● Explain what ``thin cores’’ are
● What are the arguments in favor and against using ‘’thin cores’’ ?
● Which role does energy consumption play in this discussion?
● Compute the energy efficiency of the current 10 largest supercomputers on www.top500.org
● Which type of machine currently is most energy efficient?
● Compare the maximum performance of the current #1 against the performance of the #1 of 10 years ago. What is the difference?