Computer Science — An Overview J. Glenn Brookshear Chapter Two Data Manipulation.

Post on 21-Dec-2015

219 views 0 download

Tags:

transcript

Computer Science — An Overview

J. Glenn Brookshear

Chapter TwoData Manipulation

Outline

• The Central Processing Unit

• The Stored-Program Concept

• Program Execution

• Arithmetic/Logic Instructions

• Communicating with Other Devices

• Other Architectures

Turing Machines

• Turing’s computer design was for theoretical purposes.

• It has the three central components:– A processor– A program (instructions for the processor)– A memory (to store the data)

The Central Processing Unit( CPU )

Inside the Computer

• Every computer can be broken down into three parts :– CPU (Central Processing Unit)– Memory– I/O (Input/Output) devices : Peripherals

CPU Memory (RAM, ROM, )

Peripherals (monitor, printer, etc.)

Computer System

• Memory Unit

• Arithmetic and Logic Unit (ALU)

• Control Unit

• Input Unit

• Output UnitInput Unit

Output Unit

Main Memory

Secondary Memory

Control Unit

ALUCPU

Input /Output Units

• Provide a means of communicating between users and CPU.

• What kinds of Input /Output (I/O) Devices are used? – Input

• Keyboard, mouse, scanner, bar code reader, touch pad, microphone

– Output• printer, monitor, speaker

Central Processing Unit (CPU)

• Arithmetic/logic unit (ALU): to perform the arithmetic and logic operation

• Control unit: coordinating the machine’s activities

• Registers: to store temporary data

Figure 2-1

Registers

• General-purpose registers– R0,R1,R2,...– Register banks: a set of registers (R0..R7)

• Special-purpose registers– Program Counter (PC)– Instruction Register (IR)– Program Status Word (PSW)

The Registers of Computer

R0R1R2

S

IR

PC

Instructions

Workspace

GeneralRegisters

SpecialRegisters

PSW...

Bus

• Buses are used to Communicate between the computer components. – Data Bus– Address Bus– Control Bus

MemoryCPU I/O device

Data bus

Control bus

Address bus

Bus Architecture

The Operation of Bus

• For a device (memory or I/O) to be recognized by the CPU, it must be assigned an address.– The address of every device must be unique.– The CPU puts the address on the address bus, and the

decoding circuitry finds the device.– The CPU uses the data bus either to get data form

that device or to send data to it.– The control buses are used to provide read or write

signals.

Data Bus

• The more data buses available, the better the CPU.– An 8-bit bus can send 1 byte a time.– 8 bits (slow), 16 bits, 32 bits, 64 bits (fast).

• Data buses are bidirectional.

8 lines for a 8-bit bus

Address Bus

• The more address buses available, the larger the number of devices that can be addresses.– Example : 8 bits ( small ) , 16 bits, 32 bits,

64 bits ( large ) .

– A 16-bit address bus can indicate 216=64K bytes of addressable memory.

– Regardless of the size of the data bus.

• Address buses are unidirectional.

Capacity of CPU

• The address bus and data bus determine the capacity of a given CPU.– The processing power of a computer is related to

the size of its buses.– The number of address lines determines the

number of locations with which a CPU can communicate.

– More data and address buses mean a more expensive CPU and computer.

Machine Clock

• One bit is send on a wire in a clock period.

• Clock is used to synchronize work of the components on the machine.

• Clock decides the performance of the computer.

1.4 GHz

a clock10

Clock Generator

Data Transmission

• Parallel communication– Several data lines send data together.– Ex: LPT1 sends 8 bits data in parallel

• Serial communication– Single data line sends one bit per time unit.– Ex: COM port, USB (Universal Serial Bus)

• Note that there needs control lines for the negotiation between two ends.

Data Transmission Rate

• bps: bit per second

• Kbps, Mbps, Gbps

• Example: – PCI bus (32 lines): 33MHz, 132MB/sec – USB: 12 Mbps– modem (modulator-demodulator)

• Download: 56 kbps, Upload: 33.6 kbps

Parallel v.s. Serial

The Central Processing Unit

- Instruction Set

How does the Computer Work

• Figures 2.2 (or 2.3) is the algorithm (i.e., the list of execution steps) of addition.

• How does the computer perform the task?

• Processor provides a collection of instructions to perform some basic operations -- instruction set.

Figure 2.2Adding values stored in memory

LOAD

STORE

ADD

HALT

Figure 2.3Dividing values stored in memory

Conditional Jump

DIVIDE

Types of Instructions

1. Data Transfer– get or store data

2. Arithmetic/Logic – perform basic operations (Chapter 2.4)

3. Control– jump or branch

Data Transfer Instruction

• Memory Access : copy data from one place to the other– LOAD (memory register) 、 STORE

(register memory) 、 MOV – See Example: 2-2

• I/O Access

Arithmetic/Logic Instruction

• Perform operations on registers to get results (of course, put in the registers)

1. Logical – AND 、 OR 、 NOT 、 XOR

2. Arithmetic – SHIFT 、 Rotate– ADD 、 SUB 、 MUL 、 DIV– Example 2-2

Control Instruction

• Jump or Branch– JUMP– Unconditional jumps and conditional jumps

• Unconditional jump: always jump (ex: call a subroutine)

• Conditional jump: jump if some requirement is meet (ex: if-else, for-loop)

– Example 2-3

Arithmetic/Logic Instructions

- Logic Instruction

Logic Operation -AND

• AND

• AND is used as an mask (network mask, interrupt mask) or a bit map (each bit denotes that we own the object) .

10011010 11001001 10001000

AND10011010 00000001 00000000

AND

Check if bit =1, then we own card A.

Logic Operation - OR

10011010 11001001 11011011

OR10011010 11110000 11111010

OR

• OR

• Use 0 to mask the needed bits. These 4 bits are those we need.

Logic Operation -XOR

• XOR

• XOR is used to find the 1’complement.

• Exclusive OR

10011010 11001001 01010011

XOR10011010 11111111 01100101

XOR

1’s complement

Logic Operation -SHIFT

• SHIFT right

• Shift right = 2

10011010 00110100

shift right and insert 0

10011010 01001101

• SHIFT left

• Shift left = 2

shift left and insert 0

Logic Operation -ROTATE

• ROTATE right: d7...d0 d0d7...d1

• ROTATE left: d7...d0 d6...d0d7

10011010 01001101

10011010 01001101

Logic Operation -ROTATE with carry bit

• Shift right with carry bit

• Shift left with carry bit

10011010 11001101

1carry flag

10011010 01001101

0

1carry flag

1

The Stored-Program Concept

Machine Language

• How do we communicate with the computer machine ?

• Answer: – The coding system – The collection of instructions: Program

Execution in the Computer

1. Computers provide a collection of instructions (i.e., instruction set).

2. Programmers write programs (i.e., a list of instructions in some order.)

3. The computer reads the program and executes the instructions one by one.

4. The instruction set format is depend on the structure of the computer.

Stored-program Concept

• John von Neumann

• If the control unit is designed to – extract the program from memory – and execute them,

then a computer’s program can be changed merely by changing the contents of the computer’s memory. (Not changing the control unit)

Pseudo Machine

• Appendix C• 16 registers (R0 - RF) and S, T,PC, IR

– of length 1 byte– 4 bits are used to distinguish R0 - RF

• 256 bytes memory• Instruction set

– 12 instructions– All instruction occupies 2 bytes of memory.

Some Special Registers

• PC points to the next instruction.

• IR stores the current instruction to be decoded.

Figure 2.4The architecture of the machine

S T

The format of an Instruction

Figure 2.5

1 3 4 7

Instruction of Pseudo Machine

Opcode Operand Execution

1 RXY load, Rvalue in address XY

2 RXY load, Rthe value XY

3 RXY store, address XYR

4 ORS move, SR

5 RST add, RS+T

R (R0 - RF), S, T: registers

Examples of Instruction

1. 1347 ; load, R3 value in address 47

2. 2347 ; load, R3 value 47H

2. 70C5 ; or, R0 RC or R5

3. B7F3 ; jump to address F3 if R7 = R0

4. B0F3 ; jump to address F3

An Example of Program

Step 1: 156C ; load, R5 value in 6C

Step 2: 166D ; load, R6 value in 6D

Step 3: 5056 ; add, R0R5+R6

Step 4: 306E ; store, address 6ER0

Step 5: C000 ; halt

Program Execution

Program Store in Memory

• Our “add” program stored in memory starting at address A0.

Figure 2.7

The Machine Cycle

• Every instruction in memory is executed by three steps:

Fetch Decode Execute– Each instruction has its micro-instruction (or

micro-operations).• A micro-operation is an elementary operation that can

be performed in parallel during one clock pulse period.

– CPU has separate inside units for performing fetch/decode/execution.

Figure 2.6The machine cycle

by instruction decoder

by ALU

Example of Program Execution

1. PC=A0

2. IR=156C, PC=A2

3. IR=166D, PC=A4

4. IR=5056, PC=A6

5. IR=306E, PC=A8

6. IR=C000, PC=AA

Figure 2.7

Communicating with Other Devices

I/O Controller

• Each peripheral device has its controller.

• Controllers attach to the bus and communicate with CPU.

Figure 2.8

I/O Management

1. How does the CPU address the I/O device?– Isolated I/O v.s. Memory-mapped I/O

2. Where is the data transmitted to/received from?– Buffer

3. Which device can control the communication process?– CPU? I/O? Buffer?

I/O Addressing

• How does the CPU address the I/O device?

• Two kinds of addressing methods:– Isolated I/O – Memory-mapped I/O

Isolated I/O

• Separate I/O address and memory address– Different address numbering, different decoder– Ex: PC’s graphic card has I/O port for basic

control (I/O address 03B0-03BB, 03C0-03DF)

• Separate I/O, memory instructions – Ex: Intel use IOR\, IOW\ for I/O access,

MEMR\, MEMW\ for memory access

Memory-mapped I/O

• I/O device uses the memory location as its address.

• CPU uses memory read/write instructions to access I/O.

port is the memory address. Figure 2.9

Buffering

• Communication data is stored in the buffer.

• The transceiver/receiver reads/writes the data using its working rate.– Usually, CPU is fast and I/O is slow.

CPU Buffer I/O

Buffer Location

• Buffer can be register, cache, FIFO RAM, dual port RAM, MM, Disk.– Ex: Graphic controller content read data form

memory 000A0000-000CBFFF,58000000-5800FFFF

Control Mechanism

• The method to control their communication between CPU and I/O devices?

1. PIO (Programmed I/O)

2. Interrupt

3. DMA (Direct Memory Access)

PIO (Programmed I/O)

• CPU controls the process of the communication.

• Sometimes we call PIO as polling method.

Polling Concept

1 2 3 4 5

• 5 個不會響的電話!• 必須一個一個接起來聽才知道是否有電話來。

Polling for I/O

• CPU 控制 I/O 的方式,需要一個測試迴路, CPU 依序詢問( Poll )此 I/O 的狀態。

CPU

控制卡 控制卡控制卡 ......................

......................

Polling 的優缺點• 優點:簡單、直接• 缺點:

– 程式的結構安排困難– CPU 的效率會因無效 Polling 而降低– 無法對 I/O 馬上做出反映( Real-time

Response )

Interrupt

1 2 3 4 5

• 會響的電話!• 有電話來才需要接起來處理。

鈴 !

Interrupt ( 1 )

• 由需要服務的週邊或程式向系統發出信號• 由 I/O 主動送出 Interrupt 通知 CPU 其狀態。

CPU

控制卡 控制卡控制卡 ......................

......................

Interrupt ( 2 )

CPU

控制卡 控制卡控制卡 ......................

......................

Interrupt Vector

Interrupt Service Routine

Interrupt Vector Table

01: mm 02: pp 03: ... mm:

Interrupt Handling

• OS 取得 CPU 控制權• OS 將被 Interrupt 的 Process 狀態保存於 PCB• OS 以中斷向量( Interrupt vector )到中斷向量表

( Interrupt vector Table )查出中斷服務程式( Interrupt Service Routine , ISR )的起始位址,並將 CPU 控制權交給適當的中斷服務程式( Interrupt Service Routine , ISR )

• 當 ISR 執行完畢,交還 OS CPU 的控制權, OS 選取下一個 Process 執行

DMA (Direct Memory Access)

• CPU, memory, I/O share the system bus.

• CPU has the administration about the bus.– CPU must control the data transfer between memory

and I/O on system bus.

• DMA controller (DMAC) asks the system bus from CPU. Then, DMAC control and let data transfer directly between memory and I/O on system bus.– CPU can work separately.

Steps of DMA Operation

1. I/O 向 DMAC 提出 DMA Request.

2. DMAC 向 CPU 提出 System Bus Request.

3. CPU 送 System Bus Granted 給 DMAC.

4. DMA 回覆 I/O 一個 DMA ACKnowledge

5. 開始執行 DMA 的動作– CPU closes its I/O ports, address/data latches.

– DMAC controls data transfer on system bus.

– DMAC releases the bus.

DAM Operation Diagram

CPU

DMAC

Memory

I/O1. DMA Request

4. DMA ACKnowledge

2. System Bus Request

3. System Bus Grant

5. Data Transfer

system bus

Other Architectures

Complex Instructions Set Computer (CISC)

• Traditional Machine– More and more complex in the next generation

CPU– VAX 、 Intel Pentium Pro 、 x86

• Features– Variable instruction format and length– Many addressing modes– Large instruction set (100-300 instructions)

Reduced Instruction Set Computer (RISC)

• From 1980, an other viewpoint:– Faster and cheaper processor design cost– Reliable and fast instruction execution– UltraSPARC 、 Motorola( MAC)’s PowerPC

• Features of RISC– Small and simple instruction set (< 100 instructions)– Fixed-length and fixed-format instruction – Small addressing modes– Single machine cycle/instruction

CISC v.s. RISC

• Which is the winner?

CISC優點 : 一件需要許多指令的工作 整合成為一個單一指令 對 programmer 比較好寫 CPU 如果要轉換成不同用途 只需將 microprogram 換掉。

缺點 : microprgram 的複雜度會讓 cpu 性能的提昇受限 , 對 cpu 設計者是很大的考驗。

RISC優點 : 指令集減少許多 , 對 cpu 設計者 , 可比較方便利用 先進的方法提昇 CPU 性能。 缺點 : 對 programmer 而言比較難寫 因需用比較多的指令完成一 個工作 。

Pipeline

Instruction 1

Instruction 2

Instruction 3

IF

IF

ID EX

IF

ID EX

ID EX

Instruction 1

Instruction 2

Instruction 3

IF

IF

ID EX

IF

ID EX

ID EX

IF: Fetch ID: Decode Ex: Execution

Non-pipeline

Pipeline

Let Fetch/Decoder/Execute Units works in parallel

Pipeline Architectures

執行一個指令需用到許多步驟 :例如 : fetch( 讀取指令 ), decode( 解譯 ), execute(執行 )

先進的技術可以同一時間執行好幾個指令或預先做好幾個指令

Memory

• Pipelining :在同一時序裡執行許多指令

Distributed System

• Parallel processing: the performance of several activities at the same time.– The tasks executes in several units.

• Two type of distributed systems:– Loosely couple– Tightly couple

• Throughput :在單位時間內能完成多少工作,藉以評估 CPU 效能。

Tightly Coupled System

• 一個系統含有一個以上的 Processors 。這些 Processors 共用匯流排、 Clock 時脈、儲存設備。

• 透過共享記憶體( Shared Memory )來傳遞訊息( Message Passing )

• Also called as Parallel System

ProcessorProcessor Memory

Mesh and Hypercube

• Each processor performs the same operation and generates the outputs in parallel. For example, multiplication.

processor

hypercube mesh

Supercomputer

• 1997/5, Deep Blue of IBM– 32 nodes, 256 accumulator

• 2001/10, Terascale of Compack– 750 servers, 3000 Alpha EV68 processors,

quadrics connections.

Microprocessor

• In a multiprocessor, several processors are used, and the processing work is split between them.

• Each processor has some private memory for frequently accessed program, data and workspaces.

• The processor also share a single memory system, where most information is held.

Parallel Processing

• Traditional machine use SISD architecture.– SISD: single-instruction stream, single-data stream

(traditional machine)

• Parallel processing has two types:– SIMD: single-instruction stream, multiple-data stream

(ex: multiplication for all nodes)– MIMD: multiple-instruction stream, multiple-data

stream• Different instruction sequences are performed on different

sets of data.

Loosely Coupled System

• Processors 並不共用 Memory 、 Bus 及 Clock 。

• 透過 Telephone Line 、 Cable 、 Bus 相連來訊息傳遞或遠程副程式呼叫( Remote Procedure Call )

ProcessorProcessor

Memory Memory

通信線路

Challenge of Parallel Processing

• Load Balance– Dynamically allocating tasks to the processes so that

all processors are used efficiently.

• Scaling– Dividing the present task into subtasks which are

compatible with available processors.

• Complexity – The cost of management grows exponentially when

the task grows large.

Homework

• Social Issues7

• Problems7,11,13,17,23,28,38,44,46,47