I308 Information Representation - IU South Bend: …hhakimza/I308/Notes/Part1.pdf · I308...

Post on 15-Feb-2018

213 views 0 download

transcript

1

I308 Information Representation

Dr. Hossein Hakimzadeh Computer Science and Informatics

IU South Bend

2

What is a Computer?

An Electronic digital device that can store, and process data.

A Fast and Accurate electronic symbol (or data) manipulating system that is designed to accept and store input data, process them and produce output result.

A programmable, multi-use machine that accepts data (raw

facts) and processes, or manipulates it into information we can use.

3

How does a Computer work?

We must answer the following two questions:

1) How is data represented inside a computer? (Encoding)

2) How is data manipulated inside a computer?

(Algorithms)

4

How is Data represented?

Encoding is the process of transforming information from one format into another. The opposite operation is called decoding. This is often used in many digital devices. (http://en.wikipedia.org/wiki/Encoding)

UPC (Universal Product Code)

Chinese Calligraphy

5

How is Data represented?

Encoding

http://www.unicode.org/charts/PDF/U2800.pdf

6

How is Data represented?

Encoding

http://en.wikipedia.org/wiki/DNA

http://en.wikipedia.org/wiki/DNA_sequencing

7

How is Data represented?

Encoding

8

How is Data represented?

Encoding

http://en.wikipedia.org/wiki/Musical_notation

9

How is Data represented inside the Computer?

Remember, A computer is an electronic digital device

that can store, and process data.

10

Digital vs. Analog?

Analog systems have a continues range of values. Vinyl records Analog clocks Set of real numbers

Digital systems have a set of discrete values. CD’s and DVD’s Digital clocks Set of integer numbers

11

How is information represented inside the Computer?

Binary digits or BITs

(0’s and 1’s)

Why Binary Digits?

12

How is information represented inside the Computer? Digital Computers are designed to

process data in numerical form. They can store and manipulate information such as numbers, characters, images, and sound using numbers.

The information inside the computer is expressed in the binary system.

Binary digits (bits), are made up of 0’s and 1’s. (e.g. 0, 1, 110, 11, 1010, and 1011 are all binary numbers).

Binary digits are easily expressed in the computer circuitry by the presence or absence of voltage. For example 1 may mean 5 volts and 0 may mean 0 volts.

13

How is Data represented inside the Computer?

Bit (Binary digIT)

(A bit is a unit of storage in a computer) (A bit is a single binary digit. 0 or 1)

A Byte is 8 Bits

KiloByte (KB) = 210 or 1024 bytes (Approximately 1,000 bytes)

MegaByte (MB) = 220 bytes (Approximately 1,000,000 bytes)

GigaByte (GB) = 230 bytes (Approximately 1,000,000,000 bytes)

TeraByte (TB) = 240 bytes (Approximately 1,000,000,000,000 bytes)

PetaByte (PB) = 250 bytes

14

Problem 1:

You just bought a 60 gigabyte drive. After formatting the drive, you found out that it is only 58.6 Gigabytes? What should you do?

(c) Copyright 2009, H. Hakimzadeh

15

Solution 1:

You just bought a 60 gigabyte drive. After formatting the drive, you found out that it is only 58.6 Gigabytes? What should you do?

Nothing!

Most drive manufactures use a Giga Byte to mean one Billion Bytes! 60,000,000,000 / (1024*1000*1000) = 58.593

(c) Copyright 2009, H. Hakimzadeh

16

Encoding

Given that computers only understand binary numbers, in order to store and manipulate information inside a computer, we must find a way to encode information in binary.

This information may be NUMBERS, TEXT, or other type of data

such as AUDIO, IMAGE or VIDEO.

17

Encoding Text

Imagine our language was restricted to the following symbols (letters): Source alphabet: {n,k,b,e,r,d,i} Target alphabet: {0, 1} Encoding:

n = 000 k = 001 b = 010 e = 011 r = 100 d = 101 i = 110

Decode: 101100110000001010011011100 = _________________

18

Problem 2:

How many bits do we need in order to represent all the 26 upper case English letters?

How many bits do we need to represent the upper and lower

case letters, plus all the numbers, and the symbols (@, $, & and #)

19

ASCII Code

American Standard Code for Information Interchange

Why? …. Standardization between computers

7 or 8 bits are used to represent all the letters, numbers, and symbols, that appear on the English language keyboard.

A = 01000001 = 65 B = 01000010 = 66 C = 01000011 = 67

http://www.asciitable.com/

20

UNICODE Code 16 to 32 bits vs. 8 bit code Why? … Internationalization of computers and applications) Hello = U+0048 U+0065 U+006C U+006C U+006F There are many UNICODE encoding standards. These include:

UTF-8 (Treats English as normal ASCII would, then accommodate other characters as 2 or more byte characters)

UTF-16 or UCS-2 (2 byte code to store each Unicode character) UTF-32 or UCS-4 (4 byte code to store each code point or Unicode

character)

It is important to know what encoding standard is being used

before attempting to decode a string!

21

Insert the following

Unicode into text a html file and try to view it using a browser:

<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Unicode Characters</title> </head> <body> hello. <BR> Chinese: (&#x5FAA;&#x74B0;&#x6548;&#x7387;) <BR> Persian: (&#1575;&#1604;&#1601;&#1576;&#1575;&#1740; &#1601;&#1575;&#1585;&#1587;&#1740;) </h2> <BR> done. </body> </html>

Unicode Example:

22

Insert the following

Unicode into text a html file and try to view it using a browser:

<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Unicode Characters</title> </head> <body> hello. <BR> Chinese: (&#x5FAA;&#x74B0;&#x6548;&#x7387;) <BR> Persian: (&#1575;&#1604;&#1601;&#1576;&#1575;&#1740; &#1601;&#1575;&#1585;&#1587;&#1740;) </h2> <BR> done. </body> </html>

Unicode Example:

(c) Copyright 2007, H. Hakimzadeh

23

24

Encoding Numbers:

How do we represent numbers?

Character “1” in the ASCII table is encoded as 00110001 (decimal 49.) Character “2” is 00110010 = 50 Character “3” is 00110011 = 51

Can we use ASCII representation of numbers for the purpose of

calculations? Can we add Character “1” and Character “2” to get “3”?

(c) Copyright 2009, H. Hakimzadeh

25

Encoding Numbers:

If ASCII representation of numbers can not be use, then, we need a different encoding to be able to represent numbers and perform calculations.

What is a suitable encoding?

Hold on to this idea. We’ll come back to it……

(c) Copyright 2009, H. Hakimzadeh

26

Number Systems

Decimal (Base 10)

Binary (Base 2)

Octal (Base 8)

Hexadecimal (Base 16)

(c) Copyright 2009, H. Hakimzadeh

27

Decimal (Base 10)

Used by humans (probably because we have 10 fingers!) Numbers in base 10 are (0, 1, 2, 3, 4, ...... , 9) (always from 0

to Base -1) Example: (254)

What is 254?

(c) Copyright 2009, H. Hakimzadeh

2

5

4

2 * 102

5 * 101

4 *100

200

50

4

254

28

Binary (Base 2)

Used by digital computers (remember the ON / OFF states) Numbers in base 2 are (0, 1) (always from 0 to Base -1) Example: (Binary 110)

What is Binary 110?

(c) Copyright 2009, H. Hakimzadeh

1

1

0

1 * 22

1 * 21

0 * 20

4

2

0

6

29

Octal (Base 8)

Used by people when they want to represent large binary numbers. Its easier to deal with.

Numbers in base 8 are (0, 7) (always

from 0 to Base -1)

(c) Copyright 2009, H. Hakimzadeh

Octal

Binary

0

000

1

001

2

010

3

011

4

100

5

101

6

110

7

111

30

Octal (Base 8)

Example: (Octal 251)

(c) Copyright 2009, H. Hakimzadeh

2

5

1

2 * 82

5 * 81

1 * 80

128

40

1

169

Octal 251 can be converted to binary very easily. Each

number will be represented by 3 binary digits (bits).

2

5

1

010

101

001

010 101 001

169

31

Hexadecimal (Base 16) Similar to Octal, Hex numbers are

used by people when they want to represent larger binary numbers. Its easier to deal with.

Numbers in base 16 are

(0,1,2,....,9, A, B, C, D, E, F) (always from 0 to Base -1)

To keep each number as one

character we use the letters “A” through “F” as numbers 10 to 15. (A = 10, B = 11, C=12, D=13, E=14, F=15)

(c) Copyright 2009, H. Hakimzadeh

Hexadecimal

Binary

0

0000

1

0001

2

0010

3

0011

4

0100

5

0101

6

0110

7

0111

8

1000

9

1001

A

1010

B

1011

C

1100

D

1101

E

1110

F

1111

32

Hexadecimal (Base 16) Example:

What is HEX 25A?

(c) Copyright 2009, H. Hakimzadeh

2

5

A

2 * 162

5 * 161

A * 160

512

80

10

602

HEX 25A can be converted to binary very easily. Each number will be represented by 4 binary digits (bits).

2

5

A

0010

0101

1010

0010 0101 1010

602

33

Different Ways of Representing Binary Numbers:

Unsigned Integers

Signed Magnitude

1's Complement

2's Complement

(c) Copyright 2009, H. Hakimzadeh

34

Unsigned Integers (non-negative numbers)

With k bits, we can represent 2k positive Integers Ranging from 0 to 2k-1

Unsigned Integers

Representation

0

00000

1

00001

2

00010

3

00011

4

00100

5

00101

6

00110

7

00111

8

01000

9

01001

10

01010

11

01011

12

01100

13

01101

14

01110

15

01111

16

10000

17

10001

18

10010

19

10011

20

10100

21

10101

22

10110

23

10111

24

11000

25

11001

26

11010

27

11011

28

11100

29

11101

30

11110

31

11111

35

Signed Magnitude

With k bits, we can represent 2k integers ranging from negative 2k-1-1 to positive 2k-1-1 The left most bit is a sign bit. (0 = positive, 1 = negative)

Signed Magnitude

Representation

0

00000

1

00001

2

00010

3

00011

4

00100

5

00101

6

00110

7

00111

8

01000

9

01001

10

01010

11

01011

12

01100

13

01101

14

01110

15

01111

-0

10000

-1

10001

-2

10010

-3

10011

-4

10100

-5

10101

-6

10110

-7

10111

-8

11000

-9

11001

-10

11010

-11

11011

-12

11100

-13

11101

-14

11110

-15

11111

36

1's Complement

With k bits, we can represent 2k integers ranging from negative 2k-1-1 to positive 2k-1-1 (-15 to +15) Negative numbers are represented by taking the positive numbers and flipping all their bits.

1's Complement

Representation

0

00000

1

00001

2

00010

3

00011

4

00100

5

00101

6

00110

7

00111

8

01000

9

01001

10

01010

11

01011

12

01100

13

01101

14

01110

15

01111

-15

10000

-14

10001

-13

10010

-12

10011

-11

10100

-10

10101

-9

10110

-8

10111

-7

11000

-6

11001

-5

11010

-4

11011

-3

11100

-2

11101

-1

11110

-0

11111

37

2's Complement

With k bits, we can represent 2k integers ranging from negative 2k-1 to positive 2k-1-1 (-16 to +15) Negative numbers are represented by taking the positive numbers and flipping all their bits, then adding 1 to it.

2's Complement

Representation

0

00000

1

00001

2

00010

3

00011

4

00100

5

00101

6

00110

7

00111

8

01000

9

01001

10

01010

11

01011

12

01100

13

01101

14

01110

15

01111

-16

10000

-15

10001

-14

10010

-13

10011

-12

10100

-11

10101

-10

10110

-9

10111

-8

11000

-7

11001

-6

11010

-5

11011

-4

11100

-3

11101

-2

11110

-1

11111

38

What are the advantages of one number system vs. another?

It is a lot easier to implement computer hardware that is

able to calculate numbers in 2's complement. Virtually all computers use the 2's complement number

system to do binary arithmetic.

39

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

40

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

41

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry 1

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

42

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry 1

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

0 0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

43

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry 1 1

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

0 0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

44

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry 1 1

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

0

0 0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

45

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry 1 1 1

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum

0

0 0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

46

Binary Arithmetic (Using 2's complement)

Two binary numbers can be added, starting at the rightmost bit

and adding the corresponding bits.

Carry 1 1 1

First number (addend) 0

0

0

1

1

Second number (augend)

+0

+0

+1

+0

+1

Sum 0

1

0

0 0

If a carry is generated, it is carried one position to the left, just

as in decimal arithmetic.

47

Binary Arithmetic (Using 2's complement)

Examples:

5 +4 === 9

00101 +00100 --------------- 01001

7 +4 === 11

00111 +00100 --------------- 01011

7 +7 === 14

00111 +00111 --------------- 01110

48

Overflow in 2's Complement

If the sum of two positive numbers carry into the last bit (left most bit), then an overflow has occurred and the sum becomes a negative number (incorrect).

15+15 = 30 (Note that this sum produces a negative number)

01111 +01111 --------------- 11110

49

Adding Negative Numbers

In 2’s complement arithmetic, a carry generated by the addition of the leftmost bits is simply thrown away.

7+(-7) = 0 (2's complement)

000111 +111001 --------------- 000000

7+(-6) = 1 (2's complement)

000111 +111010 --------------- 000001

(-6)+(-6) = (-12) (2's complement)

111010 +111010 --------------- 110100

50

Hardware Circuitry for Representing and Manipulating Information

Gates and Circuits:

The NOT circuit

The AND Gate

The OR Gate

The XOR Gate

51

Hardware Circuitry for Representing and Manipulating Information

Gates and Circuits:

The NOT Gate

52

Hardware Circuitry for Representing and Manipulating Information

Gates and Circuits:

The AND Gate

53

Hardware Circuitry for Representing and Manipulating Information

Gates and Circuits:

The OR Gate

54

Hardware Circuitry for Representing and Manipulating Information

Gates and Circuits:

The XOR Gate

55

Other Circuits:

Gates and Circuits:

The OR Gate

The OR gate can also be implemented as an AND gate with a few NOT gates:

A OR B = NOT( (NOT A) AND (NOT B) )

56

Other Circuits:

Gates and Circuits: The NOR Gate

57

Other Circuits:

Gates and Circuits:

The XNOR Gate

58

Other Circuits:

Gates and Circuits:

The XOR Gate

XOR gate made with 4 NAND gates

59

Hardware Circuitry for Representing and Manipulating Information

Building an ADDer Circuit

We proceed from the rightmost (least significant) bit position to the leftmost

(most significant) bit position. In each position, we add three binary digits A, B, and Cin and as a result we get two binary digits S (Sum) and Cout.

X and Y are the bits from the two numbers we want to add. Cin is the "carry-in" from the previous bit position, and Cout is the "carry-out" to the next bit position.

1 1 1 1 0 0 0 0 <--- Cin 0 1 0 1 1 1 0 1 <--- A + 0 0 1 1 1 0 1 0 <--- B ------------------------ 1 0 0 1 0 1 1 1 <--- S 0 1 1 1 1 0 0 0 <--- Cout

60

Hardware Circuitry for Representing and Manipulating Information

Building an ADDer Circuit using an XOR and a AND Circuit

1 1 1 1 0 0 0 0 <--- Cin 0 1 0 1 1 1 0 1 <--- A + 0 0 1 1 1 0 1 0 <--- B ------------------------ 1 0 0 1 0 1 1 1 <--- S 0 1 1 1 1 0 0 0 <--- Cout

61

Hardware Circuitry for Representing and Manipulating Information

Building an ADDer Circuit

1 1 1 1 0 0 0 0 <--- Cin 0 1 0 1 1 1 0 1 <--- A + 0 0 1 1 1 0 1 0 <--- B ------------------------ 1 0 0 1 0 1 1 1 <--- S 0 1 1 1 1 0 0 0 <--- Cout

62

Hardware Circuitry for Representing and Manipulating Information

Building an ADDer Circuit

1 1 1 1 0 0 0 0 <--- Cin 0 1 0 1 1 1 0 1 <--- A + 0 0 1 1 1 0 1 0 <--- B ------------------------ 1 0 0 1 0 1 1 1 <--- S 0 1 1 1 1 0 0 0 <--- Cout

63

Hardware Circuitry for Representing and Manipulating Information

Building an ADDer Circuit

1 1 1 1 0 0 0 0 <--- Cin 0 1 0 1 1 1 0 1 <--- A + 0 0 1 1 1 0 1 0 <--- B ------------------------ 1 0 0 1 0 1 1 1 <--- S 0 1 1 1 1 0 0 0 <--- Cout

64

Hardware Circuitry for Representing and Manipulating Information

Building an ADDer Circuit

1 1 1 1 0 0 0 0 <--- Cin 0 1 0 1 1 1 0 1 <--- A + 0 0 1 1 1 0 1 0 <--- B ------------------------ 1 0 0 1 0 1 1 1 <--- S 0 1 1 1 1 0 0 0 <--- Cout

Half Adder

(c) Copyright 2007, H. Hakimzadeh

65

66

Hardware Circuitry for Representing and Manipulating Information

A full adder

67

Representing Real Numbers How do we represent real numbers such as 2/3 or PI which may

have infinite repeating or non-repeating digit sequences? We approximate using Floating Point representation. Conceptually this

representation is very similar to scientific notation. For example 3456 = 3.456 * 103

Floating point numbers are generally allocated as either 32 or 64 bits. These bits are divided into 3 parts:

Sign bit

Exponent

Fraction

Floating point number = (+/-) (1+Fraction) x 2 (Exponent - Bias)

68

Representing Real Numbers Sign Bit:

The sign bit is the very first bit of the floating point number and determines whether the number is positive or negative. 0=positive 1=negative

Exponent:

The exponent consists of the next 8 (32bFP) or 11 (64bFP) bits. The bias is a fixed number:

127 for 32bFP 1023 for 64bFP

The Fraction:

The last 23 (32bFP) or 52 (64bFP) bits is the fraction. This is an unsigned binary string that represents binary places to the

right of the decimal point and is therefore a value between 0 and 1.

Sign (1 bit)

Exponent (8 bits)

Fraction (23 bits)

0 or 1

0000 0000

000 00000 00000 00000 00000

69

Example: Interpreting Floating Point Numbers

0 0000 0000 11000000000000000000000

This fraction represents 1*2-1 + 1*2-2

Therefore, our fraction value is 1/2 + 1/4 or .75

All floating point fractions are expressed in powers of 2

(+/-) (1+Fraction) x 2 (Exponent-Bias)

Sign (1 bit)

Exponent (8 bits)

Fraction (23 bits)

0 or 1

0000 0000

000 00000 00000 00000 00000

70

Example: Interpreting Floating Point Numbers

0 10000001 101 00000 00000 00000 00000

Sign bit: 0 = positive Exponent: 129 Bias: 127 Fraction: 1*2-1+0*2-2+1*2-3 = 1/2 + 1/8 = .5+.125 = .625

Sign (1 bit)

Exponent (8 bits)

Fraction (23 bits)

0 or 1

0000 0000

000 00000 00000 00000 00000

71

Example: Interpreting Floating Point Numbers

0 10000001 101 00000 00000 00000 00000

(+/-) (1+Fraction) x 2 (Exponent-Bias) + (1.625) x 2(129-127) = 1.625 x 22 = 1.625 x 4 = 6.5

Sign (1 bit)

Exponent (8 bits)

Fraction (23 bits)

0 or 1

0000 0000

000 00000 00000 00000 00000

72

Encoding Images

red

red

blue

blue

green

green

green

black

black

black

black

blue

blue

green

green

green

green

green

green

73

Encoding Images

red

red

blue

blue

green

green

green

black

black

black

black

blue

blue

green

green

green

green

green

green

The image is represented as a 2D array and each color is represented as a number.

53

53

79

79

32

32

32

0

0

0

0

79

79

32

32

32

32

32

32

74

Encoding Sound http://www.school-for-champions.com/science/sound.htm

75

Encoding Sound

In order encode sound, we have to sample the wave 3

3

3

3

1

1

1

1

1

-2

-2

-2

-2

-4

-4

Encoding of the above sound: 1,3,3,1,-2,-4,-2,1, 3,3,1, -2,-4,-2,1

76

Encoding Video Video can be encoded by combining images (typically 30

frames per second) plus one or more channels of sound.

Problem 1: (Video recording)

We want to record a 4 minute 400x400 video. Assume the .bmp file has a 54 byte header, and each pixel is 16-bits (64K colors) and the frame rate is 10/sec. Audio is stereo, 16 bit samples taken at 40khz.

What is the total size of the file?

(c) Copyright 2007, H.

Hakimzadeh 77

Solution:

(c) Copyright 2007, H. Hakimzadeh

78

Frame 1

+

Video Audio

Step 1: Calculate Frame size Frame size = Header + Image resolution * Size of each pixel Header: 54B Pixels: 400*400 = 160,000 pixels Each Pixel (color): 2B Each frame = 160,000 * 2B = 320,000B Total size for each image: 54B + 320,000B = 320,054B Approx: 320 KB

(c) Copyright 2007, H. Hakimzadeh

79

Step 2: Calculate Video size Video size = Frame size * Frame rate * Duration Video size = 320 KB * 10/sec * 60 sec/min * 4min = 320KB *2400 = 768,000KB

(c) Copyright 2007, H. Hakimzadeh

80

Step 3: Calculate Audio Size Audio size = #Channels * Sample size * Sample rate * Duration Audio size = 2 * 16b * 40,000/sec * (60sec/min * 4min) = 307,200,000b * (1B/8b) = 38,400,000B * (1KB/1024B) = 37,500KB

(c) Copyright 2007, H. Hakimzadeh

81

Step 4: Video + Audio File size = 768,000KB + 37,500KB = 805,500KB = 805 MB (approx)

(c) Copyright 2007, H. Hakimzadeh

82