Company
LOGO
Instructor: Ch. Bilal Ahmad Khan
Data Representation in Computers
2
Data Representation in Computers
3
Data Representation in Computers
4
Data Representation in Computers
5
Data Representation in Computers
6
Data Forms
Human communication
Includes language, images and sounds
Computers
Process and store all forms of data in binary format
Conversion to computer-usable representation using data formats
Define the different ways human data may be represented, stored and processed by a computer
7
Data conversion and representation
9
Common Data Representations
Type of Data Standard(s)
Alphanumeric BCD, ASCII, EBCDIC, Unicode
Image (bitmapped) GIF (graphical image format)
TIF (tagged image file format)
PNG (portable network graphics)
Image (object) PostScript, JPEG, SWF (Macromedia Flash), SVG
Outline graphics and fonts
PostScript, TrueType
Sound WAV, AVI, MP3, MIDI, WMA
Page description PDF (Adobe Portable Document Format), HTML, XML
Video and Sound Quicktime, MPEG-2, MPEG-4, RealVideo, WMV
10
Alphanumeric Data
Groups of data:
Characters: A, B, …, Z and a, b,…, z
Numbers/digits: 0 … 9
Punctuations: !, ;, :, ? etc
Special purpose characters: $, @, #, *, …, &
Four coding systems /standards to represent above types:
BCD (Binary-Coded Decimal)
ASCII (American Standard Code for Information Interchange)
EBCDIC (Extended Binary Coded Decimal Interchange Code)
Unicode
11
Next 2 slides
Standard Alphanumeric
Formats
BCD
ASCII
EBCDIC
Unicode
12
Binary-Coded Decimal
(BCD)
Four bits per digit Digit Bit pattern
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000
9 1001
Note: the following 6 bit
patterns are not used:
1010
1011
1100
1101
1110
1111
13
BCD: Example
709310 = ? (in BCD)
7 0 9 3
0111 0000 1001 0011
Or
0111000010010011
14
Standard Alphanumeric
Formats
Next 13 slides
BCD
ASCII
EBCDIC
Unicode
15
ASCII Features
Developed by ANSI (American National Standards Institute)
Defined in ANSI document X3.4-1977
7-bit code
8th bit is unused (or used for a parity bit or to indicate “extended” character set)
27 = 128 different codes
Two general types of codes: 95 are “Printing” codes (displayable on a console)
33 are “Control” codes (control features of the console or communications channel)
Represents Latin alphabet, Arabic numerals, standard punctuation characters
Plus small set of accents and other European special characters (Latin-I ASCII)
16
ASCII Table
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
17
ASCII Table
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
Most significant bit
Least significant bit
18
ASCII Table
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
e.g., ‘a’ = 1100001
19
ASCII Table
95 Printing codes
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
20
ASCII Table
33 Control codes
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
21
ASCII Table
Alphabetic codes
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
22
ASCII Table
Numeric codes
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
23
ASCII Table
000 001 010 011 100 101 110 111
0000 NULL DLE 0 @ P ` p
0001 SOH DC1 ! 1 A Q a q
0010 STX DC2 " 2 B R b r
0011 ETX DC3 # 3 C S c s
0100 EDT DC4 $ 4 D T d t
0101 ENQ NAK % 5 E U e u
0110 ACK SYN & 6 F V f v
0111 BEL ETB ' 7 G W g w
1000 BS CAN ( 8 H X h x
1001 HT EM ) 9 I Y i y
1010 LF SUB * : J Z j z
1011 VT ESC + ; K [ k {
1100 FF FS , < L \ l |
1101 CR GS - = M ] m }
1110 SO RS . > N ^ n ~
1111 SI US / ? O _ o DEL
Punctuation, etc.
24
ASCII Table
MSD
LSD 0 1 2 3 4 5 6 7
0 NUL DLE SP 0 @ P p
1 SOH DC1 ! 1 A Q a W
2 STX DC2 “ 2 B R b r
3 ETX DC3 # 3 C S c s
4 EOT DC4 $ 4 D T d t
5 ENQ NAK % 5 E U e u
6 ACJ SYN & 6 F V f v
7 BEL ETB ‘ 7 G W g w
8 BS CAN ( 8 H X h x
9 HT EM ) 9 I Y i y
A LF SUB * : J Z j z
B VT ESC + ; K [ k {
C FF FS , < L \ l |
D CR GS - = M ] m }
E SO RS . > N ^ n ~
F SI US / ? O _ o DEL
7416
111 0100
25
Example: “Hello, world”
============
Binary100100011001011101100110110011011110101100010000011101111100111111001011011001100100
Hexadecimal48656C6C6F2C207767726C64
Decimal72
1011081081114432119103114108100
Hello,
world
============
============
28
EASCII
29
EASCII
30
Standard Alphanumeric
Formats
Next 3 slides
BCD
ASCII
EBCDIC
Unicode
31
EBCDIC
8-bit code
Developed by IBM for
mainframes computers
Rarely used today, common in
archival data
Character codes differ from
ASCII
Conversion software to/from
ASCII available
ASCII EBCDIC
Space 2016 4016
A 4116 C116
b 6216 8216
32
EBCDIC Table (1 of 2)
33
EBCDIC Table (2 of 2)
34
Standard Alphanumeric
Formats
Next 2 slides
BCD
ASCII
EBCDIC
Unicode
35
Unicode
Most common 16-bit form represents 65,536 characters
EASCII is a subset of Unicode
Values 0 to 255 in Unicode table
Multilingual: defines codes for
Nearly every character-based alphabet
Chinese, Japanese and Korean alphabets
Allows software modifications for local-languages
representations
36
Two-byte Unicode Assignment Table