Quiz on Ch.2
3
Carry values
go here
1 1 0 1 1 1 0 0
+0 1 0 0 0 1 0 1
14
Binary addition (Check in base ten!)
Computers are multimedia devices, dealing with a vast array of information categories
Computers store, present, and help us modify • Numbers
• Text
• Audio
• Images and graphics
• Video
• Haptics (touch)
• Smell (machine olfaction: see link on our webpage)
• Chess positions (computer chess: see link on our webpage)
• Etc. etc. etc.
5
Positive integers
Negative integers
Real (various precisions)
Complex
Data compression
Reduction in the amount of space (memory) needed to store or transmit the data
Measured by the Compression ratio = The size of the compressed data divided by the size of the original data
Example: Two files are compressed with the ZIP utility:
• One is originally 200 MB, and becomes 150 MB after compression
• The other is originally 15 MB, and 11 MB after compression
Which file is better compressed?
6
Quiz
A video file is originally 3.5 GB long.
We compress it to 350 MB. What is the compression ratio?
7
Quiz
A video file is originally 3.5 GB long.
We compress with a compression ratio of 0.2
What is the final size of the file?
8
Data compression
The Compression ratio is always between 0 and 1
(0% and 100%)
Compression techniques can be
Lossless → the data can be retrieved without any loss of the original information
Lossy → some information may be lost in the process (but it doesn’t matter for the purposes of the intended application)
9
Analog vs. Digital
Many quantities of interest in the real-world are infinite and continuous
• Zeno’s paradox: “That which is in locomotion must arrive at the half-way stage before it arrives at the goal. ” — Aristotle, Physics VI:9
But computers are finite and discrete!
How do we represent an infinite and continuous quantity in a computer?
Answer: We approximate → represent only enough to satisfy our computational needs and our senses of sight and sound.
10
Information can be represented in one of two ways: analog or digital
Analog data
A continuous representation, similar to the actual information it represents
Digital data
A discrete representation, breaking the information up into separate elements
11
Analog and Digital Information
Computers cannot work well with analog data, so we digitize the data
Digitizing = Breaking data into pieces and representing those pieces separately, by using a finite number of binary digits
Fine distinction: there are two operations performed:
• one in time (a.k.a. sampling)
• the other in amplitude (a.k.a. quantizing)
12
Digitizing = Breaking data into pieces and representing those pieces separately, by using a finite number of binary digits
Fine distinction: there are two operations performed:
• one in time (a.k.a. sampling)
• the other in amplitude (a.k.a. quantizing)
13
Quiz
A digital compass reads the position of a robot 20 times a second.
What is the time elapsed between two consecutive readings?
Is this a sampling error or quantization error?
14
Analog and Digital Information
Why do we use binary to represent digitized data?
• Price: transistors are cheap to produce
–Remember Babbage!
• Reliability: transistors don’t get jammed
–Remember Babbage!
15
Electronic Signals
Important facts about electronic signals
• An analog signal continually fluctuates in voltage up and down
• A digital signal has only a high or low state, corresponding to the two binary digits
16
Figure 3.2
An analog and a digital signal
• All electronic signals (both analog and digital) degrade due to absorption in transmission lines
• The amplitude (voltage) of electronic signals (both analog and digital) fluctuates due to environmental effects, a.k.a. noise
17
Figure 3.3
Degradation of analog and digital signals
The difference is that digital signals can be regenerated!
Binary Representations
One bit can be either 0 or 1
• One bit can represent two things
Two bits can represent four things (Why?)
How many things can three bits represent?
How many things can four bits represent?
18
Binary Representations
How many things can n bits represent?
What happens every time you increase the number of bits by one?
20
Quiz
A digital thermometer has a scale from 50 to 100 degrees (F). The temperature is represented on 7 bits. What is the smallest temperature difference that it can measure?
Is this a sampling error or quantization error?
21
Solution
A digital thermometer has a scale from 50 to 100 degrees (F). The temperature is represented on 7 bits. What is the smallest temperature difference that it can measure?
7 bits → 27 = 128 values → 127 intervals
(100-50)/127 = 0.394 deg/interval
Is this a sampling error or quantization error?
Quantization, since it’s in the vertical direction 22
Beware of the “fencepost error”!
Image source: http://en.wikipedia.org/wiki/Fencepost_error
Quiz
A digital volt-meter has a scale from 10 to 30 volt (V).
The voltage is represented on 9 bits. What is the smallest voltage difference that it can represent?
Is this a sampling error or quantization error? 23
For next time:
• Read pp.54-57 of text
• Solve end of chapter ex. 1 through 5, 27, 28 in notebook
24 EOL1
Quiz
A weather log file is originally 4.2 GB long.
We compress it to 150 KB. What is the compression ratio?
25
Quiz
An Analog-to-Digital Converter (ADC) accepts an input voltage between -3 and +12 V, and uses 10 bits to represent it digitally.
What is the precision?
26
Binary Representations
How many things can n bits represent?
Reversing the problem: How many bits are
needed to represent N things?
• All desktops in this lab?
27
How many bits are needed to represent all 45 desktops in this lab?
The inverse of the power (2n) is the logarithm:
28
What’s wrong with this answer?
Base is 2
How many bits are needed to represent all 45 desktops in this lab?
30
Alternative solution:
What’s the smallest power of 2 that is ≥ N?
QUIZ
How many bits are needed to represent all the
days you’re planning to spend in college for your
undergraduate degree?
– Use either method!
31
Remember: Data and Computers
Computers are multimedia devices, dealing with a vast array of information categories
Computers store, present, and help us modify • Numbers
• Text
• Audio
• Images and graphics
• Video
• Smell (machine olfaction!)
• Haptics (touch)
• Etc. etc. etc. 32
Positive integers
Negative integers
Real (various precisions)
Complex
3.2 Representing Numeric Data
Negative integers
Signed-magnitude representation
The sign represents the ordering, and the digits represent the magnitude of the number
33
Negative Integers
There is a problem with the sign-magnitude representation: plus zero and minus zero.
• More complex hardware is required!
Solution: Let’s not represent the sign explicitly!
“Complement” representation
34
Ten’s complement
Using two decimal digits, represent 100 numbers
• If unsigned, the range would be 0…?
• Let 1 through 49 represent 1 … 49
• Let 50 through 99 represent -50 … -1
35
Ten’s complement
36
Top: representations (the “label on the jar”)
Bottom: the actual numbers that are being
represented (the “content of the jar”)
QUIZ
Given the following representations, find in each case what actual number is being represented:
• 51
• 52
• 96
• 47
37
Why the “complement” in ten’s complement?
100 – 50 = 50
100 – 49 = 51
……………………..
100 – 1 = 99
In general:
100 – i is the representation of – i
39
Positive number
Negative number
Let’s use ten’s complement!
To perform addition, add the numbers and discard any carry
41
Now you try it
48 (signed-magnitude)
- 1
47
How does it work in
the new scheme?
Important conclusions
In the complement representation:
• Positive and negative numbers are treated the same! We can add without knowing if they’re positive or negative!
• Subtraction is performed as addition, by changing signs: a – b = a + (-b). This greatly simplifies the hardware!
43 EOL2
44
Two’s Complement
What do you notice
about the left-most bit
(MSB)?
Important: It’s not
sign-magnitude!!
45
QUIZ: Two’s Complement
John has encountered this two’s
complement number:
1000 0111
He says: The number is negative,
b/c the MSB is one.
The magnitude is just 111, which
means 7.
Therefore the number is -7 in
decimal!
Is John correct?
Two’s complement on 4 bits (k = 4)
46
What is:
• The largest positive number?
• The largest negative number?
• -1?
Repeat the questions above for:
• 5 bits (k = 5)
• 6 bits
• 8 bits
• N bits (general N)
48
“Signposts” for two’s comp.
0000 0000 means ...
0111 1111 means ...
1000 0000 means ...
1111 1111 means ...
Formula to compute the negative of a number on k digits:
• for ten’s comp: Negative(I) → 10k - I
• for two’s comp: Negative(I) → 2k - I
Practice: find the 8-bit two’s comp. representations:
7
-7
-110
200 (trick question!)
-129 (trick question!)
0
49
“Fast” two’s complement Easier way to change the sign of a number:
Flip all bits, then add 1
Try it out! Find the negatives of the following
two’s complement numbers:
0000 0011
1000 0000
1000 0001
1000 0011
1001 0110
1111 1111 50
This is how subtraction is implemented in
computer hardware! A – B = A + (-B)
Two’s complement Addition and subtraction are the same as in unsigned:
-127 1000 0001
+ 1 0000 0001
-126 1000 0010
Ignore any Carry out of the MSB:
-1 1111 1111 +
-1 1111 1111
-2 1111 1110
52
What happens if the computed value won't fit in the given number of bits k?
Overflow
If k = 8 bits, adding 127 to 3 overflows:
1111 1111
+ 0000 0011
1 0000 0010
… but adding -1 to 3 doesn’t!
55
Conclusions:
Overflow is specific to the representation (unsigned, sign-mag., two’s comp., floating point etc.)
Overflow is something we should always expect (and make provisions for) when mapping an infinite world onto a finite machine!
56
3.3 Representing Text
Basic idea: There are finite number of characters to represent, so list them all and assign each a (binary) number, a.k.a. code. Character set A list of characters and the codes used to represent each one Computer manufacturers (eventually) agreed to standardize
– Read “Character Set Maze” on p.67 58
The ASCII Character Set
ASCII = American Standard Code for Information Interchange
ASCII originally used seven bits to represent each character, allowing for 128 unique characters
Later extended ASCII evolved so that all eight bits were used
• How many characters can be represented?
59
For next time:
• Read pp.58-62 of text
• Go through all examples presented in class and make sure you understand them. Ask next time if you have questions!
• Solve end-of-chapter ex. 6, 33 in notebook
63 EOL3
The ASCII Character Set
66
The first 32 characters in the ASCII character chart do not have a simple character representation to print to the screen.
They are called control characters
8-bit (“extended”) ASCII Character Sets
68
By using 8 bits instead of 7, the number of codes extends from 128 to 256.
Extended ASCII is always a superset of 7-bit ASCII:
• The first 128 characters correspond exactly to 7-bit ASCII
Problem ???
Not in text
Extended ASCII: IBM code page 437
69
http://en.wikipedia.org/wiki/Code_page_437
Not in text
Extended ASCII: Latin-1
70 http://en.wikipedia.org/wiki/ISO/IEC_8859-1
Not in text
Trick QUIZ
What do these bits represent?
1101 1110
72
Unsigned integer: …
Signed integer (2’s complement): …
Fixed-point fraction
IBM 437 character: …
Latin-1 character: …
Floating-point fraction
The Unicode Character Set
None of the Extended ASCII character sets were enough for international use (256!)
Unicode uses 16 bits per character
How many characters can UNICODE represent?
Unicode is a superset of Latin-1: The first 256 characters correspond exactly to Latin-1 characters (http://unicode.org/charts/PDF/U0080.pdf )
73
Simplified Chinese has
6500!
Miscellaneous Characters in Unicode
76
See more online at the official Unicode site
QUIZ:
Your boss tells you to develop a webpage using the extended ASCII character set. What do you reply?
77
QUIZ Select all that apply:
The Latin-1 character set:
• Is a 5-bit representation
• Is a 7-bit representation
• Is a 16-bit representation
• Is an extension of ASCII
• Is an extension of Unicode
• Contains letters used in European languages
78
QUIZ Select all that apply:
The Unicode representation:
• Uses 16 bits
• Is an extension of ASCII
• Is an extension of Latin-1
• Is an extension of IBM-437
• Contains letters used in all world languages
• Can accommodate over 65,000 characters
• Is used in the majority of web pages today
79 EOL4
Text Compression
Sometimes, assigning 8 or 16 bits to each character in a document uses too much memory We need ways to store and transmit text efficiently Text compression techniques:
– keyword encoding – run-length encoding – Huffman encoding
80
Keyword Encoding
Replace frequently used words with a single character, for example here’s a substitution chart:
81
Keyword Encoding Given the following paragraph:
We hold these truths to be self-evident, that all men are
created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed, That whenever any Form of Government becomes destructive of these ends, it is the Right of the People to alter or to abolish it, and to institute new Government, laying its foundation on such principles and organizing its powers in such form, as to them shall seem most likely to effect their Safety and Happiness.
82
Keyword Encoding The encoded paragraph is:
We hold # truths to be self-evident, $ all men are created
equal, $ ~y are endowed by ~ir Creator with certain unalienable Rights, $ among # are Life, Liberty + ~ pursuit of Happiness. $ to secure # rights, Governments are instituted among Men, deriving ~ir just powers from ~ consent of ~ governed, $ whenever any Form of Government becomes destructive of # ends, it is ~ Right of ~ People to alter or to abolish it, + to institute new Government, laying its foundation on such principles + organizing its powers in such form, ^ to ~m shall seem most likely to effect ~ir Safety + Happiness.
83
Keyword Encoding
How much did we compress?
Original paragraph
656 characters
Encoded paragraph
596 characters
Characters saved
60 characters
Compression ratio
596/656 = 0.9085
Could we use this substitution chart for all text?
84
Run-Length Encoding
A single character may be repeated over and over again in a long sequence.
Replace a repeated sequence with – a flag character
– repeated character
– number of repetitions
*n8 – * is the flag character
– n is the repeated character
– 8 is the number of times n is repeated 85
Run-Length Encoding Encoding example:
Original text is
bbbbbbbbjjjkLLqqqqqq+++++
Encoded text is
*b8jjjkLL*q6*+5
(Why isn't L encoded? J?)
Compression ratio: 15/25 = .6
Q: This type of repetition doesn’t occur in English text; can you think of a situation where it is very likely to occur?
86
Run-Length Encoding
Decoding example:
Encoded text is
*x4*p4l*k7
Original text is
xxxxpppplkkkkkkk
87
Huffman Codes
Conclusion: each language and each topic have specific frequencies of characters and groups of characters (digraphs, trigraphs etc.)
Why should the characters “X" or "z" take up the same number of bits as "e" or "t"?
Huffman codes use variable-length bit strings to
represent each character. More frequently-used letters
have shorter strings to represent them, and vice-versa!
90
Huffman encoding example
“ballboard” would be 1010001001001010110001111011
compression ratio
28/63 (7-bit ASCII)
QUIZ:
Encode “roadbed”
91
Huffman decoding
In Huffman encoding no character's bit string is the prefix of any other character's bit string. Codes with this property are called prefix codes.
To decode
look for match left to right, bit by bit
record letter when the first match is found
continue where you left off, going left to right
92
QUIZ: Decipher the coded text using the Huffman table:
95
0010110101001110100110011011010011000111
01111001110100111
3.4 Representing Audio Data
96
We perceive sound when:
• a series of air waves cause to vibrate a membrane in
our ear (eardrum), which
• is connected to the malleus, incus, and stapes
(hammer, anvil, and stirrup), which
• are connected to the cochlea, which
• sends nerve signals to our brain.
Anatomy of the middle ear
97
Not in text
Source of figures: http://en.wikipedia.org/wiki/Evolution_of_mammalian_auditory_ossicles
Correspondence discovered in 1837 (!)
through embriology
Anatomy of the middle ear
98
Not in text
Source of figures: http://en.wikipedia.org/wiki/Evolution_of_mammalian_auditory_ossicles
Morganucodon, a.k.a. Morgie
discovered in the1950s
Correspondence discovered in 1837 (!)
through embriology
Analog Audio
Record players and stereos send analog signals to speakers to produce sound.
These signals are analog representations of the sound waves.
The voltage in the signal varies in direct proportion to the amplitude of the sound wave.
99
Remember: Sampling and Quantizing
100
Some information
is lost, but a
reasonable
sound is
reproduced
Not in text
From Analog to Digital Audio
Digitize the signal by sampling and quantizing
– periodically measure the voltage
– record the numeric value
How often should we sample? A sampling rate of about 40,000 times per second is enough to create a reasonable sound reproduction
101
44,000 for audio CD, to be exact
QUIZ: Sampling
A telephone voice channel is designed to allow frequencies up to 4,000 Hz (4 kHz).
How many samples must be collected every second?
102
Digital Audio on a CD
On the surface of the CD are microscopic pits
and lands that represent binary digits
A low intensity laser is pointed as the disc. The
laser light reflects strongly if the surface is
smooth and poorly if the surface is pitted ???
(p.75 of text)
104
105
Pit height is about ¼ the
laser’s wavelength
“destructive
interference”
FYI: How the pits and lands are actually read
Not in text
106
Both halves of the
laser beam reflect off
pit or both halves off
land.
The two halves are “in
phase”.
Half of the laser beam
reflects off pit and half
off land.
The 2 halves are “out
of phase”.
FYI: How the pits and lands are actually read
Not in text
Audio Formats Audio Formats
– WAV, AU, AIFF, VQF, and MP3
MP3 (MPEG-2, audio layer 3 file) is dominant
– analyzes the frequency spread and discards information that can’t be heard by humans (>16 kHz)
– bit stream is compressed using a form of Huffman encoding to achieve additional compression
Is this a lossy or lossless compression? 107
QUIZ: Audio Formats
MP3
– analyzes the frequency spread and discards information that can’t be heard by most humans (>16 kHz)
How many MP3 samples are there in a 3-minute song?
108
QUIZ: Audio Formats
MP3
– analyzes the frequency spread and discards information that can’t be heard by most humans (>16 kHz)
How many MP3 samples are there in a 3-minute song?
If each sample is represented as one Byte, what is the total size of the file?
109
3.5 Representing Images and Graphics
Color
Perception of the frequencies of light that reach
the retinas of our eyes
Human retinas have three types of color
photoreceptor cone cells that correspond to the
colors of red, green, and blue
110
Color is expressed as an RGB (red-green-blue) value = three numbers that indicate the relative contribution of each of these three primary colors
An RGB value of (255, 255, 0) maximizes the contribution of red and green, and minimizes the contribution of blue, which results in a bright yellow.
111
112
Dark means low number, light
means high.
Look at the snow and the black
side of the barn! Source: Wikipedia – RGB color model
Representing Images and Graphics
Can you understand this HTML code?
<font color="#FF0000"> Blah blah …
</font>
113
RGB Color Chart in hex
Representing Images and Graphics
color depth
The amount of data that is used to represent a color
HiColor
A 16-bit color depth: five bits used for each number in an RGB value with the extra bit sometimes used to represent transparency
TrueColor
A 24-bit color depth: eight bits used for each number in an RGB value
116
Tarleton Purple
120
The correct "Tarleton Purple" color codes:
• Hex: 0x4F 2D 7F
• RGB: 79 45 127
Source: http://www.tarleton.edu/webservices/guidelines.html
Indexed Color
A browser may support only a certain number of specific colors, creating a palette from which to choose
121
Figure 3.11
The Netscape color palette
How to digitize a picture
• Sample it → Represent it as a collection of individual dots called pixels
• Quantize it → Represent each pixel as one of 224 possible colors (TrueColor)
Resolution = The # of pixels used to represent a picture
123
Digitized Images and Graphics
124
Figure 3.12 A digitized picture composed of many individual pixels
Whole
picture
Digitized Images and Graphics
125
Figure 3.12 A digitized picture composed of many individual pixels
Magnified portion
of the picture
See the pixels?
Hands-on: paste the
high-res image from
the previous slide in
Paint, then choose
ZOOM = 800
QUIZ: Images
A low-res image has 200 rows and 300 columns of pixels.
• What is the resolution?
• If the pixels are represented in True-Color, what is the size of the file?
• Same question in High-Color
126
Two types of image formats
• Raster Graphics = Storage on a pixel-by-pixel
basis
• Vector Graphics = Storage in vector (i.e. mathematical) form
127
Raster Graphics
GIF format • Each image is made up of only 256 colors (indexed color –
similar to palette!)
• But they can be a different 256 for each image!
• Supports animation! Example
• Optimal for line art
PNG format (“ping” = Portable Network Graphics)
Like GIF but achieves greater compression with wider range of color depth
No animations
128
Bitmap format Contains the pixel color values of the image from left to right and from top to bottom
• Great candidate for run-length compression!
• Lossless, but files are large!
JPEG format Averages color hues over short distances
• Lossy compression
Optimal for color photographs
129
Vector Graphics
A format that describes an image in terms of lines and geometric shapes
A vector graphic is a series of commands that describe a line’s direction, thickness, and color
The file sizes tend to be smaller because not every pixel is described
Example: Flash
130
Vector Graphics
The good side:
Vector graphics can be resized mathematically and changes can be calculated dynamically as needed
The bad side:
Vector graphics are not good for representing real-world images
131
133
Representing Video
Video codec COmpressor/DECompressor Methods used to shrink the size of a movie to allow it to be played on a computer or over a network
Almost all video codecs use lossy compressions to minimize the huge amounts of data associated with video
QUIZ video
A 10-minute videoclip is shot in a resolution of 768x1024 True-Color pixels, and 24 frames/second.
• What is the uncompressed size of the video file?
• How long would it take to transmit it over a 100 Mbps Ethernet connection?
134
135
Representing Video
Temporal compression
A technique based on differences between consecutive frames: If most of an image in two frames hasn’t changed, why should we waste space to duplicate all of the similar information?
Spatial compression
A technique based on removing redundant information within a frame: This problem is essentially the same as that faced when compressing still images
Chapter Review Questions
• Distinguish between analog and digital information
• Explain data compression and calculate compression ratios
• Explain the binary formats for negative (two’s complement), fractional, and floating-point values
• Describe the characteristics of the ASCII and Unicode character sets
• Perform various types of text compression with pencil and paper: Keyword, Run-length, Huffman
136
Chapter Review Questions
• Explain the nature of sound and its representation
• Explain how RGB values define a color
• Distinguish between raster and vector graphics
• Explain temporal and spatial video compression
137
Individual work for Wednesday:
Read pages 69-80 of text.
Solve in the notebook these problems :
● 10 through 20
● 50 through 53
138
Not homework! Do not turn in, it’s for
your own review