Date post: | 21-Jan-2016 |
Category: |
Documents |
Upload: | emil-harrison |
View: | 214 times |
Download: | 0 times |
Computer OrganizationCENG331 Section 3, Fall 2012-20131st Lecture
Instructor: Erol Şahin
Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron, G. Kesden and Markus Püschel of Carnegie-Mellon Univ.
Overview Course theme Five realities How the course fits into the CENG curriculum Logistics
Course Theme:Abstraction Is Good But Don’t Forget Reality Most CENG courses emphasize abstraction
Abstract data types Asymptotic analysis
These abstractions have limits Especially in the presence of bugs Need to understand details of underlying implementations
Useful outcomes Become more effective programmers
Able to find and eliminate bugs efficiently Able to understand and tune for program performance
Prepare for later “systems” classes in CENG Compilers, Operating Systems, Networks, Embedded Systems
Great Reality #1: Int’s are not Integers, Float’s are not Reals Example 1: Is x2 ≥ 0?
Float’s: Yes! Int’s:
40000 * 40000 --> 1600000000 50000 * 50000 --> ??
Example 2: Is (x + y) + z = x + (y + z)? Unsigned & Signed Int’s: Yes! Float’s:
(1e20 + -1e20) + 3.14 --> 3.14 1e20 + (-1e20 + 3.14) --> ??
Code Security Example
Similar to code found in FreeBSD’s implementation of getpeername
There are legions of smart people trying to find vulnerabilities in programs
/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}
Typical Usage/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}
#define MSIZE 528
void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, MSIZE); printf(“%s\n”, mybuf);}
Malicious Usage/* Kernel memory region holding user-accessible data */#define KSIZE 1024char kbuf[KSIZE];
/* Copy at most maxlen bytes from kernel region to user buffer */int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len;}
#define MSIZE 528
void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE); . . .}
Computer Arithmetic Does not generate random values
Arithmetic operations have important mathematical properties Cannot assume all “usual” mathematical properties
Due to finiteness of representations Integer operations satisfy “ring” properties
Commutativity, associativity, distributivity Floating point operations satisfy “ordering” properties
Monotonicity, values of signs Observation
Need to understand which abstractions apply in which contexts Important issues for compiler writers and serious application
programmers
Great Reality #2: You’ve Got to Know Assembly Chances are, you’ll never write program in assembly
Compilers are much better & more patient than you are But: Understanding assembly key to machine-level
execution model Behavior of programs in presence of bugs
High-level language model breaks down Tuning program performance
Understand optimizations done/not done by the compiler Understanding sources of program inefficiency
Implementing system software Compiler has machine code as target Operating systems must manage process state
Creating / fighting malware x86 assembly is the language of choice!
Assembly Code Example Time Stamp Counter
Special 64-bit register in Intel-compatible machines Incremented every clock cycle Read with rdtsc instruction
Application Measure time (in clock cycles) required by procedure
double t;start_counter();P();t = get_counter();printf("P required %f clock cycles\n", t);
Code to Read Counter Write small amount of assembly code using GCC’s asm facility Inserts assembly code into machine code generated by
compiler
static unsigned cyc_hi = 0;static unsigned cyc_lo = 0;
/* Set *hi and *lo to the high and low order bits of the cycle counter. */void access_counter(unsigned *hi, unsigned *lo){ asm("rdtsc; movl %%edx,%0; movl %%eax,%1"
: "=r" (*hi), "=r" (*lo) :: "%edx", "%eax");
}
asm ( “assembly instructions” : output operands //optional : input operands //optional : list of clobbered regs //opt..);
Great Reality #3: Memory MattersRandom Access Memory Is an Unphysical Abstraction
Memory is not unbounded It must be allocated and managed Many applications are memory dominated
Memory referencing bugs especially pernicious Effects are distant in both time and space
Memory performance is not uniform Cache and virtual memory effects can greatly affect program
performance Adapting program to characteristics of memory system can lead to
major speed improvements
Memory Referencing Bug Example
double fun(int i){ volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0];}
fun(0) –> 3.14fun(1) –> 3.14fun(2) –> 3.1399998664856fun(3) –> 2.00000061035156fun(4) –> 3.14, then segmentation fault
volatile keyword is intended to prevent the compiler from applying any optimizations on the code that assume values of variables cannot change "on their own."
Memory Referencing Bug Exampledouble fun(int i){ volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0];}
fun(0) –> 3.14fun(1) –> 3.14fun(2) –> 3.1399998664856fun(3) –> 2.00000061035156fun(4) –> 3.14, then segmentation fault
Saved State
d7 … d4
d3 … d0
a[1]
a[0] 0
1
2
3
4
Location accessed by fun(i)
Explanation:
Memory Referencing Errors C and C++ do not provide any memory protection
Out of bounds array references Invalid pointer values Abuses of malloc/free
Can lead to nasty bugs Whether or not bug has any effect depends on system and compiler Action at a distance
Corrupted object logically unrelated to one being accessed Effect of bug may be first observed long after it is generated
How can I deal with this? Program in Java or ML Understand what possible interactions may occur Use or develop tools to detect referencing errors
Memory System Performance Example
Hierarchical memory organization Performance depends on access patterns
Including how step through multi-dimensional array
void copyji(int src[2048][2048], int dst[2048][2048]){ int i,j; for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++) dst[i][j] = src[i][j];}
void copyij(int src[2048][2048], int dst[2048][2048]){ int i,j; for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) dst[i][j] = src[i][j];}
21 times slower(Pentium 4)
The Memory Mountains1 s2 s3 s4 s5 s6 s7 s8 s9
s10
s11
s12
s13
s14
s15
s16
s32
s64
0
1000
2000
3000
4000
5000
6000
7000
64
M 16
M 4M 1
M 25
6K
64
K 16
K 4K
Stride (x8 bytes)
Re
ad
th
rou
gh
pu
t (M
B/s
)
Size (bytes)
L1
L2
Mem
L3
copyij
copyji
Intel Core i72.67 GHz32 KB L1 d-cache256 KB L2 cache8 MB L3 cache
Great Reality #4: There’s more to performance than asymptotic complexity
Constant factors matter too! And even exact op count does not predict performance
Easily see 10:1 performance range depending on how code written Must optimize at multiple levels: algorithm, data representations,
procedures, and loops Must understand system to optimize performance
How programs compiled and executed How to measure program performance and identify bottlenecks How to improve performance without destroying code modularity
and generality
Example Matrix Multiplication
Standard desktop computer, vendor compiler, using optimization flags Both implementations have exactly the same operations count (2n3) What is going on?
0
5
10
15
20
25
30
35
40
45
50
0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000
matrix size
Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHz (double precision)Gflop/ s
160x
Triple loop
Best code (K. Goto)
MMM Plot: Analysis
0
5
10
15
20
25
30
35
40
45
50
0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000
matrix size
Matrix-Matrix Multiplication (MMM) on 2 x Core 2 Duo 3 GHzGflop/ s
Memory hierarchy and other optimizations: 20x
Vector instructions: 4x
Multiple threads: 4x
Reason for 20x: Blocking or tiling, loop unrolling, array scalarization, instruction scheduling, search to find best choice
Effect: less register spills, less L1/L2 cache misses, less TLB misses
Great Reality #5: Computers do more than execute programs
They need to get data in and out I/O system critical to program reliability and performance
They communicate with each other over networks Many system-level issues arise in presence of network
Concurrent operations by autonomous processes Coping with unreliable media Cross platform compatibility Complex performance issues
Role within CENG Curriculum
CENG331
CENG334OperatingSystems
CS 444Compilers
ProcessesMem. Mgmt
CENG336EmbeddedSystems
CENG140C(++) Programming
CS 352Databases
Data Reps.Memory Model
CS 477ComputerGraphics
MachineCode Arithmetic
Computer Organization
Execution ModelMemory System
Course Perspective Most Systems Courses are Builder-Centric
Computer Architecture Design pipelined processor in Verilog
Operating Systems Implement large portions of operating system
Compilers Write compiler for simple language
Networking Implement and simulate network protocols
Course Perspective (Cont.) Our Course is Programmer-Centric
Purpose is to show how by knowing more about the underlying system, one can be more effective as a programmer
Enable you to Write programs that are more reliable and efficient
Not just a course for dedicated hackers We bring out the hidden hacker in everyone
Cover material in this course that you won’t see elsewhere
Teaching staff Instructor
Dr. Erol Sahin Location: B-111, Tel: 210 5539, E-mail: [email protected] Office hours: By appointment.
TA’s Fatih Gokce (E-mail: [email protected]) Asli Genctav (E-mail: [email protected])
Textbooks Randal E. Bryant and David R.
O’Hallaron, “Computer Systems: A Programmer’s
Perspective, Second Edition”, Prentice Hall 2011.
http://csapp.cs.cmu.edu This book really matters for the course!
How to solve labs Practice problems typical of exam
problems
Course Components Lectures
Higher level concepts Assignments (3)
The heart of the course Provide in-depth understanding of an aspect of systems Programming and measurement
Exams (midterm + final) Test your understanding of concepts & mathematical principles
Policies: Grading
Midterm 1: 25%. Midterm 2: 25% Assignments: 20%.
4 homeworks Precondition for entering final exam:
Accumulate 25/100 overall from midterms and homeworks Final: 30%.
Assignments1. Binary bomb:
Defuse a binary bomb by disassembling and reverse engineering the program.
2. Buffer overflow: Modify the run-time behaviour of a binary executable by using the
buffer overflow bug.
3. Architecture: Modify the HCL description of a processor to add new instructions.
4. Performance: Optimize the performance of a function.
Details regarding the scheduling and grading of these assignments will be announced later.
Assignment Rationale Each assignment should have a well-defined goal such as
solving a puzzle or winning a contest.
Doing an assignment should result in new skills and concepts
We try to use competition in a fun and healthy way. Set a reasonable threshold for full credit. Post intermediate results (anonymized) on Web page for glory!
Communication Class Web Pages
Web: http://kovan.ceng.metu.edu.tr/~erol/Courses/CENG331 Copies of lectures, exams, solutions
http://cow.ceng.metu.edu.tr/ Assignments, grades etc.
Newsgroup news://metu.ceng.course.331 Announcements about the course, clarifications to assignments,
general discussion Communication
Questions that are general should be posted to the newsgroup. Please put [Section 3 ] on the subject line of your posting.
If you have a specific question you can send an e-mail to the instructors or to your teaching assistants. However make sure that the subject line starts with CENG331 [capital letters, and no spaces] to get faster reply.
Policies Late assignments:
Late submission policy will be announced for each assignment.
Academic dishonesty: All assignments submitted should be fully your own. We have a
zero tolerance policy on cheating and plagiarism. Your work will be regularly checked for such misconduct and persecuted.
We would like to remind you that, if found guilty, the legal code of the university proposes a minimum of six month expulsion from the university.
Cheating What is cheating?
Sharing code: either by copying, retyping, looking at, or supplying a copy of a file.
Coaching: helping your friend to write a lab, line by line. Copying code from previous course or from elsewhere on WWW
Only allowed to use code we supply, or from CS:APP website What is NOT cheating?
Explaining how to use systems or tools. Helping others with high-level design issues.
Penalty for cheating: Removal from course with failing grade.
Detection of cheating: We do check and our tools for doing this are much better than you
think!
If you don’t want to cry at the end of the semester..
Keep in mind that this is a MUST course! If you fail to get passing
grades you may lose a year! No extra exams, or extra time
to submit assignments after grading.
We’ll have fun!
Conclusion
5 Great Realities Ints are not Integers, Floats are not Reals You’ve Got to Know Assembly Memory Matters There’s more to performance than asymptotic complexity Computers do more than execute programs
Abstraction Is Good But Don’t Forget Reality
A Tour of Computer SystemsCENG331 Section 1 & 2, Fall 2011-2012
Instructor: Erol Şahin
Acknowledgement: Most of the slides are adapted from the ones prepared by R.E. Bryant, D.R. O’Hallaron of Carnegie-Mellon Univ.
hello.c
#include <stdio.h>
int main() {
printf(“Hello World”);
}
Compilation of hello.c
Pre-processor
(cpp)
hello.i Compiler(cc1)
hello.s Assembler(as)
hello.o Linker(ld)
hellohello.c
Sourceprogram
(text)
Modifiedsource
program(text)
Assemblyprogram
(text)
Relocatableobject
programs(binary)
Executableobject
program(binary)
printf.o
Preprocessing
Pre-processor
(cpp)
hello.ihello.c
Sourceprogram
(text)
Modifiedsource
program(text)
#include <stdio.h>
int main() {
printf(“Hello World”);
}
# 1 "hello.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "hello.c"
# 1 "/usr/include/stdio.h" 1 3 4
……………………………………..
typedef unsigned char __u_char;
typedef unsigned short int __u_short;
typedef unsigned int __u_int;
typedef unsigned long int __u_long;
………………………………………..
int main() {
printf(“Hello World”);
}
cpp hello.c > hello.i
Compiler# 1 "hello.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "hello.c"
# 1 "/usr/include/stdio.h" 1 3 4
……………………………………..
typedef unsigned char __u_char;
typedef unsigned short int __u_short;
typedef unsigned int __u_int;
typedef unsigned long int __u_long;
………………………………………..
int main() {
printf(“Hello World”);
}
gcc -Wall -S hello.i > hello.s
Compiler(cc1)
hello.shello.i
.file "hello.c" .section .rodata.LC0: .string "Hello World" .text.globl main .type main, @functionmain: pushl %ebp movl %esp, %ebp subl $8, %esp andl $-16, %esp movl $0, %eax addl $15, %eax addl $15, %eax shrl $4, %eax sall $4, %eax subl %eax, %esp subl $12, %esp pushl $.LC0 call printf addl $16, %esp leave ret .size main, .-main .section .note.GNU-stack,"",@progbits .ident "GCC: (GNU) 3.4.1"
Assembler
as hello.s -o hello.o
.file "hello.c" .section .rodata.LC0: .string "Hello World" .text.globl main .type main, @functionmain: pushl %ebp movl %esp, %ebp subl $8, %esp andl $-16, %esp movl $0, %eax addl $15, %eax addl $15, %eax shrl $4, %eax sall $4, %eax subl %eax, %esp subl $12, %esp pushl $.LC0 call printf addl $16, %esp leave ret .size main, .-main .section .note.GNU-stack,"",@progbits .ident "GCC: (GNU) 3.4.1"
hello.s Assembler(as)
hello.o
0000500 nul nul nul nul esc nul nul nul ht nul nul nul nul nul nul nul0000520 nul nul nul nul d etx nul nul dle nul nul nul ht nul nul nul0000540 soh nul nul nul eot nul nul nul bs nul nul nul % nul nul nul0000560 soh nul nul nul etx nul nul nul nul nul nul nul d nul nul nul0000600 nul nul nul nul nul nul nul nul nul nul nul nul eot nul nul nul0000620 nul nul nul nul + nul nul nul bs nul nul nul etx nul nul nul0000640 nul nul nul nul d nul nul nul nul nul nul nul nul nul nul nul0000660 nul nul nul nul eot nul nul nul nul nul nul nul 0 nul nul nul0000700 soh nul nul nul stx nul nul nul nul nul nul nul d nul nul nul0000720 ff nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul0000740 nul nul nul nul 8 nul nul nul soh nul nul nul nul nul nul nul0000760 nul nul nul nul p nul nul nul nul nul nul nul nul nul nul nul0001000 nul nul nul nul soh nul nul nul nul nul nul nul H nul nul nul0001020 soh nul nul nul nul nul nul nul nul nul nul nul p nul nul nul0001040 2 nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul0001060 nul nul nul nul dc1 nul nul nul etx nul nul nul nul nul nul nul0001100 nul nul nul nul " nul nul nul Q nul nul nul nul nul nul nul0001120 nul nul nul nul soh nul nul nul nul nul nul nul soh nul nul nul0001140 stx nul nul nul nul nul nul nul nul nul nul nul , stx nul nul0001160 sp nul nul nul nl nul nul nul bs nul nul nul eot nul nul nul0001200 dle nul nul nul ht nul nul nul etx nul nul nul nul nul nul nul0001220 nul nul nul nul L etx nul nul nak nul nul nul nul nul nul nul0001240 nul nul nul nul soh nul nul nul nul nul nul nul nul nul nul nul0001260 nul nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul0001300 nul nul nul nul nul nul nul nul eot nul q del nul nul nul nul0001320 nul nul nul nul nul nul nul nul etx nul soh nul nul nul nul nul0001340 nul nul nul nul nul nul nul nul etx nul etx nul nul nul nul nul0001360 nul nul nul nul nul nul nul nul etx nul eot nul nul nul nul nul0001400 nul nul nul nul nul nul nul nul etx nul enq nul nul nul nul nul0001420 nul nul nul nul nul nul nul nul etx nul ack nul nul nul nul nul0001440 nul nul nul nul nul nul nul nul etx nul bel nul ht nul nul nul0001460 nul nul nul nul . nul nul nul dc2 nul soh nul so nul nul nul0001500 nul nul nul nul nul nul nul nul dle nul nul nul nul h e l0001520 l o . c nul m a i n nul p r i n t f0001540 nul nul nul nul sp nul nul nul soh enq nul nul % nul nul nul0001560 stx ht nul nul0001564
od –a hello.o
Linker
0000500 nul nul nul nul esc nul nul nul ht nul nul nul nul nul nul nul0000520 nul nul nul nul d etx nul nul dle nul nul nul ht nul nul nul0000540 soh nul nul nul eot nul nul nul bs nul nul nul % nul nul nul0000560 soh nul nul nul etx nul nul nul nul nul nul nul d nul nul nul0000600 nul nul nul nul nul nul nul nul nul nul nul nul eot nul nul nul0000620 nul nul nul nul + nul nul nul bs nul nul nul etx nul nul nul0000640 nul nul nul nul d nul nul nul nul nul nul nul nul nul nul nul0000660 nul nul nul nul eot nul nul nul nul nul nul nul 0 nul nul nul0000700 soh nul nul nul stx nul nul nul nul nul nul nul d nul nul nul0000720 ff nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul0000740 nul nul nul nul 8 nul nul nul soh nul nul nul nul nul nul nul0000760 nul nul nul nul p nul nul nul nul nul nul nul nul nul nul nul0001000 nul nul nul nul soh nul nul nul nul nul nul nul H nul nul nul0001020 soh nul nul nul nul nul nul nul nul nul nul nul p nul nul nul0001040 2 nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul0001060 nul nul nul nul dc1 nul nul nul etx nul nul nul nul nul nul nul0001100 nul nul nul nul " nul nul nul Q nul nul nul nul nul nul nul0001120 nul nul nul nul soh nul nul nul nul nul nul nul soh nul nul nul0001140 stx nul nul nul nul nul nul nul nul nul nul nul , stx nul nul0001160 sp nul nul nul nl nul nul nul bs nul nul nul eot nul nul nul0001200 dle nul nul nul ht nul nul nul etx nul nul nul nul nul nul nul0001220 nul nul nul nul L etx nul nul nak nul nul nul nul nul nul nul0001240 nul nul nul nul soh nul nul nul nul nul nul nul nul nul nul nul0001260 nul nul nul nul nul nul nul nul nul nul nul nul soh nul nul nul0001300 nul nul nul nul nul nul nul nul eot nul q del nul nul nul nul0001320 nul nul nul nul nul nul nul nul etx nul soh nul nul nul nul nul0001340 nul nul nul nul nul nul nul nul etx nul etx nul nul nul nul nul0001360 nul nul nul nul nul nul nul nul etx nul eot nul nul nul nul nul0001400 nul nul nul nul nul nul nul nul etx nul enq nul nul nul nul nul0001420 nul nul nul nul nul nul nul nul etx nul ack nul nul nul nul nul0001440 nul nul nul nul nul nul nul nul etx nul bel nul ht nul nul nul0001460 nul nul nul nul . nul nul nul dc2 nul soh nul so nul nul nul0001500 nul nul nul nul nul nul nul nul dle nul nul nul nul h e l0001520 l o . c nul m a i n nul p r i n t f0001540 nul nul nul nul sp nul nul nul soh enq nul nul % nul nul nul0001560 stx ht nul nul0001564
od –a hello
0000000 del E L F soh soh soh nul nul nul nul nul nul nul nul nul0000020 stx nul etx nul soh nul nul nul @ stx eot bs 4 nul nul nul0000040 X cr nul nul nul nul nul nul 4 nul sp nul bel nul ( nul0000060 ! nul rs nul ack nul nul nul 4 nul nul nul 4 nul eot bs0000100 4 nul eot bs ` nul nul nul ` nul nul nul enq nul nul nul0000120 eot nul nul nul etx nul nul nul dc4 soh nul nul dc4 soh eot bs0000140 dc4 soh eot bs dc3 nul nul nul dc3 nul nul nul eot nul nul nul0000160 soh nul nul nul soh nul nul nul nul nul nul nul nul nul eot bs0000200 nul nul eot bs bs eot nul nul bs eot nul nul enq nul nul nul0000220 nul dle nul nul soh nul nul nul bs eot nul nul bs dc4 eot bs0000240 bs dc4 eot bs nul soh nul nul eot soh nul nul ack nul nul nul0000260 nul dle nul nul stx nul nul nul dc4 eot nul nul dc4 dc4 eot bs0000300 dc4 dc4 eot bs H nul nul nul H nul nul nul ack nul nul nul0000320 eot nul nul nul eot nul nul nul ( soh nul nul ( soh eot bs0000340 ( soh eot bs sp nul nul nul sp nul nul nul eot nul nul nul0000360 eot nul nul nul Q e t d nul nul nul nul nul nul nul nul0000400 nul nul nul nul nul nul nul nul nul nul nul nul ack nul nul nul0000420 eot nul nul nul / l i b / l d - l i n u0000440 x . s o . 2 nul nul eot nul nul nul dle nul nul nul0000460 soh nul nul nul G N U nul nul nul nul nul stx nul nul nul0000500 stx nul nul nul enq nul nul nul etx nul nul nul ack nul nul nul0000520 enq nul nul nul soh nul nul nul etx nul nul nul nul nul nul nul0000540 nul nul nul nul nul nul nul nul stx nul nul nul nul nul nul nul……………………………………………………………………………..
hello.o Linker(ld)
hello
Relocatableobject
programs(binary)
Executableobject
program(binary)
printf.o
gcc hello.o –o hello
Finally…
$ gcc hello.o -o hello
$ ./hello
Hello World$
How do you say “Hello World”?
Typical Organization of System
Mainmemory
I/O bridgeBus interface
ALU
Register file
CPU
System bus Memory bus
Disk controller
Graphicsadapter
USBcontroller
MouseKeyboard Display
Disk
I/O busExpansion slots forother devices suchas network adapters
hello executable stored on disk
PC
Mainmemory
I/O bridgeBus interface
ALU
Register file
CPU
System bus Memory bus
Disk controller
Graphicsadapter
USBcontroller
MouseKeyboard Display
Disk
I/O busExpansion slots forother devices suchas network adapters
PC
"hello"
Usertypes
"hello"
Reading hello command from keyboard
Byte-Oriented Memory Organization
Programs Refer to Virtual Addresses Conceptually very large array of bytes Actually implemented with hierarchy of different memory types System provides address space private to particular “process”
Program being executed Program can clobber its own data, but not that of others
Compiler + Run-Time System Control Allocation Where different program objects should be stored All allocation within single virtual address space
• • •00
•••0
FF••
•F
Machine Words Machine Has “Word Size”
Nominal size of integer-valued data Including addresses
Most current machines use 32 bits (4 bytes) words Limits addresses to 4GB Becoming too small for memory-intensive applications
High-end systems use 64 bits (8 bytes) words Potential address space ≈ 1.8 X 1019 bytes x86-64 machines support 48-bit addresses: 256 Terabytes
Machines support multiple data formats Fractions or multiples of word size Always integral number of bytes
Word-Oriented Memory Organization Addresses Specify Byte
Locations Address of first byte in word Addresses of successive words differ
by 4 (32-bit) or 8 (64-bit)
000000010002000300040005000600070008000900100011
32-bitWords
Bytes Addr.
0012001300140015
64-bitWords
Addr =??
Addr =??
Addr =??
Addr =??
Addr =??
Addr =??
0000
0004
0008
0012
0000
0008
Data Representations
C Data Type Typical 32-bit Intel IA32 x86-64
char 1 1 1
short 2 2 2
int 4 4 4
long 4 4 8
long long 8 8 8
float 4 4 4
double 8 8 8
long double 8 10/12 10/16
pointer 4 4 8
Byte Ordering How should bytes within a multi-byte word be ordered in
memory? Conventions
Big Endian: Sun, PPC Mac, Internet Least significant byte has highest address
Little Endian: x86 Least significant byte has lowest address
Byte Ordering Example Big Endian
Least significant byte has highest address Little Endian
Least significant byte has lowest address Example
Variable x has 4-byte representation 0x01234567 Address given by &x is 0x100
0x100 0x101 0x102 0x103
01 23 45 67
0x100 0x101 0x102 0x103
67 45 23 01
Big Endian
Little Endian
01 23 45 67
67 45 23 01
The origin of the word Endian from Wikipedia
The very terms big-endian and little-endian were taken from the Big-Endians and Little-Endians of Jonathan Swift's satiric novel Gulliver's Travels, where in Lilliput and Blefuscu Gulliver finds two groups of people in conflict over which end of an egg to crack.
http://sisu.typepad.com/sisu/images/gulliver.jpg
http://www.tallstories.org.uk/shows/other/the-egg.jpg
Address Instruction Code Assembly Rendition 8048365: 5b pop %ebx 8048366: 81 c3 ab 12 00 00 add $0x12ab,%ebx 804836c: 83 bb 28 00 00 00 00 cmpl $0x0,0x28(%ebx)
Reading Byte-Reversed Listings Disassembly
Text representation of binary machine code Generated by program that reads the machine code
Example Fragment
Deciphering Numbers Value: 0x12ab Pad to 32 bits: 0x000012ab Split into bytes: 00 00 12 ab Reverse: ab 12 00 00
Examining Data Representations Code to Print Byte Representation of Data
Casting pointer to unsigned char * creates byte array
Printf directives:%p: Print pointer%x: Print Hexadecimal
typedef unsigned char *pointer;
void show_bytes(pointer start, int len){ int i; for (i = 0; i < len; i++) printf(”%p\t0x%.2x\n",start+i, start[i]); printf("\n");}
show_bytes Execution Example
int a = 15213;printf("int a = 15213;\n");show_bytes((pointer) &a, sizeof(int));
Result (Linux):
int a = 15213;0x11ffffcb8 0x6d0x11ffffcb9 0x3b0x11ffffcba 0x000x11ffffcbb 0x00
Representing IntegersDecimal: 15213
Binary: 0011 1011 0110 1101
Hex: 3 B 6 D
6D3B0000
IA32, x86-64
3B6D
0000
Sun
int A = 15213;
93C4FFFF
IA32, x86-64
C493
FFFF
Sun
Two’s complement representation(Covered later)
int B = -15213;
long int C = 15213;
00000000
6D3B0000
x86-64
3B6D
0000
Sun
6D3B0000
IA32
Representing Pointers
Different compilers & machines assign different locations to objects
int B = -15213;int *P = &B;
x86-64Sun IA32
EF
FF
FB
2C
D4
F8
FF
BF
0C
89
EC
FF
FF
7F
00
00
char S[6] = "18243";
Representing Strings Strings in C
Represented by array of characters Each character encoded in ASCII format
Standard 7-bit encoding of character set Character “0” has code 0x30
– Digit i has code 0x30+i String should be null-terminated
Final character = 0 Compatibility
Byte ordering not an issue
Linux/Alpha Sun
31
38
32
34
33
00
31
38
32
34
33
00
Boolean Algebra Developed by George Boole in 19th Century
Algebraic representation of logic Encode “True” as 1 and “False” as 0
And A&B = 1 when both A=1
and B=1
Or A|B = 1 when either A=1
or B=1
Not ~A = 1 when
A=0
Exclusive-Or (Xor) A^B = 1 when either A=1 or B=1, but
not both
General Boolean Algebras Operate on Bit Vectors
Operations applied bitwise
All of the Properties of Boolean Algebra Apply
01101001& 01010101 01000001
01101001| 01010101 01111101
01101001^ 01010101 00111100
~ 01010101 10101010 01000001 01111101 00111100 10101010
Representing & Manipulating Sets Representation
Width w bit vector represents subsets of {0, …, w–1} a[j] = 1 if j A∈
01101001 { 0, 3, 5, 6 } 76543210
01010101 { 0, 2, 4, 6 } 76543210
Operations & Intersection 01000001 { 0, 6 } | Union 01111101 { 0, 2, 3, 4, 5,
6 } ^ Symmetric difference 00111100 { 2, 3, 4, 5 } ~ Complement 10101010 { 1, 3, 5, 7 }
Bit-Level Operations in C Operations &, |, ~, ^ Available in C
Apply to any “integral” data type long, int, short, char, unsigned
View arguments as bit vectors Arguments applied bit-wise
Examples (Char data type) ~0x41 ➙ 0xBE
~010000012 ➙ 101111102
~0x00 ➙ 0xFF ~000000002 ➙ 111111112
0x69 & 0x55 ➙ 0x41 011010012 & 010101012 ➙ 010000012
0x69 | 0x55 ➙ 0x7D 011010012 | 010101012 ➙ 011111012
Contrast: Logic Operations in C Contrast to Logical Operators
&&, ||, ! View 0 as “False” Anything nonzero as “True” Always return 0 or 1 Early termination
Examples (char data type) !0x41 ➙ 0x00 !0x00 ➙ 0x01 !!0x41 ➙ 0x01
0x69 && 0x55 ➙ 0x01 0x69 || 0x55 ➙ 0x01 p && *p (avoids null pointer access)
Shift Operations Left Shift: x << y
Shift bit-vector x left y positions– Throw away extra bits on left
Fill with 0’s on right Right Shift: x >> y
Shift bit-vector x right y positions Throw away extra bits on right
Logical shift Fill with 0’s on left
Arithmetic shift Replicate most significant bit on right
Undefined Behavior Shift amount < 0 or ≥ word size
01100010Argument x
00010000<< 3
00011000Log. >> 2
00011000Arith. >> 2
10100010Argument x
00010000<< 3
00101000Log. >> 2
11101000Arith. >> 2
0001000000010000
0001100000011000
0001100000011000
00010000
00101000
11101000
00010000
00101000
11101000
C, however, has only one right shift operator, >>. Many C compilers choose which right shift to perform depending on what type of integer is being shifted; often signed integers are shifted using the arithmetic shift, and unsigned integers are shifted
using the logical shift.
Swap
Challenge: Can you code swap function without using a temporary variable?
Hint: Use bitwise XOR (^)
void swap(int *x, int *y)
{int temp;temp = *x;*x = *y;*y = temp;
}
Swapping with Xorvoid funny(int *x, int *y){ *x = *x ^ *y; /* #1 */ *y = *x ^ *y; /* #2 */ *x = *x ^ *y; /* #3 */}
Bitwise Xor is a form of addition
With extra property that every value is its own additive inverse
A ^ A = 0
BABeginBA^B1(A^B)^B = AA^B2A(A^B)^A = B3ABEnd
*y*x
What’s the story behind C and UNIX?