+ All Categories
Home > Documents > Lecture1a Bit Hacks Merge

Lecture1a Bit Hacks Merge

Date post: 02-Jun-2018
Category:
Upload: muhammad-reyhan-fahlevi
View: 223 times
Download: 0 times
Share this document with a friend

of 51

Transcript
  • 8/11/2019 Lecture1a Bit Hacks Merge

    1/51

    Computer ArchitectureLecture 1: Bit Hakcs

    Semester 101, 2014, 22/8

  • 8/11/2019 Lecture1a Bit Hacks Merge

    2/51

    Disclaimer

    This module was taken from Charles E. Leiserson lecture

    notes, Performance Engineering of Software Systems, MITOpen Courseware

    2

  • 8/11/2019 Lecture1a Bit Hacks Merge

    3/51

    Agenda

    Swap

    Minimum of Two integers

    Modular Addition

    Round up to a power of 2

    Least Significant 1

    Log Base 2 of Power of 2

    Population Count

    Queens game

    3

  • 8/11/2019 Lecture1a Bit Hacks Merge

    4/51

    Swap

    2009 Charles E. Leiserson 2

    Problem

    Swap two integers x and y.

    t = x;x = y;y = t;

  • 8/11/2019 Lecture1a Bit Hacks Merge

    5/51

    No Temp Swap

    Problem

    Swap two integers x and y without using a temporary.

    2009 Charles E. Leiserson 3

    x = x ^ y;y = x ^ y;x = x ^ y;

    Example

    x 10111101 10010011 10010011 00101110

    y 00101110 00101110 10111101 10111101

  • 8/11/2019 Lecture1a Bit Hacks Merge

    6/51

    2009 Charles E. Leiserson 4

    No Temp Swap

    Problem

    Swap two integers x and y without using a temporary.

    x = x ^ y;y = x ^ y;x = x ^ y;

    Example

    x 10111101 10010011 10010011 00101110

    y 00101110 00101110 10111101 10111101

  • 8/11/2019 Lecture1a Bit Hacks Merge

    7/51 2009 Charles E. Leiserson 5

    No Temp Swap

    Problem

    Swap two integers x and y without using a temporary.

    x = x ^ y;y = x ^ y;x = x ^ y;

    Example

    x 10111101 10010011 10010011 00101110

    y 00101110 00101110 10111101 10111101

  • 8/11/2019 Lecture1a Bit Hacks Merge

    8/51

    No Temp Swap

    Problem

    Swap two integers x and y without using a temporary.

    2009 Charles E. Leiserson 6

    x = x ^ y;y = x ^ y;x = x ^ y;

    Example

    x 10111101 10010011 10010011 00101110

    y 00101110 00101110 10111101 10111101

  • 8/11/2019 Lecture1a Bit Hacks Merge

    9/51 2009 Charles E. Leiserson 7

    No Temp Swap

    Problem

    Swap two integers x and y without using a temporary.

    x = x ^ y;y = x ^ y;x = x ^ y;

    Why it works

    XOR is its own inverse: (x ^ y) ^ y = x.

    Example

    x 10111101 10010011 10010011 00101110

    y 00101110 00101110 10111101 10111101

    Performance

    Poor at exploiting instruction-level parallelism (ILP).

  • 8/11/2019 Lecture1a Bit Hacks Merge

    10/51 2009 Charles E. Leiserson 8

    Minimum of Two Integers

    Problem

    Find the minimum r of two integers x and y.

    if (x < y)r = x;

    else

    r = y;

    or r = (x < y) ? x : y;

    Performance

    A mispredicted branch empties the processor pipeline ~16 cycles on the cloud facilitys Intel Core i7s.

    The compiler might be smart enough to avoid theunpredictable branch, but maybe not.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    11/51 2009 Charles E. Leiserson 9

    No Branch Minimum

    Problem

    Find the minimum z of two integers x and y withouta branch.

    r = y ^ ((x ^ y) & -(x < y));

    Why it works:

    C represents the Booleans TRUE and FALSE withthe integers 1 and 0, respectively.

    If x < y, then (x < y) = 1, which is all 1s in

    twos complement representation. Therefore,we have y ^ (x ^ y ) = x. If x y, then (x < y) = 0. Therefore, we have

    y ^ 0 = y.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    12/51 2009 Charles E. Leiserson 10

    Modular Addition

    Problem

    Compute (x + y) mod n, assuming that 0 x < nand 0 y < n.

    r = (x + y) % n;Divide is expensive, unlessby a power of 2.

    z = x + y; Unpredictable

    r = (z < n) ? z : z-n; branch isexpensive.

    z = x + y; Same trick asr = z - (n & -(z >= n)); minimum.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    13/51

  • 8/11/2019 Lecture1a Bit Hacks Merge

    14/51 2009 Charles E. Leiserson 12

    Round up to a Power of 2

    Problem

    Compute 2log n.

    //64-bit integers--n;n |= n >> 1;

    n |= n >> 2;n |= n >> 4;n |= n >> 8;n |= n >> 16;n |= n >> 32;++n;

    Example

    0010000001010000

    0010000001001111

    0011000001101111

    0011110001111111

    0011111111111111

    0100000000000000

  • 8/11/2019 Lecture1a Bit Hacks Merge

    15/51 2009 Charles E. Leiserson 13

    Round up to a Power of 2

    Problem

    Compute 2log n.

    //64-bit integers--n;n |= n >> 1;

    n |= n >> 2;n |= n >> 4;n |= n >> 8;n |= n >> 16;n |= n >> 32;++n;

    Example

    0010000001010000

    0010000001001111

    0011000001101111

    0011110001111111

    0011111111111111

    0100000000000000

  • 8/11/2019 Lecture1a Bit Hacks Merge

    16/51 2009 Charles E. Leiserson 14

    Round up to a Power of 2

    Problem

    Compute 2log n.

    //64-bit integers--n;n |= n >> 1;

    n |= n >> 2;n |= n >> 4;n |= n >> 8;n |= n >> 16;n |= n >> 32;++n;

    Example

    0010000001010000

    0010000001001111

    0011000001101111

    0011110001111111

    0011111111111111

    0100000000000000

  • 8/11/2019 Lecture1a Bit Hacks Merge

    17/51 2009 Charles E. Leiserson 15

    Round up to a Power of 2

    Problem

    Compute 2log n.

    //64-bit integers--n;n |= n >> 1;

    n |= n >> 2;n |= n >> 4;n |= n >> 8;n |= n >> 16;n |= n >> 32;++n;

    Example

    0010000001010000

    0010000001001111

    0011000001101111

    0011110001111111

    0011111111111111

    0100000000000000

  • 8/11/2019 Lecture1a Bit Hacks Merge

    18/51 2009 Charles E. Leiserson 16

    Round up to a Power of 2

    Problem

    Compute 2log n.

    //64-bit integers--n;n |= n >> 1;

    n |= n >> 2;n |= n >> 4;n |= n >> 8;n |= n >> 16;n |= n >> 32;++n;

    Example

    0010000001010000

    0010000001001111

    0011000001101111

    0011110001111111

    0011111111111111

    0100000000000000

  • 8/11/2019 Lecture1a Bit Hacks Merge

    19/51 2009 Charles E. Leiserson 17

    Round up to a Power of 2

    Problem

    Compute 2log n.

    //64-bit integers--n;n |= n >> 1;

    n |= n >> 2;n |= n >> 4;n |= n >> 8;n |= n >> 16;n |= n >> 32;++n;

    Example

    0010000001010000

    0010000001001111

    0011000001101111

    0011110001111111

    0011111111111111

    0100000000000000

    Why decrement and increment?

    To handle the boundary case when n is a power of 2.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    20/51 2009 Charles E. Leiserson 18

    Least Significant 1

    Problem

    Compute the mask of the least-significant 1 in word x.

    r = x & (-x);

    Example

    x 0010000001010000-x 1101111110110000

    x & (-x) 0000000000010000

    Question

    How do you find the index of the bit, i.e., lg r = log2 r?

  • 8/11/2019 Lecture1a Bit Hacks Merge

    21/51

    2009 Charles E. Leiserson 19

    Log Base 2 of a Power of 2

    Problem

    Compute lg x, where x is a power of 2.

    const uint64_t deBruijn = 0x022fdd63cc95386d;const unsigned int convert[64] ={ 0, 1, 2, 53, 3, 7, 54, 27,

    4, 38, 41, 8, 34, 55, 48, 28,62, 5, 39, 46, 44, 42, 22, 9,24, 35, 59, 56, 49, 18, 29, 11,63, 52, 6, 26, 37, 40, 33, 47,61, 45, 43, 21, 23, 58, 17, 10,

    51, 25, 36, 32, 60, 20, 57, 16,50, 31, 19, 15, 30, 14, 13, 12};

    r = convert[(x*deBruijn) >> 58];

  • 8/11/2019 Lecture1a Bit Hacks Merge

    22/51

    2009 Charles E. Leiserson 20

    Log Base 2 of a Power of 2

    Why it works

    A de ruijn sequen e s oflength 2k is a cyclic 0-1sequence such that each ofthe 2k 0-1 strings of

    length k occurs exactlyonce as a substring of s.

    Example k=3

    0001110120 0001 0012 011

    3 1114 1105 101

    00011101 * 24 = 110100002 2 6 01011010000 >> 5 = 62

    7 100convert[6] = 4

    Performance

    convert[8] =

    Limited by multiply and {0, 1, 6, 2, 7, 5, 4, 3};

    table look-up

  • 8/11/2019 Lecture1a Bit Hacks Merge

    23/51

    2009 Charles E. Leiserson 21

    Population Count I

    Problem

    Count the number of1 bits in a word x.for (r=0; x!=0; ++r) Repeatedly eliminate

    x &= x - 1; the least-significant 1.

    Issue

    Fast if the popcount is small, but in the worst case, therunning time is proportional to the number of bits inthe word.

    Example

    x 0010110111010000

    x-1 0010110111001111

    x & (x-1) 0010110111000000

  • 8/11/2019 Lecture1a Bit Hacks Merge

    24/51

    2009 Charles E. Leiserson 22

    Population Count II

    Table look up

    static const int count[256] ={0,1,1,2,1,2,2,3,1,...,8}; //#1s in index

    for (r=0; x!=0; x>>=8)r += count[x & 0xFF];

    Performance

    Memory operations are much more costly thanregister operations:

    register: 1 cycle (6 ops issued per cycle per core), L1-cache: 4 cycles, L2-cache: 10 cycles, L3-cache: 50 cycles, DRAM: 150 cycles.

    per 64-byte cache line

  • 8/11/2019 Lecture1a Bit Hacks Merge

    25/51

    2009 Charles E. Leiserson 23

    Population Count III

    Parallel divide // Create masks

    and conquer

    B5 = !((-1) 4) + x) & B2;x = ((x >> 8) + x) & B3;x = ((x >> 16) + x) & B4;x = ((x >> 32) + x) & B5;

  • 8/11/2019 Lecture1a Bit Hacks Merge

    26/51

    2009 Charles E. Leiserson 24

    Population Count III

    11110101000110000011011111001010

    11110101000110000011011111001010

  • 8/11/2019 Lecture1a Bit Hacks Merge

    27/51

    2009 Charles E. Leiserson 25

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10100101000101000010011010000101

    11110101000110000011011111001010

  • 8/11/2019 Lecture1a Bit Hacks Merge

    28/51

    2009 Charles E. Leiserson 26

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10100101000101000010011010000101

    11110101000110000011011111001010

  • 8/11/2019 Lecture1a Bit Hacks Merge

    29/51

    2009 Charles E. Leiserson 27

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10 01 01 00 10 10 00 0110 01 00 01 00 01 10 01

    11110101000110000011011111001010

    +

    01000010000100010010001100100010

  • 8/11/2019 Lecture1a Bit Hacks Merge

    30/51

  • 8/11/2019 Lecture1a Bit Hacks Merge

    31/51

    2009 Charles E. Leiserson 29

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10 01 01 00 10 10 00 0110 01 00 01 00 01 10 01+

    0010 0001 0011 00100100 0001 0010 0010

    11110101000110000011011111001010

    +

    00000110000000100000010100000100

  • 8/11/2019 Lecture1a Bit Hacks Merge

    32/51

  • 8/11/2019 Lecture1a Bit Hacks Merge

    33/51

    2009 Charles E. Leiserson 31

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10 01 01 00 10 10 00 0110 01 00 01 00 01 10 01+

    0010 0001 0011 00100100 0001 0010 0010+

    00000010 0000010000000110 00000101

    11110101000110000011011111001010

    +

    00000000000010000000000000001001

  • 8/11/2019 Lecture1a Bit Hacks Merge

    34/51

    2009 Charles E. Leiserson 32

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10 01 01 00 10 10 00 0110 01 00 01 00 01 10 01+

    0010 0001 0011 00100100 0001 0010 0010+

    00000010 0000010000000110 00000101

    11110101000110000011011111001010

    +

    00000000000010000000000000001001

  • 8/11/2019 Lecture1a Bit Hacks Merge

    35/51

    2009 Charles E. Leiserson 33

    Population Count III

    1 1 1 1 0 1 0 0 0 1 1 1 1 0 0 01 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1+

    10 01 01 00 10 10 00 0110 01 00 01 00 01 10 01+

    0010 0001 0011 00100100 0001 0010 0010+

    00000010 0000010000000110 00000101+

    00000000000010010000000000001000+

    00000000000000000000000000010001

    11110101000110000011011111001010

    17

  • 8/11/2019 Lecture1a Bit Hacks Merge

    36/51

    Queens Problem

    Problem

    Place n queens on an n n chessboard so that no queenattacks another, i.e., no two queens in any row, column,

    2009 Charles E. Leiserson 34

    or diagonal.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    37/51

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

    2009 Charles E. Leiserson 35

  • 8/11/2019 Lecture1a Bit Hacks Merge

    38/51

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

    2009 Charles E. Leiserson 36

  • 8/11/2019 Lecture1a Bit Hacks Merge

    39/51

    2009 Charles E. Leiserson 37

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    40/51

    2009 Charles E. Leiserson 38

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    41/51

    2009 Charles E. Leiserson 39

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    42/51

    2009 Charles E. Leiserson 40

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    43/51

    2009 Charles E. Leiserson 41

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    44/51

    2009 Charles E. Leiserson 42

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    45/51

    2009 Charles E. Leiserson 43

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    46/51

    2009 Charles E. Leiserson 44

    Backtracking Search

    Strategy

    Try placing queens row by row. If you cant place aqueen in a row, backtrack.

  • 8/11/2019 Lecture1a Bit Hacks Merge

    47/51

    2009 Charles E. Leiserson 45

    Board Representation

    The backtrack search can be implemented as a simple

    recursive procedure, but how should the board berepresented to facilitate queen placement? array of n2 bytes? array of n2 bits? array of n bytes? 3 bitvectors of size n, 2n-1, and 2n-1!

  • 8/11/2019 Lecture1a Bit Hacks Merge

    48/51

    n

    qg ueen in columnaPlacing q

    c is not safe if

    down & (1

  • 8/11/2019 Lecture1a Bit Hacks Merge

    49/51

    2009 Charles E. Leiserson 47

    Bitvector Representation

    0 0 0 0 1 1 0 10

    1

    0

    0

    0

    1

    1

    Pa

    i

    1 0

    right

    lacing a queen in row rnd column c is not safe if

    right & (1

  • 8/11/2019 Lecture1a Bit Hacks Merge

    50/51

    2009 Charles E. Leiserson 48

    Bitvector Representation

    0

    0

    1

    1

    0

    1

    10 1 0 0 1 0 0 0

    left

    1

    0

    Placing a queen in rowr and column c is notsafe if

    left & (1

  • 8/11/2019 Lecture1a Bit Hacks Merge

    51/51

    Further Reading

    Sean Eron Anderson, Bit twiddling hacks,

    http://graphics.stanford.edu/~seander/bithacks.html,2009.

    Happy Hacking!


Recommended