+ All Categories
Home > Documents > Subject : CAD For VLSI (7CS4) 1 Unit 5 Floor-planning, Placement & Routing.

Subject : CAD For VLSI (7CS4) 1 Unit 5 Floor-planning, Placement & Routing.

Date post: 06-Jan-2018
Category:
Upload: harriet-griffin
View: 228 times
Download: 0 times
Share this document with a friend
Description:
Physical Design 3 Circuit Design Partitioning Floorplanning & Placement Routing Fabrication

If you can't read please download the document

Transcript

Subject : CAD For VLSI (7CS4) 1 Unit 5 Floor-planning, Placement & Routing Synthesis Flow 2 High-Level Synthesis Logic Synthesis Physical Design Fabrication and Packaging Figures adopted with permission from Prof. Ciesielski, UMASS Physical Design 3 Circuit Design Partitioning Floorplanning & Placement Routing Fabrication What is Backend? 4 Physical Design: 1. FloorPlanning : Architects job 2. Placement : Builders job 3. Routing : Electricians job At sub-micron level So, what is Partitioning? 5 System Level Partitioning Board Level Partitioning Chip Level Partitioning System PCBs Chips Subcircuits / Blocks Partitioning of a Circuit 6 Why Partitioning? 7 Since each partition can correspond to a chip, interesting objectives are: Minimum number of partitions Subject to maximum size (area) of each partition Minimum number of interconnections between partitions Since they correspond to off-chip wiring with more delay and less reliability Less pin count on ICs (larger IO pins, much higher packaging cost) Balanced partitioning given bound for area of each partition Circuit Representation 8 Netlist: Gates: A, B, C, D Nets: {A,B,C}, {B,D}, {C,D} Hypergraph: Vertices: A, B, C, D Hyperedges: {A,B,C}, {B,D}, {C,D} Vertex label: Gate size/area Hyperedge label: Importance of net (weight) A B CD A B C D 9 FloorPlanning Floorplanning 10 The floorplanning problem is to plan the positions and shapes of the modules at the beginning of the design cycle to optimize the circuit performance: chip area total wirelength delay of critical path routability others, e.g., noise, heat dissipation, etc. Floorplanning also decides the IO structure, aspect ratio of the design. A bad floorplan will lead to waste-age of die area and routing congestion. Contd 11 Floorplanning is a mapping between the logical description (the netlist) and the physical description (the floorplan). Floorplanning is the process of identifying structures that should be placed close together, and allocating space for them in such a manner as to meet the sometimes conflicting goals of available space (cost of the chip), required performance, and the desire to have everything close to everything else. Goals and Objectives 12 Goals of oorplanning: arrange the blocks on a chip, decide the location of the I/O pads, decide the location and number of the power pads, decide the type of power distribution, and decide the location and type of clock distribution. Objectives of oor planning are: to minimize the chip area, and minimize delay. Polar Graph Representation 13 A graph representation of floorplan. Each floorplan is modeled by a pair of directed acyclic graphs: Horizontal polar graph Vertical polar graph For horizontal (vertical) polar graph, Vertex: Vertical (horizontal) channel Edge: 2 channels are on 2 sides of a block Edge weight: Width (height) of the block Note: There are many other graph representations. Polar Graph: Example 14 Horizontal Polar Graph Vertical Polar Graph Bounds on Aspect Ratios 15 If there is no bound on the aspect ratios, can we pack everything tightly? - Sure! But we don t want to layout blocks as long strips, so we require r i h i /w i s i for each i. Slicing and Non-Slicing Floorplan 16 Slicing Floorplan: One that can be obtained by repetitively subdividing (slicing) rectangles horizontally or vertically. Non-Slicing Floorplan: One that may not be obtained by repetitively subdividing alone. Representation of Slicing Floorplan Slicing Floorplan V HH 213H V 64 V 75 Slicing Tree Polish Expression ( postorder traversal of slicing tree ) 21H67V45VH3HV Skewed ST and Normalized PE 18 Skewed Slicing Tree: no node and its right son are the same. Normalized Polish Expression: no consecutive Hs or Vs Slicing Floorplan V HH 213H V 64 V 75 Slicing Tree (Skewed) Polish Expression 21H67V45VH3HV V HH 21H V 6 V 7 3 Slicing Tree H67V45V3HHV Example of Moves V2H5V1H V4H5V1H V45HV1H V45VH1H M1 M3 M2 Channel Definition 20 Contd. 21 I/O and Power Planning 22 The next step is to plan and create power and ground structures for both I/O pads and core logic For core logic, there is a core ring enclosing the core with one or more sets of power and ground rings horizontal metal layer: top and bottom sides, while the vertical metal layer is utilized for left, right Contd internal core power and ground busses consist of one or two sets of wires or strips that repeat at regular intervals across the core logic, or specified region, within the design. Each of these power and ground strips run vertically, horizontally, or in both directions. If these strips run both vertically and horizontally at regular intervals, then the style is known as power mesh. As the ASIC core power consumption increases, the distance of power and ground strip intervals increases. Clock Planning 24 The idea of the implementation of clock distribution networks is to provide clock to all clocked elements in the design in a symmetrically-structured manner The basic idea of manual implementation of clock distribution networks is to build a low resistance/capacitance grid similar to power and ground mesh that covers the entire logic core area Contd It is essential to realize that clock grid networks consume a great deal of power due to being active all the time and it may not be possible to make such networks uniform owing to floorplanning constraints (e.g. to spread the power dissipation evenly across the chip). Another aspect of clock planning is that it is well suited to hierarchical physical design. This type of clock distribution is manually crafted at the chip level, providing clock to each sub-block that is place- and-routed individually. 26 Placement Floorplanning v.s. Placement 27 Both determines block positions to optimize the circuit performance. Floorplanning: Details like shapes of blocks, I/O pin positions, etc. are not yet fixed (blocks with flexible shape are called soft blocks). Placement: Details like module shapes and I/O pin positions are fixed (blocks with no flexibility in shape are called hard blocks). Importance of Placement 28 Placement is a key step in physical design Poor placement consumes large area, leads to difficult/ impossible routing task Ill placed layout cannot be improved by high quality routing Quality of placement: Layout area Routability Performance (usually timing, measured by delay of critical/ longest net) Placement Goals and Objectives 29 Goals: (1) Guarantee the router can complete the routing step (2) Minimize all the critical net delays (3) Make the chip as dense as possible Objectives: (1) Minimize power dissipation (2) Minimize crosstalk between signals Problem formulation 30 Input: Blocks (standard cells and macros) B 1,..., B n Shapes and Pin Positions for each block B i Nets N 1,..., N m Output: Coordinates (x i, y i ) for block B i. No overlaps between blocks The total wire length is minimized The area of the resulting block is minimized or given a fixed die Other consideration: timing, routability, clock, buffering and interaction with physical synthesis Placement affects chip area 31 And also Wire Length 32 Placement Algorithms 33 There are two classes of placement algorithms: 1) A constructive placement method uses a set of rules to arrive at a constructed placement. The most commonly used methods are variations on the min-cut algorithm. The other commonly used constructive placement algorithm is the eigenvalue method. 2) An iterative placement improvement. As in system partitioning, placement usually starts with a constructed solution and then improves it using an iterative algorithm. In most tools we can specify the locations and relative placements of certain critical logic cells as seed placements Min-cut placement 34 The min-cut placement method uses successive application of partitioning 1. Cut the placement area into two pieces. 2. Swap the logic cells to minimize the cut cost. 3. Repeat the process from step 1, cutting smaller pieces until all the logic cells are placed. Min-cut placement. (a) Divide the chip into bins using a grid. (b) Merge all connections to the center of each bin. (c) Make a cut and swap logic cells between bins to minimize the cost of the cut. (d) Take the cut pieces and throw out all the edges that are not inside the piece. (e) Repeat the process with a new cut and continue until we reach the individual bins. 35 1.. Iterative Placement Improvement 36 An iterative placement improvement algorithm takes an existing placement and tries to improve it by moving the logic cells. There are two parts to the algorithm: 1.The selection criteria that decides which logic cells to try moving. 2. The measurement criteria that decides whether to move the selected cells. There are several interchange or iterative exchange methods that differ in their selection and measurement criteria: 1. pairwise interchange, 2. force-directed interchange, 3. force-directed relaxation, and 4. force-directed pairwise relaxation. Pairwise-interchange algorithm 37 All of these methods usually consider only pairs of logic cells to be exchanged. A source logic cell is picked for trial exchange with a destination logic cell. The pairwise-interchange algorithm is similar to the interchange algorithm used for iterative improvement in the system partitioning step: 1.Select the source logic cell at random. 2.Try all the other logic cells in turn as the destination logic cell. 3.Use any of the measurement methods we have discussed to decide on whether to accept the interchange. 4.The process repeats from step 1, selecting each logic cell in turn as a source logic cell. 38 (a) and (b) show how we can extend pairwise interchange to swap more than two logic cells at a time. If we swap l logic cells at a time and find a locally optimum solution, we say that solution is l -optimum. The neighborhood exchange algorithm is a modification to pairwise interchange that considers only destination logic cells in a neighborhood cells within a certain distance, e, of the source logic cell. Limiting the search area for the destination logic cell to the e - neighborhood reduces the search time. Force Directed Approach 39 Transform the placement problem to the classical mechanics problem of a system of objects attached to springs Analogies: Module (Block/Cell/Gate) = Object Net = Spring Net weight = Spring constant Optimal placement = Equilibrium configuration An Example 40 Resultant Force Force-directed placement. (a) A network with nine logic cells. (b) We make a grid (one logic cell per bin). (c) Forces are calculated as if springs were attached to the centers of each logic cell for each connection. The two nets connecting logic cells A and I correspond to two springs. (d) The forces are proportional to the spring extensions. 41. Comments on Force-Directed Placement 42 Use directions of forces to guide the search Usually much faster than simulated annealing x Focus on connections, not shapes of blocks x Only a heuristic; an equilibrium configuration does not necessarily give a good placement ? Successful or not depends on the way to eliminate overlapping Simulated Annealing 43 Very general search technique. Try to avoid being trapped in local minimum by making probabilistic moves. Popularize as a heuristic for optimization Basic Idea of Simulated Annealing 44 Inspired by the Annealing Process: The process of carefully cooling molten metals in order to obtain a good crystal structure. First, metal is heated to a very high temperature. Then slowly cooled. By cooling at a proper rate, atoms will have an increased chance to regain proper crystal structure. Attaining a min cost state in simulated annealing is analogous to attaining a good crystal structure in annealing. Simulated Annealing 45 State Cost Temperature dropping Drop back The Simulated Annealing Procedure 46 Let t be the initial temperature. Repeat Pick a neighbor of the current state randomly. Let c = cost of current state. Let c = cost of the neighbour picked. If c < c, then move to the neighbour (downhill move). If c > c, then move to the neighbour with probablility e -(c -c)/t (uphill move). Until equilibrium is reached. Reduce t according to cooling schedule. Until Freezing point is reached. Things to decide when using SA 47 When solving a combinatorial problem, we have to decide: The state space The neighborhood structure The cost function The initial state The initial temperature The cooling schedule (how to change t) The freezing point 48 Routing Routing in design flow 49 AC B Post Placed Netlist AND OR INV Floorplan/Placement Routing Process of finding geometric layouts of the net The Routing Problem 50 Apply it after Placement Input: Netlist Timing budget for, typically, critical nets Locations of blocks and locations of pins Output: Geometric layouts of all nets Objective: Minimize the total wire length, the number of vias, or just completing all connections without increasing the chip area. Each net meets its timing budget. The Routing Constraints 51 Examples: Placement constraint Number of routing layers Delay constraint Meet all geometrical constraints (design rules) Physical/Electrical/Manufacturing constraints: Crosstalk Steiner Tree 52 For a multi-terminal net, we can construct a spanning tree to connect all the terminals together. But the wire length will be large. Better use Steiner Tree: A tree connecting all terminals and some additional nodes (Steiner nodes). Rectilinear Steiner Tree: Steiner tree in which all the edges run horizontally and vertically. Steiner Node Routing Problem is Very Hard 53 Minimum Steiner Tree Problem: Given a net, find the Steiner tree with the minimum length. Input :An edge weighted graph G=(V,E) and a subset D (demand points) Output: A subset of vertices V (such that D is covered) and induces a tree of minimum cost over all such trees This problem is NP-Complete! Heuristic Algorithms 54 Use MST (minimum spanning tree) algorithms to start with Cost MST /Cost RMST 3/2 Heuristics can guarantee that the weight of RST is at most 3/2 of the weight of the optimal tree Apply local modifications to reach a RMST (rectilinear minimum steiner tree) Kinds of Routing 55 Global Routing Detailed Routing Channel Switchbox Others: Maze routing Over the cell routing Clock routing General Routing Paradigm 56 Two phases: Extraction and Timing Analysis 57 After global routing and detailed routing, information of the nets can be extracted and delays can be analyzed. If some nets fail to meet their timing budget, detailed routing and/or global routing needs to be repeated. Routing Regions 58 Global Routing 59 Global routing is divided into 3 phases: 1. Region definition 2. Region assignment 3. Pin assignment to routing regions Maze Routing Problem 60 Given: A planar rectangular grid graph. Two points S and T on the graph. Obstacles modeled as blocked vertices. Objective: Find the shortest path connecting S and T. This technique can be used in global or detailed routing (switchbox) problems. Grid Graph 61 X X Area Routing Grid Graph (Maze) S T S T S T X Simplified Representation X Blocked cells Maze Routing 62 S T Lee s Algorithm 63 An Algorithm for Path Connection and its Application , C.Y. Lee, IRE Transactions on Electronic Computers, 1961. Basic Idea 64 A Breadth-First Search (BFS) of the grid graph. Always find the shortest path possible. Consists of two phases: Wave Propagation Retrace An Illustration 65 S T Wave Propagation 66 At step k, all vertices at Manhattan-distance k from S are labeled with k. A Propagation List (FIFO) is used to keep track of the vertices to be considered next. S T 0 S T S T After Step 0After Step 3After Step 6 Retrace 67 Trace back the actual route. Starting from T. At vertex with k, go to any vertex with label k-1. S T Final labeling How many grids visited using Lee s algorithm? 68 S T Time and Space Complexity 69 For a grid structure of size w h: Time per net = O(wh) Space = O(wh log wh) (O(log wh) bits are needed during exploration phase + one additional bit to indicate blocked or not) For a 2000 2000 grid structure: 12 bits per label Total 6 Mbytes of memory! For 4000 x 4000, 48 M bytes! Acker s coding : Improvement to Lee s Algorithm 70 The vertices in wave-front L are always adjacent to the vertices L-1 and L+1 in the wavefront Soln: the predecessor of any wavefront is labeled different from its successor 0,0,1,1,0, . Need to indicate blocked or not Hence can do away with 2 bits Time complexity is not improved Acker s Technique 71 S T Detailed routing 72 Global routing do not define wires They define routing regions Detailed router places actual wires within regions, indicated by the global router We consider the channel routing problem here Channel Routing 73 A channel is the routing region bounded by two parallel rows of terminals Assume top and bottom boundary Each terminal is assigned a number to indicate which net it belongs to 0 indicates : does not require an electrical connection Channel Routing 74 channel Channel Routing 75 Upper boundary Lower boundary Tracks Terminals Via TrunksBranches Dogleg Channel Routing How to connect all the points with the same label with the smallest no. of tracks (to minimize the channel height)? Horizontal Constraint Graph (HCV) Clique of size 4 Left-Edge Algorithm Sort the horizontal segments of the nets in increasing order of their left end points. 2. Place them one by one greedily on the bottommost available track. Left-Edge Algorithm Sort by left end points Place nets greedily. Vertical Constraint Graph and Doglegs imposes a vertical constraint on 2, as top terminal belongs to 1 and bottom terminal belongs to 2 2 imposes a vertical constraint on 1 2 VCG : Cycle Dogleg Conclusion: 81 We have discussed the problem of partitioning and the role of partitioning in floorplanning. We have understood the concept and physical significance of FloorPlanning, Placement and Routing with various algorithms used in physical design automation. 82 Thanks Queries???


Recommended