SRAM Design

Date post:29-Nov-2014
Category:
View:333 times
Download:12 times
Share this document with a friend
Description:
Designed a fully customized 128x10b SRAM by constructing schematic & virtuoso layout of memory cell array (6T cell), row & column decoder, pre-charge circuit, write circuit and sense amplifier using Cadence. Manually placed and routed all components, performed DRC & LVS debugging of constructed schematic and layout and ran PEX to generate the final Netlist, Hspice Spectre simulation of final design for verification of the correct functionality and analysis of best read, best write cycles & the worst case timing for read and write. Timing and power consumed is analyzed through STA-Primetime (Static timing Analysis)
Transcript:
  • 1. SRAM Design and Layout Project Description Design and layout of a 128 word SRAM using the IBM 130nm process. The key design tools used are Cadences Virtuoso for layout editing, DRC (for design rule checking), LVS (layout versus netlist, for verifying that the layout matches the schematic netlist) and circuit simulation (for measuring the read/write times). Word size is 10bits An output capacitance of 30fF is used for all outputs when simulating for delays. All input signals, and clocks are provided by inverters sized: PMOS=0.75m and NMOS=0.25m. Introduction Static random access memory (SRAM) is a type of volatile semiconductor memory meaning it stores data as long as it is powered. SRAM uses bi-stable latching circuitry made of transistors to store each bit. Unlike Dynamic RAM (DRAM), SRAM doesn't have a capacitor to store the data hence, SRAM works without refreshing. SRAM is often used as a memory cache. The most commonly used SRAM cell consists of 6 transistors and this configuration is called 6T Memory Cell. It consists of two cross-coupled inverters and two access transistors. Figure 1: 6T SRAM Cell EE 7325 Page 1
  • 2. SRAM Design and Layout The access transistors are connected to the word line (WL) at their respective gate terminals, and the bit lines (BL and BLbar) at their source/drain terminals. The word line is used to select the cell while the bit lines are used to perform read or write operations on the cell. Read Operation Figure 2: Read Operation The read operation of the memory cell is explained in Figure 2. Assume that a 0 is stored on the left side of the cell, and a 1 on the right side. M1 is on and M2 is off. Initially, BL and BLbar are pre-charged to VDD. Whenever a row is selected by making the word line active, access transistors M3 and M4 are turned on. Current begins to flow through M3 and M1 to ground. As a result the cell discharges the capacitance Cbit. On the other side of the cell, the voltage on M4 remains high since there is no path to ground through M2. The difference between BL and BLbar is fed to a sense amplifier to generate a valid low output. Write Operation Figure 3: Write Operation EE 7325 Page 2
  • 3. SRAM Design and Layout In order to write to the cell it has to be attacked from both sides. A 1 is placed on one of the bit lines and 0 on the other. By doing this we can flip the value that was stored in the cell and write the new value. The WL transistors need to be ON during read and write operations. SRAM Implementation The top level block diagram of the SRAM is shown in Figure 4 Figure 4: Top Level Block Diagram The signal description is as follows Port I/O Type Description WR Input 1 bit Write/Read signal 1- Write 0- Read clk Input Clock signal 0-Precharge 1-Evaluate addr0-6 Input 7 bit input address addr_en Input 1 bit address enable Word line selected only on addr_en =1 data0-9 Bidirectional 10 bit SRAM data When WR is 0- Reads the data stored in SRAM 1- Writes the data to SRAM vdd,vss Inputs Supply(1.2 V) and gnd 128 word SRAM has 128*10 memory cells considering the word size of 10 bits. The cell is designed to have 40 columns and 32 rows. Hence, we need a 5 bit address line to access one of the rows/word lines and a 2 bit address line to access one of the four words. The overall architecture of the memory design is as shown in Figure 5. EE 7325 Page 3
  • 4. SRAM Design and Layout Figure 5: Memory Architecture Now our goal is to design each individual unit of this architecture, integrate and ensure that the read and write operations are working correctly for the design. EE 7325 Page 4
  • 5. SRAM Design and Layout Component Design SRAM Cell The layout and schematic of the designed SRAM cell are illustrated in Figure 6 and 7. Figure 6: Memory Cell Layout EE 7325 Page 5
  • 6. SRAM Design and Layout Figure 7: Memory Cell Schematic Since there are usually millions of bits to be stored in these memories, in order to achieve the minimum area, all the transistors are minimum size (0.28m here). Width = 2.49 m, Length = 2.38 m => Aspect ratio = = 1.04 Hence, the area per memory cell is 2.49 * 2.38 = 5.92 m 2 Precharge Circuit In both read and write operations, the bitlines are initially pulled up to high voltage. This is done using a precharge circuit. The schematic of the circuit is as shown in Figure 9 below. A clock input is applied to the two pull-up transistors, called the balance transistors, connected between the two bitlines. When the wordline (WL) signal goes high, one bitline remains high and the other falls until WL goes low. The layout of the precharge circuit is as shown in Figure 8. EE 7325 Page 6
  • 7. SRAM Design and Layout Figure 8: Layout of the Precharge Circuit Figure 9: Schematic of the Precharge Circuit EE 7325 Page 7
  • 8. SRAM Design and Layout Clock Driver Circuit Since we have used a clocked precharge circuit to charge the bitlines, it is necessary to size the clock buffer circuit as well. The sizing of the transistor is as follows: All calculations are done based on the fact that the clock drives 2 PFETs between every BL and BL lines. That is it has to drive a total of 2*40 PFETs. Cpoly = 2 fF/m* 2m *2*40 = 320fF Cwire = 0.2fF/m * Width of the memory cell*Number of columns = 0.2fF/m * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !!".!" ! = 169.52fF F = GBH= 169.52 Number of stages, N = !"# !"#.!" !"# !.! = 4 stages f = F1/N = 3.6 Hence the circuit is as below Figure 10: Clock Driver Circuit The sizing equation is Cin = !!"#$% ! 4: !!"#.!" !.! = 47.08 => Wp =31.38m, Wn= 15.69m 3: !!".!" !.! = 13.08 => Wp =8.72m, Wn= 4.36m 2: !!".!" !.! = 3.633 => Wp =2.42m, Wn= 1.21m 1: !!.!"" !.! = 1.00 => Wp =0.66m, Wn= 0.33m The schematic and layout of the Clock Driver circuit is shown below. EE 7325 Page 8 Cload
  • 9. SRAM Design and Layout Figure 11: Layout and Schematic of Clock Driver EE 7325 Page 9
  • 10. SRAM Design and Layout Sense Amplifier The designed SRAM uses ten identical sense amplifiers to provide simultaneous output of ten data bits. In our design, we have used a current mode differential input single ended sense amplifier in order to attenuate the common mode noise and amplify the differential mode signals. The main reason for using this type of sense amplifier is to improve the noise immunity and speed of the read circuit. The differential signal that changes between the two bit lines during read operation is amplified by the differential pair current mode sense amplifier. The transistors are sized such that the differential voltage is amplified suitably for read operation. The output of the sense amplifier is then given to a pair of inverters in order to have a digital output. Inverted Write (WR) signal is given to the gate of the current source transistor in order to enable the sense amplifier only during read operation. The schematic and layout of the sense amplifier are as shown in the figures below. Figure 10: Sense Amplifier Layout EE 7325 Page 10
  • 11. SRAM Design and Layout Figure 11: Sense Amplifier Schematic Row Decoder Access time and power consumption of memories may be largely determined by decoder design. Row decoders take an n-bit address and produce 2n outputs. Row decoders are used to select the required row in the memory array. The required wordline is activated based on the address given to the decoder. In our design we have 32 rows, hence n=5 address bits are used to select a row. Since the row decoder is used to activate one of the 25 wordlines, it has to be sized suitably using logical effort based on the capacitance of the wordline. The gate level schematic of one stage of row decoder is as shown in Figure. Figure 12: Row Decoder Circuit EE 7325 Page 11
  • 12. SRAM Design and Layout The gates are sized as below Cpoly = 2 fF/m* 2m *0.28*40 = 44.8fF Cwire = 0.2fF/m * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!" ! = 31.92fF = H B= 16, G = ! ! * ! ! = !" ! F = GBH= 1418.67 Number of stages, N = !"# !"!#.!" !"# !.! = 5.66 = 6 stages f = F1/N = 2.82 1: !!".!" !.!" = 11.32 => Wp =7.55m, Wn= 3.78m 2: !!!.!" !.!" = 4.02 => Wp = 2.68m, Wn= 1.34m 3: !!.!" !!.!" = 2.38 => Wp = 0.48m, Wn= 1.9m 4: !!.!" !!.!" = 1.4 => Wp =0.56m, Wn= 0.84m 5: !"!.! !.!" = 7.95 => Wp = 5.3m, Wn= 2.65m 6: !!.!" !.!" = 2.82 => Wp = 1.88m, Wn= 0.94m 7: !!.!" !.!" = 1.00 => Wp = 0.66m, Wn= 0.33m Two more inverters are used to get the non-inverted input 8: !.!" !.!" = 1.68 => Wp = 1.12m, Wn= 0.56m 9: !.!" !.!" = 1.00 => Wp = 0.66m, Wn= 0.33m 10: !"!.! !.!" = 15.89 => Wp = 10.6m, Wn= 5.3m 11: !".!" !.!" = 5.64 => Wp = 3.76m, Wn= 1.88m 12: !.!" !.!" = 2 => Wp = 1.34m, Wn= 0.67m 13: ! !.!" = 0.71 =>The transistor widths are below the minimum. Hence, Wp = 0.66m, Wn= 0.33m are chosen. EE 7325 Page 12
  • 13. SRAM Design and Layout Figure 13: Layout and Schematic of Row Decoder EE 7325 Page 13
  • 14. SRAM Design and Layout Column Decoder After precharging all the bitlines to a high voltage, the next step is to select a column of the memory cell array that will be involved in the read or write operation. This column selection is performed using a decoder/multiplexer combination. The m-bit column address is used to select one or more of the 2m columns. In our case, the array is designed such that four words are placed in a row with all the first bits of the word together and so on. The column decoder is hence used to select a bit from among 4 bits hence, 2 bits are used to select a column. The transistors are sized based on the bitline capacitances. The column decoder is sized as below Figure 15: Column Decoder Cpoly = 2 fF/m* Wn of the Transmission Gate *Number of gates = 2 fF/m* 1m *20 = 40fF Cwire = 0.2fF/m * Width of the memory cell*Number of columns = 0.2fF/m * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!" ! = 29.52fF = H F = GBH= ! ! * 29.52 * 4 = 157.44 Number of stages, N = !"# !"#.!! !"# !.! = 4 stages f = F1/N = 3.54 The circuit is modified as shown below to get the required number of stages EE 7325 Page 14
  • 15. SRAM Design and Layout Figure 16: One input-output block of the column decoder with the required number of stages The gates were sized as follows: 7&9: !!".!" !.!" = 8.33 => Wp =5.55m, Wn= 2.77m 6: !!.!! !.!" = 4.42 => Wp = 2.94m, Wn= 1.47m 5&8: !!.!" !.!" = 2.35 => Wp =1.56m, Wn= 0.78m 4: ! ! !!.!" !.!" = 1.77 => Wp =0.88m, Wn= 0.88m 1&2: !!!.!! !.!" = 1 => Wp =0.66m, Wn= 0.33m 3: !!!.!! !.!" = 1.88 => Wp = 1.25m, Wn= 0.62m The layout and schematic of the column decoder are shown below. EE 7325 Page 15
  • 16. SRAM Design and Layout Figure 17: Layout and Schematic of the Column Decoder EE 7325 Page 16
  • 17. SRAM Design and Layout Write Driver During precharge both the BL and BLbar lines are charged to VDD. Before write operation, one of the bitlines must be driven high and the other low based on the data bit that is being written. The schematic of the write circuitry that we have used in our design is shown in Figure 19. During write operation, WR signal goes high and the 8 bit data can be written by giving required bit values to the corresponding input bits. These values are then passed through a set of pass transistors that are attached to the BL and BLbar lines so that the data bit will be written into the corresponding memory cell. Figure 18: Write Driver Layout EE 7325 Page 17
  • 18. SRAM Design and Layout Figure 19: Write Driver Schematic Write Enable (WR) Driver Since the WR signal drives two NFETs in each column, a total of 20 NFETs will be driven by WR. In addition, its complement is given to the 20 PFETs of the write driver. Hence, the buffer circuit for WR must be suitably sized so that it drives the required load. The transistor sizing is as given below. The schematic and the layout of the buffer circuit are as shown in figure 22 and 23 respectively. Figure 20: WR Driver EE 7325 Page 18
  • 19. SRAM Design and Layout Load to WRbar is Cpoly = (2 fF/m* 4.22m *20) + (2 fF/m*2*10) = 208.8fF Cwire = 0.2fF/m * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !"#.!!!".!" ! = 113.92fF = H F = GBH= 113.92 Number of stages, N = !"# !!".!" !"# !.! = 3.69 stages In order to get the inversion, 5 stages were chosen. f = F1/N = 2.57 Load to WR is Cpoly = (2 fF/m* 0.48m *20) + (2 fF/m*2*10) = 59.2fF Cwire = 0.2fF/m * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!!!".!" ! = 39.12fF = H F = GBH= 39.12 Number of stages, N = !"# !".!" !"# !.! = 2.86 = 4 stages f = F1/N = 2.77 The circuit is modified as shown below to get the required number of stages Figure 21: Modified WR Driver The gates were sized as follows: 1: !!!".!" !.!" = 44.32 => Wp = 29.6m, Wn= 14.8m 2: !!!.!" !.!" = 17.24 => Wp = 11.5m, Wn= 5.74m 3: !!".!" !.!" = 6.7 => Wp =4.45m, Wn= 2.23m 4: !!.! !.!" = 2.607 => Wp =1.73m, Wn= 0.86m EE 7325 Page 19
  • 20. SRAM Design and Layout 5: !!.! !.!" = 1.01 => Wp = 0.66m, Wn= 0.33m 6: !!".!" !.!! = 21.34 => Wp = 14.2m, Wn= 7.11m 7: !!".!" !.!! = 7.703 => Wp =5.13m, Wn= 2.56m 8: !!.!"# !.!! = 2.78 => Wp =1.85m, Wn= 0.92m 9: !!.!" !.!! = 1.003 => Wp = 0.66m, Wn= 0.33m Figure 22: Layout of WR Driver Figure 23: Schematic of WR Driver EE 7325 Page 20
  • 21. SRAM Design and Layout Data Buffer The data is given through two tristate inverters to the bitlines. The data buffer has to be appropriately sized to run these tristate inverters. The correct sizing is shown below. The layout and schematic are shown in figures 25 and 26. Figure 24: Data Buffer Circuit Cpoly = (2 fF/m* 4.22m) + (2 fF/m* 0.48m) = 9.4fF Cwire = 0.2fF/m * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !.!!!".!" ! = 14.22fF = H F = GBH= 14.22 Number of stages, N = !"# !".!! !"# !.! = 2.07 = 3 stages in order to invert. f = F1/N = 2.42 3: !!".!! !.!" = 5.87 => Wp =3.2m, Wn= 1.96m 2: !!.!" !.!" = 2.42 => Wp = 1.61m, Wn= 0.8m 1: !!.!" !.!" = 2.42 => Wp =0.66m, Wn= 0.33m EE 7325 Page 21
  • 22. SRAM Design and Layout Figure 25: Data Driver Layout Figure 26: Data Driver Schematic EE 7325 Page 22
  • 23. SRAM Design and Layout Transmission Gate In our design, we have used transmission gates in order to select the columns. The transmission gates are also sized for optimal speed. The schematic and layout of the transmission gate are as shown in Figures 25 and 26 respectively. Figure 27: Transmission Gate Layout Figure 28: Transmission Gate Schematic EE 7325 Page 23
  • 24. SRAM Design and Layout Complete Schematic and Layout Once all the peripheral circuits are designed, all of the units are then integrated to the memory cell array. The complete SRAM schematic including precharge, clock buffer, row decoders, column decoders, sense amplifier and the write circuit is as shown in Figure 29. The corresponding layout of the design along with the rulers is given in Figure 30. Figure 29: Complete Schematic EE 7325 Page 24
  • 25. SRAM Design and Layout Figure 30: Complete Layout The total area of the design is 107.84 m *114.85 m = 12385.42 m 2 Therefore, total area that accounts for one bit is given by, Area /bit =12385.42 / 1028 = 12.048 m 2 EE 7325 Page 25
  • 26. SRAM Design and Layout DRC and LVS Reports The designed layout is checked in Cadence for design rule errors. There were no DRC errors in the layout. A snapshot of the DRC report is shown in Figure 31. The functionality is then tested by comparing the Layout versus Schematic (LVS). LVS matched and the report is shown in Figure 32. Figure 31: DRC Report for complete SRAM layout EE 7325 Page 26
  • 27. SRAM Design and Layout Figure 32: LVS Report of the complete SRAM layout EE 7325 Page 27
  • 28. SRAM Design and Layout Simulation and Results The functionality of the SRAM is tested by writing a 10 bit data word 0110001010 into the first row and first column of each super column of the design and then reading the written value in the next clock cycle. When the clk is low, all the bitlines are precharged to VDD. During evaluation, the Write enable (WR) signal is activated. During this phase, the write operation take place and word bits are written into the corresponding memory cell depending on the row and column address. Figure 33: SRAM Simulation Result EE 7325 Page 28
  • 29. SRAM Design and Layout The worst case write time is found by reducing the width of the WR signal until the write does not work properly. The smallest width of WR at which the data is written correctly is the worst case write time. The read time delay is also measured by 50 - 50% delay between addr_en signal and data bits being read. The simulated waveforms are as shown below. Figure 34: Worst case time simulation The operating frequency is calculated as shown below Operating frequency = ! !!"#$% !"#$ !"#$ = ! !!"#!" = 613.5MHz The noise margin is calculated by drawing the overlapped VTC (Butterfly diagram) for the cross-coupled inverters that form the memory cell. The largest square that can fit in the eyes of the butterfly diagram determines the noise margin. The butterfly diagram obtained is shown below. EE 7325 Page 29
  • 30. SRAM Design and Layout Figure 35: Noise Margin Conclusion Parameter Value Aspect Ratio 1.065 Worst case write time 815ps Worst case read time 714ps Operating frequency 613.5MHz Noise Margin 0.35V The SRAM has comparatively low operating frequency but that is a trade-off for the low area/bit that we have tried to achieve. The memory cell has good noise margin and good control over read and write operations. EE 7325 Page 30
of 30/30
SRAM Design and Layout EE 7325 Page 1 Project Description Design and layout of a 128 word SRAM using the IBM 130nm process. The key design tools used are Cadence’s Virtuoso for layout editing, DRC (for design rule checking), LVS (layout versus netlist, for verifying that the layout matches the schematic netlist) and circuit simulation (for measuring the read/write times). Word size is 10bits An output capacitance of 30fF is used for all outputs when simulating for delays. All input signals, and clocks are provided by inverters sized: PMOS=0.75µm and NMOS=0.25µm. Introduction Static random access memory (SRAM) is a type of volatile semiconductor memory meaning it stores data as long as it is powered. SRAM uses bi-stable latching circuitry made of transistors to store each bit. Unlike Dynamic RAM (DRAM), SRAM doesn't have a capacitor to store the data hence, SRAM works without refreshing. SRAM is often used as a memory cache. The most commonly used SRAM cell consists of 6 transistors and this configuration is called 6T Memory Cell. It consists of two cross-coupled inverters and two access transistors. Figure 1: 6T SRAM Cell
Embed Size (px)
Recommended