Home > Documents > Sram Implementation

Sram Implementation

Date post: 06-Apr-2018
Author: gattu-sreeja
View: 218 times
Download: 0 times
Share this document with a friend
Embed Size (px)

of 50

  • 8/3/2019 Sram Implementation





    Very-large-scale integration (VLSI) is the process of creating integrated circuits by

    combining thousands of transistors into a single chip. VLSI began in the 1970s when

    complex semiconductor and communication technologies were being developed.

    The first semiconductor chips held two transistors. Subsequent advances added more

    and more transistors, and, as a consequence, more individual functions or systems were

    integrated over time. The first integrated circuits held only a few devices, perhaps as many as

    ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or

    more logic gates on a single device. Now known retrospectively assmall-scale

    integration (SSI), improvements in technique led to devices with hundreds of logic gates,

    known as medium-scale integration (MSI). Further improvements led to large-scale

    integration (LSI), i.e. systems with at least a thousand logic gates. Current technology has

    moved far past this mark and today's microprocessors have many millions of gates andbillions of individual transistors.

    At one time, there was an effort to name and calibrate various levels of large-scale

    integration above VLSI. Terms like ultra-large-scale integration (ULSI) were used. But the

    huge number of gates and transistors available on common devices has rendered such fine

    distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in

    widespread use.

    As of early 2008, billion-transistor processors are commercially available. This isexpected to become more commonplace as semiconductor fabrication moves from the current

    generation of 65 nm processes to the next 45 nm generations. Current designs, as opposed to

    the earliest devices, use extensive design automation and automated logic synthesis to lay

    out the transistors, enabling higher levels of complexity in the resulting logic functionality.

    Certain high-performance logic blocks like the SRAM (Static Random Access Memory) cell,

    however, are still designed by hand to ensure the highest efficiency.



  • 8/3/2019 Sram Implementation


    The system prototyping methodology is a natural outgrowth of recent developments in

    Software and hardware facilities intended to make it simple for designers with an idea for a

    particular application to turn that idea into a working system based on very large scale

    Integrated chips. Today VLSI CMOS technologies deliver individual integrated circuits

    (ICs) and containing millions of gates, sufficient to implement substantial systems-on-chip or

    major subsystems-on-a chip. System-on-chip design may involve the expertise from many

    fields of electronics such as signal processing, communication, device physics etc. and so on.

    It is unreasonable to expect the architect of a speech recognition system, for example, to be

    an expert in device physics as well as in signal processing. The Mead Conway methodology

    for integrated-circuit design makes VLSI technology available to such an application


    The structured design methodology of Mead and Conway is an approach to VLSI system

    design that attacks the problems of complex chip designs. The structured design

    methodology is similar in concept to structured programming: the design proceeds in a top-

    down manner in which the problem is decomposed and refined. The structured design

    methodology has two major parts: hierarchy and regularity. Hierarchical techniques have

    long been used to design complex systems. Hierarchies are used to partition designs and

    common parts of a design can be factored out and specified only once. By introducing

    regularity into a system, the design problem is reduced in complexity as subunits are

    replicated many times and connections between units are simplified.

    Regularity means that the hierarchical decomposition of a large system should result in

    not only simple, but also similar blocks, as much as possible. A good example of regularity

    is the design of array structures consisting of identical cells such as a parallel multiplication

    array. Regularity can exist at all levels of abstraction: If the designer has a small library of

    well-defined and well-characterized basic building blocks, a number of different functions

    can be constructed by using this principle. Regularity usually reduces the number of different

    modules that need to be designed and verified, at all levels of abstraction.


    VLSI design style mainly uses three domains of design description, viz. the

    behavioral, the description of the function of the design; the structural, the description of the


  • 8/3/2019 Sram Implementation


    form of the implementation; and the physical, the description of the physical implementation

    of the design. There are many possible representations of a circuit in each description, and

    judicious choice of representations is important in tool design.

    A simplified view of design flow is shown in Fig. 1. Regardless of the actual size of

    the project, the basic principles of structured design will improve the prospects of success.



    Logic(Gate Level


    Circuit Representation

    Layout Representation

    Fabrication and Testing

    Fig1.1 VLSI design flow

    At the beginning of a design it is important to specify the system requirements

    without unduly restricting the design. The object is to describe the purpose of the design

    including all aspects, such as the functions to the realized, timing constraints, and power

    dissipation requirements, etc.


    System Specification

    Functional(Architecture) Design


    Logic Design

    Logic Verification

    Circuit Design

    Circuit Verification

    Physical Design

    Layout Verification

    Circuit Modeling

  • 8/3/2019 Sram Implementation


    Functional design specifies the functional relationships among subunits or registers.

    In general, a description of the IC in either the functional or the block diagram domain

    consists both of the input-output description, and the way that this behavior is to be realized

    in terms of subordinate modules. In turn each of these modules is described both in terms of

    input-output behaviors and as an interconnection of other modules. These hierarchical ideas

    apply to all the domains. The internal description of a module any be given in another

    domain. If a module has no internal description then the design is incomplete. Ultimately

    this hierarchy stops when the internal description is in terms of mask geometry, which is

    primitive. Hierarchy and modularity are used in block diagrams or computer programs. In

    these domains hierarchy suppresses unnecessary details, simplifies system design through a

    divide-and-conquer strategy and leads to more easily understood designs that are more

    readily debugged and documented.

    It can be summarized in a way that when we want to design a digital system, we need

    to specify the system performance which is called system specification. Then the system

    must be broken down into subunits or registers. So we have a functional design which

    specifies the functional relationships among subunits or registers. Architecture usually

    means the functional design, system specification and often including part of the subsequent

    logic design.

    The next step is the Logic design of networks which constitutes subunits or registers.

    When a system architecture or logic networks are designed, performance and errors are

    checked by CAD programs, called as logic simulation. The subject of the logic design is

    to decide overall structure of blocks, their interconnection pattern, to specify the structure of

    data path and to control sequences of data path. Logic simulator does the logic verification

    considering the propagation delays of interconnection signals and the element delay.

    Simulator also checks whether the network contains hazards analysis. Logic design and

    simulation is a key issue in VLSI CAD. The flow of logic design process is determined by

    the level at which the design can begin-system level, behavioral level or functional level.

    Logic design consists of a series of design steps leading from a higher level to a circuit

    description at the logic level.

    For this electronic Circuit design and simulation, CAD programs perform complex

    numerical analysis calculations of nonlinear differential equations which characterize

    electronic circuits. Since we need to finish calculation within a reasonable time limit,

    keeping the required accuracy, many advanced numerical analysis techniques are used. The


  • 8/3/2019 Sram Implementation


    CAD programs usually yield the analysis of transient behavior, direct-current performance,

    stationary alternating-current performance, temperature, signal distortion, noise interference,

    sensitivity and parameter optimization of the electronic circuits.

    The Layout system is used to convert block/cell placement data into actual locations,

    and to construct a routing maze containing all spacing rules. The format used for relative cell

    placement data is the same for automatic as for manual placements in order to simplify their

    interchange. In fact, the output of the automatic placement program can be modified by hand

    before input into the chip building step as manual placement data.

    The layout for random-logic networks in the most time-consuming stage throughout

    the entire sequence of LSI/VLSI chip design. After having finished the layout, designers

    usually check by CAD programs whether the layout conforms to the layout rules. As the

    integration size of LSI/VLSI chips becomes larger, design verification and testing at each

    design stage is vitally important, because any errors which sneak in from the previous design

    stages are more difficult to find and more expensive, since once found, we need to redo the

    previous design stages. As the integration size increases, the test time increases very rapidly,

    so it is crucial to find a good way to test within as short a time as possible, though it appears

    very difficult to find good solutions. Complete test and design verification with software or

    hardware (i.e., computers specialized in testing) is usually done to find a design mistake.

    The last domain in which the design of an IC can exist include the mask set, and of

    course, the final fabricated chip followed by prototype testing.




    MEMORY refers to the physical devices used to store programs or data on a

    temporary or permanent basis for use in a computer or other digital electronic device.

    The term primary memory is used for the information in physical systems which are

    fast (i.e. RAM), as a distinction from secondary memory, which are physical devices


  • 8/3/2019 Sram Implementation


    for program and data storage which are slow to access but offer higher memory capacity.

    Primary memory stored on secondary memory is called "virtual memory".

    The term "storage" is often used in separate computers of traditional secondary

    memory such as tape, magnetic disks and optical discs (CD-ROM and DVD-ROM). The term"memory" is often associated with addressable semiconductor memory, i.e. integrated

    circuits consisting of silicon-based transistors, used for example as primary memory but also

    other purposes in computers and other digital electronic devices.

    There are two main types of semiconductor memory: volatile and non-volatile.

    Examples of non-volatile memory are flash memory and ROM, PROM, EPROM,

    EEPROM memory. Examples of volatile memory are primary memory (dynamic RAM), and

    fast CPU cache memory (static RAM), which is fast but energy-consuming and offer lower

    memory capacity per area unit than DRAM.

    The semiconductor memory is organized into memory cells or bistable flip-flops, each

    storing one binary bit (0 or 1). The memory cells are grouped into words of fix word length,

    for example 1, 2, 4, 8, 16, 32, 64 or 128 bit. Each word can be accessed by a binary address

    ofNbit, making it possible to store 2 raised byNwords in the memory. This implies

    that processor registers normally are not considered as memory, since they only store one

    word and do not include an addressing mechanism.


    SRAM operates in three modes. They are;

    1. Standby mode or idle mode

    2. Read mode and

    3. Write mode.

    In idle mode, the SRAM is disabled. In read mode the data is read from a selected addresslocation. In write mode the data is written to a particular address location.


    SRAM is used in many embedded applications.

    Many categories of industrial and scientific subsystems, automotive electronics uses

    static RAM. Some amount (kilobytes or less) is also embedded in practically all

    modern appliances, toys, etc., that implement an electronic user interface.

    Several megabytes is used in complex products such as digital cameras, cell phones,

    synthesizers etc.


  • 8/3/2019 Sram Implementation



  • 8/3/2019 Sram Implementation





    This is one of the most common approaches to reduce leakage currents where two

    different types of transistors are fabricated on the chip, a high V th to lower sub-threshold

    leakage current. Based on the multi-threshold technologies previously described, several

    multiple-threshold circuit design techniques have been developed.

    Multi-threshold voltage CMOS: reduces the leakage by inserting high-threshold

    devices in series to low Vth circuitry. Fig. 3.1(a) shows the schematic of an MTCMOS circuit.A sleep control scheme is introduced for efficient power management. In the active mode,

    Sleep is set low and sleep control high Vth transistors (MP and MN) are turned on. Since their

    on-resistances are small, the virtual supply voltages (Virtual Vdd and Virtual GND) almost

    function as real power lines. In the standby mode, Sleep is set high, MN and MP are turned

    off, and the leakage current is low. In fact, only one type of high V th transistor is enough for

    leakage control. Fig 3.1(b) and (c) show the PMOS insertion and NMOS insertion schemes,

    respectively. The NMOS insertion scheme is preferable, since the NMOS on-resistance is

    smaller at the same width. NMOS can be sized smaller than corresponding PMOS.

    MTCMOS can be easily implemented based on existing circuits.


  • 8/3/2019 Sram Implementation


    Fig 3.1 MTCMOS Technique

    In the active mode, the sleep control signal SL is set low and the control transistors

    are tuned on. Since the on resistances of high-V t sleep is low, VDD and VSS act like power

    supply lines. In the standby mode, SL is set high, the high threshold sleep control transistors

    are tuned off, resulting in low leakage current.

    Short channel transistors require lower power supply levels to reduce power

    consumption. This forces a reduction in the threshold voltage that causes a substantial

    increase of weak inversion current. The leakage control technique that have been proposed so

    far is power gating and also known as MTCMOS, has traditionally been the most effective

    way to lower the leakage. Power gating uses a PMOS transistor or an NMOS transistor to

    disconnect the circuits supply voltage from the logic when the logic is inactive. This

    technique can reduce leakage by more than two orders of magnitude with negligible speed


    MTCMOS power gating works to reduce leakage currents by disconnecting the power

    supply from specific portions of a circuit when those portions are not needed.


  • 8/3/2019 Sram Implementation


    Multi-threshold CMOS (MTCMOS) has been described as a method to reduce

    standby leakage current in the circuit, with the use of a high threshold MOS device to de-

    couple the logic from the supply or ground during long idle periods, or sleep states.

    MTCMOS circuit, where the logic block is constructed using low threshold devices and the

    either the power supply can be gated by a high threshold header switch, or the ground

    terminal is gated by a high threshold footer switch.

    During active operation of the MTCMOS circuit described by Fig 3.2, the power

    interrupt switch is turned on by the SLEEPN (or SLEEPP) signal and current dissipated by

    the logic is drawn through the interrupt switch which causes a reduction in drive voltage seen

    by the logic, reducing logic performance. To compensate for the reduction in logic

    performance: Larger power supply voltages can be used to at the expense of increased active

    power for similar performance. Larger device widths for the power interrupt switch can be

    used to minimize performance impact, at the expense of increased area and power for

    entering and existing sleep mode. The adjustments in device implants to allow moderately

    high threshold values is another technique that can be used to increase performance at the

    expense of increased device leakage during idle mode.

    Fig 3.2: MTCMOS Logic

    The MTCMOS scheme, proposed in, is a good technique for reducing both gate and

    sub-threshold leakages. But it slows down circuits considerably as VDD is scaled below 0.6V.



  • 8/3/2019 Sram Implementation


    A principal source of Igate arises from the tunneling of electrons through the gate

    oxide. The probability of electron tunneling is a strong function of the applied electric field

    and the barrier thickness itself, which is simply Tox, with a small change in Tox having a

    tremendous impact on Igate. For example, in MOS devices with SiO2 gate oxides, a

    difference in Tox of only 2A can result in an order of magnitude increase in Igate ,so that

    reducing Tox from 18A to 12A increases Igate by approximately 10001. The most

    effective way to control Igate is through the use of new materials, high-k dielectrics, but such

    materials are not expected to be manufacturable until approximately 2007 at the earliest .The

    issue of power dissipation due to gate leakage a rises in two contexts. In the stand-by mode,

    when a circuit is not undergoing any active operations, leakage may be controlled through

    various means, prominent among which is the use of multiple threshold CMOS sleep

    transistors. The assignment of circuit inputs to send the circuit into a low leakage state, and

    body biasing. In the active mode, i.e., in normal operation, clearly, the use of neither sleep

    transistors nor state assignment is viable. Recent studies show that at the 90nm mode, leakage

    can contribute over 40% of the total power.

    Leakage power in modern CMOS VLSI circuits has become a component comparable

    to dynamic power dissipation. Typically, the sub threshold leakage current dominates the

    device off-state leakage due to low Vth transistors employed in logic cell blocks in order to

    maintain the circuit switching speed in spite of decreasing VDD levels. The Multi-Threshold

    CMOS (MTCMOS) technique can significantly reduce the sub threshold leakage currents

    during the circuit sleep (standby) mode by adding high-Vth power switches (sleep transistors)

    to low-Vthlogic cell blocks. This is because the stacked high-Vthsleep transistor connected to

    the bottom of the pull-down network of all logic cells in the circuit acts as a high-resistance

    element during the sleep mode, which limits the leakage current from V ddto ground lines. At

    the same time, because of the stack effect, the sub threshold leakage of the low-V thtransistors

    in the logic block itself goes down. This leakage reduction is preferably achieved with small

    performance degradation because, during the active mode of the circuit, the sleep transistor is

    fully on (i.e., it operates in the linear mode), and thus, all low-V th logic cells in the MTCMOS

    logic block can switch very fast. Unfortunately, the situation is different in real designs. More

    precisely, during the active mode of the circuit operation, the high-V thsleep transistor acts as

    a small linear resistance placed at the bottom of the transistor stack to ground, causing the

    propagation delay of the cells in the logic block to increase. In addition, the virtual ground

    network itself acts as a distributed RC network, which causes the voltage of the virtualground node to rise even further, thereby degrading the switching speed of the logic cells


  • 8/3/2019 Sram Implementation


    even more .The former effect is a function of the size of the sleep transistor whereas the latter

    effect is a function of the physical distance of the logic cell from the sleep transistor.

    Fig 3.3: (a) MTCMOS circuit structure (b) The circuit model with virtual ground

    interconnected and sleep transistor modeled as resistors R1 and R2 respectively

    Fig 3.3(a) depicts a logic blockLB, in which a group of low-Vth logic cells are first

    connected to the virtual ground node and then through a high-Vthsleep transistor, S, to the

    actual ground, GND . Fig 3.3 (b) models the virtual ground interconnection and the high- Vth

    sleep transistor, which behaves like a linear resistor in the active mode of the circuit

    operation, as resistorsRi andRs, respectively. The virtual ground is at voltage Vxabove the

    actual ground, i.e., ( VX=I.(Rs+Ri) whereIis the current flowing through the virtual ground

    sub-network and the sleep transistor. The voltage drop acrossRs +Ri reduces the gate over-

    drive voltage of MTCMOS logic cells (i.e., theirVgsvalue) from Vddto VddVx. An optimal

    algorithm for placing sleep transistors for the standard cell-based layout design, which

    minimizes the performance degradation of MTCMOS circuits due to the interconnect

    resistance of the virtual ground network.

    Technology scaling causes sub threshold leakage currents to increase exponentially.

    As technology scales into the deep-submicron (DSM) regime, standby sub threshold leakage

    power increases exponentially with the reduction of the supply voltage (VDD) and the

    threshold voltage (Vth). For many event driven applications, such as mobile devices where

    circuits spend most of their time in an idle state with no computation, standby leakage power

    is especially detrimental on overall power dissipation. Multi-Threshold CMOS (MTCMOS)

    is an effective circuit-level methodology that provides high performance in the active mode

    and saves leakage power during the standby mode. The basic principle of the MTCMOS

    technique is to use low Vth transistors to design the logic gates where the switching speed is

    essential, while the high Vth transistors are used to effectively isolate the logic gates in

    standby state and limit the leakage dissipation. In the active mode, the sleep transistor works

    as a resistor.


  • 8/3/2019 Sram Implementation


    A downside of using Multi-Threshold CMOS (MTCMOS) technique for leakage

    reduction is the energy consumption during transitions between sleep and active modes.

    Previously, a charge recycling (CR) MTCMOS architecture was proposed to reduce the large

    amount of energy consumption that occurs during the mode transitions in power gated

    circuits. Considering the RC parasitic of the virtual ground and V DD lines, proper sizing and

    placement of charge recycling transistors is key to achieving the maximum power saving.

    Power gating technique provides low leakage and high performance operation by

    using low Vt transistors for logic cells and high Vt devices as sleep transistors for

    disconnecting logic cells from power supply and/or ground. This Multi-threshold CMOS

    technology reduces the leakage in the sleep mode. One of the key concerns in MTCMOS is

    the wake up time latency of the circuit, which is defined as the time required to turn on the

    circuit after receiving the wake up signal. Reducing the wake up time latency is an important

    issue since it can affect the overall performance of the VLSI circuit. Another important issue

    in power gating is minimizing the energy wasted during mode transition, i.e., while switching

    from active to sleep mode and vice versa. Both virtual ground and virtual VDD nodes

    experience voltage change during mode transition. Since there is considerable number of

    cells connected to the virtual ground and virtual supply nodes, the total switching capacitance

    at these nodes is large, and as a result the switching power consumption during mode

    transition can be significant.

    Sleep transistor sizing is an important issue in designing the MTCMOS circuits.

    Charge recycling technique has been recently proposed in order to reduce the energy

    consumption during mode transition of MTCMOS circuits. It has been shown that by

    applying this technique, up to 46% of the switching energy due to mode transition can be


    The MTCMOS circuit scheme is a very efficient low-power and high performance

    circuit technique that employs high Vth transistors to switch on and off the power supplies to

    the low Vth logic blocks.


    The MTMCOS technique is a well known way to combine high switching speed with

    low standby current, by using low-Vt transistors for the logic part and high-Vt transistors for

    the so-called sleep transistors. However, a practical analytic formula, how to correctly

    dimension the sleep transistor for a demanded performance, has not been provided.

    MTCMOS circuits can be simplified by using NMOS sleep transistors, see Fig.3.4

    These transistors will be in their linear mode when the circuit is active. The logic transistors,


  • 8/3/2019 Sram Implementation


    however, will work in the saturation region. Since the current through logic and sleep

    transistor must be identical, the following equation describes the resulting ground shift VH

    due to the sleep transistor.

    Fig 3.4: Modified MTCMOS design with NMOS sleep control transistor

    The factor q (q>1) is used to describe the specified delay time factor of the MTCMOS

    circuit in comparison to a standard CMOS configuration. With the help of Equation (1) it is

    possible to calculate the necessary width WH of a sleep transistor, with WL as the

    accumulated width of all low-Vt logic transistors that are controlled by the sleep transistor.

    A drawback of the common MTCMOS technique is the floating of nodes in the circuits. To

    prevent data from being lost, circuitry must be added to each flip flop.


    First, process modifications for supporting the high-VTH of the sleep MOSFET are

    required. Second, when a circuit goes into the sleep state, it takes a non-negligible amount of

    time to wake up and re-activate because the large sleep transistor must be switched on and it

    must initially discharge the slow virtual ground capacitance. Third, gates into the sleep region

    may be interfaced with gates outside. This means that the outputs of inactive gates (gates into

    the sleep region) can float at intermediate voltages, causing large short-circuit currents in the

    active gates they drive.


  • 8/3/2019 Sram Implementation


    In recent years, technology scaling has increased the role of leakage power in the

    overall power consumption of circuits. Supply voltage reduction is a widely accepted

    methodology for reducing dynamic power, but it has an adverse effect on circuit

    performance. To maintain high performance, the threshold voltage Vt must also be scaled

    down which causes an exponential increase in the sub-threshold leakage currents. This is a

    more potent problem in deep-sub micron technologies. In applications which involve large

    standby times, this high sub-threshold leakage can be detrimental to the overall power

    consumption of the circuit. Multi-threshold CMOS has emerged as an effective technique for

    reducing sub-threshold currents in the standby mode while maintaining circuit performance.

    MTCMOS technology essentially places a sleep transistor on gates and puts them in sleep

    mode when the circuit is non-operational. State of the art techniques in leakage optimization

    using MTCMOS essentially assign a sleep transistor to each gate and size them such that all

    gates have a fixed slowdown. This is followed by a clustering approach that clusters gates

    with mutually exclusive switching patterns. This reduces the overall area penalty of the

    MTCMOS transistor. There are several problems in this approach. First the traditional

    approach sizes the sleep transistors such that all gates have the same slowdown. It does not

    investigate the possibility of slowing down non-critical gates more than critical gates for

    better improvements in leakage. Second, it has been shown that clustering MTCMOS gates

    has adverse effects on signal integrity due to ground bounce issues. In this work we address

    these issues by developing a fine grained methodology for MTCMOS based leakage

    optimization. First assign sleep transistors selectively to gates such that the overall slack

    could be effectively utilized. Moreover, dont perform clustering, hence the signal integrity

    issues are not critical in our approach.

    As shown in figure 3.5(a), low Vt logic modules or gates are connected to the virtual

    supply rails through high Vt sleep transistors which behave similar to a linear resistor in

    active mode as shown in figure 3.5(b). The high threshold sleep transistor is controlled using

    the Sleep signal and limits the leakage current to a low value in the standby mode.

    The load dependent delay di of a gate i in the absence of a sleep transistor can be

    expressed as


  • 8/3/2019 Sram Implementation


    where CL is the load capacitance at the gate output, VtL is the low voltage threshold =350mV,

    Vdd = 1.8 V and is the velocity saturation index ( 1.3 in 0.18-m CMOS technology). In

    the presence of a sleep transistor, the propagation delay of a gate can be expressed as

    Where Vx is the potential of the virtual rails as shown in figure 1 and K is the proportionality

    constant. Let us suppose Isleep ON is the current flowing in the gate during active mode of

    operation. During this mode, the sleep transistor is in the linear region of operation. Using the

    basic device equations for a transistor in linear region, the drain to source current in the sleep

    transistor (which is the same as Isleep ON) is given by

    The sub-threshold leakage current Ileak in the sleep mode will be determined by the sleep

    transistor and is expressed as given by

    Where n is the N-mobility, Cox is the oxide capacitance, Vth is the high threshold

    voltage (= 500 mV), VT is the thermal voltage = 26mV and n is the sub-threshold swing


    Equation 2 establishes a relation between delay of a gate disleep and Vx. By replacing

    Vx in equation 4 in terms of disleep (using equation 2), we get a dependence between (W/L)

    sleep and disleep (assuming the ON current is constant for each gate). Thus, a range of (W/L)

    sleep for the sleep transistor would correspond to a range of gate delays. Finally, (W/L) sleep

    in equation 5 can be replaced in terms of disleep, hence establishing a relationship between

    gate delay and gate leakage. The final relation between leakage and delay can be expressed as

    This relationship exists for only those gates that have a sleep transistor assigned to

    them. Note that the moment a sleep transistor is assigned, some delay penalty is incurred. The

    range of delay that a gate can have is decided by the range of the acceptable (W/L) sleep. The

    objective of sleep transistor sizing is to decide the best values of (W/L) sleep for all sleep

    transistors such that the global delay constraint is satisfied and the total leakage is minimized.


  • 8/3/2019 Sram Implementation


    Fig 3.5 : Sleep Transistors in MTCMOS circuits


    1. Low Vth Transistors (lvt)

    The Low Vth transistor type is the fastest available favor in the STM 90nm general

    purpose technology, and is used for applications where the speed is of primary importance.

    The disadvantage of this type of transistors is that, due to the low threshold voltage (Vth), the

    static power is very high.

    2. Standard Vth Transistors (svt)

    The Standard Vth transistor type is an all-purpose favor where delay and static

    power has been traded-off to match typical design requirements. The procedure used to

    characterize this technology variation is exactly the same as the one used for lvt.

    3. High Vth Transistors (hvt)

    The High Vth transistor type is a favor especially optimized for extremely low static

    power consumption. Typical applications for this technology variation are circuit idle most of

    the time and/or where speed/performance are not of utmost importance. The procedure used

    to characterize this technology variation is exactly the same as the one used for lvt.


  • 8/3/2019 Sram Implementation



    -Performance can be improved and the leakage current minimized.

    -Sub-threshold leakage current is reduced by the sleep transistor while performance loss is



    MTCMOS has a serious problem that the stored data of latches and flip-flops in logic

    blocks cannot be preserved when the power supply is turned off (sleep mode).Therefore,

    extra circuits and complex timing design must be provided for holding the stored data. These

    cause great penalties on performance, power and area of the system.


    Placing a high Vth PMOS transistor between Vdd and the logic block results in the

    MTCMOS design with PMOS sleep control transistor as shown in fig 3.6 .

    Fig 3.6 : Modified MTCMOS design with PMOS sleep control transistor


  • 8/3/2019 Sram Implementation




    The project involves in the implementation of SRAM using MTCMOS technique in

    Cadence- Virtuoso Analog Design Environment.


    4.2 CMOS LOGIC

    In CMOS (complementary MOS) logic, only the two complementary MOSFET

    transistors: n-channel also known as NMOS, andp-channel also known as PMOS are used to

    create the circuit. The logic symbols for the NMOS and PMOS transistors are shown in

    Figure (a) and Figure (b), respectively. In designing CMOS circuits, we are interested only in

    the three connectionssource, drain, and gateof the transistor. The substrate for the

    NMOS is always connected to ground, while the substrate for the PMOS is always connected

    to VCC. Notice that the only difference between these two logic symbols is that one has a

    circle at the gate input, while the other does not. Using the convention that the circle denotes

    active-low i.e., a 0 activates the signal for PMOS, the NMOS gate input is active-high.

    The operation of the NMOS transistor is as follows:

    When the input at gate is a 1, the NMOS transistor is turned on or enabled, and the

    source input that is supplying the 0 can pass through to the drain output through the

    connecting n-channel. However, if the source has a 1, the 1 will not pass through to the drain

    even if the transistor is turned on, because the NMOS does not create a p-channel. Instead,

    only a weak 1 will pass through to the drain. On the other hand, when the gate is a 0 (or any

    value other than a 1), the transistor is turned off, and the connection between the source andthe drain is disconnected. In this case, the drain will always have a high impedance Zvalue


    Fig 4.1: Design Flow ofSRAM

  • 8/3/2019 Sram Implementation


    independent of the source value. The (dont-care) in the Input Signal column means that it

    doesnt matter what the input value is, the output will be Z. The high-impedance value,

    denoted byZ, means no value or no output. This is like having an insulator with an infinite

    resistance or a break in a wire, therefore, whatever the input is, it will not pass over to the


    Fig 4.2 :NMOS symbol Fig 4.3:Truth Table

    The PMOS transistor works exactly the opposite of the NMOS transistor. The

    operation of the PMOS transistor is as follows.

    When the input at gate is a 0, the PMOS transistor is turned on or enabled, and the

    source input that is supplying the 1 can pass through to the drain output through the

    connectingp-channel. However, if the source has a 0, the 0 will not pass through to the drain

    even if the transistor is turned on, because the PMOS does not create an n-channel. Instead,

    only a weak 0 will pass through to the drain. On the other hand, when the gate is a 1 (or any

    value other than a 0), the transistor is turned off, and the connection between the source and

    the drain is disconnected. In this case, the drain will always have a high-impedanceZvalue

    independent of the source value.

    (a) (b)

    Fig 4.4: PMOS Transistor a)symbol b) Truth table


  • 8/3/2019 Sram Implementation


    4.3 INVERTER

    When the gate input is a 1, the bottom NMOS transistor is turned on while the top

    PMOS transistor is turned off. With this configuration, a 0 from ground will pass through the

    bottom NMOS transistor to the output while the top PMOS transistor will output a high-

    impedance Zvalue. A Zcombined with a 0 is still a 0, because a high-impedance is of no


    Alternatively, when the gate input is a 0, the bottom NMOS transistor is turned off

    while the top PMOS transistor is turned on. In this case, a 1 from VCC will pass through the

    top PMOS transistor to the output while the bottom NMOS tr

    ansistor will output aZ. The resulting output value is a 1.



    Fig 4.5 INVERTER (a) circuit (b)truth table


  • 8/3/2019 Sram Implementation


    Fig 4.6 Switch model for INVERTER (a) low input; (b) high input

    4.4 NAND GATE

    If either input isLOW

    , the outputZ

    has a low-impedance connection to VDD throughthe corresponding on p-channel transistor, and the path to ground is blocked by the

    corresponding off n-channel transistor. If both inputs are HIGH, the path to VDD is

    blocked, and Z has a low-impedance connection to ground.




    Fig 4.7 NAND GATE (a)circuit (b) truth table (c) symbol


  • 8/3/2019 Sram Implementation


    Fig 4.8 : Switch model for 2 input NAND gate (a) both inputs low;

    (b) one input high; (c) both inputs high


    When the E input is asserted, the Q output follows the D input. In this situation, the latch

    is said to be open and the path from D input to Q output is transparent; the circuit is

    often called a transparent latch for this reason. When the E input is negated, the latch

    closes; the Q output retains its last value and no longer changes in response to D, as long as

    E remains negated.

    Fig :4.9 D-Latch with enable


    A tri-state buffer, as the name suggests, has three states: 0, 1, and a third state

    denoted byZ. The valueZrepresents a high-impedance state, which acts like a switch that is


  • 8/3/2019 Sram Implementation


    opened or a wire that iscut. Tri-state buffers are used to connect several devices to the same

    bus. A bus is one or more wire for transferringsignals. If two or more devices are connected

    directly to a bus without using tri-state buffers, signals will getcorrupted on the bus because

    the devices are always outputting either a 0 or a 1. However, with a tri-state buffer in

    between, devices that are not using the bus can disable the tri-state buffer so that it acts as if

    those devices are physically disconnected from the bus. At any one time, only one active

    device will have its tri-state buffers enabled, and thus, use the bus.

    The active high enable lineEturns the buffer on or off. WhenEis de-asserted with a

    0, the tri-state buffer is disabled, and the outputy is in its high-impedanceZstate. WhenEis

    asserted with a 1, the buffer is enabled, and the outputy follows the input d.

    The truth table is derived as follows.

    WhenE= 0, it does not matter what the input dis, we want both transistors to be

    disabled so that the output y has the Z value. The PMOS transistor is disabled

    when the input A = 1; whereas, the NMOS transistor is disabled when the input

    B= 0.

    WhenE= 1 and d= 0, the output y is 0. To get a 0 ony, we need to enable the

    bottom NMOS transistor and disable the top PMOS transistor so that a 0 will pass

    through the NMOS transistor toy.

    When E = 1 and d = 1, the output y is 1. Here we need to do the reverse by

    enabling the top PMOS transistor and disabling the bottom NMOS transistor.

    Fig 4.10:Tri-state buffer: (a) truth table; (b) logic symbol; (c) circuit; (d) truth table for

    the control portion of the tri-state buffer circuit



  • 8/3/2019 Sram Implementation


    Each bit in a static RAM chip is stored in a memory cell similar to the circuit shown

    in Fig 4.11 (a). The main component in the cell is a D latch with enable. A tri-state buffer is

    connected to the output of the D latch so that it can be selectively read from. The Cell enable

    signal is used to enable the memory cell for both reading and writing. For reading, the Cell

    enable signal is used to enable the tri-state buffer. For writing, the Cell enable together with

    the Write enable signals are used to enable the D latch so that the data on the Inputline is

    latched into the cell.

    Fig4.11 :Memory cell (a) circuit; (b) logic symbol.


    The write operation begins with a valid address on the address lines and valid data on

    the data lines, followed immediately by the CEline being asserted. As soon as the WR line is

    asserted, the data present on the data lines is written into the memory location that is

    addressed by the address lines.

    A memory read operation also begins with setting a valid address on the address lines,

    followed by CEgoing high. The WR line is then pulled low, and shortly after, valid data from

    the addressed memory location is available on the data lines.

    Fig4.12 Memory Timing Diagram (a) read operation (b) write operation.


  • 8/3/2019 Sram Implementation


    In SRAM, there is a set of data lines, Di, and a set of address lines, Ai. The data lines

    serve for both input and output of the data to the location that is specified by the address

    lines. The number of data lines is dependent on how many bits are used for storing data in

    each memory location. The number of address lines is dependent on how many locations are

    in the memory chip.

    In addition to the data and address lines, there are usually two control lines: chip

    enable (CE), and write enable (WR). In order for a microprocessor to access memory, either

    with the read operation or the write operation, the active-high CEline must first be asserted.

    Asserting the CEline enables the entire memory chip. The active-high WR line selects which

    of the two memory operations is to be performed. Setting WR to a 0 selects the read

    operation, and data from the memory is retrieved. Setting WR to a 1 selects the write

    operation, and data from the microprocessor is written into the memory. Instead of having

    just the WR line for selecting the two operations, read and write, some memory chips have

    both a read enable and a write enable line. In this case, only one line can be asserted at any

    one time. The memory location in which the read and write operations are to take place, of

    course, is selected by the value of the address lines. The operation of the memory chip is

    shown in Figure 4.8(b).

    Fig 4.13 A 2nx m RAM chip: (a) logic symbol; (b) operation table.

    Notice in Fig 4.13(a) that the RAM chip does not require a clock signal. Both the read

    and write memory operations are not synchronized to the global system clock. Instead the

    data operations are synchronized to the two control lines, CEand WR.

    To create a 8X8 static RAM chip, we need 64 memory cells forming a 8X8 grid, as

    shown in Figure 4.8.2.

    Each row forms a single storage location, and the number of memory cells in a row

    determines the bit width of each location. So all of the memory cells in a row are enabled


  • 8/3/2019 Sram Implementation


    with the same address. Again, a decoder is used to decode the address lines, A0,A1,A2. In

    this example, a 3 to 8 decoder is used to decode the eight address locations. The CE

    signal is for enabling the chip, specifically to enable the read and write functions through the

    two AND gates.

    The data comes in from the external data bus, Di, through the input buffer and to the

    Inputline of each memory cell. The purpose of using an input buffer for each data line is so

    that the external signal coming in, only needs to drive just one device (the buffer) rather than

    having to drive several devices (i.e., all of the memory cells in the same column). Which row

    of memory cells actually gets written to will depend on the given address. The read operation

    requires CEto be asserted and WR to be de-asserted. This will assert the internalREsignal,

    which in turn will enable the eight output tri-state buffers at the bottom of the circuit diagram.

    Again, the location that is read from is selected by the address lines.

    Fig 4.14 A 8X8 SRAM chip circuit.


  • 8/3/2019 Sram Implementation






  • 8/3/2019 Sram Implementation


    Fig 5.1: Output waveforms of SRAM without MTCMOS


  • 8/3/2019 Sram Implementation




  • 8/3/2019 Sram Implementation



  • 8/3/2019 Sram Implementation



  • 8/3/2019 Sram Implementation


    Fig 5.2: Output waveforms of SRAM with MTCMOS







    1 INVERTER 117.5nW 71.53nW

    2 NAND GATE 248.9nw 98.12nW

    3 D-LATCH 1.507uW 0.6268uW


    BUFFER1.216uW 0.801uW

    5 3x8 DECODER 9.47uW 2.113uW

    6 MEMORY CELL 1.559uW 0.368uW



    SRAM 97.92uW 13.11uW


  • 8/3/2019 Sram Implementation


    Table 5.1 Comparison of power consumed with and without MTCMOS


  • 8/3/2019 Sram Implementation





    Jack Horgan, Low Power Soc Design, EDAWeekly Review May 17 - 21, 2004

    Cadence, Low Power in EncounterTM RTL Compiler, Product Version 5.2,

    December 2005

    Cadence, Cadence Low Power Design Flow

    Cadence, Low Power Application Note for RC 4.1 and SoCE 4.1 USR3, Version


    V.Kursun and E. G. Friedman,Multi-Voltage CMOS Circuit Design.New York: Wiley,


    A. Chandrakasan and B. Brodersen, editors,Low Power CMOS Design", IEEE Press,


    J.K. Kao and A. Chandrakasan,Dual-Threshold Voltage Techniques for Low-Power

    Digital Circuits",IEEE Journal of Solid State Circuits, Vol. 35, No. 7,pp. 1009-1018,

    July 2000.

    Liqiong Wei, Zhanping Chen, Roy, K., Yibin Ye, De, V., Mixed-Vth (MVT) CMOS

    Circuit Design Methodology for Low Power Applications Design Automation

    Conference, 1999. Proceedings. 36th, Jun. 1999, pp. 430-435.

    M. Anis, S. Areibi, and M. Elmasry, Design and Optimization of Multithreshold

    CMOS(MTCMOS) Circuits, IEEE Transaction on Computer-Aided Design of

    Integrated Circuits and Systems,vol. 22, no. 10, pp. 1324-1342, Oct. 2003.

    S. Sirichotiyakul and et al., Stand-by Power Minimization through Simultaneous

    Threshold Voltage Selection and Circuit Sizing, Proc. of the DAC, pp. 436-441,


    Essentials of VLSI circuits and systems- Kamran Eshraghain , Eshraghian Dougles

    and A.Pucknell,PHI,2005 Edition

    Digital Design Principles &Practices- John F. Wakerly , PHI/ Pearson Education

    Asia, 3rd Ed., 2005

    Digital Logic and microproccesor design with VHDL-Enoch O.Hwang


  • 8/3/2019 Sram Implementation


    A.1 Cadence: VirtuosoAnalog Design Environment

    Cadence is an Electronic Design Automation (EDA) environment which allows

    different applications and tools to integrate into a single framework thus allowing to support

    all the stages of IC design and verification from a single environment. These tools are

    completely general, supporting different fabrication technologies.

    Fig A.1 :Cadence design flow

    A.2 Various Design steps

    Firstly a schematic view of the circuit is created using the Cadence Composer

    Schematic Editor. Alternatively, a text netlist input can be employed. Then, the circuit is

    simulated using the Cadence Affirma analog simulation environment. Different simulators

    can be employed, some sold with the Cadence software (e.g., Spectre) some from other

    vendors (e.g., HSPICE) if they are installed and licensed.

    1. Invoking Cadence tool

    The command Interpreter Window can be invoked by typing icfb90The tool is

    available on vlsi34, vlsi35, vlsi36, vlsi27. The following window will appear on the screen on

    invoking the command.


  • 8/3/2019 Sram Implementation


    Fig A.2 Log Window

    2. Create Library

    In order to create the library go to Tools >Library Manager on the Tools menu of the CIW.

    Fig A.3 Library window

    Now to create a new library go to File >New >Library from the File menu of the Library



  • 8/3/2019 Sram Implementation


    Fig A.4 Library Creation window

    3.Create Schematic

    Start by clicking on the library (created by you) in the Library Manager window, then

    go to File >New >Cell View and fill in with Inverter ( in this case) as the cell name,

    schematic as the view name, and Composer Schematicas the tool, then press OK.

    Fig A.5 File Creation window


  • 8/3/2019 Sram Implementation


    An empty window appears as the next figure.

    Fig A.6 Schematic Window

    Now place the instances. Add the I/O pins.Add the wires.

    Now you need to Check and Save your design (either the top left button or Design >Check

    and Save).

    Make sure you look at the CIW window and there are no errors or warnings, if there are any

    you have to go back and fix them! Assuming there are no errors we are now ready to start



  • 8/3/2019 Sram Implementation


    3 Simulation

    In the Virtuoso Schematic window go to Tools >Analog Environment. The design should

    be set to the right Library, Cell and View.

    Fig A.7 Simulation window

    5.Choosing the Analyses

    In the Affirma Analog Circuit Design Environment window, click Analysis Choose

    pull down menu to open the analyses window.Several analyses modes are set up.


  • 8/3/2019 Sram Implementation


    6.Transient Analysis

    In the Analysis Section, select transient time and set the Stop Time and Before

    Clicking OK button, click APPLY button.

    Fig A.8 Analysis Window

    7. Saving and Plotting Simulation Data

    Select Output To be Plotted Select on Schematic to select nodes to be

    plotted. By clicking on the wire on the schematic window to select voltage node, and by

    clicking on the terminals to select currents. Select the input and output wires in the circuit.

    Observe the simulation window as the wires get added.

    8.Run the Simulation

    Click on the Run Simulation icon.

    When it completes, the plots are shown automatically.


  • 8/3/2019 Sram Implementation




    Fig B.1 : Inverter without MTCMOS


  • 8/3/2019 Sram Implementation


    Fig B.2 :Inverter with MTCMOS


  • 8/3/2019 Sram Implementation


    Fig B.3 :SRAM without MTCMOS


  • 8/3/2019 Sram Implementation


    Fig B.4 :SRAM with MTCMOS


  • 8/3/2019 Sram Implementation




  • 8/3/2019 Sram Implementation



  • 8/3/2019 Sram Implementation


    Fig B.5 : Power Calculation window without MTCMOS


  • 8/3/2019 Sram Implementation



  • 8/3/2019 Sram Implementation


    Fig B.6 :Power Calculation window with MTCMOS