+ All Categories

VLP

Date post: 28-Mar-2015
Category:
Upload: sprtoshbti
View: 304 times
Download: 26 times
Share this document with a friend
Description:
GROUP WISE DOWNLOAD OF SPRTOS2011
141
CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011 VLP0101-1 Low power On-Chip Amplifier for CCD Array Er. Rahul Malhotra*, Er. Amit Kumar** Bhai Maha Simgh College of Engineering, Sri Muktsar Sahib, India *[email protected], **[email protected] AbstractThe field of Analog VLSI design is an essential part of any electronics system because of our real world is analog, In this paper low power amplifier is presented for CCD array [1]. CCD are used to capture the images modern digital cameras and high resolution cameras consists of CCD array but all the performance of the CCD array is depends on the performance of On-Chip amplifier which is placed at the end of the array in this paper single and two stage amplifier are simulated and the result is presented for the power and bandwidth by varying the sizes of the different transistors all the results are verified by using the Tanner tool (version 7.1) [11]. There are number of analysis presented by the researchers in the literature to improve the power dissipation but most of the structure are compromise sometimes with the area or sometimes with the bandwidth here we have achieve the lesser power dissipation but with the handsome value of bandwidth is also maintained to support this claim the detailed results are presented in the result section. Keywords: Gain, power dissipation, bandwidth, capacitance INTRODUCTION Charge Coupled Devices (CCDs) were invented in the 1970s and originally found application as memory devices Charge Coupled Devices (CCD) have many applications, but the most important is in imaging [3]. The basic operation of the sensor is to convert light into electrons. When light is Incident on the active area of the image sensor it interacts with the atoms that make up the silicon crystal. The energy transmitted by the light (photons) is used to enable an electron to escape from the tight control of one atom to roam more freely about the device as a “conduction” electron, leaving behind an atom shy of one electron. Modern CCD has two types of architecture: 1. Full-Frame (FF) 2. Frame-Transfer (FT) FF CCDs have the simplest architecture and are the easiest to fabricate and operate. They consist of a parallel CCD shift register, a serial CCD shift register and a signal sensing output amplifier. Images are optically projected onto the parallel array which acts as the image plane the architecture is shown in the fig. 1 FT CCDs are very much like FF architectures. The difference is that a separate and identical parallel register, called a storage array, is added which is not light sensitive. The idea is to shift a captured scene from the photosensitive, or image array, very quickly to the storage array [5]. Readout off chip from the storage register is then performed as described in the FF device previously while the storage array is integrating the next frame. The architecture is shown in the fig. 2 Fig. 1 Full Frame architecture Fig. 2 Frame transfer architecture
Transcript
Page 1: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0101-1

Low power On-Chip Amplifier for CCD Array

Er. Rahul Malhotra*, Er. Amit Kumar** Bhai Maha Simgh College of Engineering, Sri Muktsar Sahib, India

*[email protected], **[email protected]

Abstract— The field of Analog VLSI design is an essential

part of any electronics system because of our real world is

analog, In this paper low power amplifier is presented for

CCD array [1]. CCD are used to capture the images modern

digital cameras and high resolution cameras consists of CCD

array but all the performance of the CCD array is depends on

the performance of On-Chip amplifier which is placed at the

end of the array in this paper single and two stage amplifier

are simulated and the result is presented for the power and

bandwidth by varying the sizes of the different transistors all

the results are verified by using the Tanner tool (version 7.1)

[11]. There are number of analysis presented by the

researchers in the literature to improve the power dissipation

but most of the structure are compromise sometimes with the

area or sometimes with the bandwidth here we have achieve

the lesser power dissipation but with the handsome value of

bandwidth is also maintained to support this claim the

detailed results are presented in the result section.

Keywords: Gain, power dissipation, bandwidth, capacitance

INTRODUCTION

Charge Coupled Devices (CCDs) were invented in the 1970s

and originally found application as memory devices Charge

Coupled Devices (CCD) have many applications, but the

most important is in imaging [3]. The basic operation of the

sensor is to convert light into electrons. When light is

Incident on the active area of the image sensor it interacts

with the atoms that make up the silicon crystal. The energy

transmitted by the light (photons) is used to enable an

electron to escape from the tight control of one atom to roam

more freely about the device as a “conduction” electron,

leaving behind an atom shy of one electron. Modern CCD has

two types of architecture:

1. Full-Frame (FF)

2. Frame-Transfer (FT)

FF CCDs have the simplest architecture and are the easiest to

fabricate and operate. They consist of a parallel CCD shift

register, a serial CCD shift register and a signal sensing

output amplifier. Images are optically projected onto the

parallel array which acts as the image plane the architecture is

shown in the fig. 1

FT CCDs are very much like FF architectures. The difference

is that a separate and identical parallel register, called a

storage array, is added which is not light sensitive. The idea is

to shift a captured scene from the photosensitive, or image

array, very quickly to the storage array [5]. Readout off chip

from the storage register is then performed as described in the

FF device previously while the storage array is integrating the

next frame. The architecture is shown in the fig. 2

Fig. 1 Full Frame architecture

Fig. 2 Frame transfer architecture

Page 2: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0101-2

Both of the above architecture are widely used but the

performance of both the architecture are depends on the type

and the quality of the On-chip (output) amplifier which is

fabricated at the last stage of the structure as shown in the fig

above.

ARCHITECTURE OF ON-CHIP AMPLIFIER

Output amplifier has also two type of the architecture

1. Single stage amplifier

2. Two stage amplifier

out

M1

Mc

VRG

VRD

VDD

FD

Detection Node

L=2u

W=22u

L=2u

W=22u

L=2u

W=22u

Fig. 3 Single Stage CCD On-Chip amplifier

The single stage amplifier consists of source follower M1 and load transistor Mc for biasing. The reset FET is connected to the detection node and consists of floating diffusion [6, 7] and the gate of M1. In the ON state it resets the detection node to a reference voltage (VRD) and in the OFF state the floating can receives the next charge packet. The voltage source between the gate and source of the current sink Transistor Mc determines the bias current of the first stage and can be used as a signal injection point to measure the ratio between total capacitance and the effective sense capacitance and the bandwidth in the off state.

The Two stage amplifier further improves the character tics of the amplifier and gives the better result which is shown in the result section of the paper and the architecture of two stages is shown two stage amplifier also improves the sensitivity of the amplifier and this also reduces the noise level of the overall CCD.

Mr

M1

Mc

M2

M3

Vdd

VCS

VRD

FD

Detection node

Reset gate pulse

output

L=2u

W=22u

L=2u

W=22u

L=2u

W=22u

L=2u

W=22u

L=2u

W=22u

Fig. 4 Two Stage CCD On-Chip amplifier

OPTIMIZATION

For optimization of the on-chip amplifier Length and Width

of the individual transistor are varied and the various

optimization results are obtained. The effect of increase and

decrease of Length and Width of the transistor is given as

To achieve maximum gain:

Transistor „M1‟: -The gain can be maximized by increasing

the width of this transistor as this increases the difference in

the output voltage amplitude.

Transistor „MC‟: -The gain can be maximized by decreasing

the width of this transistor as this increases the difference in

the output voltage amplitude.

Page 3: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0101-3

Transistor „M2‟: -The gain can be maximized by increasing

the width of this transistor as this increases the difference in

the output voltage amplitude.

Transistor „M3‟: -The gain can be maximized by decreasing

the width of this transistor as this increases the difference in

the output voltage amplitude.

To achieve maximum bandwidth:

Transistor „M1‟: - The bandwidth of the circuit can be

increased by increasing the width of this transistor as the

increase in width increases the transconductance which helps

in increasing the bandwidth as the impedance decreases.

Transistor „MC‟: - The bandwidth of the circuit can be

increased by increasing the width of this transistor as the

increase in width increases the transconductance which helps

in increasing the bandwidth as the impedance decreases.

Transistor „M2‟: - The bandwidth of the circuit can be

increased by increasing the width parameter of this transistor.

So bandwidth can be increased by changing this parameter.

Transistor „M3‟: - The bandwidth of the circuit can be

increased by increasing the width of this transistor as the

increase in width increases the Tran conductance which helps

in increasing the bandwidth as the impedance decreases,

although the change desired is not that large.

To achieve minimum power dissipation:

Transistor „M1‟: - The power dissipation of the circuit can be

reduced by reducing the width of this transistor as the current

flowing into this transistor reduces with the reduction in the

width while power dissipation can be reduced by increasing

the length because increase in length reduces

transconductance which in turn reduces the amount of current

flowing into the transistor.

Transistor „MC‟: - The power dissipation of the circuit can be

reduced by reducing the width of this transistor as the current

flowing into this transistor reduces with the reduction in the

width while power dissipation can be reduced by increasing

the length because increase in length reduces

transconductance which in turn reduces the amount of current

flowing into the transistor.

Transistor „M2‟: - The power dissipation of the circuit can be

reduced by reducing the width of this transistor as the current

flowing into this transistor reduces with the reduction in the

width.

Transistor „M3‟: - The power dissipation of the circuit can be

reduced by reducing the width of this transistor as the current

flowing into this transistor reduces with the reduction in the

width.

RESULTS

Table 1: When the width of the transistor M3 varied

Transistor

Dimensions

(W× L) μm

M1 Mc

M2

(W×

L) μm

M3

(W× L)

μm

Power

Dissipation

(mW)

Bandwidth

BM

(MHz)

15×25

12×10

20x10

10x25

5.9

302

15×25

12×10

20x10

12x25

5.95

320

15×25

12×10

20x10

15x25

6.0

242

15×25

12×10

20x10

18x25

6.1

207

Table 2: When the width of the transistor M2 varied

Table 3: When the Length of the transistor M3 varied

Transistor

Dimensions

(W× L) μm

M1 Mc

M2

(W×

L) μm

M3

(W× L)

μm

Power

Dissipation

(mW)

Bandwidth

in

(MHz)

15×25

12×10

20x10

10x5

7.0

580

15×25

12×10

20x10

10x10

6.4

594

Transistor

Dimensions

(W× L) μm

M1 Mc

M2

(W× L)

μm

M3

(W× L)

μm

Power

Dissipation

(mW)

Bandwidth

BM

(MHz)

15×25

12×10

20x10

10x25

5.15

69

15×25

12×10

18x10

10x25

5.25

62

15×25

12×10

16x10

10x25

5.2

78

15×25

12×10

14x10

10x25

5.3

70

15×25

12×10

12x10

10x25

5.4

87

15×25

12×10

10x10

10x25

5.7

122

15×25

12×10

8x10

10x25

5.8

148

Page 4: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0101-4

15×25

12×10

20x10

10x15

6.1

596

15×25

12×10

20x10

10x18

6.0

365

15×25

12×10

20x10

10x20

5.9

270

15×25

12×10

20x10

10x25

5.7

122

15×25

12×10

20x10

10x30

5.8

109

Table 4: When the Length of the transistor M2 varied

Transistor

Dimensions

(W× L) μm

M1 Mc

M2

(W×

L) μm

M3

(W×

L) μm

Power

Dissipa

tion

(mW)

Bandwidth

in

(MHz)

15×25

12×10

20x5

10x15

6.4

150

15×25

12×10

20x10

10x15

6.1

490

15×25

12×10

20x15

10x15

5.9

550

15×25

12×10

20x18

10x15

5.8

570

15×25

12×10

20x20

10x15

5.8

326

15×25

12×10

20x25

10x15

5.75

380

The results of the above table are taken from the Tanner T-spice tool by using the 2.0 Mosis model file for the enhancement MOSFET transistor. The power dissipation and the bandwidth are directly, measures from the waveform editor in the Tanner EDA tool.

CONCLUSION AND FUTURE SCOPE

It is observed from the result that in case of single stage On-

Chip amplifier minimum power dissipation and maximum

bandwidth is achieved when the Width of the M1 transistor is

18μm and the Length of the M1 transistor is 25μm meter and

the Width of the Mc transistor is 10μmr and the Length of the

Mc transistor is 16μm. In this case power dissipation is 4.3

milli-watts and the gain of the amplifier is 0.82 and

bandwidth is 617MHz. In case of two stage amplifier

maximum bandwidth is achieved when dimension of

transistor is as M1(15μmx25μm), M2(20μmx10μm),

M3(10μmx15μm) & Mc(12μmx10μm) and for minimum

power dissipation the dimension of all the transistor should be

M1(15μmx25μm), M2(20μmx10μm), M3(10μmx25μm) &

Mc(12μmx10μm). The whole design simulated using

MOSIS/Orbit 2.0μm process by using Tanner tool.

In this thesis Analog simulation is done by using the Tanner

tool and using the enhancement type MOSFET transistor is

used, this thesis can be further extended for the depletion type

MOSFET because in depletion type MOSFET noise level

will get further reduce and the other thing which can be

improved in future is, semiconductor and environmental

noise effect which is not consider in this current thesis.

REFERENCES

[1] Gruner, Sol M. Tate, Mark W. Eikenberry and Eric

F “Charge - coupled device area x-ray detectors”.

Review of Scientific Instruments, page No. 2815 -

2842 Volume:73 Issue: 8

[2] M.J.Howess & D.V.Morgan, “Charge-Coupled

Devices and Systems”, John Wiley & Sons.

[3] James R. Janesick, “Scientific Charge-Coupled

Devices”, Spie Press Monograph Vol.85.

[4] M.s Tyagi, “Introduction To Semiconductor

Materials And Devices”, by John Wiley & Sons,

Inc © 1991.

[5] Dalsa web site; CCD Technology Primer;

http://www.dalsa.com/corp/markets/ccd_vs_cmos.as

px

[6] Kodak CCD Primer, #KCP-001,”Charge coupled

device (CCD) Image Sensors”, Eastman Kodak

Company - Microelectronics Technology Division.

[7] D.Barbe, "Imaging Devices Using the Charge-

Coupled Concept". Proceedings of the IEEE,

pp. 38-67, Jan. 1975.

[8] Stuart A. Taylor, “CCD and CMOS Imaging Array

Technologies: Technology Review”, Technical

Report EPC106, Xerox Research Centre Europe,

1998.

[9] Beynon J.D.E, “The Basic Principles Of Charge

Coupled Devices”, MICROELECTRONICS,

vol.7 No.2c 1975 Mackintosh Publications Ltd.

Luton.

[10] P.Centen, E. Roks. "Characterization of Surface- and

Buried-Channel Detection Transistors for On-

Chip Amplifiers". Technical Digest IEDM97,

pp.193-196, San Francisco, Dec 7-10, 1997.

[11] http://www.mosis.com/products/fab/vendors/tsmc/ts

mc-kits.html

Page 5: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0102-1

Abstract— The mathematical model provides an insight into the

complete behavior of the physical system that reduces the

problem to its essential characteristics. The floating admittance

matrix (FAM) approach is a neat method of mathematical

modeling of electronic devices and its uses in circuits. The zero

sum property of the floating admittance matrix provides a

check to proceed further or reobserve the first equation itself.

All transfer functions are represented as cofactors of the

floating admittance matrix of the circuit.

Keywords: Amplifier, Common Source FET, Floating

Admittance Matrix, Zero Sum property, Cofactors, Plots

INTRODUCTION

The most commonly used amplifier configuration of

MOSFETs is common source amplifier. The common-

source (CS) amplifier may be viewed as a transconductance

amplifier or as a voltage amplifier. As a transconductance

amplifier, the input voltage is seen to be modulating the

current going to the load. As a voltage amplifier, input

voltage modulates the amount of current flowing through the

FET, changing the voltage across the output resistance

accordingly.

This paper aims to develop the mathematical model of

common source amplifier. The floating admittance matrix of

FET is taken to advantage for derivation of its voltage gain,

input resistance and output resistance in the common source

configuration.

MATHEMATICAL MODEL OF FET

The two stage Common Source FET amplifier can be

represented as in Fig. 1

Fig.1 Two-stage Common Source Amplifier

The a.c. equivalent circuit of Fig. 1is shown in Fig. 2

Fig.2 ac circuit of two-stage Common Source Amplifier

The matrix representation of FET as two-port network (four

terminals) is written as

Programmable Input Output Resistances of

FET Amplifier

Mrs. Meena Singh

Lecturer, Deptt. of ECE, University

Polytechnic, B.I.T. Mesra, Ranchi

([email protected])

+91-9279265054

Arun Kumar Singh Deptt. of ECE, Madan Mohan

Malaviya Engg. College, Gorakhpur

([email protected])

+91-9312801316

Dr. B. P. Singh

Professor, Deptt. of ECE &EEE,

Mody Institute of Technology &

Science, Lakshmangarh

([email protected])+91-9468688102

+

VD

D

1

2

3

RD1

RG2 RG1

RF

RD2

4

rs

1 2

3

R21 R12

RF

RL rs

RS2 RS2

C

RD2 R12

vi

RD1

C

VDD

C R22

C

C C

Page 6: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0102-2

s

d

g

ii

ii

ii

3

2

1 =

3

2

1

gggggg

gggg

g0g

321

dmgdmg

dmdm

gg

s

d

g

vv

vv

vv

3

2

1 (6.1) (5.1)

(1)

The admittance matrix of the FET as a device is expressed in

(1). Its coefficient matrix is expressed as

Y =

3

2

1

gggggg

gggg

g0g

321

dmgdmg

dmdm

gg

=

dmdm

dmdm

gggg

gggg

000 (2)

Gate to source resistance of FET is assumed to be very large

(ideally infinity) as it is always reverse biased, hence gg =

0 S. Then the above coefficient matrix of the FET of (1)

reduces to (2). Thus, the admittance matrix of two FETs

(device1 and device2) connected in Fig.2 can be written as

1deviceY =

3

2

1

gggg

gggg

000

321

1d1m1d1m

1d1m1d1m

(3)

2deviceY =

3

4

2

gggg

gggg

000

342

2d2m2d2m

2d2m2d2m

(4)

Now the composite matrix of two devices (device1 and

device2) is written as

devicesY =

4

3

2

1

gggg0

gggggggg

0gggg

0000

4321

2d2d2m2m

2d2d2m1d1m2m1d1m

1d1m1d1m

(5)

The over all admittance matrixes for Fig.2 is written as

Y =

FGLG2dgLG2dg2mg2mgFG

LG2dg

LG2GG1DG1GG

sg2dg2mg1dg1mg2GG1DG2mg1dg1GGsg1mg

02GG1DG1dg1mg2GG1DG1dg1mg

FG1GGsg0FG1GGsg

(6)

Equation (6) represents the Floating Admittance Matrix [3],

[4], [5] of two stages Common Source Amplifier.

Now from (6) the input impedance of circuit in Fig.2 can be

expressed as [1],[2]

=

]G)GGggg(gg[(G

)GGg)(GGggg)(GGg(

)GGg)(GGgg(

FGD2m2g1d2m1mF

FL2dGD2m2g1dFG1g

FL2dGD2g1d

(7)

Similarly, its output impedance and voltage gain can be

expressed as [1], [2]

=

]G)GGggg(gg[(G

)Gg)(GGggg)(GGgg(

)GGg)(GGgg(

FGD2m2g1d2m1mF

F2dGD2m2g1dFGs1g

FG1gGD2g1d

(8)

1313

Y

1343

Y

131Sgn34Sgn43

13VA 11

AV=)GG)(gGGg(g

)GG(gGgg

FLd2GDg2d1

GDd1Fm2m1 (9)

VERIFICATION ON MATLAB

The values of , , and 43

13VA for different values of

source conductance and load conductance ( 0mS, 1mS, and

2mS) have been programmed through MATLAB. The

output of the MATLAB programs have been plotted for ,

, and 4313VA with respect to feedback conductance, Gf .

If we assume that the two MOSFETs of Fig. 2 are properly

biased to yield the same values of its internal parameters

( 1dg = 2dg and 1mg = 2mg ), then for plotting on demand

Page 7: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0102-3

value of simulated input and output resistances, typical

values of external parameters along with its internal

parameters can be given as:

1dg = 2dg = 0.1mS, 1mg = 2mg = 5mS, LG = DG = 1mS,

1GG = 2GG = GG = 0.001mS, 1gg = 2gg = 0.0001mS, FG

= variable (0mS to 0.15mS).

The plots of input and output resistances results into on

demand values or in other words simulated input and output

resistance can have any values, both negative and positive

that is controlled by the feedback conductance between the

two stages of the amplifier.

The plot of input resistance as a function of feedback

conductance is shown in Figs.3, 4, and 5 for 0 S, 1 mS and 2

mS of load conductance respectively as per (7).

Following observations are recorded from the plots in Fig. 3,

4 and 5:

Fig.3 Input resistance as a function of feedback conductance for

GL= 0 S

a) For GL = 0 S, input resistance is almost constant (

1.148e+06 Ω) from initial values of Gf till Gf reaches

2.7520e-05 mS, thereafter input resistance began to rise

exponentially (from 1.148e+06 Ω to 4.837e+06 Ω) for

2.7520e-05 mS to 2.7523e-05 mS variation in Gf. It is

interesting to note that Ri suddenly jumps down (from

4.837e+06 Ω to -6.828e+07 Ω) for 2.7523e-05 mS to

2.7524e-05 mS variation in Gf , again Ri began to increase

suddenly to -4.237e+06 Ω as Gf approaches 2.7525e-05 mS,

the curve then starts increasing linearly (from -4.237e+06 Ω

to -1.473e+06 Ω) from Gf = 2.7525e-05 mS to Gf = 2.7527e-

05 mS respectively, and Ri remains constant thereafter at -

1.473e+06 Ω for higher values of Gf.

Fig.4 Input resistance as a function of feedback conductance for

GL= 1 mS

b) For GL= 1 mS, input resistance is almost constant at

3.289e+05 Ω from initial values of Gf till Gf reaches

0.0004036 mS, thereafter Ri starts increasing linearly (from

3.289e+05 Ω to 4.393e+07 Ω) from Gf = 0.0004036 mS to

Gf = 0.0004038 mS and suddenly jumps down (to -

7.805e+06 Ω) as Gf reaches 0.00040381 mS. Again, Ri

began to rise (from -7.805e+06 Ω to -6.729e+05 Ω) from Gf

= 0.00040381 mS to Gf = 0.0004039 mS respectively, and

remains constant thereafter at -6.729e+05 for higher values

of Gf.

Fig.5 Input resistance as a function of feedback conductance for

GL = 2 mS

c) For GL= 2 mS, input resistance rises exponentially (from

216.5 Ω to 3331 Ω) from Gf = 0.0001 mS to Gf = 0.0011 mS

respectively, then suddenly it jumps down to Ri= -4418 Ω at

Gf = 0.0012 mS and again rises exponentially( to -225.4 Ω)

till Gf = 0.002 mS and remains constant thereafter at -225.4

Ω for higher values of Gf.

Page 8: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0102-4

The plot of output resistance as a function of feedback

conductance (Gf) is shown in Figs.6, 7, and 8 for 0 S, 1 mS

and 2 mS of source conductance respectively as per (8).

Following observations are recorded from the plots in Fig. 6,

7 and 8:

Fig.6 Output resistance as a function of feedback conductance for

GS = 0 S

a) For gs = 0 S, output resistance is almost constant (

1.735e+04 Ω) from initial values of Gf till Gf reaches

2.752e-05 mS, thereafter output resistance starts rising

exponentially (from 1.735e+04 Ω to 5.452e+04 Ω) for

2.7520e-05 mS to 2.7522e-05 mS variation in Gf. It is

interesting to note that Ro suddenly jumps down (from

5.452e+04 Ω to -7.697e+05 Ω) for 2.7522e-05 mS to

2.75242e-05 mS variation in Gf, again Ro began to increase

suddenly to -4.776e+05 Ω as Gf reaches 2.75262e-05 mS,

then starts increasing exponentially (from -4.776e+05 Ω to -

1.252e+04 Ω) from Gf = 2.75262e-05 mS to Gf = 2.753e-05

mS respectively, and then Ro remains constant thereafter

at -1.252e+04 Ω for higher values of Gf.

Fig.7 Output resistance as a function of feedback conductance for

GS = 1 mS

b) For Gs= 1 mS, output resistance is almost constant at

237.9 Ω from initial values of Gf till Gf reaches 0.03340

mS, thereafter Ro starts increasing exponentially (from

237.9 Ω to 2829 Ω) from Gf = 0.03340 mS to Gf = 0.03341

mS and suddenly jumps down (to -7836 Ω) as Gf reaches

0.033411 mS. Again, Ro rises (from -7836 Ω to -22.83 Ω)

from Gf = 0.033411 mS to Gf = 0.0335 mS, and remain

constant thereafter at -22.83 Ω for higher values of Gf.

Fig.8 Output resistance as a function of feedback conductance for

GS = 2 mS

c) For Gs= 2 mS, output resistance rises exponentially (from

0.805 Ω to 39.85 Ω) from Gf = 0.09 mS to Gf = 0.1 mS

respectively, suddenly it jumps down to Ro= -1.028 Ω at Gf

= 0.11 mS and remains constant thereafter at -1.028 Ω for

higher values of Gf.

The plot of voltage gain as a function of feedback

conductance is shown in Figs.9 and 10 for 0 S, 1 mS and 2

mS of load conductance respectively as per (9).

Fig.9 Voltage gain as a function of feedback conductance for

GL = 0 S

Page 9: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0102-5

Fig.10 Voltage gain as a function of feedback conductance for

GL = 1 mS and 2 mS

Plots in the figs. 9 and 10 reveals that voltage gain (AV) is an

inverse function of feedback conductance (Gf), further the

voltage gain decreases as the value of source conductance

(gs) increases due to their inverse relationship given by (9).

CONCLUSION

Plots in the Figs. 3 to 8 reveal a region of very sudden

change in the values of input resistance and output resistance

from very high positive values to large negative value, for

very small change of the order of 10-05

in the value of

feedback conductance, Gf. This zone of very high variation

in input and output resistances can be used for compensation

of resistances to obtain very high Q-factor in the lossy

networks.

REFRENCES

[1] Wai-Kai Chen, On second order cofactors and null return difference in feedback amplifier theory, International Journal of circuit theory and

application, Vol. 6, Issue 3, pp. 305-312, Dec. 2006.

[2] Otso Juntunen , A two port S-parameter data transformation, circuit theory laboratory report series, CT-35, Helsinki University of technology,

Finland, Espoo 1998. [3] B.P. Singh, Unified Approach to electronics circuit analysis, IJEEE, pp.

276-285, July 1978.

[4] B.P. Singh, Active bridge for measurement of admittance parameters of the transistors, Indian Journal of Pure and Applied Physics, Vol. 15, pp.

783-786, Nov. 1976.

[5] B.P. Singh, A new active bridge for measuring FET parameters, J Phys. E. Scientific Instrument, Vol. II, pp. 667-670, 1978.

[6] Jacob Millman and Christos C. Halkias, Integrated Electronics, Analog

and Digital Circuits and Systems, TATA McGRAW-HILL publication, pp. 471-475, 2004.

[7]B.P. Singh, Meena Singh, Sanjay Kumar Roy and S.N. Shukla,

Mathematical Modeling of Electronic Devices and its integration; Proceedings of National Seminar on Recent Advances on Information

Technology, Allied Publishers Pvt. Ltd., Indian School of Mines Dhanbad

University, pp.494-502, Feb. 6-7, 2009

[8]B.P. Singh, Arun Kumar Singh, verification of transfer functions

of BJT obtained by using MATLAB, Proceedings of IEEE National

Symposium on Innovative Development in Electronics Arena, Arya

College of Engineering, pp. 92-96, Dec. 12, 2009.

Page 10: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-1

RELIABILITY PREDICTION FOR IGBT BASED INVERTERS UNDER

DIFFERENT SWITCHING PATTERNS

Fuzail Ahmad

#1, S.K.Singh

*2, Amit Kumar Verma

#DOEACC CENTRE GORAKHPUR,INDIA, DOEACC CENTRE GORAKHPUR,INDIA

IBM GURGAON,INDIA

[email protected] [email protected]

[email protected]

Abstract—Due to the increasing importance of power electronics

in control of devices particularly in electrical vehicles the

reliability analysis becomes important. The reliability of a

component is the probability that this component will perform its

intended function after a time ‘t’ in a given operating condition.

Nowadays component reliability is not very important by

considering only the power losses. For predicting reliability of

power electronics components temperature and temperature

cycle are to be determined.

Military handbook [3] has been released by US

department of defence is generally accepted and often used to

determine reliability [1]. Now the handbook is not revised and

new components like IGBTs are not considered here the values

are too conservative for available devices. Some manufacturers

gives information of finding reliability through information that

only continue to finding switching losses and total power losses,

very few of them gives the thermal model of the devices. The

information of calculating the power losses and thermal

modelling is presented in [5] based on PWM reconstruction

technique. This method is useful for large simulation time step

and particularly for long mission profiles. D. Hirschman

presented an approach with simple formulas for reliability

prediction of inverters in HEVs. Work presented in literature so

far has developed reliability models for power electronics

components but not bothered about the effect of PWM method

on the reliability. This work presents the comparison between

six-step PWM based inverter and SVPWM based IGBT inverter

on finding reliability. In this work reliability is found by

conventional method and also by considering thermal cycles.

MATLAB/Simulink based models for finding out the switching

losses and temperature cycles are developed.

I. INTRODUCTION

The use of power electronic components in automobile

applications is increasing day-by-day. Due to this it becomes

important to determine the reliability of power electronic

components used in automotive applications.

Inverters are used in hybrid electric vehicles to

convert the DC supply coming from battery into AC for use in

motor to run the vehicle. Inverters are made up of

semiconductors and capacitors, so it is important to assure the

reliability of these components. Because malfunctioning of

any of the power electronic components may prevent the

vehicle to operate.

Mainly three phase voltage source inverters are used

in these types of applications. Here IGBTs are used as

switching devices. For designing an inverter, it is important to

make a good thermal design such that on the one hand the

temperature of the components never exceeds their specified

maximum temperature and on the other hand the cooling

system is not oversized.

document is a template. An electronic copy can be

downloaded from the conference website. For questions on

paper guidelines, please contact the conference publications

committee as indicated on the conference website.

Information about final paper submission is available from the

conference website.

II. BASICS OF RELIABILITY CALCULATION

2.1 INTRODUCTION

―The reliability of a component is the probability that

this component will perform its intended function after a time

t in a given working condition.‖

The Global reliability of the system is the product of all

reliabilities

Here n is the no. of components and .

It means adding component reduces reliability.

The starting point in reliability analysis is the

evaluation of reliability of a device or a component. This is

generally done from the available failure data. That is, a large

number of identical components are subjected to identical

operating conditions and the frequency of their failures is

tabulated.

Page 11: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-2

Let be the total no. of identical components for

reliability.

is the no. of components surviving at time‗t‘.

is the no. of components failed at time‗t‘.

Then at any time, + = and Reliability is

given as

=

(2.1)

Reliability is characterized by the failure rate .

The failure rate is the probability that a component, which is

still operational at time , fails in the time interval ,

where . Thus, it gives the fraction of failures in a

certain time interval for defined boundary conditions. The unit

of the failure rate is FIT (failures in time)

(2.2)

The total failure rate for a system, consisting of k

components, is the sum of all single failure rates , given as

(2.3)

The mean time to failure (MTTF) is also used to characterize

the reliability. MTTF is mean time elapsed before the first

failure occurs, is equal to the area under the reliability curve.

(2.4)

It can be calculated easily as

(2.5)

Different approaches can be used to calculate reliability.

Well known method is to use Failure rate catalogs. There are

various failure rate catalogs available e.g. Military Handbook

(MIL-HDBK-217F) and Recueil de Données de Fiabilité

(RDF 2000).

2.2 Military Handbook (MIL-HDBK-217F) Method

Military Handbook 217F has been released in 1995 by the

US Department of Defense, Washington DC. This revised

version is also the last version as the Department of Defense

has discontinued updating this standard. Hence, new

electronic devices like IGBTs are not considered in this

standard and many reference values are too conservative for

the currently available devices. Regardless, MIL-HDBK-217F

is generally accepted and often used to determine reliability.

The models have been developed, based on the historical part

failure rates.

2.2.1 Component failure rate for IGBT

The component failure rate is computed by multiplying

a component base failure rate with application specific -factors.

Failures/

(2.6)

Here is Base Failure Rate

is Temperature Factor

is Application Factor

is Quality Factor

is Environmental Factor

However no-factor exist which takes temperature cycles into

consideration.

III. ELECTRICAL MODELING AND

CALCULATION OF POWER LOSSES

A. During the design phase of an inverter, it is important to

make a good thermal design such that on the one hand the

temperatures of the components never exceed their

specified maximum temperature and on the other hand the

cooling system is not oversized. In hybrid electric vehicles,

the inverter load cannot directly be derived from the

current load status. Instead, the inverter load is computed

by a complex algorithm that considers the motor speed, the

required torque, the state of charge of the traction battery

etc.

The electrical simulation includes the inverter model and

computes the currents and voltages at the terminals of the

inverter. These values are stored in a file which is used as

input for the thermal simulation. The advantage of this

procedure is that the results of the electrical simulation can be

reused for different thermal simulations, if nothing in the

model is changed.

SIMULATION AND RESULTS

5.1 INTRODUCTION A block diagram representation of the whole work is

shown in fig 5.1. The fig shows a three-phase inverter with

IGBT/Diode as a switching device, constant DC supply as an input to

the inverter model and a three-phase load.

The losses in IGBT i.e. conduction loss and switching loss

is calculated and fed to the thermal model. Here it should be noted

that switching losses in an IGBT can be found by using datasheets.

The thermal model gives the junction temperature as an output,

which is later used in calculating reliability of the devices.

Page 12: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-3

Fig.5.1 Block Diagram Representation of Model

5.2 THREE PHASE INVERTER

The Universal Bridge block used in simulation model

implements a universal three-phase power converter that consists of

up to six power switches connected in a bridge configuration. The

types of power switch and converter configuration are selectable

from the dialog box. The Universal Bridge block allows simulation

of converters using both naturally commutated and line-commutated

power electronic devices (diodes or thyristors) and forced-

commutated devices (GTO, IGBT, MOSFET).

5.2.1 DESCRIPTION OF IGBT

The important specifications of IGBTs are as follows:

INPUT (g) - PWM switching signal to control the opening and

closing of the IGBT.

OUTPUT (m) - The Simulink output of the block is a vector

containing two signals. These signals are demultiplexed by using the

Bus Selector block provided in the Simulink library. These signals

are -

1. IGBT Current (A)

2. IGBT Voltage (V)

The Parameters of the IGBT used in simulation model are as follows:

1. Internal resistance (Ron) - The internal resistance Ron of the IGBT

device, in ohms (Ω). In this model it is 1mΩ.

2. Snubber resistance (Rs) - The snubber resistance, in ohms (Ω).

The Snubber resistance Rs is set to infinite to eliminate the snubber

from the model.

3. Snubber capacitance (Cs) - The snubber capacitance in farads (F).

The Snubber capacitance Cs is set to zero to eliminate the snubber.

5.3 CONTROL CIRCUIT Generate pulses for carrier-based pulse width modulator

(PWM) for IGBTs. For each arm the pulses are generated by

comparing a triangular carrier waveform to a reference modulating

signal. The modulating signals can be generated by the PWM

generator itself, or they can be a vector of external signals connected

at the input of the block. Three reference signals are needed to

generate the pulses for a three-phase bridge.

The amplitude modulation ratio, phase, and frequency of

the reference signals can be changed to control the output voltage of

the bridge connected to the PWM Generator block on the AC

terminals. The two pulses firing the two devices of an arm bridge are

complementary to each other for example, when pulse 1,3,5 is low (0)

then pulse 2,4,6 is high (1).

INPUT – Internal generation of modulating signals.

OUTPUT - Six pulses are generated for a three-arm bridge. Pulses 1,

3, and 5 fire the upper devices of the first, second, and third arms.

Pulses 2, 4, and 6 fire the lower devices.

The parameters of the control circuit used in simulation are as

follows:

1. Carrier Frequency (Hz) – 1080 Hz.

2. Sample Time (sec) – 5.14 µsec.

3. Modulation Index – 0.8. The amplitude of the internal

sinusoidal modulating signal. The Modulation index must be greater

than 0 and lower than or equal to 1. This parameter is used to control

the amplitude of the fundamental component of the output voltage of

the controlled bridge.

4. Frequency of Output Voltage – 60 Hz. The frequency, in hertz, of

the internal modulating signals. This parameter is used to control the

fundamental frequency of the output voltage of the

controlled bridge.

5. Phase of Output Voltage (degrees) – 0.

5.4 CALCULATION OF LOSSES IN IGBT

5.4.1 CONDUCTION LOSSES

As described in detail in chapter 4 conduction loss in an

IGBT is given as a multiplication of collector to emitter voltage of

IGBT when it is conducting and the collector current.

(5.1)

5.4.2 SWITCHING LOSSES

The best way to find switching losses in an IGBT is by

using datasheets provided by the manufacturer. For this model IXER

35N120D1 by IXER is used. In this datasheet

5.5 THERMAL MODEL

In the thermal model the transient thermal impedance curve

(Fig.5.3) provided in every datasheets of IGBT/Diode is used to find

the parameters of the thermal network (given in Fig.4.2).

Page 13: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-4

Fig.5.2 Block Diagram Representation of THERMAL MODEL.

Some manufacturers provide the values of thermal

resistance and capacitance in their datasheets. But in most of the

datasheets the information required to obtain thermal network

parameters is commonly given in form of a transient thermal

impedance curve ( ).

Fig. 5.3 Transient Thermal Impedance Curve

5.5.1 CURVE FITTING

Here the curve fitting technique is used to approximate the

curve by eq. 4.10. The points taken for data fitting are:

For IGBT

t= [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0]

= [0 0.38 0.5 0.58 0.59 0.595 0.6 0.6 0.6 0.6 0.6]

For Diode

t= [0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0]

= [0 0.79 1.01 1.19 1.25 1.28 1.3 1.3 1.3 1.3 1.3]

Here for smoothening the curve moving average method is used. For

fitting values to this curve exponential type of FIT is used the

governing equation is

a+b*exp(-c*x)+d*exp(-e*x)

Here the values of variables a,b,c,d,e gives the coefficients for

thermal network equations.

Table 5.1 gives the values of coefficients of the equation

approximated and fitted by exponential curve fitting technique.

Table 5.2 gives the values coefficients of the transfer

function found by transient thermal impedance curves.

The values of calculated thermal resistance and thermal capacitance

values are given in Table 5.3.

TABLE 5.1

Table 5.3

5.6 RESULTS AND DISCUSSION

The simulation was carried out for three-phase IGBT

inverter used in two different applications: Six step VSI induction

motor drive and Space vector PWM VSI induction motor drive. The

junction temperature of six IGBTs and six Diodes are simulated. In

this case the temperatures of the IGBT and diode junctions do not

differ significantly. Hence, the temperatures of one IGBT junction

and one diode junction are presented here. The simulations are

carried out for both of these cases for different simulation times, also

speed and torque values are changed in between simulations to better

incorporate the driving cycles. It can be seen that the results are

improved for long simulation run time.

5.6.1 SIX STEP GENERATION TECHNIQUE

The results shown here is for values of thermal coefficients

given in datasheets. The results for values of thermal coefficients

calculated from transient thermal impedance curve by curve fitting

technique are given in AppendixIII.

Parameters of fitted curve

IGBT DIODE

= -0.499 = -1.099

= 7.772 = 6.901

= 69.518 = 40.212

Calculated and

IGBT DIODE

= 0.238 °C/W = 0.650 °C/W

= 0.362 °C/W =0.650 °C/W

= 0.095 J/°C = 0.064 J/°C

= 0.240 J/°C = 0.133 J/°C

Values of coefficients

IGBT DIODE

= 0.60 = 1.30

= 0.0202 = 0.0562

= 0.1431 = 0.1698

=

Page 14: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-5

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-50

0

50

Time (sec)

Stator C

urrent (A

) Stator current

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

1000

2000

Time (sec)

Speed (rpm

)

Rotor speed

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-50

0

50

Time (sec)

Torque (N

m)

Fig.5.4 Stator Current, Speed and Torque Curve

In fig. 5.5 the changes in torque and speed values is shown

which clearly indicates the changes that occur at time 1sec, 1.5sec,

2.5sec and 4sec. The change in the curve of stator current takes place

in accordance with changes in speed and torque values.

A speed reference step from 0 to 1800 rpm is applied at t =

0. The speed set point doesn't go instantaneously at 1800 rpm but

follows the acceleration ramp. The motor reaches steady state at t = 1

s.

At t = 1.5 s, a decelerating torque is applied on the

motor's shaft. We can observe a speed decrease. Since the rotor speed

is higher than the synchronous speed, the motor is working in the

generator mode. The braking energy is transferred to the DC

link and the bus voltage tends to increase. However the over voltage

activates the braking chopper which causes the voltage to reduce. In

this example, the braking resistance is not big enough to avoid a

voltage increase but the bus is maintained within tolerable

limits.

At t = 2.5 s, the torque applied to the motor's shaft

steps from 30 Nm to 0 Nm .You can observe a DC bus voltage and

speed drop. At this point, the DC bus controller switches from

braking to motoring mode.

At t = 4 s, the load torque is switched from 0 to 15 the

speed of motor again starts following the acceleration ramp. Again

motor reaches a steady state at t=4.4sec.

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

10

20

30

40

50

60

70

80TEMPERATURE CURVE

TIME (in seconds)

TE

MP

ER

AT

UR

E (in

degree C

elc

ius)

0 5 10 15 20 25 300

2

4

6

8

10

12

14

16

difference in temperature

num

ber of tim

es

number of detected temperature cycles

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

5

10

15

20

25

30

35TEMPERATURE CURVE

TIME (in seconds)

TE

MP

ER

AT

UR

E (

in d

egre

e C

elc

ius)

0 5 10 15 20 25 30 35 400

5

10

15

20

25

30

difference in temperature

num

ber

of

tim

es

number of detected temperature cycles

Page 15: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-6

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

50

100

Time (sec)

Pow

er

Loss (

W)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

10

20

30

40

Time (sec)

Tem

p(d

eg c

el)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

50

100

Time (sec)

Tem

p(d

eg c

el)

Fig. 5.5 Power Loss, Junction Temperature Curve

Fig 5.6 shows the power losses that occur in IGBTs.

Since total power loss is summation of conduction losses and

switching losses, and switching losses are constant losses, which

is 40W as shown in fig.5.5 The fluctuation in curve is only due to

the variation of conduction losses. The losses in a diode are same

as that in IGBTs.

The junction temperature for IGBT and diode is shown.

Which indicate that the temperature in a diode is higher than that

in IGBTs. The curve shows the variation in power and

temperature cycles due to the variation in speed of the motor.

Fig. 5.6 Detected Temperature Cycles for IGBT

Fig. 5.7 Detected Temperature Cycles for Diode

The temperature cycles of junction temperature are

detected from the algorithm (given in 4). There are total of 210

temperature cycles are detected. The curve clearly indicates that

numbers of temperature cycles are high at low values of

difference in temperature and goes on decreasing. The

temperature cycles below 15 ºC are not much harmful for

semiconductors. But they should be considered due to their large

numbers.

∆T n(reldata) N N(Total)= N*

n(reldata)

1/N

3 22 8.6071e+006 1.89E+08 5.28E-09

4 20 8.1873e+006 1.64E+08 6.11E-09

5 27 7.7880e+006 2.10E+08 4.76E-09

6 7 7.4082e+006 5.19E+07 1.93E-08

7 4 7.0469e+006 2.82E+07 3.55E-08

8 4 6.7032e+006 2.68E+07 3.73E-08

9 9 6.3763e+006 5.74E+07 1.74E-08

10 5 6.0653e+006 3.03E+07 3.30E-08

11 5 5.7695e+006 2.88E+07 3.47E-08

12 5 5.4881e+006 2.74E+07 3.64E-08

13 5 5.2205e+006 2.61E+07 3.83E-08

14 3 4.9659e+006 1.49E+07 6.71E-08

15 3 4.7237e+006 1.42E+07 7.06E-08

16 3 4.4933e+006 1.35E+07 7.42E-08

17 5 4.2741e+006 2.14E+07 4.68E-08

18 3 4.0657e+006 1.22E+07 8.20E-08

19 3 3.8674e+006 1.16E+07 8.62E-08

20 5 3.6788e+006 1.84E+07 5.44E-08

21 4 3.4994e+006 1.40E+07 7.14E-08

22 4 3.3287e+006 1.33E+07 7.51E-08

23 4 3.1664e+006 1.27E+07 7.90E-08

24 4 3.0119e+006 1.20E+07 8.30E-08

25 4 2.8650e+006 1.15E+07 8.73E-08

26 4 2.7253e+006 1.09E+07 9.17E-08

27 4 2.5924e+006 1.04E+07 9.64E-08

28 4 2.4660e+006 9.86E+06 1.01E-07

29 4 2.3457e+006 9.38E+06 1.07E-07

30 4 2.2313e+006 8.93E+06 1.12E-07

31 4 2.1225e+006 8.49E+06 1.18E-07

32 4 2.0190e+006 8.08E+06 1.24E-07

33 4 1.9205e+006 7.68E+06 1.30E-07

34 4 1.8268e+006 7.31E+06 1.37E-07

35 4 1.7377e+006 6.95E+06 1.44E-07

36 4 1.6530e+006 6.61E+06 1.51E-07

37 4 1.5724e+006 6.29E+06 1.59E-07

38 4 1.4957e+006 5.98E+06 1.67E-07

F(t) 2.78e-6

R(t) 1-F(t) 0.99999722

Page 16: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-7

Table 5.4 Number of Temperature Cycles and Reliability

The values of reliabilities found by using direct value

putting and by using curve fitting technique shows that, curve fitting

technique gives better reliability.

5.6.2 SPACE VECTOR PWM

TECHNIQUE

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-50

0

50

Time (sec)

Stator C

urrent(A

)

Stator current

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

500

1000

1500

2000

Time (sec)

Speed (rpm

)

Rotor speed

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-20

0

20

40

Time (sec)

Torque(N

m)

Electromagnetic Torque

Fig. 5.8 Stator Current, Speed and Torque

At time t = 0 s, the speed set point is 1800 rpm. The speed

follows precisely the acceleration ramp. Speed comes to a steady

state at t=1 sec.

At t = 1.5 s, the full load torque is applied to the motor shaft

while the motor speed is still ramping to its final value. This forces

the electromagnetic torque to increase to a high value and then to

stabilize at 20 Nm once the speed ramping is completed and the

motor has reached 1200 rpm.

At t = 2.5 s, the speed set point is changed to 1500 rpm and the

electromagnetic torque reaches again a high value so that the speed

ramps precisely at 1800 rpm/s up to 1500 rpm under full load.

At t = 4 s, the mechanical load passed from 0 Nm to 15 Nm,

which causes the electromagnetic torque to stabilize at approximately

at 20 Nm shortly after. Note that the DC bus voltage increases since

the motor is in the braking mode. This increase is limited by the

action of the braking chopper.

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

50

100

150

Time (sec)

Pow

er Loss(W

)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

10

20

30

Time (sec)

Tem

p(deg cel)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

20

40

60

Time (sec)

Tem

p(deg cel)

Fig

5.9 Power Loss and Temperature Curves

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

5

10

15

20

25

30TEMPERATURE CURVE

TIME (in seconds)

TE

MP

ER

AT

UR

E (

in d

egre

e C

elc

ius)

0 5 10 15 20 25 30 350

5

10

15

difference in temperature

num

ber

of

tim

es

number of detected temperature cycles

Fig. 5.10 Detected temperature Cycles for IGBT

Page 17: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-8

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

10

20

30

40

50

60TEMPERATURE CURVE

TIME (in seconds)

TE

MP

ER

AT

UR

E (in degree C

elcius)

0 5 10 15 20 250

5

10

15

20

25

30

35

difference in temperature

num

ber of tim

es

number of detected temperature cycles

Fig. 5.11 Detected temperature Cycles for Diode

Table 5.5 Number of Temperature Cycles and Reliability

5.6.3 COMPARISON OF RELIABILITIES

TABLE 5.6 CALCULATED

MTTFs

In Table 5.3 the calculated MTTFs for the two approaches

are compared. Even though the same simulation data were used, the

both approaches calculated components MTTFs which differ for

orders of magnitude. It can be easily seen that the MTTFs for IGBTs

and Diodes in both applications are comes out to be nearly same.

This due to the reason that both experiments are done in nearly same

the operating condition. Reliability calculated by Military handbook

does not consider the effect of temperature cycling hence MTTFs

from this method is same for all three cases. In all cases the IGBTs

are comes out to be least reliable component.

∆T n(reldata2) N N(Total)=

N*

n(reldata2)

1/N

3 3 8.6071e+006 2.58E+07 3.87E-08

4 3 8.1873e+006 2.46E+07 4.07E-08

5 3 7.7880e+006 2.34E+07 4.28E-08

6 4 7.4082e+006 2.96E+07 3.37E-08

7 4 7.0469e+006 2.82E+07 3.55E-08

8 4 6.7032e+006 2.68E+07 3.73E-08

9 3 6.3763e+006 1.91E+07 5.23E-08

10 3 6.0653e+006 1.82E+07 5.50E-08

11 3 5.7695e+006 1.73E+07 5.78E-08

12 3 5.4881e+006 1.65E+07 6.07E-08

13 3 5.2205e+006 1.57E+07 6.39E-08

14 3 4.9659e+006 1.49E+07 6.71E-08

15 3 4.7237e+006 1.42E+07 7.06E-08

16 3 4.4933e+006 1.35E+07 7.42E-08

17 3 4.2741e+006 1.28E+07 7.80E-08

18 3 4.0657e+006 1.22E+07 8.20E-08

19 3 3.8674e+006 1.16E+07 8.62E-08

20 3 3.6788e+006 1.10E+07 9.06E-08

21 3 3.4994e+006 1.05E+07 9.53E-08

22 3 3.3287e+006 9.99E+06 1.00E-07

23 3 3.1664e+006 9.50E+06 1.05E-07

24 3 3.0119e+006 9.04E+06 1.11E-07

25 3 2.8650e+006 8.60E+06 1.16E-07

26 3 2.7253e+006 8.18E+06 1.22E-07

27 3 2.5924e+006 7.78E+06 1.29E-07

28 3 2.4660e+006 7.40E+06 1.35E-07

29 3 2.3457e+006 7.04E+06 1.42E-07

30 3 2.2313e+006 6.69E+06 1.49E-07

31 3 2.1225e+006 6.37E+06 1.57E-07

32 3 2.0190e+006 6.06E+06 1.65E-07

33 3 1.9205e+006 5.76E+06 1.74E-07

34 3 1.8268e+006 5.48E+06 1.82E-07

F(t) 2.95E-06

R(t) 1-F(t) 0.99999705

Six-step SVPWM SVPWM

(Ts=50sec)

MTTF

(hrs)

IGB

Ts

Diod

es

IGBT

s

Diod

es

IGB

Ts

Diod

es

MIL-

HDBK-

217

Coffin-

Manson

Page 18: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-9

The results for SVPWM technique for simulation time of

50sec is given in appendix II.

REFERENCES

[1] D. Hirschmann, D. Tissen, S. Schroder, and R. De Doncker,

― Reliability Prediction for Inverters in Hybrid Electrical

Vehicles‖, IEEE transactions on power

electronics,vol.22,n0.6,nov 2007

[2] D. Hirschmann, D. Tissen, S. Schroder, and R. De Doncker,

―Inverter design for hybrid electrical vehicles considering

mission profiles,‖ in Proc. IEEE Vehicle Power Propulsion

Conf., Sep. 2005.

[3] “Military Handbook (MIL-HDBK-217F),” Dept. Defense, Dec.

1991, Ed.

[4] L.K. Mestha, P.D. Evans, ―Analysis of on-state losses in PWM

inverters‖. IEE Proceedings, Vol. 136 pp.189-195, July 1989.

[5] A.D. Rajapakse, A.M. Gole, and PL. Wilson. ―Electromagnetic

transient simulation models for accurate representation of

switching losses and thermal performance in power electronic

systems‖. IEEE Trans. Power Delivery, 20(1):319-327,

January 2005.

[6] A. Goel and R. J. Graves, ―Electronic system reliability:

Collating prediction models,‖ IEEE Trans. Device Mater.

Rel., vol. 6, no. 2, pp. 258–265, Jun. 2006.

[7] P.Nance,M.Marz ―Thermal Modeling of Power Electronics

System‖ PCIM Europe Power Electronic Systems, No.

2/2000 pp.20-27.

[8] W. Engelmaier, ―Fatigue life of leadless chip carrier solder

joints during power cycling,‖ IEEE Trans. Comp. Hybrids

Manufact. Technol., vol. CHMT-6, no. 3, pp. 232–237, Sep.

1983.

[9] M. Ciappa, F. Carbognani, and W. Fichtner, ―Lifetime

prediction and design of reliability tests for high-power

devices in automotive applications,‖ IEEE Trans. Device

Mater. Rel., vol. 3, no. 4, pp. 191–196, Dec. 2003.

[10] A. Morozumi, K. Yamada, T. Miyasaka, S. Sumi, and Y. Seki,

―Reliability of power cycling for IGBT power semiconductor

modules,‖ IEEE Trans. Ind. Appl., vol. 39, no. 3, pp. 665–671,

May. 2003.

[11] Mitsubishi Semiconductors Power Modules ―General

considerations for IGBT and intelligent power modules‖.

[12] Z. Zhou, M. S. Khanniche, P. Igic, S. T. Kong, M. Towers, and

P. A. Mawby, ―A fast power loss calculation method for long

real time thermal simulation of IGBT modules for a three-

phase inverter system,‖ in Power Electron. Applications,

2005 Eur. Conf., Sep. 2005.

[13] T. Kojima, Y. Nishibe, Y. Yamada, T. Ueta, K. Torii, S. Sasaki,

and K. Hamada, ―Novel electro-thermal coupling simulation

technique for dynamic analysis of HV (hybrid vehicle)

inverter,‖ in Proc. 37th IEEE Power Electron. Specialists

Conf., 2006, PESC ’06, Jun. 2006, pp. 1–5.

[14] Semikron Application Handbook. Berlin, Germany: ISLE

Verlag, 1998. ISBN 3-932633-24-5.

[15] Z. Zhou,M. S. Khanniche,P. Igic,S. M. Towers ,P. A. Mawby,

―Power loss calculation and thermal modeling for a three

phase phase inverter drive system‖, J. Electrical Systems 1-4

(2005): 33-46.

[16] Takashi Kojima, Yuji Nishibe, Yasushi Yamada,Takashi Ueta,

Kaoru Torii, Shoichi Sasaki, Kimimori Hamada. ―Novel

Electro-Thermal Coupling Simulation Technique for

Dynamic Analysis of HV (Hybrid Vehicle) Inverter‖ 37th

IEEE Power Electronics Specialists Conference / June 18 - 22,

2006, Jeju, Korea.

[17] K & K Associates, Ed., Thermal Network Modeling Handbook

10141 Nelson St.. Westminster, CO, 80021, K & K

Associates, Developers of Thermal Analysis Kit (TAK), 2000.

[18] A.R. Hefner. ―A dynamic electro-thermal model for the IGBT‖.

IEEE Trans. Industry Applications, 30(2):394-405, March

1994.

[19] L.K. Mestha, P.D. Evans, ―Analysis of on-state losses in PWM

inverters‖, IEE PROCEEDINGS, Vol. 136, Pt. B, No. 4, JULY

1989.

[20] C.-S. Yun, P. Malberti, M. Ciappa, and W. Fichtner, ―Thermal

component model for electromechanical analysis of IGBT

module systems,‖ IEEE Trans. Adv. Packag., vol. 24, no. 3,

pp. 401–405, Aug. 2001.

[21] M. Ciappa and W. Fichtner, ―Lifetime prediction of IGBT

modules for traction applications,‖ in Proc. IEEE Int.

Reliability Physics Symp.,

San Jose, CA, 2000, pp. 210–216.

[22] A.T. Bryant, A. Walker, and P.A. Mawby, ―Fast Inverter loss

simulation for Hybrid electrical vehicle drives.‖, Hybrid

Vehicle Conference, IET The Institution of Engineering and

Technology, 2006.

[23] IXYS Semiconductor GmbH, IXER 35N120D1 Product

Specification Sheet, Lampertheim, Germany, 2003.

[24] Eupec IGBT modules , BSM 100 GD 60 DLC datasheet,

2000-02-08.

[25] International rectifier,IGBT, IRG4PC40KD datasheet,2000.

[26] TOSHIBA, GTR Module silicon n-channel IGBT,

MG300J2YS50 datasheet.

[27] Dustin A. Murdock, Jose E. Ramos Torres, Jeffrey J. Connors,

and Robert D. Lorenz. “Active Thermal Control of Power

Electronic Modules‖, IEEE Transactions on industry

Applications, VOL. 42, NO. 2, March/April 2006.

Page 19: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0103-10

[28] D. Xu, H. Lu, L. Hang, S. Azuma, M. Kimata and R. Uchida,

Power Loss and Junction Temperature Analysis of Power

Semiconductor Devices, IEEE Transaction on Industry

Applications, Vol..38, No.5, pp, 1426-1431,

September/October 2002.

Page 20: VLP

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

1

Abstract— Multi-threshold CMOS (MTCMOS) technology

features the MOSFETS having low threshold voltage (for speed

enhancement) and high threshold voltage (for suppressing

standby leakage current during sleep period). In this design,

frequent transition of mode i.e. active to sleep and sleep to active

may occur, which consumes significant amount of energy. This

paper presents charge recycling concept between virtual supply

and virtual ground to reduce dynamic energy consumption

during mode transition. This paper presents the simulation of

two bit carry ripple adder used in 2 bit accumulator depicting

reduction of 75% dynamic energy consumption during mode

transition as compared to a ripple adder with conventional

MTCMOS.

Index Terms— Charge recycling, Gated ground, Gated-power,

Multi-threshold voltage, Virtual power node,

I. INTRODUCTION

OW power design is one of the most significant challenges

in designing today’s advanced VLSI circuit. Currently,

portable devices consume lots of energy during idle period due

to leakage current which shortens the battery lifetime. A

popular low leakage circuit technique –multi-threshold voltage

technology which is based on disconnecting the low threshold

voltage (low Vt) logic gates from power supply and /or the

ground line by the use of sleep transistor (high Vt) (Fig.1)

during the standby mode by turning off the sleep transistor [1].

However during the mode transition from active to sleep and

sleep to active, a significant amount of energy is consumed. If

mode transition is frequent, then energy overhead is more

significant to turn off and turn on the power gating structure.

As shown in Fig. 1, virtual power node and virtual ground

node have high parasitic capacitance due to due diffusion

capacitances of transistor connected to virtual line, wire

capacitances.

This paper applies a new charge recycling technique to

minimize energy consumption during mode transition from

active to sleep and sleep to active. The charge stored on the

parasitic capacitances of virtual power node (VP) and virtual

ground node (VG) is recycled during mode transition.

The remainder of the paper is organized as follows. The

conventional MTCMOS and virtual node voltages are

described in section II, charge recycling technique during

sleep to active and active to active and parasitic capacitance

calculation in section III, simulation results in section IV and

conclusion in section V.

II. CONVENTIONAL MTCMOS

The conventional MTCMOS as shown in Fig. 1, consist of

two blocks where 1st block is power gated by an NMOS sleep

transistor creating virtual ground node (VG) between the

block and sleep transistor , and the second block is power

gated by the PMOS sleep transistor creating virtual power

node (VP) between the sleep transistor and the block.

A. Virtual Ground and virtual power voltages

In active mode, sleep transistors NMOS and PMOS are turn

on (linear region). During active mode, voltage at virtual

ground node (VG) is zero and at virtual power node (VP) is Vdd

[2]. In sleep mode, both NMOS and PMOS are in cut-off.

Then the virtual ground node (VG) and virtual power node

(VP) will be charged up to steady state value of high voltage (≈

1.4 V) and low voltage (≈ 0V) for the supply of 1.8 V as

shown in fig. 2. Large portion of the total energy drawn from

the power supply is stored in the parasitic capacitance (shown

in fig.1 as lumped capacitance) associated to virtual nodes.

The remaining portion of energy is dissipated in the parasitic

impedances of low Vt circuitry i.e. a full adder in fig. 1. In

order to calculate total dynamic energy, i.e. energy consumed

Carry Ripple Adder based on Charge Recycling

for Lower Energy MTCMOS

Arvind Kumar, Member, IEEE , Sanjeev Rai, Sarad Shrestha, ECED, MNNIT,Allahabad

L

CMOS Full

Adder CMOS Full

Adder

Carry

Virtual

Gnd (VG)

Virtual

Vdd (Vp)

Vdd

Vdd

Fig. 1. Power gating structure using NMOS and PMOS sleep

transistors. High Vt transistor is represented with thick line in channel

region

Page 21: VLP

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

2

during sleep to active and active to sleep mode transition, we

assumed that sleep period is long enough to charge the virtual

ground node (VG) to VDD and virtual power node (VP) to zero.

Let CG-virtual and CP-virtual represents the total parasitic

capacitances at Virtual ground node (VG) and virtual power

node (VP) respectively. Then energy consumed during sleep to

active mode transition is as follows:

(1)

Similarly during active to sleep mode, we assumed virtual

power node (VP) is at value of VDD and virtual ground node

(VG) at zero. For active to sleep mode transition, energy

consumed is as follows:

(2)

The total energy consumed during one cycle of active to

sleep and sleep to active is follows:

. (3)

III. CHARGE RECYCLING MTCMOS TECHNIQUE

The charge recycling technique includes the charge

recycling of charges stored at virtual ground node (VG) and

virtual power node (VP) is recycled through a transmission

gate [3, 4] shown Fig. 3.

A. Charge recycling during sleep to active mode transition

As mentioned in section II, during sleep mode, virtual

ground node (VG) will be charged to almost VDD and virtual

power node to almost zero. Before turning on sleep transistor

to make in active state, the transmission gate is turned on for a

short period [3]. This allows for charge sharing between

virtual ground node (VG) and virtual power node (VP) until the

parasitic capacitance on the nodes share the common voltage

(Vf) [5] as shown in fig. 5. Here we assume that parasitic

capacitance on the virtual nodes are almost equal. After the

complete charge sharing i.e. having equal voltages on the

virtual nodes, the transmission gate is switched off and now

the sleep transistors are turned on for sleep to active state. The

total energy drawn from the power supply to charge the

parasitic capacitance at virtual power node (VP) during mode

transition from sleep to active is as follows:

(4)

B. Charge recycling during active to sleep mode transition

During active state, the virtual power node (VP) is at value

of VDD and virtual ground node (VG) is at almost zero value.

Before turning off the sleep transistor while going from active

to sleep state, the transmission gate is switched on shortly for

charge recycling between virtual ground node (VG) and virtual

power node (VP). The charge sharing occurs between two

nodes until the common voltage (Vf) on both nodes and

transmission gate is switched off. Now the sleep transistors are

turned off. The charge recycling process is shown in Fig. 7.

The parasitic capacitance at virtual ground node (VG) draws

the energy Eactive-sleep from supply during active to sleep

transition which is as follows:

(5)

Hence total energy drawn from the power supply during

Virtual Vdd

(Vp)CMOS Full

Adder

CMOS Full

AdderCarry

Virtual Gnd

(VG)

Sleep

Sleep

VCR

VCRVDD

VDD

Fig. 3. Charge recycling MTCMOS circuit with transmission gate between

virtual ground node (VG) and virtual power node (VP)

Fig. 2. Virtual Ground Voltage VG =1.3V and Virtual supply voltage VP =

0V during sleep mode

Sleep

Charge

Recycling

Active

Sleep

Fig. 4. Charge recycling Signal (VCR)

Esleep-active = CP-virtual V2

DD

Eactive-sleep = CG-virtual V2DD

ETOTAL = CG-virtual V2

DD + CP-virtual V2

DD

= V2DD (CG-virtual + CP-virtual)

Esleep-active = VDD (VDD-Vf) CP-virtual

Eactive-sleep = VDD (VDD-Vf) CG-virtual

Page 22: VLP

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

3

jkjkjkjk

lklklklkl

active to sleep and sleep to active mode transition is given as

follows:

(6)

C. Capacitance calculation

For the capacitance calculation on the virtual node, all the

parasitic capacitances of transistors connected to virtual node

are summed up. MOSFET intrinsic capacitance, fig. 6, mainly

includes structural capacitance, channel capacitance, diffusion

capacitances [6] [7].

Structural capacitance includes the overlap capacitances

(gate to source overlap capacitance (CGSO) and gate to drain

overlap capacitance (CGDO)). Channel capacitance depends on

the operating regions. For digital circuit, we can take average

value over three operating regions. Likewise, diffusion

capacitance includes source to body (CSB) and drain to body

(CDB) which is calculated by following equation:

(7)

where CJ is zero bias bulk capacitance per square meter and

CJSW zero bias perimeter capacitance per meter.

IV. SIMULATION RESULT

We used the cadence-spectre simulator and the technology

180nm(Vtnlow=|Vtplow| =0.156 V and Vtnhigh=|Vtphigh|=0.386V )

for the simulation of the circuit. Two bit static carry-ripple

adders (using 28-transistors) are designed. The carry-ripple

adder with conventional MTCMOS shown in fig.1 and the

adder with charge recycling technique shown in fig.2 are

compared in terms of dynamic energy during mode transitions.

All possible input vectors are given to the circuit and the

almost same energy overheads are found out. By using charge

recycling, the energy overhead during mode tranistion of

charge recycling ripple adder is 75% lower as compared to the

adder with conventional MTCMOS. Fig. 8 shows the total

energy overheads for a full cycle of mode transition i.e. from

active to sleep and sleep to active.

Fig.7. Charge recycling waveform of the two bit carry ripple adder during

mode transition from active mode to sleep mode

CDIFF = CBP + CSW

= CJ Area + CJSW Perimeter

TABLE I.

Process parameter of TSMC 180 nm process for VDD =1.8 V

Parameters NMOS PMOS

CGDO (fF/µm) 0.37 0.33 CJ (fF/µm2) 0.77 0.85

CJSW (fF/µm) 0.18 0.33

Fig. 5. Charge recycling waveform of the two bit carry ripple adder during

mode transition from sleep mode to active mode

CGD

CGS

CGB

CDB

CSB

Fig. 6. Capacitances of MOS transistor

ETotal(CR) = Esleep-active + Eactive-sleep

= VDD (VDD-Vf) CP-virtual + VDD (VDD-Vf) CG-virtual

= VDD (VDD-Vf) (CP-virtual + CG-virtual)

Page 23: VLP

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <

4

V. CONCLUSION

In this paper, a charge recycling MTCMOS technique for

two bit ripple adder is proposed to reduce the dynamic energy

overhead during mode transition from sleep to active and

active to sleep transition. Transmission gate is used for charge

recycling between virtual rails. We have shown the reduction

of 75% of energy overhead during mode transition i.e. active

to sleep and sleep to active, in charge recycling technique with

compare to conventional one. Here, in the standby mode, the

circuit lost the data. So in future, we can propose the data

retentive circuit in this circuit.

REFERENCES

[1] S.Mutoh, T. Douseki, Y. Matsuya, T. Aoki, S. Shigematsu, and J. Yamada, “1 V power supply high-speed digital circuit tehnology with

multi-threshold-voltage CMOS, “IEEE J. Solid-State Circuits, vol. 30,

no.8, pp.847-854, Aug.1995. [2] A. Abdollahi, F. Fallah, M. Pedram “ A Robust Power Gating Structure

and Power Mode Transition Strategy for MTCMOS Design”, IEEE

Trans. Very Large Scale Intergrated Sysytem, vol. 15, Jan. 2007. [3] E. Pakbaznia, F. Fallah, and M. Pedram, “Charge recycling in

MTCMOS circuits: concept and analysis,” in Proc.ACM/IEEE Des.

Autom. Conf., 2006,pp 97-102. [4] Z. Liu and V. Kursun, “ Charge Recycling between Virtual Power and

Ground Lines for Low Energy MTCMOS,” Proceedings of the

IEEE/ACM International Symposium on Quality Electronic Design. Pp.

239-244, March 2007.

[5] J. P. Uyemura, Introduction to VLSI CIRCUITS ANS SYSTEMS,

WIELWY Student edtition. [6] S. Mo. Kang, Y. Leblebici, CMOS Digital Intergrated Circuits-

Analysis and Design, 3rd ed. PEARSON Education.

[7] N. H.E . Weste, D. Harris, A. Benerjee, CMOS VLSI Design- A circuit and system perspective, 3rd ed. PEARSON Education.

[8] J. M. Rabey, A. P. Chandrakasan, B. Nicolic, Digital Intergrated Circuit,

A Design Perspective, 2nd ed.

Fig.8. The energy overheads of the MTCMOS 2-bit ripple adders

0

5

10

15

20

25

30

35

40

45

50

Conventional Gated -

MTCMOS

Charge Recycling

MTCMOS

En

argy(f

J)

Page 24: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0105-1

Forthcoming CMOS Technology in Nanoscale Era Shashank Mishra

#1, Kshitij Bhargava

#2, Rohit Tripathi

#3 , Piyush Jain

#4

Electronics and Communication Engineering (Microelectronics and Embedded Technology) Department

Jaypee Institute of Information Technology, Noida-201307, U.P., India

[email protected]

[email protected]

[email protected] [email protected]

Abstract— CMOS technology has reached to the level of sub-

45nm range. It is expected that the nano-CMOS technology will

govern the IC manufacturing at least for another couple of

decades. Though there are many challenges ahead, further down-

sizing the device to a few nanometers is still on the schedule of

International Technology Roadmap for Semiconductors (ITRS).

Several technological options for manufacturing nano-CMOS

microchips has been available or will be available very soon. This

paper reviews the challenges of nano-CMOS downsizing and will

focus on the recent developments on the key technologies for the

nano-CMOS in the years to come.

I. INTRODUCTION

Among numerous great inventions made in the 20th

century, electronics is the most important one. Almost every

thing related to human activities, such as power generation,

transportation, entertainment, medical care, is now provided

and controlled by electronics. Semiconductor is strategically

an important technological area for all nations. The electronic

circuit development has been accomplished with the

downscaling of component size since the replacement of

vacuum tubes with transistors 40 years ago. The circuit

characteristics have benefited a lot from the downsizing. We

are now able to integrate millions of CMOS transistors at the

nanoscale level on the silicon chip with only few centimetres

square of area occupied. Right now the operating speed of the

recently developed microprocessor has already reached upto 5

GHz and is expected to increase further. Although recent

trends indicate that the increase in the clock frequency may

gradually get saturated. The CMOS integrated circuits as well

as their core device technology are expected to evolve further

for at least a couple of decades and their importance will be

further increased in future intelligent systems. CMOS device

dimensions have been reduced to a millionth at the production

level in the past 100 years. Hundred years ago, no one could

have ever imagined that the mankind of our time will be able

to make any such electronic components which will consist of

billions of electronic components with dimension smaller than

the bacteria size and those circuits will fulfil the different

needs of the society. Future scaling trends have been predicted

by the International Technology Roadmap for Semiconductors

(ITRS) for 30 years up to 2040, when the physical gate length

is expected to be 1 nm (as shown in figure 1), [2]. It is

believed that the CMOS device downsizing will approach the

physical limit.

Figure 1: Feature size versus time in silicon ICs.

II. CHALLENGES IN SCALING

Device downsizing from 10 μm to the sub-45-nm range

presented a lot of benefits in terms of speed, power, and cost.

But apart from the improvements, reported above, one of the

major problems for performance degradation in the ultra-large

scale circuits is the interconnect delay due to the increase in

the resistance and the capacitance values of narrow and dense

interconnection metal lines (parasitic). Furthermore, the

performance improvement is also questionable for the ultra-

small MOSFET itself. According to the scaling theory, the

drain current per unit gate width should remain constant.

However, a significant reduction of the drain current value per

unit gate width for sub-45nm gate length MOSFETs was

reported recently (as in Fig. 2), [2]. This phenomenon is due

to the non-optimized MOSFET structure and process. On the

other hand, the small drain current (of several tens of micro-

Ampere per micrometer) at the scaled supply voltage becomes

a major concern. Besides, the fringing capacitance of the gate

electrode, and the inversion layer capacitance will also

degrade the performance of the ultra-small MOSFETs (as in

Fig.3), [2]. It is still doubtful at this moment that such a small

MOSFET can be used for high-speed devices. Hence, without

Page 25: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0105-2

any new technology support, further downscaling may only

result in performance degradation.

Figure 2: Significant reductions of the unit drain currents

Figure 3: Challenging issues further downsizing of MOS transistor

III. IMPROVEMENTS IN CMOS

There have been proposals to try and change the structure

of the transistor itself. Here we are discussing the two most

prominent structural changes: Silicon on Insulator (SOI) and

Double Gate CMOS (DGCMOS). The basic concept of

Silicon on Insulator is fairly simple. Rather than fabricating a

transistor whose body is connected to the substrate (Fig. 4.a),

which is the normal method, an insulating oxide is first

deposited on the substrate and then the transistor is fabricated

on top of that (Fig. 4.b). By doing this the body can be made

electrically isolated from its surroundings. This means that the

bulk to source voltage Vbs is now floating. This design

provides a number of performance benefits. Vbs is now greater

than or equal to zero, which lowers the threshold voltage, Vt,

providing a performance increase. Also, there is no junction

area capacitance. Finally, stacked circuits do not suffer from

the reverse body effect. The new structure also lends itself to

some new uses, such as using the insulating layer for a high

resistance element.

Figure 4.a: Bulk CMOS Gate

Figure 4.b: SOI Gate

There are of course some disadvantages associated with the

new structure as well. While the floating Vbs provides many

benefits, its variability can also become problematic. The

value of Vbs is a function of the present current level in the

gate as well as the history of previous states which the gate

has been in. This means that the threshold of a gate may vary

significantly throughout its operation. Also, if Vbs climbs too

high it can cause pass-gate leakage. There have been

techniques developed to address some of these issues. To test

this technology, IBM redesigned some of their PowerPC line

chips using SOI. They were able to demonstrate a 22-33%

performance increase over the bulk CMOS version of these

chips. They also found that, while implementing SOI

structures it requires a proper understanding of the unique

problems that this technology gets associated with, it was

possible to redesign existing technologies in a reasonable

amount of time. The second structure is more experimental,

but promises great benefits in the future. That structure is the

Double-Gate CMOS (DGCMOS). The basic idea of this

structure is to add an extra gate (or more) to increase coupling

between the gate and the channel. Some have called this the

―ideal structure for scalability‖. Most of the people agree that

it is the design of the future, but there are some difficulties to

overcome before them. The difficulties arise in how to

implement the DGCMOS structure. Using traditional

fabrication processes a second gate could be added below the

body. However, the alignment issues of such a gate are

troublesome. The proposed solution is known as the FinFET.

This structure builds the drain, source, and gate up vertically.

(as in Fig. 5).

Page 26: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0105-3

Figure 5: FinFET structure

This may solve the alignment issue, but there is one other

challenge to overcome. In order to control SCE, the body

thickness must be ¼ of the gate length. This is a daunting

challenge because the gate length is usually the smallest

dimension that can be fabricated. There are some technologies

that may address this, but more work needs to be done in this

area.

The most popular idea is to use carbon nanotubes (CNTs) as

transistors (a configuration example is shown in Fig. 6). This

concept is very appealing because it is still a transistor and

could make use of all the architectural knowledge developed

for CMOS. Carbon nanotubes however do have a long way to

go before they can start replacing the silicon based MOS

transistors. First of all, nanotube transistors developed till date

has shown very poor performance characteristics. Many of the

problems they are exhibiting are similar to the challenges

CMOS is currently facing, such as high off-state leakage and

source-to-drain tunneling. Also, despite the hopes for

chemical self assembly some day, it is still very difficult to

produce nanotube transistors.

Figure 6: Basic carbon nanotube transistor

IV. CONCLUSIONS

Silicon MOSFETs have been the smallest electronic device

for several decades. The gate length used for high

performance logic unit is 45 nm in production and 5 nm in

research. Note that the 5-nm gate length is the distance of 18

atoms and 0.8-nm oxide thickness is two atomic layers only.

Si technology is no doubt the most successful nano-devices.

We do not see that there is any realistic replacement for

silicon devices. Even the Si devices reach the downsizing

limit no matter 10 nm, 5 nm, or 1 nm, other emerging devices

such as molecular transistors will also reach their limit of

downsizing in similar dimensions. It is a critical period for

moving from 45-nm to 10-nm technology within this decade.

Most of the materials and the manufacturing processes used in

the deep-submicron era are now pushing to their physical

limits. New materials and technologies are required for further

down-scaling the device to 10-nm technology and below.

Immersion lithography for ultra fine patterning, strained

channels, nickel salicide, high-k gate dielectric, low-k

interlayer for interconnect, plasma doping, flash and laser

annealing for source and drain doping, elevated source and

drain and three-dimensional MOSFETs for controlling short-

channel effects, would help to overcome the materials and

technological constraints and improve the device performance

in the ultra-small scale. The final remark is a non-technical

issue. We anticipate that this issue will be one of the most

important issues for nano-CMOS technology development in

the next 15 years. We are aware that most of the new mega-

fabs being planned or under construction are in the East and

Southeast Asia, and particularly the Mainland China. In 10 or

15-year’s time, the distribution of semiconductor

manufacturing sites in Asia (including Japan) will be quite

substantial. Currently, Korea and Taiwan are in the first place

for semiconductor memory manufacturing and semiconductor

foundry, respectively. They also lead the technology

development in Asia region. Mainland China seems to be

another super power for semiconductor manufacturing. The

share of China semiconductor manufacturing will keep fast

growing with the support of booming IC design houses,

constructing new fabs with remarkable increase in industrial

investment, and will be the most important huge and rapidly

expending market. As many other industries and other sectors

of electronic products, Mainland China will eventually

become ―the factory of the world‖ in semiconductor

manufacturing in 15 years or longer and will have great

impact on the future nano-CMOS technology.

REFERENCES

[1] G. E. Moore, ―Cramming more components onto integrated circuits‖, [Electronics, vol. 38, no. 8, 1965.

[2] International Technology Roadmap for Semiconductors, 2003 Edition, Semiconductor Industry Association (SIA), Austin, Texas: SEMATECH, USA.

[3] H. Iwai, Future semiconductor manufacturing-challenges and opportunities, IEDM Tech. Dig., 2004, pp. 1-16.

[4] H. Iwai, CMOS downsizing toward sub-100 nm, Solid–State Electron., vol. 48, 2003, pp. [497-503].

Page 27: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0105-4

[5] Zhao W, Cao Y. New generation of Predictive Technology Model for sub-45nmearly design exploration IEEE Trans. Electron Devices 2006; 11:2816-23.

[6] T. Morimoto, H. S. Momose, T. Iinuma, et al, A NiSi salicide technology for advanced logic devices, IEDM Tech. Dig., 1991,

653-656.

[7] T. Iizima, A. Nishiyama, Y. Ushiku, et al, A novel selective

Ni3Si contact plug technique for deep-submicron ULSIs, Symp. VLSI

Technology, 1992, pp.70-71. [8] R. Tsuchiya, M. Horiuchi, S. Kimura, et al, Silicon on thin BOX: A new

paradigm of the CMOSFET for low-power and high-performance application featuring wide-range back-bias control, IEDM Tech. Dig., 2004, pp.631-634.

[9] T. Ghani, et al., "Scaling challenges and device design requirements for high performance sub-50 nm gate length planar CMOS transistors," Symp. VLSl Technology, 2000, pp. 174-175.

[10] B. Yu, ―Scaling towards 35 nm gate length CMOS,‖ in Proc. VLSI Symp., Kyoto, AMD, June 12–14, 2001, pp. 9–10.

[11] D. Connelly, C. Faulkner, and D.E. Group, ―Performance advantage of Schottky source/drain in ultrathin-body silicon-on-insulator and dual gate CMOS,‖ IEEE Trans. Electron Devices, vol. 50, no. 5, pp. 1340–1345, May 2003.

[12] J. Knickerbocker et al., IEEE Custom Integrated Circuits Conference (CICC) p. 659 (2005).

[13] G. Anelli, Design and characterization of radiation tolerant integrated circuits in deep submicron CMOS technologies for the LHC experiments, Ph.D. Thesis, Institute National Poly-technique de Grenoble, France, December 2000, also available at http://www.cern.ch/ RD49.

[14] D. Frank et al., ―CMOS device and circuit limits,‖ Proc. IEEE, vol. 89, Mar. 2001.

[15] Davari, R. H. Dennard, and G. G. Shahidi, ―CMOS scaling, the next ten years,‖ Proc. IEEE, vol. 83, p. 595, 1995.

C. Mead, ―Scaling of MOS technology to submicrometer feature sizes,‖ J. VLSI Signal Processing, pp. 9–25, 1994.

[16] Y. Taur and E. Nowak, ―CMOS devices below 0.1 m: How high will performance go?‖ in Proc. Int. Electron Devices Meeting, 1997, p. 215.

Page 28: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0106-1

Abstract—A very simple circuit of the MOSFET amplifier to

realize both very high positive as well as negative resistances at

its input and output terminals is presented. The mathematical

model is the representation of any device or system that

predicts response of the device or system under different types

of excitations. The floating admittance matrix (FAM) approach

is one of the neat methods of mathematical modeling of

electronic devices and its uses in circuits. The zero sum

property of the floating admittance matrix provides a check to

the worker to proceed further or reobserve the first equation

itself. All transfer functions are represented as cofactors of the

floating admittance matrix of the circuit.

Keywords: Amplifier, Common Source FET, Floating

Admittance Matrix, Zero Sum property, Cofactors, Plots

INTRODUCTION

The input resistance of a MOSFET is supposed to be very

high, yet a single-stage MOSFET amplifier is sometimes not

suitable for certain applications, especially, when high gain

along with change in the resistance levels from positive to

negative of very high to very low, is required. This type of

requirement is solved by either cascading or cascoding or

combination of the both in different sections of the amplifier

stages. Fig. 1 shows two stages of the MOSFET amplifier

with RF connected between output of the second stage to the

input of the first stage. It reveals that with proper adjustment

of the feedback resistance, RF, one may realize extremely

value of input and output resistance, both positive and

negative. The common source amplifier is the most versatile

MOSFETs amplifier configuration. The common-source

(CS) amplifier may be viewed as a transconductance

amplifier or as a voltage amplifier. As a transconductance

amplifier, the input voltage is seen to be modulating the

current going to the load. As a voltage amplifier, input

voltage modulates the amount of current flowing through the

MOSFET, changing the voltage across the output resistance

accordingly. The input resistance of a conventional emitter

follower, cathode follower or source follower is limited by

finite value of the passive emitter/carthode/source resistance

as well as the input bias resistance. In fact, the input bias

resistance shorts the input resistance of the amplifier and

hence the effective input resistance is limited to the

maximum value of the input bias resistance. A number of

papers are available in the literature which describes

separate circuits for realization of positive and negative

resistances. The simple single set-up here realizes both

positive and negative input and output resistance and saves

large number of active and passive components. The

importance of the negative resistance is very much felt in the

design of oscillators, multivibrators, filters, and synthesis of

driving-point functions. An attractive method for controlling

of the line loss in the telephone lines to any extend can be

achieved by introducing resistance; which covers very large

range of values, in the impedance boosting-network. The

realization of very high positive as well as negative

resistances of any amplifier is all the more important for

instrumentation.

This paper aims to develop the mathematical model of

common source amplifier in the form of floating admittance

matrix. The floating admittance matrix of the MOSFET is

taken to advantage for derivation of its voltage gain, input

resistance and output resistance in its common source

configuration.

MATHEMATICAL MODEL OF FET

The two stage common source MOSFET amplifier can be

represented as in Fig. 1 with a feedback through RF from

output of the second stage to the input of the first stage.

Fig.1 Two-stage Common Source Amplifier

The a.c. equivalent circuit of Fig.1 is shown in Fig. 2. The

matrix representation of MOSFET as two-port network (four

terminals) is written as

On Demand Simulation of Input and Output

Resistances of MOSFET Amplifier Mrs. Meena Singh

Lecturer, Deptt. of ECE, University

Polytechnic, B.I.T. Mesra, Ranchi

([email protected])

+91-9279265054

Arun Kumar Singh Deptt. of ECE, Madan Mohan

Malaviya Engg. College, Gorakhpur

([email protected])

+91-9312801316

Dr. B. P. Singh

Professor, Deptt. of ECE &EEE,

Mody Institute of Technology &

Science, Lakshmangarh

([email protected])+91-9468688102

+

VD

D

1 2

3

R21 R12

RF

RL

rs

RS2 RS2

C

RD2 R12

vi

RD1

C

VDD

C R22

C

C C

Page 29: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0106-2

s

d

g

ii

ii

ii

3

2

1 =

3

2

1

gggggg

gggg

g0g

321

dmgdmg

dmdm

gg

s

d

g

vv

vv

vv

3

2

1 (6.1) (5.1)

(1)

Fig.2 ac circuit of two-stage Common Source Amplifier

The admittance matrix of the MOSFET as a device is

expressed in [1]-[3]. Its coefficient matrix is expressed as

Y =

3

2

1

gggggg

gggg

g0g

321

dmgdmg

dmdm

gg (1)

The gate to source resistance of MOSFET is assumed to be

very large (ideally infinity) as it is always reverse biased,

hence gg = 0S. Then the above coefficient matrix of the

MOSFET of (1) reduces to (2).

dmdm

dmdm

gggg

gggg

000

(2)

Thus the floating admittance matrix of two MOSFETs

(device1 and device2) connected in Fig.2 can be written as

1deviceY =

3

2

1

gggg

gggg

000

321

1d1m1d1m

1d1m1d1m

(3)

2device

Y =

3

4

2

gggg

gggg

000

342

2d2m2d2m

2d2m2d2m

(4)

Now the composite matrix of two devices (device1 and

device2) is written as

devicesY =

4

3

2

1

gggg0

gggggggg

0gggg

0000

2d2d2m2m

2d2d2m1d1m2m1d1m

1d1m1d1m

(5)

The overall admittance matrixes for Fig.2 is written as

Y =

FGLG2dgLG2dg2mg2mgFG

LG2dg

LG2GG1DG1GG

sg2dg2mg1dg1mg2GG1DG2mg1dg1GGsg1mg

02GG1DG1dg1mg2GG1DG1dg1mg

FG1GGsg0FG1GGsg

(6)

Equation (6) represents the Floating Admittance Matrix [3],

[4], [5] of two stages common source amplifier.

Now from (6) the input impedance of circuit in Fig.2 can be

expressed as [1]-[3]

=

]G)GGggg(gg[(G

)GGg)(GGggg)(GGg(

)GGg)(GGgg(

FGD2m2g1d2m1mF

FL2dGD2m2g1dFG1g

FL2dGD2g1d

(7)

Similarly, its output impedance and voltage gain can be

expressed as [1]- [3]

=

]G)GGggg(gg[(G

)Gg)(GGggg)(GGgg(

)GGg)(GGgg(

FGD2m2g1d2m1mF

F2dGD2m2g1dFGs1g

FG1gGD2g1d

(8)

1313

Y

1343

Y

131Sgn34Sgn43

13VA 11

AV=)GG)(gGGg(g

)GG(gGgg

FLd2GDg2d1

GDd1Fm2m1 (9)

VERIFICATION ON MATLAB

The values of , , and 43

13VA for different values of

source conductance and load conductance ( 0mS, 1mS, and

2mS) have been programmed through MATLAB. The

output of the MATLAB programs have been plotted for ,

, and 4313VA with respect to feedback conductance, GF .

If we assume that the two MOSFETs of Fig. 2 are properly

biased to yield the same values of its internal parameters

( 1dg = 2dg and 1mg = 2mg ), then for plotting on demand

value of simulated input and output resistances, typical

values of external parameters along with its internal

parameters can be given as:

1

2

3

RD1

RG2 RG1

RF

RD2

4

rs

Page 30: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0106-3

1dg = 2dg = 0.1mS, 1mg = 2mg = 5mS, LG = DG = 1mS,

1GG = 2GG = GG = 0.001mS, 1gg = 2gg = 0.0001mS, FG

= variable (0mS to 0.15mS).

The plots of input and output resistances results into on

demand values or in other words simulated input and output

resistance can have any values, both negative and positive

that is controlled by the feedback conductance connected

between the two stages of the amplifier.

The plot of input resistance as a function of feedback

conductance is shown in Figs.3, 4, and 5 for 0 S, 1 mS and 2

mS of load conductance respectively as per (7).

Following observations are recorded from the plots in Fig. 3,

4 and 5:

Fig.3 Input resistance as a function of feedback conductance for

GL= 0 S

a) For GL = 0S, input resistance is almost constant

(1.148e+06 Ω) from initial values of GF till GF reaches

2.7520e-05 mS, thereafter input resistance began to rise

exponentially (from 1.148e+06 Ω to 4.837e+06 Ω) for

2.7520e-05 mS to 2.7523e-05 mS variation in GF. It is

interesting to note that Ri suddenly jumps down (from

4.837e+06 Ω to -6.828e+07 Ω) for 2.7523e-05 mS to

2.7524e-05 mS variation in GF, again Ri began to increase

suddenly to -4.237e+06 Ω as GF approaches 2.7525e-05 mS,

the curve then starts increasing linearly (from -4.237e+06 Ω

to -1.473e+06 Ω) from GF = 2.7525e-05 mS to GF =

2.7527e-05 mS respectively, and Ri remains constant

thereafter at -1.473e+06 Ω for higher values of GF.

b) For GL= 1 mS, input resistance is almost constant at

3.289e+05 Ω from initial values of GF till GF reaches

0.0004036 mS, thereafter Ri starts increasing linearly (from

3.289e+05 Ω to 4.393e+07 Ω) from GF = 0.0004036 mS to

GF = 0.0004038 mS and suddenly jumps down (to -

7.805e+06 Ω) as GF reaches 0.00040381 mS. Again, Ri

began to rise (from -7.805e+06 Ω to -6.729e+05 Ω) from

GF = 0.00040381 mS to GF = 0.0004039 mS respectively,

and remains constant thereafter at -6.729e+05 for higher

values of GF.

c) For GL= 2 mS, input resistance rises exponentially (from

216.5 Ω to 3331 Ω) from GF = 0.0001 mS to GF = 0.0011

mS respectively, then suddenly it jumps down to Ri= -4418

Ω at GF = 0.0012 mS and again rises exponentially (to -

225.4 Ω) till GF = 0.002 mS and remains constant thereafter

at -225.4 Ω for higher values of GF.

Fig.4 Input resistance as a function of feedback conductance for

GL= 1 mS

Fig.5 Input resistance as a function of feedback conductance for

GL = 2 mS

The plot of output resistance as a function of feedback

conductance (GF) is shown in Figs.6, 7, and 8 for 0 S, 1 mS

and 2 mS of source conductance respectively as per (8).

Following observations are recorded from the plots in Fig. 6,

7 and 8:

a) For gs = 0S, output resistance is almost constant (

1.735e+04 Ω) from initial values of GF till GF reaches

2.752e-05 mS, thereafter output resistance starts rising

exponentially (from 1.735e+04 Ω to 5.452e+04 Ω) for

2.7520e-05 mS to 2.7522e-05 mS variation in GF. It is

interesting to note that Ro suddenly jumps down (from

5.452e+04 Ω to -7.697e+05 Ω) for 2.7522e-05 mS to

2.75242e-05 mS variation in GF, again Ro began to increase

suddenly to -4.776e+05 Ω as GF reaches 2.75262e-05 mS,

then starts increasing exponentially (from -4.776e+05 Ω to -

1.252e+04 Ω) from GF = 2.75262e-05 mS to GF = 2.753e-

05 mS respectively, and then Ro remains constant

thereafter at -1.252e+04 Ω for higher values of GF.

Page 31: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0106-4

Fig.6 Output resistance as a function of feedback conductance for

GS = 0 S

Fig.7 Output resistance as a function of feedback conductance for

GS = 1 mS

b) For gs= 1 mS, output resistance is almost constant at

237.9 Ω from initial values of GF till GF reaches 0.03340

mS, thereafter Ro starts increasing exponentially (from

237.9 Ω to 2829 Ω) from GF = 0.03340 mS to GF = 0.03341

mS and suddenly jumps down (to -7836 Ω) as GF reaches

0.033411 mS. Again, Ro rises (from -7836 Ω to -22.83 Ω)

from GF = 0.033411 mS to GF = 0.0335 mS, and remain

constant thereafter at -22.83 Ω for higher values of GF.

c) For gs= 2 mS, output resistance rises exponentially (from

0.805 Ω to 39.85 Ω) from GF = 0.09 mS to GF = 0.1 mS

respectively, suddenly it jumps down to Ro = -1.028 Ω at GF

= 0.11 mS and remains constant thereafter at -1.028 Ω for

higher values of GF.

The plot of voltage gain as a function of feedback

conductance is shown in Figs.9 and 10 for 0 S, 1 mS and 2

mS of load conductance respectively as per (9).

Plots in the figs. 9 and 10 reveals that voltage gain (AV) is an

inverse function of feedback conductance (GF), further the

voltage gain decreases as the value of source conductance

(gs) increases due to their inverse relationship given by (9).

Fig.8 Output resistance as a function of feedback conductance for

GS = 2 mS

Fig.9 Voltage gain as a function of feedback conductance for GL

= 0 S

Fig.10 Voltage gain as a function of feedback conductance for

GL = 1 mS and 2 mS

CONCLUSION

The plots from Figs. 3 to 8 reveal a region of very sudden

change in the values of input resistance and output resistance

Page 32: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0106-5

from very high positive values to large negative value, for

very small change of the order of 10-05

in the value of

feedback conductance, GF. This is the zone of very high

variation in input and output resistances, both negative and

positive, which can be used for compensation of resistances

to obtain very high Q-factor in the lossy networks.

REFRENCES

[1] Wai-Kai Chen, On second order cofactors and null return difference in

feedback amplifier theory, International Journal of circuit theory and application, Vol. 6, Issue 3, pp. 305-312, Dec. 2006.

[2] Otso Juntunen , A two port S-parameter data transformation, circuit

theory laboratory report series, CT-35, Helsinki University of technology, Finland, Espoo 1998.

[3] B.P. Singh, Unified Approach to electronics circuit analysis, IJEEE, pp.

276-285, July 1978. [4] B.P. Singh, Active bridge for measurement of admittance parameters of

the transistors, Indian Journal of Pure and Applied Physics, Vol. 15, pp.

783-786, Nov. 1976.

[5] B.P. Singh, A new active bridge for measuring FET parameters, J Phys.

E. Scientific Instrument, Vol. II, pp. 667-670, 1978. [6] Jacob Millman and Christos C. Halkias, Integrated Electronics, Analog

and Digital Circuits and Systems, TATA McGRAW-HILL publication, pp.

471-475, 2004. [7]B.P. Singh, Meena Singh, Sanjay Kumar Roy and S.N. Shukla,

Mathematical Modeling of Electronic Devices and its integration;

Proceedings of National Seminar on Recent Advances on Information Technology, Allied Publishers Pvt. Ltd., Indian School of Mines Dhanbad

University, pp.494-502, Feb. 6-7, 2009

[8]B.P. Singh, Arun Kumar Singh, verification of transfer functions

of BJT obtained by using MATLAB, Proceedings of IEEE National

Symposium on Innovative Development in Electronics Arena, Arya

College of Engineering, pp. 92-96, Dec. 12, 2009.

Page 33: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-1

Performance Analysis and Comparison of PFSCL and

MCML

Kirti Gupta , Ranjana Sridhar, Jaya Chaudhary

DTU (formerly Delhi College of Engineering)

ABSTRACT:

CML or current mode logic is a

differential logic style which offers high

noise immunity and high speed of

operation. In this paper we compare the

performance of PFSCL or positive

feedback source coupled logic with

MCML or MOS current mode logic which

are derivatives of CML style. We show

through simulations on Orcad PSPICE

using .18nm technology that PFSCL offers

significant advantages over MCML in

terms of power consumption, area

occupied and propagation delay .

Due to growing market for digital signal

processing and optical communication

applications, commercial interest in high

resolution mixed signal ICs has been

growing. In mixed signal ICs the analog

and the digital blocks are integrated on the

same base and hence the resolution of the

analog block is limited by the dynamic

switching noise produced by the digital

block. Hence CMOS logic style is not

suitable as it is suffers from dynamic

switching noise. Also, for CMOS the

advantage of having zero static power

consumption is lost when it is used at

hundreds of MHz to GHz of frequencies.

Several other logic styles have been

proposed to reduce the dynamic switching

noise in mixed signal ICs such as in [2],[3]

and [4]. The CML style offers advantage

in robustness to switching noise as

compared to CMOS logic style [1]. Also,

at high frequencies (hundreds of MHz to

GHz range) CML style is more power

efficient than CMOS logic[7].This type of

logic was first implemented using bipolar

transistors [5] and extended for application

with MOS transistors. It has less power

consumption than ECL but is slower than

ECL.

MCML is a extension of Current Mode

Logic where MOSFET is used as the

transistor instead of BJT. A constant

current source is used to bias the

differential pair of transistors which

switches the current from one of the pair to

another depending upon the applied input.

The differential operation suppresses the

noise coupled with the signal inputs.

PFSCL is new logic style which introduces

positive feedback into single ended

MCML gates [ 7]. This eliminates the need

for complementary second input signal

while still maintaining the differential

mode of operation.

In the following, the operation of MCML

gates is explained in section II. The

architecture of PFSCL and its operation is

addressed in section III. In section IV,

result of comparison between the

performance of PFSCL and MCML is

Page 34: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-2

presented and the simulation results are presented.

MCML GATES :

To understand the operation and the

unique properties of MCML we consider

the simple case of an inverter and will see

different configurations for its

construction.[8]

Inverters can be implemented using

transistors operating as voltage controlled

switches. The simplest configuration is as

shown in the figure below:

[from ref 9]

When vi is low switch will be open and

vo=vdd since no current flows through

resistance R.When vi is high then switch

will be closed and vo= 0.

We can modify the above configuration by

using a pair of complementary switches

called as PU and PD.

[from ref 9]

PU switch connects the output node to vdd

and the PD switch connects output to the

ground. When vi is low,the PU switch will

be closed and the PD switch open

establishing vo=vdd. Next if vi is raised to

logic high, the PU switch will be open

while the PD switch will close thus

establishing vo=vdd. This circuit constitutes

the basis of the CMOS inverter.

The third type of configuration can be

implemented using a double –throw switch

as shown below :

[fromref 9]

The switch is used to steer the constant

current IEE into one of the two resistors

connected to the positive supply VCC. If

logic high is applied at vi it results in the

switch being connected to Rc1, then a

logic inversion function is realized at v01.

This current steering is the basis for

current mode logic circuits.

Page 35: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-3

PRINCIPLE OF OPERATION AND STATIC MODEL OF MCML:

MCML is a dual rail logic circuit which

uses both the applied input and its

complement as an input pair. The

schematic is made up of NMOS source

coupled pair where the transistors work in

the saturation or cutoff. Here we are

considering the resistive load, however

different types of loads can be used such

as active PMOS load. Total current IT is

steered to any of the two branches and is

converted to differential output voltage by

the two resistors RD1 and RD2.M1 and

M2 constitutes a differential pair.

If VGS (M2) is higher than VGS (M1),

then current ID2 exceeds the current

ID1.Therefore, the output voltage Vo2

begins to drop until it reach steady sate

.The output voltage swing Vswing is

defined as voltage difference between Vo1

and Vo2 at steady state.

The differential output voltage Vo is equal

Vo = Vo1 – Vo2 = RD (iD1 – iD2)

The voltage swing is defined as difference

in the output voltage between cutoff and

saturation codition and is given by

Vswing=(Rd)(IT)

The small signal gain Av of a MCML with

matched gm for single ended output is

given by : Av= gmRD ∕ 2

Noise margin is given by:

NM = (Vswing/2)(1 - √2 /AV)

where AV>>1/√2 was assumed.[9].

The delay associated with a SCL gate is

given by:Г = .69 Rd Cout

where Cout is the net parasitic capacitance

at the output.

Page 36: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-4

III PFSCL

In PFSCL, the MCML logic style is

modified to include positive feedback from

the drain of M1 vo1 to the gate of M2, the

second transistor of the differential

pair.[10]

STATIC BEHAVIOUR OF PFSCL

GATE:

The bias current Iss is steered through

either M1 or M2 depending on the input

signal vin. The transistors M1 or M2

operate in the cutoff or in the saturation

region depending on vin. The logic high

voltage level is Vdd and the logic low

level is Vdd-IssRd. Hence, the PFSCL has

the same Vswing as in

MCML.

The small-signal circuit around a given

bias point can be represented as in above

figure where the source voltage vx value is

calculated by applying the superposition of

input voltages vin and vout at the gate of

M1 and M2 and observing that the voltage

gain between the gate and the source of

M1 is equal to

and that of M2

is

For ,

calculating Av from the small signal

equivalent circuit of PFSCL gives us:

Page 37: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-5

Av = (gmn Rd/2)/( 1- gmnRd/2)

From this expression we see that very high

value of closed loop gain is achieved for

gmnRd/2 tending to 1.

Piecewise linear approximation of DC

transfer characteristics of PFSCL gates

We see that the factor

gm1gm2/(gm1+gm2) reduces significantly

outside the transition region around VLT

and due to high sensitivity of Av to this

factor, the closed loop gain rapidly reduces

to zero outside the transition region.

Hence, the DC transfer characteristic has a

slope that sharply tends to zero outside the

transition region and can be approximated

by three segment piecewise linear function

with slope =-Av around VLT and zero

slope for other ones.

Due to positive feedback the expression

for the small signal voltage gain Av

changes to

NM = Vswing/2 (1-1/Av)

From the expression for NM we see that

noise margin is lower than half of

VSWING and tends to it for high values of

Av ( ie gmnRd/21).

SUMMARY FROM THE ABOVE SECTIONS ON PFSCL AND MCML

From the expressions of Av , voltage

swing and NM(noise margin) we see that

PFSCL topology offers advantages with

respect to MCML:

1) Keeping all design parameters

constant like (voltage swing,

biasing voltages and noise margin)

PFSCL achieves same gain for

lower value gmn and RD.

2) Less Rd implies an area saving.

3) From the expression for Av , for

MCML we see that for increasing

Av we have to proportionally

increase the gmn or in other words

the width of the transistors M1 or

M2 or value of Rd.

4) As gmn depends directly on Iss,

increase in gmn requires a increase

in gate voltage of the transistors

implementing the constant current

source or increase in (W/L) ,

5) That is, increase in power or area

required.

6) Whereas for PFSCL we have to

satisfy the relation gmn Rd/21

only, this can be easily achieved.

7) The reduction in area of the NMOS

transistors for particular value of

Page 38: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-6

Vswing and Av leads to decrease

in the associated parasitic

capacitances.

a. This gives PFSCL a speed

advantage over MCML

circuits.

8) This increase in speed can be

utilised for certain applications.

It can also be traded off for a

corresponding decrease in

power supply voltage which is

required in low power design.

IV COMPARISON OF MCML AND PFSCL LOGIC GATES

PERFORMANCE

In this section we present results of

simulation carried out on PFSCL and

MCML gates. The simulations were

carried out on Orcad PSPICE using 180nm

BSIMv3 MOS model.

The values of circuit parameters voltage

gain, gate bias voltage and voltage swing

were taken within the range used in

practical applications. pMOS loads were

used in PFSCL and MCML.

For simulation purpose, the voltage swing

was taken to be 400mV which is within

the practical range of 350mV-650mV.The

value of the voltage gain Av is generally

between 2-10. Simulations have been

performed using Av=2 and Av=6.The

Cload value is taken as .1pF.All the results

have been presented for input signal

frequency 500Mhz, with input swing =1.4

to 1.8V.

Area required vs Iss for given Av=2 and Vswing=0.4V

0.00E+001.00E-05

2.00E-053.00E-054.00E-05

5.00E-056.00E-05

0.00E+

00

1.00E-

04

2.00E-

04

3.00E-

04

4.00E-

04

5.00E-

04

Iss,bias current

Are

a r

eq

uir

ed

W1+W2+W3 MCML

W1+W2+W3 PFSCL

This graph shows that as the bias current

value increases, the area occupied by

MCML increases at a faster rate than area

occupied by PFSCL. The advantage in

area also leads to decrease in associated

parasitic capacitance which in turn causes

Page 39: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-7

the PFSCL gate to be faster than a MCML gate.

t_delay PFSCL vs MCML

0.00E+00

5.00E-10

1.00E-09

1.50E-09

2.00E-09

2.50E-09

0.00E+00 2.00E-05 4.00E-05 6.00E-05 8.00E-05 1.00E-04

Iss

t_d

ela

y t_d pfscl

t_d MCML

This graph shows the advantage of PFSCL gate vs MCML in terms of speed of operation.

This enables the extension of CML architecture into the GHz frequency range.

(For the values of Av=6,Vswing=0.4V and Cload=0.1pF)

Monte Carlo Simulations were also carried

out on PFSCL vs MCML gate to

determine the robustness of the logic style

to process variations(eg: tox ) and

variations in Vth(the threshold voltage of

the MOS).

From the simulation result it was found

that PFSCL was more robust and its

robustness increases as the bias current

increases.

Page 40: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-8

REFERENCES :

1) D. Allstot, S. Chee, S. Kiaei, and

M. Shristawa, “Folded source-

coupled Logic vs. CMOS static

logic for low-noise mixed-signal

ICs,”IEEE Trans. Circuits Syst. I,

vol. 40, pp. 553–563, Sept. 1993.

2) S. Kiaei, S. Chee, and D. Allstot,

“CMOS source-coupled logic for

mixed-mode VLSI,” in Proc. Int.

Symp. Circuits Systems, 1990,

pp.1608–1611.

3) J. Kundan and S. Hasan,

“Enhanced folded source-

coupled logic techniquefor low-

voltage mixed-signal integrated

circuits,” IEEE Trans.Circuits Syst.

II, vol. 47, pp. 810–817, Aug.

2000.

4) J.Kundan and S. Hasan, “Current

mode BiCMOS folded source-

coupled logic circuits,” in Proc.

ISCAS, June 1997, pp. 1880–

1883.

5) ] P. Gray, P. Hurst, S. Lewis, and

R. Meyer, Analysis and design of

analog integrated circuits, 4th

ed. New York: John Wiley &

Sons, 2000.

6) Design of nanometer MOS

Current Mode Logic (MCML):

basic concepts and

perspectives(lecture), Massimo

Alioto,2007

7) Modeling and Evaluation of

Positive-Feedback Source-

Coupled Logic, M. Alioto,

Member, IEEE, L. Pancioni, S.

Rocchi, Member, IEEE, and V.

Vignoli, Member, IEEE, IEEE

Tansactions on Circuits and

Systems—I: Regular Papers vol.

51, NO. 12, December 2004

8) A. Sedra and K. Smith,

Microelectronic Circuits,Oxfords

9) M. Alioto and G. Palumbo,

“Design strategies for source

coupled logic gates,” IEEE Trans.

Circuits Syst. I, vol. 50, pp. 640–

654, May 2003.

10) Modeling and Evaluation of

Positive-Feedback, Source-

Coupled Logic, M. Alioto,

Member, IEEE, L. Pancioni, S.

Rocchi, Member, IEEE, and

V. Vignoli, Member, IEEE, IEEE

Transactions on Circuits and

Systems—I: Vol. 51, No. 12, Dec

2004

Page 41: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0107-9

Page 42: VLP

Comparative Study of Fast Adders using VHDL and FPGA

Nishi Chandra Rajani Bisht

ET Deptt.,H.B.T.I.,Kanpur Associate Professor,ET Deptt.,H.B.T.I,Kanpur

Abstract: Adders are one of the most widely used components in integrated circuits and they are most commonly used in various electronic applications. The major challenge for VLSI designer is to reduce area of chip and the next phase is to increase the speed of operation to achieve fast operations.

Therefore, various adders such as the ripple adder, carry-look-ahead adder, carry select adder etc. are compared and VHDL is used in their comparison. Their comparative study included the use Xilinx 9.2i as the synthesis tool, Xilinx ISE Simulator as the simulation tool and FPGA Spartan-II kit for the implementation of these adders.In this comparison study, area and delay report is generated for these adders and the VHDL codes can be as well implemented on the FPGA Spartan-II kit.

Introduction

One of the most widely used components in integrated circuits are adders, so designing efficient adders has been the goal of research in VLSI design. Addition is a crucial arithmetic function for most digital systems. Various adder structures can be used to execute addition such as serial and parallel structures. They are used not only for addition, but also for other operations such as subtraction, multiplication, division, and address computation .Adders are one of the most widely used components in integrated circuits and they are most commonly used in various electronic applications e.g. Digital signal processing in which adders are used to perform various algorithms like FIR, IIR etc[1]. In past, the major challenge for VLSI designer is to reduce area of chip by using efficient optimization techniques. Apart from aiding a designer in selecting an adder with favorable characteristics, aim is providing insight into design tradeoffs that can save power and enhance performance. The adders studied include linear time ripple carry and manchester carry chain adders, carry skip and carry select adders, carry

lookahead adder and its variations, and carry-save adders. Several researchers had worked on the performance analysis of adders and other researchers on the performance analysis of multipliers. Therefore, lot of research is going on to reduce power consumption. Therefore, there are three performance parameters on which a VLSI designer has to optimize their design i.e. Area, Speed and Power[2]. It is very difficult to achieve all constraints for particular design, therefore depending on demand or application some compromise between constraints has to be made. Hence, the VHDL codes have been formulated for these fast adders and to get area and delay report, Xilinx 9.2i is used as the synthesis tool. In addition to this, Xilinx ISE Simulator is used for simulation and FPGA Spartan –II kit is used for implementation. Fast Parallel Adders Ripple Carry Adder (RCA) It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is a ripple carry adder, since each carry bit "ripples" to the next full adder. Ripple carry adder can be designed by cascading full adder in series i.e. carry from previous full adder is connected as input carry for the next stage. Full adder is a basic building block of Ripple carry adder. The major limitation of Ripple carry adder is that as the bit length goes on increasing, delay also increases. Therefore, Ripple carry adder is not suitable if large number bits are to be added. The two Boolean functions for the sum and carry are: SUM = Ai ⊕ Βi ⊕ Ci (i) Cout = Ci+1 = Ai · Bi + (Ai ⊕ Bi) · Ci (ii)

Page 43: VLP

Fig 1. Ripple carry adder Condition Carry Adder (CCA) This adder computes sum and carry depending upon status of previous carry i.e. 1. If ci = 0 then Sout = ai xor bi & ci+1 = ai and bi (iii) 2. If ci = 1 then Sout = ai xnor bi & ci+1=ai or bi (iv) The adder does not consider the case of computing sum and carry directly by using full adder.

Fig 2. Condition carry adder Carry Lookahead Adder (CLA) In the lookahead carry algorithm ,carry for the next stages is calculated in advance based on input signals.As a result this algorithm speed up the operation to perform addition. If ‘‘X’’ and ‘‘Y’’ are two inputs, “ci” is initial carry, “sout” and “cout” are output sum and carry respectively, then Boolean expression for calculating next carry and addition is[3]: Pi = xi xor yi --- Carry Propagation (v) Gi = xi and yi --- Carry Generate (vi) Ci+1 = Gi or (Pi and Ci) --Next Carry (vii)

Fig 3. Carry lookahead adder Given the two Boolean functions for the sum and carry as follows[ref1]: SUM = Ai ⊕ Βi ⊕ Ci (viii) Cout = Ci+1 = Ai · Bi + (Ai ⊕ Bi) · Ci (ix) Manchester Carry Chain Adder Manchester adder is also a type of Carry look-ahead adder.In the case of manchester adder ,there is a slight modification in calculating next carry to be propagated i.e. instead of using Boolean expression Ci+1 = Gi + Ci.Pi to calculate next carry, Manchester carry adder uses expression: Ci+1=Gi+Ci.ti (x) ti = Xi + Yi (xi) Thus, we can say that carry recurrence can be written in terms of ti instead of Pi, which leads to slightly faster adder because in binary addition, ti is easier to produce than Pi (OR instead of XOR). Conventional Carry Skip Adder (CSKA) Carry has to propagate through all N stages in case of N-bit Ripple carry adder, which results in large delay in performing binary addition. On the other hand,it is possible to skip carry over group of n-bits in case of Carry Skip Adder. This results in less delay as compared to ripple carry adder. The logic used for the carry skip is shown in the figure below and also obvious from the equations.

P(0:3)<= ((x(0) or y(0)) and (x(1) or y(1)) and

(x(2) or y(2)) and (x(3) or y(3))); (xii)

Page 44: VLP

Fig 4. Conventional carry skip adder Modified Carry Skip Adder (CLSKAs) In the case of conventional carry skip adder, each block consists of ripple carry adder and skip logic is used after each block to generate carry for next block. The speed of operation is affected by the method of carry propagation from previous block to next block[4]. While in CLSKAs, carry lookahead scheme is used in each block to generate carry for next block. As a result ther is a better performance in terms of speed as look ahead carry adder is faster than ripple carry adder[5]. Figure shows modified CLSKA with fixed block size i.e. 4-bit each.

Fig 5.Modified carry skip adder Carry Select Adder (CSA) In the carry select adder, the principle used to calculate sum is based on assuming input carry from previous stage. One adder calculates the sum assuming input carry of 0 while the other calculates the sum assuming input carry of 1[6]. Then, the actual carry triggers a multiplexer that selects the appropriate sum . Fig. shows the schematic block diagram of 16-bit Carry select adder consists of 4-

blocks each of 4-bit Look ahead carry adder . Carry output of each block is fed into next block as input carry.

Fig 6. Carry select adder Carry Save Adder (CSA) In carry save adder, if sum of two 16-bit binary numbers is to be computed, so 16 half adders are taken at first stage instead of using 16 full adders. Therefore, carry save unit consists of 16 half adders, each of which computes single sum and carry bit based only on the corresponding bits of the two input numbers. It is used to compute sum of three or more n-bit binary numbers. This adder is same as a full adder Let x and y are two 16 bit numbers and produces partial sum and carry as s and c: Si = xi xor yi (xii) Ci = xi and yi (xiii) The final addition is then computed as: 1. Shifting the carry sequence C left by one place. 2. Placing a 0 to the front (MSB) of the partial sum sequence S. 3. Finally, a ripple carry adder is used to add these two together and computing the resulting sum.

Fig 7. Carry save adder

Page 45: VLP

RESULTS AND ANALYSIS The adders namely, ripple carry adder, carry lookahead adder , manchester carry chain adder, carry select adder, carry save adder, condition carry adder, conventional carry skip adder and modified carry skip adder have been designed using VHDL (Very High Speed Integration Hardware Description Language) for 16-bit unsigned data. In order to demonstrate the performance of these adders , the adders are compared on the basis of their delays and area occupied. The delay and area reports are generated for these specified adders. To get the delay and area report, the following tools are used:

1. Xilinx 9.2i is used as the synthesis tool. 2. Xilinx ISE Simulator is used for

simulation. 3. FPGA – Spartan II is used for

implementation.

The delay and area reports of the adders are generated with the help of the synthesis tool i.e. Xilinx 9.2i. The VHDL codes formulated for these adders are firstly simulated using the Xilinx ISE Simulator and further these codes are synthesized using the synthesis tool. The synthesis tool after the synthesis process generates a synthesis report and this report can provide us with the propagation delay and also the number of 4-input LUTs used by the design out of the total number of LUTs. Further, the VHDL codes of the adders after being simulated and synthesized can be implemented on FPGA kit by downloading design codes on the kit. The VHDL codes implemented on the kit such that the codes are converted in the design format (i.e. the programming file) to be downloaded on the kit. The delay and area reports generated for these adders are given in tabular form in table 1.

ADDERS (With Fix Block Size=4 bit)

DELAY (ns)

LUTs (out of 1536)

Ripple Carry Adder

32.997 32

Carry Lookahead Adder

22.792 17

Manchester Carry Chain Adder

31.744 32

Carry Select Adder

26.056 45

Carry Save Adder

35.424 46

Condition Carry Adder

33.378 32

Conventional Carry Skip Adder

17.636 43

Modified Carry Skip Adder

27.163 69

Table 1. Delay and area report of 16-bit fast adders

CONCLUSION

The delay and area reports generated as a result of simulation and synthesis processes run on the VHDL codes of the adders provide us with the performance analysis of these 16-bit adders. According to the reports of these adders, comparison between the delays of the adders concludes that the conventional carry skip adder has minimum propagation delay (17.636 ns) while it occupies 43 LUTs out of total 1536 LUTs on the Spartan II -XC3S50-5-TQ144 FPGA kit. However, carry lookahead adder has next least propagation delay (i.e. 22.792 ns) and least number of LUTs occupied on the FPGA kit (i.e 17 LUTs out of 1536 LUTs).

From the area and delay reports of these adders , it is observed that there are trade-offs between performance parameters i.e Area and Delay. In order to design delay efficient adder, conventional carry skip adders in which it is possible to skip carry over group of n-bits. This results in less delay as compare to ripple carry adder to generate output sum and carry bit for next block. This result in fast operation but at the cost of few more LUT’s due to carry skip logic.

Page 46: VLP

References

1. R.P.P.Singh,Parveen Kumar and Balwinder Singh, “Performance Analysis of fast adders using VHDL”,2009 International Conference on advances in Recent Technologies in Communication and Computing.

2. Nagendra, C.; Irwin, M.J.; Owens, R.M.,“Area-time-power tradeoffs in parallel adders”, Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on Volume 43, Issue 10, Page(s): 689 – 702, 1996.

3. Hasan Krad and Aws Yousif Al-Taie, “Performance Analysis of a 32-Bit Multiplier with a Carry-Look-Ahead Adder and a 32-bit Multiplier with a Ripple Adder using VHDL”, Journal of Computer Science 4 (4): 305-308, 2008.

4. Wang, Y.; Pai, C.; Song, X., “The design of hybrid carry lookahead/carry-select adders, Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on Volume 49, Page(s): 16-24, 2002.

5. Min Cha and Earl E. Swartzlander, Jr, “Modified Carry Skip Adder for reducing first block delay”, Proc. 43rd IEEE Midwest Symp. on Circuits and Systems, Lansing MI, Page(s): 346-348, 2000.

6. Behnam Amelifard, Farzan Fallah,

Massoud Pedram, “Closing the gap between Carry Select Adder and Ripple Carry Adder: A new class of Low-power and High-performance Adders”, Proceedings of the Sixth International Symposium on Quality Electronic Design (ISQED’05) , 2005.

Page 47: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-1

Organic Thin Film Transistor: Materials,

Structures and Operational Parameters Poornima Mittal

1, Brijesh Kumar

2, B. K. Kaushik

3, Y. S. Negi

4 and Krishna Raj

5

1Electronics and Communication Engineering, Graphic Era University, Dehradun, INDIA

3Electronics and Computer Engineering, Indian Institute of Technology, Roorkee, INDIA

2,4Polymer Science and Technology Group, DPT, Indian Institute of Technology, Roorkee, INDIA

5Department of Electronics Engineering, H.B.T.I., Kanpur, INDIA

[[email protected], [email protected], [email protected], [email protected], [email protected]]

ABSTRACT: Organic Thin Film Transistors (OTFTs)

are out breaking their performance over the past few

years and becoming very attractive for large range of

applications such as oscillators, flexible display devices,

small and large scale and even integrated optoelectronic

devices. Transistor based on organic semiconductor as

active layer to manage electric current flow is known as

organic thin film transistor. For the last decade

organic/polymeric materials have been extensively

investigated for substrate, conducting semiconductor

layer, dielectric and contact electrodes for thin film

transistor (TFT) devices. In organic thin film transistor,

the type of semiconductor, processing, doping and

structure can affect their electrical characteristics. This

paper presents new insight into structure, organic

materials, conduction mechanism and performance

characteristics of OTFT. However pentacene based

bottom and top contact structure has been modelled to

characterise adopted structures for organic transistor.

It explores the current status of OTFTs in terms of

various parameters such as contact resistance, effect of

channel length, active layer thickness and on/off current

ratio etc. Organic electronic products are lighter, more

flexible and less expensive than their inorganic

counterparts. These are also biodegradable being made

from carbon. This opens the door to many exciting

applications that would be impossible using silicon.

Since OTFT provide simple and low cost processes, its

application to display has been discussed.

Keywords: Bottom and Top Contact Structures of

OTFTs, Contact Resistance, Mobility, Organic

Materials, Organic Thin Film Transistors.

1. INTRODUCTION

Organic electronics has the potential to create

new range of devices, circuits and their

applications. Some important applications like

display drivers, advertising boards, smart cards,

wall sized televisions, identification tags, portable

products such as modern cell phones and video

games [1]. Organic material based devices like

Organic Thin Film Transistor (OTFT), Organic

Field Effect Transistor (OFET), Organic Light

Emitting Diode (OLED) and Solar Cell have

numerous advantages of low cost, flexibility and

light weight than their inorganic counterparts.

Organic semiconductors can be processed at low

temperatures compatible with plastic substrate

whereas higher temperatures are required for

alternative Si based devices [2, 3]. Organic

transistors can usually be manufactured at or near

room temperature, unlike silicon based

transistors, which typically require fairly high

process temperatures (>800ºC for crystalline Si

transistor).

For simulation of OTFTs certain structures

have been proposed. In order to enhance the

device speed, considerable research effort has

been devoted to increase the mobility of organic

materials by improving deposition conditions [4,

5]. At the same time as a result of this effort,

mobility exceeding 1 cm2/V.sec for Pentacene

[6], this is of comparable value to amorphous

hydrogenated silicon (a-H:Si) and 0.1 cm2/V.sec

for poly (3-hexylthyophine) P3HT [7]. In addition

to mobility, other ways of improving performance

of OTFTs such as channel length scaling and

active layer thickness have also attracted

considerable attention [8]. This paper first

Page 48: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-2

describe the different structures of organic thin

film transistors in section 2, various organic and

polymer materials used for active semiconductor

and dielectric layer in section 3, operation and

characteristics in section 4. Finally parameters

and display application has been discussed in

section 5 and 6 respectively.

2. OTFT STRUCTURES

OTFTs adopt the architecture of thin film

transistor (TFT), which has proven it’s

adaptability with low conductivity materials. It

contains three electrodes source, drain and gate, a

dielectric layer and active organic

semiconducting (OSC) layer. The structure can be

top gate or bottom gate and further both

architectures can be divided into top contact and

bottom contact alternatives as depicted in fig.1 (a)

and (b). The deposition of organic semiconductor

on the insulator is much easier than the reverse

due to fragile nature of organic semiconductors;

hence bottom gate architecture is built in majority

for current OTFTs.

Well known structure for standard silicon

MOSFETs is top-gate-top-contact (TGTC),

however for simulation of OTFT bottom-gate-

top-contact (BGTC) and bottom-gate-bottom-

contact (BGBC) architecture has been modeled

mostly. Certain advantages and disadvantages are

associated with each of OTFT structures. In terms

of field effect mobility among both the structures,

BGTC structure shows better performance in

comparison with BGBC structure. The better

field effect mobility for top contact OTFT is due

to less contact resistance than that of a bottom

contact one [9]. The performance of OTFTs in a

BGBC bottom contact device structure is

generally observed to be lower by two orders of

magnitude than to the top contact device

configuration [10-13].

(a)

(b)

Fig.1 Schematic cross-section of OTFT structures with

pentacene as active semiconductor, Al2O3 as dielectric and

gold contact electrodes. (a) Bottom Gate Top Contact

(BGTC) (b) Bottom Gate Bottom Contact (BGBC).

3. ORGANIC MATERIALS

The performance of OTFTs depends on

their constituent organic semiconductors and

materials of insulator. Following materials are

explained here for different layers of OTFTS.

3.1. SUBSTRATE

For substrates quartz, polycarbonate,

polyethylene naphthalate (PEN), glass, silicon

wafer and polyimide materials can be used [14,

15]. Inorganic substrates have high melting point

and good flatness where as polymer substrates

have high toughness, flexibility and light weight.

3.2. CONTACT ELECTRODE

To improve electrical characteristics, ohmic

contact can be formed between gold (Au) and

organic semiconductor because the work function

of Au is 5.0ev and HOMO of most of the organic

Page 49: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-3

semiconducting materials is around this level.

Adding nickel on gold improves adhesion of the

gold on the oxide. Platinum electrodes are

inferior to gold electrodes. Aluminum shows

slightly higher electron mobility (2.2cm2/Vs) at

room temperature in single crystals [15, 16].

3.3. P-TYPE ORGANIC SEMICONDUCTORS

Organic thin film transistors fabricated with

light weight flexible substrates are expected to

replace hydrogenated amorphous TFT

applications on glass substrates. Table-1 shows

the mobility and on/off current ratio measured

from OTFT by using p- type organic molecules

deposited by different techniques. Among all

investigated oligomeric and polymeric materials,

pentacene thin films have demonstrated the best

electrical performance. Pentacene exhibits typical

p-channel semiconductor characteristics.

TABLE-1 MOBILITY (µ) AND CURRENT ON/OFF

RATIO FOR SOME P-TYPE SEMICONDUCTORS [17].

Material Mobility

cm2/V s

Ion/Ioff

Pentacene 3.2 109

Copper phthalocyanine 0.01-0.02 NR

Polythiophene 10-5

>102

αω-dihexyl-hexathiophene 0.13 >104

P3HT 0.1 106

3.4. N-TYPE ORGANIC SEMICONDUCTORS

It is surprising to note that most of the work

to date has focused on p-type organic materials,

whereas some effort has been guided towards the

preparation of novel n-type semiconductor

materials recently. While designing n-type

devices, semiconductor must be utilized which

can allow the injection of electrons into its

LUMO. Gold has been optimized for source and

drain electrodes [10], and it has a work function

of 5.0ev and since most n-type materials have

solid state electron affinity levels 4.0ev.

Thus charge injection into the semiconductor

would be limit by the energy barrier of

approximately 1ev, is a another issue associted

with complexity of n-type devices. Substantial

effort has gone into the development of organic

n-channel OTFTs because this allows the

implementation of complementary circuits with

low static power consumption [9, 18]. Table-2

gives mobility and current on/off ratio for some

n-type semiconductors.

TABLE-2 MOBILITY (µ) AND CURRENT ON/OFF

RATIO FOR SOME N-TYPE SEMICONDUCTORS [17].

Material Mobility

(cm2V

-1s

-

1)

Ion/Ioff

Pc2Lu

(Lutetiumbisphthalocyanines)

2×10-4

NR

TCNQ

(tetracyanoquinodimethane)

3×10-5

4-450

C60 0.08 106

F16CuPc 0.03 5×104

3.5. MATERIALS FOR DIELECTRIC

Organic polymers having good processability

and dielectric properties, such as poly methyl

methacrylate (PMMA), poly vinyl phenol (PVP),

polyimide (PI), and poly vinyl alcohol (PVA)

have been extensively employed as the gate

insulator. Switchig voltage of OTFTs increase

with low dielectric constant of insulators. Some

important dielectric materials with their dielectric

constant are polyimide - 2.6, PMMA - 2.65,

Al2O3 - 9, and SOG (spin on glass) - 3.9.[17].

4. OPERATION AND CHARACTERISTICS

4.1. OPERATION

TFTs cannot accommodate a bend bending

due to absence of bulk region [19]. The

conducting channel is formed by an inversion

layer in MOSFETs while in TFTs, it is because of

accumulation. Depending upon the polarity of the

gate voltage they can operate in unipolar carrier

(electron or hole) accumulation modes. In a thin

film FET or accumulation type FET, charge-

voltage relation is simply given as:

ρ (x) = [V(x) – Vg] Cox (1)

Page 50: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-4

With ρ and V the local charge per area and

voltage in the channel respectively. Polymeric

material such as Pentacene acts as p-type

semiconductor having holes as majority carriers.

Fig.2 Top contact OTFT operation with pentacene as active

semiconductor layer.

When a negative gate voltage is applied, an

electric field is formed across the dielectric,

causing an accumulation region of holes at the

dielectric-semiconductor boundary shown in fig.

2. Applying a voltage to the source-drain

terminals allows a current to flow across this

accumulation layer between the contacts.

Basically OTFT operates like a capacitor, when a

voltage is applied to the gate an equal (but of

opposite sign) charge is induced at both side of

the insulator [20].

4.2. CHARACTERISTICS

Despite the fact that the transport physics in

organic /polymeric TFTs is different from that in

silicon MOSFETs, the current-voltage

characteristics can to first order be described with

the same formalism:

ID = µWCi/L [(VGS – Vth) VDS – V2

DS/2] (2)

For VGS – Vth > VDS (linear regime)

ID = µWCi/2L (VGS – Vth) (3)

For VDS > VGS – Vth > 0 (saturation regime)

Where, W is the channel width, L is the channel

length, Ci is the gate dielectric capacitance and µ

is the carrier mobility in the semiconductor. The

currene-voltage (Id -Vds) characteristics of OTFT

is similar to inorganic based FETs at gate bias

voltage VGS higher than a threshold voltage Vt, as

illusrated in fig. 3.

Fig. 3 Output characteristics of organic thin film transistor

with Pentacene as semiconductor layer, Al2O3 dielectric

material and gold as contacts.

Characteristics shows a linear (ohmic)

region with dependency of ID on VDS for low

drain-source bias voltage (VDS << VGS) and

saturation of ID occurs at high drain voltages (VDS

>VGS). The biasing voltages and current polarity

is considered as per behavior of device similar to

NMOS or PMOS.

5. PARAMETERS

5.1. MOBILITY

Field effect mobility is a key parameter to

determine the processing speed of organic

devices. Mobility of carriers can be modulated by

gate voltage; it tends to increase when gate bias

increases [20]. By many decades quoted values

for effective mobility for organic transistors vary

in the range of 10-5

to 10 cm2/V s. Mobility

depends on many other factors such as gate

biasing, method of fabrication and the method of

evaluation of the mobility from the simulation

and experiments [21]. The bias dependent

mobility, expressed as power law for polymer

based field effect transistor is given by:

µ (VGS) = µ0 (VGS – VT)

γ (4)

The parameter γ is usually estimated in the

range of 0.2 – 0.5 for different OTFTs/PFETs

[22, 23]. TFTs exhibits mobility up to 0.4 cm2V

-

1s

-1 at low operating voltages (5V) [24, 25]. The

mobility increases from very low values about

Page 51: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-5

0.02 cm2V

-1s

-1 at VG = -14 V to 1.26 cm

2V

-1s

-1 at

-146 V [26].

5.2 ON/OFF CURRENT RATIO

The ratio of current in the accumulation mode

over the current in the depletion mode is called

Ion/Ioff. Current ratio depends upon various factors

such as materials, channel length, and thickness

of semiconductor. Short channel devices shows

higher on/off current ratio over devices having

large length of conducting channel [6]. This ratio

increases with decrease in the thickness of

semiconducting layer. For memory and display

applications high on/off current ratio is more

important requirement than high mobility. It has

been quoted that on/off current ratio has been

measured as 108 for BGBC thin film transistor

structure with Pentacene as organic

semiconductor, cross linked PVP as insulator and

gold contacts for source and drain [29]. One has

observed it around 109 for Pentacene as active

organic semiconductor [31].

5.3. THRESHOLD VOLTAGE

To extract information about impurity

concentrations, interface states and traps it is

common practice to use threshold voltage and sub

threshold current as device evaluation parameters.

In MOSFETs, the sub threshold current

exponentially depends on the gate-bias as well as

the drain-source bias because below threshold the

free carrier density exponentially depends on the

local bias. The threshold voltage (Vt) of OTFTs

varies with either the gate insulator capacitance

[27] or the thickness of the organic film [20].

5.4. CONTACT RESISTANCE

Ideally the contact resistance should be ohmic

and small in order to make enable the whole

voltage applied to the device, contributes to the

transport current. For top contact devices it

strongly depends upon gate bias and sharply

increases at low gate-source voltage, while

contact resistance appears to be almost

independent of the gate bias in bottom contact

structures. Necliudov et al. measured the contact

resistance as 1.3×108 Ohm µm with mobility of

approximately 0.9 cm2/V s for bottom contact

Pentacene OTFT, consistent with an injection

barrier of between 0.2 and 0.3eV in the

simulation, additionally it has been quoted that

top contact resistance is strongly depend on gate

voltage [22] and much less than the bottom

contact resistance at high gate bias.

5.5. EFFECT OF CHANNEL LENGTH

Drain current strongly depends upon the

semiconductor used for channel and it can be

modulated by length of the conducting channel.

M. Austin et al quoted drain current dependence

on the length of channel for P3HT (poly (3-

hexylthiophene)) in OTFTs with different

channel lengths of 1000nm and 70nm. It has been

shown that saturation region is present for long

channel (1000nm) device but no saturation region

appears in the short channel (70nm) device. Long

channel devices are relatively immune to high

contact resistance and when scaled to smaller

channel lengths, the device performance may

degrade [28]. The on/off current ratio is higher

for short channel devices over long channel

devices.

5.6 EFFECT OF ACTIVE LAYER THICKNESS

Electrical parameters of OTFT does not solely

depend upon gate capacitance, these can be

modulated by film thickness and charge injection

from the source electrode. There are trends which

can be expressed as a function of the product of

thickness of polymeric film and gate capacitance

per unit area. It has been observed that with

increasing the permittivity of gate insulator and

thickness of organic material, the mobility

decreases in OTFTs [21].

Page 52: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-6

6. OTFT FOR DISPLAY DEVICES

The companies currently developed a very

diverse set of substrate, drive element and display

mode technologies in order to realize flexible

display. E-paper display market is expected to

show 46.9% annual average growth rate from

US$ 260 million in 2010 to US$ 2.1 billion in

2015 and US$ 7 billion in 2020. OTFTs can be

used to make good displays of LCD or E-paper as

there need high on/off current ratio [29].

TABLE-3 DISPLAY APPLICATIONS WITH OTFTS

WITH PENTACENE AS OSC [29]

App. Specification Organization

OLED 4*4 pixel on PC NHK (Japan)

OLED 8*8 pixel on glass Pioneer (Japan)

LCD 64*128 on plastic ERSO (Taiwan)

LCD 15 in. full color XGA on

glass

Samsung

(Korea)

LCD 1.4 in. 80*80 RGB on glass Hitachi (Japan)

Table-3 summarizes the display prototypes

using OTFTs and LCD (made with OTFT matrix

array) and active matrix organic light emitting

diode (AMOLED) with dot matrix patterns.

Organic/polymer LEDs displays have the

potential to replace LCDs and become the next

dominant force in flat panel display due to require

fewer steps in fabrication processes and have

lower material costs than LCD [30].

7. CONCLUSION

Organic/polymer electronics is a very

promising alternative to crystalline,

polycrystalline and amorphous silicon processes.

Moreover, there are no restrictions as to the

dimensions of the device. It has been observed

that with increasing the permittivity of gate

insulator and thickness of organic material, the

mobility decreases in OTFTs. The effect of

channel length has been discussed; long channel

devices are relatively immune to high contact

resistance. Top contact OTFT shows better field

effect mobility due to less contact resistance than

that of a bottom contact one.

It has been quoted that on/off current ratio is

higher for short channel devices over long

channel devices. For memory and display

applications high on/off current ratio is more

important requirement than high mobility and this

ratio should be more than 108. In spite of

numerous advantages such as, large area

coverage, structural flexibility and especially low

cost, certain limitations like instability, lower

carrier mobility, and shorter lifetimes are

associated with organic material based devices

need to be resolve to commercialize OTFTs

based applications.

REFERENCES

[1] M. Jamal Deen, “Plastic microelectronics with organic

and polymeric thin film transistors,” Proc. 26th

international conference on microelectronics, MIEL,

2008.

[2] Yoshiro Yamashita, “Organic semiconductors for

organic field effect transistor,” Sci. Technol. Adv.

Mater. vol.10, pp-024313, 2009.

[3] H. Klauk, D. J. Gundlach, and T. N. Jackson, “Fast

organic thin-film transistor circuits,” IEEE Electron

Device Lett., vol. 20, pp. 289-291, 1999.

[4] A. R. Brown, A. Pomp, C. M. Hart, and D. M. De

Leeuw, “Logic gates made from polymer transistors

and their use in ring oscillators,” Science, vol. 270, pp.

972-974, 1995.

[5] Y. Sun, Y. Liu and D. Zhu, “Advances in organic field-

effect transistors, ” J. mater. chem. , vol. 15, pp. 53-

65, 2005.

[6] Y. Y. Lin, D. J. Gundlach, S. F. Nelson, and T. N.

Jackson, “Stacked pentacene layer organic thin film

transistors,” IEEE Electron Device Lett., vol. 18, pp.

606–608, Dec. 1997.

[7] Z. Xie, M. Abdou, A. Lu, M. J. Deen, S. Holdcroft,

“Electrical Characteristics of Poly (3-Hexylthiophene)

Thin Film MISFETs,” Canadian J. of Physics, vol. 70

no. 10 & ndash; 11, pp. 1171-1177, 1992.

[8] O. Marinov, M. J. Deen, and R. Datars, “Compact

modeling of charge mobility in organic thin-film

transistors,” J. Appl. Phys. , vol. 106, no. 6, pp.

064501-1–064501-13, Sep. 2009.

Page 53: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0109-7

[9] H. Klauk, “Organic thin film transistor,” Chem. Soc.

Rev., 39, pp. 2643-2666, 2010.

[10] O. Marinov, M. J. Deen, and B. Iniguez, “Charge

transport in organic and polymer thin-film transistors:

Recent issues,” Proc. Inst. Elect. Eng. Circuits Devices

Syst., vol. 152, no. 3, pp. 189–209, Jun. 2005.

[11] N. Karl, “Charge Carrier Transport in Organic

Semiconductors,” Synth. Met. , vol. 649, pp . 133-

134, 2002.

[12] R. A. Street and A. Salleo,” Contact effects in polymer

transistors,”Appl. Phys. Lett. vol. 81, no. 15, pp. 2887,

2002.

[13] S. F. Nelson, Y. Y. Lin, D. J. Gundlach and T. N.

Jackson, “Temperature independent transport in high

mobility pentacene transistors,” Appl. Phys. Lett. vol.

72, no.15, pp.1854, 1998.

[14] F. Garnier, “Thin-Film Transistors Based on Organic

Conjugated Semiconductors, Chem. Phys., 227, 253,

1998.

[15] H. Klauk, D. J. Gundlach, M. Bonse, C. C. Kuo, and

T. N. Jackson, “A reduced complexity process for

organic thin film transistors,” Appl. Phys. Lett., 76,

1692, 2000.

[16] J. H. Schon, Ch. Kloc, and B. Batlogg, “On the

intrinsic limits of pentacene field-effect transistors,”

Organic Electronics., vol.1, no. 57, 2000.

[17] C. Shekar, T. Lee and S. W. Rhee, “Organic thin film

transistors, material, processes and devices,” Korean J.

Chem. Engg., vol. 21, no. 1, pp. 267-287, 2004.

[18] G. Horowitz, “Organic field-effect transistors,” Adv.

Mater. vol. 5, pp. 365-377, 1998.

[19] P. Stallinga, and H. L. Gomes, “Modelling electrical

characteristics of thin-film-field-effect transistor, I.

Trap-free materials,” Synthetic Metals, 156, pp. 1305-

1315, 2006.

[20] G. Horowitz, “Organic thin film transistors: From

theory to real devices”, J. Mater. Res., vol. 19, no. 7,

pp. 1946-1962, Jul 2004.

[21] O. Marinov, M. J. Deen, and B. Iniguez, “Performance

of organic thin film transistors,” J. Vac. Sci. Technol.,

vol. 24, no. 4, pp. 1728–1733, 2006.

[22] P. Necliudov, M. Shur, D. Gundlach, and T. Jackson,

“Modeling of organic thin film transistors of different

designs,” J. Appl. Phys., vol. 88, no. 11, pp. 6594–

6597, Dec. 2000.

[23] De Leeuw, D. Gelinck, G. Geuns, T. Van Veenendaal,

E. Cantatore, E. and B. Huisman, “Polymeric

integrated circuits: fabrication and first

characterization,” IEEE-IEDM, 2002, pp. 293–296.

[24] C. D. Dimitrakopoulos, S. purushothaman, J. Kymissis,

A. Calleggari and J. M. Shaw, “Low-Voltage Organic

Transistors on Plastic Comprising High-Dielectric

Constant Gate Insulators,” Science, vol. 283 no. 5403

pp. 822-824, February 5, 1999.

[25] C. D. Dimitrakopoulos, I. J. Kymissis, S.

Purushothaman, D. A. Neumayer, P. R. Duncombe,

and R. B. Laibowitz, "Low-Voltage, High-Mobility

Pentacene Transistors with Solution-Processed High

Dielectric Constant Insulators," Adv. Mater. 11, 1372,

1999.

[26]C. D. Dimitrakopoulos and P. R. L. Malenfant,

“Organic Thin Film Transistors for Large Area

Electronics,” Adv. Mater. vol. 14, pp. 99-117, 2002.

[27] M. J. Deen, O. Marinov, Jianfei Yu, S. Holdcroft and

W. Woods, “Low-frequency noise in polymer

transistors,” IEEE Trans. on Electron Devices, vol. 48,

no. 8, pp. 1688-1694, 2001.

[28] I. G. Hill, “Numerical simulations of contact resistance

in organic thin-film transistors,” Appl. Phys. Lett. Vol.

87, pp. 163505-1-163505-3, 2005.

[29] Jin Jang and S. H. Han, ”High performance OTFT and

its application,” Current Applied Physics, 6S1, pp.

e17-e21, 2006.

[30] A. Afzali, C. D. Dimitrakopoulos and T. L., Breen,

“High-performance, solution-processed organic thin

film transistors from a novel pentacene Precursor,” J.

Am. Chem. Soc., vol. 124, pp. 8812, 2002.

[31] J. H. Schon, S. Berg, Ch. Kloc, and B. Batlogg,

“Ambipolar pentacene field-effect transistors and

inverters,” Science, vol. 287, pp. 1022, 2000.

Page 54: VLP

CHARACTERIZATION OF 4T SRAM CELL

Setu Garg1, Prof.S.N.Sharan2, Garima Chandel3 Member IEEE, Hridesh Verma4

1 GCET, Greater Noida,2GNIT, Greater Noida, 3,4ABES IT Ghaziabad, India. [email protected], [email protected], [email protected],

[email protected]

ABSTRACT — The Static Random Acess Memory discussed in this paper is based on a Four-Transistor SRAM cell. This paper focuses on the various important parameters viz., Static Noise Margin Analysis and Bit Line Leakage current analysis to characterize Four-Transistor SRAM cell. Maximum allowable SNM is needed to be investigated for efficient operation of SRAM cell. The purpose of this analysis is to measure the SNM of bit cell without flipping the cell contents.Bit line leakage current analysis is also done. Analysis involves bit cell contribution to column leakage and margin available for sum of total cell leakage current in a long column. The performance and results have been validated through simulations using ELDO tool from Mentor Graphics Corporation.

Index Terms – SRAM, Bit Line, Static Noise Margin, DC Source, Word Line

I. INTRODUCTION

Static random-access memory (SRAM) is a critical component across a wide range of microelectronics applications from consumer appliances to high-end workstation and microprocessor applications. For almost all fields of applications, semiconductor memory has been a key enabling technology. It is forecasted that embedded memory in SoC designs will cover up to 90% of the total chip area. A representative example is the use of cache memory in microprocessors. The operational speed could be significantly improved by the application of on-chip cache memory that temporarily stored a fraction of the data and instruction content of the main memory.

The SRAM consists of an array of static memory cells which are connected by horizontal word lines and vertical bit lines. To select a word line out of 2h, a h-bit address has to be applied. The output data is usually organized as a word of b-bits. From the architectural point of view the output word

represents a b-bit input/output (I/O -port). The I/O-port consists of b I/O-blocks, i.e. one block per bit of the output word. Each bit of the I/O-port can be connected to one out of 2w bit lines by a 2w-to-1 column or bit line multiplexer. Any SRAM cell can be accessed by an address word which is (h + b) bits long. This address is applied to the control logic block which controls all the memory operations, e.g. write, read, enable, data- in, data-out..

II. BASIC SRAM ARCHITECTURE A typical static random access memory (SRAM) architecture is as shown in Figure 1. It consists of a matrix of memory cells arranged in an array of 2N rows by 2M columns. The total size of the memory array is 2M x 2N bits. During a read operation, one of the 2N rows (Word lines) is selected by the row address decoders by decoding the row addresses. All the memory cells in the given word line are enabled. The column decoder selects one of the 2M columns and the value of the selected memory cell is read out by the sense amplifier. The data into and out of the memory array is controlled by the Read-Write control circuit.

Figure 1. Static Random Access Memory Architecture

Page 55: VLP

III. SCHEMATIC AND READ/WRITE OPERATION OF 4T SRAM CELL

In four Transistor SRAM celll two NMOS

transistors are used as pass transistors to access the cell and two PMOS transistors which are used as drivers to the cell.

Figure2. Schematic Of The Cell

A. WRITE OPERATION

In order to store a logic ‘1’ to the cell, BL is charged to Vdd and BL’ is charged to ground and vise versa for storing ‘0’. Then the word line is switched to Vdd to turn on the NMOS access transistors. When the access transistors are turned on, the values of the bit lines are written into Q and Q’. The node that is storing logic’1’ will not go to full Vdd because of a voltage drop across the NMOS access transistor. After the write operation the word line voltage is reset to ground to turn off the NMOS access transistors. The node with the logic’1’ stored is pulled up to full Vdd through the PMOS driver transistors.

B. READ OPERATION

The read operation of the cell is different from that of 6T cell. To read from the cell the bit lines are charged to ground instead of Vdd and the word line voltage is set to Vdd to turn on the NMOS access transistors. The node with logic’1’ stored will pull the voltages on the corresponding bit line up to a high (not Vdd because of the voltage drop across the NMOS access transistor) voltage level. The other bit line is pulled to ground. The sense amplifier detects which bit line is at high voltage and which bit line is at ground.

If the cell was storing a logic’0’ the voltage level of BL will be lower than BL’ so the sense amplifier will output a logic‘0’. If the cell was storing logic’1’ then the voltage level of BL will be higher than BL’ then the sense amplifier will output a logic’1’

IV. EXPERIMENTS AND RESULT

A. Static Noise Margin Analysis

SNM quantifies the maximum level of voltage nose which can be present at the internal nodes of a bit cell without flipping the cell contents. Figure 3shows the location Q and Q´, the noise margin sources in the 4T SRAM cell schematic. The purpose of this analysis is to measure the SNM of bit cell. A SRAM cell should be designed such that under all conditions some SNM is reserved to cope withdynamic disturbances caused by a particle, cross talk, voltage supply ripple and thermal noise. I have done SNM analysis for 6T and 4T cell for 0.18µ technology node. The method and results are shown below.

Figure 3. 4T SRAM cell simulated structure

B. Method to calculate Static Noise Margin

To analyze Static Noise Margin, introduce a DC noise source inside the SRAM cell and see where the cell flips .Put the WL (Word Line) at Vdd . Bit Line and Bit Line’ (BL and BL’) are connected to ground.Iinitialize Q’ with Vdd and Q with 0. Now slowly increase VX from 0 and monitor points Q and Q’ to investigate where the cell flips. Static Noise Margin is measured to be 362.3279 mV.

Page 56: VLP

C. Bit Line Leakage Current Analysis

The purpose of this analysis is to characterize the bit cell contribution to column leakage. The main purpose of this test is to see the margin available for the sum of total cell leakage currents in a long column (from unselected WLs) during a read operation. This simulation should be used as guidelines for designing the maximum number of physical rows in a SRAM array.

Figure 4. Bit Line Leakage Current Calculation

D. Method to calculate Bit Line Leakage Current

To do Bit Line Leakage Current Analysis initialize the output Q to ‘0’ and Q’ to Vdd . At this time Word Line (WL) is in off condition and therefore set to 0. BL and BL’ are connected to ground. Now leakage current is measured as the current through MN4 (pass transistor facing the ‘1’).The Bit Line Leakage Current for 4T SRAM cell is measured to be 7.1441 pA.

V. CONCLUSION

The two basic parameters static noise margin and bit line leakage current are successfully measured.All the simulations are done in ELDO tool from Mentor Graphics Corporation. Both the parameters discussed in this paper are very important in characterization of 4T SRAM cell. A SRAM cell is designed such that under all conditions some SNM is reserved to cope with dynamic disturbances caused by particle, cross talk,

voltage supply ripple and thermal noise. Static Noise Margin is measured to be 362.3279 mV. For BLCC it is also seen the margin available for the sum of total cell leakage currents in a long column during a read operation. The Bit Line Leakage Current for 4T SRAM cell is measured to be 7.1441 pA. Objective is also to keep Bit line Leakage Current as low as possible.

VI. REFERENCES

[1]. Neil H. E. Weste and Kamran Eshraghian, “Principles of CMOS VLSI Design,” Second-Edition, Pearson Education Asia, 2002.

[2]. S. M. kang and Y. leblebici, “CMOS Digital Integrated Circuits,” Third Edition, Tata McGraw –Hill, 2002.

[3]. Tegze P. Haraszti, “CMOS Memory Circuits”, Kluwer Academic Publishers, 2000 .

[4]. Semiconductor Memories, A handbook of design, manufacture and application By “Betty Prince”. [5]. Stephan De Beer, Monuko du Plessis, and Evert Seevinck,”An SRAM Array Based on a Four-Transistor CMOS SRAM Cell”, IEEE Transactions On Circuits and Systems—Fundamental Theory and Applications, Vol. 50, No. 9, September 2003.

[6]. Jinshen Yang ,Li Chen,”A New Loadless 4-Transistor SRAM Cell with a 0.18 μm CMOS Technology”, IEEE,2007.

[7]. Ding-Ming Kwai ,”Review of 6T SRAM Cell” ,Intellectual Property Library Company ,June 3, 2005.

[8]. T-H Joubert, E Seevinck, M du Plessis, “A CMOS REDUCED-AREA SRAM CELL”, ISGAS 2000 - IEEE International Symposium on Circuits and Systems, May 28-31, 2000, Geneva, Switzerland.

[9]. Bharadwaj S. Amrutur and Mark A. Horowitz,” Speed and Power Scaling of SRAM’s”, IEEE Transactions on Solid State Circuits, Vol.. 35, No. 2, Febraury 2000.

Page 57: VLP

U

Quantitative Analysis and Optimization Techniques

for On-Chip Cache Leakage Power

Vikas Tiwari Shyam Akashe Rajkumar Rajoriya

M.Tech (VLSI Design) Associate Professor. Assist. Professor

ITM, Gwalior, India ITM, Gwalior ITM, Gwalior e-mail: [email protected] e-mail:[email protected]

Abstract—On-chip L1 and L2 caches represent a sizeable fraction of the total power consumption of microprocessors. In nanometer-scale technology, the sub threshold leakage power is becoming one of the dominant total power consumption com- ponents of those caches. In this study, we present optimization techniques to reduce the sub threshold leakage power of on-chip caches assuming that there are multiple threshold voltages, ’s, available. First, we show a cache leakage optimization technique that examines the tradeoff between access time and sub threshold leakage power by assigning distinct ’s to each of the four main cache components—address bus drivers, data bus drivers, decoders, and static random access memory (SRAM) cell arrays with sense amplifiers. Second, we show optimization techniques to reduce the leakage power of L1 and L2 on-chip caches without affecting the average memory access time. The key results are: 1) two additional high ’s are enough to minimize leakage in a single cache—3 ’s if we include a nominal low for micro- processor core logic; 2) if L1 size is fixed, increasing L2 size can result in much lower leakage without reducing average memory access time; 3) if L2 size is fixed, reducing L1 size may result in lower leakage without loss of the average memory access time for the SPEC2K benchmarks; and 4) smaller L1 and larger L2 caches than are typical in today’s processors result in significant leakage and dynamic power reduction without affecting the average memory access time.

Keywords—Microprocessor memory hierarchy, multiple threshold voltage, on-chip caches, SRAM, sub threshold leakage power.

I. INTRODUCTION

NTIL VERY recently, only dynamic power has been a

significant source of power consumption, and Moore’s

law has helped to control it. Shrinking processor technology below 100 nm has allowed, and actually required, reducing the

supply voltage to reduce dynamic power consumption. How-

ever, smaller geometries with a low-threshold voltage exacer-

bate leakage, so static power is beginning to dominate the power

consumption equation [1]. For example, a 90-nm Pentium 4 con-

sumes 110 W, and roughly 40% of the total power dissipation

is consumed by leakage power [2]. The excessive heat dissipa-

tion by the leakage power in the high-end 90-nm Pentium 4 pro-

cessor forced Intel Corporation to adopt more expensive power

delivery, cooling, and packaging systems.

A potentially important source of this power dissipation is

on-chip caches, because larger on-chip caches are being

integrated onto the chip. For example, an Intel processor for

server applications has 1 and 6 MB on-chip L2 and L3 caches,

respectively1; subthreshold leakage power is dissipated by all

of the subbanks even if they are not accessed, while dynamic

power is dissipated when a cache subbank is accessed. To

alleviate this problem, transistors in caches could be designed

for low subthreshold leakage, for example, by assigning them

a higher threshold voltage or by controlling the with

adaptive body biasing or, if a better balance of speed and power

is required, by employing dual [3]–[7]. Traditionally, at

most two ’s—one low and one high —have been avail-

able in high-performance process technologies, allowing cache

designers only limited flexibility for suppressing subthreshold

leakage current. To further improve the subthreshold leakage,

several circuit and microarchitectural techniques [8]–[13] have

therefore been proposed targeted at the subthreshold leakage

power reduction of L1 caches.

One consequence of the increasing importance of sub-

threshold leakage current is that, the number of available ’s

in future process technologies will increase. Next-generation

65-nm processes are expected to support three ’s (one

low and two high ’s) and future processes are likely to

provide designers with even more choices. This increase

provides new flexibility for subthreshold leakage power re-

duction methods, allowing new tradeoffs between the of

different parts of a cache and between different levels in the

cache hierarchy. The availability of additional ’s suggests a

new examination of the tradeoff between cache size and to

reduce power loss from subthreshold leakage current.

In this study, we present systematic techniques for assigning

multiple ’s to memory hierarchies to minimize power dis-

sipation, in particular subthreshold leakage [14]. Based on our

techniques, we provide a detailed quantitative tradeoff analysis

between access time and subthreshold leakage power of on-chip

caches as a function of the number and the strength of .

Although the qualitative trends of subthreshold leakage power

versus access time tradeoff are well known, this paper provides a

detailed quantitative analysis to determine the optimal number

of ’s for given design constraints and to justify the cost of

extra ’s. First, we examine optimal leakage power dissipa-

tion for various access times in on-chip SRAM caches, when

more than one high is available. Then, we show how many

high ’s are needed, in addition to a nominal required for

the processor’s general logic circuits and how much should

be increased for effective leakage power reduction for

Page 58: VLP

TABLE I CACHE ORGANIZATIONS FOR EACH CACHE SIZE

various cache access time points. Second, we present how cache

leakage power can be reduced while maintaining the same av-

erage memory access time of a processor memory system using

L1 and L2 cache access statistics for SPEC2K workloads [15].

The reminder of this study is organized as follows. Section II

explains our on-chip cache subthreshold leakage power and ac-

cess time-modeling methodologies. Section III presents a sub-

threshold leakage power optimization technique for a given ac-

cess time constraint and provides a quantitative tradeoff analysis

of on-chip cache subthreshold leakage power and access time.

Section IV presents two-level cache leakage power optimization

techniques using cache access statistics. Section V discusses fu-

ture directions for this line of work and adds some concluding

remarks.

II. ON-CHIP CACHE LEAKAGE POWER AND ACCESS

TIME MODELS

To examine tradeoffs between subthreshold leakage power

and access time of a processor cache memory system, we need

circuit models to estimate the subthreshold leakage power and

access time of caches. Rather than starting from scratch, we

could have built on a widely used cache memory model called

―CACTI‖ [16]. This model estimates access time, dynamic en-

ergy dissipation, and area of caches for given cache configura-

tion parameters such as total size, line size, associativity, and

number of ports. However, it is based on an outdated 0.8- m

CMOS technology and it applies linear scaling to obtain the fig-

ures for smaller process technologies. Furthermore, it does not

provide access time and leakage power when multiple ’s are

available. To address these shortcomings, we designed caches

with the 70-nm Berkeley predictive technology model (BPTM)2

in anticipation of the next generation of process technology.

Then, we derived our subthreshold leakage power and access

time models based on the HSPICE simulations of the designed

cache circuits.

The designed caches ranged from 16 to 1024 KB in size. The

bitlines and wordlines were segmented to improve access time,

and subbanks were employed to reduce dynamic power dissipa-

tion [17] as well; see Table I for the cache subbank organization

used in this study. The caches were broken into four components

for the purposes of assigning distinct ’s: address bus drivers,

Fig. 1. Cache subbank organization.

data bus drivers, decoders, and 6T-SRAM cell arrays with sense

amplifiers. Fig. 1 illustrates the cache subbank organization used

in this study.

The circuit topology and the ratios of transistors in

the decoder circuits are based on the CACTI model but opti-

mized for the 70-nm technology. In addition, modern techniques

for lower voltage are employed for the bitline precharge and

sense-amplifier circuits. For the address and data bus intercon-

nects, we employed an H-tree topology and inserted repeaters

on each branch of the buses to optimize the interconnect delay

of cache buses. To obtain the interconnect capacitance and re-

sistance of long wires such as bitlines, wordlines, address, and

data buses, the lengths of the interconnects are estimated using

SRAM cell dimensions of 1.42 m 0.72 m and the cache or-

ganizations in Table I. Then, for given interconnect length, the

predictor provided in footnote 2 is used to estimate the intercon-

nect capacitance and resistance.

HSPICE simulations were run extensively to obtain leakage

power and access time (or delay) models for wide ranges of

cache sizes and ’s for their four components. We considered

’s between 0.2 and 0.5 V in steps of 0.05 V at 1-V nominal

supply voltage. We measured the leakage power and the delay

of each cache component separately.

A. Leakage Power Models

Fig. 2 shows versus leakage power of the 7 128,

8 256, and 9 512 row decoders that we designed. The

HSPICE simulation results shown in Fig. 2 agree with the

exponential decay in leakage power with a linear increase of

that is characteristic of general CMOS circuits

(1)

To obtain an approximated analytic equation for leakage power

as a function of , we measured the leakage power of the de-

coders at each discrete point, and we applied an exponen-

tially decaying curve fitting method to the measured leakage

power as follows:

(2)

where , and are constants derived from using Origin

6.1, which is a scientific graphing and analysis software curve-

Page 59: VLP

TABLE II

CACHE COMPONENT LEAKAGE POWER MODEL COEFFICIENTS AT 70 C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE

Fig. 2. Leakage power dissipation of the 7 128, 8 256, and 9 512 decoders.

fitting package3—the -squared error is less than 0.001 for each

fitted curves.

The rest of the cache components—address driver, data

driver, and 6T SRAM cell array—show the same leakage

power trend characteristics as the decoder of Fig. 2; leakage

power decreases exponentially with the linear increase of .

Hence, an identical curve-fitting method can be applied for

these components to derive leakage power models like (2). The

coefficients for all of the components in (2) can be found in

Table II.

Once all of the approximated analytic leakage power models

for each component are derived for a cache size, the total

leakage power of the cache can be approximated as the sum

Fig. 3. Delay time of 7 128, 8 256, and 9 512 decoders.

of the leakage power of all the components. Assuming that we

apply four distinct ’s, the analytic approximated equation

for leakage power (LP) is

(3)

where , and represent the ’s for address

bus drivers, data bus drivers, decoders, and 6T-SRAM cell ar-

rays, respectively. Each exponential term evaluates the leakage

power dissipation of one of the four components.

B. Access Time Models

Fig. 3 shows versus delay time of the 7 128, 8 256,

and 9 512 row decoders that we designed. Basically, the

Page 60: VLP

TABLE III

CACHE COMPONENT DELAY MODEL COEFFICIENTS AT 70C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE

CMOS circuit delay of ultra deep submicrometer short-channel

transistors is

(4)

where , and 4 are constants depending on the technology

and transistor sizes. The measured delay time trends in Fig. 3

agree with (4). However, the circuit delay or access time also fits

very well to an exponential growth function with a very small

exponent over our range of interest. It was convenient for some

of our optimizations to approximate delay this way.

To obtain an approximated analytic equation for delay time

as a function of , we measured the delay time of the decoders

at each discrete point, and we fit the following exponential

curve to the measured delay time:

(5)

where , and are constants derived using the same tech-

nique as that used for the leakage power models.

The rest of the cache components show the same delay trend

characteristics as the decoder case of Fig. 3. Hence, the same

curve-fitting technique can be applied for those components to

derive approximated delay time models as functions of like

(5). The coefficients for all the components in (5) can be found

in Table III.

Once all of the approximated delay time models for each

component are extracted for a specific cache size, total delay

or access time of the cache can be approximated as a sum of the

delay times of all the cache components. Assuming that we can

4 was around 2 in submicrometer technology, but it has been decreased to

about 1.3 in the current generation deep-submicrometer technology.

Fig. 4. Access time and leakage power versus cache size of the baseline caches.

apply four distinct ’s, the analytic approximated equation for

the access time (AT) is

(6)

where , and represent the ’s for address

bus drivers, data bus drivers, decoders, and 6T-SRAM cell ar-

rays, respectively. Each exponential term corresponds to the

delay time of one of the four components.

We also define baseline caches in which the of all the

cache components is set to a low- (0.2 V). Fig. 4 shows the

access time and the leakage power of the baseline caches. The

cache access time grows logarithmically and the leakage power

increases linearly with the cache size. Those trends agree with

those of earlier studies on SRAM design. In Fig. 4, we assume a

direct-mapped cache organization and consider only the leakage

Page 61: VLP

power of data arrays, disregarding the leakage of the tag com-

parators and other cache control logic.

III. CACHE LEAKAGE OPTIMIZATION WITH MULTIPLE

ASSIGNMENTS

A. Methodology

In this section, we present a leakage power optimization tech-

nique assuming that we can assign multiple ’s to a cache. To

find the minimum leakage power of caches using a maximum

of four distinct ’s under a specified target access time con-

straint, we formulate the problem as follows:

constraints

(7) Fig. 5. Normalized optimum LP and V versus normalized AT of 512-KB caches—schemes I and II.

(8)

where , and represent the ’s for address

bus drivers, data bus drivers, decoders, and 6T-SRAM arrays,

respectively.

There exist numerous combinations of , and

satisfying a specific target access time. Among those combinations, we find a quadruple of , and producing minimum leakage power using a numerical optimiza-

tion method (e.g., Matlab’s fmincon function). We allowed the

combination that satisfies a specified access time error range

within 5%. We can repeat this procedure with modified objec-

tive and constraint functions to find an optimal combination

for cache memories that have only two or three distinct ’s.

Assuming that we can assign distinct ’s to each compo-

nent of the cache, it is important to determine how many ’s

are cost-effective because an extra mask and process step are

needed for each additional . To examine the dependence of

the optimization results on access time, we sweep the target ac-

cess time from the fastest possible (assigning a low of 0.2 V

to all the cache components) to the slowest possible (assigning

a high of 0.5 V to all the cache components). We present

here the summary of the assignment schemes we examined

in this study.

• Scheme I: Assigning a high- to all of the cache compo-

nents including address bus drivers, data bus drivers, de-

coders, and 6T-SRAM cell arrays. This requires 2 ’s if

we include a nominal or low for the processor’s gen-

eral logic circuits.

• Scheme II: Assigning a high- only to the 6T-SRAM cell

arrays that dominates leakage power but not the overall

cache delay and assigning a default- or low- (0.2 V) to

the rest of the transistors. This requires at least two ’s if

we include a nominal or low for the processor’s logic.

• Scheme III: Assigning a high- to the 6T-SRAM cell ar-

rays and assigning another high- to the peripheral com-

ponents—address bus drivers, data bus drivers, and de-

coders of the cache. This requires at least three ’s if

we include a nominal or low for the processor.

• Scheme IV: Assigning four distinct high ’s to all four

cache components. This requires at least five ’s if we

include a nominal or low for the processor logic.

B. Leakage Power Optimization and Quantitative Tradeoff

Analysis

In Fig. 5, we plot the normalized optimum leakage power and

at different target access times (125%, 150%, 175%, and so

forth) of 512-KB caches employing schemes I and II. The op-

timum leakage power and the are obtained using (7) and (8)

of Section III-A. The parenthesized I and II in Fig. 5 represent

the schemes I and II, respectively. In the graph, the normalized

minimum leakage power and the access time of 100% corre-

spond to the access time and the leakage power of a 512-KB

baseline cache designed with a low (0.2 V) for all four cache

components—the fasted but leakiest cache. The 125% access

time in the axis means that the cache is 25% slower than the

baseline cache.

According to the trends shown in Fig. 5, the leakage power

decreases exponentially as the increases linearly; note that

the axis is a logarithmic scale. The optimization results for the

different cache sizes show almost the same normalized optimum

leakage power and trends as those of the 512-KB caches in

Fig. 5 as long as the same assignment scheme is applied; see

Table IV for the normalized leakage power of all the cache sizes.

Comparing two schemes—scheme I and II—the 512-KB cache

with scheme II dissipates less leakage power than the one with

scheme I at the same access time point when the normalized ac-

cess time constraint is less than 155%. For example, at the 125%

access time point, scheme II shows 6% leakage dissipation of

the baseline 512-KB cache and scheme I shows 13% leakage

dissipation—a 2 difference. However, scheme I shows better

leakage power reduction beyond a 155% normalized access time

point.

Fig. 6 shows the normalized optimum leakage power and versus normalized access time trends for a 512-KB cache of

scheme III. The optimum leakage power and the ’s are ob-

tained using (7) and (8) of Section III-A. In Fig. 6, the of

Page 62: VLP

TABLE IV

PERCENTAGE LEAKAGE POWER OF SCHEMES I–IV NORMALIZED TO LEAKAGE POWER OF EACH CACHE SIZE AT THE 100% AT POINT

Fig. 6. Normalized optimum LP and V versus normalized AT of 512-KB caches—schemes I and III.

the SRAM cell array, denoted as array in the graph, starts to in-

crease first. This implies that the SRAM cell array is responsible

for the most significant fraction of total cache leakage power,

but it has the least impact on increasing the total cache access

time. After the of the SRAM cell arrays are saturated to the

maximum allowed point (0.5 V), the of the peripheral com-

ponents labeled as peri in the graph is increased further to reduce

further leakage power in the peripheral components. However,

this just increases the access time without much further cache

leakage reduction. For example, the leakage power is not de-

creased over the 215% access time point where the for the

peripheral circuit has not reached the maximum value (0.5 V)

in this 512-KB cache case.

This leakage power and versus access time trends also ex-

plain the leakage optimization results shown in Fig. 5: scheme II

shows a better optimization result than scheme I does when the

normalized access time is less than 155%, but it does not beyond

155% access time point. Recall that scheme I assigns a high- to all the cache components. It sacrifices more access time un-

necessarily by increasing the of the peripheral components

with little leakage reduction at the same access time. However,

scheme II assigns the high- to just the SRAM cell arrays

that are responsible for a greater fraction of total cache leakage

power but affects access time less. However, scheme II cannot

Fig. 7. Normalized optimum LP and V versus normalized AT —scheme IV.

reduce leakage power beyond the 155% access time point, be-

cause the leakage power of the peripheral components, where a

low is used, becomes substantial beyond this point.

Fig. 7 shows the normalized optimum leakage power and versus normalized access time trends for a 512-KB cache of

scheme IV. The optimum leakage power and the ’s are ob-

tained again using (7) and (8) of Section III-A. In scheme IV,

we can assign up to 4 distinct ’s for leakage power opti-

mization. According to the results shown in Fig. 7, the of

the 6T-SRAM cell arrays starts to increase first similar to the

scheme III case. Among the peripheral components, the for

the data bus starts to increase first. This implies that the data

bus consisting of 128 b—the assumed bus width between the L2

and L1 caches—has the second most significant impact on the

leakage power. Even though the address bus has the same struc-

ture, the number of bits in the address bus is much smaller than

the data bus. Hence, the leakage power impact of the address

bus much less than the data bus. However, in the case of smaller

caches (e.g., 16–64 KB caches) where the data bus width is 32 b,

both the data and address bus have almost the same impact on

the leakage power. Therefore, the trends for both the data and

address buses will be the same. These trends suggest the di-

rection of optimizations that reduce cache leakage power.

Table IV summarizes the normalized cache leakage power of

schemes I–IV. As expected, we can reduce more leakage power

Page 63: VLP

TABLE V

CACHE DYNAMIC ENERGY CONSUMPTION PER ACCESS AND LEAKAGE POWER

DISSIPATION AT 70 C DIE TEMPERATURE AND A TYPICAL CORNER FOR EACH CACHE SIZE

while achieving the same access time by having more ’s to

control. If the access time is fixed, the caches of schemes III and

IV always show 38%–72% better leakage optimization results

than those of scheme I. There are a few things we should note

from this comparison study. First, as the target access time is

increased to more than the 150% point in scheme II, caches dis-

sipate more leakage power than those employing scheme I. This

implies that the cache peripheral components consume nonneg-

ligible leakage power. The leakage power of those components

becomes substantial when we cut down the leakage power of the

6T-SRAM arrays significantly. Second, the slowest cache ac-

cess time of scheme II ends around 150% in small-size caches.

This means that the peripheral components also play important

roles in both cache leakage power and access time. In other

words, increasing the of 6T-SRAM cell arrays alone gives us

diminishing returns at some point without reducing the leakage

power further. This is why the caches of scheme I give even

better results than those of scheme II as increases. Finally,

there is a negligible difference between caches of schemes III

and IV in terms of leakage power reduction. This implies that

scheme III employing two distinct high ’s—three ’s if

we include a nominal or low for the processor—is enough

to minimize leakage. Finally, as illustrated in Figs. 5–7 and

Table IV, each cache shows a wide range of optimal leakage

power consumption depending on target access time. Hence, the

right tradeoff point between the leakage power and the access

time of the caches will be determined by either system design

specifications or constraints.

IV. LEAKAGE OPTIMIZATION TECHNIQUES

FOR TWO-LEVEL CACHES

A. Methodology

In a processor memory system, the average memory access

time (AMAT) [18] is a key metric for measuring the overall

memory system performance. To evaluate the performance or

AMAT, it is essential to examine the cache miss characteristics

of realistic applications, because the performance or AMAT is a

function of L1 and L2 cache miss rates and cache access times.

In our study, we assume that the memory system hierarchy con-

sists of separate L1 instruction and data caches with a unified L2

cache. Then, the average performance of the processor memory

system can be measured or compared with the AMAT repre-

sented by

(9)

where HitTime and HitTime are the access time of L1 and

L2 caches, Miss Rate and Miss Rate are the miss rate of L1

and L2 caches, and Miss Penalty is the external memory

access and data transfer time. Note that the local miss rate5 is

used as the Miss Rate .

Similarly, we measure the average memory access energy

(AMAE) to compare the dynamic energy dissipation of each

memory system configuration. Assuming that the L1 cache is

accessed every cycle, the AMAE represents the average en-

ergy dissipation per access in the entire microprocessor memory

system that includes L1, L2, and main memory. We can estimate

average memory access energy, as follow:

(10)

where Hit Energy is average energy dissipation per access

given in Table V. We assume a two-channel 1066-MHz

256-MB RAMBUS DRAM RIMM whose sustained transfer

rate is 4.2 GB/s [19] to derive the main memory access time

and dynamic energy dissipation per access. Though the sus-

tained transfer rate is quite high, we should also consider the

RAS/CAS latency of the memory, which is about 20 ns. For the

energy dissipation per access, we used the number given in [20],

which is 3.57 nJ per access. The dynamic energy dissipation

per access can vary depending on the number of RIMMs. We

assume that one RIMM is installed. See Section IV–B and note

that more RIMMs are favorable for our optimization technique,

because our technique prefers a larger L2 cache to a smaller

one for leakage power reduction. The larger L2 cache accesses

DRAM less frequently than the smaller one, resulting in less

energy consumption for accessing the external DRAM. Hence,

if more RIMM modules are installed implying more energy

dissipation per DRAM access, a larger L2 cache will allow

even more energy to be saved.

To obtain L1 and L2 cache miss rates, we use the Simple-

Scalar/Alpha 3.0 tool set [21], which is a suite of functional and

timing simulation tools for the Alpha AXP ISA. In addition, we

collected the results from all 25 of the SPEC2K benchmarks [15]

to perform our evaluation. All SPEC programs were compiled

for a Compaq Alpha AXP-21 264 processor using the Compaq

C and Fortran compilers under the OSF/1 V4.0 operating system

using full compiler optimizations . We completed the ex-

ecution for each benchmark application to get reliable L2 cache

miss rates, because L2 cache accesses are far less frequent than

5This rate is simply the number of misses in a cache divided by the total number of memory accesses to this cache.

Page 64: VLP

TABLE VI

AVERAGE L1 AND L2 CACHE MISS RATES

FROM THE ENTIRE SPEC2K BENCHMARKS

L1 cache accesses; an insufficient number of L2 accesses may

result in unrepresentatively higher L2 cache miss rates.

Table VI shows the average L1 and L2 cache miss rates from

the entire SPEC2 K benchmarks for 16-, 32-, and 64-KB L1

caches, respectively. We used direct-mapped L1 instruction

caches and four-way set associative L1 data caches. Also, we

used eight-way set associative L2 caches. For simplicity, each

L1 cache miss rate is obtained by taking the sum of the number

of total instruction and data cache misses and dividing by the

sum of total instruction and data cache accesses; a 16-KB L1

means instruction and data caches are each 16 KB in size. Since

an L2 miss rate is a function of the L1 cache miss rate, we

measure the separate L2 cache miss rates for each L1 cache size

configuration. Those cache miss characteristics will definitely

affect the leakage optimization direction of two-level cache

memory systems.

B. L2 Cache Leakage Power Optimization

Since an L2 cache’s contribution to leakage power dominates

due to their size, we will examine the leakage power optimiza-

tion of the L2 cache first. Consider caches designed with low- (0.2 V) devices and a baseline cache memory system consisting

of 16 and 128 KB for L1 and L2 caches, respectively. Then,

we have leakage power consumption and AMAT corresponding

to this configuration. Increasing of the 128-KB L2 cache

will reduce the leakage power of the L2 cache, but it will in-

crease the AMAT of the cache memory system because of the in-

creased access or hit time. However, there is a way to reduce the

leakage power of the cache memory system without increasing

the AMAT that significantly impacts on the execution time of

the system.

The key to reducing leakage power without increasing AMAT

is to compensate for the increased L2 access time by reducing

the cache miss rate of the cache memory system. To reduce the

miss rate, we can increase the L2 cache size. The main memory

access penalty is quite significant in term of both time and en-

ergy. Hence, even a slight reduction of L2 cache miss rates re-

sults in a significant improvement in the AMAT. We note that

although area was one of the most important design constraints

in the past, this trend is changing and power is becoming an

Fig. 8. L2 leakage power optimization at a fixed L1 size (16 KB). (1) and (2) are the leakage power consumption of the 256- and 512-KB caches at the same AMAT as the baseline 128-KB cache, respectively.

equally important constraint in many situations [22]. In this ar-

gument, we assume that the same AMAT will approximately

give us the same execution time for a fixed processor core, L1

cache size, and benchmark program, so that we can fairly com-

pare the total leakage energy consumption as well.

Fig. 8 shows the leakage power versus AMAT of L2 caches

with a fixed L1 cache size—16 KB. The leakage power opti-

mization for individual caches is based on scheme III that re-

quires two additional distinct high ’s for L2. Assuming the

AMAT of the fastest 128-KB L2 cache designed with low- (0.2 V) as a baseline, we compare the leakage power of other

caches at the same AMAT point; see the (1) and (2) points in

Fig. 8. The (1) and (2) points are the leakage power consump-

tion of the cache system with the 256- and 512-KB caches at

the same AMAT as the baseline 128-KB cache system. As can

be seen from the plots, the AMAT can be maintained while the

leakage power can be reduced by replacing the baseline 128-KB

L2 cache with a 256-KB L2 cache that is intentionally slowed

down by increasing its ’s to reduce leakage.

This replacement with the double-sized L2 cache reduces

the leakage power by 70% compared to the fastest but leakiest

128-KB L2 cache with the same AMAT. Similarly, the use of a

512-KB L2 cache can further reduce leakage compared to the

256-KB cache; see the vertical line in Fig. 8.

Finally, the employment of larger L2 caches also reduces

the average dynamic power of the memory system, because

the larger L2 caches reduce the number of external memory

accesses that consume a significant amount of dynamic energy.

Table VII summarizes the results for the normalized leakage

power and normalized average memory access energy for each

L1 cache size designed using scheme III at a fixed AMAT. To

compare leakage power and AMAE, the following standard

cache configurations were used: 128-KB L2 with 16-KB L1,

256-KB L2 with 32-KB L1, and 512-KB L2 with 64-KB L1.

The shaded numbers represent the baseline L2 configuration,

leakage power, and AMAE. Table VII shows the counterintu-

itive results that we can reduce both leakage power and AMAE

by employing larger L2 caches while maintaining the same

AMAT.

Page 65: VLP

TABLE VII

L2 CACHE NORMALIZED LEAKAGE AND AMAE AT THE FIXED L1 SIZE (16 KB) AND AMAT

Fig. 9. L1 leakage power optimization at a fixed L2 size (512 KB). (1) and (2) are the leakage power consumption of the 32- and 16-KB caches at the same AMAT as the baseline 64-KB cache, respectively.

C. L1 Cache Leakage Power Optimization

It is rather difficult to improve the L1 cache miss rates fur-

ther, because they are already very low for 16-, 32-, and 64-KB

caches in the case when SPEC2K benchmarks are run. Hence,

the access time of caches become a dominant factor in deter-

mining the AMAT. For example, the access time of a 64-KB

L1 cache increases by 48% compared to the fastest 16-KB L1

cache, because the access time is very sensitive to size in small

caches. Essentially, cache access time increases logarithmically

with size, but has a steeper slope for smaller caches than for

larger caches. This observation explains why the AMAT of a

cache hierarchy with a smaller L1 cache can be faster than one

with a larger L1 caches for a certain range of cache sizes (e.g.,

16 or 64 KB).

Fig. 9 shows the leakage power versus the AMAT of 16-, 32-,

and 64-KB L1 caches using scheme III each with a fixed L2

cache of size 512 KB. Like the comparison performed in Section

IV–B, the leakage power of different caches is compared at the

same AMAT point. The plots show that leakage power can be

reduced by replacing the fastest 64-KB L1 cache with a 32-KB

L1 cache that is intentionally slowed down by increasing its

’s to reduce the leakage power—the resulting cache memory

TABLE VIII

L1 CACHE NORMALIZED LEAKAGE AND AMAE AT THE FIXED L2 SIZE (512 KB) AND AMAT

system still has the same AMAT. Similarly, a slowed 16-KB

cache with increased ’s can replace a 32-KB cache without

changing the AMAT of the L1/L2 hierarchy. The new system

consumes much less leakage power; see points (1) and (2) in

Fig. 9, which are the leakage power consumption of the cache

system with the 32- and 16-KB caches at the same AMAT as

the baseline cache system.

Table VIII shows the results for normalized leakage power

and AMAE as a percentage of each fast but leaky L1 cache

size using scheme III with fixed AMATs. The comparisons were

performed in the same manner as Table VII. The shaded num-

bers represent the baseline L1 configuration, leakage power,

and AMAE. According to the comparisons, we can reduce both

leakage power and AMAE by employing smaller L1 caches.

This is the inverse of the case for L2 caches, where the leakage

of the overall memory system can be reduced by increasing their

size. However, it should be noted that these results are only valid

within the specific set of sizes and simulation environment given

in this discussion. First, a 4-KB L1 cache will have a cache

miss rate that is much higher than a 16-KB cache, but its access

time will not be sufficiently smaller to make the tradeoff worth-

while. Also, the normalized AMAE is rather high because the

total power fraction of L1 caches is relatively small compared to

L2 caches. Second, many SPEC2K benchmark programs have

very high locality compared to real-world larger size applica-

tions. This results in quite low cache miss rates for small-size

L1 caches as shown in Table VI. Third, the operating system

(OS) context switching was not modeled due to our limited sim-

ulation environment. The context switching typically increases

cache miss rates, because cache flushing increases cold start

misses. These factors must be considered if one is to perform

realistic cache leakage power optimizations with the proposed

techniques.

V. CONCLUSION

In this study, we examined the leakage power and access time

tradeoff for caches where multiple ’s are allowed. We used

curve fitting techniques to model subthreshold leakage power

and access time. Our results show that two extra distinct high

’s for caches—3 ’s including the for the micropro-

cessor core logic—are sufficient to yield a significant reduction

in leakage power. Such an arrangement can reduce the leakage

Page 66: VLP

power by as much as 91%. We also show that smaller L1 and

larger L2 caches than are typical in today’s processors result

in significant leakage and dynamic power reduction without af-

fecting the average memory access time. Given that the pro-

cessor core may need a distinct , and each of the caches may

need up to two ’s (scheme III) we could require up to five

distinct ’s for the leakage power optimization of two-level

cache memory systems.

Even though the modeling and optimization techniques pre-

sented in this study have been performed using continuous-do-

main functions, the actual cache latencies are integer numbers of

processor clock cycles. Cache designers or architects can choose

an appropriate discrete point from the continuous-domain re-

sults depending on their target processor core clock frequency.

Furthermore, the circuit techniques combined with microar-

chitectural level controls exemplified by drowsy caches [10] are

designed to reduce the leakage power of L1 caches when sac-

rificing access time is not an option. Such an approach is less

attractive for L2 caches. The same effect can be obtained more

simply by using high- circuits.

REFERENCES

[1] N. S. Kim et al., ―Leakage current: Moore’s law meets static power,‖ IEEE Computer , vol. 36, no. 12, pp. 68–75, Dec. 2003.

[2] G. Sery, S. Borkar, and V. De, ―Life is CMOS: Why chase life after?,‖

in Proc. IEEE Design Automation Conf., 2002, pp. 78–83. [3] S. Mutoh et al., ―1-V power supply high-speed digital circuit technology

with multithreshold-voltage CMOS,‖ IEEE J. Solid-State Circuits, vol. 30, no. 8, pp. 847–854, Aug. 1995.

[4] T. Douseki, N. Shibata, and J. Yamada, ―A 0.5–1 V MTCMOS/SIMOX SRAM macro with multi-Vth memory cells,‖ in Proc. IEEE Int. SOI Conf., 2000, pp. 24–25. [5] K. Nii et al., ―A low power SRAM using auto-backgate-

controlled MT-CMOS,‖ in Proc. IEEE Int. Symp. Low Power Electronic Device, 1998, pp. 293–298.

[6] H. Mizuno et al., ―An 18- A standby current 1.8-V, 200-MHz micropro- cessor with self-substrate-biased data-retention mode,‖ IEEE J. Solid- State Circuits, vol. 34, no. 11, pp. 1492–1500, Nov. 1999.

[7] F. Hamzaoglu et al., ―Analysis of dual-V SRAM cells with full-swing single-ended bit line sensing for on-chip cache,‖ IEEE Trans. Very Large Scale (VLSI) Syst., vol. 10, no. 2, pp. 91–95, Apr. 2002.

[8] M. Powell et al., ―Gated-V : A circuit technique to reduce leakage in deep-submicron cache memories,‖ in Proc. IEEE Int. Symp. Lower Power Electronics & Design, 2000, pp. 90–95.

[9] A. Agarwal, L. Hai, and K. Roy, ―A single-V low-leakage gated-ground cache for deep submicron,‖ IEEE J. Solid-State Cir- cuits, vol. 38, no. 2, pp. 319–328, Feb. 2003.

[10] N. S. Kim et al., ―Drowsy instruction caches,‖ in Proc. IEEE Int. Symp. Microarchitecture, 2002, pp. 219–230.

[11] S. Yang et al., ―An integrated circuit/architecture approach to reducing leakage in deep-submicron high-performance I-caches,‖ in Proc. IEEE Int. Symp. High-Performance Computer Architecture, 2001, pp. 147–157.

[12] S. Kaxiras et al., ―Cache decay: Exploiting generational behavior to re- duce cache leakage power,‖ in Proc. IEEE Int. Symp. Computer Archi- tecture, 2001, pp. 240–251.

[13] H. Zhou et al., ―Adaptive mode-control: A static-power-efficient cache design,‖ in Proc. IEEE Parallel Architecture and Compilation Tech., 2001, pp. 61–70.

[14] N. S. Kim et al., ―Leakage power optimization techniques for ultra deep sub-micron multi-level caches,‖ in Proc. IEEE Int. Conf. Computer Aided Design, 2003, pp. 627–632.

[15] Standard Performance Evaluation Corporation [Online]. Available: http://www.specbench.org

[16] S. Wilton et al., ―An Enhanced Access and Cycle Time Model for

On-Chip Caches,‖, Western Res. Lab. Res. Rep. 93/5, 1993. [17] K. Ghose and M. Kamble, ―Reducing power in superscalar processor

caches using subbanking, multiple line buffers and bit-line segmenta- tion,‖ in Proc. IEEE Int. Symp. Low Power Electronic and Design, 1999, pp. 70–75.

Page 67: VLP

[18] J. Hennessy et al., Computer Architecture—A Quantitative Approach, 3rd ed. San Mateo, CA: Morgan Kaufmann, 2003, pp. 406–408.

[19] 800/1066 MHz RDRAM Advanced Information (2002). [Online]. Avail- able: http://www.rambus.com

[20] V. Delaluz et al., ―Compiler-directed array interleaving for reducing en- ergy in multi-bank memories,‖ in Proc. IEEE Asia South Pacific Design Automation Conf., 2002, pp. 288–293.

[21] T. Austin et al., ―SimpleScalar: An infrastructure for computer system modeling,‖ IEEE Computer, vol. 35, no. 2, pp. 59–67, Feb. 2002.

[22] T. Mudge, ―Power: A first class design constraint,‖ IEEE Computer, vol. 34, no. 4, pp. 52–57, Apr. 2001.

Page 68: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0112-1

Abstract—Reliability has been important in many applications

and it has been convenient as the size and cost of chips has been

reduced drastically , reliability in electronics circuits is achieved

through fault tolerant where the system itself is able to tolerate

the fault and mask the error, fault tolerant in circuits is achieved

by various redundancy methods( as hardware , software ,

information, and time) but these redundant methods are

different for analog and digital systems so in this paper we have

discussed the important method for analog and digital circuits to

make them fault tolerable. In this paper digital fault tolerant

design has been explained with majority and minority voting and

how fault is injected in the circuits for testing using VHDL.

Analog fault tolerant design has been explained with the help of

fuzzification. The platform used for digital circuits is Xilinx-12.4i

(ISE) and for analog is MATLAB.

Index Terms: Triple modular redundancy, Majority & Minority,

Voter Fuzzification.

I. INTRODUCTION

There are various methods to make a system fault tolerable but

the most basic is TMR method where the module which has

to be made reliable, is made redundant by taking three

identical modules in parallel in both hardware and software

and so the reliability of system increases as it can give the

right output even on failure of one module.

The basic block diagram of TMR system has been shown in

figure 1 as follows:

Figure 1

Here the most important part is voting unit which plays an

important role in reliability of system, as the results in analog

systems and digital systems are different so this voting unit

plays a distinguished part in both these systems another thing is

that the voting unit is not redundant here so what happens if it

fails? So these are parts of discussion of this paper.

The distribution of this paper is as follows. In Section II,

we make a short review of the most common fault tolerant

technique with its mathematical expression that how reliability

is increased as this is the basic method for both digital and

analog systems Section III describes the fault tolerance in

digital circuit’s environment and how faults are injected in

FPGA circuits for testing. In Section IV, the fault tolerant

technique for analog systems has been discussed with the help

of fuzzy logic. The discussion of the results for both analog

and digital circuits is provided in Section V. And finally the

future work and scope have been explained in Section VI.

II. TRIPLE MODULAR REDUNDANCY

The basic block diagram of TMR system has been shown

above let the reliability of a single module is . Now the

above TMR system will give the correct output if either two or

three modules will perform correct operation so if the

reliability of above system is then

Department of Electronics Engineering

Institute Of Technology, BHU

Varanasi,India

Email-pathak.akhilesh, agarwaltarang07,[email protected],

Fault Tolerant Design for Analog and digital Circuits

Dr.Anand Mohan ,Akhilesh Pathak, Tarang Agarwal, Trailokya Nath Sasamal

2 1

1,2,3,4

4 3

During lifetime of a system it is tested and diagnosed on

numerous occasions. For the system to perform its intended

mission with high availability, testing and diagnosis must be

quick and effective. A sensible way to ensure this is to specify

testing as one of the system functions– in other words, self-test.

Reliability, availability, and safety (RAS) are the major factors

for consideration in system design to provide continuous

correct operation [1]. Since faults cannot be completely

eliminated, critical systems always employ fault tolerance

techniques to guarantee high reliability and availability. Fault

tolerance (FT) techniques try to keep the system operational

despite the presence of faults [2]. FT can be achieved through

hiding the occurrence of faults and preventing it from

generating errors (fault-masking), or through fault detection

and fault repairing.

RR

RRRRR

mm

mmmmS

23

1132

033

3

23

2

Page 69: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0112-2

The reliability of above system will be greater than single

module if

So the reliability of overall system will be greater than the

single module if > 0.5. There is a single voter unit in

above circuit so if this voter unit fails the complete circuit will

fail so it is important to consider the reliability of voting unit

also.

Let the reliability of voter unit is , so the reliability of

triplicated TMR will be greater than TMR system if:

Or 2

2 > 3 - 2

These are the mathematical conditions for a triplicated system

to be more reliable as compared to TMR system.

III. FAULT TOLERANT DESIGN IN DIGITAL CIRCUITS

Figure 2

The above truth table and circuit diagram (figure 2) shows the

basic TMR system. Here the voting unit is not redundant so if

it fails the circuit will not be able to give the correct output. So

to make the circuit more reliable triplicated TMR with

minority voting is used the basic circuit diagram of it has been

shown below:

Figure 3

The method of using TMR with only one majority voter

circuit is still flawed; this is because the SEU not only could

affect the redundant modules but can also affect the voting

circuit itself. To alleviate this issue the majority voting circuit

must also be redundant. These redundant majority voters must

be compared using minority voter circuits (figure3). The

minority voters also take in three inputs, the primary path and

two other redundant paths in question. If the primary path is in

the majority with one other redundant path then the output is

low. If the primary path is in the minority in comparison with

the two other redundant paths then the output is high. Figure 4

below is a schematic and truth table of the minority voting

circuit.

Figure 4

This minority voter output is fed into the control signal

of a tri-state buffer with an inverted control input. If the path

in question is the minority then the tri-state buffer will be

placed into high-impedance. If the path in question is in the

majority then its corresponding tri-state buffer will allow the

path to follow through to output. These three outputs will

connect together outside of the FPGA into a wired-OR

fashion. Figure shows the minority voters controlling the tri-

state buffers which feed outside to the wired-OR gate.

IV. FAULT TOLERANT DESIGN IN ANALOG CIRCUITS

Voting on the results of redundant modules with discrete

values is straightforward, and is referred to as exact voting.

The 3- input exact majority voter for example produces a

correct output when 2-out of-3 of its inputs are equal.

However, exact voting on the results of redundant modules

with real number outputs is not appropriate. For data derived

directly from noisy sources, for the outputs which are read by

digital computers, for the output of replicated remote sensors

in fault tolerant data acquisition systems, or for the output of

diversely implemented software programs which handle

floating point arithmetic, an exact match is generally

impossible. So in case of analog signals exact match of results

from redundant modules is generally impossible. Various

solutions have been proposed for it, and most common method

used is median-selector algorithm method, it selects the mid

value of the voter inputs and then uses this value directly as

the voter output. Another solution for handling approximate

redundant value is the use of inexact (threshold) voters. In this

technique if the difference between outputs of two modules is

less than a threshold value then they will be in agreement

otherwise in disagreement, so to make it more reliable

dynamic threshold method is used where the threshold value is

Page 70: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0112-3

not fixed but it varies according to the module outputs, fuzzy

voting comes in this category.

A. fuzzy voter

Fuzzy voter described here uses fuzzy logic to calculate the

weights of modules and the final output[10].

The basic block diagram of the fuzzy voting has been shown

in figure 5:

Figure 5

The final output y will be calculated on the basis of weights of

voter inputs as:

Here the value of weights will lie in the range of [0, 1], Where

0 means that the particular module is completely in

disagreement with other modules while 1 means that module is

in complete agreement with other modules. The membership

of difference of input pairs [8] has been defined as:

The membership of output w has been defined as:

So the weight will be calculated with the fuzzy rules as:

This fuzzy voting mechanism shows the better availability and

safety than previous methods for small and medium errors but

it does not show much effective result for large errors so if

instead of taking constant parameters [p, q, and r] we can

make them variable then this system will be able to show

better performance even with larger errors.

V. RESULTS

The results for digital circuits are as follows:

Implementation Results:

The basic circuit used for description of reliability is ALU;

here single module of ALU, Triple module of ALU and

Triplicated TMR of ALU has been implemented. The tables

below show how much area is utilized on FPGA board in

terms of slices/LUTs.

The faults have been injected in the circuit by adding extra

component to the actual circuit so that logic of circuit is

changed this is known as SABOTEUR METHOD.

Circuit Implementation without TMR:

Circuit Implementation with TMR:

XUPV5-LX110T Speed Grade-3

Used Available utilization

Number of Slice LUTs

46 69,120 1%

Number of

BUFG/BUFGCT

RLs

1 32 3%

Number of occupied Slices

22 17,280 1%

Number of bonded IOBs

53 640 8%

XUPV5-

LX110T

Speed Grade-3

Used Available utilization

Number of

Slice LUTs 49 69,120 1%

Number of BUFG/BUF

GCTRLs

1 32 3%

Number of

occupied Slices

27 17,280 1%

Number of bonded

IOBs

53 640 8%

Page 71: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0112-4

Circuit Implementation with Triplicated TMR:

Comparison of maximum path delay:

XUPV5-LX110T

Speed Grade-3 Without

TMR

With TMR

With Triplicated

TMR

Utilized area (slices)

22 27 28

Maximum path

delay (ns)

5.143ns 5.618ns 5.150ns

The results for analog circuits are as follows:

A comparison of results of existing fuzzy voter and improved

fuzzy voter has been shown here with the basic formula as:

Performance= (1 - )

First graph has been plotted for correct output 1 with injecting

errors in TMR modules and max allowed error was 0.1, error

Has been injected in modules in range [-0.5, 0.5]

Second graph has been plotted for correct output 5 and max

allowed error is 0.5, error has been injected in modules in

range [-1, +1]:

Both the graphs show that improved fuzzy logic shows the

better results as compared to existing fuzzy logic even in

presence of larger errors as shown in second graph.

VI. FUTURE WORK

The demand of reliability is increasing day by day even

in less critical systems, so in future to make a system more

reliable survivability approach will be dominating where even

if the system fails the critical part should not go down. So the

next step in digital circuits in this project will be survivability

while in case of analog circuits the concept of both fuzzy logic

and genetic may come together to make a system more reliable

REFERENCES

[1] Two Flows for Partial Reconfiguration: Module Based or

Difference Based, Xilinx [2] J.C. Baraza , J. Gracia, D. Gil, P.J. Gil , “A prototype of a VHDL-based

fault injection tool: description and application.

[3] Tobias Becker, Wayne Luk1 and Peter Y.K. Cheung, “Enhancing Relocatability of Partial Bitstreams for Run-Time

Reconfiguration”.

[4] F. Lima, C. Carmichael, J. Fabula, R. Padovani, R. Reis,” A Fault

Injection Analysis of Virtex FPGA TMR Design Methodology”.

[5] C. Carmichael. Triple Modular Redundancy Design Techniques

for Virtex FPGAs. Xilinx, xapp197 (v1.0) edition, 2001 [6] Khaled Elshafey and Ahmed Elhosiny.” on-line testing and diagnosis of

microcontrollers”

[7] Fabian Vargas, Alexandre ,Amory Raoul ,” Estimating Circuit Fault-Tolerance by Means of Transient-Fault Injection in VHDL”

[8] “Fuzzy logic with engineering applications” by Timothy J Ross.

[9] “Fuzzy sets and fuzzy logic theory and applications” by George J. Klir

and Bo Yuan. [10] “A fuzzy voting scheme for hardware and software fault tolerant

systems”, G. Latif-Shabgahi, A.J. Hirst / Fuzzy Sets and Systems 150 (2005) 579–598

[11] “Fuzzy logic tutorial” from MATLAB

XUPV5-LX110T

Speed Grade-3

Used Available utilization

Number of Slice

LUTs 49 69,120 1%

Number of BUFG/BUFGCT

RLs

1 32 3%

Number of occupied Slices

28 17,280 1%

Number of

bonded IOBs 107 640 16%

Page 72: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0112-5

Page 73: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-1

Floating Point Arithmetic Operations Using

VHDL S.C. Yadav, S. S. Chauhan

1, A. R. Khan

2

Electronics & Communication Engg.,1,2

Graphic Era University

566/6 Bell Road, Dehradun (India)

[email protected], [email protected], [email protected] Abstract-

In this paper an asynchronous programmable chip capable

of performing floating point arithmetic operations has

been designed. An asynchronous chip is the one wherein

the operations performed are not clock dependent and

hence are faster. The developed chip is operated by loading

the proper values of control and status registers. The result

is obtained by reading the result register

I. INTRODUCTION

A OBJECTIVE

The objective is to design an asynchronous

programmable chip, capable of performing IEEE: 754 –

1985 standard based floating point arithmetic

operations.

The complete design of the chip constitutes the

individual modules developed for floating point

addition/subtraction, multiplication and division.

The language of choice is VHDL.

B FLOATING POINT

Floating point system was developed to provide high

resolution over a large dynamic range. Floating point

system often can provide a solution when fixed point

system, with their limited precision and dynamic range

fails. Floating point systems comply with the published

single or double precision IEEE floating point standard.

There are basically two types of IEEE floating point

Representation.

(1) Single Precision

(2) Double Precision

Single Precision

The IEEE single precision floating point standard

representation requires a 32 bit word, which may be

represented as numbered from 0 to 31, left to right. The

first bit is the sign bit, S, the next eight bits are the

exponent bits, 'E', and the final 23 bits are the fraction

'F':

S EEEEEEEE FFFFFFFFFFFFFFFFFFFFFFF

31 30 23 22 0

A standard floating point word consists of

(1) Sign-Bit (s)

(2) Exponent (e)

(3) Normalized Mantissa (m)

1) Sign Bit

The sign bit is as simple as it gets. 0 denotes a positive

number; 1 denotes a negative number. Flipping the

value of this bit flips the sign of the number.

2) Exponent

The exponent field needs to represent both positive

and negative exponents. To do this, a bias is added to

the actual exponent in order to get the stored

exponent. For IEEE single-precision floating point,

this value is 127. Thus, an exponent of zero means

that 127is stored in the exponent field. A stored value

of 200 indicates an exponent of (200-127), or 73. For

reasons exponents of -127 (all 0s) and +128 (all 1s)

are reserved for special number. For double precision,

the exponent field is 11 bits, and has a bias of 1023.

3) Mantissa

The mantissa known as the significand, represents the

precision bits of the number. It is composed of an

implicit leading bit and the fraction bits.

To find out the value of the implicit leading bit,

consider that any number can be expressed in

scientific notation in many different ways.

In order to maximize the quantity of representable

numbers, floating-point numbers are typically stored

in normalized form. This basically puts the radix point

after the first non-zero digit. In normalized form, five

is represented as 5.0 × 100.

A nice little optimization is available to us in base

two, since the only possible non-zero digit is 1. Thus,

we can just assume a leading digit of 1, and don't need

to represent it explicitly. As a result, the mantissa has

effectively 24 bits of resolution, by way of 23 fraction

bits.

Special Values

IEEE reserves exponent field values of all 0s and all

1s to denote special values in the floating-point

scheme.

i) Zero

Zero is not directly representable in the straight

format, due to the assumption of a leading 1 ( need to

specify a true zero mantissa to yield a value of zero).

Zero is a special value denoted with an exponent field

of zero and a fraction field of zero. Note that -0 and

+0 are distinct values, though they both compare as

equal.

In particular,

0 00000000 00000000000000000000000 = 0

1 00000000 00000000000000000000000 = -0

ii) Denormalized

If the exponent is all 0’s, but the fraction is non-zero

(else it would be interpreted as zero), then the value is

a denormalized number, which does not have an

assumed leading 1 before the binary point. Thus, this

Page 74: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-2

represents a number (-1)s × 0.f × 2

-126, where s is the

sign bit and f is the fraction. For double precision,

denormalized numbers are of the form (-1)s × 0.f × 2

-

1022. From this you can interpret zero as a special type of

denormalized number.

If 0<E<255 then V=(-1)S* 2

(E-127) * (1.F)

where "1.F" is intended to represent the binary

number created by prefixing F with an implicit

leading 1 and a binary point.

If E=0 and F is nonzero, then V=(-1)S * 2

(-126)

* (0.F) These are "unnormalized" values.

iii) Infinity

The values +∞ and -∞ are denoted with an exponent of

all 1s and a fraction of all 0s. The sign bit distinguishes

between negative infinity and positive infinity. Being

able to denote infinity as a specific value is useful

because it allows operations to continue past overflow

situations .Operations with infinite values are well

defined in IEEE floating point .

0 11111111 00000000000000000000000 = Infinity

1 11111111 00000000000000000000000 = -Infinity

iv) Not A Number

The value NaN (Not a Number) is used to represent a

value that does not represent a real number. NaN's are

represented by a bit pattern with an exponent of all 1s

and a non-zero fraction.

0 11111111 00000100000000000000000 = NaN

1 11111111 00100010001001010101010 = NaN

There are two categories of NaN: QNaN (Quiet NaN)

and SNaN (Signalling NaN).

a) QNaN is a NaN with the most significant

fraction bit set. QNaN's propagate freely

through most arithmetic operations. These

values pop out of an operation when the result

is not mathematically defined.

b) SNaN is a NaN with the most significant

fraction bit clear. It is used to signal an

exception when used in operations. SNaN's can

be handy to assign to uninitialized variables to

trap premature usage.

Semantically, QNaN's

denote indeterminate operations, while SNaN's denote

invalid operations.

Summary:

The value V represented by the word may be

determined as follows:

If E=255 and F is nonzero, then V=NaN ("Not

a number")

If E=255 and F is zero and S is 1, then V= -

Infinity

If E=255 and F is zero and S is 0, then V=

Infinity

If 0<E<255 then V=(-1)S * 2

(E-127) * (1.F)

where "1.F" is intended to represent the binary

number created by prefixing F with an

implicit leading 1 and a binary point.

If E=0 and F is nonzero, then V=(-1)S * 2

(-

126) * (0.F) These are "unnormalized" values.

If E=0 and F is zero and S is 1, then V= -0

If E=0 and F is zero and S is 0, then V= 0

0 00000000 00000000000000000000000 = 0

1 00000000 00000000000000000000000 = -0

0 11111111 00000000000000000000000 = Infinity

1 11111111 00000000000000000000000 = -Infinity.

0 11111111 00000100000000000000000 = NaN

1 11111111 00100010001001010101010 = NaN

0 10000000 00000000000000000000000 = +1 * 2(128-127)

* 1.0 = 2

0 00000001 00000000000000000000000 = +1 * 2(1-127)

* 1.0 = 2(-126)

0 00000000 10000000000000000000000 = +1 * 2(-126)

* 0.1 = 2(-127)

0 00000000 00000000000000000000001 = +1 * 2(-126)

*

0.00000000000000000000001 = 2(-149)

(Smallest positive value)

Special Operations

Operations on special numbers are well-defined by

IEEE. In the simplest case, any operation with a NaN

yields a NaN result. Other operations are as follows: Table 1

Special Operations in floating point

Operation Result

n ÷ ±Infinity 0

±Infinity × ±Infinity ±Infinity

±nonzero ÷ 0 ±Infinity

Infinity + Infinity Infinity

±0 ÷ ±0 NaN

Infinity – Infinity NaN

±Infinity ÷ ±Infinity NaN

±Infinity × 0 NaN

Double Precision

The IEEE double precision floating point standard

representation requires a 64 bit word, which may be

represented as numbered from 0 to 63, left to right.

The first bit is the sign bit, S, the next eleven bits are

the exponent bits, 'E', and the final 52 bits are the

fraction 'F'.

The value V represented by the word may be

determined as follows:

If E=2047 and F is nonzero, then V=NaN

("Not a number")

If E=2047 and F is zero and S is 1, then V= -

Infinity

Page 75: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-3

If E=2047 and F is zero and S is 0, then V=

Infinity

If 0<E<2047 then V=(-1)S * 2

(E-1023) * (1.F)

where "1.F" is intended to represent the binary

number created by prefixing F with an implicit

leading 1 and a binary point.

If E=0 and F is nonzero, then V=(-1)S * 2

(-1022)

* (0.F) These are "unnormalized" values.

If E=0 and F is zero and S is 1, then V= -0

If E=0 and F is zero and S is 0, then V= 0

II WORKING PRINCIPLE

A MY CHIP

Chip consists of 2 unidirectional buses, each 32 bits to

accommodate the input and the output. It consists of a 2

bit address bus for selecting the desired register in the

chip.

Signal description:

1) r/w ( read/write) signal to perform the read or

write operation . A high indicates read

operation and the low indicates write

operation.

2) rst (Reset) signal to reset the chip contents.

3) Int (Interrupt) signal to interrupt the processor

about some abnormality in the functioning of

the chip.

Fig. 1 Block Diagram of My Chip

Control Register:

IE

X

X

Mode

X

Op2

Op1

Op0

Fig. 2 Control Register format.

The flags of the control register are defined as:

IE stands for Interrupt Enable. When this flag is low (0)

no interrupt is generated and when this flag is high (1)

interrupt is generated under certain conditions.

Mode flag is low when being used for signed

operation and high when being used for floating point

operation.

X representrs don’t care condition.

Operation to be performed by the chip is selected

using the last three bits of the control register. Table 2

Opcodes for various operations. Op2 Op1 Op0 Operation selected

0 0 0 Addition

0 0 1 Subtraction

0 1 0 Multiplication

0 1 1 Division

Status Register

F1F

F2F

RF

NAN

OF

UF

DE

Z

Fig. 3 Status Register Format

The flags of the status register are defined as:

F1F flag is high when operand 1 is loaded on the chip.

F2F flag is high when operand 2 is loaded on the chip.

RF flag when high indicates the completion of the

selected operation by the chip.

NAN flag is high when the content of the result

register is wrong i.e. NaN (not a number) condition

has been encountered.

OF flag is high when the content of the result register

exceeds the higher bound limit.

UF flag is high when the content of the result register

crosses the lower bound limit or when a denormalized

number is encountered .

DE flag is high when division by zero (0) error

occurs.

Z flag is high when the result of the operation is zero.

Register Mapping Table 3

Access codes for registers in my _chip module

Read/write Address Bus Register

X 00 F1

X 01 F2

1 10 RES

0 10 Control Register

X 11 Status Register

When address bus is loaded with 00 then register F1 is

port mapped for read or write operation. The mapping

of register F2 has been done using address bus code

01.

The optimization of address bus has been done for the

code 10 where RES register is mapped only for read

operation and control register only for write operation.

Status register has been portmapped for address bus

code 11.

B FLOATING POINT ARITHMETIC OPERATION

B.1 Addition & Subtraction:

Addition and Subtraction are performed

using module fp_ads. Steps to perform the

addition & subtraction operation are:

Page 76: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-4

Step 1: Check which exponent is bigger and

shifts the mantissa of the smaller number till

the difference between the two numbers is

reached. If the exponents are equal, then both

numbers mantissa’s are checked for the bigger

one.

Step 2: Add the exponents of the two numbers.

If sign bit of both numbers is same otherwise

we subtract them. Same operation is performed

with the mantissa of the input operands.

Step 3: The abnormality of negative exponents

is resolved by shifting the required number of

bits to get the correct result. To see whether

result has encountered an overflow error

boundary conditions are checked.

B.2 Multiplication

Multiplication of floating point numbers is

done by using module fp_mul. Steps for the

multiplication operation are as follows:

Step 1: When we multiply two numbers having

the same base their powers are added. Similarly

here we add the exponents of the two operands.

Step 2: Booth multiplication (shift and add)

technique is employed to multiply the

mantissa’s of the two numbers along with the

‘hidden bit’. Mantissa multiplication result is

saved in a 49 bit temporary register.

Step 3: Negative exponents abnormality is

removed to get the resultant number mantissa

and exponent.

B.3 Division

Division operation is performed using module

fp_div utilizing fixed point division technique.

Steps to divide p by q, both of n+1 bits are as

follow:

Step 1: Store the numbers p & q in temporary registers

p_temp & q_temp of 2n+1 bits each

respectively.

Step 2: Compare the values of p_temp & q_temp.

If p_temp > q_temp subtract q_temp from

p_temp and store 1 in the quotient register and

move to the next iteration.

If p_temp<q_temp store 0 in the quotient

register and move to the next iteration.

Step 3:After n+1 iterations quotient is saved in quotient

register and remainder is saved in p_temp.

There are three components used in this design:

i) fp_ads used for floating point addition and

subtraction operation.

ii) fp_mul used for floating point multiplication

operation.

iii) fp_div used for floating point division

operation.

III PROGRAMMING THE CHIP

Chip programming consists of a series of step which

must be followed for the efficient functioning of the

chip.Chip programming consists of the following steps:

Step 1: Chip is made available for the floating point

arithmetic operations by making rst (reset) signal low.

At this point, all the contents of the chip registers are

erased and the chip is ready afresh for a new

calculation/computation.

Step 2: To load the first operand onto the chip register

mapping is required making read/write signal low and

the loading address bus with 00.

Step 3: To load the second operand onto the chip

read/write signal is kept low while the status of

address bus is changed to 01 for the required register

mapping.

Step 4: To select the operation to be performed last

three bits of the control register are taken into account

while the address bus indicates 11 and the read/write

signal is low.

Refer Table 2. For opcodes of various operations.

Step 5: A start signal is generated by checking the

F1F and F2F flag of the status register to commence

the selected operation while the address bus shows 10

and the read /write signal is low.

Step 6: The confirmation of operation completion is

checked by the status of the RF flag of the status

register which should be high for successful

completion of operation while the read/write signal is

high and the address bus indicates 10.

Step 7: The result of the arithmetic operation done is

viewed by checking the dataout signal while the

read/write signal is high and the address bus indicates

11

Step 8: The previous entered input values can be

viewed by keeping the read/write signal high while

keeping address bus 00 for operand 1 and 01 for

operand 2.

IV RESULTS AND DISCUSSIONS

A ADDITION

Fig. 4: Floating Point Addition Simulation

Page 77: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-5

Table 5.

SIMULATION EXAMPLE FOR FP ADDITION.

Base 10 Sign

Bit

Exponent

Bits

Mantissa

Bits

HEX

Equivalent

F1 4444.44 0 1000

1011

0001 0101

1100 0111 0000

101

458AE385

F2 5555.56 0 1000 1011

0101 1011 0011

1000 1111

010

45AD9C7A

RES 10000 0 1000 1100

0011 1000 1000

0000 0000

000

461C3FFF

B SUBTRACTION

Fig. 5:Floating Point Subtraction Simulation

TABLE 6:

SIMULATION EXAMPLE FOR FP SUBTRACTION.

Base

10

Sign

Bit

Expone

nt Bits

Mantissa

Bits

HEX

Equivalent

F1 85.73 0 1000

0101

0101 0111

0000

1010 0011 111

42AB8517

F2 49.96 1 1000

0100

1000 1111

1010

1110 0001 010

C247D70A

RES 35.80 0 1000

1100

0001 1110

0110 0110 0110

011

420F3334

C MULTIPLICATION

Fig. 6: Floating Point Multiplication Simulation

TABLE 7

SIMULATION EXAMPLE FOR FP MULTIPLICATION

Base 10 Sign

Bit

Exponent

Bits

Mantissa

Bits

HEX

Equivalent

F1 148.75 0 1000

0110

0010

1001 1000

0000

0000 000

4314C000

F2 1092.86 0 1000

1001

0001

0001

0011 0111

0000 101

44889B85

RES 162562.925 0 1001

0000

0011

1101

1000 0001

0111 011

481EC0BB

D DIVISION

Page 78: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-6

Fig. 7: Floating Point Division Simulation

Fig.8. Simulation of My_chip

V CONCLUSION

Floating point operations are widely used in the

digital signal processing applications and can be

implemented using PDPs (Programmable Digital

Processors). But a large amount of data processing is

required because of complex computations. This

affects the cost, speed and flexibility of the DSP

systems. In this paper floating point arithmetic

operations have been successfully simulated using

ModelSim .

Future Aspects of project

Future aspects should include the following:

1) Fast Fourier Transform computation.

2) Digital Signal Processing.

3) Infinite Impulse Response (IIR) and Finite

Impulse Response (FIR) filter design.

4) Digital Image Processing.

Page 79: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0113-7

REFERENCES

[1] Digital System Design Using VHDL by Charles H. Roth

Jr.

[2] The Design Warrior’s Guide to FPGA by Clive ‘Max’

Maxfield.

[3] FPGA Based System Design by Wayne Wolf.

[4] A VHDL Primer by Jayaram Bhaskar.

[5] Circuit Design With VHDL by Volnei A. Pedroni.

Page 80: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0114-1

A Complete CMOS Based Low Power Supply

Bandgap Voltage Reference Circuit

Implemented On TSMC 0.35-μm Process Kshitij Bhargava

#1, Kirmender Singh

*2

ECE Department (Microelectronics And Embedded Technology)

Jaypee Institute Of Information Technology University, Noida

India

[email protected]@gmail.com

[email protected]

Abstract— A complete CMOS based low power

supply bandgap voltage reference circuit

implemented on TSMC 0.35μm CMOS process

is presented in this paper. The designed circuit

employs a start-up circuit, a beta-multiplier

circuit(PTAT circuit) and a MOS based

differential amplifier. This circuit provides a

nominal reference voltage of 323 mV at 2V

supply voltage. Experimental results show that

the temperature coefficient is 1.16 ppm / ºC in

the temperature range from -20 ºC to +90 ºC.

The value of PSRR achieved without any

filtering capacitor is -21dB at 10KHz. The area

occupied by the design is 0.027mm² and power

consumption is 62.24μW at room temperature

(25 ºC).

Keywords— Bandgap voltage reference, PTAT,

CMOS, PSRR.

1. INTRODUCTION

The high-precision voltage reference circuit is an

important component in mixed-mode applications.

A stable reference circuit provides a reliable

reference voltage, and low supply voltage makes

the integration with low voltage analog and digital

circuits possible. Such reference circuits should

exhibit little dependence on process, supply voltage,

and temperature variations (PVT). With steadily

decreasing power supply voltages in deep

submicron CMOS technologies, a design of any

voltage/current reference on-chip becomes a non-

trivial task. Numerous approaches to achieve low

voltage supply drift as well as low temperature drift

voltage reference have been proposed till date. But

most of them have used BJT devices implemented

in standard CMOS process to implement reference

circuits [1-3] which occupies large wafer area.

Moreover, some of the implementations using non-

standard CMOS process require higher cost owing

to extra process steps [4-5] .

This paper presents a complete MOS based

bandgap voltage reference circuit with the same

general working principle of positive and negative

temperature coefficient voltages nullifying each

other to give a near about zero temperature

coefficient reference voltage along with a suitable

technique to minimize the power supply

dependence of this reference voltage[6].

The major parts of the circuit involves a start-up

circuit, a beta-multiplier circuit made up of NMOS

and PMOS current mirror circuits, and a differential

amplifier to enhance power supply rejection

capability of the reference voltage.

Section II. describes the proposed voltage

reference circuit design along with the detailed

description of its subparts viz. start-up circuit, the

beta-multiplier circuit and the differential amplifier

circuit.

Section III illustrates the experimental results.

Section IV concludes the paper.

Page 81: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0114-2

2. Proposed Reference Circuit

Figure 1: The Proposed Reference Circuit

2.1 Start-up Circuit

In any self-biased circuit, when the power supply

is just turned on, the current flowing in the circuit is

zero. In this circuit at this moment the gates of M1

and M2 are at ground while that of M3 and M4 are

at VDD. This forces the value of IPTAT current to

be zero initially. But since this voltage reference

can be used as precision power supply voltage in

many analog circuits, this unwanted state of the

reference circuit can lead to undesired operating

points of the transistors. Thus, a start-up circuit is

required to turn on the transistors M1 and M2 in the

initial moments of the circuit operation.

In the proposed circuit a start-up circuit has been

used which consists of transistors MU1, MU2 and

MU3. When the supply voltage VDD is just turned

on the gate of MU1 is at the zero potential and so it

is in the off state. On the other hand at this moment

the gate terminal of MU2 is somewhere between

VDD and VDD – Vth,p . The transistor MU3 acts

like an NMOS switch and leaks the current from the

gates of M3 and M4 into the gates of M1 and M2

and produces the desired value of IPTAT right

from the starting of circuit operation. When all the

transistors gets settled to a stable operating points

this start-up circuit automatically stops functioning

because MU1 starts conducting and due to this

MU3 turns off. This is very important since the

start-up circuit should not obstruct the normal

operation of the beta-multiplier circuit(which is

explained in the next subsection).

2.2 Beta-Multiplier Circuit (PTAT Circuit)

The basic building block of any bandgap voltage

reference circuit is a current mirror circuit. The

proposed circuit shows a NMOS current mirror

stacked just below a PMOS current mirror. The

purpose of using such a configuration is explained

below.

To obtain the desired value of IPTAT current it

becomes very essential to be able to force the same

value of current through M1 and M2. This can be

achieved by using a PMOS current mirror. We can

write,

VGS1=VGS2+IPTAT.Rout (1)

And,

IPTAT=(2/R²out.β1).[1-√β1/√β2]² (2)

Where,

β=μn .Cox. (W/L) (3)

The equation(1) holds good only if VGS1>

VGS2. To ensure this we have to use a beta-

multiplier circuit which can efficiently increase the

value of transistor gain ‗β‘ in M2, which is

generally achieved by simply increasing the width

of the transistor M2 such that W2 = K .W1. This

will eventually help in achieving the desired value

of IPTAT current even at low value of gate to

source voltage of M2.

2.3 Reference Voltage Generation Principle

The reference voltage is generated by adding up

two voltages one with positive temperature

coefficient and other with negative temperature

coefficient. The drop across resistor Rout i.e

VPTAT will provide a positive temperature

coefficient voltage and the drain-to-source voltage

(VDS5) of a diode connected NMOS transistor M5

Page 82: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0114-3

will give a negative temperature coefficient voltage.

These two opposite temperature coefficient voltages

will give a reference voltage of very small

temperature coefficient value. Mathematically, the

reference voltage can be expressed as,

VREF=VPTAT+VDS5 (4)

And,

ƏVREF/ƏT=ƏVPTAT/ƏT+ƏVDS5/ƏT (5)

2.4 Differential Amplifier

To reduce the sensitivity of reference voltage to

the power supply variation (PSRR Improvement)

we need to reduce the variations in the drain-to-

source voltages of devices M1 and M2 with change

in VDD. For this purpose a MOS based

differential amplifier has been used whose output is

connected to the common gate terminal of M3 and

M4.

The differential amplifier compares the drain

voltages of M1 and M2 and regulate them to

become equal.

Figure 2 : Differential Amplifier Circuit

TABLE 1: Component Values Of Proposed

Reference Circuit

Component Values

MU1

MU2

MU3

M1

M2

M3

M4

M5

Rout

50/2

10/20

10/1

50/2

210/2

100/2

100/2

2.85/0.35

8k

3. EXPERIMENTAL RESULTS

The proposed temperature insensitive voltage

reference circuit shown in Figure.1 generates a

voltage of 323 mV at room temperature 25 ºC.

Figure.3 shows the reference voltage variation with

temperature for the range -55 ºC to +125 ºC.

Figure.4 shows the reference voltage variation with

temperature at three different corner conditions viz.

fast corner(FF), typical(TT) and slow corner(SS).

This circuit operates at a low supply voltage of 2V

and the temperature coefficient of the reference

voltage is only 1.16 ppm/ ºC within the temperature

range of -20 ºC to +90 ºC and the value of PSRR is

-21dB at 10 KHz frequency. The power

consumption of the circuit is 62.24 μW. The area

occupied by the design on silicon wafer is 0.027

mm².

Page 83: VLP

CONFERENCE ON ―SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)‖ MARCH 26-27 2011

VLP0114-4

Figure 3: Reference Voltage Versus Temperature

Curve

Table2: Performance Summary of the Proposed

Design

Parameter [7] [8] This Work

Technology(μm) 0.6 0.5 0.35

Supply Voltage 1.4V 2.6V 2V

Reference Voltage 0.309V 1.21V 0.323V

Temperature

Coefficient(ppm/ºC)

36.9 613 1.16

PSRR -47 dB

at 100

Hz

-30

dB at

100

Hz

-21dB at

10 KHz

Active Area(mm²) 0.055 0.045 0.027

Figure 4 : Reference Voltage under the three

corner conditions

4. CONCLUSIONS

A high precision temperature insensitive voltage

reference circuit has been presented in this paper.

The circuit was designed using TSMC 0.35μm

CMOS technology and experimental results were

illustrated. It shows that the proposed circuit can

provide a stable reference voltage of 323mV within

the temperature range -20ºC to +90ºC with the

power supply rejection value of -21dB at 10KHz

Hertz frequency. The proposed reference circuit

provides a stable reference voltage having very

small temperature drift. Such circuit can be used for

applications which requires a stable voltage

reference such as MEMS based temperature sensors

and low dropout regulators.

REFERENCES

[1] Karel E. Kuijk, ―A Precision Reference Voltage

Source‖ , IEEE Journal Of Solid-State Circuits,

Vol. SC-8, No. 3, June 1973, pp. 222-226.

[2] Allen, P.E. & Holberg, D.R (2002). ―CMOS

Analog Circuit Design‖. New York : Oxford.

[3] Matthew C. Guyton and Hae-Seung Lee, MIT ,

―Bandgap Current Reference‖ , March 2003.

[4] Lee, I., Kim G., & Kim, W. (1994)

―Exponential curvature compensated BiCMOS

bandgap reference‖ IEEE Journal Of Solid-

State Circuits, 29, 1396-1403.

[5] Malcovati, P.,Maloberti, F., Fiocchi, C., Pruzzi,

M. (2001). ―Curvature-compensated BiCMOS

bandgap with 1-V supply voltage‖, IEEE

Journal Of Solid-State Circuits, 36(7), 1076-

1081.

[6] Allen-Holberg, ―CMOS Analog Circuit

Design‖, Second Edition.

[7] Stair, R., Connelly, J.A. , & Pulkin M. (2000)

―A Current Mode CMOS Voltage Reference‖.

In proceedings of Southwest Symposium on

Mixed-Signal Design (pp. 23-26)

[8] Kimberly Jane S.Udy, Patricia Angela Reyes-

Abu and Wen Yaw Chung, ―A High Precision

Temperature Insensitive Current And Voltage

Reference Generator‖. In proceedings of

World Academy Of Science, Engineering and

Technology 2009.

Page 84: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0115-1

Performance Analysis of Carbon Nanotube FET Harish Kr. Mishra

1, S.P. Gangwar

2, Dr. Harsh V. Singh

3,

1M.Tech. Student, Department of Electronics Engineering, KNIT, Sultanpur-228 118

2,3Assistant Professor, Department of Electronics Engineering, KNIT, Sultanpur-228 118

Phone No. (+91)9415177465, (+91)9515763939 Email: [email protected], [email protected]

Abstract: We Study field-effect transistors based on individual single and multi-wall carbon nanotubes and analyzed

their performance. Transport through the nanotubes is dominated by holes and by varying the gate voltage;

we successfully modulated the conductance of a single wall device by more than 5 orders of magnitude.

Multi-wall nanotubes show typically no gate effect.

Keywords: Carbon nanotubes, Semiconductor, Singlewall Nanotube, Multiwall Nanotube, FET

Carbon nanotubes (NT) are a new form of carbon with unique

electrical and mechanical properties [1].They can be considered as

the result of folding graphite layers into carbon cylinders and may

be composed of a single wall nanotube ( SWNTs), or multiwall

nanotubes.( MWNTs).Depending on the folding angle and the

diameter, nanotubes can be metallic or semiconducting.

The band gap semiconducting NTs decreases with increasing

diameter. In this paper we study on the fabrication and

performance of a SWNT-based FET and explore whether MWNTs

can be utilized as the active element of carbon-based FETs. Despite

their large diameter, we find that structurally deformed MWNTs

may well be employed in NT-FETs. Based on the output and

transfer characteristics of our NT devices.

The SWNTs used in our study were produced by laser ablation of

graphite doped with cobalt and nickel catalysts [7]. For cleaning,

the SWNTs were ultrasonically treated in anH2SO4/H2O2 solution.

MWNTs were produced by an arc-discharge evaporation technique

[8] and used without further treatment. The NTs were dispersed by

sonication in dichlroethane and then spread on a substrate with pre

defined electrodes. A schematic cross section of a NT device is

shown in Fig. 1.

They consist of either an individual SWNT or MWNT bridging two

electrodes deposited on a 140 nm thick gate oxide film on a doped

Si wafer, which is used as a back gate. The 30 nm thick Au

electrodes were defined using electron beam lithography. For

imaging, we used an atomic force microscope operating in the

noncontact mode.

The source–drain current I through the NTs was measured at room

temperature as a function of the bias voltage VSD and the gate voltage

VG. Figure 2 a shows the output

1

Characteristics I – VSD of a device consisting of a single SWNT With

a diameter of 1.6 nm for several values of the gate voltage. At VG5 0 V,

the I-VSD curve is linear with a resistance of R5 2.9 MV. For VG, 0 V,

The I-VSD curves remain linear, whereas they become increasingly

nonlinear for VG at 0 V up to a point where the current becomes un

measurably small, indicating a controllable transition between a quasi

metallic and an insulating state of the NT. Figure 2 b shows transfer

characteristics I – VG of our NT device for different source–drain

voltages.

The behavior is similar to that of a p-channel metal oxid

semiconductor FET [9]. The source drain current decreases strongly

with increasing gate voltage, which not only demonstrates that the NT

device operates as a Feld Effect Transistor but also that transport

through the semiconducting SWNT is dominated by positive carriers

holes.

The conductance modulation of our SWNT-FET exceeds 5 orders of

magnitude. For VG, 0 V, the I – VG curves saturate indicating that the

contact resistance RC at the metal electrodes starts to dominate the total

resistance R5 RNT 1 2 RC of the device. Here, RNT denotes the gate-

dependent resistance of the NT. The saturation value of the current

corresponds to RC' 1.1 MV. Similar contact resistances were previously

found for metallic SWNTs [4]. The origin of the holes is an important

question to address. One possibility is that the carrier concentration is

inherent to the NT.

FIG.1. Schematic cross section of the FET devices. A single NT of either MW or SW type bridges the gap between two gold electrodes.

Page 85: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0115-2

FIG. 2. Output and transfer characteristics of a SWNT-FET: an I – VSD

curves measured for VG526, 0, 1, 2, 3, 4, 5, and 6 V. b I – VG curves for

VSD510– 100 mV in steps of 10 mV. The inset shows that the gate

Modulates the conductance by 5 orders of magnitude (VSD510 mV).

The higher work function of gold leads to the generation of holes

in the NT by electron transfer from the NT to the gold

Electrodes [2]. Assuming that the band-bending length in their

SWNT is neither very short nor very long, At VG50 V, the

Device is ‘‘on’’ and the Fermi energy is close to the valence-

band edge throughout the NT. If indeed the band-bending length

is comparable to the length of the SWNT, a positive gate voltage

would generate an energy barrier of an appreciable fraction of

eVG in the center of the tube since the gate/NT distance is shorter

than the source/drain separation. The threshold voltage VG,T

required to suppress hole conduction by depleting the tube

center would be determined by the thermal energy available

for overcoming this barrier. Thus, VG,T should be much lower

than the 6 V .

In this case, we expect a fairly homogeneous hole distribution

along the NT independent of the gate voltage. An

Estimate of the hole density can then be obtained by writing

the total charge on the NT as Q5 CVG,T , where C is the NT

capacitance and VG,T the threshold voltage necessary to

completely deplete the tube. The NT capacitance per unit

length with respect to the back gate is C /L' 2 pee0/ln(2h/r),

with r and L being the NT radius and length, and h and e the

thickness and the average dielectric constant of th device.10

Using L 5 300 nm, r50.8 nm, h 5 140 nm, and e'2.5, we

evaluate a one-dimensional hole density of p5 Q/eL '9 3

106cm

2from VG,T

56 V. This value corresponds to about 1 hole

per 250 carbon atoms in the NT. For comparison, in graphite there

is only 1 hole per 104atoms [11]. The large hole density suggests

that the NT is degenerate and/or that it is doped with acceptors, for

example, as a result of its processing [12].

Page 86: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0115-3

FIG.3. I – VG curve of a typical MWNT device curve A in comparison

with that of a collapsed MWNT of similar cross section curve B .

We can estimate the mobility of the holes from the transconductance

of the FET. In the linear region, it is given by dI/dVG 5 mh( C/L2)VSD.

Subtracting the contact resistance. we obtain a NT transconductance

of dI/dVG5 1.731029 A/V at VSD510 mV, corresponding to a hole

mobility of 20 cm2/V s. This value is close to the mobility in heavily

p-doped silicon of comparable hole density,9but considerably smaller

than the 104cm

2/V s observed in graphite[11].The low value of the

NT mobility is consistent with our initial assumption of diffusive

transport and suggests that the SWNT contains a large number of

scatterers, possibly related to defects in the NT or disorder at the

NT/gate–oxide interface due to roughness. Such deformations can

lead to local electronic structure changes,[13]which may act as

scattering centers.

The low mobility is surprising in view of the coherence length of

more than 1 mm reported on the basis of energy quantization a long a

metallic SWNT at low temperature[4].However, we note that there

have been no transport experiments on individual SWNTs that

provide evidence for ballistic transport at room temperature e.g., by

observing conductance

Quantization [1] Having demonstrated FET operation for a SWNT,

We move on to explore whether transport through MWNTs can

be controlled by a gate electrode. The band gap of NTs has

been predicted to decrease with increasing tube

diameter[1].Therefore, MWNTs with diameters of 10 nm or more are

expected to show metallic rather than semiconducting behavior at

room temperature. We study a number of MWNT devices with

resistances of R; 100 kV. Most of these devices showed no gate

action, and a typical I – VG

curve is plotted in Fig. 3 curve A.

Structural deformations of NTs change their electronic properties.

Curve B in Fig. 3 shows that this can lead to a significant gate effect

in MWNTs. As is the case for the SWNT-FET, the source–drain

current of this MWNT-FE decreases with increasing gate voltage, i.e.

the dominant conduction process is hole transport. In contrast to the

SWNT device, this MWNT-FET could not be completely depleted. The

I – VSD curve remained linear independent of the gate voltage not

shown. Between VG52 35 and 25 V, the resistance increased only from

R5 76 to 120 kV, corresponding to a conductance modulation by about a

factor 2.

2

Page 87: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0115-4

FIG. 4. A Noncontact AFM image of the MWNT-FET. b and c

Close up views showing three twists in the collapsed Nanotube

The gate effect reaches a sharp maximum between VG

52 15 and 0 V [13].To explain this peculiar behavior, we

consider the AFM image, Fig. 4a of the MWNT-FET. The device

consists of a collapsed MWNT, which bridges the gap between

two Au electrodes separated by about 1 mm.

This nanostripe is 3 nm high from which we conclude that it has

four or five shells and it exhibits a number of twists at Figs. 4b

and c which allow us to determine its width to be 12 nm. Based

on the structural information summarized in Fig.4d, we propose

the following explanation for the behavior of the MWNT-FET.

Since the intershell interaction in MWNTs is weak, it is

reasonable to assume that transport is confined to the outermost

shell of the nanostripe [12].The conductance modulation of

about 2 indicates that the bottom ‘‘plate’’ of the outermost shell

is depleted by the gate, whereas the top layer is less affected

due to screening by the inner shells and the bottom layer as

long as it is conducting.

Our model implies that the bottom ‘‘plate’’ is decoupled from

the top layer, which may be the consequence of lateral

quantization effects perpendicular to the tube axis. Using R5

RNT1 2 RC for the ‘‘on’’ state (VG5215 V) and R 52 RNT1 2 RC

for the ‘‘off’’ state of the MWNT-FET (VG5 0 V), we estimate

a resistance of RNT532 kV for the outer shell of the NT and

deduce a contact resistance of RC 5 23 kV. Finally, we proceed

analogously to the SWNT-FET analysis to evaluate the hole

density and mobility of the collapsed MWNT. Numerical

calculations show that the capacitance per unit length is

reasonably well described by C/L 5 2 pee0/ln(2h/r) despite the

slab-shaped geometry of the collapsed tube. Using L5 1.1 mm, r5 5

nm, and a threshold voltage of VG,T'8 V to deplete the bottom layer,

we obtain p' 1.73107cm

2for its hole density.

Page 88: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0115-5

From the transconductance of dI/dVG5 3.53 1028 V/A at VSD5 50 mV,

We estimate a mobility of mh' 220 cm2/V s. The hole density is

similar to the SWNT but the mobility is higher, which suggests a

reduced number of scatterers. This may arise from the fact that the

MWNTs were not ultrasonically treated in acids. Furthermore, they

do not deform as much as SWNTs in order to conform to roughness

at the NT/gate–oxide interface.

Conclusion: Transport in the Nanotubes is dominated by holes and,

at room temperature, it appears to be diffusive. Using the gate

electrode, the conductance of a SWNT-FET could be modulated by

more than 5 orders of magnitude. An analysis of the transfer

characteristics of the FETs suggests that the NTs have a higher carrier

density than graphite and a hole mobility comparable to heavily p-

doped silicon. Large-diameter MWNTs show typically no gate effect,

but structural deformations can modify their electronic structure

sufficiently to allow FET behavior.

References :

1- M. S. Dresselhaus, G. Dresselhaus, and P. C. Eklund, Science of

Fullerenes and Carbon Nanotubes Academic, San Diego,1996

2 - J. W. G. Wildo¨er, L. C. Venema, A. G. Rinzler, R. E. Smalley,

and C. Dekker, Nature ~London! 391, 59 ~1998!.

3 -T. W. Odom, J.-L. Huang, P. Kim, and C. M. Lieber, Nature

London 391, 62 ~1998.

4-S. J. Tans, M. H. Devoret, H. Dai, A. Thess, R. E. Smalley, L. J.

Geerligs,and C. Dekker, Nature ~London! 386, 474 ~1997

5-M. Bockrath, D. H. Cobden, P. L. McEuen, N. G. Chopra, A. Zettl,

A. Thess, and R. E. Smalley, Science 275, 1922 ~1997

6-S. J. Tans, A. R. M. Verschueren, and C. Dekker, Nature

~London! 393,49 ~1998

7-T. Guo, P. Nikolaev, A. Thess, D. T. Colbert, and R. E. Smalley,

Chem. Phys. Lett. 243, 49 ~1995

8-D. T. Colbert, J. Zhang, S. M. McClure, P. Nikolaev, J. H. Hafner,

D. W. Owens, P. G. Kotula, C. B. Carter, J. H. Weaver, A. G.

Rinzler, and R. E.Smalley, Science 266, 1218 ~1994

9-S. M. Sze, Physics of Semiconductor Devices ~Wiley, New York,

1981

10 -This expression was inferred from P. M. Morse and H. Feshbach,

Methods of Theoretical Physics ~McGraw–Hill, New York, 1953

11 - N. B. Brandt, S. M. Chudinov, and Ya. G. Ponomarev,

Semimetals, 1. Graphite and its Compounds ~North-Holland,

Amsterdam, 1988

12 -H. He, J. Klinowski, M. Forster, and A. Lerf, Chem. Phys. Lett.

287, 53 1998

13 J. E. Fischer, H. Dai, A. Thess, R. Lee, N. M. Hanjani, D. L. Dehaas,

and R. E. Smalley, Phys. Rev. B 55, R4921 -1997

3

Page 89: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-

27 2011

VLP0201-1

POWER AWARE PHYSICAL MODEL FOR EMBEDDED

SYSTEMS

Asstt Prof Yasmeen Hasan

Mtech(Electronic Circuits &Systems (VLSI))

DEPT OF ECE, INTEGRAL UNIVERSITY, LUCKNOW

Email: [email protected]

Abstract- In this work we have proposed a

geometric model that is employed to devise

a scheme for identifying the hotspots and

zones in a chip. These spots or zone need to

be guarded thermally to ensure

performance and reliability of the

embedded system. The model namely

continuous unit sphere model has been

presented taking into account that the 3D

region of the system is uniform, thereby

reflecting on the possible locations of heat

sources and the target observation points.

The experimental results for the –

continuous domain establish that a region

which does not contain any heat sources

may become hotter than the regions

containing the thermal sources. Thus a

hotspot may appear away from the active

sources, and placing heat sinks or cooling

system near the active thermal sources

alone may not suffice to tackle thermal

imbalance.

Keywords:Embeddedsystems,continuous

model,floorplanning,Finemesh(FM),Corse

mesh(CM),Hotspots etc.

2: Introduction

In recent years, power density in

microprocessors has doubled every three

years [1,2,3], and this rate is expected to

increase within one to two generations as

feature sizes and frequencies scale faster

than operating voltages [4,7]. Because

energy consumed by the microprocessor is

converted into heat, the corresponding

exponential rise in heat density is creating

vast difficulties in reliability and

manufacturing costs. At any power

dissipation level, heat being generated must

be removed from the surface of the

microprocessor die, and for all but the

lowest-power designs today, these cooling

solutions have become expensive. For high-

performance processors, cooling solutions

are rising at $1–3 or more per watt of heat

dissipated [3, 8], meaning that cooling costs

are rising exponentially and threaten the

computer industry‟s ability to deploy new

systems.

Thermal aware floorplanning[6]

reduces the on chip hotspot by a

significant amount through lateral

spreading. In the traditional design

methodology, worst case assumption are

used to ensure that the system operates

normally in all corner cases, which

results in excessive design margin by

imposing extreme design constraints.

With the shift in design paradigm, worst

case assumptions and post design

solutions are no longer sufficient to

address thermal and power issues. It has

become important to take into

consideration right from the starting and

address them at all levels of design

cycle.

In this paper we have proposed a geometric

model which is employed to devise a

scheme for identifying the hotspots in an

embedded system.We propose a model here

Page 90: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0201-2

which may facilitate in identifying the hot

spots/zones in a VLSI chip. In the

continuous domain we have used the

concept of a unit sphere model to calculate

the local thermal effect at a point due to the

heat being dissipated from several point

heat sources distributed over the chip

plane. We establish that a point on a chip

can become very hot due to the conduction

effects of other heat sources, although it may

not have a heat source in its immediate

vicinity. In this model, the heat loss due to

radiation has been ignored. If it is to be

considered, an appropriate heat loss function

has to be incorporated

Fig1: SIDEVIEW OF A TYPICAL

PACKAGE[9]

2.1: Time Invariant Heat Sources

The study is made with the assumption that

there are constantly active (i.e. always on)

heat generating sources placed randomly

throughout the chip .For continuous

thermal sources; we also assume that the

heat from the sources is being propagated

through the 3D surface of the chip without

being dissipated in the ambience. The

objective is to identify the zones in the

chip, which have heat content greater than a

certain threshold. 3.2: Continuous Spatial Domain

The position of a heat source may be any

point on the chip which is assumed to be an

embedded system. In the unit sphere model,

the contribution of a point heat source S at

any target point T is expressed as the

amount of heat from S received within the

unit sphere centered at the point T. This unit

is the same as that of the distance between S

and T, and may be related to the minimum

dimension of the chip. The cumulative heat

received at the point T is evaluated as the

linear superposition of the amounts received

at T from all heat – generating sources on

the chip.

As illustrated with Fig. 2, let a heat source at

a point S generate an amount Q, henceforth

denoted as the strength of the source S. Let

the target point T be at a Euclidian distance

d from S. Let CT and Cs intersect at the two

points A and B.

Then the area cut out on the surface of the

sphere CS is equal to the product of solid

angle with its vertex at the center of the

sphere Cs and the square of the sphere‟s

radius A …. (1)

Fig 2: Unit Sphere Model of Heat Received at a

Point T

Where formed by the conical surface of the

spherical sector and d is the radius of the

source sphere.

A complete sphere forms a solid angle of 4

If the

solid angle is not formed by the entire sphere, but only

Page 91: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0201-3

by a conical surface of a spherical sector, the angle in

this case is equal to the ratio of the sector‟s spherical

surface to the square of the sphere‟s radius [5].) By

denoting the plane angle at the vertex of the spherical

sector as θ, it is possible to express its height h as

… (2)

where r is the radius of the source sphere.

Therefore the spherical area of the sector

can be represented as

A=

… (3)

Fig 3: Section of a cone and a spherical cap

inside a sphere

By denoting the solid angle which subtends

the spherical surface of the sector as we

obtain

… (4)

Thus the contribution of heat from S at T is

Q

… (5)

Where is the surface area of the sphere S.

Consider OCTB in the figure 10

(CTB)2 = (OB)

2 +(OCT)

2

( 1)2 = (d(1-cosθ))

2 +

d2sin

2 θ

1=21-cosθ)

… (6)

Putting eqn (6) in eqn (5) we get

The contribution of heat from S to T is

=Q

... (7)

Our concerns are the hottest points on the chip.

Intuitively, the source points definitely belong to the

above class. But the more pertinent question is

whether these are the only points that need to be

considered. The question may be re-phrased as

follows: does there exist any non-source point on

the floor with heat content greater than that of any

of the source points?

The observations reported, answer in the

affirmative. Before we proceed further, we point out

two special cases of the unit sphere model based on

the distance d between S and T:

0.5<d<1 and (2) 0<d<0.5

Case1:1/2<d<1 Case

2:0<1/2<d

Fig 4: Special cases of the unit sphere

model

In the boundary case when S lies on CT is

equal to , as SAT becomes an equilateral

triangle

Q= (1-cosθ)

Page 92: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0201-4

TABLE 1: RESULTS FOR THE CONTINUOUS

DOMAIN

Q= (1- cos)

=

… (8)

Hence in case (1) the angle 2θ as defined

earlier will be greater than, and

consequently more than of the heat

emanating from S reaches the unit sphere

centered at T. In case (2) T is nearer to S

and hence the sphere with radius „d

„around S will now lie entirely within the

Unit sphere at T. Hence the unit sphere CT

receives the entire heat of S in this case.

Using the formula derived above we

calculated the cumulative heat received at

each point along the diagonal joining any

two vertices of the geometrical 3D

structure taken into consideration .In this

work we proceeded by taking a regular

cuboid and a regular Octagonal prism

structure. We worked out with the formula

by taking the above mentioned 3D

structures of different dimensions. While

proceeding with this approach we consider

the medium throughout the geometrical

structure as isotropic.

An active source of unit strength (Q0=1) was

placed at each of the vertices of the 3D

structure(8 in cuboids and 16 in octagonal

prism).

4: EXPERIMENTAL RESULTS

Using the Continuous Domain formula we

try to find the hottest spot along a given

direction in a 3D structure. Here we had

taken cuboids of different dimensions and

placed an active sources of unit strength at

each of its 8 vertices. The target points are

taken along the longest diagonal.

RESULTS FOR THE CONTINUOUS

DOMAIN

We performed more experiments in the

continuous domain model implemented in C

to simulate the effect of active sources

placed at random points on the 3D floor.

Keeping the

dimensions of the 3D structure the same we

varied the number of sources from 5 to

50.We have studied five trail runs, keeping

the number and range of the power strength

of the active sources fixed, just allowing the

position of the sources to vary.

We actually considered a fine grid around

each source point and evaluated the

cumulative power at each of those points

along with the source points. Also across the

whole

floor we considered a relatively coarse grid

and evaluated the power at all the grid points

of this coarse grid.

The formula derived from the unit sphere

model has been used for the calculation. In

table 1 we have reported our results. The

threshold value is the minimum of the total

power at the active source points including

the contribution from all other sources.

5: CONCLUSION

In this work we have proposed a model in

the continuous domain to model the thermal

behavior in an embedded system. The

hotspots were usually concentrated near the

NO OF

SOUR

CES

THRESH

OLD

VALUE

TOTA

L

PROB

ES

POIN

TS

PRO

BES

POIN

TS

IN

FM

PROB

E

POIN

TS

IN CM

HOTS

POT

IN

FM

HOT

SPOT

IN

CM

%HOT

SPOT

IN

FM

%HOT

SPOT

IN

CM

5 1.24876 2029

160 2916

0 2000

000 9198 2562

0.45

% 0.19%

10 1.24338 2058

320 5832

0 2000

000 1679

8 8376

0.81

% 0.41%

20 1.26821 2116

640 1166

40 2000

000 4223

8 1204

8 1.72

% 0.68%

40 1.26441 2233

280 2332

80 2000

000 9397

2 1903

0 4.21

% 0.82%

50 1.20101 2291

600 2916

00 2000

000 1182

62 2596

9 5.16

% 1.13%

Page 93: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0201-5

active source points, but some points away

from the source were found to be much

hotter than the sources itself. The

randomness of the source did not affect the

result much. One important aspect we have

observed in all the models is that there are

zones in the chip which become much hotter

even without containing a heat source. We

conclude that it may not be enough to guard

only the active regions to make the chip

thermally stronger. This also requires the

need for more efficient power and thermal

management techniques

References

[1] S. Borkar. Design challenges of

technology scaling. IEEE Micro, pp. 23–29,

Jul.–Aug. 1999.

[2] G. Roos, B. Hoefflinger, M. Schubert,

and R. Zingg, “Manufacturability of 3D-

epitaxial-lateral-overgrowth CMOS circuits

[3] R. Mahajan. Thermal management of

CPUs: A perspective on trends, needs and

opportunities, Oct. 2002. Keynote

presentation,THERMINIC-8.

[4] Y.K.Cheng and S.M.Kang, “An Efficient

Method for Hot-spot Identification in ULSI

Circuits”, Proc. Of IEEE Int. conf. on

Computer Aided (ICCAD), pp. 124-127,

1999.

[5] Solid Angle “, on the Wikipedia, the

free encyclopedia Website.

[6] T. Sherwood, E. Perelman, and B.

Calder. Basic block distribution analysis to

find periodic behavior and simulation points

in applications. In Proc. PACT, Sept. 2001.

[7] SIA. International Technology Roadmap

for Semiconductors,2001.

[8] S. Gunther, F. Binns, D. M. Carmean,

and J. C. Hall. Managing the impact of

increasing with three stacked channels,”

Microelectron, 1991

microprocessor power consumption. Intel

Tech. J., Q1 2001.

[9] Fig: 1.From: K. Skadron,S.Velusam, K. Sankaranarayanan and D. Tarjan.

“Temperature-Aware

Microarchitecture”.Published in the

Proceedings of the 30th International

Symposium on Computer Architectures,

June 9–11, 2003 in San Diego, California,

USA.

Page 94: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0202-1

Abstract:- In the present IT age, we are in

need of fully automatic system for remotely

controlling and monitoring appliances. This

paper mainly focuses on the remotely

controlling the industrial and home

appliances and making efficient utilisation of

power supply[1]. This system is SMS based

using GSM (Global System for Mobile

Communication) and uses a wireless

technology. It provides an perfect solution to

the problem faced by home owner when they

forget to switch off their home appliances

while going out of home. It is one of the

emerging and new application of GSM

technology. It is of great use for efficient

utilisation of power in industry and cutting

down the electric bill. Here we are

representing a design of a stand alone

embedded system that can monitor and

control different appliances installed at

industries and home using built-in input and

output pheripherals. Basically this system

allows the home owner and industry owner

to control and monitor their appliances

remotely via mobile phone by sending

command in form of SMS message and

receiving the appliances current status. The

software used for simulation is ecllispse with

a java run time environment.

Keywords- GSM , SMS, Signal Processing

and Embedded System .

I. INTRODUCTION

The objective of this paper is to control home

appliances remotely and reduce the power

wastages by providing cost effective solution.

The motivation was to make possible the users

to automate their homes having universal

access. The home appliances control system

with an reasonable cost was thought to be built

that should be mobile providing remote access

to the appliances. There was a need to

automate home and industry so that user can

take advantage of the advancement in such a

way that a person getting off the office does not

get melted with the hot climate. The motive of

this paper is to propose a system that allows

user to be control home appliances universally

via SMS using GSM technology and make a

efficient utilisation of power supply. A design

and implementation of SMS based control for

monitoring systems is proposed in[2]. This

paper has three modules involving sensing unit

for monitoring the complex applications, a

processing unit that was microcontroller and a

communication module that used GSM module

or cell phone. The primary health-care

management for the rural population is

explored in [3]. Providing PHC services to the

rural population by the use of the mobile web-

technologies was prposed in the paper [3]. The

system above involves the use of SMS and cell

phone technology for information management,

transactional exchange and personal

communication. Internet and wireless

communications have been utilized in home

automations [6-8].

In this paper , I have tried to

implement a method in which a

acknowledgement from receiver could be

received without any additional cost. It would

be beneficial on the user aspect to receive a

feedback from the receiver.

Efficient Power Utilisation By Controlling

Industrial And Home Appliances Using GSM and

Microcontroller Raj Singh Yadav* and Nidhi Mishra**

*B.Tech IIIrd Year, **Assistant Professor

Krishna Institute of Engineering and Technology

Electronics and Communication Department

Ghaziabad-201206, India

[email protected]

[email protected]

Page 95: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0202-2

II. HOME APPLIANCE CONTROL

SYSTEM WITHOUT FEEDBACK

We proposed home and industrial

appliance control system based on GSM

network technology for transmission of SMS

from sender to receiver. The GSM network

provides full duplex link to support the user

requirement[4] SMS sending and receiving is

used for universal access of appliances and

allowing remotely monitoring and controlling

the appliances at home. The home appliance

control system consists of mainly three

following components:- microcontroller, GSM

module and mobile device. Microcontroller is

used for storing software program coding on

which the system is functioning. GSM module

is used for receiving the message from the user.

Mobile device is used for sending the command

which has to be performed by the

microcontroller.

III.PROPOSED PAPER WITH

FEEDBACK SYSTEM

In this proposed paper, the system is capable

enough to give feedback to user about the

condition of the home appliance according to

the user‟s needs and requirements. The current

status of the appliances can be checked. The

working of feedback system can be explained

with help of below fig.[1]

Fig:- Diagram for Home Appliances control

system with feedback [1]

This system has basically two units. They

are transmitter and receiver unit with a

feedback system. The message consists of a set

of commands to turn a specific appliancels

ON/OFF [5]. The working of this system can be

explained as:- Microcontroller, GSM module

and Mobile phone.

Microcontroller being the main component

has home appliances control system installed on

it. Appliances control is responsible for

everywhere access of appliances. Systems work

on GSM technology for transmission of

commands from sender to receiver.

GSM module is a plug and play device and is

attached with the help of port RS232 to the

Microcontroller which then communicates with

the Microcontroller via this port. GSM module

is like a link responsible for enabling/ disabling

of SMS capability.

Mobile device with a GSM sim

communicates with the GSM module via radio

waves. The method of communication is

wireless and mechanism works on the GSM

technology. Cell phone has an authorised SIM

card and a GSM subscription. Sender transmits

instructions via SMS and the system takes

action against those instructions.

IV. CONSTRAINTS OF HOME

APPLIANCES

CONTROL SYSTEM

The system functionality is based on GSM

technology and microcontroller and it needs a

power supply so the technological constraints

must be kept in mind. The system is helpless to

power failure but this disruption can be avoided

by attaching the voltage source thus allowing

users to avail the great advantage of this

system.

V. RESULTS AND SIMULATION

The result of the system can be explained as:-

The system will check various GSM hardware

tests and will run to check the all the hardware

component support. The system then opens the

serial port RS232 for communication with the

GSM module. On successful port opening the

Page 96: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0202-3

system communicates with the GSM Module

but there is no communicate if the run fails.

The system checks support for battery

level, signal strength, GSM Module and other

components by SMS sending and receiving

capability. If these tests succeed the system

gives response of „Ok‟, if it fails then „ERROR‟

is returned. The remote user sent SMS with

security code (as defined in the program code)

from a cell phone on the home appliances

control system to turn on/off the specified

appliance and the system performed the

respective function by simulating the appliance

on/off as directed by the user.

Appliances SMS

send by

User

System

Response

Feedback

Message

(current

status)

Air

conditioner

AC on

AC off

AC

button

simulated

to on/off

AC on

AC off

Light Light on

Light

off

Light

button

simulated

to on/off

Light on

Light off

Fan Fan on

Fan off

Fan

button

simulated

to on/off

Fan on

Fan off

Fig. Results of home appliances control

system with feedback response[1].

Achieved analytical results:-

System allowed the provision of security

such that system took no action against

the instructions received from SMS

without security code or if the SMS

received is from unregistered number.

The required task was performed only

when the SMS with correct security

code instructed the system.

Remote Controlling capability of the

system allowed user to switch on/off

and check the status through simulating

the appliance as directed by the

incoming SMS.

The system automatically performed tests

and checked support for available

features, hardware and SMS sending

and receiving capability and configured

system accordingly.

The program code is written using high

level language like C, C++ and the compiler

converts it into machine code and it is stored in

microcontroller . The software used is ecllipses

with a java run time enviroment. The code is

transferred from the computer to

microcontroller with help of USB port,

USBtiny and RS232 device. The compiler used

is AVRdude. The program code can be edited

and compiled using the ecllipse software . The

sender and receiver GSM number with the

security code is defined in the program code.

VI. CONCLUSION

In the paper low cost, secure, universally

accessible, remotely controlled with a feedback

solution for automation of homes has been

introduced[1]. The target of achieving the

control over home appliances remotely using

the SMS-based system is possible by this

system. GSM technology capable solution has

proved to be controlled remotely, provide home

automation and is cost-effective as it can reduce

the electric bill by efficient utilisation of the

home appliances. The appliances are used only

when they are required. It is of great use for the

industrial appliances also. Hence we can

conclude that the required objectives and goal

of home appliances control system have been

achieved.

VII. FUTURE DIRECTION

The basic level of home appliance

control and remote monitoring with feedback

has been implemented. In case of remote

Page 97: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0202-4

monitoring other home appliances can also be

monitored and controlled such that if the level

of temperature rises above certain level then it

should generate SMS or sensors can also be

applied that can detect gas, smoke or fire in

case of emergency the system will

automatically generate SMS.

In future the system will be small box

containing the microcontroller and GSM

Module with a reduced size.

REFERENCES

1) Tahmina Begum, Md. Shazzat Hossain,

Md. Bashir Uddin and Md. Shaheen

Hasan Chowdhury “Design and

Development of Activation and

Monitoring of Home Automation

System via SMS through

Microcontroller” in 2009 International

Conference on Computers and Devices

for Communication

2) B. Ciubotaru-Petrescu, D.Chiciudean,

R.Cioarga, D. Stanescu. “Wireless

Solutions for Telemetry in Civil

Equipment and Infrastructure

Monitoring” in 3rd Romanian

Hungarian Joint Symposium on Applied

Computational Intelligence (SACI) May

25-26, 2006.

3) Z. Alkar, U. Buhur, (2005). “An Internet

Based Wireless Home Automation

System for Multifunctional Devices” in

IEEE Consumer Electronics, 51(4),

1169-1174.

4) A.Alheraish, W. Alomar, and M. Abu-

Al-Ela “Programmable Logic

Controller System for Controlling and

Monitoring Home Application Using

Mobile Network” in IMTC 2006 -

Instrumentation and Measurement

Technology Conference Sorrento, Italy

24-27 April 2006 , pp. 469

5) A.R. AI-Ali & M. AL Rousan . M.

Mohandes GSM-Based Wireless Home

Appliances Monitoring & Control

System IEEE. Pp.237

6) Liang, Li-Chen Fu and Chao-Lin W, “An

integrated, flexible, and Internet-based

control architecture for home

automation system in the Internet era”,

The IEEE Proceedings of the

International Conference on Robotics

and Automation, Volume: 2,2002, pp:

1101 -1106.

7) W. Qinglong, F.Y. Wang and; L Yueton,

“A mobile-agent based distributed

intelligent control system architecture

for home automation”, The IEEE

International Conference on Systems,

Man, and Cybernetics”, Volume: 3, 200

1, pp: 1599 - 1605.

8) R. Shepherd “Bluetooth wireless

technology in the home”, Electronics &

Communication Engineering Journal,

V. 13, I. 5, Oct 2001, pp: 195 -203.

Page 98: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0202-5

Page 99: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0301-1

Design and Implementation of Radix-2 & Radix-4

Booth Multipliers Using VHDL

S. S. Chauhan1, S.C. Yadav

2, A. R. Khan

3

Graphic Era University (E&CE Deptt.) 1, 2, 3

[email protected], [email protected], [email protected]

Abstract This paper presents Low power consumption and

smaller area are some of the most important criteria for the

fabrication of DSP systems and high performance systems.

Optimizing the speed and area of the multiplier is a major

design issue. However, area and speed are usually conflicting

constraints so that improving speed results mostly in larger

areas. In this paper, we try to determine the best solution to

this problem by comparing a few multipliers.

This project presents an efficient implementation of high

speed multiplier using the shift and add method, Radix_2,

Radix_4 modified Booth multiplier algorithm. In this paper

we compare the working of the three multiplier by

implementing each of them separately in Transversal FIR

filter.

Index Terms-Transversal FIR Filter, Booth algorithms,

VHDL, Xilinx.

1. INTRODUCTION

Multipliers are key components of many high

performance systems such as FIR filters, microprocessors,

digital signal processors, etc. A system’s performance is

generally determined by the performance of the multiplier

because the multiplier is generally the slowest clement in

the system. Furthermore, it is generally the most area

consuming. Hence, optimizing the speed and area of the

multiplier is a major design issue. However, area and

speed are usually conflicting constraints so that improving

speed results mostly in larger areas. As a result, a whole

spectrum of multipliers with different area-speed

constraints has been designed with fully parallel.

Multipliers at one end of the spectrum and fully serial

multipliers at the other end. In between are digit serial

multipliers where single digits consisting of several bits

are operated on. These multipliers have moderate

performance in both speed and area. However, existing

digit serial multipliers have been plagued by complicated

switching systems and/or irregularities in design. Radix

2^n multipliers which operate on digits in a parallel

fashion instead of bits bring the pipelining to the digit level

and avoid most of’ the above problems. They were

introduced by M. K. Ibrahim. These structures are iterative

and modular. The pipelining done at the digit level brings

the benefit of constant operation speed irrespective of the

size of’ the multiplier. The clock speed is only determined

by the digit size which is already fixed before the design is

implemented.

2. THE BASIC TRANSVERSAL FILTER

An N-Tap transversal was assumed as the basis for this

adaptive filter. The value of N is determined by practical

considerations. An FIR filter was chosen because of its

stability. The use of the transversal structure allows

relatively straight forward construction of the filter, as

shown in figure 1.

As the input, coefficients and output of the filter are all

assumed to be complex valued, and then the natural choice

for the property measurement is the modulus, or

instantaneous amplitude. If y (k) is the complex valued

filter output, then |y(k)| denotes the amplitude. The

convergence error p (k) can be defined as follows:

Aykpk−=)(

where the A is the amplitude in the absence of signal

degradations. The error p (k) should be zero when the

envelope has the proper value, and non-zero otherwise.

The error carries sign information to indicate which

direction the envelope is in error. The adaptive algorithm

is defined by specifying a performance/cost/fitness

function based on the error p (k) and then developing a

procedure that adjusts the filter impulse response so as to

minimize or maximize that performance function.

Yk = 10iNi=−=Σwk (i) xk-i

Figure 1: Transversal FIR Filter

Page 100: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0301-2

The gradient search algorithm was selected to simplify the

filter design. The filter coefficient update equation is given

by:

WK+1 = WK – μ eK XK

Where XK is the filter input at sample k, eK is the error term

at sample k = pk . yk and μ is the step size for updating the

weights value.

3. MULTIPLIERS

3.1. BINARY Multiplier

A Binary multiplier is an electronic hardware device

used in digital electronics or a computer or other electronic

device to perform rapid multiplication of two numbers in

binary representation. It is built using binary adders.

The rules for binary multiplication can be stated as

follows

(i) If the multiplier digit is a 1, the multiplicand is

simply copied down and represents the product.

(ii) If the multiplier digit is a 0 the product is also 0.

For designing a multiplier circuit we should have

circuitry to provide or do the following three things:

It should be capable identifying whether a bit 0 or 1

is.

It should be capable of shifting left partial

products.

It should be able to add all the partial products to

give the products as sum of partial products.

It should examine the sign bits. If they are alike, the

sign of the product will be a positive, if the sign bits

are opposite product will be negative. The sign bit

of the product stored with above criteria should be

displayed along with the product. From the above

discussion we observe that it is not necessary to

wait until all the partial products have been formed

before summing them. In fact the addition of

partial product can be carried out as soon as the

partial product is formed.

Binary multiplication (eg n=4)

p=a×b

an−1 an−2…. a1 a0

bn−1 bn−2…. b1 b0

pn−1 pn−2…. p1 p0

where a – multiplicand, b– multiplier, p – product

x x x x a

x x x x b

---------

x x x x b0a20

x x x x b1a21

x x x x b2a22

x x x x b3a23

---------------

x x x x x x x x p

3.2. Multiply Accumulate Circuit

Multiplication followed by accumulation is an

operation in many digital systems, particularly those

highly interconnected like digital filters, neural networks,

data quantizes, etc. One typical AC (multiply-accumulate)

architecture is illustrated in figure. It consists of

multiplying 2 values, then adding the result to the

previously accumulated value, which must then be

restored in the registers for future accumulations. Another

feature of MAC circuit is that it must check for overflow,

which might happen when the number of MAC operation

is large. This design can be done using component because

we have already design each of the units shown in figure.

However since it is relatively simple circuit, it can also be

designed directly. In any case the MAC circuit, as a whole,

can be used as a component in application like digital

filters and neural networks

3.3. Architecture OF A RADIX 2^n Multiplier

The architecture of a radix 2^n multiplier is given in

the Figure. This block diagram shows the multiplication of

two numbers with four digits each. These numbers are

denoted as V and U while the digit size was chosen as four

bits. The reason for this will become apparent in the

following sections. Each circle in the figure corresponds to

a radix cell which is the heart of the design. Every radix

cell has four digit inputs and two digit outputs. The input

digits are also fed through the corresponding cells. The

dots in the figure represent latches for pipelining. Every

dot consists of four latches. The ellipses represent adders

which are included to calculate the higher order bits. They

do not fit the regularity of the design as they are used to

“terminate” the design at the boundary. The outputs are

again in terms of four bit digits and are shown by W’s. The

1’s denote the clock period at which the data appear.

3.4. BOOTH MULTIPLIER

The decision to use a Radix-4 modified Booth

algorithm rather than Radix-2 Booth algorithm is that in

Radix-4, the number of partial products is reduced to n/2.

Though Wallace Tree structure multipliers could be used

Figure 2: Radix 2n multiplier architecture

Page 101: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0301-3

but in this format, the multiplier array becomes very large

and requires large numbers of logic gates and

interconnecting wires which makes the chip design large

and slows down the operating speed.

3.5. BOOTH MULTIPLICATION ALGORITHM:

(a) Booth Multiplication Algorithm for radix-2

Booth algorithm gives a procedure for multiplying

Binary integers in signed –2’s complement representation.

We will illustrate the booth algorithm with the following

example:

Example: 2ten*(-4) ten

0010two*1100two

Step 1: Making the Booth table

I. From the two numbers, pick the number with the

smallest difference between a series of consecutive

numbers, and make it a multiplier.i.e., 0010 -- From 0 to 0

no change, 0 to 1 one change, 1 to 0 another change, so

there are two changes on this one 1100 -- From 1 to 1 no

change, 1 to 0 one change, 0 to 0 no change, so there is

only one change on this one. Therefore, multiplication of 2

x (– 4), where 2ten (0010two) is the multiplicand and (– 4)ten

(1100two) is the multiplier.

II. Let X = 1100 (multiplier) Let Y = 0010 (multiplicand)

Take the 2’s complement of Y and call it –Y

–Y = 1110

III. Load the X value in the table.

IV. Load 0 for X-1 value it should be the previous first

least significant bit of X

V. Load 0 in U and V rows which will have the product of

X and Y at the end of operation.

VI. Make four rows for each cycle; this is because we are

multiplying four bits numbers.

U V X X-1

0000 0000 1100 0 Load the value

1st cycle

2nd

cycle

3rd

cycle

4th

cycle

Step 2: Booth Algorithm

Booth algorithm requires examination of the multiplier

bits, and shifting of the partial product. Prior to the shifting,

the multiplicand may be added to partial product,

subtracted from the partial product, or left unchanged

according to the following rules:

Look at the first least significant bits of the multiplier “X”,

and the previous least

significant bits of the multiplier “X - 1”.

I 0 0 Shift only

1 1 Shift only.

0 1 Add Y to U, and shift

1 0 Subtract Y from U, and shift or add (-Y) to U

and shift

II Take U & V together and shift arithmetic right shift

which preserves the sign bit of 2’s complement number.

Thus a positive number remains positive, and a negative

number remains negative.

III Shift X circular right shifts because this will prevent us

from using

two registers

for the X

value.

Repeat the same

steps until the four

cycles are completed.

We have finished four cycles, so the answer is shown,

in the last rows of U and V which is: 11111000two.

Note: By the forth cycle; the two algorithms have the

same values in the product register.

(b) Booth Multiplication Algorithm for radix-4:

One of the solutions of realizing high speed multipliers is

to enhance parallelism which helps to decrease the number

of subsequent calculation stages. The original version of

the Booth algorithm (Radix-2) had two drawbacks. They

are:

(i) The number of add subtract operations and the number

of shift operations becomes variable and becomes

inconvenient in designing parallel multipliers.

(ii) The algorithm becomes inefficient when there are

isolated 1’s. These problems are overcome by using

U V X X-1

0000 0000 1100 0

0000 0000 0110 0

0000 0000 0011 0

U V X X-1

0000 0000 1100 0

0000 0000 0110 0

0000 0000 0011 0

1110

1111

0000

0000

0011

1001

0

1

U V X X-1

0000 0000 1100 0

0000 0000 0110 0

0000 0000 0011 0

1110

1111

0000

0000

0011

1001

0

1

1111 1000 1100 1

Shift only

Shift only

Add-Y (0000+1110) = 1110

Shift only

Page 102: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0301-4

modified Radix-4 Booth algorithm which scan strings of

three bits with the algorithm given below:

1) Extend the sign bit 1 position if necessary to ensure that

n is even.

2) Append a 0 to the right of the LSB of the multiplier.

3) According to the value of each vector, each Partial

Product will be 0, +y, -y, +2y or -2y.

The negative values of y are made by taking the 2’s

complement and in this paper Carry-look-ahead (CLA)

fast adders are used. The multiplication of y is done by

shifting y by one bit to the left. Thus, in any case, in

designing a n-bit parallel multipliers, only n/2 partial

products are generated.

4. RESULTS & CONCLUSION

This paper gives a clear concept of different multiplier and

their implementation in tap delay FIR filter. We found that

the parallel multipliers are much option than the serial

multiplier. We concluded this from the result of power

consumption and the total area. The power consumption

for radix-2 and radix-4 multiplier as shown on Table 2 and

Table 3 respectively.

Number of Slices 130

Number of 4 input LUTs 249

Number of bounded input 16

Number of bounded output 17

CLB Logic Power 79mW

Multiplier output

In case of parallel multipliers, the total area is much less

than that of serial multipliers. Hence the power

consumption is also less. This is clearly depicted in our

results. This speeds up the calculation and makes the

system faster. While comparing the radix 2 and the radix 4

booth multipliers we found that radix 4 consumes lesser

power than that of radix 2. This is because it uses almost

half number of iteration and adders when compared to

radix 2.When all the three multipliers were compared we

found that array multipliers are most power consuming

and have the maximum area. This is because it uses a large

number of adders. As a result it slows down the system

because now the system has to do a lot of calculation.

Multipliers are one the most important component of

many systems. So we always need to find a better solution

in case of multipliers. Our multipliers should always

consume less power and cover less power. So through our

project we try to determine which of the three algorithms

works the best. In the end we determine that radix 4

modified booth algorithm works the best.

REFRENCES

1. Y. C. Lim, “Single-Precision Multiplier with Reduced Circuit Complexity for Signal Processing Applications, ” IEEE Trans.

Computers, vol. 41, no. 10, pp. 1333-1336, Oct. 1992.

2. J. Isoaho, J. Pasanen, O. Vainio, and H. Tenhunen, “DSP System Integration and Prototyping with FPGAs,” Journal of VLSI Signal

Processing, Vol. 6, pp. 155-172, 1993.

3. S. S. Kidambi, F. El-Guibaly, and A. Antonious, “Area-Efficient Multipliers for Digital Signal Processing Applications, ” IEEE Trans.

Circuits and Systems-II: Analog and Digital Signal Processing, vol.

43, no. 2, pp. 90-95, Feb. 1996. 4. J. E. Stine and O. M. Duverne, “Variations on Truncated

Multiplication,” in Proc. Euromicro Symposium on Digital System Design, 2003, pp. 112-119.

5. C. Ebeling, C. Fisher, G. Xing, M. Shen, and H. Liu, “Implementing an

OFDM Receiver on the RaPiD Reconfigurable Architecture,” IEEE Trans. on Computers,Vol. 53, No. 11, pp. 1436-1448, 2004.

6. Xilinx Staff, “Celebrating 20 years of innovation,” Xcell Journal, No.

48, Spring 2004. 7. S. Knapp, “Using Programmable Logic to Accelerate DSP Functions,”

http://www.xilinx.com/appnotes/dspintro.pdf

X(i) X(i-1) X(i-2) y

0 0 0 +0

0 0 1 +y

0 1 0 +y

0 1 1 +2y

1 0 0 -y

1 0 1 -y

1 1 0 -2y

1 1 1 +0

Number of Slices 229

Number of 4 input LUTs 300

Number of bounded input 16

Number of bounded output 16

CLB Logic Power 47mW

Table1: Radix-4 modified Booth Algorithms scheme for odd values of i.

Table 2: Results of Radix-2 multiplier

Table 3: Results of Radix-4 multiplier

Page 103: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0302-1

A Novel Approach to Design of a Multiplier Using

Reversible Logic Gates

S. S. Chauhan1, S.C. Yadav

2, A. R. Khan

3

Graphic Era University (E&CE Deptt.) 1, 2, 3

[email protected], [email protected], [email protected]

Abstract Reversible logic gates are very much in demand for

the future computing technologies as they are known to

produce zero power dissipation under ideal conditions. This

paper proposes an improved design of a multiplier using

reversible logic gates. Multipliers are very essential for the

construction of various computational units of a quantum

computer. The quantum cost of a reversible logic circuit can

be minimized by reducing the number of reversible logic

gates. For this two 4*4 reversible logic gates called a DPG

gate and a BVF gate are used.

Index Terms- Reversible logic circuits; Quantum computing;

Nanotechnology.

1. INTRODUCTION

Reversible logic has received great attention in the

recent years due to their ability to reduce the power

dissipation which is the main requirement in low power

VLSI design. Quantum computers are constructed using

reversible logic circuits. It has wide applications in low

power CMOS and Optical information processing,

quantum computation and nanotechnology. R. Landauer

[1] demonstrated that high technology circuits and

systems constructed using irreversible hardware result in

loss of one bit of information dissipates KTln2 joules of

energy where K is the Boltzmann‟s constant and T is the

absolute temperature at which the operation is performed.

The heat generated due to the loss of one bit of information

is very small at room temperature but when the number of

bits is more as in the case of high speed computational

works the heat dissipated by them will be so large that it

affects the performance and results in the reduction of

lifetime of the components. Furthermore, Bennett [2]

showed that reversible circuits do not lose information due

to the one-to-one mapping between inputs and outputs;

hence no extra energy loss.

In the design of reversible circuits two restrictions should

be considered:

Fan-out is not permitted

Loops are not permitted

Due to these restrictions, synthesis of reversible circuits

can be carried out from the inputs towards the outputs and

vice versa.

2. BACKGROUND OF REVERSIBLE CIRCUITS

An n×n reversible circuit consists of n inputs and n

outputs with mapping of each input assignment to a unique

output assignment and vice versa. Also in the synthesis of

reversible circuits direct fan-out is not allowed as

one–to-many concept is not reversible. However fanout in

reversible circuits is achieved using additional gates. A

reversible circuit should be designed using minimum

number of reversible logic gates.

A. Reversible Gates and Circuits

There are two main types of reversible gates: Toffoli [3]

and Fredkin [4]. An n×n Toffoli gate passes the first (n-1)

inputs to outputs unaltered (as control signals) and for the

last output the nth

input inverts (as target signal) if all the

previous (n-1) signals are „1‟. Assuming xi as

input and yi as output, then [3]:

yi= xi 1< i < n-1

yn= xn + (x1,x2….xn)

Toffoli Gate: A 3*3 Toffoli gate [3] as shown in figure 1.

The input vector is I (A, B, C) and the output vector is O (P,

Q, R). The outputs are defined by P=A, Q=B, R=AB xor C.

Quantum cost of a Toffoli gate is 5.

A Toffoli gate with one (two) input(s) is also known as

NOT (CNOT or Feynman) gate respectively.

Fredkin Gate: A 3*3 Fredkin gate [4] as shown in figure

2. The input vector is I (A, B, C) and the output vector is O

(P, Q, R). The output is defined by P=A, Q=A′ B xor AC

and R=A′ C xor AB. Quantum cost of a Fredkin gate is 5.

Fig.1 Toffoli Gate

Page 104: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0302-2

BVF Gate: A 4 * 4 BVF gate as shown in figure 3. This is

a reversible double XOR gate and can be used for

duplication of the required inputs to meet the fan-out

requirements. The input vector is I (A, B, C, D), the output

vector is O (P, Q, R, S) and the output is defined by P = A,

Q = A xor B, R = C and S = C xor D. Quantum cost of a

BVF gate is 2. In the proposed design this gate is used to

copy the operand bits and it is shown that the number of

gates required to copy is reduced by 50% with same

quantum cost.

Peres Gate: A 3*3 Peres gate [10] as shown in figure 4. The input vector is I (A, B, C) and the output vector is O (P, Q, R). The output is defined by P = A, Q = A xor B and R=AB xor C. Quantum cost of a Peres gate is 4. In the proposed design Peres gate is used because of its lowest quantum cost.

Double Peres gate: A Double Peres Gate as shown

in figure 5. The inputs and outputs are as shown in Table-1.The full adder using DPG is obtained with C=0 and D= Cin and its quantum cost is calculated to be equal to 6 from its quantum realization [11] shown in figure 5.

Inputs Outputs

A B C D P Q R S

0 0 0 0 0 0 0 0

0 0 0 1 0 0 1 0

0 0 1 0 0 0 0 1

0 0 1 1 0 0 1 1

0 1 0 0 0 1 1 0

0 1 0 1 0 1 0 1

0 1 1 0 0 1 1 1

0 1 1 1 0 1 0 0

1 0 0 0 1 1 1 0

1 0 0 1 1 1 0 1

1 0 1 0 1 1 1 1

1 0 1 1 1 1 0 0

1 1 0 0 1 0 0 1

1 1 0 1 1 0 1 1

1 1 1 0 1 0 0 0

1 1 1 1 1 0 1 0

B. REVERSIBLE GATES IMPLEMENTED USING ELEMENTARY QUANTUM GATES Reversible implementations of 3×3 Toffoli, Peres and Fredkin gates using elementary quantum gates are shown in figure 6, figure 7, and figure 8 respectively.

3. PARALLEL MULTIPLIERS

There are two types of multipliers which are known as

sequential and parallel multipliers. The first type

iteratively computes the final product. It needs to use

feedbacks and loops to compensate for the iterative

portion. This design is too slow and not suitable for the

reversible implementation. The second type (i.e., parallel

multiplier), conventionally, consists of two main steps:

Partial product generation

Multi-operand addition

Algorithm 1 (The n×n parallel multiplier):

Inputs: Two n-bit operands

X: xn-1…….. x1, x0 , Y: yn-1…….. y1, y0

V V V+

++

V V V+

++

Fig.2 Fredkin Gate

Fig.3 BVF Gate

Fig.4 Peres Gate

Fig.6 Implementation of the 3×3 Toffoli gate [11]

Fig.7 Implementation of the 3×3 Peres gate [12]

V V+

++

V

Fig.8 Implementation of the 3×3 Fredkin gate [11, 13]

Fig.5 Double Peres Gate

Table 1 Truth Table of Double Peres Gate

Page 105: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0302-3

Output: A 2n-bit product Z: zn-1…….. z1, z0

I. Generate n partial products

P: pin-1…….. pi1, pi0 where, 0 < i < n-1

Such that pij = xj* yi

II. Produce the final product Z= Σ pi

where, 0 < i < n-1

The operation of a 4*4 reversible multiplier is shown in

figure 9. It consists of 16 Partial product bits of the X and

Y inputs to perform 4 * 4 multiplications. However, it can

extended to any other n * n reversible multiplier.

3.1 Partial Product Generation

Partial products can be generated in parallel using 16

Peres gates as shown in figure 10.

An important point that should be considered is that in

an n×n parallel multiplier (in reversible logic) for

generating partial products in parallel, n copies of each bit

of the operands are needed. Therefore, some fan-out gates

are needed. The number of fan-out gates needed for the

reversible 4×4 multiplier is 24. It uses 4*4 BVF gates with

two constant inputs as shown in figure 11.

3.2 Multi-operand Addition (MOA)

As discussed in previous section, next step is an noperand

addition. To implement this part of circuit, we use carry

save adder (CSA). The CSA tree reduces the four

operands to two. Thereafter, a Carry Propagating Adder

(CPA) adds these two operands and produces the final

8-bit product. The proposed four operand adder shown in

figure 12 uses Double Perer Gate (DPG ) gate as a

reversible full adder and Peres gate as half adder.

The proposed reversible multiplier circuit uses 8

reversible DPG gates and 4 Peres gates. The Peres gate

half adder

has quantum cost of 4 and the DPG adder has quantum

cost of 6 and the total quantum cost of this circuit is 64.

4. RESULTS & DISCUSSION

We have encountered three different designs for

reversible multipliers in literatures where all of them, for

the sake of simplicity, have implemented their design for a

4-bit multiplier. Therefore, here in this section, we

compare our proposed multiplier with prior counterparts

based on the 4-bit reversible multiplier. In order to have a

reasonable comparison, first, we examine the detailed

implementation of the previous works. Next, compare the

proposed design based on the quantum cost, and the

number of garbage outputs with the previously mentioned

cases as follows:

x3 x2 x1 x0

y3 y2 y1 y0

p03 p02 p01 p00

p13 p12 p11 p10

p23 p22 p21 p20

p33 p32 p31 p30

z7 z6 z5 z4 z3 z2 z1 z0

Partial Product

Generation

Multi-

Operand

Addition

Fig.10 Partial product generator using Peres gates

Fig.9 The operation of the 4×4 parallel multiplier

Fig.11 Fan-out circuit to duplicate the operand bits

Fig.12 Four-operand Addition

Page 106: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0302-4

A. Reversible 4-bit multiplier of [10]

For the partial product generation phase of their multiplier,

they used 24 gates of 2×2 Toffoli (TOF2), for preparing

the essential fan-outs. Moreover, 16 Fredkin gates are used

so as to generate the partial products. For the

multi-operand addition phase they used three 4-bit binary

adders, where each of them is composed of 4 TSG, plus

and extra TSG for the generation of the most significant bit

of the final product.

By and large, the overall gate consumption of their

reversible multiplier is equal to (24×TOF2) +

(16×Fredkin) + (13×TSG). The overall critical path of

their multiplier consists of two TOF2, a Fredkin gate, and

seven TSG gates. Unfortunately, there is no reference for

how the TSG can be implemented. Moreover, there is

nothing mentioned in [14] about how a TSG can be built

by means of elementary 2×2 reversible/quantum gates. For

the sake of a fair comparison we assume the QC, and GO

of a TSG gate as equal as that of a fulladder. Nevertheless,

we believe that the QC, and GO of a TSG gate are much

more than that of a FA.

B. Reversible 4-bit multiplier of [11]

For the partial product generation phase of their

multiplier, like that of [10], they used 24 gates of TOF2 for

preparing the necessary fan-outs. Moreover, 16 Peres

gates are used in order to generate the partial products.

For the multi-operand addition phase they used 12

MKG gates where a MKG gate is a 4×4 reversible gate.

Therefore, the overall gates used in their reversible

multiplier is (24×TOF2) + (16×Peres) + (12×MKG). The

overall critical path of their multiplier consists of two

TOF2 gates, a Peres gate, and seven MKG gates. As the

case for TSG, there is also no reference for the

implementation of the MKG. Therefore, although we

believe that the QC, Depth, and GO of a TSG gate is much

more than that of a FA, we assume, for the sake of a fair

comparison, the QC, Depth, and GO of a MKG gate the

same as that of a full-adder.

C. Reversible 4-bit multiplier of [12]

This multiplier and that of [11] are somehow the same

except for the multi-operand addition phase which is

implemented in [12] by means of 8 HNG gates along with

four Peres gates. This modification leads to the following

critical path: (2×TOF2) + (2×Peres) + (6×HNG).

D. The proposed reversible 4-bit multiplier

In the proposed design for the partial product generation

phase, like those of [11] and [12], we take advantage of the

Peres gates in order to generate the partial products. For

the multi-operand addition phase as is shown in Fig. 15,

we use 8 full-adders and 4 halfadders. The critical path of

this new design consists of two TOF2 plus a Peres gate for

the partial product generation phase and 5 full-adders plus

a half-adder for the multi-operand addition phase. Table-2

gives the comparative study of partial product generation

of the circuit.

Partial

Product

generation

No

of

gates

N

No of

Garbage

outputs

GO

Quantum

cost

QC

Proposed 20 32 88

TSG [10] 40 32 104

MKG [11] 40 32 88

HNG [12] 40 32 88

Table-3 gives the comparative study of multi-operand

addition of the proposed design with other existing

designs.

Multi-operand

addition

(MOA)

No

of

gates

N

No of

Garbage

outputs

GO

Quantum

cost

QC

Proposed 12 20 62

TSG [10] 13 26 130

MKG [11] 12 24 120

HNG [12] 12 20 64

Table-4 Comparative study of different reversible

multipliers as shown in Table-4.

Reversible

multipliers

No

of

gates

N

No of

Garbage

outputs

GO

Quantum

cost

QC

Proposed 40 50 150

TSG [10] 13 26 130

MKG [11] 52 56 208

HNG [12] 53 58 234

From the above study in our opinion the proposed design

is better when compared to the other existing designs as

the total circuit cost is much less compared to the other

designs.

4. CONCLUSION

Multiplier is a basic arithmetic cell in computer arithmetic

units. Furthermore, reversible implementation of this unit

is necessary for quantum computers. For this purpose,

various designs can be found in the literature. We

proposed in this paper a novel reversible multiplier, no

increase in quantum cost or the number of garbage outputs

with respect to previous counterparts. In proposed design,

partial products were generated using Peres gates. Next,

the final product was obtained using a multi-operand adder

including CSA tree and carry propagate addition,

REFERENCES

TABLE.2 Partial product generation

TABLE.3 Multi-operand addition (MOA)

TABLE.4 Comparative study of different reversible multipliers

Page 107: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0302-5

[1] R. Landauer, "Irreversibility and heat generation in the computing

process", IBM J. Res. Develop., Vol. 5, pp. 183–191, July 1961.

[2] C.H. Bennett, “Logical Reversibility of Computation”, IBM Research and Development, pp. 525-532, November 1973.

[3] T. Toffoli, "Reversible computing", MIT, Tech. Rep., 1980.

[4] E. Fredkin and T. Toffoli, “Conservative logic,” Int‟l J. Theoretical

Physics, Vol. 21, pp.219–253, 1982 [5] A. Peres, “Reversible logic and quantum computers”, Physical

Review A, Vol 32, pp. 3266-3276, 1985. [6] J. A.Smolin and D. P.DiVincenzo, “Five Two-Bit Quantum Gates are

Sufficient to Implement the Quantum Fredkin Gate”, Physical

Review A (Atomic, Molecular, and Optical Physics), Vol. 53, No. 4, pp. 2855-2856, April 1996.

[7] D. Maslov, G. W. Dueck and D. M. Miller, “Simplification of Toffoli

Networks via Templates”. Proc. 16th Symposium on Integrated Circuits and Systems Design, pp. 53-58, September 2003.

[8] W. N. N. Hung, X. Song, G. Yang, J. Yang and M. A Perkowski,

“Quantum Logic Synthesis by Symbolic Reachability Analysis”, Proc. 41st annual conference on Design automation DAC,

pp.838-841, January 2004.

[9] D. Maslov, C. Young, D. M. Miller, and G. W. Dueck, “Quantum Circuit Simplification Using Templates”, Proc. Design Automation

and Test in Europe (DATE), Vol 2, pp.1208-1213, March 2005.

[10] H. Thapliyal and M.B. Srinivas, “Novel Reversible Multiplier Architecture Using Reversible TSG Gate”, Proc. IEEE International

Conference on Computer Systems and Applications, pp. 100-103,

March 2006. [11] M. Shams, M. Haghparast and K. Navi, “Novel Reversible

Multiplier Circuit in Nanotechnology”, World Applied Science

Journal Vol. 3, No. 5, pp. 806-810, 2008. [12] M. Haghparast, S. Jafarali Jassbi, K. Navi and O.Hashemipour,

“Design of a Novel Reversible Multiplier Circuit Using HNG Gate

in Nanotechnology”, World Applied ScienceJournal Vol. 3 No. 6, pp. 974-978, 2008.

[13] M.S. Islam et al., “Low cost quantum realization of reversible

multiplier circuit”, Information technology journal, 8 (2009) 208.

Page 108: VLP

VHDL environment for floating point Arithmetic Logic Unit - ALU design and simulation

1Rajit Ram Singh 2Vinay Kumar Singh 3poornima shrivastav 4Dr. GS [email protected] VINDHYAIndore- India

[email protected] Motors Ltd. Luck now -India

[email protected] Gwalior -India

ABSTRACTVHDL environment for floating point arithmetic and logic unit design using pipelining is introduced; the novelty in the ALU design.Pipeling provides a high performance ALU. Pipelining is used to execute multiple instructions simultaneously. In top-down design approach, four arithmetic modules, addition, subtraction, multiplication and division are combined to form a floating point ALU unit. Each module is divided into sub- modules. Two selection bits are combined to select a particular operation. Each module is independent to each other .all modules in the ALU design are realized using VHDL, design functionalities are validated through VHDL simulation .all components and module is successfully run, Synthesisand Simulation in the Xilinx12.1i software.

Keywords: ALU- Arithmetic Logic Unit, Top-Down design, Validation, Floating point, Test-Vector\

I.INTRODUCTION

Floating point describes a system for representing numbers that would be too large or too small to be represented as integers. Floating point representation is able to retain its resolution and accuracy compared to fixed point representation. Numbers are in general represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16. The typical number that can be represented exactly is of the form:×

Significant digits × Baseexponent

S ×Be

IEEE 754 standard for floating point representation in 1985. Based on this standard ,floating point representation for digital system should be platform –independent and data are interchanged freely among different digital systems.

Arithmetic logic unit (ALU) is a digital circuitthat performs arithmetic and logical

operations. The ALU is a fundamental building block of the central processing unit(CPU) of a computer. e inputs to the ALU are the data to be operated on (called operands) and a code from the control unit indicating which operation to perform. Its output is the result of the computation.

In many designs the ALU also takes or generates as inputs or outputs a set of condition codes from or to a status register. These codes are used to indicate cases such as carry-in or carry-out, overflow, divide-by-zero, etc. Floating Point Unit also performs arithmetic operations between two values, but they do so for numbers in floating point representation. And the ALU with floating point operations is called a FPU.

Top-down approach (is also known as step-wise design) is essentially the breaking down of a system to gain insight into its compositional sub-systems. In a top-down approach an overview of the system is formulated, specifying but not detailing any first-level subsystems. Each subsystem is then refined in yet greater detail, sometimes in many additional subsystem levels, until the entire specification is reduced to base elements. A top-down model is often specified with the assistance of "black boxes", these make it easier to manipulate. However, black boxes may fail to elucidate elementary mechanisms or be detailed enough to realistically validate the model

In order to stimulate a device off board, a series of logical vectors must be applied to the device inputs. These vectors are called test vectors and are mostly used to stimulate the design inputs and check the outputs against the expected values.

An pipeline is a technique used in the design of computers and other digital electronic devices to increase their instruction throughput

Page 109: VLP

(the number of instructions that can be executed in a unit of time).

The fundamental idea is to split the processing of a computer instruction into a series of independent steps, with storage at the end of each step. This allows the computer's control circuitry to issue instructions at the processing rate of the slowest step, which is much faster than the time needed to perform all steps at once. The term pipeline refers to the fact that each step is carrying data at once (like water), and each step is connected to the next (like the links of a pipe.)The origin of pipelining is thought to be the IBM Stretch project(1954) .Implementing pipeline requires various phases of floating point operations be separated and be pipelined into sequential stages. We propose VHDL environment for floating point ALU design and simulation. To ease the description, verification, simulation and hardware realization. VHDL is widely adopted standard and has numerous capabilities that are suited for designs of this sort .the use of VHDL for modeling is especially appealing since it provides formal description of the system and allows the use of specific description styles to cover the different abstraction levels(architectural, register , transfer and logic level) employed in design .

II MATERIAL AND METHODS

The main objective of this paper is to describes the implementation of pipelining in design the floating -point ALU using VHDL.. the sub-objective s are to design a 16-bit floating point ALU operating on the IEEE 754 standard rd .floating point representations ,supporting the four basic arithmetic operations; addition, subtraction, multiplication and division .second sub-objective is to model the behavior of the ALU design using VHDL.Specifications for a 16-bit floating-point ALU design.

i. Input A and B and output result are 16-bit binary floating point.

ii. Operands A and B operate as follows A (operation) B=results Operation can be addition (+), subtraction (-), Multiplication (*), division (/)iii. ‘Selection’ a 2-bit input signal that selects

ALU operation and operate as shown in table1.iv. Status a 4-bit output signal work as a flag an

microprocessor.

Table1: select ALU operation.

v. Clock pulse is only provided to the module which is selected using demux.vi. Concurrent processes are used to allow processes to run in parallel hence pipelining

Fig:1 top level view of the ALU design

ALU is separated into smaller modules: addition,subtraction,moltiplication and division,demux and mux.each arithmetic module is further divided into smaller modules .the top level view of fig.1 shows the top level view of the ALU .it consist of four functional arithmetic modules, three demultiplexes and two multiplexers. the demuxs and muxes are used to route input operands and the clock signal to the correct functional modules .they also route outputs and status signals based on the selector pins.

Output status0000 Normal operation0001 Overflow0010 Underflow0100 Result zero1000 Divide by zero

Selection Operation

00 Addition 01 Summation 10 Multiplication 11 Division

Page 110: VLP

Fig: 2 view of selection of a add module

After a module completes its task, outputs and status signals are sent to the muxes where they multiplexes with other outputs from corresponding modules to produce output result selector pins are routed to these muxes such that only the output from currently operating functional module is sent to the output port. Clock is specifically routed rather then tied permanently to each module since only the selected functional modules need clock signals. This provides power savings since the clock is supplied to the required modules only and avoid invalid results at the output since the clock is used as a trigger in every process.

Pipelining floating point addition module:

Addition module has two 16 bit inputs and one16 bit output selection input is used to enable or disable the module this module is further divided into 4 sub modules zero check, align, add_ sub and normalize module.

Fig: 3 pipeline floating point addition

Zero check modules:

This module detects zero operands early in the operation and based on the detection result it has two status signals. This eliminates the need of sub sequent processes to check for the presence of zero operands table 1 summarize the algorithm

Tab:1 setting zero check bit

Align moduleIn this module operations are perform based on status signal from previous stage zero operands are Checked in the align module as well this module introduces implied into the operands shown in table.

Tab:2 setting of implied bit

Add_ sub moduleThis module performs actual addition and subtraction of operands. Firstly operands are checked via the status signals are carried out results are automatically obtained if either of the operand are zero shown in table 3 normalization is needed if no calculation are done here the operation is done based on the science and the relative magnitude of mantissa i.e. summaries in table 4 status signal is set to one is indicate the need of normalization by the next stage

Zero_a2 &zero_b2

Zero_a1 xor zero_b1

Zero_a2 Result

0 0 X Perform add_sub

0 1 1 b stage20 1 0 a stage2

1 X X 0

Tab:3 check for add_sub module

Tab: 4 add_sub operation

I/P a I/P b Zero_a1 Zero_b10 0 1 10 NZ 1 0NZ 0 0 1NZ NZ 0

Zero_a1 xor zero_b1

a_sign Implied bit for a

Implied bit for b

0 X(do’t care)

0 0

1 1 0 11 0 1 0

Operation a_sign xor b_sign

a>b Result Sign

a + b 0 X a+b +ve(-a)+(-b) 0 X a+b -vea+(-b) 1 Yes a-b +vea+(-b) 1 No b-a -ve(-a)+b 1 Yes a-b -ve(-a)+b 1 No b-a +ve

Page 111: VLP

Normalize module

Input is normalize and packed into the IEEE 754 floating point representation if the normalize status signal is set normalization is perform otherwise MSB is dropped .

Pipeline floating point subtraction module:

Subtraction module has two 16-bits inputs and one 16-bit output. Selection input is used to enable/ disable the entity depend on the operation. This module is divided further into four sub-modules: zero-check alignsadd_sub and normalize module. The subtraction algorithm differs only in the add_sub module where the subtraction operator changes the sign of the result. the reaming three modules are similar to those in the addition module.tab5 and tab 6 summarize the operation

Tab: 5 checks for add_sub module

Operation a_sign xor b_sign

a>b Result sign

(-a)-b 1 X a+b -vea-(-b) 1 X a+b +ve(-a)-(-b) 0 Yes a-b -ve(-a)-(-b) 0 No b-a +vea-b 1 Yes a-b +vea-b 1 No b-a -ve

Tab: 6 add_sub operation and sign fixing

Pipelined floating point multiplication module

Multiplication entity has three 16-bit inputs and two 16-bit outputs. Selection input is used to enable/disable the entity. multiplication module is divided into check-zero, check-sign, add-exponent and normalize –and-concatenate all modules, which are executed concurrently .status signal indicates special result cases such as overflow, underflow and result zero, in this project pipelined floating point multiplication is divided in to three stages(fig-4).stage1 checks whether the

operand is zero and report the result accordingly.stage2 determines the product sign, add exponents and multiply fractions.stage3 normalize and concatanitate the product.

Fig 4. Pipeline structure of multiplication module

Check-zero moduleInitially two operands are checked to determine whether they contain a zero .if one of the operand is zero ,the zero_flag is set to 1 .the output results zero. if neither of them is zero then the inputs with IEEE 754 format is unpacked and assigned to the check sign, add exponent and multiply mantissa modules, the mantissa is packed with hidden bit 1.

Add exponent moduleThe module is activated if the zero flag is set .else zero

is passed to the next stage and exp_flag is set to 0,two extra bit are added the exponent indicating overflow and underflow.

Multiply mantissa moduleIn this stage zero_flag is checked first. if the zero_flag is set to 0,then no calculation and normalization is performed. The mant_flag is set to 0 if both the operands are nonzero after the multiplication is done mant_flag is set to 1 to indicate that this operation is executed.

Check sign moduleThis module determines the product sign of two operands .the product is positive, when the two operands have the same sign; otherwise it is negative. The sign bit are compared using XOR circuit. the sign_flag is set to 1.Normalize and concatenate module

This module checks the overflow and underflow occurs if the 9th bit is 12.overeflow occurs if the 8th bit is 1.if exp_flag, sign_flag and mant_flag are set, the normalization is carried out. Otherwise, 16-zero bits are assigned to the result.

Zero_a2 &zero_b2

Zero_a2 xor zero_b2

Zero_a2

b_sign Result sign

0 0 X X Perform add_sub

NA

0 1 1 0 b_stage2 b_sign=10 1 1 1 b_stage2 b_sign=00 1 0 X a_stage2 a_sign1 X X X 0 NA

Page 112: VLP

During the normalization operation, the mantissa MSB is 1, hence no, normalization is needed. the hidden bit is dropped and the reaming bit is packed and assigned to the output port .normalization module set the mantissa MSB to 1.the current mantissa is shifted left until 1 is encountered .foe each shift the exponent is decreased by 1,if the mantissa MSB is 1,normalization is completed and first bit is the implied bit dropped. Theremaining bits are packed and assigned to the output port. The final normalization product with the correct biased exponent is concatenated with product sign.

Pipelined floating point division module

Division entity has three 16-bit inputs and two 16-bit outputs. Selection input is used to enable or disable the entity. Division module is divided into six modules: check zero, align, dividend check sign, subtract exponent, divide mantissa and normalize concatenate modules. Each module is executed concurrently. Status indicates the special cases such as overflow, underflow, and result zero and divides by zero. Fig shows the pipeline structure of the division module.

Fig: 5 pipeline structure of the division module

Check-zero modules:

Initially two operands are checked to determine whether they contain a zero .if one of the operand is zero, the zero_flag is set to 1 .the output results zero. Ifneither of them is zero then the inputs with IEEE 754 format is unpacked and assigned to the check sign, add exponent and multiply mantissa modules, the mantissa is packed with hidden bit 1.

Add exponent module:

The module is activated if the zero flag is set .else zero is passed to the next stage and exp_flag is set to 0,two extra bit are added the exponent indicating overflow and underflow.Multiply mantissa module:

In this stage zero_flag is checked first. if the zero_flag is set to 0,then no calculation and normalization is performed. The mant_flag is set to 0 if both the operands are nonzero after the multiplication is done mant_flag is set to 1 to indicate that this operation is executed.

Check sign module:

This module determines the product sign of two operands .the product is positive, when the two operands have the same sign; otherwise it is negative. the sign bit are compared using XOR circuit. the sign_flag is set to 1.

Align dividend module:

This module compares both mantissas. if mant_a is greater than or equal to the msant_b then the mant_a must be aligned .for every bit right shift of the mant_a mantissa ,the mant_a exponent is then increased by 1.this increase may result in an exponent overflow, in this case an overflow flag is set. Otherwise, the process continues with the parallel operation of exponent subtraction and mantissa division. Align_flag is set to 1.

Subtract exponent module

This module is activated if the zero flag is set. if not ,zero value is passed to the next stage and exp_flag is set to 0.two extra bits are added to the exponent to indicate overflow .here two exponents are subtracted .the bias is added back. after this the exp_flag is set to 1.Divide mantissa module

In this stage ,align flag is checked first. if align flag is 0 then no mantissa division is performed .mant_flag is set to 0.if both operand are not zero, mant_a is divided by mant_b .in division algorithm ,comparison between two mantissa is done by subtracting the two values and checking the output sign.

III .SIMULATION AND DISCUSSION

Design is verified through simulation, which is done in a bottom –up fashion .small modules are simulated in separate test benches before they are integrated andtested as a whole.Align RTL1:

Page 113: VLP

Simulation Result of Align:

RTL of Demux:

Demux wave:

Page 114: VLP

Multiplexer:

Simulation result of Mux:

RTL of division:

RTL division:

Iv COCLUSION

By simulating with various test vectors the proposed approach of pipeline floating point proposed approachOf pipeline floating point ALU design using VHDL is successfully designed, tested and implemented currently, we are conducting further research that consider the further reduction in the hardware complexity in terms of synthesis and fully download the code into Altera FLEXIOK: EPFIOKIOLC, FPGA chip on LC 84 package for hardware realization

Reference:

[IIANSIWEE Std 754-1985, IEEE Standard forBinary Flooring-Point Arithmetic, IEEE, NewYork, 1985.[2]M. Daumas, C. Finot, "Division of Floating PointExpansions with an Application to the

Computation of a Determinant", Journal o/

Universol Compurer Science, vo1.5, no. 6, pp. 323-338, June 1999.[3]AMD Athlon Processor techmcal brief, AdvanceMicro Devices Inc., Publication no. 22054, Rev. D,Dec. 1999.[4]S. Chen, B. Mulgeew, and P. M. Grant, "A

Clustering techmque for digital communicationsChannel equalization using radial basis functionNetworks,'' IEEE Trans. Neural Networks, vol. 4,pp. 570-578, July 1993.[5] Mamu Bin Ibne Reaz, MEEE, Md. Shabiul Islam, MEEE, Mohd. S. Sulaiman, MEEE. ICSE2002 Proc. 2002,penang-Malaysia.

Simulation of division:

Page 115: VLP
Page 116: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0401-1

OTRA based Grounded Inductor and its application

Rajeshwari Pandey(member IEEE),Neeta Pandey(member IEEE),Ajay Singh,

B.Sriram, Kaushalendra Trivedi

Delhi Technological University, Delhi

Abstract — In this a lossless grounded

inductor has been proposed using

Operational Transresistance Amplifier

(OTRA). PSPICE Simulation results have

been included to demonstrate the

performance and verify the theoretical

analysis.

Index Terms— Inductor simulators,

OTRA , grounded inductor.

I. INTRODUCTION

The Operational transresistance amplifier

(OTRA) is gaining considerable attention

amongst analog integrated circuit designers

as it inherits all the advantages offered by

current –mode techniques. The OTRA is a

high gain current input voltage output

device. The input terminals of OTRA are

internally grounded, thereby eliminating

response limitations due to parasitic

capacitances and resistances at the input[1].

Although the OTRA is commercially

available from several sources under the

name of current differencing amplifier or

Norton amplifier, it has not gained attention

until recently. These commercial

realizations do not provide internal ground

at the input port and they allow the input

current to flow in one direction only. The

former disadvantage limits the functionality

of the OTRA where as the later forces to use

external DC bias current leading to complex

and unattractive designs [2]. Several high

performance CMOS OTRA topologies have

been proposed in literature [1,2,3,4,] leading

to growing interest in OTRA based analog

signal processing circuits. In recent past

OTRA has been extensively used as an

analog building block for realizing a number

of signal processing circuits such as

filters[5,6,7,8], oscillators[9,10,11],

multivibrators[12,13] and immittance

simulation circuits [9,14,15,16] an

application which has been dealt with in this

paper.[14] presents simulation of lossy

grounded inductor, whereas a negative

inductance has been proposed in

[15].Lossless grounded inductor topologies

have been presented in [9,16].In this paper

another lossless grounded inductor topology

with its applications has been proposed

which will give further flexibility to analog

circuit designers.

II. CIRCUIT DESCRIPTION

OTRA is a three terminal device, shown

symbolically in Fig.1 and its port relations

can be characterized by matrix ((1)

(1)

Fig.1 OTRA Circuit symbol

For ideal operations the transresistance gain

Rm approaches infinity and forces the input

currents to be equal. Thus OTRA must be

used in a negative feedback configuration.

The proposed circuit is shown in Fig. 2.

Page 117: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0401-2

Fig. 2. Grounded Inductor

Routine analysis yields

(2)

subject to the condition

(3)

For simulation CMOS implementation of

the OTRA, proposed in [4] and reproduced

in Fig. 3, was used. Aspect ratios used for

different transistors are same as in [4] and

are given in Table.1.The supply voltages

taken are ± 1.5 V for SPICE simulation.

Fig. 3. CMOS Implementation of OTRA[4]

Table.1 Aspect ratio of the transistors in

OTRA circuit

Transistor W(µm)/L(µm)

M1-M3 100/2.5

M4 10/2.5

M5,M6 30/2.5

M7 10/2.5

M8-M11 50/2.5

M12,M13 100/2.5

M14 50/0.5

III.APPLICATION

The proposed inductor is used to design (i)A

high pass filter (ii)an LC oscillator

A. High Pass Filter

A high pass filter, as shown in Fig. 4(a), can

be constructed using proposed inductor. The

transfer function for high pass response is

(4)

Where

, (5)

Fi

g.4 (a) High Pass Filter

To verify theoretical propositions a HP filter

with cutoff frequency 159 KHz is designed

for which the component values are

computed as R=1KΩ, C=1nF and Leq

=1mH.For this value of Leq component

values are chosen as =

Page 118: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0401-3

1K, and =1nF. The frequency

response of the filter simulated using

PSPICE is depicted in Fig. 4(b) and is found

to be in close agreement with theoretical

response.

Fig. 4 (b) HP Response

B. LC Oscillator

An LC oscillator is designed as a signal

generating application, employing proposed

inductor, and is shown in Fig. 5(a). The

condition of oscillation and frequency of

oscillation are given as

(6)

(7)

A typical simulation for component values

=1K,

=10pF, which results in Leq =0.1mH, and

C=1nF is shown in Fig. 5(b). The simulated

frequency of oscillation is 775 KHz and is in

close agreement with the theoretically

calculated value of 795.77 KHz. Fig. 5(c)

shows the output frequency spectrum. Total

harmonic Distortion is measured as

4.906%.

Fig.5 (a) Oscillator

Fig.5 (b) Oscillator Output.

Fig. 5(c) Frequency Spectrum

Page 119: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0401-4

V. CONCLUSION

A new OTRA based lossless grounded

inductor topology is presented. A high pass

filter and an oscillator are realized to

illustrate the applications of the proposed

topology. PSPICE simulation results are

included to verify the theoretical

prepositions.

It is expected that the proposed circuits will

be useful in design of analog signal

processing and generation applications and

will provide further possibilities to the

designer in the field.

References

[1]J.-J.Chen,H.-W.Tsao and C.-C.Chen,

“Operational Transresistance Amplifier

using CMOS Technology” Electronics

letters Vol.28, No.22, pp.2087-2088,

October 1992.

[2]K. N. Salama and A. M. Soliman,

“CMOS OTRA for analog signal processing

applications.” Microelectron. J. 30, pp. 235–

245, 1999.

[3]Hasan Mostafa, Ahmed M. Soliman, “A

Modiefied realization of the OTRA”,

frequenz 60(2006) pp 70-76.

[4]Abedelrahman K.kafrawy and Ahmed M.

Soliman, “A modified CMOS differential

OTRA” Int.J. Elect. Comm. (AEU), Vol 63,

issue12, Dec2009, pp 1067-1071

[5] Selcuk Kilinc, Ugur Cam, “Cascadable

allpass and notch filters employing single

operational transresistance amplifier”,

Computers and electrical Engineering

31(2005), pp 391-401.

[6] Cem Cakir, Ugur Cam and Oguzhan

Cicekoglu, “Novel All pass Filter

Configuration Employing Single OTRA”,

Ieee Transactions on Circuits and systems-

II: Express briefs,Vol. 52,No.3,march 2005,

pp 122-125.

[7] J.-J.Chen,H.-W.Tsao and S.-I.Liu,

“Parasitic- capacitance-insensitive current-

mode filters using OTRA” IEE Proc.-

Circuits Devices Syst., Vol. 142, No.3 June

1995.

[8]Ahmet Gokcen, Ugur Kam, “MOS-C

single amplifier biquads using the OTRA”

Int.J. Elect. Commun. (AEU), Vol 63,

(2009), pp 660-664.

[9]K.N. Salama and A.M. Soliman, “Novel

oscillators using operational transresistance

amplifier,microelectron.j.,31, 39-47,2000.

[10]U. Cam, “A Novel Single-Resistance-

Controlled Sinusoidal Oscillator Employing

Single Operational Transresistance

Amplifier”, Analog Integrated Circuits and

Signal Processing, Vol. 32, pp. 183-186,

August 2002.

[11]Rajeshwari Pandey, Mayank Bothra,

“Multiphase Sinusoidal oscillator using

Operational Transresistance Amplifier”,

IEEE Symposium on Industrial Electronics

and Applications (ISIEA-2009), pp 371-

376,oct 2009.

[12] C.L.Hou, H. C. Chien and Y. K. Lo, “

Squarewave generators employing OTRAs,

IEE proc.-Circuits Devices Syst., Vol.152,

no. 6, Dec 2005

[13] Y. K. Lo, H. C. Chien, H. G. Chiu

“Switch Controllable OTRA Based Bistable

Multivibrator,” IET Circuits Devices Syst.,

2008, Vol. 2, No. 4, pp. 373–382.

[14] U.Cam, F.Kacar, CommunicationO.

Cicekoglu, h. Kuntman and A.Kuntman,

“Novel grounded parallel immittance

simulator topologies employing single

OTRA,” AEU- Int. J Electronics and

Communications,vol. 57, no.4, pp. 287-

290,2003.

[15] Selcuk Kilinc, Khaled n. Salama,and

Ugur Cam, “Realization of fully

Controllable negative Inductance with single

operational Transresistance

Amplifier”Circuits Systems Signal

Processing,Vol 25,no.1, pp.47-57,2006

Page 120: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0401-5

[16] U.Cam, F.Kacar, CommunicationO.

Cicekoglu, h. Kuntman and A.Kuntman,

“Novel two OTRA-based grounded

Immittance simulator topologies,” Analog

Integrated circuit and Signal Processing

,Vol. 29,pp. 233-235,2001.Analog

Integrated circuit and Signal Processing

,Vol. 39,pp. 169-175,2004.

Page 121: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0402-1

OTRA based Precision Full Wave Rectifier Rajeshwari Pandey (member IEEE)

, Ajay Singh, B.Sriram, Kaushalendra Trivedi

Department of Electronics and Communication Engineering, Delhi Technological University, Delhi

Abstract — This paper presents an

operational transresistance amplifier based

precision full-wave rectifier using an all-

pass filter as a 90 phase shifter. The circuit

gives a dc output voltage that is almost the

same as the peak input voltage over a

frequency range of 50 Hz–30 MHz with a

very low ripple voltage having low

harmonic distortion.

Index Terms—OTRA, All-pass filter,

harmonic distortions, precision rectifier,

ripple voltage.

I. INTRODUCTION

State-of-the-Art analog integrated circuit

design is receiving a tremendous boost due

to the development and application of

current-mode processing[1].It is well known

that the key performance features of current-

mode technique are inherent wide

bandwidth which is virtually independent of

closed loop gain, greater linearity and large

dynamic range. Recently operational

transresistance amplifier (OTRA) has

emerged as an effective alternate analog

building block. It is a high gain current

input, voltage output amplifier [2].OTRA

being a current processing building block

inherits all the advantages of current mode

technique. It is also free from parasitic input

capacitances and resistances as its input

terminals are virtually grounded thus

eliminating response limitations due to

parasitics. OTRA is now being used as an

analog building block for realizing a number

of circuits having applications in signal

processing and generation[2-6 ].

Precise rectification function is one of the

important requirements in instrumentation

and measurement. It finds applications in ac

voltmeters, ammeters, signal-polarity

detectors, averaging circuits, sample-and-

hold circuits, peak value detectors and

amplitude-modulated signal detectors [7-

10]. In general diodes are used as a rectifier

having the drawback of threshold voltage,

and hence rectification is not permitted

below a voltage of ∼0.7 V for a silicon

diode and ∼0.3 V for a germanium diode.

Low-voltage rectification is required in

applications such as amplitude modulated

signal detectors. Slew rate limitation

prevents the fast turning on of the diodes in

high frequency range and thus results in

distortion. In view of above a precision

rectifying circuit using OTRAs has been

proposed in this paper. The performance of

the circuit has been verified in the frequency

range 50Hz-30MHz using P-SPICE.

II. PROPOSED RECTIFIER CIRCUIT

The circuit symbol of OTRA is shown in

Fig.1and its port relations can be

characterized by the following matrix:

Fig.1 OTRA Circuit symbol

II. CIRCUIT DESCRIPTION

OTRA is a three terminal device, shown

symbolically in Fig.1 and its port relations

can be characterized by matrix ((1)

(1)

Fig.1 OTRA Circuit symbol

Page 122: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0402-2

For ideal operations the transresistance

gain Rm approaches infinity and forces the

input currents to be equal. Thus OTRA must

be used in a negative feedback

configuration.

Fig.2(a) shows the block diagram of the

proposed rectifier circuit. It consists of an

all-pass filter that acts as a 90° phase shifter,

two squaring circuits, one summer, and one

square rooter. The phase of the input

sinusoidal signal Vin = A sin (2πft) is shifted

by 90° by adjusting the resistance (R) and

capacitor (C) of the RC network of the all-

pass filter in accordance with equation (2).

The amplitude of phase-shifted output of all

pass filter remains same as that of input

signal.

φ = −2 tan−1 (2πfRC) = 90. (2)

The output of the all-pass filter can be

written as Vp = Acos(2πft). The squaring of

Vin and Vp is done by using analog

multiplier. These squared signals are

summed up using summer circuit

implemented through OTRA. The summed

signal, after square rooting, becomes ~A,

which provides a rectified output.

Fig.2 (a) block diagram of proposed circuit

Fig.2(b)Circuit diagram of proposed circuit

III. SIMULATION RESULTS

To verify the theoretical propositions the

rectifier circuit is simulated using P-SPICE

program. For simulation C-MOS

implementation of OTRA, proposed in [11]

and reproduced in Fig 3, was used.

Simulation was carried out for frequency

range 50Hz-30MHz and the results are

compared with diode based full wave

rectifier circuit.

Fig.3 CMOS implementation of OTRA[11]

A. Rectified Output

The waveform tests were performed for both

the proposed circuit and previously reported

circuits. Fig 4 (a) shows an input sinusoidal

signal of frequency 100Hz. fig4. (a) Shows

input signal, rectified output of the proposed

circuit has been shown in fig 4(b).

Fig. 4(a) sinusoidal input

Fig.4 (b) Rectified output of proposed

circuit

It is seen that output of the proposed circuit

contains less ripple in comparison to

previously reported circuit [7] in which one

diode conducts for one half cycle and other

diode conducts for the other half cycle as

shown in fig.4(c).

Page 123: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0402-3

Fig.4(c) Rectified output of diode based

Fig.6 (a)10mV,100KHz Input signal and 90

degrees phase shifted signal

In the proposed circuit, rectification is not

performed by diodes, and

therefore, it has fewer ripples.

Low voltage rectification i.e. below the

threshold level of the diode was also carried

out. Fig 5shows typical output of the

proposed circuit for 100Hz frequency.

Fig5(a) sinusoidal input of frequency 100

Hz and amplitude 10mV along with 90

degrees phase shifted signal

Fig 5(b) rectified output with Input signal

Similarly a high frequency signal of

frequency 100KHz and amplitude of 10mV

is analyzed and the result is shown in, (b) is

rectified output.

Fig 6(b) rectified output with Input signal.

B. Harmonic Distortion

The harmonics in the signal cause distortion

in the output of the circuit. Thus the

harmonic components are required to be

examined for circuit performance analysis.

Being periodic in nature, these harmonic

components can be analyzed by Fourier

series. The magnitude of each harmonic of a

waveform is obtained with fast Fourier

transform using PSPICE. In fig 7(a) FFT of

input signal of frequency 100Hz is shown

along with rectified output .whereas 7(b)

shows FFT of the input of frequency 100

kHz is shown.

Page 124: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0402-4

Fig7(a)

Fig 7(b)

Fig. 7(a) shows frequency spectrum of

rectified output and 100Hz input Frequency

spectrum of rectified output and input at a

frequency of 100 kHz is shown in Fig. 7(b).

C. Ripple Factor:

Ripple factor of output has been computed.

Ripple factor is given by

r = =

Where,

r = ripple factor,

Vrms = rms value of AC component of

output,VDC = DC component present in

output

In fig 8(a) ripple factor is shown for input of

100 Hz and it is clearly seen that max value

of ripple factor is 0.316 while its average

value is 0.03. Ripple factor for an input of

frequency of 100 kHz is shown in fig 8(b)

having an average value of 0.035.

Fig8 (a)

Fig8 (b)

Previously reported circuit gives a ripple

factor of 0.483[12].

V. CONCLUSION

In this paper, a precision full wave rectifier

is implemented using Operational

transresistance amplifier (OTRA).The

circuit provides an output voltage amplitude

being almost equal to input voltage. The

circuit works well in frequency range of

50Hz – 30MHz.The excellent performance

of circuit is obtained by using OTRA that

makes it work in much higher frequency

range than previously reported circuit.

.

REFERENCES:

[1] “Analog IC design : The current mode

approach” C.Toumazou,F.J.Lidgey,Peter

Peregrinus Ltd. 1990

Page 125: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0402-5

[2] Salama Khaled N., Soliman Ahmed M.,

CMOS operational transresistance amplifier

for analog signal processing,

MicroelectronicsJournal,Vol.30,No.9,pp.235

-245, March 1999.

[3] U. Cam, “A Novel Single-Resistance-

Controlled Sinusoidal Oscillator Employing

Single Operational Transresistance

Amplifier”, Analog Integrated Circuits and

Signal Processing,Vol. 32, pp. 183-186,

August 2002.

[4]Rajeshwari Pandey, Mayank Bothra,

“Multiphase Sinusoidal Oscillators Using

Operational Trans-Resistance Amplifier”,

IEEE Symposium on Industrial Electronics

and Applications (ISIEA 2009),pp 371-376

October 4-6, 2009.

[5] U.Cam, F.Kacar, CommunicationO.

Cicekoglu, h. Kuntman and A.Kuntman,

“Novel grounded parallel immittance

simulator topologies employing single

OTRA,” AEU- Int. J Electronics and

Communications,vol. 57, no.4, pp. 287-

290,2003.

[6] U.Cam, F.Kacar, CommunicationO.

Cicekoglu, h. Kuntman and A.Kuntman,

“Novel two OTRA-based grounded

Immittance simulator topologies,” Analog

Integrated circuit and Signal Processing

,Vol. 29,pp. 233-235,2001.Analog

Integrated circuit and Signal Processing

,Vol. 39,pp. 169-175,2004.

.

[7] S. J. G. Gift and B. Maundy, “Versatile

precision full-wave rectifiers for

instrumentation and measurement,” IEEE

Trans. Instrum. Meas., vol. 56, no. 5, pp.

1703–1710, Oct. 2007.

[8] S. R. Djukic, “Full-wave current

conveyor precision rectifier,” Serbian J.

Elect. Eng., vol. 5, no. 2, pp. 263–271, Nov.

2008.

[9] P. Gray, P. J. Hurst, S. H. Lewis, and R.

G. Meyer, Analysis and Design of Analog

Integrated Circuits. New York: Wiley, 2001.

[10] S. J. G. Gift, “A high-performance full-

wave rectifier circuit,” Int. J. Electron., vol.

87, no. 8, pp. 925–930, Aug. 2000.

[11] Hasan Mustafa, Ahmed M.Soliman,”A

Modified realization of the

OTRA”,frequenz60(2006) pp70-76.

[12] R. A. Gayakwad, Op-Amps and Linear

Integrated Circuits., 3rd ed. New Delhi,

India: Prentice-Hall, 2007, pp. 316–318.

Page 126: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0403-1

GaN-based HEMTs for Communication Circuits

T R Lenka1 and A K Panda

2

National Institute of Science and Technology

Palur Hills, Berhampur, Odisha, Pin-761008

E-mail: [email protected] and [email protected]

2

Abstract:

In this paper the role of GaN-based high electron mobility

transistors (HEMTs) in microwave communication

circuits have been discussed. Due to superior material

properties, GaN-based devices produce a record

maximum frequency of oscillation of around 300GHz and

high cutoff frequency. It has become one of the prime

candidates for solid-state power amplifiers at frequencies

upto 50GHz. The unique properties of GaN include a peak

saturation velocity of 2.5x107 cm/s, high breakdown

electric field of 3.3MV/cm, output power densities in

excess of 10W/mm at 40GHz and more than 2W/mm at

80.5GHz. Recent wide-spread R&D to advance the

HEMT technology has led to high-speed low-power LSI

circuits and ultra-low noise amplifiers. In this paper the

microwave characteristics of HEMT which includes

available gain (GA), maximum available gain

(MAG/GMax), unilateral gain (GU), Maximum Stable Gain

(MSG), Noise Figure (NF) and Minimum noise figure

(NFmin) are discussed. The potential usability of HEMT as

an amplifier and Oscillator are also discussed.

Key Words: GaN, HEMT, Microwave, MMIC, Gain

1. INTRODUCTION

GaN-based semiconductor devices are

currently the main focus of great interest in

academia as well as industry because of its very

interesting material properties. [1] These

semiconductor alloys have a wide bandgap

(>3.4eV), high temperature sustainability and

high electric breakdown fields, which allow

them to be used for the fabrication of short-

wavelength (blue, UV) optical devices, high-

frequency and high power electronics [2].

Due to conduction band discontinuity, two

dimensional electron gas (2DEG) channel is

created at the heterointerface between two

undoped materials by piezoelectric and

spontaneous polarizations [3]. The 2DEG is the

heart of the HEMTs. The modeling of GaN-

based HEMTs still presents many challenges to

the worldwide research community. Due to lack

of scattering effects, the mobility of the

electrons is very high in the 2DEG, which leads

the device towards microwave applications [4].

Advanced HEMT Monolithic Millimeter-wave

Integrated Circuits (MMIC) for Millimeter and

Sub-millimeter-Wave power sources and power

amplifiers for applications to heterodyne

receivers, transmitters, and communication

circuits are highly popular and dominated by

GaN based devices [5]. In discrete device

applications, low-noise HEMTs are

commercially available and are in use in

broadcast satellite and radio telescope systems.

This paper reviews the state-of-the-art HEMT

technology for communication systems.

The commonly used HEMT structure is

discussed in section 2. The microwave

characteristics of GaN-based HEMTs are

discussed in section 3 and finally the conclusion

is drawn in section 4.

2. HEMT STRUCTURE

The AlGaN/GaN heterostructure is generally

grown on sapphire/SiC substrate by Molecular

beam epitaxy (MBE) or metal organic vapor

phase epitaxy (MOVPE) process [6]. For

Schottky ohmic contacts Ti/Al/Ni/Au is mostly

used. The TCAD simulated structure of this

device is shown in figure 1. Schrödinger’s wave

equation and Poisson equation are solved self

consistently to give rise to a two dimensional

electron gas (2DEG) which is created at the

heterointerface of AlGaN/GaN due to the

growth of wideband material over narrow

bandgap material and it is the heart of any

heterostructure device [7]-[8]. The electron

concentration at the 2DEG is dependent upon

the conduction band discontinuity. However in

order to reduce the scattering in the 2DEG

formed at the heterointerface, a binary nanoscale

AlN layer is epitaxially grown at the

heterointerface of AlGaN/GaN heterostructure

[7]-[8].

Page 127: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0403-2

Fig. 1 Simulated Structure of AlGaN/GaN-based

HEMT

The two dimensional electron gas (2DEG)

created at the heterointerface of AlGaN/GaN

with a mole fraction of 0.3 is shown in figure 2.

Fig. 2 Formation of 2DEG at the heterointerface

3. MICROWAVE CHARACTERISTICS

3.1 HEMT as an Amplifier

Fig. 3 Small Signal Model of HEMT

Two port network analyses have been done by

microwave office to understand the microwave

characteristics of HEMT [9]. When embarking

on any amplifier design it is very important to

understand the stability of the device chosen,

otherwise the amplifier may well turn into an

oscillator. The microwave parameters include

available gain (GA), maximum available gain

(MAG), unilateral gain (GU), Maximum Stable

Gain (MSG), Noise Figure (NF) and Minimum

noise figure (NFmin) etc [9]. The small signal

model of HEMT is shown in figure 3.

The main way of determining the stability of a

device is to calculate the Rollett’s stability

factor (K), which is calculated using a set of S-

parameters for the device at the frequency of

operation. We can calculate two Stability

parameters K & |Δ| to give us an indication to

whether a device is likely to oscillate or not or

whether it is conditionally/unconditionally

stable [9].

1

12

1

21122211

2112

22

22

2

11

SSSSwhere

SS

SSK

(1)

The parameters must satisfy K > 1 and |Δ| < 1

for a transistor to be unconditionally stable.

Once the K factor is calculated and we find that

the device is unconditionally stable then we can

calculate the Maximum available gain (MAG).

12

12

21KK

S

SGMAG Max (2)

when K is on the limit of unity the above

equation reduces down to

12

21

S

SMSG (3)

In this case the MAG is known as the maximum

stable gain MSG and is shown in figure 4.

Fig. 4 MAG/GMax and MSG of HEMT

Page 128: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0403-3

As frequency increases from 1 to 50 GHz the

maximum available gain (GMax) and maximum

stable gains (MSG) decreases and both are

coinciding together. It means K=1 and the

device is unconditionally stable. The various

gains at different frequencies are mentioned in

figure 4.

Fig. 5 Available Gain (GA) and Unilateral Gain (GU)

of HEMT

Mason’s unilateral gain (MUG/GU) and the

available gain are plotted in figure 5, in a

frequency range of 50GHz. It is seen from

figure 5 that the available gain reaches to a peak

of 29.7dB at 1GHz and the unilateral gain (GU)

varies from 56dB at 1GHz to 24dB at 50GHz.

Fig. 6 S21 and S12 with respect to frequency

The two-port network is connected to load

impedance ZL, source impedance ZS, and

characterized by a scattering matrix [S]. The S

parameters such as S21 and S12 are the forward

voltage gain and reverse voltage gain

respectively. and are shown in figure 6. As per

the values of the lumped elements of the small

signal model the forward gain of the device is

measured to be 22.94dB at 1GHz, and then it

decreases with the frequency whereas the

reverse gain is in negative values. By taking

suitable values of the lumped elements of the

small signal circuit, the forward gain can be

increased to the desired value.

Fig. 7 Noise Figure (NF) and NFMin of HEMT

The microwave noise figure (NF) and NFMin

are shown in figure 7. It is seen from this figure

that the NF increases with the frequency of

operation and it is minimum upto 5.5GHz

whereas the NFMin is very negligibly small with

the span of frequency from 1 to 50GHz. The NF

can be optimized to the required value by tuning

the values of the lumped elements of the small

signal circuit.

3.2 HEMT as an Oscillator

In spite of the great progress in performance

achieved during the last few years, there are still

several important issues that need to be

overcome to further increase the performance of

GaN HEMTs at millimeter frequencies (30-

300GHz). One of the key challenges to achieve

high-gain millimeter-wave power amplification

is to increase the maximum power-gain cutoff

frequency (fmax) and it is the maximum

frequency at which the transistor still provides a

power gain and can be expressed as [6]-[11]

Tgdgdsgsi

T

fCRRRRR

ff

2/2max (4)

Where Tf is the current-gain cutoff frequency

and gdC is the gate-drain (depletion region)

capacitance, while dsgsi RandRRR ,,, represent the

gate-charging, source, gate and output resistance,

respectively. To maximize maxf , each parameter

needs to be carefully optimized. In FETs the

short-channel effects play an important role in

the high frequency characteristics [6]. So gate-

recess technology can suppress the short-

channel effects and it leads to the improvement

of high frequency characteristics.

Page 129: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0403-4

The general figure of merit given in equation 5,

for comparing microwave circuits is the cut-off

frequency (Fc) and is defined by the on

resistance (Ron) and off–state capacitance (Coff)

of the device [10]-[11].

offonC

CRF

2

1 (5)

The on resistance of the HEMT is governed by

the total source-drain resistances at microwave

frequencies for voltages higher than threshold.

Below threshold voltage the 2DEG is

suppressed under the gate and the resistance

increases dramatically.

The general channel resistance DSR is composed

of several resistance components and may be

written as

dgsggDS RRRR (6)

where gR is the interface (or channel) resistance

under the gate, sgR and dgR are the source-gate

and drain-gate channel resistances respectively.

The contribution of sgR and dgR to the total on-

state resistance Ron, depends on the gate-drain

and gate-source electrode spacing. This spacing

governs the high breakdown voltage with wider

spacing yielding higher break-down voltages.

The resistances making up DSR are governed by

the 2DEG that is induced at the heterointerface.

Below the threshold voltage, the 2DEG carrier

density goes to zero and DSR approaches

maximum maxR due to carriers in the GaN

material.

Since the 2DEG governs the resistance in the

conductive channel, the resistance of each

element may be estimated as [10]

W

LRR i

si (7)

where sR is the sheet resistance of interface

channel, W is the gate width of HEMT and Li is

the approximate geometrical length. The value

of the sheet resistance is dependent on the

density of the 2DEG and the mobility of the

carrier in the channel and can be written as [11]

nss

qnR

1 (8)

where q is the single charge, n is the low-field

mobility of the 2DEG and sn is the 2DEG

density.

In estimating the resistance directly under the

gate gR , the 2DEG is assumed to be under the

influence of the gate voltage, making sn a

function of the gate voltage gV [11]. The

resistance elements sgR and dgR are assumed to

not to be controlled by the applied gate voltage

and thus sn is not a function of gV in the source-

gate and drain-gate regions.

The capacitance model includes both voltage-

dependent and parasitic capacitances. The

voltage-dependent capacitances used in

modeling the GaN HEMT are the source-gate

and drain-gate capacitances gC and the

capacitances between the gate and inner side of

the source and drain electrodes, igC . The total

capacitance DSC can be written as [11]

pariggDS CCCC (9)

where par

C is the total parasitic capacitance.

4. CONCLUSION

The small signal model of HEMT is designed

for two-port network analysis using microwave

office and its corresponding GaN-based HEMT

is simulated using TCAD tool. Various

microwave parameters such as MSG, MUG,

MAG, NF and NFMin are discussed. The

Amplifier and Oscillator behavior of HEMT is

also discussed.

ACKNOWLEDGEMENT

The authors acknowledge the DST-FIST and

DST-SERC fund received by National Institute

of Science and Technology from Department of

Science & Technology (DST), Government of

India.

REFERENCES

1. David F. Brown et al: N-Polar InAlN/AlN/GaN MIS-

HEMTs, IEEE Electron Device Letters, Vol. 31,

No.8, Aug, 2010

2. T R Lenka and A. K. Panda, “Role of Nanoscale AlN

and InN for the Microwave Characteristics of AlGaN/

(Al, In) N/GaN - based HEMT,” Accepted for

publication in “Fizika i Tehnika Poluprovodnikov”/

Semiconductors (Springer) (2011).

Page 130: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0403-5

3. Haifeng Sun et al: 205-GHz (Al, In)N/GaN HEMTs, .

IEEE Electron Device Letters, Vol. 31, No.9, Sept,

2010.

4. T R Lenka and A. K. Panda, “Characteristics Study of

2DEG Transport Properties of AlGaN/GaN and

AlGaAs/GaAs-based HEMT,” “Fizika i Tehnika

Poluprovodnikov”/ Semiconductors (Springer), Vol.

45, No 5, 2011, pp.660-665.

5. Haifeng Sun et al: 102 GHz AlInN/GaN HEMTs on

Silicon With 2.5-W/mm Output Power at 10GHz,

IEEE Electron Device Letters, Vol. 30, No.8, Aug,

2009.

6. Jinwook W. Chung et al: AlGaN/GaN HEMT with

300-GHz fmax, IEEE Electron Device Letters, Vol. 31,

No.3, Aug, Mar 2010.

7. T R Lenka and A. K. Panda, “Self-consistent

Subband Calculations of AlxGa1-xN/(AlN)/GaN-based

High Electron Mobility Transistor,” Advanced

Materials Research, Vol. 159, pp 342-347, 2011.

8. T R Lenka and A. K. Panda, “Effect of Nanoscale

AlN layer for improving 2DEG Transport properties

in AlGaN/AlN/GaN-based HEMT,” International

Journal of Pure and Applied Physics (IJPAP), Vol. 6,

No.4, pp.419-427, 2010.

9. Microwave Office Manuals.

10. Kelson D. Chabak et al: Strained AlInN/GaN HEMTs

on SiC with 2.1-A/mm Output Current and 104GHz

Cutoff Frequency, IEEE Electron Device Letters,

Vol. 31, No.6, June, 2010.

11. Nikolai V. Drozdovski et al: GaN-Based High

Electron-Mobility Transistors for Microwave and RF

Control Applications, IEEE Trans on Microwave

Theory and Techniques, Vol. 50, No.1, Jan, 2002.

Page 131: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0405-1

Abstract—In this paper “Digital Transceiver using

Advance Ternary Technique” gives the details

about digital transmitter and receiver with the

design of a ternary line coding. In this scheme

computer data (byte) will be converted into base-3

data elements. Current applications of line codes are

enormous in data transmission networks and in

recording and storage of information systems. The

applications include local and wide area networks

both wireless and wire connected. A coding

technique named advanced ternary line code can be

derived from three popular line codes NRZ-L, NRZ

and polar RZ. In this scheme six signal patterns are

required for eight binary data patterns.

I INTRODUCTION

This scheme focused on the electric signal and data

processing. Implementation of this scheme will

improve the means for encoding a binary data word

as ternary code word. At the decoding time ternary

codeword to recapture the binary data word. The

main advantage of this scheme is to maintain the

DC balance at the time of ternary data word

transmission. And other advantage of this scheme is

that ternary coding carries more data per bit than

binary data. Six binary bits can represent the 64

different values (0-63) whereas six ternary bits can

represent 365 different values from 000000-

111111).

Line Coding is the process of converting digital data

to digital signals. We assume that data, in the form

of text, numbers, graphical images, audio, or video

are stored in computer memory as sequences of bits.

Line coding converts a sequence of bits to a digital

signal. At the sender, digital data are encoded into a

digital signal; at the receiver, the digital data are

recreated by decoding the digital signal [1]

Fig. 1: Digital data to digital signal encoding

Line codes data transmission categorized into three

ways. The first type is still in binary in nature. The

second type of line codes are ternary codes which

operate on three signal levels (+, 0, and -). The third

type of line codes are called as multilevel codes

which has more than three output levels. The

encoder and decoder circuits can be able to simulate

and implement by using simple combinational logic

circuits..

Ternary logic in digital communication for high

speed and performance

Email: [email protected],[email protected]

Page 132: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0405-2

Figure 2: Unipolar NRZ, Polar NRZ,Unipolar RZ

II TERNARY REPRESENTATION

(a) DECIMAL TO TERNARY CONVERSION

S.NO. Decimal Ternary

1. 0 0 0 0 0

2. 1 0 0 0 1

3. 2 0 0 1 -1

4. 3 0 0 1 0

5. 4 0 0 1 1

6. 5 0 1 -1 -1

7. 6 0 1 -1 0

8. 7 0 1 -1 1

9. 8 0 1 0 -1

10. 9 0 1 0 0

11. 10 0 1 0 1

12. 11 0 1 1 -1

13. 12 0 1 1 0

14. 13 0 1 1 1

15. 14 1 -1 -1 -1

16. 15 1 -1 -1 0

17. 16 1 -1 -1 1

18. 17 1 -1 0 -1

19. 18 1 -1 0 0

20. 19 1 -1 0 1

21. 20 1 -1 1 -1

22. 21 1 -1 1 0

23. 22 1 -1 1 1

24. 23 1 0 -1 -1

25. 24 1 0 -1 0

26. 25 1 0 -1 1

27. 26 1 0 0 -1

28. 27 1 0 0 0

29. 28 1 0 0 1

30. 29 1 0 1 -1

31. 30 1 0 1 0

32. 31 1 0 1 1

33. 32 1 1 -1 -1

34. 33 1 1 -1 0

35. 34 1 1 -1 1

36. 35 1 1 0 -1

37. 36 1 1 0 0

38. 37 1 1 0 1

39. 38 1 1 1 -1

40. 39 1 1 1 0

41. 40 1 1 1 1

Table1: Decimal -Ternary

(b) DECIMAL TO TERNARY CONVERSION

The decimal (base 10) numeral system has ten

possible values (0, 1, 2,3,4,5,6,7,8 or 9) for each

place value. In contrast, the ternary (base 3)

numeral system has three possible values, often

represented as -1, 0 or 1, for each place-value.

Like a decimal to Binary Conversion, it takes

following steps:

Algorithm:

Step-1: Write the decimal number

Step-2 : Divide the decimal value by three (3), write

quotient and remainder

Step-3: If the remainder becomes 2 then the value

of quotient becomes increase by one and the

resultant remainder decrease by 3.

Step-4: Repeat step 2 on the quotient; keep on

repeating until the quotient becomes zero

Step-5 Write all remainder digits in the reverse

order (last remainder first) to form the final result.

Example: (25)10 =(X)3

(25)10 =(1 0 -1 1)3

(c) TERNARY TO DECIMAL CONVERSION

(1 0 -1 1)3 =(X)10

1 *33 +0*3

2 + (-1)*3

1 +1*3

0

27+0+(-3)+1 = 25

(1 0 -1 1)3 =(25)10

(d) TERNARY ADDITION

Ternary addition can be performed by the

following rules:

A B C Carry Sum

0 0 0 0

0 1 0 1

0 -1 0 -1

1 0 0 1

1 1 1 -1

1 -1 0 0

-1 0 0 -1

-1 1 0 0

Page 133: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0405-3

-1 -1 -1 1

-1 -1 -1 -1 0

1 1 1 1 0

Table 2: Rules of Ternary addition

Example:

(e) TERNARY SUBSTRACTION

if negative numbers are considered, then by

changing all +1’s to -1’s and vice versa, leaving all

zeroes unchanged, gives the negative of the

corresponding number. Hence it follows that

addition and subtraction may be performed with the

same hardware in the balanced ternary system by

sign changes of the addend or subtrahend,

respectively.

(i) A-B =X

(ii) X=A + B’ where in B’ change all +1 to -1

and vice versa

Here there is no need to convert the negative

magnitude such as (-28) can be represented as

(0 0 -1 0 0 -1)

(f) TERNARY MULTIPLICATION

Ternary multiplication can be performing in

following ways similar to Binary multiplication.

Here the some basic rules are applied for

multiplication

S. No. A B A x B

1 0 0 0

2 0 1 0

3 0 -1 0

4 1 0 0

5 1 1 1

6 1 -1 -1

7 -1 0 0

8 -1 1 -1

9 -1 -1 1

Table 3: Rules for Ternary Multiplication

Example:

(i) (37)10 x (4)10= (148)10

(1 1 0 1 ) 3 * (0 0 1 1]) 3 = [X]3

1 1 0 1

X 0 0 1 1

---------------------------------------

1 1 0 1

1 1 0 1 x

0 0 0 0 x x

0 0 0 0 x x x

----------------------------------------

0 1 -1 -1 1 1 1

----------------------------------------

(0 1 -1 -1 1 1 1)3 = (148)10

(ii) (14)10 x (15)10= (210)10

(1 -1 -1 -1 ) 3 * (1 -1 -1 0]) 3 = [X]3

1 -1 -1 -1

X 1 -1 -1 0

---------------------------------

0 0 0 0

-1 1 1 1 x

-1 1 1 1 x x

Page 134: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0405-4

1 -1 -1 -1 x x x

---------------------------------

0 1 0 -1 -1 1 0

---------------------------------

(0 1 0 -1 -1 1 0)3 = (210)10

III PRINCIPLES OF TERNARYDATA

PATTERNS ENCODING

The method for transmitting an 8-bit binary data as

a 6-ternary code includes encoder, decoder. For

each 8- bit binary data has a unique 6-ternay

codeword that is optimized for communication

Ternary data Patterns encoded as three signal

patterns[3]

. The three signal levels are represented as

--, 0 and +. The first 8 bit binary pattern 10100011

is converted in (163)10 and this encoded as in 6-

ternary (1 -1 0 0 0

1)3.

This data patterns is encoded in signal patterns (+ --

0 0 0 +)

Figure3: Binary Data Communication

Figure 4:Ternary Data Communication

The logic circuitry of this method is optimized to

accomplish the translation using a small number of

combinational logic gates. Implement of Ternary

communication increase the speed and performance

over the 8 bit data word communication and also

decrease the size of encoder.

IV PRINCIPLES OF TERNARY DATA

PATTERNS DECODING

The principle of decoding system is very simple and

reverse process of encoding system. Decoding

system receive the 6 ternary pattern. And decoder

circuit converts into the 8-bit binary pattern which

format is understandable by receiver.

V ADVANTAGES OF TERNARY CODES

The Concept of 6-Ternary data communication

between two devices make the system high speed

and high performance and also reduce the size of

the overall circuitry system. This concept relates

generally to electric signal and data processing[5]

.

The encoder converts the 8-bit binary data word

into a 6-ternary data word and decoder also converts

the 6-ternaty data word into a 8 bit binary data

word. Ternary data transmission can be use in a

high speed network. Ternary data transmission

maintains the DC balance in transmission. Binary

data word to ternary conversion has beneficial for

placement of data on an electromagnetic channel.

Page 135: VLP

CONFERENCE ON “SIGNAL PROCESSING AND REAL TIME OPERATING SYSTEM (SPRTOS)” MARCH 26-27 2011

VLP0405-5

Ternary data communication increase the data

carrying can be used to increase the speed of data

transmission. In future, this can be increase the data

capacity in storage media [8]

.

VI CONCLUSION

In this Paper we have discussed about the Ternary

logic in digital communication for providing high

speed & performance. In this Scheme, we have

discussed about the encoding of binary data word to

ternary data word for improving the data word

transmission time & correspondingly high speed

communication as compared to the Binary logic that

is generally used in digital communication. For the

implementation of this scheme, we have discussed

some major algorithms and various conversion

methods that are very useful for understanding this

logic. In this paper, we also tried to light on the

principle of ternary data pattern encoding which is

responsible for ternary data communication.

VII REFERENCES

[1] Glass, A., Ali, B. and Bastaki, E. “Design and

modeling of H-Ternary line encoder for digital data

transmission”. International Conference on Info-

Tech & Info-Net, Beijing, China, 2001, pp 503-

507.

[2] A. Mahadevan, Digital Transceiver using H

Ternary Line Coding Technique, Proceedings of the

World Congress on Engineering 2007 Vol I

[3] A. Srivastava and K. Venkatapathy, “Design and

Implementation of a Low Power Ternary Full

Adder”1996 OPA (Overseas Publishers

Association) Amsterdam B.V. Published in The

Netherlands under license by Gordon and Breach

Science Publishers SA

Printed in Malaysia

[4] Abdullatif Glass and Bahman Ali, Nidhal

Abdulaziz, “H-Ternary Line Decoder for Digital

Data Transmission:

Circuit Design and Modelling”,

[5] Bylanski, P. and Ingram, D., “Digital

transmission systems,” Peter Peregrinus, 1976, pp.

216-246.

[6] Lathi, P., “Modern digital and analog

communication systems (3rd Ed),” Oxford

University Press, 1998, pp.294-353.

[7] Takasaki, Y., “Digital transmission design and

jitter analysis,” Artech House, 1991, pp.35-60.

[8] Sandeep Patel, Howard W. Johnson, “Methods

and apparatus for implementing a type 8B6T

Encoder and decoder “, 1996, patent no 5,525,983.

Page 136: VLP

1

Abstract - This Paper introduces the working principle of space vector pulse width modulation (SVPWM), and presents a new circuit realization of SVPWM generator based on a flexible, high computation speed and cost effective field programmable gate array (FPGA) embedded technique. Controlling of the machines using the vector control techniques is becoming more popular nowadays. The need for extensive computations has no more become an objection to the vector control implementation. This is due to the wide availability of high speed digital processors. The method of decoupling the variables and controlling them independently is known as vector control. To relieve the controller from the time consuming computational task of PWM signal generation, a new method of Space Vector PWM signal generation is implemented in FPGA using Hardware Description Language VHDL. The Space Vector PWM pulses are first designed in MATLAB/SIMULNK environment and relevant coding are written to generate the pulses and then by using software conversion tool the M files are converted into VHDL coding. Thus the triggering pulses are given to the inverter circuit and hence the switching pattern generated will reduce the harmonic content and switching losses. Keywords : FPGA- Field Programmable Gate Array, SVM, Space Vector PWM, VHDL, Induction motor drive

1 Introduction The Pulse Width Modulation (PWM) Technique called “Vector Modulation”, which is based on space vector theory, is the most important development in the last few years [1]. Although, several of PWM methods have been created in the past, the vector modulation technique appears to be the best alternative. FPGA’s development reached a level of maturity that made them the good choice of

implementation in many fields [2]. FPGA based embedded implement of SVPWM can make the computing power of processor and the logical processing power of hardware circuit combined, thus the processing efficiency of CPU and the logical units utilization can be improved . Figure 1 shows a SVPWM control system based on FPGA- embedded technique – Figure 1: SVPWM control system based on FPGA-

embedded technique Recent applications of FPGA’s in industrial electronics include mobile- robot path planning and intelligent transportation [3], current control applied to power converters, real-time hardware in the loop testing for control design, Controller implementation, separating and recovering independent source signals, and neural computation. Since the concept of multilevel PWM converter was introduced, various modulation strategies have been developed and studied in detail, such as multilevel sinusoidal PWM, multilevel selective harmonic elimination and space vector modulation. Among these strategies, the space vector PWM (SVPWM) [4]stands out because it offers significant flexibility to optimize switching waveforms and is well suited for digital implementation. Complexity and computational cost of traditional SVPWM techniques increases with the number of levels of the converter, and most of all use trigonometric functions or pre-computed tables. A symmetrical space vector modulation PWM pattern is proposed

“Embedded Implementation of Space Vector PWM using FPGA”

Ashish Gupta

Assistant Professor Department of Electronics Engineering,

MPEC, Kanpur [email protected]

Page 137: VLP

2

in this paper, it shows the advantage of lower THD without increasing the switching losses. Thus this paper demonstrates that a more efficient and faster solution is the use of Field Programmable Gate Array (FPGA’s), it investigates how to generate a variable PWM waveform based on Xilinx FPGA [5].The rest of the paper is organized as follows. Section II introduces the principle of symmetrical space vector PWM method. Section III shows details on FPGA. Section IV shows the m-file coding/Simulink blocks required to generate Space Vector Pulses. Section V explains the experimental results and Section VI is the conclusion

2. Principle of Space Vector PWM In vector coordinates, the combinations of three-phase inverter output voltages form eight space vectors shown in Figure. 2 There are six nonzero space vectors forming an origin centered hexagon, and two zero space vectors (V0-V7) located at the origin. The hexagon is the maximum boundary of the space vector, and the circle is the maximum trajectory of the regular sinusoidal outputs in linear modulation. This figure also explains the PWM output patterns in the six regions (denoted as sector I–VI) separately. In accordance with three-phase to two-phase transformation, the three-phase inputs (Va, Vb, Vc) are transformed into (Vα, Vβ) as the reference vector. Figure 2: Basic Eight Switching Vector and Vector

Representing of Sector 1. As shown in Figure. 3, there are eight possible combinations of on and off patterns for the three upper power switches. The on and off states of the lower power devices are opposite to the upper one and so are easily determined once the states of the upper power transistors are determined. According

to above equations, the eight switching vectors, output line to neutral voltage (phase voltage), and output line-to-line voltages in terms of DC-link Vdc, are given in Table.1 shows the eight inverter voltage vectors (V0 to V7)

Figure 3: Circuit model of PWM inverter with center-taped grounded DC bus.

Table-1 Details of different phase and line

voltages for the eight states.

3. Field Programmable Gate Array

A Field-Programmable Gate Array or FPGA is a silicon chip containing an array of configurable logic blocks (CLBs). Unlike an Application Specific Integrated Circuit (ASIC) which can perform a single specific function for the lifetime of the chip an FPGA can be reprogrammed to perform different function in a matter of microseconds. The design used Xilinx development tools, and is realized in a single FPGA chip with no external memory. The benefits of this design are as follows The whole system is implemented in only a

single chip consequently the circuit is very compact.

Systems of FPGA chip are more reliable because they do not need any control software

Voltage Vectors

Switching Vectors

Line to Neutral Voltage

Line to line voltage

a b c Van Vbn Vcn Vab Vbc Vca V0 0 0 0 0 0 0 0 0 0 V1 1 0 0 2/3 -1/3 -1/3 1 0 -1 V2 1 1 0 1/3 1/3 -2/3 0 1 -1 V3 0 1 0 -1/3 2/3 -1/3 -1 1 0 V4 0 1 1 -2/3 1/3 1/3 -1 0 1 V5 0 0 1 -1/3 -1/3 2/3 0 -1 1 V6 1 0 1 1/3 -2/3 1/3 1 -1 0 V7 1 1 1 0 0 0 0 0 0

Page 138: VLP

3

Faster design and verification time, design change without penalty.

In this paper programming FPGA using Hardware Description Languages and coding are used to generate the Space Vector Modulation for the inverter circuit. The point to be remember here is that instead of writing the direct VHDL coding firstly the M-File coding is written to generate the SVPWM pulses and then after by using he software converter VHDL coding is generated. Hence the work requires less time and fast operation. The MATLAB/SIMULNK environment is familiar to large number of software programmers and since m-file coding is very much common to most of the programmers it becomes easier to work in this software. A very attractive high-level design/ simulation tool is provided by FPGA and is called XILINX. It is a very flexible design tool, which allows Testing of a high-level structural description of the design and makes possible quick changes and corrections. The circuit description structure is very similar to the way the design could be implemented later. Therefore mapping tool allowing conversion of such a structure into VHDL code would save the designer’s time, which otherwise has to be spent in rewriting the same structure in VHDL and probably making mistakes that will need debugging.

4. Simulation Steps:

(1) Initialize system parameters in MATLAB/ SIMULNK .

(2) Perform M-File coding to (i) Determine sector. (ii) Determine time duration T1, T2, T0. (iii) Determine the switching time (Ta,Tb

and Tc) of each transistor (S1 to S6). (iv) Generate the inverter output voltages

(VAB, VBC, VCA). (v) Generate VHDL Codings through

software convertion tool. (vi) Burn the program in the FPGA kit.

(3) View the SVPWM waveform by XILINX.

4.1 Simulink Model to generate Space Vector PWM

Figure 4.1: Simulink Model for Overall System

Figure 4.2: Subsystem Simulink Model for

“Space Vector PWM Generator”

Page 139: VLP

4

Figure 4.3: Subsystem Simulink Model for “Making Switching Time”

5. Results and Discussions

The control scheme is simple in architecture and thus facilitates the realization of the developed SVPWM controller using FPGA based circuit design approach. The designed SVPWM control IC has been realized using single FPGA.The simulation results of internal module and the final output of Space Vector PWM switching pattern has been achieved with a fundamental frequency of 50 Hz. Such a wide frequency control with very high frequency-switching is only possible by utilizing the state-of-art VLSI digital circuit design approach. From the result the switching pattern generated will reduce the harmonic content and switching losses. A comparisons between spwm and svpwm by varying modulation index is shown in the below mentioned table 2 and which evidently shows the greater advantage of controlling the drive by SVPWM technique. Figure 5 shows the Locus comparison of maximum linear control voltage in Sine PWM and SVPWM. Figure 6, 7 and 8 represents the axis converter, Delay time, Output of each inverter respectively. Figure 9, 10 shows the simulation results of Van, Vab, Vac and Simulation results of pulse patterns

Table 2: Comparisons between SPWM and SVPWM by varying modulation index.

Figure 5: Locus comparison of maximum linear control voltage in Sine PWM and SVPWM.

Fig 6: Three to Two axis converter. (Va, Vb, Vc) are transformed into (Vα, Vβ)

Tech- nique SPWM SVPWM

M. I. (M)

Output line

voltage (peak V)

THD (%)

Output line

voltage (peak V)

THD (%)

0.4 180.80 162.11 192.70 154.07 0.5 266.50 123.35 312.20 108.78 0.6 289.40 117.12 318.10 105.69 0.7 369.20 94.52 436.60 81.19 0.8 396.10 89.73 442.90 78.56 0.9 472.90 70.69 552.30 53.62 1.0 502.40 64.83 567.90 49.15

Parameter used : Fundamental frequency :50 Hz, Switching frequency:10 KHz , DC Voltage : 600 volts

Page 140: VLP

5

Fig 7: Delay time

Fig 8: Output of each inverter

Fig 9: Simulation results of Van, Vab and Vac

Fig 10: Simulation results of pulse patterns

6. Conclusion In this paper, a theoretical study concerning the SVPWM control strategy on the voltage inverter based on FPGA is presented. This aims on one hand to prove the effectiveness of the SVPWM in the contribution in the switching power losses reduction. SVPWM is among the best solution to achieve good voltage transfer and reduced harmonic distortion in the output of an inverter. On the other hand since Field programmable gate array (FPGA) have better advantages compared to microprocessor and DSP control, this modulation technique is implemented in an FPGA by initially generating m-file through Matlab-Simulink environment. The FPGA coding makes it easier in designing the vector modulation pattern generator using field programmable Array. Moreover the MATLAB/ SIMULNK environment is familiar to large number of software programmers and since m-file coding is very much common to most of the programmers it becomes easier for individuals to work in this software. The switching pattern generated will reduce the harmonic content, provides efficient as well as flexible control and reduces the total size of the system. This SVPWM IC can be used for high performance ac drives and power conditioning equipment as a modulator.

References [1] Ying-yu Tzou; Hau-Jean Hsu; Tien-Sung Kuo. Industrial Electronics, Control, and Instrumentation, 1996., Proceedings of the 1996 IEEE IECON 22nd International Conference. “FPGA based SVPWM control IC for 3-phase PWM inverters”. Volume 1, Issue, 5-10 Aug 1996 Pages(s):138-143. [2] J.J. Rodriguez-Andina, M.J. Moure, and M.D. Valdes, “Features, design tools, and application domains of FPGAs”, IEEE Trans. Ind. Electron., vol.54, no.4, pp.1810 – 1823, Aug. 2007. [3] K. Sridharan and T. Priya, “The design of a hardware accelerator for realtime complete visibility graph construction and efficient FPGA implementation,” IEEE Trans. Ind. Electron., vol.52, no.4, pp. 1185 – 1187, Aug. 2005.

Page 141: VLP

6

[4] L. Franquelo, M. Prats, R. Portillo, J. Galvan, M. Perales, J. Carrasco, E. Diez, and j. Jimenez, “Three-dimensional space-vector modulation algorithm for four-leg multilevel converters using abc coordinates”, IEEE Trans. Ind. Electron., vol. 53, no.2, pp. 459-466, Apr. 2006. [5]Xilinx Inc.,”Foundation Series ISE 3.11 User Guide’”2000.


Recommended