Implementation of Brent-Kung Adder Using QCA Technology · building block of these cellular arrays...

ISSN 2322-0929

Vol.04, Issue.10,

October-2016,

Pages:0921-0926

www.ijvdcs.org

Copyright @ 2016 IJVDCS. All rights reserved.

Implementation of Brent-Kung Adder Using QCA Technology MALOTHU KIRAN KUMAR

1, CH.CURY

2

1PG Scholar, Sreekavitha Engineering College, Khammam, TS, India.

2Associate Professor, Sreekavitha Engineering College, Khammam, TS, India.

Abstract: Quantum-dot cellular automata (QCA) are a new technology suitable for the implementation of ultra-dense low-power

high-performance digital circuits. As transistors decrease in size more and more transistors can be accommodated in a single chip,

thus increasing chip computational capabilities. However, transistors cannot get much smaller than their current size. A quantum-dot

cellular automaton has a simple cell as the basic element. The cell is used as a building block to construct gates and wires. The

quantum-dot cellular automata (QCA) approach represents one of the possible solutions in overcoming this physical limit. In this

paper designed a new Brent Kung adder (BKA) that performs and achieves the best area-delay tradeoff. The 32-bit version of the

Brent-Kung adder (BKA) requires 60 LUTs and shows a delay of 14.876ns which is lower value compared to existing Ripple carry

adder (RCA).

Keywords: QCA (Quantum Dot Cellular Automata), Majority Gates (MG), Xilinx ISE, Verilog.

I. INTRODUCTION

Quantum dot cellular automata (sometimes referred to simply

as quantum cellular automata or QCA) are proposed models of

quantum computation, which have been devised in analogy to

conventional models of cellular automata introduced by von

Neumann. The Notre Dame group has developed a new

paradigm for ultra-dense and ultra-fast information processing

in nano electronic systems. These "Quantum Cellular

Automata" (QCA's) are the first concrete proposal for a

technology based on arrays of coupled quantum dots. The basic

building block of these cellular arrays is the Notre Dame Logic

Cell, as it has been called in the literature. The phenomenon of

Coulomb exclusion, which is a synergistic interplay of quantum

confinement and Coulomb interaction, leads to a bi-stable

behavior of each cell which makes possible their use in large-

scale cellular arrays. The physical interaction between

neighboring cells has been exploited to implement logic

functions. New functionality may be achieved in this fashion,

and the Notre Dame group invented a versatile majority logic

gate. In a series of papers, the feasibility of QCA wires, wire

crossings, inverters, and Boolean logic gates was demonstrated.

A major finding is that all logic functions may be integrated in a

hierarchical fashion which allows the design of complicated

QCA structures. The most complicated system which was

simulated to date is a one-bit full adder consisting of some 200

cells. In addition to exploring these new concepts, efforts are

under way to physically realize such structures both in

semiconductor and metal systems. Extensive modeling work of

semiconductor quantum dot structures has helped identify

optimum design parameters for QCA experimental

implementations. In the first year of this project, the

experimental effort has laid the ground work for the study of

Coulomb coupling in quantum dot structures.

The newly acquired ultra-high resolution field emission

scanning electron microscope was installed and used to examine

nanostructures with a resolution of 1 nanometer. Coupled dot

structures were fabricated in Gallium Arsenide hetero structures

and await testing in the newly-established cryogenic

measurement facility. QCA is a novel emerging technology in

which logic states are not stored as voltage levels, but rather the

position of individual electrons. Conceptually, QCA represents

binary information by utilizing a bi-stable charge configuration

rather than a current switch. A QCA cell can be viewed as a set

of four “dots” that are positioned at the corners of a square. A

quantum dot is a site in a cell in which a charge can be

localized. The cell contains two extra mobile electrons that can

quantum mechanically tunnel between dots, but not cells. In the

ground state and in the absence of external electrostatic

perturbation, the electrons are forced to the corner positions to

maximize their separation due to Coulomb repulsion. As shown

in below Figure,

Fig.1. QCA Cell.

MALOTHU KIRAN KUMAR, CH.CURY

International Journal of VLSI System Design and Communication Systems

Volume.04, IssueNo.10, October-2016, Pages: 0921-0926

The two possible charge configurations are used to represent

binary“0” and “1”. Note that in the case of an isolated cell, the

two polarization states are energetically degenerate. However

the presence of other charges (neighbor cells) breaks the

degeneracy and one polarization state becomes the cell ground

state. Polarization P measures the extent to which the charge

distribution is aligned along one of the diagonal axes. If the

charge density on a dot i is i, then the polarization is defined as:

𝑃 = 𝜌1+𝜌3 − 𝜌2+𝜌4

𝜌1+𝜌2+𝜌3+𝜌4 (1)

II. QCA ADDER

The Binary Adder Using QCA and its functionality were

discussed in the previous chapters. Now this chapter deals with

the simulation and synthesis results of the Binary Adder in

terms of Sum and Carry Chain. Here Modelsim tool is used in

order to simulate the design and checks the functionality of the

design. Once the functional verification is done, the design will

be taken to the Xilinx tool for Synthesis process and the net list

file generation. The Appropriate test cases have been identified

in order to test this modeled QCA design. Based on the

identified values, the simulation results which describes the

operation of the QCA Adder has been achieved. This proves

that the modeled design works properly as per the process. To

introduce the novel architecture proposed for implementing

ripple adders in QCA, let consider two n-bit addends A = an−1,.

. . , a0 and B = bn−1, . . . , b0 and suppose that for the i th bit

position (with i = n − 1, . . . , 0) the auxiliary propagate and

generate signals, namely pi = ai + bi and gi = ai · bi, are

computed. ci being the carry produced at the generic (i−1) th bit

position, the carry signal ci+2,furnished at the (i+1) th bit

position, can be computed using the conventional CLA logic

reported in as below. The latter can be rewritten as given in

equation, by exploiting Theorem. In this way, the RCA action,

needed to propagate the carry ci through the two subsequent bit

positions, requires only one MG.

Fig.2. Novel n-Bit Adder (a) Carry Chain and (b) Sum

Block.

Conversely, conventional circuits operating in the RCA

fashion, namely the RCA and the CFA, require two cascaded

MGs to perform the same operation. In other words, an RCA

adder designed as proposed has a worst case path almost halved

with respect to the conventional RCA and CFA. Equation is

exploited in the design of the novel 2-bitmodule shown in Fig. 1

that also shows the computation of the carry ci+1 = M(pi gici ).

The proposed n-bit adder is then implemented by cascading n/2

2-bit modules as shown in Fig2 (a). Having assumed that the

carry-in of the adder is cin = 0, the signal p0 is not required and

the 2-bit module used at the least significant bit position is

simplified. The sum bits are finally computed as shown in Fig.

2(b). It must be noted that the time critical addition is performed

when a carry is generated at the least significant bit position(i.e.,

g0 = 1) and then it is propagated through the sub sequent bit

positions to the most significant one. In this case, the first2-bit

module computes c2, contributing to the worst case

computational path with two cascaded MGs. The subsequent 2-

bitmodules contribute with only one MG each, thus introducing

a total number of cascaded MGs equal to (n − 2)/2. Considering

that further two MGs and one inverter are required to compute

the sum bits, the worst case path of the novel adder consists of

(n/2) + 3 MGs and one inverter

ci+2 = gi+1 + pi+1 · gi + pi+1 · pi · ci (2)

ci+2 = M(M (ai+1, bi+1, gi ) M (ai+1, bi+1, pi ) ci (3)

To introduce the novel architecture proposed for

implementing ripple adders in QCA, let consider two n-bit add

ends A = an−1, . . . , a0 and B = bn−1, . . . , b0 and suppose

that for the i the bit position (with i = n − 1, . . . , 0) the auxiliary

propagate and generate signals, namely pi = ai + bi and gi = ai ·

bi , are computed. ci being the carry produced at the generic

(i−1)the bit position, the carry signal ci+2, furnished at the

(i+1)th bit position, can be computed using the conventional

CLA logic reported. The latter can be rewritten as given in (3),

by exploiting Theorems 1 and 2demonstrated. In this way, the

RCA action, needed to propagate the carry ci through the two

subsequent bit positions, requires only one MG. Conversely,

Implementation of Brent-Kung Adder Using QCA Technology



conventional circuits operating in the RCA fashion, namely the

RCA and the CFA, require two cascaded MGs to perform the

same operation. In other words, an RCA adder designed as

proposed has a worst case path almost halved with respect to the

conventional RCA and CFA. Equation is exploited in the design

of the novel 2-bitmodule shown in Fig. 1 that also shows the

computation of the carry ci+1 = M (pi gici). The proposed n-bit

adder is then implemented by cascading n/2 2-bit modules have

assumed that the carry-in of the adder is cin = 0, the signal p0 is

not required and the 2-bit module used at the least significant bit

position is simplified. The sum bits are finally computed. It

must be noted that the time critical addition is performed when a

carry is generated at the least significant bit position (i.e., g0 =

1) and then it is propagated through the sub sequent bit positions

to the most significant one. In this case, the first2-bit module

computes c2, contributing to the worst case computational path

with two cascaded MGs. The subsequent 2-bitModules

contribute with only one MG each, thus introducing a total

number of cascaded MGs equal to (n − 2)/2. Considering that

further two MGs and one inverter are required to compute the

sum bits, the worst case path of the novel adder consists of (n/2)

+ 3 MGs and one inverter . To prove the basic operation of

QCA, we must perform the operation M(abc) = a · b + a · c + b

· c. were ,the operation are divided in to two blocks

Carry chain

Sum chain

A. Operation with Example

Here I am considering a 4 bit example for a=5 and b=2 were

it provides the sum=7 and carry =0 of 4 bit each.

Input‟s :

A(a3,a2,a1,a0)=0101=5

B(b3,b2,b1,b0)=0010=2

Output‟s :

Sum (s3,s2,s1,s0)=0111=7

carry (c3,c2,c1,c0)=0000=0

B. Carry Chain

We can see from the above Carry Chain figure 13(a) , the 4

bit input of a and b are applied to majority gate with a input of 3

bit each and provides a one output

M(a0,b0,0) = M(1,0,0) = 0--------------------------c0=g0

M(a1,b1,g0) = M(0,1,0) = 0-----------------------c1

M(a2,b2,0) = M(1,0,0)= 0---------------------------g2

M(a2,b2,1) = M(1,0,1)= 1---------------------------p2

M(P2,g2,C2) = M(1,0,0) = 0-----------------------c2

M(a3,b3,g2) = M(0,0,0) = 0

M(a3,b3,P2) = M(0,0,1) = 0

M(0,0,C2)=0=-----------------------------------------c3

Finally carry (c3, c2, c1, c0) =0000=0

C. Sum Block

As seen the above example for carry chain, in which whose

carries are used in sum chain Which has showed in above

fig.1(b) for generating the sum‟s.

M(a0,~c1,g0)=M(1,1,0)= 1.

M(~c1,1,0)=M(1,1,0)= 1-----------------s0

M(a1,~c2,b1)=M(0,1,1)= 1

M(~c2,1,0)=M(1,1,0)= 1------------------s1

M(p2,~c3,g2)=M(1,1,0)= 1

M(~c3,1,0)=M(1,1,0)= 1-----------------s2

M(a3,~c4,b3)=M(0,1,0)= 0

M (~c4, 0,0)=M(1,1,0)= 0------------------s3

Finally sum (s3,s2, s1,s0)=0111=7

III. BRENT KUNG ADDER

The Brent-Kung adder is a parallel prefix adder. Parallel

prefix adders are special class of adders that are based on the

use of generate and propagate signals. Simpler Brent-Kung

adders was been proposed to solve the disadvantages of Kogge-

Stone adders. The cost and wiring complexity is greatly

reduced. But the logic depth of Brent-Kung adders increases to

2log (2n-1), so the speed is lower. The block diagram of 4-bit

Brent-Kung adder is shown in Fig.

Fig.3. Brent-Kung Prefix Adder

In IC design environment, the chip performance is influence

by design environment, schematic and sizing parameter of the

transistor. Therefore, this study is an attempt to investigate the

performance of 4-bit Brent Kung Parallel Prefix Adder using

Silvaco EDA Tools and targeted to 0.18um Silterra Technology.

The objective of this project is to review the performance of the

adder by forming different of transistors gate sizing and

schematics. Furthermore, the study been carried out by

implementing Brent Kung Adder in Basic Logic Gate and

Compound Gate, then simulate the design in various sizes of

transistors in order to see the effects on propagation delay,

power consumption and the number of transistors used. At the

end of this paper, evidently the improvement of transistors size

contributes reducing the propagation delay and proportionally

advances the power consumption. Besides, the Compound Gate

takes about 35.58% power consumption decreased, reduced

9.16% of propagation delay and less 96 transistors used rather

than Basic Logic Gate. Nevertheless, larger size of buffers

required to stable the output consistency in Compound Gate

schematic. An adder is one of the basic building blocks of

common data path components, As such, they are of immense

importance to designer being so commonly used and such a

critical part of the data path. For smaller adders, carry-look

ahead, carry skip or carry select will suffice, but as the width of




the adder grows, the delay of passing the carry through the

stages begin to dominate.

Therefore, in current technology, Parallel Prefix Adder are

among the best adders, with respect to the area and time (cost:

performance ratio), and are particularly good for the high-speed

addition of large numbers. Moreover, the requirements of the

adder are that it is primarily fast and secondarily efficient in

terms of power consumption and chip area. Parallel Prefix

Adder as terminology background is describing prefix as the

outcome of the execution of the operation depends on the initial

inputs. Parallel in this term is defines as the process of involving

the execution of an operation in parallel. This is done by

segmentation into smaller pieces that computed in parallel. Then

all bits of the sum will begin the process concurrently. There are

a lot of parallel prefix adders been developed example in 1960:

J. Sklansky–conditional adder, 1973: Kogge-Stone adder, 1980:

Ladner-Fisher adder, 1982: Brent-Kung adder, 1987: Han

Carlson adder and 1999: S. Knowles. Other parallel adder

architectures also include H. Ling adder in 1981 and 2001:

Beaumont-Smith. Practically, the Brent Kung Parallel Prefix

Adder has a low fan-out from each prefix cell but has a long

critical path and is not capable of extremely high speed addition.

In spite of that, this adder proposed as an optimized and regular

design of a parallel adder that addresses the problems of

connecting gates in a way to minimize chip area. Accordingly, it

considered as one of the better tree adders for minimizing

wiring tracks, fan out and gate count and used as a basis for

many other networks.

Fig.4. Schematic View Implementation of 16-Bit Adder

Stages

– 2(logN-1)

Fan out

– 2

Avoids Explosion of wires

Odd Computation then even

In any row at the most one pair High performance

microprocessor units require high performance adders and

other arithmetic units. Modern microprocessors are however

32 bits or 64 bits as that is the minimum required for

floating point arithmetic as per the IEEE 754 Standard. 8 bit

and 16 bit arithmetic processors are normally found in

microcontroller applications for embedded systems where

high speed is important but low power constraints dominate

system design. A good metric of performance on such

designs would be the power delay product (or equivalently

energy per bit.) Many designs give a high speed at the cost of

more power or low power at the cost of low speed. The

design of a 16 bit Brent Kung adder presented here has the

lowest delay (among the adders compared, Table 2) and

also the lowest power delay product (among the

adders compared, Table 2) in similar technology nodes. The

design makes use of logical effort based sizing of

transistors and advanced layout techniques like fingering

and inter digitizing to reduce the self loading of the

transistors from parasitic transistor capacitances. Addition is

one of the simplest and commonly used operations and is in

most cases a speed determining factor for arithmetic operations.

The addition of two binary numbers is the fundamental

arithmetic operation in microprocessors, digital signal

processors, and data-processing application-specific integrated

circuits. Several algorithms have been presented for high speed

parallel addition, and there is generally a tradeoff between speed

and area. Hence, binary adders are crucial building blocks in

very large-scale integrated circuits. For the optimization of

speed in adders, the most important factor is carry generation.

For the implementation of a fast adder, the generated carry

should be driven to the output as fast as possible, thereby

reducing the worst path delay which determines the ultimate

speed of the digital structure.

In the design for timing optimization, a network can be

optimized either at circuit or logic level. Logic level

optimization is done by manipulating boolean equations,

whereas circuit level optimization can be carried out by

manipulating transistor sizes and circuit topologies. In the

optimization for area, care should be taken in the design of the

building blocks of the structure, which determine the area

occupied by the architecture and, finally, also affect the speed.

The type of structure of any adder greatly affects the speed of

the circuit. The logarithmic structure is considered to be one of

the fastest structures. The logarithmic concept is used to

combine its operands in a tree-like fashion. The logarithmic

delay is obtained by restructuring the look-ahead adder. The

restructuring is dependant on the associative property, and the

delay is obtained to be equal to (log2N) t, where „N‟ is the

number of input bits to the adder and t is the propagation delay

time. Hence, for a 16-bit structure, the logarithmic adder has a

delay equal to „4t‟, while for a simple ripple carry adder the

delay is given by (N-1)t and is equal to „15t‟ for „N‟ and „t‟

being the number of input bits and the delay time, respectively.

Hence it is seen that this structure greatly reduces the delay, and

would be especially beneficial for a structure with large number

of inputs. This advantage is, however, obtained at the expense

of large area and a complex structure. In the following section, a

Implementation of Brent-Kung Adder Using QCA Technology



structure known as the Brent Kung Structure, which was first

proposed by Brent and Kung in 1982 and which uses the

logarithmic concept, is discussed. This structure used an

operator known as the dot (·) operator, which is explained in the

architecture, for its basic blocks.

A. Brent Kung Architecture

In order to approach the structure known as the Brent Kung

Structure, which uses the logarithmic concept, the entire

architecture is easily understood by dividing the system into

three separate stages:

Generate/Propagate Generation

The Dot ( · ) Operation

Sum generation

B. Generate/Propagate Generation

If the inputs to the adder are given by the signals A and B,

then the generate and propagate signals are obtained according

to the following equations.

G = A.B

P = A xor B

C. The Dot ( · ) Operation

The most important building block in the Brent Kung

Structure is the dot (·) operator. The basic inputs to this

structure are the generate and propagate signals generated in the

previous stage. The · operator is a function that takes in two sets

of inputs-- (g, p) and (g‟, p‟)-- and generates a set of outputs-- (

g + pg‟, pp‟ These building blocks are used for the generation of

the carry signals in the structure. For the generation of the carry

signals, the carry for the kth bit from the carry look-ahead

concept is given by Co,k = Gk +Pk(Gk-1 +Pk-1 +P k-1

(…+P1(G0 +P0 Ci,0))) Using the dot operator explained above

the Equation 3.3 can be written for the different carry signals as

Co,0 = G0 +P Ci,0 = a ( G0,P0) Co,1 = G1 + G0 P1 = a ((G1 ,

P1 )·(G0, P0))…………… C0,k = a ((Gk,Pk)·(Gk-1,Pk-

1)·…·(G0,P0)) where a is a function defined in order to access

all the tuples. The 8-bit Brent Kung Structure is shown in Figure

3.3. This figure shows all the carry signals generated at different

stages in the structure. In the structure, two binary tree structure

are represented -- the forward and the reverse trees. The forward

binary tree alone is not sufficient for the generation of all the

carry signals. It can only generate the signals shown as

Co,0,Co,1, Co,3 and Co,7. The remaining carry signals are

generated by the reverse binary tree.

D. Sum Generation

The final stage in this architecture is the sum generation

stage. The sum is given by

S= A xor B xor C (4)

where A and B are the input signals, and C is the carry signal.

The carry is obtained from the dot operator stage discussed

earlier, and the exclusive of A and B is actually the propagate

signal itself. Hence the sum „S‟ can finally be represented and

realized as

S = P xor C (5)

Using the above three stages, the complete architecture is built.

IV. RESULTS

Fig.5. RTL Schematic of Majority Gate.

The test bench is developed in order to test the modeled

design. This developed test bench will automatically force the

inputs and will make the operations of Adder.

Fig.6. RTL Schematic of QCA Adders

Fig.7. RTL Schematic of Binary Adder(QCA) –Carry Chain




Fig.8. Technical Schematic of Binary Adder BK (QCA) –

Carry Chain.

V. SIMULATION RESULT

Fig.9.

TABLE I

Number of

4 Input

LUT‟S

Combinational

Delays(ns)

Power

(mw)

RCA

(Existed )

76 13.495 0.6196

BRENT

KUNG

(Proposed)

60 14.876 0.4892

VI. CONCLUSION

The technology studied in this project, QCA, reveals to be a

strong competitor, along with SET, to in a near future,

complement CMOS technology in digital integrated circuits. It

must be remembered that analog CMOS technology will be

needed, at least, to bound the real analog world to QCA

quantum-dots. This QCA technology seams particularly suitable

for high throughput and deeply pipelined architectures, given

the inherent pipelined operation of a single QCA wire, acting as

a chain of latches. Applications such as audio and video stream

processing might benefit much with QCA architectures. On the

other hand, heavily conditional processing would be penalized,

given the high cost of a stall in an extremely deep pipeline.

Regarding the design flow of QCA circuits, from system

specifications to physical fabrication, many points have to be

tuned to reach a viable alternative for CMOS in the digital

domain. Logic synthesis is a key operation that may drastically

reduce the area and delay of the circuits. Once the professional

tools currently available are deeply oriented to the most basic

logic gates feasible in CMOS (usually NAND gates), this will

be an area of great interest. The greatest challenge is, perhaps,

to adapt current tools and design flows from current CMOS

processes in order to accommodate the special features of QCA,

such as gate level synchronization and in wire memory .There

may be need to adapt many existing CMOS circuits to QCA,

and this may result in the exclusive use of And, Or and Inverter

gates in QCA circuits. Although this may be the solution for

small sized circuits, it can become very ineffective for larger

ones. In future there may be a chance to design the project in

transistor level also. So from the gate level design comparison

table it can be concluded that the proposed BRENT KUNG

based QCA adder is better than the existed RCA based QCA

adder.

VII. REFERENCES

[1]V. Pudi and K. Sridharan, “Low complexity design of ripple

carry andBrent–Kung adders in QCA,” IEEE Trans.

Nanotechnol., vol. 11, no. 1,pp. 105–119, Jan. 2012.

[2]V. Pudi and K. Sridharan, “Efficient design of a hybrid adder

in quantumdot cellular automata,” IEEE Trans. Very Large

Scale Integr. (VLSI) Syst.,vol. 19, no. 9, pp. 1535–1548, Sep.

2011.

[3]S. Perri and P. Corsonello, “New methodology for the design

of efficientbinary addition in QCA,” IEEE Trans. Nanotechnol.,

vol. 11, no. 6,pp. 1192–1200, Nov. 2012.

[4]V. Pudi and K. Sridharan, “New decomposition theorems on

majoritylogic for low-delay adder designs in quantum dot

cellular automata,”IEEE Trans. Circuits Syst. II, Exp. Briefs,

vol. 59, no. 10, pp. 678–682,Oct. 2012.

[5]K. Walus and G. A. Jullien, “Design tools for an emerging

SoCtechnology: Quantum-dot cellular automata,” Proc. IEEE,

vol. 94, no. 6,pp. 1225–1244, Jun. 2006.

[6]S. Bhanja, M. Ottavi, S. Pontarelli, and F. Lombardi, “QCA

circuits forrobust coplanar crossing,” J. Electron. Testing,

Theory Appl., vol. 23,no. 2, pp. 193–210, Jun. 2007.

Date post:	15-Mar-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Implementation of Brent-Kung Adder Using QCA Technology · building block of these cellular arrays...

Documents