ISSN 2322-0929
Vol.04, Issue.10,
October-2016,
Pages:0921-0926
www.ijvdcs.org
Copyright @ 2016 IJVDCS. All rights reserved.
Implementation of Brent-Kung Adder Using QCA Technology MALOTHU KIRAN KUMAR
1, CH.CURY
2
1PG Scholar, Sreekavitha Engineering College, Khammam, TS, India.
2Associate Professor, Sreekavitha Engineering College, Khammam, TS, India.
Abstract: Quantum-dot cellular automata (QCA) are a new technology suitable for the implementation of ultra-dense low-power
high-performance digital circuits. As transistors decrease in size more and more transistors can be accommodated in a single chip,
thus increasing chip computational capabilities. However, transistors cannot get much smaller than their current size. A quantum-dot
cellular automaton has a simple cell as the basic element. The cell is used as a building block to construct gates and wires. The
quantum-dot cellular automata (QCA) approach represents one of the possible solutions in overcoming this physical limit. In this
paper designed a new Brent Kung adder (BKA) that performs and achieves the best area-delay tradeoff. The 32-bit version of the
Brent-Kung adder (BKA) requires 60 LUTs and shows a delay of 14.876ns which is lower value compared to existing Ripple carry
adder (RCA).
Keywords: QCA (Quantum Dot Cellular Automata), Majority Gates (MG), Xilinx ISE, Verilog.
I. INTRODUCTION
Quantum dot cellular automata (sometimes referred to simply
as quantum cellular automata or QCA) are proposed models of
quantum computation, which have been devised in analogy to
conventional models of cellular automata introduced by von
Neumann. The Notre Dame group has developed a new
paradigm for ultra-dense and ultra-fast information processing
in nano electronic systems. These "Quantum Cellular
Automata" (QCA's) are the first concrete proposal for a
technology based on arrays of coupled quantum dots. The basic
building block of these cellular arrays is the Notre Dame Logic
Cell, as it has been called in the literature. The phenomenon of
Coulomb exclusion, which is a synergistic interplay of quantum
confinement and Coulomb interaction, leads to a bi-stable
behavior of each cell which makes possible their use in large-
scale cellular arrays. The physical interaction between
neighboring cells has been exploited to implement logic
functions. New functionality may be achieved in this fashion,
and the Notre Dame group invented a versatile majority logic
gate. In a series of papers, the feasibility of QCA wires, wire
crossings, inverters, and Boolean logic gates was demonstrated.
A major finding is that all logic functions may be integrated in a
hierarchical fashion which allows the design of complicated
QCA structures. The most complicated system which was
simulated to date is a one-bit full adder consisting of some 200
cells. In addition to exploring these new concepts, efforts are
under way to physically realize such structures both in
semiconductor and metal systems. Extensive modeling work of
semiconductor quantum dot structures has helped identify
optimum design parameters for QCA experimental
implementations. In the first year of this project, the
experimental effort has laid the ground work for the study of
Coulomb coupling in quantum dot structures.
The newly acquired ultra-high resolution field emission
scanning electron microscope was installed and used to examine
nanostructures with a resolution of 1 nanometer. Coupled dot
structures were fabricated in Gallium Arsenide hetero structures
and await testing in the newly-established cryogenic
measurement facility. QCA is a novel emerging technology in
which logic states are not stored as voltage levels, but rather the
position of individual electrons. Conceptually, QCA represents
binary information by utilizing a bi-stable charge configuration
rather than a current switch. A QCA cell can be viewed as a set
of four “dots” that are positioned at the corners of a square. A
quantum dot is a site in a cell in which a charge can be
localized. The cell contains two extra mobile electrons that can
quantum mechanically tunnel between dots, but not cells. In the
ground state and in the absence of external electrostatic
perturbation, the electrons are forced to the corner positions to
maximize their separation due to Coulomb repulsion. As shown
in below Figure,
Fig.1. QCA Cell.
MALOTHU KIRAN KUMAR, CH.CURY
International Journal of VLSI System Design and Communication Systems
Volume.04, IssueNo.10, October-2016, Pages: 0921-0926
The two possible charge configurations are used to represent
binary“0” and “1”. Note that in the case of an isolated cell, the
two polarization states are energetically degenerate. However
the presence of other charges (neighbor cells) breaks the
degeneracy and one polarization state becomes the cell ground
state. Polarization P measures the extent to which the charge
distribution is aligned along one of the diagonal axes. If the
charge density on a dot i is i, then the polarization is defined as:
𝑃 = 𝜌1+𝜌3 − 𝜌2+𝜌4
𝜌1+𝜌2+𝜌3+𝜌4 (1)
II. QCA ADDER
The Binary Adder Using QCA and its functionality were
discussed in the previous chapters. Now this chapter deals with
the simulation and synthesis results of the Binary Adder in
terms of Sum and Carry Chain. Here Modelsim tool is used in
order to simulate the design and checks the functionality of the
design. Once the functional verification is done, the design will
be taken to the Xilinx tool for Synthesis process and the net list
file generation. The Appropriate test cases have been identified
in order to test this modeled QCA design. Based on the
identified values, the simulation results which describes the
operation of the QCA Adder has been achieved. This proves
that the modeled design works properly as per the process. To
introduce the novel architecture proposed for implementing
ripple adders in QCA, let consider two n-bit addends A = an−1,.
. . , a0 and B = bn−1, . . . , b0 and suppose that for the i th bit
position (with i = n − 1, . . . , 0) the auxiliary propagate and
generate signals, namely pi = ai + bi and gi = ai · bi, are
computed. ci being the carry produced at the generic (i−1) th bit
position, the carry signal ci+2,furnished at the (i+1) th bit
position, can be computed using the conventional CLA logic
reported in as below. The latter can be rewritten as given in
equation, by exploiting Theorem. In this way, the RCA action,
needed to propagate the carry ci through the two subsequent bit
positions, requires only one MG.
Fig.2. Novel n-Bit Adder (a) Carry Chain and (b) Sum
Block.
Conversely, conventional circuits operating in the RCA
fashion, namely the RCA and the CFA, require two cascaded
MGs to perform the same operation. In other words, an RCA
adder designed as proposed has a worst case path almost halved
with respect to the conventional RCA and CFA. Equation is
exploited in the design of the novel 2-bitmodule shown in Fig. 1
that also shows the computation of the carry ci+1 = M(pi gici ).
The proposed n-bit adder is then implemented by cascading n/2
2-bit modules as shown in Fig2 (a). Having assumed that the
carry-in of the adder is cin = 0, the signal p0 is not required and
the 2-bit module used at the least significant bit position is
simplified. The sum bits are finally computed as shown in Fig.
2(b). It must be noted that the time critical addition is performed
when a carry is generated at the least significant bit position(i.e.,
g0 = 1) and then it is propagated through the sub sequent bit
positions to the most significant one. In this case, the first2-bit
module computes c2, contributing to the worst case
computational path with two cascaded MGs. The subsequent 2-
bitmodules contribute with only one MG each, thus introducing
a total number of cascaded MGs equal to (n − 2)/2. Considering
that further two MGs and one inverter are required to compute
the sum bits, the worst case path of the novel adder consists of
(n/2) + 3 MGs and one inverter
ci+2 = gi+1 + pi+1 · gi + pi+1 · pi · ci (2)
ci+2 = M(M (ai+1, bi+1, gi ) M (ai+1, bi+1, pi ) ci (3)
To introduce the novel architecture proposed for
implementing ripple adders in QCA, let consider two n-bit add
ends A = an−1, . . . , a0 and B = bn−1, . . . , b0 and suppose
that for the i the bit position (with i = n − 1, . . . , 0) the auxiliary
propagate and generate signals, namely pi = ai + bi and gi = ai ·
bi , are computed. ci being the carry produced at the generic
(i−1)the bit position, the carry signal ci+2, furnished at the
(i+1)th bit position, can be computed using the conventional
CLA logic reported. The latter can be rewritten as given in (3),
by exploiting Theorems 1 and 2demonstrated. In this way, the
RCA action, needed to propagate the carry ci through the two
subsequent bit positions, requires only one MG. Conversely,
Implementation of Brent-Kung Adder Using QCA Technology
International Journal of VLSI System Design and Communication Systems
Volume.04, IssueNo.10, October-2016, Pages: 0921-0926
conventional circuits operating in the RCA fashion, namely the
RCA and the CFA, require two cascaded MGs to perform the
same operation. In other words, an RCA adder designed as
proposed has a worst case path almost halved with respect to the
conventional RCA and CFA. Equation is exploited in the design
of the novel 2-bitmodule shown in Fig. 1 that also shows the
computation of the carry ci+1 = M (pi gici). The proposed n-bit
adder is then implemented by cascading n/2 2-bit modules have
assumed that the carry-in of the adder is cin = 0, the signal p0 is
not required and the 2-bit module used at the least significant bit
position is simplified. The sum bits are finally computed. It
must be noted that the time critical addition is performed when a
carry is generated at the least significant bit position (i.e., g0 =
1) and then it is propagated through the sub sequent bit positions
to the most significant one. In this case, the first2-bit module
computes c2, contributing to the worst case computational path
with two cascaded MGs. The subsequent 2-bitModules
contribute with only one MG each, thus introducing a total
number of cascaded MGs equal to (n − 2)/2. Considering that
further two MGs and one inverter are required to compute the
sum bits, the worst case path of the novel adder consists of (n/2)
+ 3 MGs and one inverter . To prove the basic operation of
QCA, we must perform the operation M(abc) = a · b + a · c + b
· c. were ,the operation are divided in to two blocks
Carry chain
Sum chain
A. Operation with Example
Here I am considering a 4 bit example for a=5 and b=2 were
it provides the sum=7 and carry =0 of 4 bit each.
Input‟s :
A(a3,a2,a1,a0)=0101=5
B(b3,b2,b1,b0)=0010=2
Output‟s :
Sum (s3,s2,s1,s0)=0111=7
carry (c3,c2,c1,c0)=0000=0
B. Carry Chain
We can see from the above Carry Chain figure 13(a) , the 4
bit input of a and b are applied to majority gate with a input of 3
bit each and provides a one output
M(a0,b0,0) = M(1,0,0) = 0--------------------------c0=g0
M(a1,b1,g0) = M(0,1,0) = 0-----------------------c1
M(a2,b2,0) = M(1,0,0)= 0---------------------------g2
M(a2,b2,1) = M(1,0,1)= 1---------------------------p2
M(P2,g2,C2) = M(1,0,0) = 0-----------------------c2
M(a3,b3,g2) = M(0,0,0) = 0
M(a3,b3,P2) = M(0,0,1) = 0
M(0,0,C2)=0=-----------------------------------------c3
Finally carry (c3, c2, c1, c0) =0000=0
C. Sum Block
As seen the above example for carry chain, in which whose
carries are used in sum chain Which has showed in above
fig.1(b) for generating the sum‟s.
M(a0,~c1,g0)=M(1,1,0)= 1.
M(~c1,1,0)=M(1,1,0)= 1-----------------s0
M(a1,~c2,b1)=M(0,1,1)= 1
M(~c2,1,0)=M(1,1,0)= 1------------------s1
M(p2,~c3,g2)=M(1,1,0)= 1
M(~c3,1,0)=M(1,1,0)= 1-----------------s2
M(a3,~c4,b3)=M(0,1,0)= 0
M (~c4, 0,0)=M(1,1,0)= 0------------------s3
Finally sum (s3,s2, s1,s0)=0111=7
III. BRENT KUNG ADDER
The Brent-Kung adder is a parallel prefix adder. Parallel
prefix adders are special class of adders that are based on the
use of generate and propagate signals. Simpler Brent-Kung
adders was been proposed to solve the disadvantages of Kogge-
Stone adders. The cost and wiring complexity is greatly
reduced. But the logic depth of Brent-Kung adders increases to
2log (2n-1), so the speed is lower. The block diagram of 4-bit
Brent-Kung adder is shown in Fig.
Fig.3. Brent-Kung Prefix Adder
In IC design environment, the chip performance is influence
by design environment, schematic and sizing parameter of the
transistor. Therefore, this study is an attempt to investigate the
performance of 4-bit Brent Kung Parallel Prefix Adder using
Silvaco EDA Tools and targeted to 0.18um Silterra Technology.
The objective of this project is to review the performance of the
adder by forming different of transistors gate sizing and
schematics. Furthermore, the study been carried out by
implementing Brent Kung Adder in Basic Logic Gate and
Compound Gate, then simulate the design in various sizes of
transistors in order to see the effects on propagation delay,
power consumption and the number of transistors used. At the
end of this paper, evidently the improvement of transistors size
contributes reducing the propagation delay and proportionally
advances the power consumption. Besides, the Compound Gate
takes about 35.58% power consumption decreased, reduced
9.16% of propagation delay and less 96 transistors used rather
than Basic Logic Gate. Nevertheless, larger size of buffers
required to stable the output consistency in Compound Gate
schematic. An adder is one of the basic building blocks of
common data path components, As such, they are of immense
importance to designer being so commonly used and such a
critical part of the data path. For smaller adders, carry-look
ahead, carry skip or carry select will suffice, but as the width of
MALOTHU KIRAN KUMAR, CH.CURY
International Journal of VLSI System Design and Communication Systems
Volume.04, IssueNo.10, October-2016, Pages: 0921-0926
the adder grows, the delay of passing the carry through the
stages begin to dominate.
Therefore, in current technology, Parallel Prefix Adder are
among the best adders, with respect to the area and time (cost:
performance ratio), and are particularly good for the high-speed
addition of large numbers. Moreover, the requirements of the
adder are that it is primarily fast and secondarily efficient in
terms of power consumption and chip area. Parallel Prefix
Adder as terminology background is describing prefix as the
outcome of the execution of the operation depends on the initial
inputs. Parallel in this term is defines as the process of involving
the execution of an operation in parallel. This is done by
segmentation into smaller pieces that computed in parallel. Then
all bits of the sum will begin the process concurrently. There are
a lot of parallel prefix adders been developed example in 1960:
J. Sklansky–conditional adder, 1973: Kogge-Stone adder, 1980:
Ladner-Fisher adder, 1982: Brent-Kung adder, 1987: Han
Carlson adder and 1999: S. Knowles. Other parallel adder
architectures also include H. Ling adder in 1981 and 2001:
Beaumont-Smith. Practically, the Brent Kung Parallel Prefix
Adder has a low fan-out from each prefix cell but has a long
critical path and is not capable of extremely high speed addition.
In spite of that, this adder proposed as an optimized and regular
design of a parallel adder that addresses the problems of
connecting gates in a way to minimize chip area. Accordingly, it
considered as one of the better tree adders for minimizing
wiring tracks, fan out and gate count and used as a basis for
many other networks.
Fig.4. Schematic View Implementation of 16-Bit Adder
Stages
– 2(logN-1)
Fan out
– 2
Avoids Explosion of wires
Odd Computation then even
In any row at the most one pair High performance
microprocessor units require high performance adders and
other arithmetic units. Modern microprocessors are however
32 bits or 64 bits as that is the minimum required for
floating point arithmetic as per the IEEE 754 Standard. 8 bit
and 16 bit arithmetic processors are normally found in
microcontroller applications for embedded systems where
high speed is important but low power constraints dominate
system design. A good metric of performance on such
designs would be the power delay product (or equivalently
energy per bit.) Many designs give a high speed at the cost of
more power or low power at the cost of low speed. The
design of a 16 bit Brent Kung adder presented here has the
lowest delay (among the adders compared, Table 2) and
also the lowest power delay product (among the
adders compared, Table 2) in similar technology nodes. The
design makes use of logical effort based sizing of
transistors and advanced layout techniques like fingering
and inter digitizing to reduce the self loading of the
transistors from parasitic transistor capacitances. Addition is
one of the simplest and commonly used operations and is in
most cases a speed determining factor for arithmetic operations.
The addition of two binary numbers is the fundamental
arithmetic operation in microprocessors, digital signal
processors, and data-processing application-specific integrated
circuits. Several algorithms have been presented for high speed
parallel addition, and there is generally a tradeoff between speed
and area. Hence, binary adders are crucial building blocks in
very large-scale integrated circuits. For the optimization of
speed in adders, the most important factor is carry generation.
For the implementation of a fast adder, the generated carry
should be driven to the output as fast as possible, thereby
reducing the worst path delay which determines the ultimate
speed of the digital structure.
In the design for timing optimization, a network can be
optimized either at circuit or logic level. Logic level
optimization is done by manipulating boolean equations,
whereas circuit level optimization can be carried out by
manipulating transistor sizes and circuit topologies. In the
optimization for area, care should be taken in the design of the
building blocks of the structure, which determine the area
occupied by the architecture and, finally, also affect the speed.
The type of structure of any adder greatly affects the speed of
the circuit. The logarithmic structure is considered to be one of
the fastest structures. The logarithmic concept is used to
combine its operands in a tree-like fashion. The logarithmic
delay is obtained by restructuring the look-ahead adder. The
restructuring is dependant on the associative property, and the
delay is obtained to be equal to (log2N) t, where „N‟ is the
number of input bits to the adder and t is the propagation delay
time. Hence, for a 16-bit structure, the logarithmic adder has a
delay equal to „4t‟, while for a simple ripple carry adder the
delay is given by (N-1)t and is equal to „15t‟ for „N‟ and „t‟
being the number of input bits and the delay time, respectively.
Hence it is seen that this structure greatly reduces the delay, and
would be especially beneficial for a structure with large number
of inputs. This advantage is, however, obtained at the expense
of large area and a complex structure. In the following section, a
Implementation of Brent-Kung Adder Using QCA Technology
International Journal of VLSI System Design and Communication Systems
Volume.04, IssueNo.10, October-2016, Pages: 0921-0926
structure known as the Brent Kung Structure, which was first
proposed by Brent and Kung in 1982 and which uses the
logarithmic concept, is discussed. This structure used an
operator known as the dot (·) operator, which is explained in the
architecture, for its basic blocks.
A. Brent Kung Architecture
In order to approach the structure known as the Brent Kung
Structure, which uses the logarithmic concept, the entire
architecture is easily understood by dividing the system into
three separate stages:
Generate/Propagate Generation
The Dot ( · ) Operation
Sum generation
B. Generate/Propagate Generation
If the inputs to the adder are given by the signals A and B,
then the generate and propagate signals are obtained according
to the following equations.
G = A.B
P = A xor B
C. The Dot ( · ) Operation
The most important building block in the Brent Kung
Structure is the dot (·) operator. The basic inputs to this
structure are the generate and propagate signals generated in the
previous stage. The · operator is a function that takes in two sets
of inputs-- (g, p) and (g‟, p‟)-- and generates a set of outputs-- (
g + pg‟, pp‟ These building blocks are used for the generation of
the carry signals in the structure. For the generation of the carry
signals, the carry for the kth bit from the carry look-ahead
concept is given by Co,k = Gk +Pk(Gk-1 +Pk-1 +P k-1
(…+P1(G0 +P0 Ci,0))) Using the dot operator explained above
the Equation 3.3 can be written for the different carry signals as
Co,0 = G0 +P Ci,0 = a ( G0,P0) Co,1 = G1 + G0 P1 = a ((G1 ,
P1 )·(G0, P0))…………… C0,k = a ((Gk,Pk)·(Gk-1,Pk-
1)·…·(G0,P0)) where a is a function defined in order to access
all the tuples. The 8-bit Brent Kung Structure is shown in Figure
3.3. This figure shows all the carry signals generated at different
stages in the structure. In the structure, two binary tree structure
are represented -- the forward and the reverse trees. The forward
binary tree alone is not sufficient for the generation of all the
carry signals. It can only generate the signals shown as
Co,0,Co,1, Co,3 and Co,7. The remaining carry signals are
generated by the reverse binary tree.
D. Sum Generation
The final stage in this architecture is the sum generation
stage. The sum is given by
S= A xor B xor C (4)
where A and B are the input signals, and C is the carry signal.
The carry is obtained from the dot operator stage discussed
earlier, and the exclusive of A and B is actually the propagate
signal itself. Hence the sum „S‟ can finally be represented and
realized as
S = P xor C (5)
Using the above three stages, the complete architecture is built.
IV. RESULTS
Fig.5. RTL Schematic of Majority Gate.
The test bench is developed in order to test the modeled
design. This developed test bench will automatically force the
inputs and will make the operations of Adder.
Fig.6. RTL Schematic of QCA Adders
Fig.7. RTL Schematic of Binary Adder(QCA) –Carry Chain
MALOTHU KIRAN KUMAR, CH.CURY
International Journal of VLSI System Design and Communication Systems
Volume.04, IssueNo.10, October-2016, Pages: 0921-0926
Fig.8. Technical Schematic of Binary Adder BK (QCA) –
Carry Chain.
V. SIMULATION RESULT
Fig.9.
TABLE I
Number of
4 Input
LUT‟S
Combinational
Delays(ns)
Power
(mw)
RCA
(Existed )
76 13.495 0.6196
BRENT
KUNG
(Proposed)
60 14.876 0.4892
VI. CONCLUSION
The technology studied in this project, QCA, reveals to be a
strong competitor, along with SET, to in a near future,
complement CMOS technology in digital integrated circuits. It
must be remembered that analog CMOS technology will be
needed, at least, to bound the real analog world to QCA
quantum-dots. This QCA technology seams particularly suitable
for high throughput and deeply pipelined architectures, given
the inherent pipelined operation of a single QCA wire, acting as
a chain of latches. Applications such as audio and video stream
processing might benefit much with QCA architectures. On the
other hand, heavily conditional processing would be penalized,
given the high cost of a stall in an extremely deep pipeline.
Regarding the design flow of QCA circuits, from system
specifications to physical fabrication, many points have to be
tuned to reach a viable alternative for CMOS in the digital
domain. Logic synthesis is a key operation that may drastically
reduce the area and delay of the circuits. Once the professional
tools currently available are deeply oriented to the most basic
logic gates feasible in CMOS (usually NAND gates), this will
be an area of great interest. The greatest challenge is, perhaps,
to adapt current tools and design flows from current CMOS
processes in order to accommodate the special features of QCA,
such as gate level synchronization and in wire memory .There
may be need to adapt many existing CMOS circuits to QCA,
and this may result in the exclusive use of And, Or and Inverter
gates in QCA circuits. Although this may be the solution for
small sized circuits, it can become very ineffective for larger
ones. In future there may be a chance to design the project in
transistor level also. So from the gate level design comparison
table it can be concluded that the proposed BRENT KUNG
based QCA adder is better than the existed RCA based QCA
adder.
VII. REFERENCES
[1]V. Pudi and K. Sridharan, “Low complexity design of ripple
carry andBrent–Kung adders in QCA,” IEEE Trans.
Nanotechnol., vol. 11, no. 1,pp. 105–119, Jan. 2012.
[2]V. Pudi and K. Sridharan, “Efficient design of a hybrid adder
in quantumdot cellular automata,” IEEE Trans. Very Large
Scale Integr. (VLSI) Syst.,vol. 19, no. 9, pp. 1535–1548, Sep.
2011.
[3]S. Perri and P. Corsonello, “New methodology for the design
of efficientbinary addition in QCA,” IEEE Trans. Nanotechnol.,
vol. 11, no. 6,pp. 1192–1200, Nov. 2012.
[4]V. Pudi and K. Sridharan, “New decomposition theorems on
majoritylogic for low-delay adder designs in quantum dot
cellular automata,”IEEE Trans. Circuits Syst. II, Exp. Briefs,
vol. 59, no. 10, pp. 678–682,Oct. 2012.
[5]K. Walus and G. A. Jullien, “Design tools for an emerging
SoCtechnology: Quantum-dot cellular automata,” Proc. IEEE,
vol. 94, no. 6,pp. 1225–1244, Jun. 2006.
[6]S. Bhanja, M. Ottavi, S. Pontarelli, and F. Lombardi, “QCA
circuits forrobust coplanar crossing,” J. Electron. Testing,
Theory Appl., vol. 23,no. 2, pp. 193–210, Jun. 2007.