Fast Time-Recursive Block Correlators for Pseudorandom Sequences

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS 1

Fast Time-Recursive Block Correlators forPseudorandom Sequences

David Akopian, Senior Member, IEEE, Phanikrishna Sagiraju, and Brent Nowak, Member, IEEE

Abstract—Correlators for pseudorandom sequences are usedin direct sequence spread spectrum communication systems forsignal synchronization and identification of transmitters. Imple-mentation aspects of these correlators are critical for real-timeprocessing of signals. This paper presents fast correlator structuresin time domain which significantly reduce redundant operationsusing time-recursive and block processing. The approach can beextended to other applications which use correlations with bothbinary and non-binary sequences.Index Terms—Convolution, correlators, matched filters, pseudo

noise codes.

I. INTRODUCTION

R ECEIVERS of direct spread spectrum systems (DSSS)correlate received and replica signals (codes) for syn-

chronization, identification of transmitters, reading data, mea-suring propagation delays, etc. [1]. In ranging DSSS systemssuch as Global Positioning Systems (GPS), estimated delays areused to obtain range measurements [2].The conventional synchronization process can be very time

and resource consuming as it is often realized as a search in sev-eral parameter dimensions involving massive number of corre-lations. Practical applications for real-time processing gain fromreduced correlation complexity as they relieve computationalresources and lead to faster receiver responses. State-of-the-artreceivers are typically comprised of a multiplicity of correlatorsto parallelize and accelerate the processing [3]. But for so-calledsoftware receivers the complexity should be reduced algorith-mically [4], [5]. Software receivers gained popularity due to re-configuration flexibility for multiple and/or adaptive operationalmodes. FPGA and digital signal processors are also used to ac-celerate software receivers.Correlators are implemented in time [2], [3], [6], [7] and fre-

quency domains [8]–[12]. Time-domain correlators perform adirect correlation of a replica sequence (code) with received sig-nals in time domain. They may be implemented as matched fil-ters or correlator banks [3]. The matched filter is based on a

Manuscript received August 04, 2012; revised October 14, 2012; acceptedNovember 12, 2012. Date of publication nulldate; date of current version null-date. This work was supported in part by NEEC-UTSA Center and NSF Grants0942852 and 0932339.D. Akopian is with the Department of Electrical and Computer Engineering,

University of Texas San Antonio, San Antonio, TX 78249 USA (e-mail: [email protected]).P. Sagiraju is with CSR Technology Inc, Cedar Rapids, IA 52402 USA.

(e-mail: [email protected]).B. Nowak is with the Department of Mechanical Engineering, University of

Texas San Antonio, San Antonio, TX 78249 USA (e-mail: [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TCSI.2012.2230593

time-recursive approach and is applicable for scenarios of im-plementing correlations in a “filter”-like convolution modes [3],[13]. At each correlation stage, a new fragment of an incomingsignal, which is shifted versus a previous fragment, is corre-lated with an available replica code sequence. Correlator banksare comprised of multiple autonomous correlators for parallelsearch over many synchronization options. Time-domain corre-lators are convenient for hardware implementations due to sim-plicity and parallel implementation options. Frequency-domaincorrelators perform most of the computations in frequency do-main using Discrete Fourier Transform (DFT) which enablesparallel processing for many consecutive correlation phases atonce [13]. The DFT based correlators are computationally ef-ficient due to the existence of fast DFT computation methodscalled Fast Fourier Transforms (FFT) [13]. Particularly, FFT-based correlators for GPS receivers have been reported e.g., in[8]–[12], and similar new algorithms appear each year [14] andare extended for other satellite systems [15].Despite computational attractiveness of FFT based methods,

time-domain correlators are still widely used because of im-plementation simplicity. In many DSSS systems binary signalsare used with multiplications implemented as sign changes.Fixed-point FFT implementations for long sequences are chal-lenging and calculations are not accurate due to quantizationof the transform coefficients and intermediate products [13].Another reason for performing calculations in time-domain isthat very often a limited set of correlations is needed for signalsynchronization as the range of the possible correlation phasescan be reduced based on partial a priori knowledge. Currently,however, such a limited search can be realized with correlatorsperforming a correlation in time-domain as FFT methods per-form correlations for all signal misalignments (phases). Alsothe performance and type of FFT algorithms vary dependingon sequence lengths. This is not typically required for timedomain techniques.Previously several fast time-domain algorithms have been

proposed for specific types of sequences [16]–[21]. We pro-posed computationally efficient algorithms for arbitrary binaryDSSS signals in [6] exploiting joint (block) processing of se-quences but they are not efficient for high oversampling rates.A time-recursive approach proposed in [7] addresses oversam-pled scenario but does not exploit benefits of block processingfor higher computational gains.In this paper we suggest a time-domain approach which

nontrivially integrates our block processing and time-recursivetechniques in [6] and [7] to achieve higher computationalefficiency. As with other time-domain techniques the proposedmethod does not require approximate computations, sequence

1549-8328/$31.00 © 2013 IEEE


2 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS

length constraints, and can be used for limited phase rangecorrelation scenarios. A time-recursive extension of [6] hasbeen also proposed in [22] but with a higher computationalcost than the algorithm proposed within this paper for moderatenumber of block processed sequences. Similarly, another solu-tion [23] does not eliminate computational redundancy due tohigh sampling rates. The correlators described in this paper canbe used with arbitrary binary and non-binary sequences withelements from a small alphabet, which have been introducedfor various applications [24], [25].The paper is organized in the following way. Section II

presents time-domain block processing concepts. Time-re-cursive block-processing technique is derived in Section III.Computational analysis is provided in Section IV.

II. BLOCK CORRELATOR ALGORITHMS

In this section our block correlators algorithms [6] arereviewed which will be used for time-recursive structurederivation in the following section. The following correlationmodel is assumed. A set of input samples is fetched into abuffer memory denoted as ,where is the number of samples. This array of samples is cor-related with pseudorandom sequences in a block processingmode, i.e., by sharing computations to reduce arithmetic com-plexity. The sequences or equivalently masks are denoted as:

, where indexesthe respective mask or replica sequence. Initially we assumebinary sequences, i.e., . Then extension tonon-binary sequences will be also presented.In conventional correlators the samples of the received signal

are multiplied element-wise by samples of the mask and theresulting products are added to obtain correlation values

(1)

One can express this equation in a block processing format usingmatrix form:

(2)

where .

A. Fast Block Processing Method I

In this block-correlator algorithm, the received sampleswhich multiply to the same vectors from (2) are processedjointly as a group. As index identifies received samples, letbe the set of sample indices belonging to the same group. For-mally if , and .There are such groups as are binary. Then one canrepresent (2) as

Fig. 1. Suggested correlator structure as a matched filter implementation intime domain. An example with three replica sequences.

By definition for , thus

(3)

where is one of partial sums. Equation(3) means that each is computed only once and concurrent(block) correlator outputs are computed as linear combina-tions of these partial sums. And the weights are sign changes

, i.e., only additions and subtractions are used. Thetotal number of additions is reduced almost times if thenumber of partial sums is significantly less than the numberof samples. For there are eight groups , from

to .An accumulator-register can be assigned to each group to

accumulate and store partial sums . All such registers arealso indexed by . For illustration purposes anequivalent notation for is used where signed binary valuesof are replaced with (0, 1) binary values or is replaced

by an integer using binary codeword convention. Example:.

For accumulating partial sums for each group all registersare initialized to zero in the beginning of each correlation cycle.Then, the algorithm processes the stored received samplesone after the other and each such sample adds to the partial sumcorresponding to the index group . In Fig. 1, one of the re-ceived samples is denoted as . The samples of the three replicacode sequences having the same index are

. I.e. and correspondsto a group with partial sum accumulator address (010) or “2,”and thus adds to the partial sum . The combining unit ofFig. 1 then applies (3) to obtain jointly the correlations values

at once. Note that multiplications arejust sign changes as is either or . In the example shownin Fig. 1 for correlation values the number of groupsis eight. Fig. 2 illustrates an example of how partial sums areformed while Fig. 3 presents an example how partial sums arecombined to produce three parallel outputs corresponding to thecorrelations with three parallel replicas. Please note as well thatblock correlation (3) is valid for any set of samples and is notlimited to a consecutive set as in (1) and (2).


AKOPIAN et al.: FAST TIME-RECURSIVE BLOCK CORRELATORS FOR PSEUDORANDOM SEQUENCES 3

Fig. 2. An example of sub-sum computation. Replica arrays can be replacedby the address array (the first column) to facilitate implementation.

Fig. 3. Sub-sum combining for forming parallel correlator outputs.

Fig. 4. Example of two replica difference vectors obtained from three replicavectors. Differences are relative to the first replica.

B. Fast Block Processing Method II

The second approach is illustrated in Fig. 4. It is similar to thefirst one; however, an additional correlation can be computedwith the same size of the accumulator register bank which storespartial sums.

The samples of the replica code sequences are again denotedas , where now

being reserved for the first (extra) binary mask (replicacode sequence). In the example of Fig. 4, is equal to two andeach iteration results in three correlation outputs.Conventional correlation values are calculated according

to (1). Again and are used to identify samples and groups.For the second technique the input samples are grouped in aslightly different manner though. if

or . Onecan see that or . In other words, if

. If the replica samples and are stored in (0, 1)format, then the sign change can be determined asusing XOR sum.For this second approach partial sums are slightly dif-

ferent— . For the first replica code sequence,the correlation value is simply calculated as the sum of allsub-sums

(4)

For the th replica code sequence, taking into account thatfor

(5)

The multiplication to is just a sign change as it is either or. The partial sums are slightly different

compared to the previous algorithm as samples are not summedbut accumulate with plus or minus signs depending on the ap-propriate first mask value . They are again reused for allcorrelators. Computations are reduced almost times if thenumber of partial sums is significantly less than the number ofsamples. Similar to the first approach all registers are initial-ized to zero in the beginning of each full correlation cycle. Then,the algorithm processes the stored received samples one afterthe other for forming the partial sums . A received sampleis accumulated (added or subtracted as defined by ) in a reg-ister for if . This procedure is performed for allreceived samples for each correlation cycle. Then registersare initialized to 0 and next correlation cycle begins. The com-bining unit of the correlator combines all stored sub-sums toobtain correlation values at once.

C. Nonbinary Finite Coefficient SetThe first method easily extends to nonbinary mask samples

with the constraint that samples take values from a finite dis-crete set. Group definitions change according to the followingindexing: if , and , andvalues can be nonbinary from a finite set, .With all combinations of the number of groups will be .The first method derivation (1)–(3) didn’t specifically rely on



binary nature of , but the binary case will be the most efficient.Partial sum combining in the previous section will be accordingto weights , i.e., partial sum will be weightedby value for th output according to (3). For binary case theseweights are just sign changes. Similarly for a ternary case usedin the following section with the weights justindicate sign changes and zeroing partial sums.

D. Implementation Issues

The data flow management in block correlators is notcomplicated as demonstrated next. Instead of saving binarymasks the correlator may save the group identifications ofinput samples as they located in the buffer memory. Thegroup identifications are computed beforehand as it is de-scribed previously (e.g., .Let us assume that be thecurrent received signal array saved in the buffer memory. Let

be the array of group identifiersfor each sample. The same identifiers apply to correspondingregisters or designated array elements accumulating partialsums. The identifier array is computed offline from binarymask-replica samples (see Fig. 2, the first column). The valueof is one of the possible group addresses from possible,as described in the previous section. The register contentis denoted as an array for partialsums. For the second approach the first replica sequence

is also saved along withas this sequence specifies signs which apply to samples duringaccumulation in registers. The partial sum computation can beperformed e.g., by the following pseudo-code:

/* Sub-sum computation */;

/* for each received sample in the receiver memory */For/* partial sum computation */

method */method */

End

The partial sum combining is similar for both approaches. Asthe number of partial sums shouldn’t be high, their combiningcan be directly coded. For the example in Figs. 2 and 3 (firstapproach) and for any sequence length it will be:

/* Partial-sum combining option 1*/

This expression directly follows from (3). The subsumsare simply where is a decimal index corresponding to

binary signed word as described in the first paragraph of thissubsection. The column of signs before is . Alternativelyother approaches can be used. Fig. 4 shows an example of

pre-computed group addresses (first column) which can besaved along with the first replica code for the implementationof the second approach. Group addresses are determined usingreplica sign differences with respect to the first replica.

III. JOINT BLOCK PROCESSING AND TIME-RECURSIVE MODE

During synchronization correlations are often implementedin (matched) “filter” type convolution mode when incomingsignal samples shift into a memory buffer and then the bufferarray correlates with replica masks in order to identify matchingsignal fragments. Each cycle a new sample shifts in the memory,the oldest sample is discarded, while the other samples “shift”towards memory exit. The configuration is similar to finite im-pulse response filters, where the weights are binary matchingreplica values. At each stage a big fraction of samples used inthe previous stage is still in the buffer and time-recursive con-cept is meant to reuse computations of consecutive cycles forreduced complexity.Reusing of computations becomes feasible at high oversam-

pling rates when each chip (sample) of the sequence is repre-sented by several samples of oversampled signal. For the masklength of chips, and the sampling rate samples per chip,the mask sequence length is . Let the samples ofmask sequence be the following:

(6)

This mask vector correlates each cycle with updated receivedsignal fragment. Let the current signal fragment be:

(7)

These two vectorsmultiply element-wise and the product resultsare integrated through inner product of these vectors. Let usassume that the first such iteration results in correlation valuefor the first correlation phase. In this section specific subscriptnotation is used for correlations to reflect consecutive correlatoriterations. At next cycle the memory of the matched filter isupdated by discarding the respective oldest sample (rightmost)and by introducing a new sample (leftmost). The updated storedset of samples for the second iteration is thus given by:

(8)

The mask sequence, in contrast, stays the same for all “matchedfilter” iterations. A correlation value is calculated usinginner product of the updated set with the mask arrays. Whencomparing the first set of stored input samples with the secondset of stored input samples, it can be seen that the samples

occur in both sets, even thoughthey are aligned differently with the mask samples. In this paperthis overlap is used for reducing computational complexity.The conventional correlator output, i.e., the correlation valuefor a specific code phase , can be written as:

(9)



We consider a set of mask (replica) sequences for joint pro-cessing, , and thus can rewritethe previous equation in a matrix form

(10)

where is defined as (2) in the previous sec-tion. Taking out the oldest sample from the sum (10) resultsin

(11)

Then, the remaining sum can be considered as a compositionof two sub-sums:

(12)

Here

(13)

(14)

The reason for defining these two groups relates to our finalgoal of excluding redundant computations due to high sam-pling rates. If neighbor replica samples are different as in (14),then it is an indication of a chip boundary. On the other handif neighbor replica samples are same then during correlationshifting input samples will multiply to same replica samplesafter time-recursion. Thus our goal is to eliminate the sub-sumcorresponding to as their contribution is same from iter-ation to iteration and can be shared. The indices of the input sam-ples and the mask samples increase from right to left. Whilethe mask samples remain unchanged for consecutive iterationcycles, the input samples are shifted to the right sample-by-sample, such that the oldest sample is removed from the set ofsamples and a new sample is introduced to the set of samples atthe left hand side.Using definitions (13) and (14), the correlation values in

vector for a respective next phase can be written as:

(15)

From (12) . Thenfrom (15)

(16)

As then one can note thatand the previous equation transforms to:

(17)

where.

The multiplications by are just sign assignments. The sumcan be computed using block correlator

algorithm of Method 1 described in the previous section, withthe difference that the replica samples are not binary butternary, and algorithm still applies with the number of partialsums as powers of three. Nonbinary case has been discussed inSubsection III-D. One should note here that the blockcorrelation (3) applies to any sum of samples and is not limitedto a consecutive set as in (2), so the sum in (17) can bealso presented similar to (3). Group definitions changeaccording to the following indexing: if , and

. With all combinations ofthe number of groups will be . The

subsum indexing will be according to and is similar tothe binary case. I.e., for three sequence blockprocessing, will be one of the following

which accordingly maps to subsum array addresses. Thus, in the summation component of

(17) adds to the subsum with the address correspondingto . Similar to the address array for the binaryblock correlator in Subsection II.C, the address array isdenoted as for this ternary case.If signals are oversampled with several samples per chip then

the number of samples in is significantly larger comparedto set as vectors are same for all but thelast sample within the chip duration. Thus the sum in (17) in-cludes only “boundary” samples and thus arithmetic complexitydecreases (see Fig. 5). The expression (17) excludes all addi-tions within the chip and only boundary samples correspondingto are accounted in the correlations for the set .A pseudocode example is provided in Appendix I, where chipboundary indices in are placed in array , so cyclesonly through indices .

IV. COMPLEXITY ANALYSIS

For complexity analysis let us assume sequence lengths ofsize is the number of jointly block-processed replicas;sampling rate— samples per chip, replica (mask) sequencelength is . denotes real-valued multi-plication-to-addition cost ratio, i.e., (multiplication cost)/(addi-tion cost), and —total number of processed phases (corre-lations). Additive complexities of various methods normalizedper correlator output are estimated in Table I. Time domain cor-relators are more efficient if a limited number of correlations isneeded. This is because of block processing nature of FFT-basedalgorithms which exploit joint processing of a set of correlations



Fig. 5. Examples of oversampled signals. For sampling rate of 4 samples perchip only a fraction of samples (filled bullets) contributes directly to the output.The contribution from other samples is derived from previous the iteration. (a)One mask; (b) two masks.

TABLE ICOMPUTATIONAL COMPLEXITIES PER CORRELATOR OUTPUT SAMPLE

with many redundant computations if a reduced set of correla-tions is of interest.Processing single replica sequence correlations. The

block-correlator methods described in previous sections ad-dress concurrent (block) correlations of received signals withseveral replica sequences. Time-recursive mode described inthe previous section combines consecutive block processingstages mentioned above with different (alignment) phases ofthese replica sequences. When dealing with a single replicasequence the described process still applies. Several phases ofsingle replica sequence can be block processed jointly whileexploiting time-recursive mode as well. This considerationjustifies the comparison of block correlators with conventionalmethods correlating signals with a single sequence. For ex-ample, if the replica sequence is the whole code then limited

number of circularly shifted versions of this sequence can beblock processed concurrently as illustrated below

(18)

At the same time time-recursive cycles will process otherphases. By block processing sequences in (18) and continuingtime-recursively 32 cycles, correlations will becomputed with consecutive phases of this single sequence. Thistype of implementation is also shown in Appendix I.Block correlators can be also used for linear correlation of

single sequences by splitting the replica sequence into fractions,correlating fractions with the received sequence using block cor-relators and combining results in overlap-add manner [26].Arithmetic complexities of block and time-recur-

sive-block correlators. The complexity figures for blockcorrelators I and II along with the time recursive block corre-lator are estimated from (3), (5) and (17). As each input sampleis added to subsums only once, then the number of theseadditions is . Then extra additions are needed for combiningpartial sums. For block correlator I, linear combinations of

subsums will result in additions. Then totaladditions will be and the normalization to onecorrelation cost will be . Similar reasoning ap-plies to block correlator II. For time recursive block correlator(17) the sum may include only one (boundary) sample fromeach chip. For the complexity upper bound we include all theseboundary samples. The sum is computed using block correla-tion with ternary coefficients. Thus using the same reasoning asfor block correlator I, and accounting for extra additions in (17)the complexity estimate will be .With the normalization to one correlation cost the result is

. One should pay attention that theparameter is a fixed number indicating jointly processedcorrelations. It should be constrained due to exponential terms

and . In other words, all correlations will be computedby consecutively processing groups of joint correlations.The acceleration factor is about times assumingfor block correlators and for time recursivescenario. In Figs. 6 and 8 and 3 accordingly whichsatisfy these conditions.Arithmetic complexities of FFT-based methods. For es-

timating the number of operations using FFT-based methodswe consider overlap-add linear correlation [26]. The FFT-basedcorrelator complexity is calculated using split-radix FFT (SR-FFT) complexity figures [27]. The SR-FFT algorithm providesfor close to optimal complexity and regular structure for a widepractical adoption [28]. For correlations the FFT of receivedsignal fragment is computed which is then multiplied element-wise to pre-computed FFT of replica fragment. The result istransformed to time domain by inverse FFT. Both received andreplica fragments are appended by zeros and FFTs apply toabout twice longer vectors of power-of-two lengths [26]. Threereal multiplications and three real additions are assumed per onecomplex multiplication.



Fig. 6. Comparison of additive correlator complexities: conventional,block-correlators I and II, SR-FFT-based approach computing full rangeof correlations normalized to one correlation cost (fft-corr-seq-length),SR-FFT-based approach computing limited range (100) of correlations nor-malized to one correlation cost (fft-corr-seq-length), time-recursive blockcorrelator. 8 samples per chip, 4 replicas are block-processed jointly. Inputsequences are real-valued numbers. For complex-valued sequences all com-plexities double except SR-FFT-based approach. Replica sequences are binaryreal-valued sequences.

The complexity figures of -size complex-valued SR-FFT[27] in real-valued operations:• Multiplications:• Additions:• Total:

For overlap-add linear correlation [26], complexity estimate perone correlation output will be:

(19)

This expression is valid if the total number of correlations is amultiple of . As FFT-based processing applies to a set of cor-relations jointly, it does not linearly scale when partial numberof correlations are needed, e.g., in reacquisition scenarios. It isalso not convenient for joint processing of limited number ofcorrelations during the tracking stage. For partial correlationsthe complexity figure will be:

(20)

Fig. 6 plots correlation complexities versus sequence lengths ac-cording to Table I. For FFT, the numbers are extrapolated for se-quence lengths other than powers-of-two. Also two FFT-basedcases are considered with total number of correlations equalto and as efficiency of this approach deterio-rates when the number of correlations is less than FFT blocksize. Even though multiplicative costs are higher than additivecosts—they are assumed equal in our case studiesto address various computing platforms. All complexities are

for real-valued sequences. For complex-valued sequences, thenumbers will double except for FFT-based approaches.One can observe that FFT performance is excellent when the

total number of correlations is equal to FFT block size. But forpartial correlations scenario time-domain correlators are morepreferable. FFT algorithms for over-sampled signals can bemade more efficient if there is a possibility to find chip edgesbut in many applications signals are significantly corrupted bynoise and chip edge detection might not be possible or feasible.Other solutions such as FFT domain processing of a limitednumber of correlations [29] may reduce the costs for up to twotimes, but for low number of correlations the complexity oftime-domain techniques is lower as evidenced in Fig. 6. Othersolutions, such as Number Theoretic Transforms (NTT) mayaccelerate processing on fixed-point processors. But similar toFFT-based correlations, NTTs process correlations block-wiseand are similarly less efficient for a limited number of cor-relations. They accelerate computations by about 1.6 times[30] for 1024 transform size which is not enough to improvethe performance of proposed time-domain correlators. NTTsrely on modulo arithmetic of fixed-point platforms of certainwordlengths and are not easily portable. Wordlengths and se-quence sizes are interrelated, and two-dimensional algorithmsare used to resolve some of these constraints. For massivenumber of correlations it might be still an attractive solution[31].Comparative analysis and implementation validation.

Fig. 6 demonstrates performance efficiencies of proposedcorrelators. Block correlator accelerates for about times,while in combination with time-recursions it is faster for about

times. One can also estimate preferred theoreticalcomplexity range of the proposed algorithm in comparisonwith FFT methods using Table I. For example, withsamples per chip, , code length in samples

block processed replicas,, the proposed solution is preferable for

and for real sequences, and for complex-valuedsequences.Block-correlator-I implementation is shown in Fig. 7. Code

multiplications and integrations are replaced by addressingand accumulations. Parallel correlator outputs are generated bycombining sub-sums. Time-recursive structure follows from(17) where the summation component is based on block-corre-lator structure similar to Fig. 7 and is straightforward.A real-platform implementation on Dell Latitude E6510

laptop has been used to validate theoretical estimates. Theprocessor is Intel Corei5, 2.67 GHz (4 CPUs), MS WindowsXP OS, MS Visual Studio Team System 2008 is used forprofiling C++ implementation. Performances are evaluated forconventional correlation with and without multiplications, forblock correlator of type I and proposed hybrid time-recursiveblock correlation. Fig. 8 shows average correlation timesfor these algorithms. Performances are normalized to obtainaverage computing time for a single correlation output. Itconfirms relative theoretical complexity estimates in Table I.The received signal is simulated and saved in a data file ininteger format. Profiling is performed for correlation functiononly. Initially the timing is obtained for iterations



Fig. 7. Implementation of block correlator I. Only additions are used. Codemultiplications and integrations are replaced by accumulations and addressing.

Fig. 8. Correlation computation times for different sampling rates normalizedto show single correlation cost. Sequence length is 1023x (Number of sam-ples per chip). Compared methods are (a) conventional correlation with bi-nary weights multiplied (Conv-corr-mult); (b) Conventional correlation withadditions only using positive and negative weight index arrays for each code(Conv-corr-adds); (c) Block correlator I (Block-Corr-I); (d) Proposed time re-cursive block correlator (Time-rec-block-corr). Block processing is for threereplicas.

of correlation function which is performed for all the codephases . Then the total time is normalized by . Thisexperiment is repeated twenty times to estimate the mean andstandard deviation of the estimates.

V. CONCLUSION

In this paper we described time-domain correlators for pseu-dorandom sequences with reduced computational complexity.First block correlators are reviewed and formal proofs of theirfast algorithms are provided. They accelerate processing severaltimes by jointly processing several correlations. Then block cor-relators are extended to perform time-recursive computationsto reduce complexity increases due to high sampling rates. Theapproach significantly outperforms conventional correlations’efficiency which is demonstrated using theoretical complexityestimates and timing measurements in real implementations. Itis particularly suitable for limited number of correlation opera-tions in e.g., reacquisition scenarios or advanced tracking cor-relators in software defined radio implementations. In certainscenarios the method is even competitive with FFT-based ap-proaches for massive number of correlations. Different fromFFT-based methods, the presented method is accurate, used di-rectly for arbitrary sequence lengths, and no multiplications arerequired. The approach is extended to non-binary sequences aswell.

APPENDIX I

An example of time recursive block correlation pseudocode

All code-phase range is divided into three sections.Correlation length is N. These three sections are processedjointly by the block correlator, . Thustime-recursions are used, and three correlations arecomputed concurrently. Correlation values are saved inarray C(i). Replica codes define index arrays .

/* Compute initial correlation values for each section usingblock correlator I method. First correlations are needed tostart time-recursive iterations */

; /* Initialize the partial sumsto zero.

/* Compute subsums using binary block correlator*/

For

/* subsum accumulations */

End j

/* Compute initial correlations using subsums above

/* Compute correlations using time-recursions (17)

/* The sum component in (17) iscomputed using block



/* correlator for 3 sequences with ternary coefficients.

/* The resulting vector-array is .

/* The number of subsums is defined as an array

For ; ; /* i—time-recursionindex

/*Initialize the partial subsums to zero.

;

/* p—chip boundary sample index, is # ofsamples in

For ; ;

/* Subsum accumulations. —an array ofchip boundary

/* indices in . —an array of subsumaddresses

/* saving computations by using only chipboundary samples

End p

/* Combine subsums. As they multiply to ternarycoefficients

/* , multiplications to ‘0’ are not included

/* Time-recursive computations in (17) are following

/* using components above. Multiplications are signchanges

End i

ACKNOWLEDGMENT

This work was partly supported by Naval Engineering Edu-cation Center and NSF Grants 0942852 and 0932339.

REFERENCES[1] R. L. Peterson, R. E. Ziemer, and D. E. Borth, Introduction to Spread

Spectrum Communications. Upper Saddle River, NJ: Prentice-Hall,1995.

[2] P. Misra and P. Enge, Global Positioning System, Signals, Measure-ments, and Performance. Lincoln, MA: Ganga-Jamuna Press, 2001.

[3] P. Dafesh and J. Holmes, “Practical and theoretical tradeoffs of activeparallel correlator and passive matched filter acquisition implementa-tions,” in Proc. IAIN World Congr., ION 56th Annu. Meet., San Diego,CA, USA, Jun. 26–28, 2000, pp. 352–360.

[4] K. Borre, D. M. Akos, N. Bertelsen, P. Rinder, and S. H. Jensen, ASoftware-Defined GPS and Galileo Receiver: A Single-Frequency Ap-proach. Boston, MA, USA: Birkhauser, 2006.

[5] D. Akos, “The role of Global Navigation Satellite System (GNSS) soft-ware radios in embedded systems,” GPS Solutions 2003.

[6] D. Akopian and S. Agaian, “Fast and parallel matched filters in time do-main,” in Proc. ION GNSS Conf., Long Beach, CA, USA, Sep. 21–24,2004, pp. 491–501.

[7] D. Akopian and S. Agaian, “A fast time-recursive correlator for DSSSsystems,” IEEE Signal Process. Lett., vol. 15, pp. 589–592, 2008.

[8] D. J. R. Van Nee and A. J. R. M. Coenen, “New fast GPS code-acquisi-tion technique using FFT,” Electron. Lett., vol. 27, no. 2, pp. 158–160,1991.

[9] C. Yang, “Fast code acquisition with FFT and its sampling schemes,”in Proc. Inst. Navigation Nat. Tech. Meet. (ION-96).

[10] N. F. Krasner, “GPS Receiver and Method for Processing GPS Sig-nals,” U.S. Patent PN 5 781 156, Jul. 14, 1998.

[11] T. G. Stochham, “High-speed convolution and correlation,” in Conf.Proc. Spring Joint Comput. Conf. (AFIPS), 1966, vol. 28.

[12] D. Akopian, “A fast satellite acquisition method,” inUSA Inst. Naviga-tion Proc. ION-GPS’2001 Conf., Salt Lake City, UT, USA, Sep. 11–14,2001, pp. 2871–2881.

[13] A. V. Oppenheim and R. W. Schafer, Digital Signal Processing. En-glewood Cliffs, NJ: Prentice-Hall, 1975.

[14] X. Yao, X. Qin, S. Cui, and J. Fang, “The optimizations of FFT algo-rithms in GPS software receiver,” in Proc. IEEE Int. Conf. Geoinfor-matics, 2011, DOI: 10.1109/GeoInformatics.2011.5981131.

[15] D. Borio, “M-sequence and secondary code constraints for GNSSsignal acquisition,” IEEE Trans. Aerosp. Electron. Syst., vol. 47, no.2, pp. 928–945, 2011.

[16] M. Cohn and A. Lempel, “On fast M-sequence transforms,” IEEETrans. Inf. Theory, vol. IT-23, pp. 135–137, 1977.

[17] T. Matsumoto, Y. Tanada, and T. Watanabe, “Digital matched filter ofreduced operation elements for M-ary/DS-SS system using real-valuedshift-orthogonal finite-length sequences,” in Proc. 3rd IEEE SignalProcess. Workshop Wireless Commub., Taoyuan, Taiwan, Mar. 20–23,2001, pp. 46–49.

[18] M. Werman, “Fast convolution,” J. WSCG, vol. 11, no. 1, Feb. 3–7,2003.

[19] M. C. Perez, J. Urena, A. Hernandez, A. Jimenez, and C. D. Marziani,“Efficient generation and correlation of sequence pairs with three zer-correlation zones,” IEEE Trans. Signal Process., vol. 57, no. 9, pp.3450–3465, 2009.



[20] C. D.Marziani et al., “Modular architecture for efficient generation andcorrelation of complementary set of sequences,” IEEE Trans. SignalProcess., vol. 55, no. 5, pp. 2323–2337, 2007.

[21] Z. P. Yi, W. Jiang, and W. Jie, “Fast correlation for gold large setsof Kasami sequences,” in Proc. IEEE 71st Veh. Technol. Conf. (VTC2010-Spring), Taipei, Taiwan, May 16–19, 2010, pp. 1–5.

[22] J. Zhao, J. Zhang, and J. Yin, “A parallel differential correlation ac-quisition algorithms in time domain,” in Proc. IEEE Int. Conf. Wire-less Commun., Netw., Mobile Comput., (WiCom), Bejing, China, Sep.24–26, 2009, DOI: 10.1109/WICOM.2009.5301893.

[23] D. Akopian and S. Agaian, “Fast matched filters in time domain forglobal positioning system receivers,” IEE Proc. Radar Sonar Navig.,vol. 153, no. 6, pp. 525–531, 2006.

[24] R. Kuehnel, J. Theiler, and Y. Wang, “Parallel random generators forsequences uniformly distributed over any range of integers,” IEEETrans. Circuits Syst. I, Reg. Papers, vol. 53, no. 7, pp. 1496–1505,2006.

[25] R. S. Katti, R. G. Kavasseri, and V. Sai, “Pseudorandom bit generationusing coupled congruential generators,” IEEE Trans. Circuits Syst. II,Exp. Briefs, vol. 57, no. 3, pp. 203–207, 2010.

[26] E. C. Ifeachor and B. Jervis, Digital Signal Processing: A PracticalApproach, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 2002.

[27] P. Duhamel andM. Vetterli, “Fast Fourier transforms: A tutorial reviewand a state of the art,” Signal Process., vol. 19, pp. 259–299, 1990.

[28] D. Jones, “Choosing the best FFT algorithm,” Connexions Feb. 6, 2007[Online]. Available: http://cnx.org/content/m12060/1.3/

[29] P. Sagiraju, S. Agaian, and D. Akopian, “A reduced complexity acqui-sition of GPS signals for software embedded applications,” IEE Proc.Radar, Sonar, Navigation, vol. 153, no. 1, pp. 69–78, 2006.

[30] R. C. Agarwal and C. S. Burrus, “Number theoretic transforms toimplement fast digital convolution,” Proc. IEEE, vol. 63, no. 4, pp.550–560, 1975.

[31] J. York, J. Little, and D. Munton, “A fast number-theoretic transformapproach to a GPS receiver,” Navigation, vol. 57, no. 4, pp. 297–307,2010.

David Akopian (M’02-SM’04) received the Ph.D.degree in electrical engineering from Tampere Uni-versity of Technology, Finland.He is an Associate Professor at the University of

Texas at San Antonio (UTSA). Prior to joining UTSAhe was a Senior Research Engineer and Specialistwith Nokia Corporation from 1999 to 2003. From1993 to 1999 he was a member of teaching and re-search staff of Tampere University of Technology,Finland. His current research interests include digitalsignal processing algorithms for communication and

navigation receivers, positioning, dedicated hardware architectures, and plat-forms for software defined radio and communication technologies for health-care applications. He authored and co-authored more than 30 patents and 120publications.Dr. Akopian has served in organizing and program committees of many IEEE

conferences and co-chairs SPIEMultimedia onMobile Devices conference. Hisresearch has been supported by National Science Foundation, National Institutesof Health, USAF, U.S. Navy, and Texas Higher Education Coordinating Board.

Phanikrishna Sagiraju received the Ph.D. degreefrom the University of Texas at San Antonio in 2007.He is a Staff Engineer at Samsung Semiconduc-

tors Inc. Prior to that he was a Staff Engineer at CSRTechnology Inc, from 2009 to 2012 and Sr. GPS En-gineer at SiRF Technology Inc, from 2008 to 2009.He was a Postdoctoral Research Associate in the De-partment of Electrical and Computer Engineering atUniversity of Texas at San Antonio in 2007–2008.His current research interests are GNSS base band al-gorithm development, memory and power optimiza-

tion for GNSS receivers, hybrid navigation solutions, software GNSS receiverdevelopment, mobile application development, and GIS. He is an author of onefiled patent application and more than 15 publications.

Brent Nowak (M’11) received the Ph.D. degree fromthe University of Texas at Austin, TX, USA, in 1997.He is an Associate Professor at the University

of Texas at San Antonio (UTSA). Prior to joiningUTSA he conducted and led large scale, multidisci-plinary research projects and programs at SouthwestResearch Institute from 1996 to 2007. From 1997to 2007 he held periodic adjunct professor positionsat UTSA in the Electrical Engineering Departmentand the Mechanical Engineering Department, aswell as at Saint Mary’s University in San Antonio’s

Engineering Department. His current research focus in emergent behaviors ofautonomous devices with a focus on algorithm coupling across multi-energydomains (mechatronics systems). Of particular focus is the development ofadvanced sensing and control strategies for integrated real-time, open-archi-tecture control for intelligent/adaptive behaviors. He authored and coauthoredmore than 8 patents and 50 industrial research reports, publications, and pre-sentations. His research has been supported by National Science Foundation,NASA, U.S. Navy (NAVSEA, ONR), Boeing, and Texas Higher EducationCoordinating Board.

Date post:	10-Dec-2016
Category:	Documents
Upload:	brent
View:	223 times
Download:	5 times

Fast Time-Recursive Block Correlators for Pseudorandom Sequences

Documents