Radio Engineering (From Software To Cognitive Radio) || Cognitive Radio Sensors

Chapter 3

Cognitive Radio Sensors

Throughout this book, we define sensors broadly as tools for transmittinguseful information for the cognitive cycle. The information is used to optimizethe radio link to enhance the quality of service (QoS) provided. The sensing toolsrange from classic sensors, such as microphones, to cognitive sensors that provideinformation acquired through advanced processing techniques, for example, theimpulse response of the channel. Sensors are chosen according to the environmentconsidered, as illustrated in Table 3.1. The classification of sensors presented inTable 3.2 is explained in the rest of the chapter. This is consistent with the modelproposed in Figure 1.11. It introduces a non-exhaustive list of sensing tools dividedaccording to the layer served (lower, intermediate, or higher).

3.1. Lower layer sensors

The lower layer sensors contain, amongst others, the physical layer sensors (seeTable 3.2). In this section, we focus on the physical layer sensors.

3.1.1. Hole detection sensor

This sensor is extensively studied in the literature under the name sensing. Thecognitive radio (CR) is often limited to this sensor for detecting holes (or white spaces)in the spectrum, as already discussed in Chapter 1. CRs require that the secondary

Chapter written by Renaud SÉGUIER, Jacques PALICOT, Christophe MOY, Romain COUILLET

and Mérouane DEBBAH.

44 Radio Engineering

network users are able to detect free spaces in the spectrum and use them in sucha way that they do not interfere with the transmissions of the primary network. TheQoS required by the primary network will otherwise be significantly decreased. Theproblem of detection of primary network activity can be cast as a simple energydetection problem, for which various efforts and methods have existed since 1960[DIG 03, GAR 91, KOS 02, URK 67].

Sensors EnvironmentSpectral occupancy

ElectromagneticBlanks/holes in the spectrumSignal-to-noise ratioChannel impulse responseNumber and position of hotspots and base stations

NetworkNumber and positions of usersUsable standards in proximityOperators and services in proximityLoad on the radio linkBattery levelEnergy consumptionCircuit utilization rate (FPGA) MaterialUtilization rate of the ALUMemory utilization rateTemperature of the materialMicrophone, cameraUser identification spatial position, velocity, time, interior/exteriorpreferences, user’s profile detection, facial recognition, voice

User

recognition, etc.Case study: application MedicalEmotional stateUser’s temperatureBlood pressure levelSugar level, etc.

Table 3.1. Classification list (not exhaustive) of sensors according to theenvironment

The problem of signal detection is cast as a hypothesis test:

H0 : no signal is present

H1 : a signal is being transmitted [3.1]

However, contrary to the classical techniques of signal detection, CRs are deployedin large networks so that:

– many players potentially intervene in the process of signal detection;

– all the players in the network are mobile; this imposes new conditions on theestablished model of detection;

Cognitive Radio Sensors 45

Sensors Model’s layers (see section 1.4)User’s profilePrice of Kilo OctetOperatorPersonnel choices, etc.Sound Application and MMI (man–machine interface)Video higher layerVelocityPositionSecurityVertical mobilityInter- and intra-network (see Figure 1.3) Transport, networkLoad on the radio link intermediate layerServices and networks in proximityDetection of holes/blanksAccess typeReceived powerTransmitted powerModulation typeChannel coding physical, MAC, platformCarrier frequency lower layerSymbol frequencyHorizontal mobilityChannel estimationAntenna lobes formationConsumption, material temperature

Table 3.2. Classification of sensors based on a simplified three-layer model

– each user of the primary and secondary network is potentially equipped withmultiple antennas for transmission/reception.

The model that we will follow here consists of the following hypothesis test:

H0 : yk = nk

H1 : yk = Hxk + nk [3.2]

where yk ∈ CN is the vector of signals received by the combination of N secondaryusers (or more precisely, the cumulative of total number of reception antennas) attime k; nk ∈ CN is the additive Gaussian noise received by N users regardless of thetransmission hypothesis at time k; xk ∈ Cn is the vector of transmission by n primaryusers at time k (or more precisely, the accumulation of the number of antennas usedby the primary users for transmission); and H ∈ CN×n is the transmission channelwhose elements are taken to be standard Gaussian and are unchanged for a fixed timeduration.


The channel is assumed to be sampled with a sampling period N . Combining thereceived vectors yk, we rewrite the hypothesis test in matrix format as:

H0 : Y = N

H1 : Y = HX+ N [3.3]

where columns Y,N ∈ CN×L are the vectors y1, . . . ,yL, n1, . . . ,nL and columnsX ∈ Cn×L are the vectors x1, . . . ,xL in the above model.

Conventional signal detection methods are often insufficient in the context ofopportunistic spectrum access. We present these methods in sections 3.1.1.1, 3.1.1.2,and 3.1.1.3, and explain their shortcomings. We then present an optimal collaborativedetection model in section 3.1.1.4 and compare it with models presented in previoussections.

3.1.1.1. Matched filtering

We first present traditional methods for the detection of non-cooperative sources(pilot based or blind detection), and then discuss the framework for new collaborativedetection techniques in finite dimension, when n and N are not considerably large.We conclude with detection techniques for large system models.

In this section, we present a pilot-aided detection technique called matchedfiltering. For this technique, we assume that X is known a priori to the receiver orreceivers. We also assume that the receiver has knowledge of the sampling frequencyand the transmission rate. We simplify model [3.3] by assuming N = n = 1 andH = 1. The vectors x1, . . . ,xL and y1, . . . ,yL become scalar parameters x1, . . . , xLand y1, . . . , yL, respectively. In this context, the matched filtering techniquemaximizes the signal detection capability. It consists of evaluating the following:

Cmf =L∑

k=1

x∗kyk [3.4]

if |xk|2 = 1 for all k, then Cmf is a random variable distributed according to thefollowing rule:

Cmf ∼{

N(0, σ2), H0

N(L, σ2), H1[3.5]

where σ2 is the variance of additive noise nk for all k. The decision betweenhypotheses H0 and H1 depends on the desired error probability. In particular, it isgenerally required to have minimal probability to decide H0 when H1 is the actualassumption (to avoid interference with the primary network). In this context, it isdesirable to decide H0 only when Cmf/L � 1. The matched filtering approach


requires a systematic transmission of pilot sequences from the primary network usersto allow the secondary users to opportunistically access the spectrum. This impliesthat the secondary network has a priori knowledge of the center frequency and thesampling rate of the primary network data. These hypotheses are highly unrealistic inthe context of opportunistic radio.

3.1.1.2. Detection

In a more realistic case in which the sequences transmitted by the primary networkare not known a priori to the secondary network, it is always possible to isolate theintrinsic properties of the type of signals transmitted, in order to identify them. This isalways true, in principle, for a telecommunications signal [BIC 86]. In particular, letus consider the case of orthogonal frequency division multiplexing (OFDM), whichuses a conventionally large cyclic prefix and hence generates a temporal redundancyof the signal transmitted. The work of Gardner [GAR 91] mainly studies thecyclostationarity of the signals, stemming from the redundancy in frequency of thetransmitted signals and completely characterized by the spectral coherence functionραX(f), defined for a processX taken at frequency f for a cyclicity α. Various criteriaof spectral coherence can be envisioned. For more detail, see [GAR 91] and [ENS 95].Whenever it appears that the signals detected at frequency f show a cyclostationarycharacter of frequency α, H1 is decided. If, however, no particular correlation of thesignal is detected at any cyclic frequency α, then H0 is decided.

A detection threshold ξ must explicitly be chosen such that:

Ccyc(y1, . . . , yL) = supα

|ραy1,...,yL(f)| [3.6]

We make the decision:

Ccyc > ξ ⇒ H1 is chosen;

Ccyc < ξ ⇒ H0 is chosen. [3.7]

This decision threshold is a function of the variance of additive noise. It is,however, not trivial to simply highlight the cyclostationarity character in the signalto be detected. In the context of transmissions with minimum redundancy, it mayturn out that the spectral coherence of the received signal is very weak. Extensions ofthe cyclostationarity method, especially the kth order cyclostationarity method fromDandawate and Giannakis [DAN 94], have been proposed. In all these methods, itis important to highlight the fact that the noise variance is not accounted for in thecyclicity measures, although it clearly affects the test performance.

3.1.1.3. Energy detection

The most commonly used technique for detection was introduced by Urkowitz in1967 [URK 67]. This technique is called energy detection. It does not require any


preliminary assumptions on the transmitted signal and optimizes the decision of thehypothesis test when N = n = 1 and H = 1. In this case, the decision criterion Ced

consists of summing:

Ced(y1, . . . , yL) =1

L

L∑k=1

|yk|2 [3.8]

When Ced is greater than a given level of detection, H1 is declared. This level isagain a function of the variance σ2 of noise and can be varied in order to minimize theprobability of detection error.

The assumption H = 1, however, implies that the source to be detected is in lineof sight of the receiver. The assumption N = n = 1 implies that a single source(single antenna) can be detected by a single antenna receiver. Within the frameworkof cognitive networks, it is highly desirable to permit cooperation among differentplayers of the secondary cognitive network who generally receive a total of N � 1independent signals. It is to be noted, however, that the energy detector can triviallybe generalized to the case where N,n > 1 by considering not only the scalarsum of |yk|2 but also the trace of a matrix 1/LYYH. We will see afterward thatthis suboptimal technique provides detection capabilities comparable to the optimalmethods developed henceforth.

3.1.1.4. Collaborative detection

In the framework of collaborative cognitive networks, numerous efforts were madeto devise an optimal estimator in the context of the general model [3.3]. The optimaldecision is based on the Neyman–Pearson criterion, that depends on the followingrelationship:

Copt(Y) =PH1|Y(Y)

PH0|Y(Y)[3.9]

where PX(x) denotes the probability of the random event x of random variable X .For C > 1, it denotes the probability that H1 is greater than H0 and vice versa. Tooptimize the decision under a minimum error probability constraint, it is sufficient toset a level ξ such that C > ξ implies that H1 will be decided, while C < ξ impliesthat H0 will be decided.

The exact relationship Copt is derived in [COU 10b], for a matrix H of standardGaussian inputs when n and N are not constrained. This modeling of H is based onmaximum entropy considerations, when the receivers have no a priori knowledge ofthe transmission channel [JAY 03, JEY 46]. C(Y) is taken to be a function of only theeigenvalues of the Gram matrix YYH associated with Y. In this particular case wherethe cognitive network is searching to detect a single source, i.e. n = 1, if z1, . . . , zN


denote N eigenvalues of YYH, we get:

Copt(Y) =1

N

N∑l=1

σ2(N+L−1)eσ2+

zlσ2∏N

i=1i�=l

(zl − zi)JN−L−1(σ

2, zl), [3.10]

where:

Jk(x, y) =

∫ +∞

x

tke−t− yt dt. [3.11]

Copt requires a priori knowledge of σ2, which in practice implies a prioriknowledge of the statistics of the additive noise. Generally, this constraint isunrealistic. Moreover, it is possible to generalize Copt to the cases where σ2 is notknown a priori. In this case, Copt becomes:

Copt(Y) =

∫ σ2+

σ2−PY|σ2,H1

(Y, σ2)dσ2∫ σ2+

σ2−PY|σ2,H0

(Y, σ2)dσ2[3.12]

where σ2− and σ2

+ are such that σ2 ∈ [σ2−, σ

2+]. While an explicit calculation

of PY|σ2,H1(Y, σ2) and PY|σ2,H0

(Y, σ2)d is detailed in [COU 10b], an explicitcomputation of Copt, however, is not possible and only numerical methods can beused for the evaluation of [3.12].

The performance of the optimal technique for N = 4, n = 1 with σ2 knowna priori is compared with the conventional energy detector of Urkowitz, extendedto the case N > 1, as shown in Figure 3.1 (a natural extension of the energydetector is obtained by summing not only the squares of the scalars received at thereceiver but the traces of matrices YYH received at the secondary receivers also). Itis clear that the technique of the energy detector is suboptimal, but it, nonetheless,shows a small performance degradation compared with the optimum detector. Ourcalculations therefore suggest that the natural extension of the energy detector methodis a convenient substitute and is less onerous than the optimal detection methodfor N > 1.

The detection technique for the case N > 1 is by far more complex thanthe original solution proposed by Urkowitz. The practical calculation of Copt maybe particularly prohibitive if a large number of frequency bands are to be tested.Moreover, it is to be noted that no natural extension of Urkowitz’s method exists, forthe case where σ2 is not known a priori. Suboptimal techniques that do not requirea priori knowledge of σ2 are hence considered. Such techniques are much easier toimplement. When N becomes extremely large, however, the random matrix domainprovides simple solutions, described henceforth.


Figure 3.1. ROC curve for n = 1, N = 4, L = 8, SNR = −3 dB

When N and L increase simultaneously toward infinity with the relation0 < c = N/L <∞:

– in the hypothesis H0, the distribution of the eigenvalues of 1/LYYH convergesin principle, certainly toward Marcenko–Pastur law, which has compact support[σ2(1−√

c)2, σ2(1 +√c)2] [BAI 98];

– on the contrary, in the hypothesis H1, if n is finite, then the distributionof eigenvalues of YYH shows a finite number of eigenvalues outside the support[σ2(1−√

c)2, σ2(1 +√c)2].

This effect is shown in Figure 3.2, when n = 4.

In the particular case where n = 1 and σ2 +∑N

k=1 |Hk1|2 > 1 +√c, the

maximum eigenvalue of 1/LYYH almost certainly tends toward α + cαα−1 where

α = σ2 +∑N

k=1 |Hk1|2. If α < 1 +√c, it is not possible to identify the presence

of the transmitting source in the frequency band under consideration [SIL 10]. Thiscondition gives a first answer to the fundamental limits of signal detection in largedimension systems. We assume here α > 1 +

√c, which can be satisfied in practice

by sampling the channel L times, such that c = N/L is sufficiently small. Threesimple criteria to decide the presence of the signal are:

– to determine the presence or the absence of eigenvalues other than the support[σ2(1 − √

c)2, σ2(1 +√c)2]. If no eigenvalues are present outside the support of

Marcenko–Pastur law, H0 is decided, otherwise H1;

– when only a single source is transmitting, a more detailed criterion consistsof comparing the extreme eigenvalue of 1/LYYH with the two possible values


Figure 3.2. Distribution of eigenvalues of a large dimension system under thehypothesis H1, n = 4 � N

σ2(1 +√c)2 and α = σ2 +

∑Nk=1 |Hk1|2. A deep knowledge of large deviation

statistics of the extreme eigenvalues can then yield an exact decision criterion. Forthe null hypothesis H0, the maximum eigenvalue (properly normalized) follows theTracy–Widom rule [JOH 01], whereas in the case of H1, the maximum eigenvaluefollows a Gaussian rule;

– another criterion is the value of matrix conditioning 1/LYYH, which permitsavoidance of the a priori knowledge of noise variance σ2 [CAR 08]. If we denote theminimum and maximum eigenvalues of 1/LYYH by λmin and λmax, respectively,then asymptotically:

Ccn =λmax

λmin=σ2(1 +

√c)2

σ2(1 −√c)2

=(1 +

√c)2

(1−√c)2

[3.13]

which no longer depends on σ2. It is then sufficient for a selected decision level ξto minimize the errors of detection, i.e. decide H1 when Ccn > ξ and decide H0

otherwise.

The most attractive feature of the latter technique is that it does not require apriori knowledge of σ2. In Figure 3.3, the ROC curve of this method for finite Nis compared with the large N method for N = 4, n = 1. It is clearly visible thatthe finite N method outperforms the asymptotic method. We can thus think aboutan extension of the asymptotic method through a more precise study of the largedeviations of λmax and λmin.


A natural extension is given by the technique known as generalized maximumlikelihood. This technique is based on the following relationship:

CGLRT(Y) =supH,σ2 PY|H,σ2 (Y)

supσ2 PY|σ2 (Y)[3.14]

instead of the optimal relation [3.9]. Intuitively, this approach is to isolate thehypothesis on the pair (H, σ2), which is the more appropriate if H1 is the correcthypothesis, and to isolate the value of σ2, which is most likely if the hypothesis H0 iscorrect and to take the ratio of probabilities of these two hypotheses. This method ishighly suboptimal in the sense that it systematically eliminates all hypotheses having aprobability less than the most probable hypotheses. Typically, even if a large numberof channels H show a high probability PY|H,σ2(Y), only the maximum hypothesis(and hence a unique H) will be retained in the new relationship CGLRT.

Detailed calculations of [3.14], however, lead to a relatively simple decisioncriterion, i.e. in the case n = 1 [BIA 09]:

CGLRT(Y) =supH,σ2 PH1|Y,H,σ2(Y)

supσ2 PH0|Y,σ2(Y)[3.15]

=

⎛⎝(N − 1

N

)N−1maxi{zi}1N

∑Ni=1 zi

(1− maxi{zi}∑N

i=1 zi

)N−1⎞⎠−L

[3.16]

where z1, . . . , zN are the eigenvalues of matrix 1/LYYH. Precise details of how toadjust the H1 decision threshold can be found in [BIA 09].

The performance of the generalized maximum likelihood detector is shown inFigure 3.3. The GLRT does not require a priori knowledge of the variance of theadditive noise σ2 and show better performance than the other detectors as it takesadvantage of matrix conditioning 1/LYYH. This solution thus provides an adequatesubstitute of the optimal detector when σ2 is not known a priori to the secondaryreceivers of cognitive network.

To sum up, matrix model [3.3], more general than the scalar models used in theconventional techniques, presents a mathematical challenge due to the complexity ofoptimal solutions obtained by explicit calculations. From a practical point of view,many suboptimal techniques, based on a priori knowledge, show a performance closeto the characteristics of the optimal Bayesian detector. Currently, the first barrier hasbeen overcome, allowing us to integrate inexpensive source detectors in the CRs thatneither require a priori knowledge of the transmission channel nor the signal-to-noise ratio. However, it is conceivable that some external information on the channelmay be known a priori to certain users and this knowledge must be integrated into


model [3.3]; in such a situation, a new model must be set up, for which new optimalBayesian calculations must be carried out.

Figure 3.3. ROC curve where σ2 is unknown a priori: finite N method,asymptotic method, n = 1, N = 4, L = 8, SNR= 0 dB

In addition, it seems that the detection capabilities mentioned above imply thatthe secondary users must have the capability to share their signals in a clean and fastmanner so as not to interfere with the primary network. If the primary network isdense and occupies a lot of resources, then it seems difficult to imagine a scenarioof information exchange between secondary network users, which does not affectthe primary network at all. In such a situation, the effective number of users in thesecondary network must be restricted. Similarly, if the primary network is sparsebut the secondary network contains a large number of users, it is suspected thatthe enormous amount of information to pass through this network will not beaccomplished without disturbing the primary network communications. In such cases,a restriction is imposed on the amount of information exchanged and/or on the totalnumber of opportunistic users. Future research in CR for the detection of holeswill require setting up more realistic models, taking into account network densityconstraints.

3.1.2. Other sensors

3.1.2.1. Recognition of channel bandwidth

In 2001, it was shown in [PAL 01, ROL 01] that the channel bandwidth (BWc) ofa given standard is completely discriminatory among all existing commercial wirelessstandards (2G, 3G, digital broadcast, and wireless local area network (WLAN)positioning). The authors used a neural network radial basis function (RBF) to


compare the power spectral densities (PSDs) of the received signals with referencePSDs given by equation [3.17], of different standards to be identified. The bandwidthpattern is given by the shaping filter and bandwidth. This pattern is specific to eachstandard and can thus be recognized by a neural network of the type RBF, as illustratedin Figure 3.4. The advantage of a neural network in the search for the BWc is that itcarries out pattern recognition on the spectrum of the signals. It therefore takes intoaccount the parameters of bandwidth, modulation, and shaping filter. Moreover, it isresistant to perturbations caused by the transmission channel.

Figure 3.4. Spectral pattern recognition, given by PSDs of the signal, using aneural network RBF

γref (k) = |Fems(fpfe

− k)|2γmod(fpfe

− k) [3.17]

where γref is the PSD of the reference signal, Fems the shaping filter (e.g. rootNyquist) of the modulated carrier P of the standard S under consideration, andγmod(fp/fc − k) the PSD of the modulation of the carrier P of the standard S.

The received multistandard signal (also called composite signal), sampled atfrequency fe, is given by equation [3.18]:

x(kTe) =

S∑s=1

P∑p=1

hs,p(kTe)∗(Fems(kTe) ∗ms,p(kTe)exp(2jπ

kfpfe

)

)+bT (kTe)

[3.18]

where Fems is the shaping filter, ms,p(t) the modulation on the carrier P, andhs,p(kTe) the channel response of the carrier P of the standard S. The PSD of thissignal is obtained by using an average periodogram. This spectrum therefore contains


a sum of p channels, each of which is a product of different spectral densities ofmodulations due to shaping filter multiplied by the frequency perturbations of thechannel. Using the neural network of Figure 3.4, we have compared this real spectrumwith a reference spectra base given by equation [3.17].

An indoor test campaign was conducted on real signals (fixed and mobile). ThePSDs of these signals were obtained by means of an average periodogram with eightfast Fourier transforms (FFTs). The neurons’ threshold was fixed to maximize the rateof correct detections on the one hand and to minimize that of the false detections onthe other hand. Figure 3.5 shows that for the global system for mobile communication(GSM) neuron a choice of a threshold of 0.002 optimizes the distance between thedesired correct detections and the false detections of other systems.

Figure 3.5. Percentage of correct detections of the channel bandwidth recognition sensor according to the threshold for the GSM neuron [ROL 01], and percentage of false detections of the GSM neuron when the received signals are CT2 or PHS. The optimal threshold is the threshold that maximizes distance between correct detections and false detections

3.1.2.2. Single- and multicarrier detection

Certain standards, having very similar PSDs shapes, for example digital audiobroadcasting (DAB) and digital-enhanced cordless telecommunications (DECT)signals, do not provide satisfactory results with the sensor presented in [PAL 01].These signals differ by the fact that they are single- or multicarrier. This characteristiccan be detected by identifying the guard interval (GI) present in the multicarriersignals. The GI is generally created by copying a part of the end of the OFDMsymbol and appending it to the beginning of this symbol. This is how we generate aparticular cyclic frequency that can be detected (see section 3.1.1.2). An example ofthis detection is given in Figure 3.6.


Figure 3.6. Detection of the GI of an OFDM signal (OFDM symbol 2K;GI/Tu = 1/6)

3.1.2.3. Detection of spread spectrum type

With the emergence of new IEEE standards, discrimination with both the previoussensors is not sufficient any more. In particular, it is not possible to discriminateBluetooth and WiFi (IEEE 802.11b) at 2.4 GHz. Indeed, these two standards cancoexist in the same band and, at the same time, their spectrum is identical. Yet, theydiffer by the type of spread spectrum used (frequency hopping for Bluetooth and directsequence for WiFi). This is the reason behind the proposal of this new sensor. Aprevious study proposed to make this discrimination after a time/frequency transform[GAN 04, GAN 05]. A more simple solution was proposed in [LOP 09]. Thissolution uses a Choi–Williams transform (see Figure 3.7) followed by segmentation.Later, on the resulting image, three measurements are taken: the length of timesegments, the width of the frequency segments, and the interval between two timesegments. By using these three measurements, we can determine whether frequencyhopping in the signal is present or not.

3.1.2.4. Other sensors of the lower layer

Depending on the required level of the environment knowledge and applicationneeds, it is possible to use information coming from various other sensors. Followingthe desired degree of independence of equipment and following the a prioriknowledge level of the parameters of physical layers considered1, the number andtype of sensors could belong to the following list (non-exhaustive):

– detection of carrier frequencies and symbol;

– recognition of modulation type;

1 In military applications of the interception type, this level of knowledge is practically null andvoid.


Figure 3.7. Detection of presence of FH: Choi–Williams transform on thereceived signal; illustration of a frequency hopping with four frequencies

– recognition of coding (convolutional coding, block coding, space time coding,etc.);

– recognition of access type;

– detection of synchronizations.

3.2. Intermediate layer sensors

3.2.1. Introduction

A CR terminal, when launched in an environment characterized by heterogeneousand multiple access networks, is neither initially aware of radio access technologies(RATs) that are accessible to it nor of the RAT better suited to its needs andfrequency band. The CR terminal also faces situations in which approaches suchas dynamic spectrum allocation (DSA) and flexible spectrum management (FSM)are adopted by regulators (see Chapter 1, section spectrum management). In thelatter case, the terminal completely ignores the spectral distribution. Without anya priori information, the terminal needs to scan a significant portion of thespectrum to recognize the existing RAT in its environment and to be able tolaunch communications. To avoid this search, various solutions have been proposed.Some are based on the use of a common carrier accessible from anywherelike the cognitive pilot channel (CPC; see section 3.2.2); others are based ongeolocalization (see section 3.2.2), or on more futuristic blind spectrum recognition(see section 3.2.4).


Activation of the radioterminal

Reading of the CPC

Extraction of theinformation

corresponding to thenetwork where theterminal is present

Selection of the RAT

Determining thegeographical location

Figure 3.8. Selection procedure of RAT by using CPC

3.2.2. Cognitive pilot channel

The TCPC is a recent concept in the CR domain [COR 06, HOU 06, PER 07]. Asindicated by its name, it designates a radio channel through which a CR terminal canrecover the place where it is located, the pertinent information regarding frequencybands allocation, RATs, services, etc.

The notion of CPC was introduced to simplify the task of the terminal to retrievethis information in a simple and fast manner. When the radio terminal is turnedon, it first determines its geographical position as shown in Figure 3.8. In this


context, positioning is related to the geolocalization method presented in section 3.2.3.Positioning allows terminals to decode the channel contents of CPC soon after.The terminal can hence determine all the networks available to it. For each network,it can identify all available operators, their preferred RAT, and thus their frequencybands.

The simplification of the selection procedure for RAT using the CPC has severaladvantages:

1) reduced acquisition time for a communication network;

2) reduced energy consumption by battery;

3) ease of deployment of new spectrum management approaches such asDSA/FSM.

A lot of effort remains, however, to be made before the materialization of thisconcept. The main difficulties faced are the following:

– the definition of the frequency of this CPC channel. The frequency can bedefined on bands dedicated to each country or even on each region, in the sense ofInternational Telecommunication Union’s (ITU) regions (in-band solution), or on asingle frequency for everyone (out-band solution). This solution is still under studyin the European Project E3 and has been proposed for various other standardizationorganizations;

– the operators must agree to share information currently protected due to marketcompetition.

3.2.3. Localization-based identification

The solution presented henceforth is based on the fact that there exists a known setof standards available to the user, in every geographical location. It is thus necessaryto precisely locate the equipment and associate with its geographical position a list ofstandards to be stored in its database.

3.2.3.1. Geographical location-based systems synthesis

The available systems depend on the geographical location of the terminal at thetime of transmission. The list of these usable systems can easily be edited due to thefrequency maps in any geographical region. Unfortunately, the list varies dependingon the movement of the terminal. Each standard is normalized for a certain definedgeographic region. It can, however, cover various geographical entities such as agroup of countries (European Union), a single country (France), a radius around anantenna, or even the entire world. It should be noted that the definition of a standardin a region does not mean that the receiver will have the right to communicate atany point in this region. This requires an adjustment of the geographical region to


the user rights. For certain systems of proximity whose frequency map, however,is continental like the DECT, it is preferable to limit the usage area to a verylocalized coverage. We can divide the standards according to their utilization areas(see Figure 3.9):

– Global coverage: normally, it does not require GPS localization. There are,however, exceptions to this. The standards included in this category are universalmobile telecommunication system (UMTS) and S-UMTS.

– Continental coverage (as defined by ITU): this kind of coverage is relativelyeasier to manage. The communications follow the time zones. Yet again, there areexceptions that may complicate management. The standards included in this categoryare GSM, IS95, PDC (personal digital cellular), DAB, etc.

– Regional/national coverage: it must manage the contour. This is the maindifficulty in database management. The standards included in this category areRadiocom2000, DVB-T, FM radio, etc.

– Local coverage: it is relatively easier to manage, if we know the transmittercenter and if the terrain is not of utmost importance (mountains and buildings).This category consists of standards such as DECT, PHS (personal handy phonesystem), WLAN, and other local area networks. The frequencies of these standardsare allocated by continental region but in practice the user will have only partial rightswithin very limited places. For this coverage, a manual input is often preferable. Itindicates the location of the transmitter and its coverage radius.

Figure 3.9. Examples of possible coverage


The knowledge of the frequency alone is not sufficient to distinguish allthe systems. For the Hertzian transmissions, the knowledge of the place wherethe equipment is located completely determines all the systems (and hence thefrequencies) that can be used for transmission. Embedding geolocalization (e.g. GPS)in a receiver, associated with a concise table of frequency allocation according togeographic locations (see Figure 3.10), makes it possible for a mobile to have apermanent knowledge of the systems it will be able to connect with. This table can bestored either in the user’s customized card or in the terminal’s memory. In both cases,they must be reprogrammable. Roaming is thus facilitated by this knowledge of theother networks.

Figure 3.10. Receiver architecture

3.2.3.2. Rights of database use and update

Many solutions are available to manage database and to verify user access rightsto a particular network. These solutions are divided into two general cases. First, thedevice is free to compete for its operator, whereas in the second case the device isattached to the service provider’s own network or is subjected to the agreementsbetween the operators. Two logical conflicts, however, may arise: the operator thatfavors its own network and the user who according to his/her criteria of cost or bitrate would like to have a wide range of possibilities. If the device can operate insingle-operator mode, the number of standards available in one place could be reducedto about 10 (DECT, GSM900, DCS1800, the radios and TV frequencies of variouscountries and some local area networks useful for the user, the Globalstar system,UMTS, and S-UMTS). These standards are international and can thus be included inthe databases in a straightforward manner. This configuration limits the usefulnessof the process as the operator is the operator that provides information to update the


database and access rights. If, however, the device can be used, in multioperator mode,the question of the database update and access rights does not arise in the same way.For example, a service provider can put at the disposal of all the users an Internetserver, containing the frequencies and mapping of existing standards. Access rightsthen come into play at the time of network entry and payment is made by credit card.

Database management is the most subtle point in this method. Indeed, for thismethod to be effective, it is necessary that we constantly update the equipment’sdatabase. This can be done in several different ways listed below:

– A download over air, also called over-the-air reconfiguration (OTAR) [KOU 02].It uses the reconfiguration by downloading via the aerial route so that a terminalcan download the code (binary/bitstream for processor/field programmable gate array,FPGA) regardless of where it is located, to change all or part of its radio processing.The medium used for the download is a radio link of the terminal itself. In the case ofan ongoing service (a communication), the data are multiplexed with the code to bedownloaded.

– A download by the SIM card ID of its user. The user recharges or exchangeshis/her card on a regular basis. Its rights of use are also updated according to thelocation.

– A download via a network is identical to the operation by OTAR.

– Manual configuration is perhaps useful to allow modifications for unforeseensituations. The management of the coverage is surely the crucial point of the system.Optimization of the management style and size of the database are the heart of theproblem to be solved. The system cartography must take up the smallest possibleamount of space in the database. Nevertheless, it is greatly simplified if we considerthat, practically, only four types of coverage exist (see section 3.2.3.1).

If CPC or database-oriented approaches are not adopted or do not exist in a place,stand-alone techniques can be used. The idea is to re-identify all the standards inan area, without connecting to these standards (and then avoid standard switching,network connectivity). The next section presents a blind standard recognition sensor(BSRS) for the same context.

3.2.4. Blind standard recognition sensor

3.2.4.1. General description

The BSRS analyzes the received signal in three stages, as mentioned in Figure3.11. In the first stage, the received broadband signal is analyzed in a coarse manner(e.g. using a radiometer) in order to determine whether frequency bands containsignificant energy. This analysis is performed iteratively to select frequency bandswith increasing narrowness. In the second stage, a very precise analysis of the selected


bands is performed. This analysis provides access to the information such as BWc,distinction between single- and multicarrier signals, and type of spread spectrum, etc.This analysis is performed with several sensors of the lower layer. Finally, in the thirdstage, a fusion process of all the information, obtained during the second stage (seesection 3.1), is performed, which can decide about the standards that are present in thespectrum.

Figure 3.11. Blind standard recognition sensor [HAC 07]

3.2.4.2. Stage 1: band adaptation

The frequency bandwidth sampled in SR technology (see Chapter 7) is very large.As a result, efficient functioning of second-stage sensors is extremely difficult withexisting signal processing tools. That is why an adaptation of the bandwidth to beanalyzed is performed during this first stage. This adaptation is performed using aclassical energy detection followed by filtering and decimation of the energy peaksthat are detected. This adaptation is performed in an iterative manner in order toanalyze bands of few MHz widths.

3.2.4.3. Stage 2: analysis with lower layer sensors

After studying the discriminating characteristics of the parameters in the variousstandards considered, three sensors were selected to identify the received signal in apredefined list of standards. These three sensors, described in section 3.1, are BWcrecognition of a standard, single- and multicarrier detection, and the detection ofspread spectrum between frequency hopping and direct sequence. The list of sensorscan be extended to other characteristics if these three sensors are put by default forcertain standards or to differentiate between future standards.


3.2.4.4. Stage 3: fusion

At the end of the second stage, three information sets are obtained. These willbe merged to decide which standard is available. The fusion is performed with somelogical rules or by using more powerful tools such as neural networks or Bayesiannetworks.

3.2.5. Comparison of abovementioned three sensors for standard recognition

Methods CPC LBI BSRSNeed for a service provider Yes Yes NoContent level (1) High Medium LowDependence of the radio coverage (2) Yes No NoComputation complexity Very low Medium Very highNeed for normalization Yes Yes NoSpectrum utilization Yes Yes NoDependence on the operator Yes Yes NoNeed for an additional radio link Yes (the CPC itself) Yes (GPS) No

Table 3.3. Sensor comparison for standard recognition

Table 3.3 clearly shows that the BSRS, with the exception of the complexitycriterion, is better than other propositions on other criteria. Criterion (1) in Table 3.3indicates that the information provided is more complete with CPC2 than with BSRS.The CPC can provide additional information on standards, operators, services, etc.,whereas the BSRS only gives the information about the existence of the standard (inthis case, access to more information requires demodulation of the signal). Criterion(2) of Table 3.3 means that the information provided by the method considereddepending on the standard coverage. In fact, it is hardly imaginable that the CPC cangive clear-cut and reliable information about all WiFi hotspots, while the BSRS candetect these standards. At the same time, the localization-based identification (LBI)can also detect these standards under the assumption that the database is correctlyupdated.

3.3. Higher layer sensors

3.3.1. Introduction

According to our earlier classification (see Table 3.2), higher layer sensors includethe application layer sensors. There are close relationships between higher layer

2 The CPC itself is currently being studied at ETSI.


sensors of CR and those used in context aware (CA; see Figure 1.11). The domainof CA is very broad and is defined by taking into account the context of the system(computer, cell phone, embedded systems, etc.). The application layer sensors in CRand those of the CA can be differentiated simply by the radio link. The sensors in CRare exploited to accomplish the main objective of the CR, i.e. to maximize the qualityof information transmitted (e.g. QoS, power consumption, and radiating level).

Let us discuss the following scenario, inspired by the scenario that was proposedby J. Mitola [MIT 09]. In this scenario, we use the video sensor of face analysisin CA and CR. A soldier progresses into the enemy’s territory in a vehicle wherehe is involved in an accident. He is injured and is thrown out of his vehicle withhis cognitive personal digital assistant (CPDA). This CPDA, by means of a higher-level energy sensor (CR sensor), comes to know that the nature of energy source haschanged: the energy provided by the vehicle is no longer accessible, i.e. it has nowto use its own battery. The audio and video sensors (CA sensors) are responsible forrecognizing the person handling the CPDA so that it is not used by the enemy. Thesame video sensor (CR sensor) has to recognize the person handling the CPDA inorder to optimize the compression of the video transmission by implementing a sourcecoding adapted to the face of the soldier. It is clear from this example that the samesensor (face recognition) can be used both in CA (biometrics) and CR (source coding).

In the next section, we identify the sensors that can potentially be used in CRand illustrate them using different scenarios. Section 3.3.3 discusses the video sensorto adapt the video compression mode according to the radio link constraints or toinfluence the transmission system parameters. Generally, hereafter, a mobile refers toa cell phone.

3.3.2. Potential sensors

All sensors that can be used to achieve the main goal of improving the radio linkare intelligent sensors of CR as we saw in the previous scenario. A lot of futuristicscenarios that extend far beyond the scope of these sensors can be envisioned:

– A video sensor can be used to determine whether the terminal is located insideor outside a building as proposed in [PAL 07, PAL 09b]. This information may havean impact on the transmission characteristics.

– The GPS of the cell phone can be used to detect the nearest antenna and totransmit only in its direction in order to minimize the electromagnetic pollution of theuser brain.

– The high-level information of the receiver can be accessed at the transmitterlevel. For example, it is possible at the transmitter level to know the signal restorationelements as well as the resolution and refresh the rate of receiver screen.


– The cell phone may be able to know if the user is in a car or not. In this case,the car operates as a Faraday cage, the transmission level is important for reaching thebase station. One strategy can be to reduce the bit rate and to prohibit, for example,the video communication owing to the fact that it is not desirable to look at a screenwhile driving.

REMARK.– In the latter scenario, the first strategy would be to remember the goodrules for using a cell phone terminal, i.e. use of hands-free options of the equipmentwhile being connected to the antenna exterior to the vehicle.

The video sensor is a privileged sensor of the higher layer in CR because of its non-invasive qualities and its flexibility of use. Its most natural exploitation is to assist theimage source coder so that it can adapt to the available bandwidth. The followingscenario illustrates a very classical situation in CR. A user gets a video clip and sendsit to a friend who views it. The management system (e.g. hierarchical and distributedcognitive radio architecture management (HDCRAM), explained in Chapter 5) ofFigure 3.12 must adapt the radio equipment to furnish the best transmission qualitytaking into account the available resources. Depending on information coming fromdifferent sensors and decision-making algorithms, the HDCRAM defines the optimalconfiguration to provide the best service.

Video codec Audio codec

Hardware sensor Bandwidth sensor

Video sensor Audio sensor

HDCRAM

Figure 3.12. Hierarchical and distributed cognitive radio architecturemanagement (HDCRAM)

In audio coding, the current cell phone equipment analyzes the received signalto detect the possible presence of a voice message. For example, in MPEG4compression, if a voice signal is not detected, it uses a generic audio coder such as


the transform-domain-weighted interleaved vector quantization (TwinVQ) developedby Nippon Telegraph and Telephone Corporation (NTT). In the opposite case, a coderadapted to voice signals compression such as code-excited linear predictive (CELP)[SCH 85] can be used to improve the quality of the transmitted signal. In this way, forthe same bit rate, the subjective quality of the compressed audio signal is much betterthan if it had been compressed by the generic coder.

In video coding, the same strategy can be applied: a generic coder H.264(MPEG4-AVC) or a face coder is used based on the video signal contents. It is tobe noted that the face coder synthesizes the user face starting from a model: thisoperation is costly in terms of computation time and cannot be performed in real timeexcept by a graphics processing unit (GPU). We will see that video and audio sensorsprovide valuable information for the video codec, making it possible to improve ordegrade the coding quality of certain regions of the image relative to their importance.

Face

YesNo

Codec H264 [–45˚… 45˚]

Codec H264

YesNo

Yes

GPU

No

Face recognized ? Codec H264

Voice

YesNo

YesNo

Codec adaptedto the face

Generic codecfor the face

Codec adapted to theface – emphasis on the

mouth

Figure 3.13. Decision tree

A decision tree is shown in Figure 3.13, which is part of the decision algorithmspresented in Chapter 4. When this tree is built (hence perfectly known) and the


decision space is reduced (as is the case in Figure 3.13), then decisions are evident andeasy to make. It is, therefore, a very effective decision-making algorithm. Decisionsprovided by this algorithm enable the manager to specify the optimal configuration ofthe video codec taking into account the information provided by different sensors.

For the decision tree in Figure 3.13, the ovals represent information coming fromdifferent sensors, whereas the leaves give video codec configuration. The first sensorinforms the HDCRAM of the possible presence of a face. If a face is not detected,a generic codec (H264) is applied to the entire image. In the opposite case, theorientation of the face is taken into account. If the person does not really look atthe camera (e.g. face profile), then we consider that the most relevant information fortransmission is not in the face itself: then H264 codec is used.

In the contrary case, if there is no GPU available then the H264 coder is used asthe face codec. If a GPU exists, it is interesting to know whether the detected face isknown to the system or not. If it is not known, then a generic version of the face codecis used: the face is modeled on-line and the transmission bit rate varies throughoutthe communication. If the face is known, then its model is available and will betransferred through the manager to the receiver and the video codec. Only the high-level parameters that require little bandwidth and allow the receiver to reconstruct theface from these parameters will be transmitted. The bandwidth in itself will be foundto be very heavily reduced.

The audio sensor is also used: a detected voice message indicates that the imagezone corresponding to the mouth is important and must be enhanced comparedwith those relative to eyes, skin, and obviously image background. For such animage, the background must be highly compressed since it contains little relevantinformation intended to be communicated. This can be accomplished using the audio-video objects (AVO) of MPEG4, which give the option of compressing the objects thatconstitute the scene, with different ratios. If a voice message is not detected, it meansthat the most important information is contained in the eyes and the image zone ofeyes must be enhanced with respect to other characteristics of the face (nose, mouth,and skin).

In this scenario, the image codec sends image information through the HDCRAM.For example, it specifies the throughput that it produces and the data that areimportant to protect. The information is conveyed through the HDCRAM to thedecision algorithms that decide which type of modulation is optimum (e.g. GSM or802.11 g) and which configurations (channel coding and error-correcting code)must be implemented. In addition to the information that the system hasthrough sensors of the radio equipment such as channel estimation, broadcastquality, estimation of available energy, recognition of the available standard,it sends specific information to the video codec such as the available bitrates or resolutions and images display frequencies on the final receiver. Forexample, the equipment can identify the GSM as the only means available to


transmit the audio-video message and choose a low-resolution profile for imagesbecause the receiver will display the message on a cell phone. Thus, from theinformation collected, the HDCRAM sends the configuration parameters pertinent tothe audio and video codecs through HDCRAM.

The following inputs of Figure 3.13 permit the HDCRAM to parameterize thevideo codec:

– GPU detection;

– voice message detection;

– face detection;

– face recognition;

– estimation of face orientation.

The first sensor is trivial. The second is currently incorporated into commercialequipment. For the third sensor, we use a current face detector [VIO 04]. The marketof face analysis itself is extremely active (Canon was a pioneer in the developmentof its cameras with automatic face detection. Sony, in 2007, released a camera thattriggers when the person to be photographed smiles). Currently, there exist no obvioussolutions for the last two sensors.

Face recognition can be performed in real time in an efficient way if the positionof the face is fixed, which is not the case in our application context. The userfreely takes a picture of a person who is neither centered nor is his/her face andhence his/her face may appear small or large. Under these difficult conditions, therecognition rate of identification algorithms falls dramatically. To circumvent thisproblem, precise detection of the face orientation as well as the localization of itsfeatures (eyes, nose, and mouth) is necessary. When this information is available,it is possible to synthesize the analyzed face in a standardized form (face front,properly centered, and normalized in size). The synthesized face is then processed bythe face recognition algorithms for identification. Active appearance models (AAMs)[COO 98] have been used for efficient face alignment for the last 10 years. Wecomprehend by alignment of faces, the capability to detect its pose and to locatea set of key points on eyes, nose, and mouth. In order to improve the robustnessof conventional AAM of [COO 98] under real conditions of use, new optimizationalgorithms have been proposed [SEG 09] and [SAT 09]. Work on face alignmenttherefore allows us to use two sensors: “face identification” and “estimation of faceorientation”, for which, at the moment, no obvious solutions are available.

3.3.3. Video sensor and compression

In this section, we illustrate, by means of a real scenario (adaptive compression),the close relationship that may exist between video sensors and compression


algorithms. The video sensor makes it possible to manipulate the codec andtransmission system parameters. To compress video data, it is possible to useJPEG2000 coding due to its scalability (spatial and temporal) and ability to compressthe images independently. The latter point is very important for CR because it musthave an option to change the quality of transmitted images instantaneously, when theHDCRAM makes a request; there must not be a wait of 15 images, for example,in MPEG2 and even more in MPEG4 (all 15 images, MPEG2 encodes an imageindependent of the video stream, other images result from an interpolation or aprediction with adjacent images). JPEG2000 provides the choice to encode multipleregions of interest (ROIs) with different compression ratios. These regions are linkedto different objects in our application context that constitute a face besides the imagebackground. It is to be noted that the notion of ROI also exists in MPEG4 part2.

The AAMs define four ROIs in video communication, as illustrated in Figure 3.14.The image background constitutes the first region of interest (ROI-1) and is very hardto compress. Usually, the most important information to be transmitted comes fromthe mouth region (ROI-4) that will be slightly compressed, while the region of theeyes (ROI-3) will be a little more compressed; the region of the skin (ROI-2) will beslightly less compressed than the image background. Obviously, these compressionratios vary according to the decision to be taken at the level of decision algorithms(see Figure 3.13).

ROI-2

ROI-1

ROI-3

ROI-4

Figure 3.14. Different regions of interest detected by active appearance model

After explaining the AAMs in section 3.3.3.1, the scenario envisioned bythe Signal Communication and Embedded Electronics (SCEE) team [NAF 07] toillustrate the interest of CR in the transmission rate optimization is presented insection 3.3.3.2.


3.3.3.1. Active appearance models

The AAMs produce a face model from a database of sample face models. Theshape of the face is denoted by a vector s that consists of the coordinates of the pointscharacterizing it. Its texture is denoted by a vector g representing the values of thepixels of the image in the form defined by s. To create the model, it performs, on theone hand, a principal component analysis (PCA) on shape vectors, and on the otherhand, another PCA on the texture vectors.

For image i:

si = s+Φs ∗ bsgi = g +Φg ∗ bg [3.19]

where si and gi are the shape and texture of the face in the image i; s and g theshapes and texture averages of faces; Φs and Φg matrices consisting of eigenvectors ofsample shape and textures, respectively; bs and bg vectors representing the projectioncoefficients of shapes and textures si and gi with respect to their bases. Applying athird PCA, the vector,

b =

[bsbg

]we get:

b = Φ ∗ c [3.20]

where φ is the matrix of dc eigenvectors found by PCA, vector contains the appearanceparameters, i.e. projection coefficients of vector b on its own basis vectors. Thus, it ispossible to synthesize any face by acting on the appearance parameters that deform atthe same time both the texture and the shape of the synthesized face.

To align a face in an image, the appearance parameters of the vector c must beadjusted in order to minimize the error between the segmented image (the textureof the input image defined by the shape provided by the vector c) and the texturegenerated by the model (produced by the vector c). For example, to find a face inthe image (Figure 3.15a), the detector gives the approximate position of the center ofthe face (shown with a white rectangle, Figure 3.15b) from which the active model isinitialized. After convergence, the shape of the AAM “glues” to the analyzed facialfeatures, i.e. the position of the eyes, nose, and mouth in the image (Figure 3.15c)and hence the texture of the face (Figure 3.15d) can be given. The face and itscharacteristic points are therefore correctly positioned.

Adaptive histogram equalization over a set of images is performed to improverobustness against illumination changes. The gray levels in the textures are replacedby the contours’ orientation in each pixel [GIR 06].


3.3.3.2. A real scenario

The following scenario illustrates the close collaboration that exists between thevideo sensor and the codec in a cross-layer perspective. The sensor is responsible foranalyzing the face and, in particular, detecting ROI; different parts of the face arecompressed more or less by the codec depending on their contribution in terms ofinformation. We analyze the impact that such a coding can have on the transmissionchain.

(a) (b)

(d)(c)

Figure 3.15. (a) Original image, (b) initialization of AAM, (c) and (d) shapeand texture after optimization

A person switches on his terminal and starts a video telephonic conversation. Atthe beginning of the communication, the face of this person and the backgroundimage are transmitted using a traditional compression that requires high bit rate.As time evolves, a face model of the transmitting person is evaluated. After havingtransmitted this model to the receiver, it is simply enough to send the high-levelparameters (orientation of face opening of the mouth, eyes, direction of glance,etc.) to reconstruct the image of the person’s face. It is consequently possible tosignificantly reduce the volume of data to be transmitted since the face model(texture, shape of mouth, eyes, etc.) is to be sent only once. It is sufficient for the


following transmissions to send high-level parameters characterizing the behavior ofthis face to reconstruct everything as accurately as possible when a conventional imagecompressor will be used. Furthermore, it is possible to change the bit rate on the flyif a dynamic reconfiguration is considered. Figure 3.16 illustrates the evolution of thebit rate and reconfiguration over time.

Transmission of the model of the face and thebackground

Transmission of the model of the mouth

Throughput

Stages

Transmission of the eyes model

Transmission of the errors and theparameters of the face, the eyes

and the mouth

Transmission of the parameters ofthe face and the error

Transmission of the global compressed image

Figure 3.16. Dynamic reconfiguration of standard according to video codec:in the course of time, as the video codec learns the face model to be

transmitted, the required bit rate for transmission decreases more and moreand therefore requires less and less use of bandwidth of a standard

The AAMs can be applied to any object. The facial features contained in thevarious ROI are modeled. The long-term objective is to analyze each ROI with AAM.Therefore, it is necessary to initially construct an AAM of a face (stage 1), then anAAM of the mouth (stage 3), and finally an AAM of the eye (stage 5). Each object(face, eyes, and mouth) is modeled by means of the eigenvectors, represented by thematrices Φs, Φg , and Φ of equations [3.19] and [3.20]. The high-level parameters thatwill be transmitted are the coordinates of the vector c (equation [3.20]), which permitreconstruction of each of the modeled objects.

Let us consider the texture of Figure 3.15d, which is reconstructed from theparameters c of the AAM whose base is a set of sample images of the face of aperson who is communicating. During each stage of the modeling procedure, a timeperiod is required to compile images of the object to be analyzed and to implementvarious AAMs in order to produce eigenvectors that form our model. For this reason,the entire image is compressed in a classical manner (JPEG2000) at the beginning ofcommunication and then transmitted (stage 1).


Subsequent to the realization of the AAM of the face, it is feasible to transmit themodel (eigenvectors) and the background image to the receiver (stage 2) and thensend only the model parameters (stage 3). The model of the mouth is evaluated in thethird stage, then transmitted in stage 4. The high-level parameters characterizing thebehavior of the mouth as well as the modeling of the eyes are performed in stage 5.The AAM of the eyes is transmitted in stage 6 where only the appearance parametersare sent. Finally, in the last stage, only the high-level parameters of the differentmodels are transmitted, so that the throughput is significantly reduced comparedto that of the first stage. It is also feasible to transmit an image of good qualitythroughout the communication stages. The transmission system is reconfigured overa timeline while passing seamlessly from 802.11 to the GSM for the user.

3.3.3.3. Different stages

A person switches on his terminal and starts a video telephonic conversation at t0.The video sensor progressively learns the models of the face, mouth, and eyes. Whenthe learning process is completed, only the concerned high-level parameters of theAAM are transmitted. Various stages of Figure 3.16 are connected as follows:

– Stage 1 (t0 to t1)- Source coding: the video source is encoded in a conventional manner. The

transmitter learns the 3D model of the face of the person;- Radio link: high bit rate transmission, OFDM modulation (802.11) g with a

standard error correcting code.

– Stage 2 (t1 to t2)- Source coding: video codec has completed the face analysis;- Radio link: same type of data as in the previous stage plus the face model and

the image background is transmitted. OFDM modulation (standard 802.11 g) with arobust error correcting code for the model and the background.

– Stage 3 (t2 to t3)- Source coding: codec learns different shapes and textures of the mouth;- Radio link: only the parameters characterizing the size, orientation, and

shape of the face are sent so that the receiver can reconstruct the 3D face model onthe background of the image already transmitted. The reconstruction errors betweenthe synthesis of the image and the image to be transmitted (primarily at level of eyesand mouth) are also transmitted to improve the image reconstruction at the receiver.A standard of type UMTS is used with a classical error correcting code.

– Stage 4 (t3 to t4)- Source coding: video codec has completed the analysis of different mouth

shapes;- Radio link: the high-level parameters characterizing the face as well as the

model of the mouth are transmitted. The UMTS standard will be used with an errorcorrection code particularly robust on AAM of the mouth.


– Stage 5 (t4 to t5)- Source coding: codec learns different styles of the person’s eye;- Radio link: the high-level parameters that permit encoding of the face and the

mouth, as well as the reconstruction errors (primarily in the eye area), are sent. TheUMTS standard is used with a classic error correcting code.

– Stage 6 (t5 to t6)- Source coding: video codec has finalized the modeling of the eyes;- Radio link: the appearance parameters of AAM concerning the face and the

eyes are transmitted. The model of the eyes as well as the reconstruction errors is sent.The UMTS is used with an error correcting code particularly robust on AAM of theeyes.

– Stage 7 (from t6 to t7)- Source coding: the codec has finished learning the different models;- Radio link: only the appearances parameters of the various AAM and

reconstruction errors are transmitted using the GSM standard.

In this scenario, it is the video codec that determines the bandwidth through theHDCRAM and hence the modulation type (thus the standard in this example) that willhave to be implemented in the radio link. However, as exemplified in Figure 3.12, theHDCRAM can manipulate the video codec to impose a particular output bandwidth onit. For example, in the final stage (stage 7) the HDCRAM may require an extremelyreduced bandwidth so that the video codec only transmits high-level parameters ofvarious models and no information about the error. An image thus displayed on thereceiver (a GSM cell phone) will be very smooth but with some reconstruction errorswhen the person has a behavior not modeled during the learning phases of the differentmodels. Note that this scenario has to be refreshed when the video conditions change(e.g. background).

3.4. Conclusion

In this chapter, we have proposed a classification of sensors based on a three-layer model: higher, intermediate, and lower. Initially, the conventional sensors ofthe CR, the most studied in the literature, are described. These belong to the lowerlayer of the model. The sensors of the physical layer, especially those that proposesolutions to detect the free spaces of the spectrum, are detailed. These detectors of“holes” can refine themselves in the future by taking into account the network densityconstraints. The notion of sensors is then broadened by describing certain sensorsof higher and intermediate layers. The sensors of the intermediate layer focus onspectrum analysis to identify the available networks, with significant efforts in blindstandard recognition. Finally, we described the higher layer sensors that enable us,among other functions, to compress the audio-visual signal intelligently in order toensure the best possible reconstruction quality according to the context.

Date post:	18-Dec-2016
Category:	Documents
Upload:	pierre-noel
View:	212 times
Download:	0 times

Radio Engineering (From Software To Cognitive Radio) || Cognitive Radio Sensors

Documents