THE NEURAL-NETWORK ANALYSIS & its applications DATA FILTERS

THE NEURAL-NETWORK ANALYSIS

& its applications

DATA FILTERS

Saint-Petersburg State University JASS 2006

About me

NameName:: Alexey MininPlace of studying:Place of studying: Saint-Petersburg State UniversityCurrent semester:Current semester: 7th semesterField of interests:Field of interests: Neural Nets, Data filters for Optics (Holography), Computational Physics,EconoPhisics.

Content:

What is Neural Net & it’s applications Neural Net analysis Self organizing Kohonen maps Data filters Obtained results

What is NeuroNet & it’s applications

Recognition of images Recognition of images

Processing Processing of noisedof noised signals signals

Addition of imagesAddition of images

Associative search

ClassificationClassification

Drawing up of schedules

Optimization

The forecastThe forecast

Diagnostics

PPrediction of risksrediction of risks

What is Neural Net & it’s applications

M-X2 9980

Recognition of imagesRecognition of images

What is Neural Net & it’s applications

PARADIGMS PARADIGMS of neurocomputingof neurocomputing

Neural Net analysis

Connection

Localness and parallelism of calculations

The training based on data (programming)

Universality of training algorithms

Neural Net analysis

What is Neuron?

Typical formal neuron makes the elementary operation – weighs values of the inputs with the locally stored weights and makes above their sum nonlinear transformation:

y f u u w w xi ii , 0

x1 xn

y

u

y

u w w xi i 0

neuron makes nonlinear operation above a linear combination of inputs

Neural Net analysis

Global communications

Formal neurons

Layers

Connectionism

Neural Net analysis

Localness and parallelism of calculations

Localness of processing of the information

Any neuron reacts only to the information from connected with it neurons without the appeal

to a general plan of calculations

Neurons are capable to function in parallel

Parallelism of calculations

Comparison of ANN&BNN

hzN 100 hzN910

BRAIN PC IBM

Vprop=100m/s Vprop=3*108 m/s

100hz 109hzN=1010-1011 neurons N=109

The parallelism degree ~1014

like 1014processors with 100 Hz frequency. 104 connected at the same time.

The training based on data (programming)

Neural Net analysis

Absence of the global plan

Mode of distribution of the information on a network with

corresponding adaptation neurons

The algorithm is not set inadvance, and generated by data

Training of a network occurs on a small share of all possible situations then the trained network is capable to function in much wider range of patterns

Local change by any neuron the selected parameters

Synaptic weights

Training of a network

Patterns, on which Network is training

An ability for generalization

Neural Net analysis

Universality of training algorithms

The only principle of studying - - is to find minimum of empirical error

W – set of synaptic weightsE (W) – error function

The task is to findThe task is to findGlobal minimumGlobal minimum

The stochastic optimization asa way not to stick at local minimum

Neural Net analysis

BASIS NEURAL NETS

Perceptron

Hopfield network

Kohonen maps

Probabilistic NNets

NN with general regression

Polynomial nets

The architecture of NN

Neural Net analysis

LEVEL-BY-LEVEL

WITHOUT FEEDBACK

RECURRENT with FEEDBACK (Elman-Jordan)

PROTOTYPES OF ANY NEURAL ARCHITECTURE

Classification of NN

Neural Net analysis

By type of training

with tutor without tutor

In this case the network is offered most to find the latent laws in data file. So, redundancy of data supposes compression of the information, and a network it is possible to learn to find the most compact representation of such data, i.e. to make optimum coding the given kind of the entrance information.

)},(,,{)( wxyyxEwE )},(,{)( wxyxEwE

Methodology Methodology of self-organizingof self-organizing cards cards

Self-organizing Kohonen cards represent the type of the neural networks trained without the teacher. The network independently forms the outputs, adapting to signals acting on its input. As "teacher" of a network only data, that is an information available in them, the laws distinguishing entrance data from casual noise can serve.

Cards unite in themselves two types of compression of the information:

Downturn of dimension of data with the minimal loss of the information

Reduction of a variety of data due to allocation of a final set of prototypes, and references of data to one of such types

Schematic representation of self-organizing network


Neurons in the target layer are ordered and correspond to cells of a bi-dimensional card which can be painted by a principle of affinity of attributes

Hebb training ruleHebb training rule

Hebb, 1949 Change of weight at presentation of ith example is proportionally its inputs and outputs: :

Change of weight at presentation of ith example is proportionally its inputs and outputs: :

If to formulate training as a problem of optimization trained on Hebb neuron aspires to increase amplitude of the output:

Where averaging is spent on training sample x

Training on Hebb in that kind in what it is described above, In practice not useful since leads to unlimited increase of amplitude of weights.

NB: in this case there is no minimum error

Vector representation

j jw y x y w x

2 21 12 2

,

, ,

E

E y

ww

w x w x

Oya training ruleOya training rule

x

w

1

The member interfering is added To unlimited growth of weights

Vector representation

Rule Oya maximizes sensitivity of an output neuron at the limited amplitude of weights. It is easy to be convinced of it, having equated average change of weights to zero. Having increased then the right part of equality on w. We are convinced, that in balance

Thus, weights trained neuron are located on hyper sphere:

At training on Oya, a vector of weights neuron settles down on hyper sphere, In a direction maximizing Projection of entrance vectors.

j j jw y x y w y y w x w

2 1 0y 2w

1.w

Competition of neurons: the winner takes away all

# : iii i w x w x

x1 xd

y w xi ij jj

d

1

i i k kky y w x w

Basis algorithmTraining of a competitive layer remains constant

Winner:

1i iiif i i

w w x w x

i # of neuron winner

I.e. the winner will appear neuron, giving the greatest response to the given entrance stimulus

Training of the winner:

i i

w x w1, 0,ii

y y i i

The winner takes away all

One of variants of updating of a base rule of training of a competitive layer Consists in training not only the neuron-winner, but also its "neighbors", though and with In the smaller speed. Such approach - "pulling up" of the nearest to the winner neuron- It is applied in topographical Kohonen cards

# : min iii

i i

w x w x

( 1) ( 1) ( ) , ( ) ( )t t t t t t i i i iw w w i i x w

, t i i Function of the neighborhood is equal to unit for the neuron--winner with an index And gradually falls down at removal from the neuron-winner

i

Modified by Kohonen training rule

Training on Kohonen reminds stretching an elastic grid of prototypes on Data file from training sample

Bidimentional topographical card of a set Three-dimensional data

Each point in three-dimensional space gets in the cell of a grid having coordinate of the nearest to it’s neuron from bidimentional card.

xi The convenient tool of visualization Data is coloring topographical Cards, it is similar to how it do on Usual geographical cards. Allattribute of data generates the coloring Cells of a card - on size of average value This attribute at the data who have got in given Cell.

Visualization a topographical card, Induced by i-th Visualization a topographical card, Induced by i-th component of entrance datacomponent of entrance data

Having collected together cards of all interesting Us of attributes, we shall receive topographical The atlas, giving integrated representation About structure of multivariate data.

Classified SOM for NASDAQ100 index for the period from 10-Nov-1997 till 27-Aug-2001


1,0

1,5

2,0

2,5

3,0

3,5

4,0

1 51 101 151

Время

Ln Y

(t)

Change in time of the log-Change in time of the log-price of actions of price of actions of companies JP Morgan companies JP Morgan Chase (The top schedule) Chase (The top schedule) and American Express (the and American Express (the bottom schedule) for the bottom schedule) for the period With 10-Jan-1994 on period With 10-Jan-1994 on 27-Oct-199727-Oct-1997

1,5

2,0

2,5

3,0

3,5

4,0

4,5

1 51 101 151

Время

Ln Y

(t)

Change in time of the log-price of actions of companies JP Morgan Chase (The top schedule) and Citigroup (the bottom schedule) for the period c 10-Nov-1997 on 27-Aug-2001

How to choose a variant?How to choose a variant?

Annual prediction

-29

-28

-27

-26

-25

-241988 1993 1998 2003 2008 2013 2018 2023 2028 2033 2038

Annu

al CS

L

TEST PREDICTION

This is the forecast of theThis is the forecast of theSea level (Caspian)Sea level (Caspian)

DATA FILTERS

Custom filters (e.g. Fourier filter) Adaptive filters (e.g. Kalman filter) Empirical mode decomposition Holder exponent

Adaptive filters

Further we will keep in mind, that we are going to make forecasts, that’s why we need filters, which won’t won’t change phasechange phase of the signal.

Z-1

X(n)

X(n-1)X(n-2)

…

X(n-nb)

Z-1

Z-1

b(2)

b(3)

b(nb+1)

Z-1

Z-1

Z-1

-a(2)

-a(3)

-a(na+1)

y(n) y(n-1)

y(n-2)

…y(n-nb)

)()1(...)1()2()()1(...)1()2()()1()( nanynaanyanbnxnbbnxbnxbny

)()1(...)1()2(

)()1(...)1()2()()1()(

nanynaanya

nbnxnbbnxbnxbny

Adaptive filters

We saved all maxima, there is no phase distortion

Siemens value, ad close (scaled)

Adaptive filters

Let’s try to predict next value using zero-phase filter, having information about historical price: I used Perceptron with 3 hidden layers, logistic act function, rotation alg, 20 min

Adaptive filters

Kalman filter

noisewhite)(,aftersignal)()()(

andnoisewhite)(

,signalgeneratingofmodel)1()1()(

where)],1()()[()1()(

nnetneuralnncxny

nw

nwnaxnx

nxacnynknxanx

K(n) )(nx

+ +

Z-1

ac)1(

nx)1(

nxac

)(ny

Adaptive filters

Lets use Kalman filter, like the error estimator for the forecast of the zero-phase filtered data.

Empirical Mode Decomposition

What is it?

We can heuristically define a (local) high-frequency part{d(t), t− ≤ t ≤ t+}, or local detail, which correspondsto the oscillation terminating at the two minimaand passing through the maximum which necessarilyexists in between them. For the picture tobe complete, one still has to identify the corresponding(local) low-frequency part m(t), or local trend,so that we have x(t) = m(t) + d(t) for t− ≤ t ≤ t+.

)()(

...)()()()()()(

thatso

),()()(m

as decomposed itself is )( residualfirsttheand

),()(x(t)

as loopmain ethrough th

decomposedfirst is x(t)signal original the,Eventually

1

22111

221

1

11

tmtd

tmtdtdtmtdtx

tmtdt

tm

tmtd

K

k Kk

What is it?Empirical Mode Decomposition


Algorithm

Given a signal x(t), the effective algorithm of EMDcan be summarized as follows:1. identify all extrema of x(t)2. interpolate between minima (resp. maxima),ending up with some envelope emin(t) (resp. emax(t))3. compute the mean m(t) = (emin(t)+emax(t))/24. extract the detail d(t) = x(t) − m(t)5. iterate on the residual m(t)

tone

chirp

tone + chirp

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-2

-1

0

1

2

IMF 1; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1.5

-1

-0.5

0

0.5

1

1.5

IMF 1; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 1; iteration 3

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 1; iteration 4

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 1; iteration 5

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 1; iteration 6

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 1; iteration 7

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 1; iteration 8

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 2; iteration 0

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 2; iteration 1

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 2; iteration 2

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 2; iteration 3

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 2; iteration 4

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

IMF 2; iteration 5

10 20 30 40 50 60 70 80 90 100 110 120

-1

-0.5

0

0.5

1

residue

imf1


imf2

imf3

imf4

imf5

imf6

10 20 30 40 50 60 70 80 90 100 110 120

res.


Lets do it for Siemens indeximf3

-0,05

-0,04-0,03

-0,02

-0,01

00,01

0,02

0,03

0,040,05

0,06

0 200 400 600 800 1000 1200 1400

imf3

imf4

-0,08

-0,06

-0,04

-0,02

0

0,02

0,04

0,06

0,08

-100 100 300 500 700 900 1100 1300

imf4

resid

0

0,10,2

0,30,4

0,5

0,60,7

0,80,9

1

-100 100 300 500 700 900 1100 1300

resid


Lets do it for Siemens index

0

0,2

0,4

0,6

0,8

1

1,2

0 200 400 600 800 1000 1200 1400

3+4+resactual

We saved all strong maxima and there is no phase distortion


Lets make a forecast for Siemens index

-0,08

-0,06

-0,04

-0,02

0

0,02

0,04

0,06

0,08

0,1

0 50 100 150 200 250

forecast

actual

THERE WAS NO DELAY IN THE FORECAST AT ALL!!!

Holder exponent

fDtf )(

]1,0[)(,)(|)()(| )( ttconsttfttf t

)( have that wemeans1

order second ofbreak have that wemeans0

tO

The main idea is next. Consider

Holder derived, that

So this formula is a somewhat connection between “bad” functions and “good” functions. If we will look on this formula with more precise we will notice, that we can catch moments in time, when our function knows, that it’s going to change it’s behavior from one to another. It means that today we can make a forecast on tomorrow behavior. But one should mention that we don’t know the sigh on what behavior is going to change.

Results

Thank You!Any QUESTIONS?SUGGESTIONS?

IDESAS?

Soft I’m using:1)MatLab2)NeuroShell3)FracLab4)Statistika5)Builder C++

Date post:	14-Jan-2016
Category:	Documents
Upload:	cwen
View:	47 times
Download:	0 times

THE NEURAL-NETWORK ANALYSIS & its applications DATA FILTERS

Documents