Download - Hale Collage - |LASP|CU-Boulderlasp.colorado.edu/~cranmer/ASTR_7500_2016/Lec_Rebecca/centeno… · Solutions to the RTE: Milne-Eddington Approximation Let’s assume: Polarization

March 1 - 8, 2016

Hale CollageSpectropolarimetric Diagnostic Techniques! ! ! ! ! ! ! ! Rebecca Centeno

Today

✤ Simple solutions to the RTE!!✤ Milne-Eddington approximation!✤ LTE solution!

!✤ The general inversion problem!!✤ Spectral line inversion codes!!

✤ Levenberg-Marquardt techniques!✤ Principal Component Analysis

Solutions to the RTE: Milne-Eddington Approximation

Let’s assume:Polarization due to Zeeman effect.!The elements of K are constant: K = K0!S = (S0 + S1τ) (1, 0, 0, 0)T

Problem Set

Rebecca Centeno

January 26, 2016

1 Introduction

dI⃗

dτc= K(I⃗ − S⃗) (1)

ηI = 1 +η02

{φ0sin

2θ +1

2[φ+1 + φ−1](1 + cos2θ)

}

ηQ =η02

{φ0 −

1

2[φ+1 + φ−1]sin

2θcos2φ}

ηU =η02

{φ0 −

1

2[φ+1 + φ−1]sin

2θsin2φ}

ηV =η02[φ−1 − φ+1]cosθ

ρQ =η02

{ψ0 −

1

2[ψ+1 + ψ−1]sin

2θcos2φ}

ρU =η02

{ψ0 −

1

2[ψ+1 + ψ−1]sin

2θsin2φ}

ρV =η02[ψ−1 − ψ+1]cosθ

K =

⎛

⎜⎜⎝

ηI ηQ ηU ηVηQ ηI ρV −ρUηU −ρV ηI ρQηV ρU −ρQ ηI

⎞

⎟⎟⎠

1

Radiative Transfer Equation:

Then:

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai+#Isyns (λ)−Iobss (λ)

$∂2Isyn(λ)∂ai∂aj

%

(29)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(30)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (31)

H ′ij =

1

2

∂2χ2

∂ai∂aj(32)

(∇χ2 +H′δa) = 0 (33)

δa = −H′−1∇χ2 (34)

δa = k∇χ2 (35)

δai = − 1

H ′ii

∇χ2 (36)

δai = − 1

λH ′ii

∇χ2 (37)

∇χ2 +Hδa = 0 (38)

Hij ≡&(1 + λ)H ′

ij if i = j

H ′ij if i ̸= j

(39)

I⃗(0) =

' ∞

0e−K0τcK0(S⃗0 + S⃗1τc)dτc = S⃗0 +K0

−1S⃗1 (40)

4

is the Unno-Rachkovsky solution to the RTE and it is analytical in nature!

I(0) = S0 +∆−1ηI(η2I + ρ2Q + ρ2U + ρ2V )S1(43)

Q(0) = −∆−1!η2IηQ + ηI(ηV ρU − ηUρV ) + ρQ(ηQρQ + ηUρU + ηV ρV )

"S1(44)

U(0) = −∆−1!η2IηU + ηI(ηQρV − ηV ρQ) + ρU (ηQρQ + ηUρU + ηV ρV )

"S1(45)

U(0) = −∆−1!η2IηV + ηI(ηUρQ − ηQρU ) + ρV (ηQρQ + ηUρU + ηV ρV )

"S1(46)

(47)

∆ = η2I (η2I − η2Q− η2U − η2V + ρ2Q+ ρ2U + ρ2V )− (ηQρQ+ ηUρU + ηV ρV )

2 (48)

5

I(0) = S0 +∆−1ηI(η2I + ρ2Q + ρ2U + ρ2V )S1(43)

Q(0) = −∆−1!η2IηQ + ηI(ηV ρU − ηUρV ) + ρQ(ηQρQ + ηUρU + ηV ρV )

"S1(44)

U(0) = −∆−1!η2IηU + ηI(ηQρV − ηV ρQ) + ρU (ηQρQ + ηUρU + ηV ρV )

"S1(45)

U(0) = −∆−1!η2IηV + ηI(ηUρQ − ηQρU ) + ρV (ηQρQ + ηUρU + ηV ρV )

"S1(46)

(47)


2 (48)

5

with:

Solutions to the RTE: Milne-Eddington Approximation

Model Parameters:!!Line-to-continuum absorption: η0!Doppler width: ∆λD!Damping parameter: a!Magnetic field: B, θ, φ!Source function: S0, S1!LOS velocity: vLOS

Magnetic filling factor: α!Macroturbulent velocity: vMAC

No velocity gradients!so no asymmetries !

NCP = 0

http://www.iac.es/proyecto/inversion/online/milne_code/milne.php

If you want to synthesize spectral lines in a Milne-Eddington atmosphere:

No thermodynamical !information

No magnetic field !gradients

http://www.iac.es/proyecto/inversion/online/milne_code/milne.php

Solutions to the RTE: Local Thermodynamical Equilibrium

LTE hypothesis: the plasma is in thermodynamic equilibrium at local !values of temperature and density. Hence:!!✤ Maxwellian distribution of velocities!✤ Saha and Boltzmann give the populations of different atomic species!✤ Kirchhoff’s Law applies: j = Bν(T) (ηI, ηQ, ηU, ηV)T!

Absorption profiles have the same shape as emission profiles ⇒ complete redistribution of frequencies

✤ Also assumes hydrostatic equilibrium and Zeeman induced polarization.

I(0) = S0 +∆−1ηI(η2I + ρ2Q + ρ2U + ρ2V )S1(43)

Q(0) = −∆−1[η2IηQ + ηI(ηV ρU − ηUρV ) + ρQ(ηQρQ + ηUρU + ηV ρV )

]S1(44)

U(0) = −∆−1[η2IηU + ηI(ηQρV − ηV ρQ) + ρU (ηQρQ + ηUρU + ηV ρV )

]S1(45)

U(0) = −∆−1[η2IηV + ηI(ηUρQ − ηQρU ) + ρV (ηQρQ + ηUρU + ηV ρV )

]S1(46)

(47)


2 (48)

d

dz

⎛

⎜⎜⎝

IQUV

⎞

⎟⎟⎠ = −

⎛

⎜⎜⎝

ηI ηQ ηU ηVηQ ηI ρV −ρUηU −ρV ηI ρQηV ρU −ρQ ηI

⎞

⎟⎟⎠

⎛

⎜⎜⎝

I −Bν(T )QUV

⎞

⎟⎟⎠

5

And the RTE still looks like this:

Solutions to the RTE: Local Thermodynamical Equilibrium

(from Fontenla et al 1999)

Model atmosphere:!!temperature: T(z)!pressure: P(z)!LOS velocity: vLOS(z)!Magnetic field: B(z), θ(z), φ(z)!Macroturbulent velocity: vMAC!Macroturbulent velocity: vMIC

✤ 9 physical quantities!✤ stratified atmosphere:

9 x Nz free model parameters!✤ Too many free parameters!!✤ Need to constrain stratification

to a limited number of nodes

Spectral Line Inversions

Reasonable assumptions !(LTE, Milne-Eddington, Zeeman...)

Set of physical parameters!(T, P, ρ, vLOS, B..)

Solve Radiative Transfer Eq.

Stokes profiles!(I, Q, U, V)

Inverse problem:!given a set of Stokes profiles (data!!), !

what are the physical conditions in the atmosphere?

Forward modeling:

B. Viticchié et al.: MISMA interpretation of HINODE SOT/SP data

Fig. 5. Examples of full Stokes MISMA inversion of HINODE SOT/SP Stokes profiles. For details of the layout of the figure, see the caption ofFig. 3.

MISMA code succeeds in the inversion. We note that in such apixel, as found in the lower panel of Fig. 3, polarization profilesare still interpreted by the code as emerging from a mixed polar-ity pixel. In the lower panel of Fig. 5, a full inversion of an INpixel is represented. In this case, even if the pixel is almost onthe boundary between opposite polarity regions, a single polar-ity is measured. In the two cases, the magnetic field strengths atthe base of the photosphere are in the kG regime.

The examples discussed here illustrate not only the good-ness of the fits but also the soundness of the MISMA interpre-tation of HINODE SOT/SP measurements. These measurementsare often characterized by important asymmetries in Stokes Vprofiles; Figs. 3 and 4 show three examples of Stokes V pro-files whose NCP ≥ 0.3. From the maps in Fig. 2, we note

that these values for the NCP are very common in the selected29.52′′ × 31.70′′ subfield and, by extrapolation, they should bevery common in the full FOV as well. The common presence oflarge asymmetries demands a refined inversion method to inter-pret quiet Sun SOT/SP profiles, such as the MISMA inversionwe employ. Detail of the percentage of asymmetric profiles arereported in Sect. 4.

4. Results

The inversion code succeeded in inverting 11 600 profiles, whichrepresent 29% of the selected subfield. The total time neededto perform this analysis is about two days when the inversion

Article number, page 7 of 13

Spectral Line Inversions

(from Vitticchié et al, 2011)open circles: observations!solid line: synthetic Stokes profiles

fitting metric +!educated guess

RADIATIVE TRANSFER

ATMOSPHERIC MODEL +

EXTERNAL FIELDS

STATISTICAL EQUILIBRIUM

OBSERVATIONSatomic populations

radiation output

solve

solve

compare

feedb

ack

feedback

Spectral Line Inversions: General Problem

RADIATIVE TRANSFER

ATMOSPHERIC MODEL

OBSERVATIONS

EXTERNAL FIELDS

atomic excitation

radiation output

solvecompare

feedb

ack

feedback

Spectral Line Inversions: Local Thermodynamic Equilibrium

RADIATIVE TRANSFER

ATMOSPHERIC MODEL

OBSERVATIONS

EXTERNAL FIELDS

radiation output

solvecompare

feedb

ack

feedback

Spectral Line Inversions: Milne-Eddington Approximation

Blind trial and error?!?!

Let’s assume we know how to solve the RTE.

!Levenberg-Marquardt methods

(least squares fitting)!!

Principal Component Analysis techniques (pattern recognition)!

Spectral Line Inversion Methods

!Inversion methods!

Spectral Line Inversions: the merit function

The solution to the RTE in the model a gives us a set of synthetic Stokes profiles, !which we can compare to the observed ones. We can measure the difference!using a merit function:

Let’s assume our model atmosphere is characterized by a series of Np!parameters, a.

K′ =

⎛

⎜⎜⎝

ηI ηQ ηU −ηVηQ ηI ρV ρUηU −ρV ηI −ρQηV −ρU ρQ ηI

⎞

⎟⎟⎠

d

dτcI⃗(λ− λ0) = K(I⃗(λ− λ0)− S⃗) (4)

d

dτcI⃗(λ0 − λ) = K′(I⃗(λ0 − λ)− S⃗) (5)

I(λ0 − λ) = I(λ− λ0) (6)

Q(λ0 − λ) = Q(λ− λ0) (7)

U(λ0 − λ) = U(λ− λ0) (8)

V (λ0 − λ) = −V (λ− λ0) (9)

NCP =

∫

WVobs(λ)dλ (10)

φB =

∫A |Bz|dA∫

A dA(11)

V (λ) = −CgeffB cos θ∂I(λ)

∂λ(12)

geff =1

2(gu − gl) +

1

4(gu − gl)[ju(ju + 1)− jl(jl + 1)] (13)

φα =1√πH(u0 + αgeffuB − uLOS, a) (14)

ψα =1√πF (u0 + αgeffuB − uLOS, a) (15)

χ2 =1

Nf

∑

s

∑

λ

[Iobss (λ)− Isyns (λ)

]2ω2s (16)

2

Where the number of degrees of freedom: Nf = Ns x Nλ - Np

Isyn and Iobs are the synthetic and observed Stokes profiles!ωs are some weighting factors (related to measurement error).!The sums are over wavelength and Stokes parameters.

Spectral Line Inversions: the merit function

χ2 is a hyper-surface of Np dimensions.

It quantifies the goodness of the fit!(the distance between the observed and synthetic Stokes vector)!

with one number!

The whole inversion problem boils down !to minimizing χ2

K′ =

⎛

⎜⎜⎝


⎞

⎟⎟⎠

d

dτcI⃗(λ− λ0) = K(I⃗(λ− λ0)− S⃗) (4)

d

dτcI⃗(λ0 − λ) = K′(I⃗(λ0 − λ)− S⃗) (5)

I(λ0 − λ) = I(λ− λ0) (6)

Q(λ0 − λ) = Q(λ− λ0) (7)

U(λ0 − λ) = U(λ− λ0) (8)

V (λ0 − λ) = −V (λ− λ0) (9)

NCP =

∫

WVobs(λ)dλ (10)

φB =

∫A |Bz|dA∫

A dA(11)


∂λ(12)

geff =1

2(gu − gl) +

1

4(gu − gl)[ju(ju + 1)− jl(jl + 1)] (13)



χ2 =1

Nf

∑

s

∑

λ


]2ω2s (16)

2

Spectral Line Inversions: Levenberg-Marquardt Techniques

The problem boils down to the minimization of χ2.

K′ =

⎛

⎜⎜⎝


⎞

⎟⎟⎠

d

dτcI⃗(λ− λ0) = K(I⃗(λ− λ0)− S⃗) (4)

d

dτcI⃗(λ0 − λ) = K′(I⃗(λ0 − λ)− S⃗) (5)

I(λ0 − λ) = I(λ− λ0) (6)

Q(λ0 − λ) = Q(λ− λ0) (7)

U(λ0 − λ) = U(λ− λ0) (8)

V (λ0 − λ) = −V (λ− λ0) (9)

NCP =

∫

WVobs(λ)dλ (10)

φB =

∫A |Bz|dA∫

A dA(11)


∂λ(12)

geff =1

2(gu − gl) +

1

4(gu − gl)[ju(ju + 1)− jl(jl + 1)] (13)



χ2 =1

Nf

∑

s

∑

λ


]2ω2s (16)

∂χ2

∂ai=

2

Nf

∑

s

∑

λ

[Isyns (λ)− Iobss (λ)

]ω2s∂Isyn(λ)

∂ai(17)

2

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

3

And the second derivative of χ2 is given by:

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

3

When close to the minimum of χ2, we can expect [Isyn - Iobs] ≃ 0

The first derivative is given by:

(i = 0, …, Np)


Let’s assume the model a is close to the minimum of χ2, so there is a !perturbation δa that takes us directly to the minimum. !We can use a quadratic approximation, such that:

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

3

where is half the Hessian matrix!(dimensions Np x Np)

amin acurrent

When one is really close to the minimum, the second order approximation is!adequate, and we can equate to zero the term in parenthesis:

thus we obtain a better approximation to the minimum of χ2 by shifting in the !parameter space an amount δa.

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

3

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

3

This involves inverting!the Hessian matrix!!


When we’re far from the minimum, we can get closer to it following !the gradient (first order approximation):

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

3

with k small enough!


Marquardt had two insights:!!✤ The diagonal elements of the Hessian matrix give us a sense of what

good values for k could be (it’s a dimensional argument).!!!

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

δai = − 1

H ′ii

∇χ2 (25)

δai = − 1

λH ′ii

∇χ2 (26)

∇χ2 +Hδa = 0 (27)

3

= k

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

δai = − 1

H ′ii

∇χ2 (25)

δai = − 1

λH ′ii

∇χ2 (26)

∇χ2 +Hδa = 0 (27)

3

= kfudge factor λ

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

δai = − 1

H ′ii

∇χ2 (25)

δai = − 1

λH ′ii

∇χ2 (26)

∇χ2 +Hδa = 0 (27)

3

∂2χ2

∂ai∂aj=

2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)



%

(18)

∂2χ2

∂ai∂aj≈ 2

Nf

!

s

!

λ

ω2s

"∂Isyns (λ)

∂aj

∂Isyns (λ)

∂ai

%(19)

χ2(a+ δa) ≈ χ2(a) + δaT(∇χ2 +H′δa) (20)

H ′ij =

1

2

∂2χ2

∂ai∂aj(21)

(∇χ2 +H′δa) = 0 (22)

δa = −H′−1∇χ2 (23)

δa = k∇χ2 (24)

δai = − 1

H ′ii

∇χ2 (25)

δai = − 1

λH ′ii

∇χ2 (26)

∇χ2 +Hδa = 0 (27)

Hij ≡&(1 + λ)H ′

ij if i = j

H ′ij if i ̸= j

(28)

3

where

λ↑↑ ⇒ gradient (first order) method !λ↓↓ ⇒ hessian (second order) method

✤ The two methods can be combined into one equation, that allows to vary smoothly between the gradient and the Hessian approaches:


If χ2(a+δa) ≥ χ2(a) !! →! Do not update a!! → Increase λ: (λnew = λ*10)

Evaluate χ2(aini) for the initial guess model

Take modest value of λ (λ=10-3)

Solve equation for δa !

Evaluate χ2(a+δa)!

If χ2(a+δa) ≤ χ2(a) !! → Decrease λ: (λnew = λ/10)!! → Update a: anew = a+δa

Stop when χ2 barely decreases once or twice in a row.

This algorithm is explained in detail in “Numerical Recipes” by Press et al. !


white = observations red = synthetic fit


white = observations red = synthetic fit

Levenberg-Marquardt Techniques: Issues

H can be quasi-singular due to different sensitivity of χ2 to the various !model parameters. But it has to be inverted!!Singular Value Decomposition methods: !H is real and symmetric ⇒ ∃ Y such that: H = YT W Y and YYT = YTY = 1

So H-1 = YT W-1 Y

If Wk ↓↓ ⇒ we set 1/Wk = 0, so ak does not contribute to the model perturbation.

Global vs. local minima of χ2!!Levenberg-Marquardt techniques can lead !to local (rather than global) minima depending!on the location of the initial guess

χ2 “surface”

Principal Component Analysis (PCA) Techniques

– 3 –

four Stokes parameters, I, Q, U , and V . Then the spectro-polarimetric observation of the solar

region for that parameter naturally define the N ×M matrix

Sij = Sj(λi) , i = 1, . . . , N ; j = 1, . . . ,M . (1)

The M observed points in the solar region form a set of statistically independent realizations

of the Stokes profile S(λ). We can then calculate the averages

S̄(λi) =1

M

M!

j=1

Sij , i = 1, . . . , N , (2)

for each of the wavelength points, and the N ×N covariance matrix

Cij =M!

l=1

[Sil − S̄(λi)][Sjl − S̄(λj)] , i, j = 1, . . . , N . (3)

This is a real and symmetric matrix, which therefore can always be diagonalized by an orthogonal

transformation (e.g., Birkhoff & Mac Lane 1953). The solution of the corresponding eigenvalue

problem,

Cf (k) = e(k)f (k) , k = 1, . . . , N , (4)

is known to provide an optimal set of orthogonal eigenprofiles – represented by the N -dimensional

eigenvectors f (k) – for the decomposition of the residual signals Sj(λ)− S̄(λ) (Jolliffe 2002). These

eigenprofiles are also known as the principal components of the observed set of profiles. Another

property of the covariance matrix is to be positive semidefinite (e.g., Jolliffe 2002), hence e(k) ≥ 0,

for all k. In particular, solving the eigenvalue problem (4) by singular value decomposition (SVD;

e.g., Press et al. 2007) provides us with an ordered set of eigenprofiles according to the decreasing

non-negative amplitude of the corresponding singular values. This ordering reflects the importance

of the contribution of the various eigenprofiles to the covariance of the observations.

The eigenprofiles f (k) form a basis for the space of the residual signals Sj(λ) − S̄(λ). In

particular, this implies that the j-th profile in the set of M observations can be reconstructed

exactly from its set of PCA components,

c(k)j ≡N!

i=1

f(k)i [Sij − S̄(λi)] , k = 1, . . . , N , (5)

so that

Sj(λ)− S̄(λ) !N!

k=1

c(k)j f (k) . (6)

When the eigenprofiles f (k) are ordered according to their corresponding singular values, then any

truncation of the summation in Equation (6) provides an approximation of the residual Sj(λ)−S̄(λ).

– 3 –



Sij = Sj(λi) , i = 1, . . . , N ; j = 1, . . . ,M . (1)



S̄(λi) =1

M

M!

j=1

Sij , i = 1, . . . , N , (2)


Cij =M!

l=1




problem,

Cf (k) = e(k)f (k) , k = 1, . . . , N , (4)












c(k)j ≡N!

i=1

f(k)i [Sij − S̄(λi)] , k = 1, . . . , N , (5)

so that

Sj(λ)− S̄(λ) !N!

k=1

c(k)j f (k) . (6)



Let’s assume we have a set of observations S ! — Stokes vector (I,Q,U,V)!N ! — wavelengths!M! — pixels

They are independent realizations of the Stokes profile S(λ), so the average:

We can define a covariance matrix:

– 3 –



Sij = Sj(λi) , i = 1, . . . , N ; j = 1, . . . ,M . (1)



S̄(λi) =1

M

M!

j=1

Sij , i = 1, . . . , N , (2)


Cij =M!

l=1




problem,

Cf (k) = e(k)f (k) , k = 1, . . . , N , (4)












c(k)j ≡N!

i=1

f(k)i [Sij − S̄(λi)] , k = 1, . . . , N , (5)

so that

Sj(λ)− S̄(λ) !N!

k=1

c(k)j f (k) . (6)



real and symmetric!

Which can be diagonalized by an orthogonal transformation:

– 3 –



Sij = Sj(λi) , i = 1, . . . , N ; j = 1, . . . ,M . (1)



S̄(λi) =1

M

M!

j=1

Sij , i = 1, . . . , N , (2)


Cij =M!

l=1




problem,

Cf (k) = e(k)f (k) , k = 1, . . . , N , (4)












c(k)j ≡N!

i=1

f(k)i [Sij − S̄(λi)] , k = 1, . . . , N , (5)

so that

Sj(λ)− S̄(λ) !N!

k=1

c(k)j f (k) . (6)



where f(k) are the eigenvectors that!form a basis for the residual Sj(λ) - S(λ)

– 3 –



Sij = Sj(λi) , i = 1, . . . , N ; j = 1, . . . ,M . (1)



S̄(λi) =1

M

M!

j=1

Sij , i = 1, . . . , N , (2)


Cij =M!

l=1




problem,

Cf (k) = e(k)f (k) , k = 1, . . . , N , (4)












c(k)j ≡N!

i=1

f(k)i [Sij − S̄(λi)] , k = 1, . . . , N , (5)

so that

Sj(λ)− S̄(λ) !N!

k=1

c(k)j f (k) . (6)



So that:

Principal Component Analysis (PCA) Techniques– 4 –

Stokes I

0 20 40 60 80 1001201400.700.750.800.850.900.95

f (0)

Stokes Q

0 20 40 60 80 100120140

0.00000.00020.00040.00060.0008

Stokes U

0 20 40 60 80 100120140-1•10-5

-5•10-6

05•10-61•10-5

Stokes V

0 20 40 60 80 100120140-1•10-4-5•10-5

05•10-51•10-4

0 20 40 60 80 100120140

-0.20-0.15-0.10-0.05

f (1)

0 20 40 60 80 100120140

-0.1

0.0

0.1

0.2

0 20 40 60 80 100120140

-0.1

0.0

0.1

0.2

0 20 40 60 80 100120140-0.2-0.10.00.10.2

0 20 40 60 80 100120140-0.2-0.10.00.10.2

f (2)

0 20 40 60 80 100120140-0.20-0.15-0.10-0.050.000.050.10

0 20 40 60 80 100120140

-0.20-0.15-0.10-0.050.000.050.10

0 20 40 60 80 100120140-0.2-0.10.00.10.2

0 20 40 60 80 100120140-0.2-0.10.00.10.2

f (3)

0 20 40 60 80 100120140

-0.1

0.0

0.1

0 20 40 60 80 100120140

-0.1

0.0

0.1

0 20 40 60 80 100120140-0.2-0.10.00.10.2

0 20 40 60 80 100120140-0.2-0.10.00.10.2

f (4)

0 20 40 60 80 100120140-0.3

-0.2

-0.1

0.0

0 20 40 60 80 100120140

-0.2

-0.1

0.0

0.1

0 20 40 60 80 100120140

-0.10.00.10.2

0 20 40 60 80 100120140wavelength

-0.3-0.2-0.10.0

f (5)

0 20 40 60 80 100120140wavelength

-0.2

-0.1

0.0

0.1

0 20 40 60 80 100120140wavelength

-0.2-0.10.00.10.2

0 20 40 60 80 100120140wavelength

-0.2-0.10.00.10.2

Fig. 1.— The average profiles (top row) and the first five PCA eigenprofiles (next five rows) for each

of the four Stokes parameters, I, Q, U , and V (columns), extracted from a database of synthetic

Stokes profiles of the He I chromospheric lines at 1083 nm. The rows correspond to increasing orders

of the eigenprofiles, with f (0) = S̄(λ). The forward model used in the derivation of this eigenbasis

corresponds to on-disk observations of the He I lines, for inclinations of the line of sight to the local

normal to the surface between 30◦ and 40◦. The database comprises all possible orientations of

the magnetic field vector (i.e., 0 ≤ ϑB ≤ π and −π ≤ ϕB ≤ π), and magnetic strengths between

0.2 and 2000G logarithmically sampled. The calculated profiles emerge from a homogenous slab of

plasma with optical depth at line center between 0.1 and 1.5, and temperature between 5000 and

25,000K. The position height of the slab (which affects the radiation anisotropy) varies between 0

and 0.06R⊙. Bulk velocities are accounted for by randomly displacing the rest frequency of each

model within a pixel unit of Doppler shift. Because of the homogenous slab assumption, velocity

gradients are not taken into account.

Casini et al 2012

average

1st PC

2nd PC

3rd PC

4th PC

5th PC

Principal Component Analysis (PCA) Techniques

Build a complete database !of Stokes profiles

Determine a good set of! Principal Components

Decompose observations !on the eigen-basis

Calculate the !eigen-profiles of the database

For each Si(λk) in the database!calculate eigen-coefficients ci(k)

observations!eigen-coefficients cj(k)

compare

– 6 –

model’s covariance along any of the “principal directions” in the N -dimensional space spanned by

the eigenprofiles. From Figure 1 we see that the peak amplitude of the eigenprofiles is typically

around 0.2. Thus, for a polarimetric noise of 10−3, we expect to be sensitive to profile covariances of

the order of (10−3/0.2)2 = 2.5× 10−5, which is indicated in Figure 2 by the dotted line. From that

figure, taking also into account the local drop-off of the covariances around the specified threshold,

we can conclude that, for the purpose of Stokes profile reconstruction and inversion, we should

retain approximately 11 orders for Stokes I, 14 for Q and U , and 13 for V . These numbers must

be compared with the dimension N of the complete set of eigenprofiles in the database, which

corresponds to the number of wavelength points (in this case, N = 151) used for the synthesis of

the Stokes profiles. As a result, the description of the Stokes profiles for our model in terms of

their principal components allows a data compression of the spectro-polarimetric information by

approximately a factor 10.

It is important to observe that the above argument about the number of orders that must

be retained for spectro-polarimetric inversions relies on two fundamental assumptions. The first

assumption is that the error bars of the profiles are dominated by photon noise, and the second

one that the photon counts is large enough that the associated poissonian noise can be treated as

a random variate, so that its variance can simply be added to that of the model. On the other

hand, the systematic errors due to deviations of the observations from the line formation model,

especially in the case of complicated atmospheric structures, is very likely to dominate the inversion

errors. Thus, in practical cases, we should not expect that retaining such a high number of orders

necessarily improves the goodness of the profile fits from the inversion. We will come back to this

argument in the next section.

3. Indexing of PCA inversion databases

We created an inversion database of 0.75 million models spanning the same parameter space

as the eigenbasis of Figure 1. The profile information in this database is encoded in the expansion

coefficients given by Equation (5). The inversion database is constructed by a strategy of “filtered”

Monte Carlo sampling, where each new randomly selected point in the parameter space is tested

for proximity to previously included models in the database. The testing parameter is the PCA

distance between two models, i and j, which is defined as follows, for each of the four Stokes

parameters:

dij =

!

m"

k=1

#

c(k)i − c(k)j

$2%1/2

, (7)

where m is the maximum number of orders retained for the reconstruction of the Stokes profiles.

The filtering criterion is to reject models for which the cumulative PCA distance for the fourdij ≡ PCA distance between model i and observation j!index k sums over truncated set of eigen-coefficients

For each observation j!choose model i!

that minimizes dij

Principal Component Analysis: Pros and Cons

Pros!● fast (searches best fit in a pre-built database of models)!● stable (always finds best fit: no problems of local minima)!● model independent (universal search/minimization algorithm)!

!Cons!

● no solution refinement (can be fixed by increasing the density of the database)!

● database can become unmanageably large (dimensionality of parameter space, parameter ranges; partial mitigation from optimally sampling the parameter space)

All PCA stuff is from Roberto Casini

SDO/HMI

Spectral line inversions from HMI

Courtesy: R. Bogart and K. Hayashi

Spectral line inversions from HMI