Advanced Imaging Optics

Advanced Optical Microscopy 1

Advanced Imaging Optics

Yuval Garini

Delft University of Technology, Faculty of Applied Sciences

The beauty is in the eyes of the observer


Table of context:

1 Maxwell equations:................................................................................................4

1.1 Different unit systems ....................................................................................4

1.2 Simple case I: .................................................................................................5

1.3 Boundary conditions: .....................................................................................5

1.4 The electric dipole moment and polarization.................................................7

1.5 The field of a dipole moment.........................................................................9

1.6 Polarization leads to charge density and surface charge................................9

1.7 Definition of electric displacement ................................................................9

2 Wave equations in free space, reflection & transmission ....................................11

2.1 The transversal character of electromagnetic waves ...................................12

2.2 Get the feeling of waves ..............................................................................13

2.3 Decaying waves and imaginary index of refraction ....................................14

2.4 Reflection and Transmission........................................................................15

2.5 The amplitudes of reflection and transmission ............................................16

2.5.1 Special case: Normal incidence ...........................................................17

2.5.2 Special case: Brewster angle................................................................17

2.5.3 Special case: Total reflection ...............................................................17

3 Polarization of time-varying fields ......................................................................19

3.1 Electromagnetic waves in dielectrics...........................................................20

3.2 Solution for diluted gases.............................................................................22

3.3 Solution for dense material ..........................................................................23

4 Optical properties of bulk metals.........................................................................25

4.1 Wave propagation in bulk conductors .........................................................25

4.1.1 Low frequencies...................................................................................27

4.1.2 High frequencies ..................................................................................29

5 Diffraction theory.................................................................................................30

5.1 Historical review..........................................................................................30

5.2 The Green function ......................................................................................32

5.3 The Kirchhoff Integral .................................................................................32

5.4 The Kirchhoff formulation of diffraction and boundary conditions............34

5.5 The Fresnel-Kirchhoff diffraction formula..................................................36

5.6 The Rayleigh-Sommerfeld diffraction formulation .....................................36

5.7 Physical interpretation: Rayleigh-Sommerfeld formulation........................37

6 Fresnel and Franuhofer diffractions.....................................................................38


6.1 Fresnel diffraction........................................................................................38

6.2 Fraunhofer diffraction..................................................................................40

6.2.1 Simple Fraunhofer setup ......................................................................41

7 The diffraction limit of imaging systems.............................................................43

7.1 The diffraction limit of light ........................................................................45

7.2 The amplitude and optical transfer function ................................................47

7.2.1 Coherent case .......................................................................................48

7.2.2 Non-coherent case................................................................................49

7.3 Three Dimensional point spread function....................................................50

8 Surface Plasmons .................................................................................................55

8.1 The dispersion relations ...............................................................................57

8.2 Spatial extent of surface plasmons fields.....................................................59

8.2.1 Perpendicular to the surface plane – Z.................................................59

8.2.2 In plane XY..........................................................................................61

8.3 The surface plasmon electric field ...............................................................61

9 Surface plasmons in metallic hole arrays.............................................................63

10 Fourier transforms of special functions ...........................................................64

10.1 Coulomb (& Screened) potential .................................................................64

10.2 Gaussian function.........................................................................................65

10.3 Circular hole.................................................................................................65

11 Spherical coordinate system ............................................................................67

Appendix I: Units.........................................................................................................68

References....................................................................................................................70


1 Maxwell equations:

This treatment starts from Maxwell’s equations. It is assumed that the reader

understands it. Derivation of the equations can be found in many text books such as

The Feynman lectures on physics [1].

Maxwell’s equations in dielectric media are given below in CGS units. See more on

units in Appendix I: Units.

Gauss law πρ4=⋅∇ D (1)

Gauss law 0=⋅∇ B (2)

Faraday’s law 01

=+×∇ BE &

c (3)

Ampere’s law JDHccπ41 =−×∇ & (4)

The continuum equation:

0=⋅∇+∂

∂j

t

ρ (5)

The Maxwell equation can also be derived from three principles:

1. the Lorentz law of force on moving charges (CGS):

×+= BvEF

ce

1 (6)

where e is the charge of the particle and v its velocity.

2. The superposition principle. This means that the total force on a charge is

equal to the superposition of all forces applied by the separate charges.

3. The principle of special relativity (which is in a sense already ‘built in’

Lorentz force due to the appearance of the magnetic force.

In the equations above, there are 2 different electric field terms, the electric field E

and electric displacement D and two magnetic field terms, the magnetic field H and

the magnetic flux density B. These already hint on a certain complexity with the

equations that must be cleared. D refers to the electric field that is only due to free

charges (charges that are not bound, as an example, in dense matter) while E is the

actual field that will be measured by a test charge at every point in space. In the same

way H is the magnetic field due to currents that are caused by free charges while B is

the actual magnetic field that can be measured at every point in space.

The appearance of the electric displacement is further explained in section 1.4.

1.1 Different unit systems

The Maxwell equations in MKS and CGS unit systems are as follows:


JDHJDH

BEBE

BB

DD

=−×∇=−×∇

=+×∇=+×∇

=⋅∇=⋅∇

=⋅∇=⋅∇

&&

&&

cc

c

MKSCGS

π

ρπρ

41

00100

4

(7)

In MKS units:

)(104

108542.81

2117

0

21212

0

20

−−−−

−−−

⋅⋅⋅×=

⋅⋅×==

ANormAWb

mNCc

πµ

µε

These are the values of the electric permittivity and magnetic permeability in vacuum.

1.2 Simple case I:

The Maxwell equations given above can not be solved without more equations for the

unknown parameters (there are too many parameters and not enough equations

because some of the equations are redundant). Especially, it is required to state the

dependence of D and B on E and H respectively. The simplest possible case assumes:

1. Isotropic material

2. Slowly varying fields

3. Time-harmonic fields

In these cases:

HBEDEj µεσ === ,, (8)

σ is the conductivity, ε is the dielectric constant (or electric permittivity) and µ is the

magnetic permeability.

In a general case, these relations are more complex. First, they should not necessarily

be linear (for example D may depend on E as ...2 ++= EED χε which is a non-linear

dependence, cases that does happen in some materials.

In addition, the dependence may not be a scalar, but a tensor, so that ED ε= . Such

cases also occur, and will be briefly discussed in section 1.4.

These relations depend on the nature of the material and are an interesting and

important subject.

1.3 Boundary conditions:

In general, the mathematical derivation performed in this booklet assumes that the

functions are well-behaved and that the mathematical operations that are performed

are adequate. In most of the cases, this is a good assumption, and it is based on the


fact that the functions that describes the natural laws must exist and be continues.

Nevertheless, in some occasions, these assumptions must be checked. As an example,

the Maxwell equations given above assumes that the derivative of all the functions

exist at any point in space. This assumption fails, for example, on the interface

between two materials that have different dielectric functions. Therefore, to complete

these equations, it is necessary to state the boundary conditions on these interfaces.

Based on the image below showing the transition from material (1) to (2), assuming

the transition is continuous along a distance h and then shrinking h to 0.

Figure 1 Boundary conditions perpendicular to the plane.

Using Maxwell equation (2) we get:

( ) 0)1()2(

12 =−⋅ BBn (9)

which means that the normal magnetic field is continuous.

In a similar way from equation (1) it turns out that

( ) σDDn π4)1()2(

12 =−⋅ (10)

where σ here is the surface charge density.

In a similar way, we can find the tangential boundary conditions that will give us

based on the following figure

Figure 2 Boundary conditions parallel to the plane.

From equation (3) we will get:

( ) 0)1()2(

12 =−× EEn (11)

and

( ) jHHnc

π4)1()2(

12 =−× (12)

in summary:

1. The normal electric displacement (D) changes abruptly in a value that equals to

the surface charge density,


2. The normal magnetic field across the surface is continuous,

3. The tangential electric field is continuous across the surface and

4. The tangential magnetic field changes abruptly in a magnitude that equals to the

current density in the surface. The direction of the change is perpendicular to both

j and n.

1.4 The electric dipole moment and polarization

In matter there are many charged particles. If an electric field exist in a certain part of

space and a certain material is brought in, the charges inside the matter will move and

reorient. Positive and negative charges will in general move in opposite direction, and

it can be described by defining the term electric dipole.

When there are two equal and opposite charges (q and –q) separated from each other,

the electric dipole is defined as

rp q= (13)

where r is the vector connecting the negative to the positive charge (Figure 3).

q r

Figure 3 Definition of the dipole

In matter there are definitely many charges, and it is more convenient to use an

average value in each unit volume. We therefore define the polarization vector P as

the electric dipole per unit volume, assuming that the electric dipole changes

relatively slowly inside the matter.

V∆

=p

P (14)

where the average is within the small volume V∆ .

The effect of polarization is usually presented into the Maxwell equations through the

electric polarization vectors (similar treatment is used for the magnetic polarization).

It is clear that the electric polarization vector P by itself depends on the

electromagnetic field, and its appearance will by itself change the electric field in

space. If 0E is the original electric field that existed before introducing the dielectric

matter and E′ is the final electric field in space, this can be schematically described as

follows:

EPE0′→→

We will first deal with the first part, the polarization.

In the general case the dependence of P on E can be complex. It may be non linear

(like in the Kerr and Pockels effects) and/or a tensor-like dependence. In the simple

case, the dependence of the electric dipole (and therefore also the polarization) on the

electric field can be described as a linear phenomena with a scalar dependence

constant, Ep α= .


The linear model that is described above, looks very similar to the dependence of a

simple harmonic force, such as a spring that connects the two charges (because if we

multiply the electric dipole definition Ep α= by q, we see that the force is linear with

the displacement r).

To test the linearity assumption, we can take a simple model of a hydrogen atom and

assume that the electron forms a uniform cloud of charge around the proton (Figure 4).

The spacing in between the center of the charge cloud and the proton can be

calculated by drawing an internal sphere around the center of the cloud (see dashed

circle). The part of the electron cloud charge that is outside that sphere does not

contribute anything to the electric field felt by the proton, only the part that is inside

that sphere.

The part that is inside the sphere q’ is easily calculated to be:

qRrq

3

3

=′

where q is the electron charge. The two forces acting in between the charges should

now be equal. The external force that result from E is simply qE and the inter-charges

force is (MKS units) 30

2

20 4

14 R

rq

r

qqF

πεπε=⋅

′⋅= . If we compare this to the external

electric field we get:

( )ERqr3

04πε=

Which confirms the simple case of Ep α= .

A similar exercise can be done for a more realistic electron cloud distribution that

follows the real wave function solution of the electron in a hydrogen atom.

+

- + -

E

b a

r

Figure 4 Source of the electric dipole moment: a simple atomic model.

The validity of this model depends also on small displacements relative to the steady-

state. This is conformed by observing typical electric dipole values, α for hydrogen

and potassium atoms:

1240

1240

1043

10667.0

−−

−−

⋅⋅×=

⋅⋅×=

NmC

NmC

K

H

α

α.

This even further justifies the linearity dependence.

In addition, molecules may have sometimes an internal polarization, due to its

structure (e.g. water). In such a case, an external electric field can result in a

polarization that is due to two effects: 1. the electric dipole of single molecules and, 2.


the orientation of the molecules relative to the electric field (due to statistical

equilibrium).

1.5 The field of a dipole moment

The potential of a dipole moment is:

20

ˆ4

1r

VdipolerP ⋅=

πε (15)

and the electric field is given by:

( )

−++=+= z

rz

ry

r

zyx

rzxPr

rP

dipoleˆ31ˆ

3ˆ3

4ˆsinˆcos2

4 5

2

3550

3

0πε

θθθπε

E (16)

when the dipole is oriented along the z axis.

1.6 Polarization leads to charge density and surface charge

The existence of polarization in matter can lead to both charge density and surface

charge density. It is clear that when the polarization is not constant, it cause the

accumulation of charge in the volume. It can easily be verified that the charge density

depends on the derivative of the polarization vector with respect to space, with a

minus sign, or a divergence operation.

On the boundary of dielectrics, if the polarization vector changes abruptly to zero, it

also creates surface density, it is also easy to verify.

These can be summarized in the following two important equations:

ad ˆ⋅=

⋅−∇=

P

P

σ

ρ (17)

1.7 Definition of electric displacement

For historical reasons and for practical aspects in some cases, it is convenient

sometimes to separate the contribution of free charges and currents, from those

charges and currents that are induced in dielectric mater.

It is better to start by writing the Maxwell equations in vacuum, and assume that all

the charges and currents are free. In this case DE 0ε= and HB 0µ= , and substituting

these in equation 7 (we will use only the MKS system here):

JEB

BE

B

E

=−×∇

=+×∇

=⋅∇

=⋅∇

&

&

00

0

0

0

εµ

ερ

(18)

Now, let us distinguish between free and bound charges and currents and write for ρ:

boundfree ρρρ += .


We already know from previous section (equation 17) that P⋅−∇=boundρ so

P⋅∇−= freeρρ

Substituting this back to the first Maxwell equation we get: PE ⋅∇−=⋅∇ freeρε0 or

after arranging elements: ( ) freeρε =+⋅∇ PE0 . We can now define a new vector, D,

PED += 0ε which leads to the final version of the Maxwell equation:

ρ=⋅∇ D

We have already shown what is the dependence of P on the electric field itself

(section 1.4), at least in the simple case, Ep α= (for the electric dipole moment). A

similar relation is also written for the polarization vector, EP eχε0= where eχ is

defined as the electric susceptibility. Using this we can relate D and E by:

( )EEED ee χεχεε +=+= 1000 and define the permittivity ε:

( )eχεε += 10 .

A similar treatment can be performed for the magnetic polarization vector M and then

write the magnetic flux density B as MHB += 0µ and then define the magnetic

permeability µ as ( )mχµµ += 10 where mχ is the magnetic susceptibility.

Substituting these definitions in Maxwell equations in vacuum leads to the final

version of the equations that are commonly used (equation 7).


2 Wave equations in free space, reflection & transmission

One of the most impressive successes of the Maxwell equations is the prediction of

the existence of electromagnetic waves. These are made of time-varying electric and

magnetic fields that can exist even when the charges and currents that initiated it, do

not exist anymore.

Consider a space where there are no charges and no currents, J=0 and ρ=0. If we take

the curl of the third Maxwell equation (Equation 7), we get:

( ) 0=×∇+×∇×∇ HE &µ . (19)

Next, we substitute from the first and forth equations and use equation 8 to get:

( ) 0=+×∇×∇ EE &&εµ . (20)

Using the identity: ( ) AAA 2∇−⋅∇∇=×∇×∇ and assuming a homogenous space

where µ and ε are uniform, we get:

0,0 22 =−∇=−∇ BBEE &&&& εµεµ . (21)

These equations shows that E and B are waves that propagates with a velocity

εµ1=v . Note that in this solution every component of the vector must satisfy the

wave equation (six equations). This leads to the scalar approximation that allows to

treat each one of the vector components as a separate scalar function. It is a result of

the assumptions mentioned above on the homogeneity and isotropy of the dielectric

constant and the magnetic permeability. Note also that in vacuum, the speed is exactly

the speed of light since 001 µε=c .

Two simple cases are that of plane wave and spherical waves. The plane wave has the

solution:

( ) ( )vtEvtEE +⋅+−⋅= srsr 21 . (22)

This solution represents a linear combination of two waves that propagate in opposite

directions but at the same speed. We have also assumed that the propagation is along

a single direction in space, so that E is along a certain single direction.

We can also write the wave equation in spherical coordinates. In spherical coordinates,

the Laplacian is:

ψφθθ

θθθ

ψψ

∂∂+

∂∂∂+

∂∂

∂∂=∇

2

2

22

2

2

2

sin1sin

sinsin111

rrr

rr.

A useful relation is: ( ) ψψψ rrr

rrrr

rrr

2

222

2111

∂∂=

∂∂=

∂∂

∂∂ .

If we assume only spherical symmetric waves, there is no dependence on θ and φ, and

we get for the wave equation:

( ) ( ) 01

2

2

22

2

=∂

∂−

∂

∂rV

tvrV

r. (23)

This equation has the solution:


( ) ( )

r

vtrV

r

vtrVV

++

−= 21

. (24)

with V1 and V2 being arbitrary functions.

Two simple and well known solutions of the wave equation are the spherical

diverging wave reV jkr= and the plane wave, rjkeAV ⋅= where k depends on the

constants that appear in the wave function.

The following terminology will be used for the various parameters of the wave:

Frequency: T

1

2==

π

ων

Angular frequency ω

Oscillation period T

Wavelength TVV ⋅=⋅=ω

πλ

2

, (V being the speed)

Wavenumber V

kω

λ

π==

2 (25)

In general, the solution to the wave equations can be a superposition of

monochromatic waves at different frequencies,

( ) ( )[ ] ωω ωω dgtatV ∫∞

−=0

cos)(, rrr . (26)

The group velocity of the wave is:

k

g

d

dv

κ

ω= (27)

and the phase velocity (the velocity of planes that have the same phase is:

k

v p ω= . (28)

In a non-dispersive medium where k=ω/c, the group velocity and phase velocity are

the same and equal to c/n.

The index of refraction n is defined as the ratio of the phase velocity to the speed of

light and plays an important role as a parameter that can be determined

experimentally:

pv

cn = . (29)

2.1 The transversal character of electromagnetic waves

When the wave solution is tested together with the Maxwell equation itself from

which it was derived, we can see that there are other restrictions that are not apparent

from the general solution. To summarize, the valid solution satisfies:

1. The electric field vector is perpendicular to its propagation direction


2. The electric and magnetic fields always coexist and propagate along the same

direction, but they are perpendicular to each other.

To see that we start by assuming a solution that is along one direction only, say x, or

that xExˆ=E . Note that xE (which is a scalar) can still be a function of both x, y and

z, i.e., ( )zyxfEx ,,= .

Lets now substitute this in the Gauss equation (first Maxwell equation) where we get:

0=⋅∇ E or 0=∂

∂xE

x.

But this means that xE can not depend on x, so we are left with ( )zyfEx ,= . If we

now substitute this in the wave equation (21), we see that Each one of the directions

can be a solution. Let us choose the z direction, which finally gives us one possible

solution

( )xAe

tkzi ˆω−−=E .

A similar solution can exist along y. This is therefore a transverse wave. The actual

field is perpendicular to the progression direction.

The last peace of the puzzle is found when we observe the third Maxwell equation (3).

For a simple electric field that points along x and depends only on z, the Curl

operation gives only one solution:

yEz

xˆ

∂

∂=B&

That immediately means that ( )yikAei tkzi ˆωω −−−=B .

This shows that the magnetic field is transversal to the electric field with an opposite

sign (delay of half a period).

2.2 Get the feeling of waves

We described above a general solution of a wave. Note that in general it does not have

to be periodic or a sine or cosine wave. Nevertheless, harmonic waves are important

because of two reasons. First, in many cases the wave is generated by a source that

does create a harmonic wave and second, even if it is not a harmonic wave, we know

that every other function can be described as a linear combination of harmonic waves

(e.g. by Fourier transformation of the function) and it is therefore easier to deal with.

A general solution (one dimensional) of a wave that will be:

( )tkxiAeE

ω−−= . (30)

Lets substitute it in the wave equation (21) and see what we get. We will also replace

the factor εµ by 2−v (m/s)

-2 because these are the right units. The substitution gives

us after second space and time derivatives:

kvEv

Ek ωω =→=+− 02

22 . (31)


So ω and k are dependent on each other, but there is still one degree of freedom! We

can choose one of these parameters. That means that the solution of Maxwell equation

for waves is not well defined. There are infinite possible solutions (the whole

spectrum, so to speak).

Let us see what the wavelength is, λ. It is defined as the distance between similar

values of the wave, at a given time. It is easily found by setting πλ 2=k which means

that λπ2=k it is called again the wavevector or the wavenumber.

The meaning of group velocity becomes important only if the wave is actually made

from a combination of few harmonic waves, and not only a monochromatic wave. Let

us look at a combination of two waves with indexes 1 and 2 and assume the same

amplitude. The wave is:

( ) ( ) ( ) ( )tkxitxkitxkitxki

eeeeωωωω ∆−∆−−−−−−− =+ 002211 2 . (32)

where the following definition were used: ( ) 2210 kkk += , ( ) 2210 ωωω += ,

( ) 212 kkk −=∆ and ( ) 212 ωωω −=∆ . If the values of the wavevectors and angular

frequencies is not much different, than the first part of the equation oscillates much

faster and with a shorter wavelength than the second part. The second part is called

the envelope function. The second part is the more important and describes the actual

flow of energy and information. If we are to calculate the speed of the energy flow, it

is therefore equal to kv ∆∆= ω . If more than two waves are combined together, it

finally becomes a differential function as defined in equation 27.

2.3 Decaying waves and imaginary index of refraction

It is not always noted that the index of refraction can also be imaginary. If we use the

simple wave function (equation 30), replace k by cnk ω= and substitute ninn ′′+′=

we get:

( ) cxntcxnieeE

′′−′− ⋅= ωω . (33)

The first exponent is a normal harmonic wave, while the second part is an

exponentially increasing/decreasing function. This describes nothing more than an

oscillating decaying (or growing) wave. It is an important case that as we will later

see, appears in many cases.

The intensity of the field is dependent on the square of the amplitude and if we define

β as cnI /2ωβ = we get:

zeI

β−∝ (34)

β is known as the absorption coefficient of matter. It has the dimensions of length-1

and if it is small relative to the thickness of the material, it is usually negligible. This

is the case for most of the materials that are regarded as transparent.

If we trace back the origin of n in our equation, vcn = and ( ) 22 −= εµv we see also

that


00

22

µε

εµεµ == cn . (35)

In non-magnetic materials, the dependence on µ is canceled.

2.4 Reflection and Transmission

To describe the reflection and transmission rules, we will use the indices i, r and t for

the incident, reflected and transmitted light. We will also denote the direction of

propagation of each one of these waves by )(is , )(r

s and )(ts and v1 and v2 are the

velocities of the waves in each one of the media.

The fields at the boundary must have the same time variation. This means that the

argument of the wave function of the three waves on the boundary must be equal.

Using equation 14 and forcing the argument of the functions to be the same, we find

that:

( ) ( ) ( )

211 vvv

tri srsrsr ⋅=

⋅=

⋅ (36)

that is simplified on the plane z=0 to:

( ) ( ) ( ) ( ) ( ) ( )

211 v

sysx

v

sysx

v

sysxt

y

t

x

r

y

r

x

i

y

i

x ⋅+⋅=

⋅+⋅=

⋅+⋅. (37)

Since this equation must hold at every point in the plane, it should hold specifically

for each one of the components (x and y), i.e.

( ) ( ) ( ) ( ) ( ) ( )

211211

,v

s

v

s

v

s

v

s

v

s

v

st

y

r

y

i

yt

x

r

x

i

x ==== (38)

and it shows that the three waves are lying in one and the same plane.

The components of the unit vector s can be expressed by the angle between the

vectors and the Z axis.


Figure 5 Conditions for the reflection and transmission of light from a surface.

Therefore, the above equations can be translated to:

ir

tritri nnnor

vvv

θπθ

θθθθθθ

−=

⋅=⋅=⋅== sinsinsinsinsinsin

211

211 (39)

which is known as the law of reflection and Snell’s law.

2.5 The amplitudes of reflection and transmission

Assume two media with homogenous and isotropic characteristic and with zero

conductivity (therefore perfectly transparent). Assume that their peremabilities is

negligibly different than 1, and simply take it as 1.

Let A be the complex wave of the incident light and τi its variable phase,

( )

.cossin

11

+−=

⋅−=

v

zxt

vt ii

i

i

θθωωτ

sr (40)

It is better to resolve each vector in terms of the components that are parallel (||) and

perpendicular ( ⊥ ) to the plane of incidence (see Figure 3).

The components of the incident electric field can be written as:

( ) ( ) ( ) iii i

i

i

z

ii

y

i

i

i

x eAEeAEeAEτττ θθ −−

⊥− ==−= sin,,cos |||| (41)

By using the Maxwell equations with j=0, it is possible to show that EsH ×= ε and

by placing the definition from the previous equation one get:

( ) ( ) ( ) iii i

i

i

z

ii

y

i

i

i

x eAHeAHeAHτττ εθεεθ −

⊥−−

⊥ =−=−= sin,,cos ||| (42)

Similar equations are found for the transmitted and reflected fields, and can be written

by simply using T and R for the transmitted and reflected amplitudes.


The incident, transmitted and reflected fields must obey the boundary conditions that

are listed in equations 8-11 and by substituting we get:

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )t

y

r

y

i

y

t

x

r

x

i

x

t

y

r

y

i

y

t

x

r

x

i

x

HHHHHH

EEEEEE

=+=+

=+=+ (43)

After substitution of the above equation, one can get the values for the intensity if the

transmitted and reflected light:

⊥⊥ ⋅+

=

⋅+

=

Ann

nT

Ann

nT

ti

i

ti

i

θθ

θ

θθ

θ

coscos

cos2

coscos

cos2

21

1

||

12

1||

(44)

⊥⊥ ⋅+

−=

⋅+

−=

Ann

nnR

Ann

nnR

ti

ti

ti

ti

θθ

θθ

θθ

θθ

coscos

coscos

coscos

coscos

21

21

||

12

12||

(45)

These are known as the Fresnel equations.

2.5.1 Special case: Normal incidence

In the special case of normal incidence where θ=0, it reduces to:

⊥⊥⊥

⊥

⋅+

−=⋅

+

−=

⋅+

=⋅+

=

An

nRA

n

nR

An

TAn

T

1

1,

1

1

1

2,

1

2

||

||||||

(46)

with n=n2/n1.

2.5.2 Special case: Brewster angle

In the case that:

1

2tann

ni =θ (47)

This condition is met for θi=56040’ for air/glass with n=1.52. In such case, the value

of R||=0 at the Brewster angle.

2.5.3 Special case: Total reflection

When light propagates from a denser to a less-denser optical media, the transmission

angle may not be real (see equation 39) because itn

nθθ sinsin

2

1= >1. In this case there

is no flow of energy into the second media, but there is still an electric field at the


other side, that decreases very rapidly along the media and propagates along the

boundary.

In order to find the field, also called an evanescent field, we refer again to Figure 5. It

is clear from Snell’s law (equation 39) that the sine function is greater than one, and

therefore also the cosine is purely imaginary:

itn

nθθ sinsin

2

1= >1 and 1sin

cos2

2

−±=n

i it

θθ (48)

where we defined 12 nnn = or it nnn = . If we look at the result of the imaginary

cosine function on the phase of the planar wave (see equation 40 and the following

one, 41) we find that:

−⋅

−−=−

1sin

expsin

exp2

2

n

zxtie i

tt

ii tθ

ν

ω

ν

θωτ

m . (49)

As we see, the first term is still a harmonic function as a function of x, but the second

term is an exponentially decreasing function along z. If we use the relation

λπνω 2= , we can write the exponentially decaying term for the electric field as δ/z

e− where:

( ) 2/1222 sin2

−−⋅= ttit

t nnn θπ

λδ . (50)

Now, note that tλ is the wavelength at the material of the transmitted wave. It is better

to describe it as the wavelength in vacuum ( 2211 nn λλ = ) and therefore 0λλ =ttn . Also

note that the intensity is proportional to the square of the field, which adds a factor of

two to the denominator of δ. We finally get:

dzeII

−= 0 ; ( ) 2/12220 sin4

−−⋅= tti nnd θ

π

λ. (51)


3 Polarization of time-varying fields

The optical properties of dense materials are determined by the properties of the

charges particles in the materials. In the presence of an oscillating electric field, the

atoms will also oscillate and usually it is sufficient to consider only the motion of the

electrons relative to the protons as harmonic oscillators. This motion would lead to a

polarization of the material (through the electric dipole moment of each of the atoms)

which will lead to an index of refraction that depends on the frequency. This

phenomenon is known as the dispersion of matter.

The motion equation for each atom can be written as:

Errr eqgm =++ &&& (52)

where it is assumed that m is the mass of the particle, g is the ‘friction’ coefficient that

is responsible for the energy loss, e it’s charge, qr is the elastic restoring force and E

is the effective field in the environment of the particle. Note that for particular cases,

only some of the arguments described above may exist. As an example, for an

electron in metal it is assumed that there is no elastic restoring force, which means

that q and ω0 are both 0.

If the electric field is oscillatory with a given frequency ω, the solution for the

displacement r must also have a similar time dependence and therefore tie

ω−= 0rr .

Placing this in the motion equation we get:

( ) Er ⋅−−

=ωγωω i

me22

0

(53)

where γ=g/m is the damping factor and mq=0ω is the self frequency of the

molecule.

This shows that the oscillation amplitude depends on the oscillation frequency of the

field, the self-frequency of the atom and the damping factor of the molecule. The

imaginary part means that the displacement will have a phase shift relative to the field.

Important physical insight is hidden in these parameters. As an example, the damping

factor is related to the absorption coefficient of the atoms in matter.

The electric dipole moment contributed by each atom is equal to p = er and the

polarization is equal to P = Np where N is the density of electrons (it can be of course

any other particle). Substituting r we get for the polarization:

( ) EP ⋅−−

=ωγωω im

eN

22

0

2 1 (54)

In most materials, there is more than just a single vibration mode, due to the complex

structure of the molecules or solids. Each one of these vibration modes will also have

a different strength factor (different probability to be excited, one can think of it as an

effective charge). Assuming that these modes of vibration are independent we can

write the total polarization as:


( )∑ ⋅−−

=k kk

k

i

f

m

eN EP

ωγωω 22

0

2

.

(55)

The material parameters that are introduced are the oscillation mode: kf (the oscillator

strength), k0ω and kγ .

We have already seen that it is common to describe the dependence of the polarization

on the electric field as (section 1.4).

EP αN= . (56)

α is the atomic polarizability of matter.

The Lorentzian function found above is shown in Figure 6. The full width at half

maximum of the imaginary part (FWHM) is equal to 2γ.

Figure 6 Lorentzian function. Shown are the real (blue) and complex (red) components. This is

calculated for γ=0.1ω0.

3.1 Electromagnetic waves in dielectrics

We saw already in section 1.7 the relation between the electric field and the so-called

electric displacement vector (that describes the contribution of the ‘free charges’ to

the electric field (and the same for the magnetic flux and field).

In the following treatment, we will write the Maxwell equations based on the more

fundamental vectors, namely the electric field E and the polarization P.


Remember again the definitions we used:

MHB

PED

+=

+=

0

0

µ

ε (57)

P is the electric polarization and M the magnetic polarization. We also defined the

dependence on the electric susceptibility eχ and magnetic susceptibility mχ ,

( )eχεε += 10 , ( )mχµµ += 10 .

We will now use these additive relations and write the Maxwell equations in space

which is free of charges and magnetic moments so that the only charges and magnetic

moments that may arise, results from the material and it’s properties.

We also saw before that the polarization in general leads to the appearance of electric

charges and therefore also currents (see section 1.6): P⋅−∇=ρ . If the polarization

changes with time, it will also lead to currents as we can see from the conservation of

charge equation: 0=⋅∇+∂∂ jtρ . This simply gives PJ &= .

Maxwell equations can now be written:

PEB

B

BE

PE

&&

&

0

2

0

1

0

1

ε

ε

+=×∇

=⋅∇

−=×∇

⋅∇−=⋅∇

cd

c

b

a

(58)

Similar to the way we developed the wave equation before (chapter 2), we can solve

these equations in the following way:

• Take the curl of equation 58b and use the identity ( ) AAA 2∇−⋅∇∇=×∇×∇

• Substitute B×∇ from equation 58d

• Substitute E⋅∇ from equation 58a

This will lead to: ( ) EPEE &&&&22

0

2 11cc

−−=∇−⋅∇∇ε

and further on to:

( ) ( ) PPEE &&&&12

0

1

0

22 −−− +⋅∇∇−=−∇ cc εε . (59)

This equation will be solved for various cases.

At first, let’s assume an isotropic material and a propagation of a wave along the Z

direction with an electric field only along the X axis, i.e. ( ) ( )cnztiw

x

kzti

x exEexE/ˆˆ −− == ω

E . We substituted here kvp /ω= , and the definition of

the index of refraction pvcn /= .

Because of the isotropic nature of the material, as we saw in equation 54 the

polarization P is parallel to E and therefore xP ˆP= . We can also assume a steady

state solution for P that has the same frequency dependence as E: tieP

ωxP ˆ= . Using

this information we can perform all the derivatives of the above equation.


It gives: 0,,, 222 =⋅∇−→→∇ PPEE ω&&&&k . Substitution gives:

xx Pc

Ec

k2

0

2

2

22

εωω =

− . (60)

Even though we have developed this for certain directions, in isotropic materials the

same solution will be found fro every direction so it can be generalized to a three-

dimensional equation.

This is the far that we can get by using the Maxwell equations. In order to keep

solving this equation, we must state what the dependence of P on E is. This depends

on the solutions of the Schrödinger equations (or Newtonian equations) of motion for

the particles (e.g. electrons) in the matter as a result of the electromagnetic field

(beginning of chapter 3). This dependence is not provided by Maxwell equations.

Another important comment is worth mentioning. What are we actually trying to find?

We already know that the solution must be a linear wave along a single direction. The

main issue is therefore to know the dependence of the wavelength on frequency or in

more practical terms the dependence of the wavenumber k on the angular frequency ω.

These parameters relate to the momentum and energy and therefore carry the most

important physical insight. We are also interested to find the dependence of the index

of refraction n on these values. This is practical because it is relatively easy to

measure n and it therefore provides a link of the theory to the experiment. Note that

ω// kcvcn p == , and therefore if k(ω) is known, expressing n is only an algebraic

exercise. k(ω) is known as the dispersion relation of a wave or a wave-like

phenomena.

We also remind the relation found before 00

22

µεεµεµ == cn and in a non-magnetic

material, the dependence on µ is eliminated.

In order to continue we must now combine together the solution of the equations of

motion with the Maxwell equation. This will be the subject of the next few sections.

Few cases that will be described are the case of diluted gases, the case of non-

conducting dense media and the case of conducting media (such as metal).

It is also worth remembering that the only limiting condition we used up to this point

was the assumption of an isotropic material and a linear dependence of P on E. This

assumption simplifies the continuing treatment, but it still describes the case of many

materials fairly well. The more general case will not be treated here. In this case, the

dependence of P on E is through a tensor.

3.2 Solution for diluted gases

In diluted solution we can take the dependence of P on E that we have found before

(equation 54) and use it together with equation 60.

This will give us:

( ) ωγωωε

ωω

im

eN

cck

−−=−

22

0

2

2

0

2

2

22 1

. (61)


The dispersion relation can be isolated here and give:

( )

−−⋅+=

ωγωωεω

imNe

ck 22

00

2

2

22 11 . (62)

We can also describe the solution for the index of refraction n. Substituting ω/kcn =

we get:

( ) ωγωωε imNen

−−⋅+= 22

00

22 11 . (63)

In gases the index of refraction is very close to 1 (the speed of light is almost the same

as in vacuum) so we can look at a approximation 2/11 εε +≈+=n and get

( ) ωγωωε imNen

−−⋅+≅ 22

00

2 12

1 . (64)

That’s it. This derivation is shown to be accurate to a satisfactory degree of accuracy.

3.3 Solution for dense material

In dense material, the derivation is almost the same; there is only one further

complication.

If we think about the procedure we used for the derivation, it looked as follows:

1. Maxwell equation gave us part of the solution for the electromagnetic wave.

The only limiting condition we used was the assumption of an isotropic matter.

2. We solved the equations of motion in order to find the dependence of the

polarization on the electric field. Here we skipped a complexity that can not be

ignored in dense material. The complexity arises from the fact that we

assumed that the polarization of every atom and molecule in the material

arises only from the electric field that is not contributed from the matter itself.

But if the polarization contributes a significant field, the local electric field

that exists at every point depends also on the electric field that all the other

dipoles exert on it. This should be therefore taken into account.

In other words, the solution so far assumed that the atoms are not affecting

each other, an assumption that fails in condensed matter.

To take it into account, we have to distinguish the local field 'E from the electric field

that existed even without the matter in place. More rigorously:

( )polNN EEEP +== αα ' (65)

where Epol is the effective electric field at every point that arises from the polarization

of the other atoms (or molecules) that are around the molecule that is inspected.

Without developing it here (See reference [1] Vol II Chapters 10-11) the result is:

+=′=

03εαα

PEEP Nn . (66)

Isolating P we get:


EP

031

ε

ααN

N

−

= . (67)

Equation 60 is still valid. Substituting it in Equation 67 we get:

.

3

11

0

2

0

2

2

22

αε

α

ε

ωω

N

N

cck

−

=− . (68)

Substituting n as it is defined in equations 28 and 29 and arranging we get:

αε

α

N

n

3

11

0

2

−

Ν+= . (69)

Solving for α we get and substituting from Equation 38:

( ) ωγωωαε

im

eNN

n

n

−−==

+

−22

0

2

2

2

0

1

2

13 . (70)

This important derivation is known in few different names, Lorentz-Lorenz formula

or Clausius-Mosotti equation (well, this is only from two books..). Again it gives a

reasonable agreement with experiments.


4 Optical properties of bulk metals

We consider a homogenous and isotropic medium with a dielectric constant ε,

permeability µ and conductivity σ.

Before we actually develop the full solution, we can solve a very simple case of

electron oscillation in metals. This case does not take into account relaxation

processes or polarization effects but it still provides interesting insight to the physics.

Imagine a simple model of electrons vibrating in a metal and thereby forming a

uniform charge distribution on the edges (see Figure 5).

Assuming the displacement of electrons is dx, the charge density on each side is

σ=Ne(dx) (with opposite signs). This will raise an electric field E=4πσ which will

apply a force on each electron F=-eE=-4πNe2(dx). The equation of motion of each

electron will therefore be:

( )dxNexm 24π−=&& (71)

Which have the obvious solution:

ti pexx

ω

0= (72)

where we have used the definition of the plasma frequency ωp:

m

Nep

22 4π

ω = . (73)

E

F

Figure 7 A simplified model of electron oscillations in metal. The electron displacement

creates a positive and negative surface charge densities (equal but opposite).

Though simple, the solution provides some understanding on the plasmons

osciallations.

We will now derive the dispersion relation and the index of refraction for a bulk

material and then for a surface.

4.1 Wave propagation in bulk conductors

The treatment for a bulk follows the same guidelines we used in the previous sections

2.3 2.4 with few modifications again.


In metal, there is a large density of free electrons. There are also electrons that are

bound to the atoms in inner shells, but we will neglect the contribution of these,

assumed to be small in comparison.

The equation of motion of the electrons in metal is similar to what we have already

developed in section 2.1 and the solution given in equation 38. There are however few

specific issues related to this.

First, for a free electron, there is no restoring force and the electron is in principle free

to move. The disturbances to it’s free motion are due to collisions with other atoms in

the metallic lattice. Therefore, we should set the self-oscillation frequency of the

electrons to zero, 00 =ω .

The collisions of the electrons can be treated by a model similar to the Drude model

that assumes simple collision of the electrons with the atoms (see [2] chapter 1). This

model assumes an average time between collisions τ. In the existence of a static

electric field E, the equation of motion simply becomes:

τττ /driftdrift mvFeEFmv =⇒== (74)

which means that there is no net acceleration to the electron, and the electric field

simply leads to a constant average velocity of the electron. Using this model we see

that γ that appears in equation 38 is equal to 1/τ.

Also, based on the Drude model, we can express this value in terms of the

conductivity σ. This is very good because σ is a measurable parameter. In a static

electric field we have:

τσσm

NeNevJEJ drift

2

, =⇒== . (75)

Another issue to clear out is the issue of local-field that we have introduced in section

2.4 when we have found the dispersion relations for a dense material. The local field

correction that we found (equation 50) is significant. Do we have to use a similar

correction in the case of free electrons in metal? The answer is no. The reason for this

is that in a dense material the polarization in each small area in space is well defined

because the charges are confined in space in each atom or molecule. In metals in

contrast the electrons are rapidly moving so that at every point the electric field on

average is simply the electric field in the metal, namely E (it is though correct that

this field is varying very rapidly due to the motion of electrons in the vicinity of the

area, but this rapid change is averaged to the average field E).

To summarize this long discussion for the case of a metallic structure, we have found

that:

1. γ=1/τ and τσm

Ne2

= .

2. ω0=0

3. There is no correction to the local field.


This information can now be easily inserted to the solution of the Maxwell equation in

dielectric media (equation 44, xx Pc

Ec

k2

2

2

22 4πωω

=

− ) and the motion equation for

the electrons (equation 38, ( ) EP ⋅−−

=ωγωω im

eN

22

0

2 1). The result is:

γωω

πωω

im

eN

cck

−−=−

2

2

2

2

2

22 14

. (76)

after substituting σ and τ:

−−=

ωτωπσω

ikc

2

222 141 . (77)

or

ωτω

πσ

in

−−=

2

2 41 . (78)

This is it. A relative simple solution. We already discussed about the meaning of the

complex index of refraction, so we should not be too troubled with it, only careful

when using the results.

It is convenient in this case to look at the approximation of this result in two special

cases of:

1. Low frequency where ωτ<<1 and σ/ω>>1, and

2. High frequency where ωτ>>1.

We will test these two cases.

Another fact that worth noting is the relation between the complex index of refraction

N, and the complex dielectric constant ε. The dielectric constant is defined as the ratio

between the displaced electric field D and free charges contribution E,

ED ε= . (79)

By using this and equation 66 we can see that if we write the complex index of

refraction as ninn ′′+′= where n′ and n ′′ are the real and imaginary parts then

( ) ( )εεε ′′+′==′′+′= ininn22 and therefore:

nn

nn

′′′=′′

′′−′=′

2

22

ε

ε. (80)

Here we have written the values of the real and imaginary components explicitly.

4.1.1 Low frequencies

In this regime, Equation 66 reduces to:

ω

πσ42in −= . (81)


To perform the square root, we use:

2

1 ii

−=−

and get:

( ) ωπσ 241 in −= . (82)

This solution shows that the attenuation of the field is significant relative to the

change of the speed of the wave. We can use the definition of the attenuation

(equation 34, zeI

β−∝ ) and cnI /2ωβ = . It is more convenient to look at the

reciprocal of β, delta δ=β-1

that has a unit of length, and reflects the penetration depth.

This therefore gives for δ:

πωσ

δ8

c= . (83)

If the approximation that we made is not adequate, we can always use the exact

solution.

An example for the case of copper (Cu) is shown in Figure 6. For this case,

4πσ=6.5x1018

sec-1

, the atomic weight is 63.5 gram, the density is 8.9 grams/cm3 and

τ=2.4x10-14

sec. When the penetration depth is that small, there is little use to discuss

about the behaviour of the wave inside the bulk material because it will not really get

there.

Figure 8 The penetration depth calculated for Copper. The exact solution (blue) and approximate solution are shown to be similar up to the point where ωτ~1.


4.1.2 High frequencies

At high frequencies, Equation 66 reduces to:

τω

πσ2

2 41−=n . (84)

If we substitute the solution we have found before (equation 75) for the conductivity,

τσm

Ne2

= and the definition of the plasma frequency (equation 73) we get:

2

2 1

−=

ω

ωpn . (85)

Substituting n=ck/ω and solving ω(k) we get the dispersion relation, ( ) 22

Pck ωω += .

This dispersion curve for bulk plasmons is shown in Figure 23.

This means that the plasma frequency acts like a critical one. Above it, the index of

refraction n is approximately real, and the wave has no attenuation. The metal

therefore becomes transparent. Metals are in general transparent to x-rays and some

(like Rubidium, Rb) are fairly transparent even at the UV range.


5 Diffraction theory

Diffraction is an important phenomenon that affects every aspect of light propagation.

Light can bend in different cases. Not every bent of a ray is diffraction. Actually, we

can distinguish between three different types of rays bending; reflection, refraction

and diffraction, see section 2.4 for the first two. Diffraction was first defined by

Sommerfeld as the deviation of light from its rectilinear path which can not be

explained as reflection or refraction.

I will start with a brief historical description that is also important for the

understanding of the phenomenon. For more details, see for example [5, 6].

5.1 Historical review

Around the 11th

century, Abu Ali al-Hasan ibn al-Hatim (philosopher, physicist and

mathematician) writes the book Kitab al-manazir where mirror, lenses and light

aberrations are described.

In 1618, Francesco Maria Grimaldi discovered the diffraction patterns of light and

becomes convinced that light is a wave-like phenomenon. The theory is given little

attention. In the measurement he performed, he noticed that the light spread beyond

the geometrical limits that the transition from light to dark is gradual.

The first explanation given to the phenomena was provided by Christian Huygens in

1678 in his work Treatise on Light. Huygens determined that each point on the

wavefront of a wave can be considered as a new source or a secondary source of light,

and that the wavefront at any later instant can be found by constructing the envelope

of all the secondary waves. This was a major step forward in the understanding of the

physical properties of light in general, and the diffraction specifically.

The progress was later stopped for a long time as a result of the rejection of the wave

theory of light by Isaac Newton who favored the corpuscular (particle) theory of light

in 1704. Newton’s influence led his followers to abandon the wave theory, and the

previous descriptions of Huygens for a long time.

Further progress was done by Thomas Young in his famous two-slits experiment in

1804. Young introduced the interference concept that strongly favors the wave theory

of light. This was a radical idea, as it suggested that in some cases adding two light

beams together may produce darkness.

The ideas suggested by Huygens and Young were summarized in a more detailed way

and mathematically adequate by Augustin Jean Fresnel in 1818. By describing the

wave with phases and calculating the contribution of all the secondary sources,

Fresnel presented his paper to a prize committee if the French Academy of Sciences.

His thesis was first rejected by Poisson (one of the members of the committee)

because he showed that Fresnel’s theory predicts the appearance of a bright spot at the

center of the shadow that is projected by an opaque disk. F. Arago who chaired the

committee ordered to perform the experiment, and the bright spot was actually found.

Fresnel was awarded the prize, and the wave theory received much more attention

from that point on.


A major step forward was the identification of light as an electromagnetic wave by

Maxwell in 1860. Nevertheless, the diffraction phenomenon continued to be described

more accurately only in 1882 by Gustav Kirchhoff. Kirchoff succeeded to show that

the secondary wave sources can be the result of the wave nature of light. In his

mathematical formulation, Kirchhof had two assumptions on the boundary values of

the light that incident on the surface of an obstacle placed along the path of a light

beam. Later on, these two assumptions were shown to be inconsistent with each other

by Poincaré and Sommerfeld in 1892-4. Kirchhof formulation which is called the

Huygens-Fresnel principle is therefore only a first approximation, even though in the

majority of the cases it agrees very well with experiments.

Kirchhof’s theory was improved later by Kottler and Sommerfeld, and one of the

assumptions mentioned above was eliminated. This modified theory is called the

Rayleigh-Sommerfeld diffraction theory.

In both these theories (Kirchhof and Rayleigh-Sommerfeld), light is treated as a scalar,

which in reality it is not. Nevertheless, it was shown that such a treatment is valuable

and accurate as long as the following requirements are met:

1. The aperture that diffracts the light is large compared to the wavelength of light,

2. The diffracted fields are observed in a distance that is large relative to the

wavelength of light.

In other cases, it is necessary to use a more rigorous vector-based theory. Such

calculations were performed later by Sommerfeld in 1896.

As mentioned before, in those cases where the approximations described above holds,

the diffraction theory leads to the famous diffraction approximation of Fresnel or

Fraunhofer. The Fraunhofer approximation is widely used, as it is a good

approximation in many cases, and at the same time leads to a simple-to-use integrals

and calculations. The Fresnel approximation is broader in it’s nature, but leads to a

mathematical formulation that is more difficult to handle. See reference [5] for a

detailed discussion.

Another aspect of the Kirchhof integral theorem that is important to understand relates

to the scalar approximation. In deriving the Kirchhof algorithm, the only property of

the wave that is being used is that it satisfies the homogeneous scalar wave equation.

This means that it can be applied to any component of the real vector function of the

wave. Such a treatment is may become complex, and it turns out that for many

calculations and application in optics, it is enough to use the so called scalar

approximation. In this approximation only the wave function is taken as a scalar

without treating each polarization component separately, see [5] section 8.4.

Abu Ali al-Hasan ibn al-Hatim, Francesco Maria Grimaldi, Chirstiaan Huyghens, Issac Newton


Thomas Young, Augustin Jean Fresnel, James Clerk Maxwell, Gustav Kirchhoff

Figure 9 Scientists that contributed to the discovery and understanding of diffraction.

5.2 The Green function

The main importance of the Green function theorem is in its usage more than in its

abstract form. In general, in some cases it simplifies the problem of finding the values

of a function in a given volume to the problem of finding an auxiliary function and

then performing calculations that involve the values of the desired function and its

derivative on the boundary of a convenient surface that includes that volume.

The Gauss theorem for a vector function ( )rA is:

( ) ( )∫ ∫ ⋅=⋅∇V S

dSdV nrArA (86)

where S is any surface that is surrounding a volume V and n is a vector that is

perpendicular to the surface.

We can define for general case two functions A and B as:

( ) ( ) ( )( ) ( ) ( )rrrB

rrrA

fg

gf

∇=

∇= (87)

and its divergences:

( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )rrrrrB

rrrrrA

fgfg

gfgf

∇⋅∇+∇=⋅∇

∇⋅∇+∇=⋅∇2

2

(88)

We now apply the Gauss theorem on A and B, but use the definitions of A and B that

are given above (we save space but omitting the dependence on r).

( )

( ) ∫∫

∫∫⋅∇=∇⋅∇+∇

⋅∇=∇⋅∇+∇

SV

SV

dSfgdVfgfg

dSgfdVgfgf

n

n

2

2

(89)

As a last step we take the difference of these two functions and get:

( ) ( )∫ ∫ ⋅∇−∇=∇−∇V S

dSfggfdVfggf n22 (90)

5.3 The Kirchhoff Integral

We first note that if we look at the wave equation which results from Maxwell’s

equations (equation 21), it can be written in a form that is known as the Helmholtz

equation:


( ) 022 =+∇ Uk . (91)

where k is the wave vector.

If we now look for a solution U to the electric field in a certain volume with given

boundary conditions, we can use the Green theorem given above and choose an

auxiliary function that by itself obeys the Helmholtz solution. The function selected

by Kirchhoff was ( )01

1

01

rePG

jkr

= around a point P0 inside the relevant volume where 01r

is the distance from the origin point of the spherical wave to the point P1 where G is

calculated. It is easy to verify that G obeys the Helmholtz equation by itself, and so

must also the actual solution U that we are looking for.

We now use the Green function with these two functions. Noting that from the

Helmholtz equation UkU22 −=∇ and the same for G, we insert this to the left side of

the Green function (equation 90) and get for the integrand:

( ) 02222 =−−=∇−∇ GUkUGkUGGU

Which leads to zero contribution to the left side of the equation. We are therefore left

only with:

( ) 0=⋅∇−∇∫S

dSUGGU n

Before we continue, we should note that our selection of the spherical wave for the

auxiliary function leads to a singular point at P0. To overcome that, we select the

surface around the volume to include not only the range marked as S (Figure 10) but

also a small sphere around the singular point S’’ with a radius ε that we than shrink to

zero in our calculation.

S’

n

P0

S’’

Figure 10 Description of the surfaces that are defined when using the Green theorem in

Kirchhoff’s integral formalism.

Note that the vector n that is perpendicular to the surface points outwards from S’ but

inwards on S’’. The value of G on the surface S’’ is equal to ( )ε

εjkeSonG ='' and its

divergence time n is equal to: ( )jkeGjk

−=⋅∇εε

ε 1n (this is simply the derivative of G

times -1, because the direction of the derivative is outward from the sphere while n

points inward).


We insert these values to the left side of the Green function that we were left with and

get:

( ) ( ) ( ) 0''''''

=⋅∇−∇+⋅∇−∇=⋅∇−∇ ∫∫∫SSS

dSUGGUdSUGGUdSUGGU nnn (92)

Note that the integral is calculated now on the two surfaces S’ and S’’. In the first part

of this integral on S’, we can substitute the values from above and calculating the

limit where ε goes to zero. In this case, the integral can be replaced with a

multiplication with the area of a sphere, 24πε and it therefore gives:

( ) ( ) ( ) ( )00

2

0'

414lim' PUeUjkePUdSUGGUjkjk

S

πεεε

πεεε

ε=

∇−−=⋅∇−∇

→∫ n .

Substituting this into equation 92 finally leads to:

( ) ( ) ''41

" 01010

0101

∫

⋅

∇−⋅∇=

S

jkrjkr

dSr

eUUr

ePU nnπ

. (93)

This is known as the Helmholtz-Kirchhoff integral theorem. As we can see, it allows

calculating the solution U at any point inside the volume by knowing only the values

of U on a surface S’’.

5.4 The Kirchhoff formulation of diffraction and boundary conditions

For calculating the diffraction of light through an aperture, Kirchhoff continued by

constructing a boundary around a point that lay behind a screen as shown in Figure 11.

Using the integral theorem developed above (equation 93), simplifying steps are taken

that finally leads to a simpler integral equation. Note that the boundary in Figure 11

consists of three regions; SL in the aperture, S1 in the background of the screen and S2

which is closing a spherical surface around P0. The development follows by showing

(or requesting) that:

1. The integral over S2 vanishes.

2. The integral over S1 vanishes,

And we are left with a much simpler integral over SL.


S1

S2

SL

R

P1

P0

n

R01

P2

R21

Figure 11 The construction made by Kirchhoff for formulating the diffraction by a plane screen.

We first start by noting that for G (the auxiliary function) ( )R

eSGjkR

=2 , the derivative

part gives (for large R):

( ) jkGjkRR

eGjkR

≈−=⋅∇ 1n

We can also assume that the actual solution for U behaves like a spherical wave at a

very large radius R. In Kirchhhoff’s development, this assumption is also a

requirement, stated that the function U depends on R so that it vanishes at least as fast

as a spherical wave solution. In such a case, the integral of equation 93 on the surface

S vanishes, because we get:

( ) ( )∫ ∫Ω

=−=−2

02

S

GUUGjkRjkUGjkGU

We are therefore left with the integral part only on the boundaries S1 and SL.

Kirchhoff’s theory continues by following these assumptions:

1. The distribution of U and it’s derivative on the aperature SL is similar to that

occurs with the absence of the screen.

2. Over the screen S1, there is a shadow and therefore U and its derivative is zero.

These are known as Kirchhoff’s boundary conditions.

The solution for U is now reduced to:

( ) ( ) ''41

01010

0101

∫

⋅

∇−⋅∇=

LS

jkrjkr

dSr

eUUr

ePU nnπ

. (94)

Even though the conditions above are not exact, when the size of the aperature and the

calculated field is larger than few times the wavelength of light, this solution provides

good agreement with experiment.


5.5 The Fresnel-Kirchhoff diffraction formula

Continuing the development, we now assume that the point P0 where we calculate the

field U is far enough from the aperture (relative to the wavelength) so that 011 Rk >> .

In such a case, the derivative of G at point P1 is:

( ) φφ cos1cos010101

1

0101

Rejk

Re

RjkPG

jkRjkR

≈

−=∇

where φ is the angle between the vector R01 and the perpendicular to the surface at P1.

Again, remember that k=2π/λ, and the whole treatment assumes a single wavelength.

We now continue by assuming that aperture is illuminated only by a single spherical

wave at a point P2 that is far to the left of the aperture with an amplitude A,

( ) 21121 RAePU

jkR= . This assumption is not limiting, because any other illumination can

be calculated later as a superposition (see Figure 11).

Inserting these two into equation 94 we get:

( )( )

( )∫

−=

+

LS

L

RRjk

dSRR

eAjkPU θφ

πcoscos

4 21010

2101

(95)

where φ was defined before, and in the same way, θ is the angle between the vector

R21 and the perpendicular to the surface SL.

This result is the Fresnel-Kirchhoff diffraction formula. It holds for a single point

source but can be extended too.

5.6 The Rayleigh-Sommerfeld diffraction formulation

One of the main problems of the Kirchhoff theory is the fact that it was necessary to

assume that both U and its directional derivative are vanishing behind the screen.

Such a condition is especially problematic because it is known that for every potential

function, if such strict conditions exist, it means that the function is zero in the whole

space. It is therefore inconsistent with the solution for U.

Sommerfeld solved this problem in a different way by selecting a different Green

function, more specifically, a symmetric green function that vanishes (or it’s

derivative) on the screen. In such a case, it is required that only U (or it’s derivative)

vanish behind the screen (but not both of them) and therefore the inconsistency is

removed. The solution can be found relatively easily by following the same steps

described above, but using the symmetric auxiliary function G.

The solution that is found is:

( )( )

∫

−=

+

LS

L

RRjk

dSRR

eAjkPU φ

πcos

2 21010

2101

(96)

There is another solution that involves the gradient of U.

This solution is very similar to that developed by Kirchhoff (the Fresnel-Kirchhoff

diffraction formula, equation 95. Note that the difference is only in the factor inserted

by the cosine of the angles. The effect of this difference on calculated results is

minute. For a more detailed discussion, see reference [6] section 3.5-3.6.


5.7 Physical interpretation: Rayleigh-Sommerfeld formulation

If the source of the rays at P2 is at infinity, we can write the Rayleigh-Sommerfeld

diffraction formula as:

( ) ( )∫=LS

L

jkR

dSR

ePUj

PU φλ

cos1

0110

01

(97)

Looking at this formula, we can see that it predicts the principle set by Huygens-

Fresnel that the field at every point is a sum (or integral) of the contribution of all the

rays that originate from a previous wavefront.

This solution expresses he field at P0 as a superposition of spherical waves that

originate from the aperture SL (these are secondary waves, as the origin of the field is

at P2). The following points are to be emphasized:

1. Each spherical wave at SL has an amplitude (complex in general) U(P1).

2. The amplitude of the wave is proportional to 1/λ (or to the frequency of the

wave).

3. It has a phase that is leading the wave in 900 (1/j).

4. Each of the waves has a directional contribution φcos .

The first argument is intuitive, as it reflects the contribution of all the points in the

plane, each one with its own amplitude and phase. The second and third terms can be

related to the fact that the aperture changes the waves towards the observation point,

and 1/jλ is the derivative of a planar wave ( )λπjt2exp − . The last term can not be

easily explained.


6 Fresnel and Franuhofer diffractions

The diffraction formulas developed by Kirchhoff and Sommerfeld can be used for

practical purposes to calculate the patterns of light that passes through different

objects. For some simplified cases, the mathematics becomes rather simple, and more

important, these simplified cases are of enormous importance in optics.

Two of the most well known approximations are those developed by Fresnel and

Fraunhofer (these are different). The Franuhofer approximation is more strict but

gives a very simple mathematical formulation.

Both methods are fundamental for the understanding of imaging systems, and optical

systems in general.

6.1 Fresnel diffraction

The construction that we will use is shown in Figure 12. We would like to know the

field at every point P0 in the plane x-y that results from light that travels parallel to the

Z-axis and passes through a certain mask (shown in the figure and described with the

spatial coordinates u, v).

P0

Φ

v

u

P1

z

x

y

R01

Figure 12 Diffraction geometry for Fresnel diffraction.

The diffraction formula that we use is the Fresnel-Kirchhoff formula (equation 97),

and as mentioned before, it is not much different than the Sommerfeld formulas:

( ) ( )∫=LS

L

jkR

dSR

ePUj

PU φλ

cos1

0110

01

(98)

For the case described above, note that 01cos Rz=φ , or to simplify it, we will just use

r for the distance from P0 to P1. This distance can also be written as:


( ) ( )222vyuxzr −+−+= . (99)

We now start with the Fresnel approximation. It holds when the distance z is much

larger than the dimensions of the pattern in both xy and uv planes. In such cases:

( ) ( )

−

+−+≈

−

+−+=2222

5.05.011z

vyz

uxzz

vyz

uxzr (100)

where we use the first approximation in ...815.011 2 +−+=+ ααα

Now, r appears in two places in the equation. In the denominator, it is actually

possible to replace r with z because the differences are very small. In contrast, the

value that appears in the exponent is more critical because a small phase change can

lead to a large difference (an exponent function), and it is also multiplied by k which

is (relative to r) a large number. In the exponent we therefore use our approximation,

and get:

( ) ( )( ) ( )[ ]

( )∫ ⋅=

−+−

vu

vyuxz

kjjkz

dvduevuUzj

eyxU,

222

,,λ

(101)

and the integration is over the whole mask in the uv – plane.

This is an integral that is not that complex to calculate. In case that the transmission

function U can be separated to two independent functions (one on u and the other on

v) than the whole area integral can be separated to a multiplication of two integrals.

It can be further simplified by using the definition of ( )∫ dxx2cos and ( )∫ dxx2sin , but

we leave it that way. To get a feeling of the solution, Figure 13 shows the calculated

intensity for a one dimensional slit on a screen located 5 cm away assuming a

monochromatic light with λ=500 nm. This intensity is the square of the integral given

in equation 101. Three slit sizes are used with a dimension of 0.02, 0.18 and 0.34 cm

(these are 4e-3, 3.6e-2 and 6.8e-2 relative to the screen-slide distance).


Figure 13 Intensity along the x-axis for a Fresnel diffraction through a slit. Each color shows

the diffraction through a different slit size. The slit sizes are also shown. This

calculation is based on the Fresnel equation, assuming λ=500 nm light and a distance

of 5 cm from the mask to the screen.

One can also see that when the slit size (relative to the distance from the source

pattern) is large, it becomes more and more similar, while for a very small slit it

becomes similar to the Fourier diffraction discussed below.

In case that we use a rectangular hole in the uv plane, with same dimensions as these

given above, we will get the following images.

d=0.34 cm d= 0.18 cm d=0.02 cm

Figure 14 The diffraction images that will result in the case of a rectangular mask in the uv

plane (see previous image). The intensity scale is logarithmic in order to emphasize

the pattern (especially for the small slit.

6.2 Fraunhofer diffraction

In the case of Fraunhofer diffraction, another approximation is made on the

diffraction equation that we have derived. It can be seen from the Fresnel formula

(equation 101). This approximation restricts even more the size of the mask relative to


the distance of the mask to the screen, z. To see that we look at the exponent factor

that appears in equation 101 and open it as:

( ) ( )[ ] ( ) ( ) ( )[ ]yvxuvuyx

zkjvyux

zkj

ee+⋅−+++−+−

=2

22222222

(102)

Looking at the three terms, we can see that the first part that depends only on x, y can

be taken out of the integral. The approximation is now made by neglecting the term

( )22

2vu

zkj

e+

. This is justified only if for all possible u,v that U(u,v) is not zero:

( )

2

22vuk

z+

>> (103)

Keep in mind that this is a relatively strict restriction. As an example, if the mask size

is 1 cm, this restricts z to be at least (for λ=500 nm) 1000 meter. Nevertheless, it turns

out that there is a very simple way to use a lens in front of the mask and get the

Fourier condition at the focal plane of the lens. This approximation and application is

of enormous importance.

Using this approximation, equation 101 can be written as:

( )( )

( )( )

∫ ⋅⋅

=+−

+

vu

yvxuz

jyxz

jk

jkz

dvduevuUzj

eeyxU

,

22

,,

22

λ

π

λ (104)

The terms in front of the integral only scale the intensity, but the exponential terms

disappear if we take the square of the field (i.e. the intensity). The integral itself is an

exact definition of the Fourier transform, where the transformation is from u, v to fx, fy

that are defined as:

z

yf

z

xf yx

λλ== , . (105)

The Fraunhofer diffraction is therefore a kingdom of the Fourier domain. Due to the

simplicity of Fourier-transformed functions, the convolution theorem and the other

Fourier properties, it is easy to find the solutions of an enormous number of masks

and patterns, by knowing the Fourier-transform of a very small number of functions

(such as a delta-function, slit, sine, cosine, square, circle and a Gaussian).

6.2.1 Simple Fraunhofer setup

A simple and very common setup that is used for observing the Fraunhofer diffraction

is shown in Figure 15. The first part of the system is aimed to produce collimated

light (that is a set of beams that are parallel to one another).


Figure 15 A typical 4f optical system that is used for observing the Fraunhofer diffraction. Each

one of the FT lenses are basically assuring that the Fraunhofer approximation is exact

(these lenses brings the image screen from infinity to a distance f from the lens).


7 The diffraction limit of imaging systems

Based on the diffraction patterns, we can calculate the diffraction limit of light. We

will refer to a setup of an imaging system with an entrance and exit apertures. It can

easily be shown that these apertures are images of one another (a point-source object

that emit spherical waves will be observed by the entrance aperture, and the same

amount of light will pass through the exit aperture exactly). Note that these apertures

in general do not have to be physically located at the edges of the optical system, but

for calculations purposes, we can assume that the exit aperture is actually the last lens

of the imaging system. A schematic diagram is shown in Figure 16. In an even

simpler way, the two lenses can be replaced by a one lens, in which case z0 and z1

should be conjugate based on geometrical optics.

Z0 Z1

2a

Image plane Object plane

u, v

s, t

x, y

Figure 16 A schematic diagram of an imaging system. The entrance aperture is imaged on the

exit aperture.

We have already seen before that the image of the lens aperture itself on the image

plane is the Fourier transform of its two-dimensional function (usually a circular hole).

Let us see it first in a quantitative way. If the object itself (in the object plane) is a δ-

function, it will emit a spherical wave that is collimated to a parallel beam by the first

lens. This light will now travel through the second lens that will produce, as already

said, the Fourier transform of the lens opening function on the image plane.

What if the object is not a δ-function but an image, say ( )vuU o , ? In such case what

we will get in the image plane is the convolution of the object function with the

Fourier transform of the hole, almost. There is a difference that is due to the

magnification of the optical system, and in some cases also the inversion of the image.

What actually happens is that the object function is Fourier-transformed twice, one

time at each lens. But we know that the Fourier transform of a Fourier transform is the

function reflection of the function, or ( )), vuU −− .

To see how that comes out from the diffraction equations that we have developed, we

start by a δ-function light source origins at u0, v0. Based on equation 101, the field

distribution right before the lens plane (we assume only a single lens) is

( )( ) ( )[ ]22

00

0

2

0

,vtus

z

kjjkz

ezj

etsU

−+−⋅

=λ

(106)


After the lens, this function is multiplied by the factor of the lens, a factor that adds a

complex phase with a quadratic radial dependence (details not shown) as well as the

actual transmission of the lens. This contribution adds up to:

( )( ) ( )[ ]

( )( )222

02

00

0

22

0

' ,,ts

f

kjvtus

z

kjjkz

etsPezj

etsU

+−−+−⋅

=λ

. (107)

Here f is the focal length of the lens. Now this field is diffracted again based on the

Fresnel diffraction (equation 101) and finally gives (after a simple arrangement of

terms):

( )( )

( )( ) ( )[ ] ( ) ( ) ( )[ ]

dtdseeetsPzz

eyxU

ts

tysxz

kjts

f

kjvtus

z

kjzzjk

⋅⋅−

= ∫−+−⋅+−−+−⋅+

,

222

10

2

'

22

1

2220

20

0

10

,,λ

.(108)

If we open the terms in the exponent by opening the quadratic parentheses and

arranging it, the phase that we get will be:

( ) ( )

( )

++

+−

+

−++

+++=Φ

tz

y

z

vs

zx

z

ujk

tsfzz

jk

yxz

jkvu

zjk

10

0

10

0

22

10

22

1

2

0

2

00

1112

22

(109)

It looks complicated, but it is not really, thanks to two things. First we note that the

term in the second line is zero. That is because the first parenthesis equal zero in the

imaging condition of a thin lens. This is actually the ‘lens equation’ for an imaging

condition. We now use the paraxial approximation. This will limit the treatment only

to a small distance from the optical axis relative to the distances z0 and z1 along the

optical axes. It can be shown that this approximation is exact in certain cases.

We are left only with the third term (third line) which is linear in u, v, x and y.

( )( )

( ) dtdsetsPzz

eyxU

ts

tz

y

z

vs

z

x

z

ujkzzjk

⋅⋅−

= ∫

++

+⋅−+

,10

2

' 1010

10

,,λ

. (110)

The ratio of z1 to z0 defines the magnification of the lens, and we will define the

magnification factor M:

0

1

z

zM −= . (111)

The minus sign handles the inversion of the image for a simple convex lens. Using

this definition we get:

( )( )

( ) dtdstM

yvs

M

xu

z

jktsP

zz

eyxU

ts

zzjk

⋅⋅

−+

−⋅−

−≈ ∫

+

,

00

210

2

' exp,,10

λ. (112)

The phase that precedes the integral is constant, and can usually be neglected. The

integration parameters s and t can be changed to 2

' / zss λ= and 2

' / ztt λ= . Note that

also ds and dt have to be changed accordingly. Using this, the equation will change to:


( ) ( )∫ ′′

′

−+′

−−′′=

'' ,

0011

' 2exp,,ts

tdsdtM

yvs

M

xujtzszPyxU πλλ . (113)

This is exactly the Fourier transformation of the lens opening function, or more

general, of the exit pupil of the imaging system. The –M term simply shifts the center

of the function to the center of the geometric image and scales it according to the

magnification being used. These terms disappear if the object is a δ-function at the

origin.

Let’s summarize what we did: in a three-diffraction steps contribution, we have

calculated the field that will result from a δ-function object.

What happens for an object that is not a δ-function? Because the function that we

have found is linear, we can simply divide the object to a collection of many δ-

functions, each one will contribute as above, or more rigorously, use a convolution of

the original function and the transfer-function developed above. It is an important

function that received the name the point spread function or PSF:

( ) ( ) ''

,

'''

1

'

1''

2exp,, dtdstM

ys

MxjtzszPyxU

ts

∫

+−= πλλ . (114)

7.1 The diffraction limit of light

Based on the above derivation, we can now find the diffraction limit of a microscope

(or any other optical system). The term diffraction-limit means a system that is limited

only by the diffraction, that means, that there are no other aberrations or imperfections

that limit the performance of that system. Therefore, the diffraction-limit calculation

gives the best possible value and in reality the system can only be worth.

We now just have to take the Fourier transform that is calculated (see equation 180)

and use it here. The result will give:

( )

Mza

Mza

J

zaU

1

112

1 2

2

2

λρπ

λρπ

λπρ

= (115)

where we have omitted phase constants. We also took into account the multiplicative

factors that appear in the argument of the function P(x).

The absolute value of this function is shown in Figure 17. The intensity is equal to the

square of this function, but if plotted, the side-peaks will be barely visible on a linear

scale. The values at which this function is zero are given in section 10.3.


Figure 17 The absolute value of the J1(x)/x function. The values of the zeros are given in section

10.3.

Let us find the value of the argument at which the function gets it’s first zero. From

section 10.3 we see that this happens when 22.1/ ≈πx .

Therefore:

11

61.022.12

zaMMza λρπ

λρπ

=→= (116)

The value of 1za is defined as the numerical aperture of the optical system (see

Figure 16).

The parameter M simply specify that the image will be multiplied by the

magnification of the system. If we assume M=1 we therefore see that the first zero is

reached when:

NA

λρ ⋅= 61.0 . (117)

This means that even when a single point is imaged with an optical system, its image

size is going to be of a finite size. The NA of a system is limited by the possible size

of the optical elements, and is usually smaller that 1.4. A typical wavelength is 500

nm, which means that nm220≈ρ . This indicates also what is the nearest points that

can be distinguished from each other, but this requires to set a criteria to the

brightness levels that can be distinguished by the system. The above value of 220 nm

is a very good estimation.

Knowing the PSF, we can now predict the image of every object by convolving the

PSF with the function of the object:

( ) ( ) ( ) ( )∫∫ ⋅−−=∗= dsdtvuPSFtvsuUPSFvuUyxU oi ,,,, 0 . (118)

and we ignored the effect of magnification (and image inversion).


7.2 The amplitude and optical transfer function

In light of the importance of the diffraction pattern of light, another method of

analyzing it was developed.

In the treatment so far, we have looked at the response of the imaging system to an

impulse function (a delta δ-function), the point spread function (PSF). Once we know

that, a more complex image would be the result of the convolution of the original

image with the PSF.

In the method, rather than looking at the response of the system to a δ-function, we

look at the response of the system to many different frequencies in the object.

Therefore, we test images that look like a perfect cosine function with all possible

frequencies, and we examine the response of the system to that. Knowing this

response for all frequencies (along all possible directions), we can know how would

every image look like (because every object can be written as a linear combination of

all frequencies). The response function to all the frequencies is called the optical

transfer function (OTF).

As one can already guess, there is actually a Fourier-transformation relation between

the PSF and the OTF, but with a bit of care as will be explained.

The complexity arises from the nature of light and its coherence. Let us consider two

extreme cases; 1. The light illuminated from the object is spatially coherent, i.e., the

time-variation of the signal from different points on the object is correlated. 2. The

light illuminated from the object is spatially incoherent, i.e. there is no correlation.

Remember that in all our analysis, we first calculate the field amplitude, and then find

the intensity by multiplying the amplitude with it’s complex conjugate.

When we consider the statistical nature of light, we will note that a different treatment

should arise for spatially coherent and non-coherent objects.

Because of the statistical nature of the light, we can only talk about some time-

averaging of the signal. There is no problem in doing that because the amplitude of

light oscillates with a typical frequency of 1014

sec-1

while the detectors we use

(including our eye) are averaging many cycles for each read out of the signal. The

question is what should we average the amplitude or the intensity? After all, in the

previous section we only calculated the response of a single point while now we have

to add the contribution of many different points from the object.

If we represent the field at a point i of the object by Ai, its complex conjugate by Ai*

then the contribution of this to the image intensity is found by:

( ) ( )** ** PSFAPSFAU iii ⋅= . (119)

where Ii represent the image intensity as a result of a single point of the object and the

triangular brackets represents an average over a time-range that is large relatively to

the time variation of the field.

The complete image is a similar equation where we have to some all the points:

( ) ( )****

2

**

121 ****** PSFAPSFAPSFAPSFAPSFAPSFAU nni +++⋅+++= KK (120)

I prefer to write a sum, as I think it will better clarify the process.


If we now perform all the multiplications, and the time average for each term we get

different behavior in the coherent and non coherent cases.

The simpler case is the coherent case. Here, the amplitudes from different points i is

correlated and therefore all the terms are relevant so that we finally get:

Coherent case: ( ) 22

21 **** PSFAPSFAPSFAPSFAI n =+++= K (121)

In the noncoherent case, all the cross-multiplications of the type ji AA ⋅ would

disappear unless i=j. The time averaging will more accurately introduce a δ-function

of position. As a result, we will get

Non-coherent: 22

* PSFAI = (122)

We will soon see that it creates a different OTF.

In order to find the

7.2.1 Coherent case

In the coherent case as we saw in equation 121, we should calculate the spatial

frequency response of the system by transforming the amplitude response. The

function that we get is therefore called the amplitude transfer function (ATF).

The calculation simply involves a Fourier transformation of the PSF itself (it is

enough to calculate the frequency response to an impulse). The PSF itself is a fourier

transformation of the exit pupil of the optical system, and we get:

( ) ( )( )

= ∫∫+−

dsdtetsPzAFTffH

tysxz

j

yx2

2

2

,,λπ

λ. (123)

Here as before, P is the exit aperture function, z2 is the distance to the image plane

and A is some constant amplitude of the exit aperture. The Fourier transformation is

scaled as such that ( )[ ]

=

b

f

a

fG

abbyaxgFT yx ,1, where G is the Fourier transform of

the function g(x,y).

We therefore get:

( ) ( )yxyx fzfzPCffH 22 ,, λλ⋅= . (124)

Here C is a certain constant. This is actually a simple result. Beside a scaling factor,

this means that the spatial-frequency response is a simple function, similar to the exit

aperture function.

As an example, if the aperture is a square hole with dimensions of 2d along each axis,

( ) ( )

⋅=

dy

rectdxrectyxP

22, then the ATF is the same function:

( ) ( )rectffH yx =, . (125)


7.2.2 Non-coherent case

This case gives a more interesting result. We saw that in the non-coherent case, the

image of an object is given by the intensity function of the object convoluted by the

square of the PSF (equation 122). To find the OTF, we simply have to take the

Fourier transformation of this function when the object is a δ-function. As a

convention, the OTF is defined as a normalized function, where the normalization

factor is the response at zero frequency. The OTF is therefore:

( )( )

( )

( )∫∫

∫∫

=

+−

dsdttsP

dsdtetsPzAFT

ffH

tysxz

j

yx 2

2

2

2

,

,

,

2λπ

λ. (126)

Not that the function P is squared inside the equation. The Fourier transformation of a

multiplication is a convolution of the separate transformations, and again a scaling

factor appears in each one, we therefore get:

( ) ( ) ( )( )∫∫

∗=

dsdttsP

fzfzPfzfzPffH

yxyx

yx 2

2222

,

,,,

λλλλ. (127)

More rigorously, this function is:

( )( ) ( )

( )∫∫∫∫ −−⋅

=dsdttsP

dsdtfztfzsPtsPffH

yx

yx 2

22

,

,,,

λλ. (128)

This is an interesting result and simple to calculate. It states that the OTF depends on

a geometric overlap of the exit pupil function with itself.

Let us look at few examples. First, lets take a square aperture with a width of 2a. This

function can be written as:

( ) ≤

=otherwise

tstsP

0

1,1, . (129)

Its convolution with the scaled parameters will give:

( )( ) ( )

≤

−⋅−

=

otherwise

zdff

d

fzdfzd

ffH yx

yx

yx

0

2,4

22

,2

2

22

λ

λλ

. (130)

which is a multiplication of 2 triangle functions.

For a circular hole with radius a, it is a fun geometric function to calculate that gives:

( )

≤

−−

=

otherwise

H

0

22

12

arccos20

2

000

ρρρρ

ρρ

ρρ

πρ . (131)

and 2

0 za

λρ =


The one-dimensional graph and two dimensional OTF surfaces are shown in

Figure 18 The optical transfer function (OTF) for a square (left) and a circular (right) in 2D and

the 1D graph. In this case the original size of the functions is 30.

In all cases, we see that the largest frequency that can be observed is equal to

2max

2zaf

λ= . This means the smallest distance that can be observed is max1 f which is

equal to NAλ5.0 . This is a very similar value to the diffraction limit that we derived

in section 7.1.

7.3 Three Dimensional point spread function

In this section we will develop the three-dimensional point spread function of an

imaging system. We have already done that in section 7.1 for two dimensions, or for

the focal plane where the image is in focus. In this calculation based on reference [5]

chapter 8 we will find the PSF near focus. We will use the geometry described in

Figure 19.

y

x

z

P

O

R

C

Q s

f

fq

W

Figure 19 Schematic diagram of the geometry for calculating the PSF near focus.

Based on the diffraction theory that we developed before (see equation 97). As before,

we will omit along the way different constants that effect nothing but the total

intensity.

What we want to calculate is the field (and intensity) at point P as a result of light that

is diffracted from an aperture W. We assume a light beam that converges toward the

point O (which means that this is a light beam that have passed the focusing lens). We

assume that Rfaf >>>> , , the focal length CO is much larger than both the

wavelength being used and the maximal radius in the plane of the aperture and image

plane. We will use the vector OQ and call its direction vector (unit vector) q and its


size f. We start from equation 97 but neglect the cosφ term which is changing slowly

due to our approximation all over the integration range. We also assume that R01 is

constant that is equal to f, the focal length (R01 is actually the distance QP in our

current scheme). This value is also changing slowly. The exponent on the other hand

is changing rapidly, and we approximate R01 in the exponent as:

RqfR ⋅−=01 (132)

The minus sign is here because of the definition of the direction of q. The function U

is simply equal to 1 in the circular range, and we therefore get:

( )( )

dSf

Aej

PUW

fjk

∫∫⋅−

=Rq

λ1 (133)

We can also approximate the unit area dS as f2dΩ, again assuming that the solid angle

around point P is approximately the same as around point O and get after removing

constant phases:

( ) ∫∫⋅−=

W

jkdSe

jAf

PURq

λ (134)

We will use point O as the origin for all the coordinates in the system (in the aperture

plane and in the image plane). We refer to the coordinates of a point Q on the wave-

front by ( )χηξ ,, and a point P in the image plane by (x,y,z). We then use cylindrical

coordinates with:

φθρη

φθρξ

coscos

sinsin

rya

rxa

==

==

and

+−−=−−= L2

22222

211

f

afaf

ρρχ .

The vector multiplication can now be written as:

( )

+−−

−=

++=⋅ L2

22

211

cos

f

az

fa

fzyx ρφθρχηξ

Rq . (135)

It is now convenient to change the variables in the image plane to the dimensionless:

rfavandz

fau

=

=

λπ

λπ 22

2

. (136)

With these definitions, the exponent of the integral becomes:

( ) 2

2

21cos ρφθρ uu

af

vk +

−−=⋅ Rq . (137)

We use the approximation of the Taylor series above only to the second term.

We can also express the solid angle unit in terms of the coordinates:

2

2

2 f

dda

fdSd

θρρ==Ω . (138)

Using these definitions, we can write equation 134 as:


( )( )[ ]

∫∫+−−

=π

ρφθρθρρ

λ

2

0

21cos

1

0

2

2 2

2

ddeefj

AaPUuvju

a

fj

. (139)

The integral over θ simply gives the zero-Bessel integral (equation 177) and we get:

( ) ( )∫−

=1

0

202

2 2

2

2 ρρρλ

π ρdevJe

fjAaPU

uju

a

fj

. (140)

This is not such a complex integral, that can be performed numerically. It is still

valuable to develop it to a more rigorous form. For that, we write explicitly the real

and imaginary parts of the integral as:

( ) ( ) ( )vujSvuCdevJu

j

,,21

0

20

2

−=∫−

ρρρρ

. (141)

These integrals can be developed to a series based on the Lommel functions that are

related to the Bessel functions of the first kind. The Lommel functions [7] are defined

as:

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )∑

∑∞

=+

+

∞

=+

+

−=

−=

0

2

20

2

2

1,

1,

m

mn

mnm

n

m

mn

mnm

n

vJuvvuV

vJvuvuU

. (142)

We now briefly look at the mathematical development.

We start with the function C(u,v) defined as:

( ) ( ) ( )∫=1

0

2

0 2cos2, ρρρρ duvJvuC . (143)

By using differentiation by part for the two functions (J0 and cos) and the relation:

( )[ ] ( )xJxxJxdxd

n

n

n

n 1

1

1 ++

+ = . (144)

we can get for C(u,v):

( ) ( )[ ] ( )

( ) ( ) ( ) ( )

+=

=

∫

∫1

0

2

1

2

1

1

0

2

1

2sin2cos2

2cos2,

ρρρρ

ρρρρ

duvJuuvJv

uvJdd

vvuC

. (145)

The process of integration by parts can now continue, and we will get a series of

Bessel functions of the first kind that can be defined with the Lommel functions:

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )vuUu

uvuU

u

u

vJvuvJ

vu

u

u

vJvuvJ

vu

u

uvuC

,2

2sin,

22cos

22sin

2

2cos,

21

4

4

2

2

3

3

1

+=

+−+

+−=

L

L

. (146)


Similarly, we can find for S(u,v):

( ) ( ) ( ) ( ) ( )vuUu

uvuU

u

uvuS ,

22cos

,2

2sin, 21 += . (147)

These functions are practical to calculate only if the sum converges, which happens

only if u/v that appears in the parentheses is smaller than 1. This requirement

physically means that the point P is lying in the geometric shadow of the beam (see

Figure 19).

For points where u/v>1, another similar development is possible, where the

integration by parts is performed by defining the two functions (J and cos) in an

opposite order. At this time, another Bessel identity relation is used:

( )[ ] ( )xJxxJxdxd

n

n

n

n

1+−− −= together with:

( )

0!2

1lim

→

=

x

nnn

nx

xJ.

This will finally lead to:

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )vuVu

uvuV

u

uuv

uvuS

vuVu

uvuV

u

uuv

uvuC

,2

2sin,

22cos

2cos2,

,2

2cos,

22sin

2sin2,

10

2

10

2

−+=

−+=. (148)

The intensity at point P is ( ) ( )2pUPI = and it can be calculated based on the two sets

of equations that we have developed for C and S, but these are practically usefull only

in a defined range of the ratio u/v:

A. For |u/v|<1:

( ) ( ) ( ) ( )[ ] 0

2

2

2

1

2

,,2, IvuUvuUu

vuI += . (149)

B. For |u/v|>1:

( ) ( ) ( ) ( )( ) ( )[ ]( ) ( )[ ] 0

22

1

22

0

2

1

2

0

2

2sin,2

2cos,2

,,12,

IuvuvuV

uvuvuV

vuVvuVu

vuI

+−

+−

++=

. (150)

where

2

2

2

0

=

fAaI

λπ .

The plot of this function is shown in Figure 20. An intensity plot is shown in Figure

21. This function is plotted in real scale (nanometer) calculated for an NA=1 and

λ=500 nm. The large size of the central spot (also called the Airy spot) is obvious.

Note also that it is much broader along the optical axis (Z) than in the plane.


Figure 20 The 3D Point spread function close to the focal point.

Figure 21 An intensity plot of the 3D Point spread function close to the focal point.


8 Surface Plasmons

On an interface between a metal and a non-conducting material, it can be shown that

there exist surface modes of oscillations of the electrons, or surface plasmon as it is

usually called. By solving the appropriate solutions for surface plasmons it can be

shown that the oscillation modes are diminishing when moving deeper into the bulk

of the metal and therefore the surface plasmons are bound to the surface.

A relevant reference, though very laconic can be found in [3]

To find the solution we start with the model shown in Figure 22

X

Z

2

1

2k

1k

Figure 22 The schematic description of the model used for finding the surface plasmon

modes.

The solution follows the similar method that was used for finding the bulk plasmons.

The main difference is that here it is necessary to write two sets of the Maxwell

equations (in each one of the different media) and than to use the appropriate

boundary conditions.

More specifically we will follow the following steps:

1. Write two sets of the Maxwell equations (equation 58)

2. Write the boundary conditions (equations 9-12)

3. Assume an appropriate oscillatory solution, substitute in the equations and

solve.

Not too bad but still a lot of work. We will assume that there are no free charges (or

free currents) and the materials are non-magnetic, (therefore, B=H). Maxwell

equations will therefore be:

( )

EH

HE

H

E

tc

tc

∂

∂=×∇

∂

∂−=×∇

=⋅∇

=⋅∇

1

1

0

0

ε

ε

. (151)

and to simplify the steps we will start by assuming an oscillatory solution that has a

component only along one axis. This is not limiting any other solution because we


assume a full symmetry in the XY plane. We therefore set the y components of the

field to zero. The only other simplification here is the assumption that the electric

field in both media has the same frequency ω:

( ) ( )

( )

<

>

−−

−+

=

=tz

zkx

xki

tzz

kxx

ki

eEE

eEE

zxz

zxz

ω

ω

11

22

110

220

,0,

,0,

E

E. (152)

From the boundary conditions we conclude that:

1. The in plane (perpendicular to normal) electric field E is equal: 21 xx EE =

2. The in plane (perpendicular to normal) magnetic field is equal: 21 yy HH =

(we will see soon that the y and z components are zero).

3. The parallel to normal electric field condition gives: 2211 zz EE εε =

Because of our selection for the electric field (with no component along y) and if we

use the third Maxwell equation above it follows that:

0=∂

∂−

∂

∂∝

∂

∂

z

E

y

EH

t

yxx

It is equal to zero because both terms on the left side are zero. The first term is zero

because there is no dependence on y and the second term is zero because there is no y

component to the electric field. Therefore 0=xH . In a similar way we find that

0=zH . The solutions for the magnetic field are therefore:

( ) ( )

( )

<

>

−−

−+

=

=

tzz

kxx

ki

tzz

kxx

ki

eH

eH

yz

yz

ω

ω

11

22

0,,0

0,,0

10

20

H

H

.

(153)

Because the boundary condition 21 xx EE = must hold at any time for z=0 and for x we

conclude that (see 152):

xxx kkk == 21 . (154)

By using the forth Maxwell equation and substituting the solution we assume for the

electric and magnetic fields (equations 152,153), it is found that (just insert the

solution in the two different media):

2

1

2

1

z

z

k

k−=

ε

ε. (155)

Finally, by substituting the solutions in the third and forth Maxwell equations above

and using the identities that we have so far found, we get:

2

22

=+

ckk izix

ωε . (156)

Combining the last two results finally gives us (it takes few lines of simple algebra):

21

21

εε

εεω

+=

ckx . (157)


To summarize what we have done, we wrote the Maxwell equation, assumed a

solution that simplifies the treatment but not limiting it, substituted these solutions

and got few equalities that reduced the number of free parameters. Moreover, we also

found relations of the wave-vector k to the angular frequency with the dielectric

functions of the materials. The solution can therefore be summarized by:

21

21

2

2

εε

εεω

ωε

+=

−

=

ck

kc

k

x

xizi

. (158)

where zx kkk += is the wavenumber vector, and it is somewhat different in each one

of the layers (due to the z component difference).

We can further develop this by using few assumptions that are based on the

knowledge of the dielectric functions.

If we assume that ε2 is real (as in air or other transparent media where the imaginary

component is relatively small), and if we also assume that the imaginary part of the

metallic dielectric function is small relative to the real part, i.e. 11 εε ′<′′ then we can

write the first order approximation of the above solution which gives for the real and

imaginary parts of kx:

( )

( )2

1

1

2/3

21

21

21

21

2Im

Re

ε

ε

εε

εεω

εε

εεω

′

′′

+′

′=

+′

′=

ck

ck

x

x

. (159)

The imaginary part of k determines the absorption in matter and the penetration depth

or spatial extent of the plasmon excitation.

Another relation that we will use is the ratio of the electric field in the two principle

directions X and Z. from the equations above we can find that:

2z

x

x

zi

k

k

E

E−= . (160)

This relation can be used to find the electric field in the whole space.

8.1 The dispersion relations

From the solution above, it is possible to calculate the dispersion relations for the

surface plasmons. We have to remember that the dielectric constants that appear in the

solution are related to the properties of the isotropic materials in each area.

If we assume air-metal then ε2=1 and for the real part of the metal dielectric constant

we use the solution that we have found before for metal (equation 78). Remember that

ε=2n and the real part of this solution is:


22

2

1 1γω

ωε

−−=′ p

(161)

where γ=1/τ is the damping factor defined before in section 3. We already saw that

this value can be neglected at high frequencies (section 4.1.2). In fact, a typical value

for metals is smaller than ~1014

sec-1

. The typical frequencies of light is larger than

~1015

sec-1

and therefore as long as we are working in the optical range of visible light

(~400-1000 nm) we can neglect the damping term in the above equation. This is also

the solution for the high frequencies case that we have calculated above in section

4.1.2.

Substituting the approximate equation in the real part of equation 159 leads to:

( )11

1

Re

2

2

2

2

+−

−=

ω

ωω

ωω

p

p

xc

k (162)

Solving this equation for ω finally leads to:

( )4442222 422

1kckc pp +−+= ωωω . (163)

Where we have simply used k to describe the real part of the wavevector. We can see

that when k grows, ω reaches what we will now define as the surface plasmon

frequency:

2psp ωω = . (164)

Using this definition, we can write the above solution as:

( ) ( )44222 ckck spsp +−+= ωωω . (165)

where it is using the definition of the plasmon frequency given above in equation 61.

The dispersion curve is shown in Figure 23 together with the dispersion curve of the

bulk plasmon as well as light in vacuum.

If the other material is not air and will have a larger index of refraction (as well as

dielectric constant) than the surface plasmon energy for large k will be smaller. It can

be evaluated from equation 86 when k approaches infinity, which requires that the

denominator in the left side approaches 0. The solution found is that

21 εωω += psp which shows that the frequency will be lower than that found in air.


0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.00.0

0.5

1.0

1.5

2.0

2.5

Bulk plasmon

Photon

Surface plasmon

Forbidden frequency gap

Plasmon & Photon dispersion relations

Pla

sm

on f

req

uen

cy

[ωωωω/ωωωωp]

wave vector [ck/ωωωωp]

Figure 23 The dispersion curve of bulk plasmon, surface plasmon and light. The units that

are used are normalized to the plasmon frequency.

8.2 Spatial extent of surface plasmons fields

The spatial extent of the plasmons is highly dependent on the actual values of the

index of refraction and the dielectric constant (these are of course dependent on one

another as 2n=ε ).

These two functions for silver are shown in Figure 24. The data is from reference [4].

As one can see, the imaginary part is dominant in the index of refraction and the real

part is dominant in the dielectric constant. Also, the real part of the dielectric constant

is negative and for first-order approximations it is enough to take it as such (real and

negative).

400 500 600 700 800 900 1000 1100 12000

1

2

3

4

5

6

7

8

N = n + ik

Real n

Imaginary k

Ind

ex o

f re

fra

ctio

n

Wavelength [nm]

Figure 24 The index of refraction (left) and it’s dielectric constant (right) for silver (Ag).

8.2.1 Perpendicular to the surface plane – Z


We have seen in the previous section that the surface plasmon will have an oscillatory

solution in the plane (dominant real kx). The extent of the oscillation along the z axis

can be found by using the solution from equation 158: 2

2

xizi kc

k −

=

ωε . Here we

can see that kzi will have a dominant imaginary part, both because ckx /ω> and

because εi usually has a real and negative part (and its absolute value is usually larger

than its real part). This means that the fields along the z axis will always be

decreasing exponentially. It is sometimes called evanescent fields. The exponential

decrease of the fields along z will look like: zk

zzieE

⋅−∝ and is obviously directed

normal to the surface.

If we therefore refer only to the real parts of both kx and εi place the solution of kx

(equation 159) to that we get:

21

21

εε

εεε

ω

′+′

⋅′−= izi

ck (166)

Note that the penetration depth is different in the metal and in the other media. For an

interface of Silver and air and an in-plane wavelength of 600 nm, the penetration

depth gives 343 and 25 nm in the air and in the metal in correspondence. An example

of the decrease of the field strength as a function of the distance from the interface is

shown in Figure 25.

We define the penetration depth as the value in which the field strength falls to 1/e.

Figure 26 shows the dependence of the penetration depth on the in-plane wavelength

of the field. As one can see, the penetration depth in air increases with the wavelength

while that in the metal does not change a lot.

0.0 0.2 0.4 0.6 0.8 1.0-1000

-800

-600

-400

-200

0

200

400

600

800

1000

Silver

Air

Z [

nm

]

Ez [arb. units]

Figure 25 The penetration depth of the field perpendicular to the surface in an Air-Ag

interface calculated for an in-plane wavelength of 600 nm. The vertical axis

shows the distance from the interface and the horizontal is the field strength.

The penetration depth is approximately 10 times larger in Air than in the metal.


400 600 800 1000 1200 14000

500

1000

1500

2000

2500

Air

pe

ne

tra

tio

n d

ep

th

[nm

]

In-plane wavelength [nm]

22

23

24

25

26

27

28

29

30

Meta

l p

en

etr

ati

on

dep

th [n

m]

Figure 26 The penetration depth of the field perpendicular to the surface in an Air-Ag

interface calculated for an in-plane wavelength range of 400-1400 nm. The

vertical axis shows the penetration (Left: Air, Right: Silver) and the horizontal is

in-plane wavelength. The penetration depth in air increases with wavelength

while that in metal does not change a lot.

8.2.2 In plane XY

Even though we saw before that the surface plasmon has mainly an oscillatory

solution in the plane, it still has a decay term (an imaginary k) as well. We have

already seen the solution in Equation 159. This solution is plotted in Figure 27 for the

case of Silver-Air. As clearly observed, the in-plane plasmon spans a large area (note

that the penetration depth here is in microns).

400 600 800 1000 1200 14000

100

200

300

400

In-p

lan

e p

en

etr

ati

on

de

pth

[µµ µµ

m]

WAVELENGTH [nm]

Figure 27 The penetration depth of the in-plane field. Note that the scale here is in µm. The

range of the plasmons in the plane is therefore large and macroscopic.

8.3 The surface plasmon electric field

After solving the plasmon solution dispersion curves, we know the solution of the

wave vectors k for each energy level (or vice versa). These wave vectors can now be


substituted back to the actual electric field solution that we started with in equation

152 and find the actual field. We should also take into account the rati of the field

components that we have found in equation 160. The calculated solution for the z

component of the electric fields (both Ex and Ez) are shown in Figure 28. Another way

to represent it is through a vectorial representation as shown in Figure 29.

+ +++ + + +++ + + +++ + -- - -- -- - - - -- -- - --

Z component of the Electric field Blue points towards surface Red points outwards

Figure 28 The electric field around the surface in the X-Z plane calculated for Silver/Air

interface. The calculation is for a wavelength (in vacuum) of 600 nm. Top: Ex ,

Bottom: Ez. Note that the field extends much further to the air than in the metal.

The color bar shows the strength of the electric field. In the bottom image (Ez)

positive values means that the field is pointing outward from the interface and

negative values means that the fields points inwards towards the surface.

In the top image (Ex) positive values means that the field points in the positive X-

direction and negative is vice versa.

Figure 29 Vectorial representation of the electric field. The direction points parallel to the

electric field direction and the size of the vector is proportional to the strength of

the electric field.


9 Surface plasmons in metallic hole arrays

TO ADD


10 Fourier transforms of special functions

10.1 Coulomb (& Screened) potential

The first step is to find the Fourier transformation of a point charge potential 1/r (we

neglect here the actual charge value e2 that should be used to find the electron-

electron potential). It is actually better to start with the screened potential:

( )r

eV

rα−

=r (167)

because of two reasons. One, we will see during the mathematical solution that there

is an integration step that is simplified by using this potential, and it is always possible

to find the limiting case where 0→α . The second reason is that this potential is the

screened Coulomb potential that is interesting by itself.

The Fourier transformation of the Coulomb potential provides us with the wavevector

components that are of importance especially in matter where they should be

quantized to fulfill also the boundary conditions of the electrons.

The Fourier transformation of the potential is:

( ) ∫∫∫⋅

−

=r

rk rk der

eV

irα

(168)

where the integration is performed over the whole space. We neglect here the term of

( )321 π that is used for the integral normalization, and can be added at the end). It is

better to work in spherical coordinates due to the symmetry and we note that in

spherical coordinates drddrd θφφsin2=r , see Figure 30.

x

r φ

θ

y

z

k

Figure 30 The spherical coordinate systems. For simplification, we take the k vector to lay

parallel to the z axis.

The algorithm now reads:

( ) ∫∫∫∫∫∫−

−

==rr

k drddreedrddrer

eV

ikrrikrr

θφφθφφ φαφα

sinsin cos2cos (169)

The integral over θ is easily performed from 0 to 2π. For the integral over φ, we

change the integration variable to φcos=x and then φφddx sin−= . φ should be

integrated in the range of –π to 0 and therefore x is in the range -1 to 1. This gives:


( ) ( )

( ) ( ) ( )( )22

0

22

1

1,

4cossin4sin4

sin422

kkrkkr

k

edrkr

k

e

drkrik

eirdre

ikr

erdxdreeV

r

rr

r

r

r

ikrxr

rx

ikrxr

+=−−

+==

=−=−=

∞−−

−

−

−−

∫

∫∫∫∫

α

πα

αππ

πππ

αα

αααk

. (170)

We have used the integral formula ( )∫ −+

= bxbbxaba

ebxdxe

axax cossinsin

22.

The non-screened electron-electron Coulomb potential can therefore be written as:

( ) ∫⋅−=

k

rkkr de

keV

i

2

2 4π. (171)

10.2 Gaussian function

[ ]( ) ak

ax ea

keFT

222

ππ

−− ⋅= . (172)

10.3 Circular hole

The two-dimensional Fourier transformation in cartesian coordinates is defined as:

( )[ ] ( ) ( )∫∫

+−⋅=

yx

yfxfjdxdyeyxgyxgFT yx

,

2,,

π (173)

In cylindrical coordinates (r, φ), the integration unit is changed to φrdrddxdy = .

x, y, fx and fy are changed to: θρθρφφ sin,cos,sin,cos ==== yx ffryrx and

the Fourier integral is changed to:

( )[ ] ( ) ( )∫∫

+−⋅=φ

φθφθπρ φφφ,

sinsincoscos2,,r

rjrdrdergrgFT (174)

The circular hole function is defined as:

( )

>

≤=

10

11,

ar

ararcirc φ (175)

where the factor a defines the radius of the hole. It’s Fourier transform is therefore:

( )[ ] ( )∫∫

≤<≤

+−=πφ

φθφθπρ φφ20,1

sinsincoscos2,r

rjrdrdercircFT (176)

The exponent can also be written as ( )φθ −cos and we now use one of the

fundamental Bessel identities:

( ) ∫−=

πφ φ

π0

sin

01 dexj

jx (177)

In our case the integration is over 2π, but the function is symmetric around so it

simply multiply the result by a factor of 2. Also, the sine and cosine functions are

interchangeable in the exponent because of the integration over 2π, and we can

therefore substitute for the integration over φ:


( )[ ] ( )∫=

=a

r

drrrJrcircFT0

0 22, πρπφ (178)

See below more on the nature of the Bessel functions.

We now use another Bessel function identity:

( ) ( )∫=x

dyyyJxxJ0

01 (179)

and get:

( )[ ] ( )ρπρπ

πφa

aJarcircFT

2

22, 12= (180)

This function is well known and behaves in a similar way as the sinc function.

Zeros of the function x

xJa )(:

3.83170597020751, x/π = 1.21966989126650

7.01558666981562, x/π = 2.23313059438153

10.17346813506272, x/π = 3.23831548416624


11 Spherical coordinate system

In spherical coordinate system x, y and z depend on θ, φ and r in the following way:

θ

φθ

φθ

cos

sinsin

cossin

rz

ry

rx

=

=

=

where θ is bound between 0 and π, φ is bound between 0 and 2π and r is positive or

zero.

The Laplacian in spherical coordinates is:

ψφθθ

θθθ

ψψ

∂∂+

∂∂∂+

∂∂

∂∂=∇

2

2

22

2

2

2

sin1sin

sinsin111

rrr

rr.

A useful relation is: ( ) ψψψ rrr

rrrr

rrr

2

222

2111

∂∂=

∂∂=

∂∂

∂∂


Appendix I: Units

In the CGS unit (centimeter, gram, second), charge is measured in electrostatic units

or simply esu. In esu units, the charge of an electron is e= 1010802.4 −× esu. The table

below shows few other units that are used.

Unit Symbol MKS CGS

Length Various Meter, (m) Centimeter, (cm)

Mass m Kilogram, (Kg) Gram, (g)

Time t Seconds, (s) Seconds

Charge q Coulomb, (C) esu

Current i Ampere, (A) esu × sec-1

Electric field E V × m-1

statvolt × cm-1

Capacitance C Farad, (F) Centimeter

Electric

potential

V Volt, (V) Statvolt 1 Volt = 299.792458

statvolts

Energy, work E, W Joule, (J) erg

Force F Newton (N) Dyne

Temperature T Kelvin, (K) Kelvin, (K)

Hbar ħ 1.0546×10-34

Joule × sec 1.0546×10-27

erg × sec

unit symbol MKS (abbrev.) cgs (abbrev.)

inductance L Henry (H) cm-1 s

magnetic field B Tesla (T) Gauss (G)

magnetic flux

Weber (w) Gauss cm

momentum p kg m s-1 g cm s-1

power P Watt (W) erg s-1

pressure P Pascal (Pa) bar

resistance R Ohm ( ) cm-1 s


For comparing the differences in units, following are Maxwell equations in vacuum in

both MKS and CGS units:

JDHJDH

BEBE

BB

EE

=−×∇=−×∇

=+×∇=+×∇

=⋅∇=⋅∇

=⋅∇=⋅∇

&&

&&

cc

c

MKSCGS

π

ρε

πρ

41

001

00

140

In MKS unites:

21212

0

20 108542.81 −−− ⋅⋅×== mNCc µ

ε and

( )2117

0 104 −−−− ⋅⋅⋅×= ANormAWbπµ

smc 8109979.2 ×=


References

1. Feynman, R.P., R.B. Leighton, and M. Sands, The Feynman lectures on

physics. 6 ed. 1964, California: Addison-Wesley Publishing Company.

2. Ashcroft, N.W. and N.D. Mermin, Solid State Physics. 1976, Philadelphia:

Saunders College.

3. Raether, H., Surface Plasmons on smooth and rough surfaces and on gratings.

Springer Tracts in Modern Physics, ed. G. Hohler and E.A. Niekisch. Vol. 111.

1988, Berlin: Springer-Verlag.

4. Palik, E.D., Handbook of Optical Constants of solids. Academic Press

handbook series. 1985, Orlando: Academic Press.

5. Born, M. and E. Wolf, Principles of Optics. 7 ed. 1999, Cambridge:

Cambridge University Press.

6. Goodman, J.W., Introduction to Fourier Optics. Second ed. McGraw-Hill

series in Electrical and Computer Engineering, ed. S.W. Director. 1996, New

York: McGraw-Hill.

7. Lommel, E.C.J.v., Die Beugungserscheinungen einer kreisrunden Oeffnung

und eines kreisrunden Schirmchens theoretisch und experimentell bearbeitet.

Abh. der math. phys. Classe der k. b. Akad. der Wiss. (München), 1884-1886.

15: p. 229-328.

Date post:	07-Oct-2014
Category:	Documents
Upload:	ker11en
View:	84 times
Download:	2 times

Advanced Imaging Optics

Documents