Tomography for
Multi-guidestar Adaptive OpticsAn Architecture for Real-Time Hardware Implementation
Donald Gavel, Marc Reinig, and Carlos CabreraUCO/Lick Observatory Laboratory for Adaptive Optics
University of California, Santa Cruz
Presentation at the SPIE Optics and Photonics Conference5903-15
San Diego, CAJune 3, 2005
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 2
Outline of talk
• Introduction: The problem of real-time AO tomography for extremely large telescopes (ELTs):
Real-time calculations grow with D4
• An alternative approach using a massively parallized processor (MPP) architecture
• Performance study results
– Experiment
– Simulation
• Conclusions
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 3
AO systems are growing in complexity, size, ambition
–MOAO•Up to 20 IFUs each with a DM•8-9 LGS•3-5 TTS
–MCAO•2-3 conjugate DMs•5-7 LGS•3 TTS
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 4
Extrapolating the conventional vector-matrix-multiply AO reconstructor method to ELTs is not feasible
Ksa
sΣHΣHHΣa
sHHHa
ˆ
ˆ
ˆ1
1
nTT
TT
• Online calculation requires P x M matrix multiply– M = 10,000 subaps x 9 LGS– P = 20,000 acts (MCAO) or 100,000 acts (MOAO)
– fs = 1 kHz frame rate
~1011 calcs x 1 kHz = ~105 Gflops = ~105 Keck AO processors!
• Offline calculation requires O(M3) flops to (pre)compute the inverse ~1015 calcs --106 sec (12 days) with 1Gflop machine
• “Moore’s Law” of computation technology growth: processor capability doubles every 18 months. To get a 105 improvement takes 25 years growth. Let’s say we use 100 x more processors; a 103 improvement takes 15 years.
Least-squares solution
Minimum variance solution
General form
H = actuator to sensor influence function matrix
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 6
Alternative: massively parallel processing
• Advantages– Many small processors each do a small part of the task – not taxing to any one processor
– Modularity: each processor has a stand-alone task – possibly specialized to one piece of hardware (WFS or DM)
– Modularity makes the system easier to diagnose – each part has a “recognizable” task
– Modularity makes system design easier – each subsection depends only on parameters associated with it, as opposed to global optimization of a monolithic design
• Requires– Lots of small processors, with high speed data paths
– Iteration to solution – but what if 1 iteration took only 1 s? – then we would have time for 1000 iterations per 1 ms data frame cycle!
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 7
1. Wavefront sensor processing• Hartmann sensor: s = Gy
– s = vector of slopes– y = vector of phases– G = gradient operator
• Problem is overdetermined (more measurements than unknowns), assuming no branch points
• High speed algorithms are well knowne.g. FFT based algorithm by Poyneer et. al. JOSA-A 2002 is O(n0 log(n0))
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 8
Weiner solution of the wavefront sensor slope-to-phase problem in the Fourier domain
350
22 1
1~~~
~~
r
si
CC
siy
nnest
23111 2027.0 and = spatial frequency~ indicate Fourier transformr0 = Fried’s parametern = meas. Noiseda = subap diameterC = Kolmogorov spectrumCnn = noise spectrum
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 9
2. Tomographic reconstruction
Axy where
y = vector of all WFS phase measurementsx = value of OPD at each voxel in turbulent volumeA is a forward propagation operator (entries = 0 or 1)
x is an N-vectory is an M-vectorA is M x N
• The problem in underdetermined – there are more unknowns than measurements
• Guidestars probe the atmosphere:Image
ProcessorsImage
ProcessorsImage
ProcessorsImage
Processors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 10
Inverse tomography algorithms
AT is the back propagation operator
C is the “preconditioner”affects convergence rate only
P,N is the “postconditioner”determines the type of solution:
P=I, N=0 least squaresP=<xxT>, N=<nnT> min variance
= constant feedback gainf(.) = 1st order regression (and other hidden details of the CG algorithm)
vPAx
vNAPAye
Cev
vvv
T
kT
k
kk
kkk
f1
Linear feedback Preconditioned conjugate gradient
-or-
vPAx
vNAPAye
Cev
vvv
T
kT
k
kk
kkk
1
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 11
Compute count for inverse tomography• A and AT are massively parallelizable over transverse dimension, guidestars• AT is massively parallelizable over layers
• Optional Fourier domain preconditioning and postconditioning:
per iteration
Back-propagate
Post-condition
Forward-propagate
FT FT-1X
Aperture
WFSdata
VolumetricOPDestimates-+
Pre-condition
FT-1 FTX
Aperture
Back-propagate
Post-condition
Forward-propagate
FT FT-1X
Aperture
WFSdata
VolumetricOPDestimates-+
Pre-condition
FT-1 FTX
Aperture
Operation CPU MPPU
Fourier Transform M log(M) Log(M) per iteration
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 12
Prototype implementation on an FPGA
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation.
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation.
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation .
Global SystemState Information
Global SystemState Information
GSn
GSn
VoxelLocal Registers
Control Logic
GS1 Error
Current Estimated
ValueCn
2
GSN Error
GS3 Error
GS2 Error
ALU(Word Size + NGS) wide
GSn
...
...
GS1
GS1 ...
...
Cumulative Value GS1
Cumulative Value
GSn
GSn
Forward Propagation Path
Forward Propagation Path
Back PropagationPath
Back PropagationPath
GS1
GS1 ...Note:Because the Forward propagation and Back Progagation paths are parallel, but are used at different times, they will actually be a single bus in the physical implementation.
Global SystemState Information
Global SystemState Information
GSn
GSn
•A Single Voxel Processor•An Array of Voxel Processors
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 13
Preliminary Results for MPP Timing and Resource Allocation on an FPGA
Timing• Basic clock speed supported: 50 MHz (Xilinx Vertex 4)• Total number of states per iteration: 36
Element Current Value Derived Formula Comment
Load Measured Value 12 3n0 Done once per msec
Forward Propagate 27 NGS(2L + 1)
Compare 1 1
Back Propagate 1 1
Calculate New Estimate 7 3NGS + 4
Parameters (current Value)L = Layers (4)NGS = Guide Stars (3)n0 = Sub Apertures (4)A single iteration takesT = 4NGS + 2LNGS + 6 clock cycles
Currently this is 36 50MHz clocks = 720 nsec. Per iteration
Note: algorithm parallelizes over guidestarsFor reasons of simplicity and debugingof this first implementation we have not done this yet
Chip count• This implementation: Vertex 4 chip is 20% utilized (2996 of 15360 available logic cells employed)• Scaling to a system with 10,000 subapertures (such as for the 30 meter telescope) would require 500 of these chips• Standard packing density is ~50 chips/board, this equates to 10 circuit boards
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 14
Simulation: extrapolation to the full ELT spatial scale to estimate convergence rates
• 7800 subapertures per guidestar• 5 guidestars• 7 layer atmosphere
• Fixed feedback gain iteration• A and AT implemented in the spatial domain• Initial atmospheric realizations were random with a Kolmogorov spatial
power spectrum.
Convergence to 3 digits accuracy in 1ms
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 15
3. Projection and fitting to DMs
• MCAO– Requires filtering and weighted integral over layers for each DM– Filters and weights chosen to minimize “Generalized
Anisoplanatism” (Tokovinin et. al. JOSA-A 2002)– Massively parallelizable over the Fourier domain and over DMs -
L steps to integrate
• MOAO– Requires integral over layers for each science direction (DM)– Massively parallelizable over Spatial or Fourier domain and over
DMs – L steps to integrate
• DM fitting– Deconvolution – massively parallelizable given either spatially
invariant or spatially localized actuator influence function– PCG suppresses aperture affects in 2-3 iterations
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
Gavel, Tomography for Multi-guidestar AO SPIE Optics and Photonics, San Diego, Aug. 2005 16
Conclusions
• The architecture: massive parallel computation
• Conceptually simple• Tested with a commercial FPGA; evaluated with simulations – it’s feasible
with today’s technology• Under study:
FD-PCG – extra computation per iteration traded off against faster convergence rate
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
ImageProcessors
ImageProcessors
ImageProcessors
ImageProcessors
WavefrontSensorsWavefrontSensorsWavefrontSensorsWavefrontSensors
TomographyUnit
ImageProcessors
ImageProcessors
ImageProcessors
DMFit
WavefrontSensorsWavefrontSensorsWavefrontSensors
DMProjection
DM conjugatealtitude
Cn2 profile Actuatorinfluencefunction
Centroid algorithmr0, guidstar brightness,
Guidestar position
ImageProcessors
ImageProcessors
ImageProcessorsDeformable
Mirrors
Back-propagate
Post-condition
Forward-propagate
FT FT-1X
Aperture
WFSdata
VolumetricOPDestimates-+
Pre-condition
FT-1 FTX
Aperture
Back-propagate
Post-condition
Forward-propagate
FT FT-1X
Aperture
WFSdata
VolumetricOPDestimates-+
Pre-condition
FT-1 FTX
Aperture