Sort-Last Parallel Rendering for Viewing Extremely Large Data Sets on Tile Displays
Paper by Kenneth Moreland,
Brian Wylie, and Constantine Pavlakos
Presented by Adam Howard
CS594 Spring 2002- Dr. Jian Huang
Research Focus/Goal
Focus:• Develop highly scalable rendering techniques.
Goal:• Drive multiple tile displays with frame rates that are comparable to a single tile display
using a sort-last based parallel algorithm that scales appropriately with large data sets.
Background
Sort-Last Parallel Rendering:• Combines images after rasterization occurs.
Overview of Approach
The Situation:Requirement (Output):• Target resolution of 12 million pixels or more.Problem:• Beyond capabilities of a single commodity computer.Solution:• Tiled display where input comes from graphics engines on different
computers.
The Approach:Rather than render a single high-resolution image, each processor generates images for the tiles that make up the display.
More precisely, use N processors to render and compose a large data set with T different projections, one for each display tile.
System OrganizationVIEWS
RiCky
Tile DisplayProjector
System AreaNetwork
Communication
NodeEquipment:Compaq 750nVidia Geforce 256 Graphics Card
Data Set
System Organization
Design Strategy:• Allow any number of N processors to
contribute to rendering T images for a tile as long as N>=T.
Software:• Intended to draw polygons that are evenly
distributed amongst all the processors.
System Organization
Main Point of Paper
Composting Strategies
• Serial• Virtual Trees• Tile Split and Delegate• Reduce to Single Tile
Serial
Compose T images for a tile display by serially running a composition algorithm for a single display T times.
Weakness:
• No advantage from spatial coherence.
• Load balancing.
Virtual Trees
• Based on Binary Tree Algorithm.
• Each tile image has a tree that has processors assigned to it.
• Processors assigned to more than one tree- when finished with one job- start another.
• Processor scheduling is very important.
• Weakness: Load balancing.
Tile Split and DelegateAttempt to achieve better load
balancing throughout composting.
Extension of the direct send algorithm.
Load balancing is ensured.
Weakness: Large amount of message passing. Number of messages is O(N2).
Reduce to Single Tile- Attempt to reduce the problem
to that of composing a single image in the same manor as traditional sort-last parallel rendering systems.
Before composting begins, each processor holds between zero and T images for separate tiles. The goal is for each processor to have one image for a particular tile.
Advantages:• Good load balancing.• Fewer messages- Order of
O(N*T+NlogN).
Optimizations
• Bucketing– Reduce number of polygons sent to
graphics hardware.
• Active Pixel Encoding– Reduce amount of information
passed over the network
• Floating Viewport– Reduce number of times a polygon
is rendered and the number of times the frame buffer is read back.
Experimental Results
The serial strategy has good results when the data is not spatially coherent.
Tile Split and Delegate and Reduce to One strategies were the best for spatially coherent data.
Determined that there is a tradeoff between display resolution and rendering time.
Other parallel cluster systems can render larger data sets faster, but not at this level of resolution.
Experimental Results
Experimental Results
Experimental Results
Experimental Results
Conclusion
The results support the initial goal of increasing resolution by rendering to a tiled display by using a cluster of commodity computers. It also supports the desire for scalability- such as larger data sets or higher resolution displays.
Pretty Pictures
Bucketing
Reduce number of polygons sent to the graphics hardware by estimating which polygons can be ignored.
• Before rendering- each processor’s polygons are grouped into several 3D regions called buckets. Occurs when data loaded during initialization.
• Before each tile image is rendered, the buckets are tested to determine which lie in the tile.
• Only the polygons in these buckets are rendered.
Weakness: A large number of buckets reduces rendering time, but increases overhead in determining screen projections.
They ended up using a moderate amount of buckets to reduce rendering time.
Active Pixel Encoding
This method simply reduces the amount of information that is sent across the network by making a distinction between active pixels that contain geometric information and inactive pixels. Active pixel information is longer than inactive pixel information and this reduces the overall overhead of message passing between processors.
Floating Viewport
Here a virtual tile is created to encompass an entire polygon.
After processing it is split and each piece is displayed directly on each real tile it is actually on.
Hence the system does not need to render any polygon more than once, and the frame buffer is read back one time instead of four.
This is most effective when the ratio of tiles to processors decreases and the data has good spatial coherency.