Improving Adaptive Display with Temporally Adaptive Rendering

Improving Adaptive Display with Temporally Adaptive Rendering

Benjamin Watson, Abhinav Dayal Dept. Computer Science Northwestern University

David Luebke, Cliff Woolley Dept. Computer Science University of Virginia

contact email: [email protected]

1 Why Adapt? Adaptive display solves a number of very basic and important problems. For this reason, adaptive displays have long been used in the simulation and gaming industries, where they improve portability, accuracy and interactivity [Luebke et al., 2002]. Portability is the ability of a system to run on a wide range of platforms. Without adaptivity, display systems will often slow and become unusable on less capable platforms. Systems often adapt to platform by monitoring their frame rate, and reducing displayed detail whenever frame rate falls too low [Funkhouser & Sequin, 1993]. Accuracy is the fidelity of the displayed scene. Unless display is adaptive, systems will waste effort rendering details that will never be seen, even details on objects so far away that they will not change the color of a single pixel. Adaptive displays can avoid such waste, and instead expend rendering effort where it will actually improve the accuracy of the resulting image. Interactivity refers primarily to the delay between user input and displayed system response. Such delays are quite confusing to users, particularly when they are performing continuous interactions such as driving a vehicle or manipulating a tool [Watson et al., 1998; 2003]. Because adaptivity reduces the amount of effort required to maintain an adequate level of accuracy in display, system response comes more quickly, and interactivity improves. Adaptive display is also beginning to solve problems of more direct concern to users. The perceptibility of visual detail varies with a number of display variables, including spatial frequency, contrast, position and velocity. Where non-adaptive displays will spend just as much effort rendering imperceptible as perceptible detail, adaptive displays can focus rendering effort where it is most perceptible [Watson et al. 1997, 2004; Reddy et al. 1998; Luebke et al. 2002]. Exploiting the dependence of perceptibility on retinal position and velocity is most effective when eye or at least head position is known and tracked interactively, giving the display system at least a basic knowledge of user state. More recently, researchers have begun to build systems that take advantage of greater and higher-level knowledge of user state, including attention as indicated by eye tracking, interaction, biometrics, and other predictors [Danforth et al., 2000; Dinh et al, 1999; Yee et al., 2001]. Ultimately, adaptive display systems might even use direct requests from users to guide adaptively displayed detail. 2 Temporally Adaptive Rendering In a strange irony, almost all of today’s adaptive display systems respond poorly to changes in viewpoint, a very basic indication of user state. For example with the naked eye, when a person’s

head turns, what the person sees is visible only in coarse spatial detail, and is changing quickly. Conversely when a person’s head is still, what the user sees is visible in fine spatial detail, and is not changing much. Most current adaptive display systems will perform well when simulating one of these situations, but not both: either displayed detail is not up to date when the user’s head turns, or too coarse when the user’s head is still. This non-adaptive response in most adaptive displays is due to the inflexibility of the approaches they use for managing the tradeoff between spatial and temporal detail. The most common of these approaches renders all available spatial detail (e.g. polygons, textures) no matter how long it takes. This approach sacrifices interactivity and portability to achieve the best possible accuracy. Another approach ensures adequate temporal detail (e.g. frame rate) no matter how much spatial detail goes unrendered. This approach sacrifices accuracy to achieve good interactivity and reasonable portability. Our own research improves adaptivity in display by using more flexible approaches to managing the spatial/temporal tradeoff. The systems we produce are capable of emphasizing both interactivity when the displayed scene is dynamic, and accuracy when the scene is static. For example if the user’s viewpoint changes, the displayed scene may be a bit blurry, but quite up to date. On the other hand, if the user’s head becomes still the scene will quickly sharpen. The crucial ingredient in these systems is temporal adaptivity: explicit control of the rate at which not only space but time is sampled for display. Below we briefly describe two of our prototype temporally adaptive renderers. We follow these descriptions with a discussion of the implications of temporally adaptive rendering, which are much broader than those we have already described above. 2.1 Interruptible Rendering Interruptible rendering [Woolley et al, 2003] is a progressive rendering framework that renders a coarse image into the back (off-screen) frame buffer and continuously refines it while tracking the error introduced by subsequent input (such as changes in viewpoint). When this temporal error exceeds the spatial error caused by the remaining coarseness in the off-screen rendering, there is no longer any reason to refine further, since any improvement to the appearance of objects in the image will be overwhelmed by their incorrect position and/or size. In other words, when the error due to the image being late is greater than the error due to the image being coarse, further refinement is pointless. The front (on-screen) and back frame buffers are then swapped and rendering begins again into the back buffer for the most recent viewpoint. The resulting system, which maximizes combined spatial and temporal detail, quite intuitively results in coarse, high frame-rate display when input is changing rapidly, and finely detailed, low frame rate display when input is static. This behavior matches nicely the ability of the human visual system to perceive high spatial frequencies in still imagery, but only low frequencies during retinal motion. Figure 1 shows an example interaction sequence. As the name implies, interruptible rendering requires an interruptible image generation process; a sudden motion by the user can drive up temporal error at any time. The back buffer should always contain a complete image ready to be displayed, with the rendering process incrementally refining the image until further refinement is pointless. This requires a progressive rendering algorithm, and below we describe one progressive renderer based on ray casting. Later we discuss techniques for estimating temporal and spatial error, and present some resulting imagery.

Our current interruptible rendering prototype uses a ray tracer and a quadtree to sample the image in a coarse-to-fine fashion. Reconstruction of a partially sampled image places a splat at the center of each sampled quadtree node, colored with the shading result of the ray at the node’s center. The splat is an alpha-textured quad twice the size of the node’s screen space, with a texture shaped like a Gaussian blob with full transparency at its boundaries. All nodes at each quadtree level are visited before the next finer level, but nodes within a level are visited in a (predetermined) random order. Because of this randomness, the prototype’s textured splatting is not strictly energy-preserving (it may be a bit too bright or too dim), but it does remove high frequencies. We achieve over 600,000 rays per second on average in our implementation. Interruptible rendering requires lightweight methods for estimating error so that remaining spatial and temporal error may be compared several times per frame. Enabling these comparisons also requires that spatial and temporal error be quantified in the same space. We estimate screen space spatial error with the maximum size of the image region sampled by each ray. Estimating temporal error is much more challenging. An ideal interactive renderer could instantly display all available detail using the most current input. Temporal error approximates the difference between the image that is presently being rendered, and the image that would be produced by this ideal renderer. Obviously the ideal renderer can only be crudely approximated. In practice even this crude estimate proves quite useful. Our present implementation uses a small, precomputed set of vertices V surrounding the rendered model. We then find the projected distances that vertices in V have moved between the moment presently being rendered and the moment represented by the latest input. We form our estimate of temporal error using the maximum of these projected distances.

Figure 1: Interruptible rendering. The system initially displays the 871,414 polygon dragon coarsely to support dynamic, real time interaction. It gradually increases detail as user motion ceases, improving rendering quality.

2.2 Adaptive Frameless Rendering Interruptible rendering retains a basic underlying assumption of interactive computer graphics: all pixels in a given image represent a single moment in time. In contrast, Bishop et al.'s frameless rendering [1994] replaces the frame with randomly distributed spatial samples, each representing the most current input when the sample was produced. Frameless rendering thus decouples spatial and temporal sampling, so that the pixels in a frameless image represent many moments in time. Frameless rendering offers the potential for greatly improved temporal adaptivity, since spatial and temporal detail tradeoffs can differ across the image. Adaptive frameless rendering is an improvement of the original frameless rendering approach. It samples rapidly changing regions of an image coarsely but frequently to improve temporal detail, while refining static portions of the image to improve spatial detail. It also improves the displayed image by performing a reconstruction step, filtering samples in space and time so that older samples have lower weights than recent samples; and with filters adjusted to the image’s local sampling density and color gradient. Rather than sending samples into a frame buffer, adaptive frameless rendering sends samples to a frameless and temporally deep buffer that collects samples scattered across space-time. Sampling is made adaptive with the use of a spatial hierarchy of tiles superimposed over the image and the deep buffer (Figure 2). This hierarchy is constantly managed to ensure that all tiles contain roughly equal amounts of color variation (across both space and time), with tiles continuously merged and split in response to changes in the image. As a result, small tiles are located over image regions containing motion or fine spatial detail, while large tiles emerge over static or coarsely detailed regions. Sampling is thus a biased, probabilistic process, with tiles chosen for sampling with equal probability (pixels within a selected tile are also chosen with equal likelihood). Because tiles vary in size, sampling is biased towards those regions of the image which exhibit high spatial and/or temporal color variance. Because all tiles are sampled, the renderer remains sensitive to newly emerging motion and detail.

Figure 2: Adaptive sampling. In a scene without viewpoint motion, a car moves from left to right, then turns toward the viewer and moves offscreen. Sampling is biased using a tiling over the image. Notice the small tiles behind and in front of the car.

Reconstruction of images from the deep buffer is also adaptive. The key question is what shape and size filter to use. A temporally narrow, spatially broad filter emphasizes the newest samples and leads to a blurry but very current image and is appropriate when the scene is changing rapidly (Figure 3, car). A temporally broad, spatially narrow filter leads to a finely detailed, antialiased image and is appropriate when the underlying scene is changing slowly (Figure 3, background). We size our filter as if we were reconstructing a regular sampling with the local sampling density, and determine filter widths using the gradients. We reason that a large spatial gradient implies an edge, which should be resolved with a spatially narrow filter to preserve the underlying high frequencies. Similarly, a large temporal gradient implies a temporal edge such as an occlusion, which should be resolved with a temporally narrow filter. 3 Advantages of Temporal Adaptivity Temporally adaptive rendering systems offer a wide range of improvements over temporally non-adaptive renderers. Because they can move continuously between emphasizing interactivity and image accuracy, temporally adaptive systems are more portable than temporally non-adaptive systems. When moved to a more capable platform, a temporally adaptive renderer will strike a balance between reducing delay and increasing image detail, while a temporally non-adaptive fixed frame rate renderer will never reduce delay, it will always and only improve image content. Temporally adaptive rendering systems are also more accurate and interactive than temporally non-adaptive systems. We have evaluated this claim empirically using a simulated zero-delay, full detail ideal renderer as a standard of comparison. As one compares the currently displayed image with slightly aged content to an image from the ideal rendering with completely current content, one quickly realizes that interactivity and accuracy are just two aspects of the same coin: drops in both harm visual fidelity. As measured by root mean squared (RMS) error, even our early temporally adaptive prototypes have 2-3 times better visual fidelity than temporally non-adaptive renderers with the same rendering speed. Ultimately, we expect that temporally adaptive rendering will be almost 10 times more accurate and interactive than temporally non-adaptive rendering.

Figure 3: Adaptive reconstruction, same scene. On the left, a simulated ideal rendering that takes no time to produce. On the right, adaptive frameless reconstruction using samples arriving 20x too slowly to produce 60 Hz frames.

Temporally adaptive renderers can also improve response to users’ perceptual and cognitive state. Temporally adaptive renderers can balance user sensitivity to temporal frequency and velocity with sensitivity to spatial frequency: for example when the viewer’s head or eye is still, the renderer can incorporate older samples into the image, improving spatial detail beyond what would otherwise be possible. Alternatively when the viewer is moving, the renderer can quickly discard older samples and high spatial frequencies in favor of newer samples and high temporal frequencies. With adaptive frameless rendering this can even be done within the image, for example by displaying a blurred but accurately placed moving object over a sharp, static background (see again Figure 3). Similar temporally adaptive rendering response to more cognitive measures based on eye tracking, biometrics, interaction and user requests are also possible. For example, an adaptive frameless system might render an object the user interacts with repeatedly more often and more accurately than other background objects. In the long term, we believe that temporally adaptive renderers will prove to be more efficient than temporally non-adaptive renderers, even at a basic technical level. If our expectation proves to be correct, temporally adaptive rendering should enable a new level of richness in display. Increased rendering efficiency will enable the use of previously non-interactive photorealisitic rendering effects such as soft shadows, color bleeding, and diffuse reflections. In addition, temporally adaptive renderers may prove to be a crucial enabling enabling technology for tomorrow’s hyper-resolution, gigapixel (one billion pixel) displays. Such displays, based on emerging technologies such as digital micromirror devices, E-Ink, and organic LEDs, will likely need to rely heavily on temporally adaptive updating, since a full frame update would take a prohibitively long time to complete. 4 Acknowledgements We would like to thank the creators of OpenRT for making their software available and the maintainers of the Stanford 3D Scanning Repository for the use of their models. Thanks also to the makers of the BART ray tracing benchmark. This work was supported in part by NSF awards 0092973, 0093172, and 0112937. 5 References BISHOP, G., FUCHS, H., MCMILLAN, L., AND SCHER ZAGIER. E.J. 1994. Frameless rendering: double buffering considered harmful. Computer Graphics, 28 (Annual Conference Series):175-176. DANFORTH, R., DUCHOWSKI, A., GEIST, R. AND MCALILEY, E. 2000. A platform for gaze-contingent virtual environments. Proc. AAAI Smart Graphics Symposium. DINH, H.Q., WALKER, N., SONG, C., KOBAYASHI, A. AND HODGES, L.F. 1999. Evaluating the importance of multi-sensory input on learning and the sense of presence in virtual environments. Proc. IEEE Virtual Reality. FUNKHOUSER, T. AND SEQUIN, C. 1993. Adaptive display algorithm for interactive frame rates during visualization of complex virtual environments. Computer Graphics, 27 (SIGGRAPH 93), 247-254. LUEBKE, D. AND HALLEN, B. 2001. Perceptually driven simplification using gaze-directed rendering. Rendering Techniques 2001, Springer-Verlag (Proc. Eurographics Workshop on Rendering), 223-234.

LUEBKE, D., REDDY, M., COHEN, J., VARSHNEY, A, WATSON, B. AND HUEBNER, R. 2002. Level of Detail for 3D Graphics. San Francisco: Morgan Kaufman. REDDY, M. 1998. Specification and evaluation of level of detail selection criteria. Virtual Reality: Research, Development and Application, 3(2): 132-143. WATSON, B., WALKER, N. AND HODGES, L.F. 2004. Supra-threshold control of peripheral LOD. Proc. ACM SIGGRAPH 2004. ACM Trans. Graphics, 23, 3, to appear. WATSON, B., WALKER, N., HODGES, L.F. AND WORDEN, A. 1997. Managing level of detail through peripheral degradation: effects on search performance with a head-mounted display. ACM Trans. on Computer-Human Interaction, 4, 4, 323-346. WATSON, B., WALKER, N., RIBARSKY, W.R. AND SPAULDING, V. 1998. The effects of variation in system responsiveness on user performance in virtual environments. Human Factors, Special Section on Virtual Environments, 40, 3 (Sept), 403-414. WATSON, B., WALKER, N., WOYTIUK, P. AND RIBARSKY, W.R. 2003. Maintaining usability during 3D placement despite delay. Proc. IEEE Virtual Reality 2003 conference, (Los Angeles, March). 133-140. WOOLLEY, C., LUEBKE, D., WATSON, B. AND DAYAL, A. 2003. Interruptible rendering. Proc. ACM Interactive 3D Graphics, 143-151. YEE, H., PATTANAIK, S. AND GREENBERG, D.P. 2001. Spatiotemporal sensitivity and visual attention for efficient rendering of dynamic environments. ACM Trans. Graphics, 20, 1, 39-65.

Date post:	17-Nov-2023
Category:	Documents
Upload:	independent
View:	0 times
Download:	0 times

Improving Adaptive Display with Temporally Adaptive Rendering

Documents