Assets, Dynamics and Behavior Computation for virtual worlds and computer games

Post on 21-Jan-2016

34 views 0 download

Tags:

description

Assets, Dynamics and Behavior Computation for virtual worlds and computer games. Sheldon Brown, UCSD Site Director CHMPR Daniel Tracy, Programmer, UCSD Experimental Game Lab Kristen Kho, Programmer, UCSD Experimental Game Lab Todd Margolis, CRCA Technical Director. - PowerPoint PPT Presentation

transcript

Assets, Dynamics and Behavior Computation for virtual worlds and

computer games

Assets, Dynamics and Behavior Computation for virtual worlds and

computer games

Media environments becoming less singularly “authored”, more responsive to conditions of operation and social interaction.

•Higher fidelity of HCI environments increases need for quality in underlying assets

•Media environments are social spaces, and are less predictable as to how they are used

•Repositories of rich data require these developments to enhance understanding. This applies across almost all fields of knowledge building and discovery.

•We develop our methods of expression within the domains of culture. The invention of new forms of culture in areas such as computer gaming, will impact how we conduct science, engineering and social interactions of all types.

Media environments becoming less singularly “authored”, more responsive to conditions of operation and social interaction.

•Higher fidelity of HCI environments increases need for quality in underlying assets

•Media environments are social spaces, and are less predictable as to how they are used

•Repositories of rich data require these developments to enhance understanding. This applies across almost all fields of knowledge building and discovery.

•We develop our methods of expression within the domains of culture. The invention of new forms of culture in areas such as computer gaming, will impact how we conduct science, engineering and social interactions of all types.

Interactive media is a good candidate for taking advantage of multicore computing.

GPU’s are an example of one area in which multicore approaches have reaped enormous benefits.

Other problems in the field aren’t as homogenous as graphics and require new approaches in the overall application structure to utilize processing power, while still delivering interactive operations.

Whatever cool things we want to do all have to be done within 1/30 of a second.

The creation of assets from databases and algorithms along with the behaviors of complex systems are currently limiting the possibilities of interactive graphic experiences.

Interactive media is a good candidate for taking advantage of multicore computing.

GPU’s are an example of one area in which multicore approaches have reaped enormous benefits.

Other problems in the field aren’t as homogenous as graphics and require new approaches in the overall application structure to utilize processing power, while still delivering interactive operations.

Whatever cool things we want to do all have to be done within 1/30 of a second.

The creation of assets from databases and algorithms along with the behaviors of complex systems are currently limiting the possibilities of interactive graphic experiences.

Procedural Asset Procedural Asset pipeline for virtual pipeline for virtual worlds, games and worlds, games and other forms of other forms of digital media.digital media.

Data, either taken Data, either taken from real world from real world sources via sensors, sources via sensors, mined from mined from databases, or databases, or generated, is generated, is transformed by transformed by varieties of varieties of algorthmic stages.algorthmic stages.

Asset creation Asset creation becomes increasingly becomes increasingly responsive to the responsive to the application.application.

Techniques developed from Techniques developed from other fields have other fields have applicability to transforming applicability to transforming data. data.

For instance, 2D computer For instance, 2D computer vision techniques apply to vision techniques apply to 3D spatial data.3D spatial data.

Kristen KhoKristen Kho

Programmer, UCSD Experimental Game Lab

• SC Road Generation

• Algorithms

• Using the Cell Processor

• Performance

• Conclusion and Future Work

Case Implementation:Case Implementation:

Assets can also come from the purely Assets can also come from the purely algorithmic, with no initial data seed. algorithmic, with no initial data seed.

SC Road GenerationSC Road Generation

• L-system of Archimedes spirals

• Algorithm1. Choose starting point on existing road

2. Generate a spiral curve (“challenger” road)

3. Test for intersections with existing roads• If no intersections, add to list of existing roads

4. Repeat n times or until all starting points have been tried

MotivationMotivation

• Problem– Initial implementation used Maya/MEL– Very slow preprocess only

• Take advantage of data parallelism and Cell processors– Generate road system in real time during SC

runtime

Cell Processor ReviewCell Processor Review

• Power Processing Unit (PPU)

• Synergistic Processing Unit (SPU)

• Direct Memory Access (DMA)

Porting to the CellPorting to the Cell

• Road intersection testing– Function-offload programming model

• Code– PPU: manages the SPU threads, sums up results– SPUs: intersection testing– CAFÉ (Cell Architecture Framework and

Extensions) : set of libraries that build upon Cell SDK libraries

PseudocodePseudocodePPULoop:

Generate challenger curveSend challenger to SPUs

Receive SPU intersection countsIf there were no intersections

Insert challenger into listRepeat

SPULoop:

Receive challenger curveUpdate sublist of existing curvesintersectionCount = 0For every curve in sublist:

If challenger intersects curveIncrement intersectionCount

Send intersectionCount to PPU

Line Segment IntersectionLine Segment Intersection

• Using a parametric representation, the line segment ab can be written as a convex combination involving a real parameter s:p(s) = (1 - s)a + sb for 0 ≤ s ≤ 1

• Similarly for cd we may introduce a parameter t:q(t) = (1 - t)c + td for 0 ≤ t ≤ 1

• An intersection occurs if and only if we can find s and t in the desired ranges such that p(s) = q(t). Thus we get the two equations:(1 - s)ax + s bx = (1 - t) cx + t dx(1 - s)ay + s by = (1 - t) cy + t dy

• The coordinates of the points are all known, so it is just a simple exercise in linear algebra to solve for s and t.

• Using a parametric representation, the line segment ab can be written as a convex combination involving a real parameter s:p(s) = (1 - s)a + sb for 0 ≤ s ≤ 1

• Similarly for cd we may introduce a parameter t:q(t) = (1 - t)c + td for 0 ≤ t ≤ 1

• An intersection occurs if and only if we can find s and t in the desired ranges such that p(s) = q(t). Thus we get the two equations:(1 - s)ax + s bx = (1 - t) cx + t dx(1 - s)ay + s by = (1 - t) cy + t dy

• The coordinates of the points are all known, so it is just a simple exercise in linear algebra to solve for s and t.

SC/Cell CommunicationSC/Cell Communication

• Scalable City (Win32) and a Cell Blade communicate on the local network via lightweight TCP/IP server/client programs

• Request for new road system is initiated by SC (the client)

• Cell (the server) sends back the road data after work has completed

Networking ProtocolNetworking Protocol

Client (Win32)Connect to Server

Send parameters

Send border curve

Receive road hierarchy

Receive road imageDisconnect from Server

Client (Win32)Connect to Server

Send parameters

Send border curve

Receive road hierarchy

Receive road imageDisconnect from Server

Server (Cell)

Open Client connection

Receive parameters

Receive border curveRun road generation programSend road hierarchy

Send road image

Close Client connection

Server (Cell)

Open Client connection

Receive parameters

Receive border curveRun road generation programSend road hierarchy

Send road image

Close Client connection

OptimizationsOptimizations

• Floating point precision – 10x faster than double

precision

– Loss of precision requires normalization to prevent errors

OptimizationsOptimizations

• Loop Unrolling + Vector Operations– Can test 4 intersections at a time

• Bounding circles– Spirals fit roughly within a circle– Can skip tests if bounding circles intersect

• NUMA & MPI– Slower due to communication latency

PerformancePerformance

1000 tries 10000 tries 100,000 tries

8 SPUs 2.008266 seconds 2.371372 seconds 3.481144 seconds

16 SPUs 1.224655 seconds 1.451451 seconds 2.226997 seconds

# roads generated 23 25 24

Table 1. Total Execution Time for Road Generation on the Cell

Performance

Final Final + NUMA Final + MPI Final, no bounding circles

4 SPUs 1.399935 1.517238 n/a 4.764048

8 SPUs 1.345195 1.352651 6.066842 2.548669

12 SPUs 1.306945 1.393843 5.387366 2.091064

16 SPUs 1.364099 1.410301 4.585504 1.775743

24 SPUs n/a n/a 3.728109 n/a

32 SPUs n/a n/a 3.50951 n/a

Table 2. Comparison of Final technique vs. various optimization approaches

Cell based computation of similar pipeline – takes place in about 1 second per blade per landscape.

Cell based computation of similar pipeline – takes place in about 1 second per blade per landscape.

The Cell processor has met our needs and expectations for real time and near real time asset development, allows roads to be brought into the program interactively

Future Work– Offload other asset classes generation processes

onto the multicore compute servers• Terrain generation from satellite images• House & tree placement/scattering• Dynamics, Physics, Animation, AI

Creating these assets outside the main application pipeline assures continued real time behavior, but how do we further integrate these operations?

How do we come up with effective approaches for dynamically balancing multicore computing in computing assets dynamics and behavior?

Where do we employ high level tools, low level algorithm re-engineering and the capabilities of heterogeneous computing devices?

Creating these assets outside the main application pipeline assures continued real time behavior, but how do we further integrate these operations?

How do we come up with effective approaches for dynamically balancing multicore computing in computing assets dynamics and behavior?

Where do we employ high level tools, low level algorithm re-engineering and the capabilities of heterogeneous computing devices?

Tree Loading

5.2

5.3

5.4

5.5

5.6

5.7

5.8

5.9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Tim

e (

se

co

nd

s)

serial

tbb

tbb_auto

boost

We can see that all of the multi-threading solutions yield a significant improvement over the serial implementation. The TBB solutions yielded very similar results and the boost threading solution was slightly slower on average.

We can see that all of the multi-threading solutions yield a significant improvement over the serial implementation. The TBB solutions yielded very similar results and the boost threading solution was slightly slower on average.

Comparing multi-threading techniques with Intel Threading Building Blocks compared to Boost threads to serial processing in different areas of interactive graphics applications.

Comparing multi-threading techniques with Intel Threading Building Blocks compared to Boost threads to serial processing in different areas of interactive graphics applications.

Areas of the application that have execution dependencies necessitate larger scale re-engineering to achieve multi-threading improvements.Areas of the application that have execution dependencies necessitate larger scale re-engineering to achieve multi-threading improvements.

See report for more details

Serial implementation is the quickest in this case. The automatic partitioning for TBB seriously degrades performance and the boost and regular TBB implementations are comparable in performance, with TBB yielding slightly better performance on average.

Serial implementation is the quickest in this case. The automatic partitioning for TBB seriously degrades performance and the boost and regular TBB implementations are comparable in performance, with TBB yielding slightly better performance on average.

By putting physics on a separate thread, application performance gains range from 100% improvement when little physics activity, to no improvement with high levels of physics. Generally we experience about a 30% improvement.

Bottleneck loops in non-thread safe physics library require locks for library calls, degrading performance. Re-design of the physics library with data level parallelism is required.

By putting physics on a separate thread, application performance gains range from 100% improvement when little physics activity, to no improvement with high levels of physics. Generally we experience about a 30% improvement.

Bottleneck loops in non-thread safe physics library require locks for library calls, degrading performance. Re-design of the physics library with data level parallelism is required.

See report for more details

Physics Collision Loop Iteration

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

1 62 123 184 245 306 367 428 489 550 611 672 733 794 855 916 977

Tim

e (s

eco

nd

s)

tbb_auto tbb 2 threads tbb 8 threads boost 2 threads boost 8 threads serial

All objects in the real world are potentially dynamic.

But most are not active all the time.

Or their activity is at orders of magnitude different scales. Planets to People to Peanuts to Proteins.

This is a lot of stuff to keep track of. Most common approaches are to make special categories of objects that are subject to dynamic interaction in a virtual world.

However, some new approaches can create a virtual world that is significantly more complex.

All objects in the real world are potentially dynamic.

But most are not active all the time.

Or their activity is at orders of magnitude different scales. Planets to People to Peanuts to Proteins.

This is a lot of stuff to keep track of. Most common approaches are to make special categories of objects that are subject to dynamic interaction in a virtual world.

However, some new approaches can create a virtual world that is significantly more complex.

See report for more details

ATLAS in silicoAn intimate aesthetic encounter with millions of sequences from the Global Ocean Survey

SIGGRAPH, Digital Eyes 2008 , FILE, SPIE, Animated Painting Symposium, Ingenuity Festival Cleveland , Animation: from the Avant-Garde to Popular Culture

SIGGRAPH, Digital Eyes 2008 , FILE, SPIE, Animated Painting Symposium, Ingenuity Festival Cleveland , Animation: from the Avant-Garde to Popular Culture

NSF SGER fundedwork-in-progressNSF SGER fundedwork-in-progress

“…Darwin faced a very basic visual problem: how could natural

selection, a concept almost by definition impossible to illustrate

directly, be illustrated, especially when the existing visual

conventions of the natural sciences were associated in varying

degrees with conceptions of species fixity?”B. Smith, Charles Darwin and Victorian Visual Culture,2006

As algorithmic/discovery-based practices of 21st Century

science mine vast and abstract data sets to reveal the diversity

and plasticity of biological systems they create new challenges

for representing and intuitively comprehending nature.

Can aesthetic experience and artistic practice call out the

relationship between data and understanding, and

between data about life and the life that it describes?

Challenge

Approach

Combine novel displays,

cyber-infrastructure,

metagenomics and new media

arts to enable a “larger-than-

life," compelling experience

engaging a broad audience

through visceral, aesthetic,

poetic encounters with vast

data sets to foster curiosity

and intuitive understanding

ExperienceImmersive atlas = exploration + context

• Metadata environment patterns• Data element “traces” context• Homology (BLAST - GOS+ public)

animation + sound

Visualize collective knowledge production • BLAST server logs “data precipitates”

(clusters of sequences that are repeatedly returned as results to public BLAST searches) “social network” visualization as fluid simulation

• See impact of your activity within the data

Hybrid mode that includes text-overlays and ability to link out to sequences, satellite imagery and other supporting data for researchers.

System Diagram

State control

Graphics Layout engine

Activity detection

Hand tracking

Head/body tracking

Input

CAMERA/GOS + Public

• data• meta-data

Shape grammar engine

Sequence feature

calculations

Environment graphic engine

Environment physics engine

Rendering

Varrier, CAVE,

Projection

Sound engine

Sound

Audio system

CAMERA/BLAST

• server logs

Surround sound and real-time 3D computer graphics

Immersive & Interactive Virtual Reality Installation

Display

+

30,000 sequences

System Diagram

State control

Graphics Layout engine

Activity detection

Hand tracking

Head/body tracking

Input

CAMERA/GOS + Public

• data• meta-data

Shape grammar engine

Sequence feature

calculations

Environment graphic engine

Environment physics engine

Rendering

Varrier, CAVE,

Projection

Sound engine

Sound

Audio system

CAMERA/BLAST

• server logs

Using a hybrid computational approach has provided a 40x speed-up

As of September 2009…

Display

+

GPU(CUDA)

Multi-Threading

GPUShader

1.2 million sequences

Data-Within-Metadata: Immersive Environment

Value ranges

Dynamic/abstract aesthetic patterns

• Particle-to-region interactions metadata annotations (# sets, values/set)

• Particle-to-particle interactions sequence characteristics, feature annotations and metadata annotation

• Data aggregates visualize social network/collective knowledge production

Regions metadata categories + value ranges

Data elements (sequences) particles in fluid dynamics simulation

Fields: Regions exert +/- forces

Hybrid mode: text-annotation toggles

Potential Field

Meta-data sets / ORF

Simulation/Physics Engine

Mapping Sequence to Visual Form

Sequence features 3D shape grammar

• Length• Primary (AA) sequence• Homologs/Alignments• Pfam Domains• Secondary structure predictions• Low complexity regions• Signal peptides

• Ref-seq ORFs (413,707)• 17M+ ORFs

Pre-render

Real-time render

• BLAST result (top hit)• Animation• Sound

Aesthetics inspired by Ernst Haeckel

Outreach & Venues

• Varrier• CAVE• Single-wall VR• Online• 2D Print• Stereograms• Rapid Prototyped Sculptures

Existing relationships:• San Francisco Exploratorium• New York Hall of Science• Maryland Science Center• Science Museum of Minnesota• Museum of Science, Boston• Museum of Life and Science, Durham NC• Museum of Science and Industry, ChicagoTryScience.org, ASTC

Cultural VenuesContemporary art/new media venues• SIGGRAPH 2007• Digital Eyes 2008• FILE 2009• Animated Painting Symposium• Ingenuity Festival Cleveland• Animation: from the Avant-Garde to

Popular Culture

Publications/Conferences• SPIE, IEEE Visualization, ACM-CHI,

Leonardo, Convergence, and more…

FormatsScience Center Venues

Collaboration

Team members:Ruth West, CRCATodd Margolis, CRCAJP Lewis, Stanford UniversityRajvikram Singh, Calit2/NCMIRJurgen Schulze, Calit2

Daniel Tenedorio, UCSD CSEIman Mostafavi, UCSD CSEJavier I. Girado, Calit2 IVLPaul Gilna, CAMERAWeizhong Li, CAMERAKayo Arima, CAMERATom Cassey, UCSD CSETommy Chheng, UCSD ECEJeff Lien, UCSD ECE

Larry Smarr, Calit2Ramesh Rao, Calit2

ATLAS in silico is the result of an art-science collaboration.

Academic Support:CRCA: Center for Research in Computing and the ArtsCalit2: California Institute for Telecommunications and Information TechnologyCAMERA: the Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and AnalysisNCMIR: National Center for Microscopy and Imaging Research CENS: UCLA Center for Embedded Networked Sensing

Industrial Support:Da-Lite Screen CompanyMeyer Sound LaboratoriesVRCOThe Ingenuity FestivalTimeLogic/Active Motive Inc

Government Support:National Science Foundation

• Milestones Year 1 and 2– Migrate 3D application environment to modular client server

model. – Adapt for networked multi-user system to handle multiple requests

for new road systems simultaneously– Test modules performance over hybrid multicore computing

environment. i.e. configurations include variations of x86, Cell, Larrabee, Nvidia GPU, z10 mainframe, and x86 compute cluster.

– Offload other asset classes generation processes onto the multicore compute servers

• Terrain generation from satellite images• House & tree placement/scattering• Dynamics, Physics, Animation, AI

Assets, Dynamics and Behavior Computation for virtual worlds and

computer games

Assets, Dynamics and Behavior Computation for virtual worlds and

computer games

• Deliverables Year 1 and 2– Performance guidelines for computing hybrid

computing environments for interactive graphics application class.

– Deploy various solutions in location based exhibitions as well as with downloadable client software

Assets, Dynamics and Behavior Computation for virtual worlds and

computer games

Assets, Dynamics and Behavior Computation for virtual worlds and

computer games