+ All Categories
Home > Technology > C++ on the Web (GDCE 2013)

C++ on the Web (GDCE 2013)

Date post: 08-May-2015
Category:
Upload: andre-weissflog
View: 3,189 times
Download: 0 times
Share this document with a friend
Description:
My GDCE 2013 presentation about C++ on the web, a more detailed "remix" of the QuoVadis 2013 presentation.
32
C++ on the Web A Tale from the Trenches Andre Weissflog Head of Development, Berlin Bigpoint GmbH GDC Europe 2013
Transcript
Page 1: C++ on the Web (GDCE 2013)

C++ on the WebA Tale from the Trenches

Andre WeissflogHead of Development, BerlinBigpoint GmbH

GDC Europe 2013

Page 2: C++ on the Web (GDCE 2013)

What’s this about?

• the web as a new target platform for C++ code

• differences to traditional platforms

• differences between C++/web technologies

• porting problems and solutions

Page 3: C++ on the Web (GDCE 2013)

Demos

• Dragons Demo: minimal 3D skinned character demo [show demo]

• Map Demo: more advanced 3D demo [show demo]

• based on Nebula3 engine, also used in Drakensang Online

Page 4: C++ on the Web (GDCE 2013)

Why develop for the web HTML5 + WebGL?

Create Deploy Play

• no walled gardens, no gate-keepers, no certification process• free choice of hosting & payment providers• no installations, no updates, no plugins, no lengthy downloads• multi-platform “for free”• battle-hardened security infrastructure

The web is the most open and seamless platform for users and developers.

Page 5: C++ on the Web (GDCE 2013)

C++ to web technologies

Google’s pNaCl

Mozilla’s emscripten

Adobe’s crossbridge

LLVM has opened up a lot of new usage scenarios for C/C++...

...for instance running C/C++ code inside byte code VMs and other sandboxed environments:

Page 6: C++ on the Web (GDCE 2013)

Mozilla’s emscripten

• OpenSource project, started in 2010• .cpp → LLVM .bc → .js• extremely active and responsive dev team• lots of wrapper APIs (OpenGL, SDL, GLUT, ...)• limited threading support (no pthreads)

Recent Developments:• asm.js (highly optimizable subset of JS)• massive compilation speed improvements• inline Javascript directly into C++

Page 7: C++ on the Web (GDCE 2013)

Google’s pNaCl

• OpenSource project, started in 2008• .cpp → LLVM .bc → (deploy) x86/x64/ARM• Google Chrome only• safe sandbox for native code execution• full pthreads implementation

Recent Developments:• pNaCl finally ready for prime-time• enabled in Chrome v.30 and up• no longer restricted to Chrome Web Store apps

Page 8: C++ on the Web (GDCE 2013)

• formerly known as Alchemy and flascc• started in 2008, recently open-sourced• .cpp → LLVM .bc → AVM2 byte code• runs in Flash plugin• proprietary 3D API (Stage3D)• incredibly slow and resource hungry build process :/

Adobe’s crossbridge

Page 9: C++ on the Web (GDCE 2013)

Focus...Will mostly talk about emscripten (and some pNaCl)

Why:• emscripten has widest reach (all major browsers)• emscripten progresses incredibly fast• pNaCl currently has edge in threading support• pNaCl and emscripten are actually quite similar from dev perspective

But Javascript is slow, isn’t it?asm.js generated code is probably faster than you think, and pNaCl generated code is probably slower than you think (don’t have hard

benchmark numbers yet... sorry)

IMHO: for 3D games, the real performance gains will come through WebGL extensions, high call-overhead requires extensions to reduce number of GL calls!

Page 10: C++ on the Web (GDCE 2013)

My [OSX] dev environment

• Xcode (for compiling/debugging native OSX and iOS apps)

• Eclipse (for emscripten and NaCl specific dev work)

• emscripten SDK

• NaCl SDK

• cmake

• a local HTTP server (e.g. “python -m SimpleHTTPServer”)

Page 11: C++ on the Web (GDCE 2013)

Multiplatform Build System

ios.toolchain.cmake

osx.toolchain.cmake

pnacl.toolchain.cmake

emscripten.toolchain.cmake

android.toolchain.cmakeCMakeLists.txt + make

cmake: flexible meta-build-system, generates IDE project files and/or makefiles from generic “CMakeLists” files.

cmake

cmake toolchain files define platform-specific tools, header/lib search paths and compile options

windows.toolchain.cmake

pNaClCMakeLists.txt files define compile targets and their source files

Page 12: C++ on the Web (GDCE 2013)

Multiplatform Ecosystem

32 BIT+

64 BIT

x86, x86_64, ARM

OpenGLvs

Direct3D?

BigEndian no longer matters

POSIX+

Windows

code must be 32/64 bit clean

no exceptionsno RTTIno STL

no Boost

Windows still big, everything else is POSIX-ish

OpenGL Renaissance, but D3D9 still relevant

these make porting to exotic platforms often harder, not easier

• WinXP is still incredibly big in Eastern Europe & Asia• 3D feature base-line is OpenGL ES 2.0 w/o extensions (== WebGL)• go fully GL on all platforms? (GL driver quality? Win8 Metro apps? ANGLE?)

What keeps me awake at night:

Page 13: C++ on the Web (GDCE 2013)

N3 Multiplatform PhilosophyPlatform-specific code lives in itsown sub-directories.

__POSIX____IOS__

__NACL____EMSCRIPTEN__

...

Platform-specific pre-processor definesprovided by build system.

class DisplayCoreBase

class NaclDisplayCore class EmscDisplayCore class IOSDisplayCore#if __NACL__ #if __EMSCRIPTEN__ #if __IOS__

#endif #endif #endif

class DisplayCore

Diamond-shape class hierarchy resolved at compile time:

Page 14: C++ on the Web (GDCE 2013)

Multiplatform Line Counts

Dragons Demo (~170k lines of code)

platform-agnostic: 148kPOSIX/CRT: 7k OpenGL: 6.7kemscripten: 3kpNaCl: 3.5kOSX/iOS: 2.2k

about ~2% platform-specific code

Page 15: C++ on the Web (GDCE 2013)

Size Comparisons (Dragons Demo)~170k lines of C++ code

OSX (-arch i386 -O3):• orig: 2027 kByte• +no asserts: 1457 kByte• +stripped: 1237 kByte• +gzipped: 413 kByte

OSX (-arch x86_64 -O3):• orig: 2134 kByte• +no asserts: 1663 kByte• +stripped: 1427 kByte• +gzipped: 460 kByte

iOS (-arch armv7 -O3):• orig: 1542 kByte• +no asserts: 1196 kByte• +stripped: 972 kByte• +gzipped: 395 kByte

pNaCl (-O2):• orig: 1654 kByte• +no asserts: 1333 kByte• +stripped: 1333 kByte• +gzipped: 842 kByte

emscripten (-O2 --llvm-opts 3 --llvm-lto 3):• orig: 5414 kByte• +no asserts: 2154 kByte• +closure pass: 1951 kByte• +gzipped: 486 kByte

wow, surprisingly compact!

smaller than expected

bigger than expected

closure: Google’s JS optimizer/minifier

Page 16: C++ on the Web (GDCE 2013)

The Callback ProblemKey Point to understand (and accept):

Browser runtime environment uses callback model for asynchronous programming.

Start lengthy operation, provide callback which will be called when operation is finished: becomes very messy very quickly.

Games are usually frame-driven, not callback-driven.

This is the main riddle when trying to port a game engine to browser platforms.

Page 17: C++ on the Web (GDCE 2013)

The Game Loop Problem

Most event-driven platforms don’t let you “own the game loop”.

Instead the application runs completely inside event callback functions which must return quickly.

Failing to return quickly results in unresponsive behaviour or even your app being killed.

pNaCl

Page 18: C++ on the Web (GDCE 2013)

The Game Loop ProblemBest solution is to use the app main threadexclusively for system event handling...

...and spawn a “Game Thread” which runs the actual game loop.

MainThread

GameThread

input events

system events

quit event

display change events

Only wakes up on system events.

Runs your typical “infinite” game loop.

Page 19: C++ on the Web (GDCE 2013)

The CallOnMainThread-ProblemSome platforms have restrictions what OS functionalityis accessible from threads.

E.g. must call OpenGL or IO functions from the main thread only.

pNaCl

Either run everything on main thread, or dispatch“system calls” to run asynchronously on main thread.

Page 20: C++ on the Web (GDCE 2013)

CallOnMainThread problem

All “PPAPI calls” must happen on main thread, and the main thread must never block.

pNaCl

Threads can push function pointers for deferred execution on main thread.

Deferred function calls and result callbacks execute in a simple run-loop after your per-frame callback on the main thread.

This primitive runloop/callback model makes it easy to shoot yourself in the foot by waiting for events triggered by your own callbacks. This stops the entire runloop and freezes the app.

But: All other threads can block as much as they want, waiting for events triggered by callbacks on the main thread. Nice way to simulate blocking I/O.

Conclusion: pNaCl’s full threading support can be used to workaround many of its restrictions by moving the actual game logic into its own thread, and use the main thread only for “system calls” and their result callbacks.

Page 21: C++ on the Web (GDCE 2013)

CallOnMainThread visualizedpNaCl

Init():launch Game Thread

StartIO():begin async IO,

set finish-callback to FinishIO()

FinishIO():set finished-condvar

Main Thread

put StartIO func ptr on main thread’s run

queue and wait for finished-condvar

new thread

CallOnMainThread(StartIO)

your Game Thread

finished-condvar is set,continue Game Thread

...blocked...

finished-condvar set

your pNaCl main-thread code

invoke callbacks to pNaCl app code:

initializationinvoke deferred funcsinvoke result callbacks

...

pNaCl runtime(runloop/callbacks)

Page 22: C++ on the Web (GDCE 2013)

Limitations

Similar restrictions as pNaCl, but can’teasily use threads to workaround them:

• most “interesting functions” (WebGL!) must be called from main thread• main thread must not block• no pthreads, only WebWorkers for threading• WebWorkers have their own “address space”

Can’t move entire game loop into WebWorker thread (yet?)

Browser vendors working towards more flexible WebWorkers, but HTML5 standardization takes time.

Page 23: C++ on the Web (GDCE 2013)

Limitation WorkaroundsAll your code must run inside “slices”,always return within 16 or 32 ms to browser.

If something takes longer, either spread workover several frames, or move into WebWorker.

N3 has new “PhasedApplication” model: app goes through phases,which tick themselves forward when finished.

OnInit

OnPreloading

OnOpening

OnRunning

OnClosing

OnQuit

OnFrameemscripten

runtimeenvironment

max 16ms or 32ms (for 60 or 30 fps)

Page 24: C++ on the Web (GDCE 2013)

Threading Workarounds

Failed approach: Try to wrap low-level threading code in some sort of “co-operative thread scheduling” system.

Success: Move abstraction to a higher level (don’t wrap “low level threads”, but wrap “parallel task system”).

2 uses for threading: hide blocking / make use of additional CPU cores.

Dispatcher WorkerThread(s)

request

Nebula3 parallel task system model3 Flavours:

• Blocking: thread sleeps until messages arrives• Timeout: block until messages arrive, or timeout occurs• Run-through: infinite loop doing per-frame work, pull messages

emscripten port adds 2 “run modes”:

• Parallel: work is pushed to WebWorker threads (makes use of cpu cores)• Sliced: runs on main-thread, work is “triggered” per frame (hides callback mess)

response

queue

Page 25: C++ on the Web (GDCE 2013)

Nebula3 IO System

IOSystem

HTTP File System

App Code

IO request

IO responsewith Stream

object

Closer to HTTP philosophy then fopen()/fclose():

• URLs instead of file system paths• asynchronous IO is default, synchronous is special case• pluggable filesystem handlers associated with URL scheme (http://, file://, ...)• Stream objects with StreamReaders and StreamWriters

Local File System

http://..,file://...

Stream object with file data

• Filesystem modules return Stream objects holding downloaded data• Stream objects have typical Read/Seek/... methods• IO reponse is a “Future” object, app code polls whether response has become valid

Page 26: C++ on the Web (GDCE 2013)

Asset Loading

Easy way: emscripten can pre-load assets into memory before app starts, accessible through fopen() / fread()

HTTPFile System

Web Server

HTTP request

Downside: delay on startup, memory cost - doesn’t work well for big asset sets.

Solution: need to stream and uncompress all assets on demand asynchronously

HTTP response

Problem: HTTP downloads much slower than loading from HDD, can’t block while waiting for download to finish.

UncompressWebWorker

App Code

IO request

IO response

HTTP File System has platform-specific implementations:• emscripten: emscripten_async_wget_data()• pNaCl: pp::URLLoader• OSX/iOS: NSURLRequest • Linux / Windows: libCURL• fallback: home-made HTTP client using raw TCP sockets (tricky!)

Page 27: C++ on the Web (GDCE 2013)

Preloading Phase

Loading Screen On

Preloading

Problem: Sometimes asynchronous loading is too much hassle, or even impossible (for instance when using 3rd party libs).

Solution: Have pre-loading app phases, show loading screen, download and pin files into a memory filesystem, continue to next app phase when files have finished downloading.

Synchronous IO functions exclusively access data in memory filesystem, fail if file hasn’t been preloaded.

Running

Loading Screen Off

Loading Screen On

PreloadingLoading Screen Off

Running

MemoryFile System

fread()

fread()

populate

populate

Web ServerHTTP

Only use this approach when absolutely necessary and only for small files, not for textures, geometry, audio, etc...

Page 28: C++ on the Web (GDCE 2013)

Debugging

None of the C++ web solutions have really goodinteractive debugging support (yet).

Develop and debug your app mainly as a native desktop app for OSX or Windows inside XCode or VStudio, this gives the best turn-around time and “debugging experience”

Only fall-back to low-level debugging for platform-specific code.

emscripten debugging can be surprisingly easy:

• generated Javascript can be made very readable (see -g options in emcc)• can inject debugging statements without recompiling• see emscripten/src/settings.js for some interesting runtime debug options

Page 29: C++ on the Web (GDCE 2013)

JS Debugging with Source Mapsemcc -g4 generates source maps containing reference data to the original C++ sources.

Interactively debug C++ code in the browser! (still feels very rough around the edges though)

Page 30: C++ on the Web (GDCE 2013)

Too many slides, too little time...Other interesting problem areas:

Audio NetworkingWebAudio vs Audio tag

no common compressed audio

format across browsers

WebSockets or WebRTC

much more restrictive than Berkley Sockets

(security reasons)

Feels like back in the 90’s, have to roll our own Audio and

Networking libs AGAIN :(

Page 31: C++ on the Web (GDCE 2013)

Too many slides, too little time...

OpenGLhave to settle on OpenGL ES 2 feature set

“it just works!”...even on mobile: Main problem is

call overhead into WebGL, but it’s still surprisingly fast.

Page 32: C++ on the Web (GDCE 2013)

Questions?

Resourceshttp://flohofwoe.blogspot.com

http://www.flohofwoe.net/demos.html


Recommended