+ All Categories
Home > Documents > Lect5 Video

Lect5 Video

Date post: 19-Oct-2015
Category:
Upload: lizhi0007
View: 19 times
Download: 0 times
Share this document with a friend

of 37

Transcript
  • CS 414 - Spring 2014CS 414 Multimedia Systems Design Lecture 5 Digital Video Representation

    Klara NahrstedtSpring 2014

    CS 414 - Spring 2014

  • CS 414 - Spring 2014Administrative MP1 will be out (February 2)Deadline of MP1 is February 19 (Wednesday), 5pm You can have 2 bonus day if needed (just keep in mind that you can have totally 3 bonus days for all three MPs) Submit via compassMP1 discussion during Lecture on February 7 (Friday)

    CS 414 - Spring 2014

  • Today Introduced ConceptsDigital Image RepresentationQuantization, Color Issues, Image Size Video Additional Visual Perception Dimensions Resolution, Brightness, Temporal ResolutionTelevisionAnalog, DigitalNTSC, HDTV, CS 414 - Spring 2014

    CS 414 - Spring 2014

  • Color QuantizationExample of 24 bit RGB ImageCS 414 - Spring 201424-bit Color Monitor

    CS 414 - Spring 2014

  • Image Representation Example24 bit RGB Representation (uncompressed)Color Planes

    128135166138190132129255105189167190229213134111138187

    135190255167213138

    128138129189229111

    166132105190134187

    CS 414 - Spring 2014

  • Image Properties (Color) CS 414 - Spring 2014

    CS 414 - Spring 2014

  • Color HistogramCS 414 - Spring 2014

    CS 414 - Spring 2014

  • Spatial and Frequency DomainsSpatial domainrefers to planar region of intensity values at time t

    Frequency domainthink of each color plane as a sinusoidal function of changing intensity valuesrefers to organizing pixels according to their changing intensity (frequency)CS 414 - Spring 2014

    CS 414 - Spring 2014

  • Image Size (in Bits) Image Size = Height x Width X Bits/pixel

    Example: Consider image 320x240 pixels with 8 bits per pixel Image takes storage 7680 x 8 bits = 61440 bits or 7680 bytesCS 414 - Spring 2014

    CS 414 - Spring 2014

  • What is 2D Video?300 image framesCS 414 - Spring 2014

    CS 414 - Spring 2014

  • Visual Perception: Resolution and BrightnessVisual Resolution (depends on: )Image sizeViewing distanceBrightnessPerception of brightness is higher than perception of colorDifferent perception of primary colorsRelative brightness: green:red:blue=59%:30%:11%CS 414 - Spring 2014Source: wikipedia

    CS 414 - Spring 2014

  • CS 414 - Spring 2014Visual Perception: Resolution and Brightness

    CS 414 - Spring 2014

  • Visual Perception: Temporal Resolution CS 414 - Spring 2014Effects caused by inertia of human eyePerception of 16 frames/second as continuous sequenceSpecial Effect: Flicker

    CS 414 - Spring 2014

  • Temporal Resolution FlickerPerceived if frame rate or refresh rate of screen too low (
  • Visual Perception Influence Viewing distanceDisplay ratio width/height 4/3 for conventional TVWidth/height 16/9 for HDTVNumber of details still visibleIntensity (luminance)

    CS 414 - Spring 2014

    CS 414 - Spring 2014

  • *3D Video Stereo video and Free-viewpoint video

  • Stereo 3D Image Most stereoscopic methods present two offset images separately to the left and right eye of the viewer. Thesetwo-dimensionalimages are then combined in the brain to give the depth perception Visual requirements for 3D videoSimultaneous perceptionFusion (binocular single vision)Stereopsis (impression of depth)CS 414 - Spring 2014

    CS 414 - Spring 2014

  • 3D Image Binocular viewing of scene createsTwo slightly different images of the scene in the two eyes due to the eyes different positions on the headThese differences, referred to as binocular disparity, provide information that the brain can use to calculate depth in the visual scene, providing depth perception

    Stereoscopic Image http://www.pixelsonic.com/2011/04/mercedes-300-sl-stereoscopic/

  • 3D Image/Video - Depth Perception Depth perception Visual ability to perceive world in 3D and the distance of an objectDepth sensation is corresponding term for animals (it is not known whether they perceive depth in the same subjective way that humans do) Depth cues Binocular cues that are based on receipt of sensory information in 3D from both eyesMonocular cues that can be represented in just 2D and observed (depth) with just one eye.

    CS 414 - Spring 2014

    CS 414 - Spring 2014

  • *3D Teleimmersive Video

  • Television History (Analog)1927, Hoover made a speech in Washington while viewers in NY could see, hear him

    AT&T Bell Labs had the first television18 fps, 2 x 3 inch screen, 2500 pixels

    CS 414 - Spring 2014

  • Analog Television ConceptsProduction (capture)2D structured formatsRepresentation and Transmissionpopular formats include NTSC, PAL, SECAMRe-constructionscanningdisplay issues (refresh rates, temporal resolution)relies on principles of human visual system CS 414 - Spring 2014

    CS 414 - Spring 2014

  • Color Space: YUVPAL video standardY is luminanceUV are chrominance

    YUV from RGB Y = .299R + .587G + .114B U = 0.492 (B - Y) V = 0.877 (R - Y)U-V plane at Y=0.5CS 414 - Spring 2014Source: wikipediaYUV

    CS 414 - Spring 2014

  • YIQ (NTSC)

    YIQ from RGB Y = .299R + .587G + .114B I = .74 (R - Y) - .27 (B - Y) Q = 0.48 (R - Y) + 0.41 (B - Y) CS 414 - Spring 2014Source: wikipediaYIQ with Y=0.5

    CS 414 - Spring 2014

  • Video RepresentationsCS 414 - Spring 2014

    CS 414 - Spring 2014

  • TV HistoryCS 414 - Spring 2014

    CS 414 - Spring 2014

  • HDTV (Digital) Resolutions: 1920x1080 (1080p) Standard HD (HDTV)2160p, 4096x2304 (4096p) 4K High HDFrame rate: HDTV - 50 or 60 frames per secondHDTV 120 fps

    CS 414 - Spring 2014

    CS 414 - Spring 2014

  • HDTVInterlaced (i) and/or progressive (p) formatsConventional TVs use interlaced formats Computer displays (LCDs) use progressive scanningMPEG-2 compressed streamsIn Europe (Germany) MPEG-4 compressed streams

    CS 414 - Spring 2014

    CS 414 - Spring 2014

  • Aspect Ratio and Refresh RateAspect ratio Conventional TV is 4:3 (1.33) HDTV is 16:9 (2.11) Cinema uses 1.85:1 or 2.35:1Frame Rate NTSC is 60Hz interlaced (actually 59.94Hz) PAL/SECAM is 50Hz interlaced Cinema is 24Hz non-interlacedCS 414 - Spring 2014Source: wikipedia

    CS 414 - Spring 2014

  • CS 414 - Spring 2014Resolution

    CS 414 - Spring 2014

  • Digital Video and TV

    Bit rate: amount of information stored per unit time (second) of a recordingColor Coding: YCrCbSubset of YUV that scales and shifts the chrominance values into range 0..1

    Y = 0.299R + 0.587G + .114B Cr = ((B-Y)/2) + 0.5 Cb = ((R-Y)/1.6) + 0.5

    CS 414 - Spring 2014YCrCb

    CS 414 - Spring 2014

  • Digital Video and TV Color space compressionYUV444 24 bits per pixelYUV422 16 bits/pixelYUV411 12 bits/pixel

    CS 414 - Spring 2014

    CS 414 - Spring 2014

  • Digital Video and TV DVD videoSince 1997Resolution and frame rate704x480 at 29.97 fps704x576 at 25 fpsBitrate: 9.8 MbpsCS 414 - Spring 2014

    CS 414 - Spring 2014

  • Digital Video and TV Blu-ray video since 2006Resolution and frame rate1920i (@59.94 fps) interlaced1920p (@24 fps) progressive.Bitrate : 40 MbpsCS 414 - Spring 2014

    CS 414 - Spring 2014

  • 3DTVRefresh rate no less than 120 HzSynchronized shutter glasses to enable different views for different eyesCS 414 - Spring 2014

    CS 414 - Spring 2014

  • Summary Digitization of Video SignalsComposite CodingComponent CodingDigital Television (DTV)DVB (Digital Video Broadcast)Satellite connections, CATV networks best suited for DTVDVB-S for satellites (also DVB-S2)DVB-C for CATV

    CS 414 - Spring 2014

    CS 414 - Spring 2014

  • SMPTE Time CodesSociety of Motion Picture and Television Engineers defines time codes for videoHH:MM:SS:FF01:12:59:16 represents number of pictures corresponding to 1hour, 12 minutes, 59 seconds, 16 frames If we consider 30 fps, then 59 seconds represent 59*30 frames, 12 minutes represent 12*60*30 frames and 1 hour represents 1*60*60*30 frames.For NTSC, SMPTE uses a 30 drop frame codeincrement as if using 30 fps, when really NTSC has only 29.97fpsdefines rules to remove the difference error

    CS 414 - Spring 2014

    CS 414 - Spring 2014

    ***ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign***ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign******Stereo video focuses on the 3D impression presented. The most simple method is ... Nowadays, we can also use depth image based rendering, light field and other techniques for this

    FVV focuses on the 3D attributes of the video object. It tries to render the video object from arbitrary virtual viewpoint and combine the video objects with the 3D graphics. So we want to find out whether free data reduction is indeed possible in tele-immersive video. To do this, lets first understand how tele-immersive video is generated. What is its format? Basically, on this camera host computer, theres this whole pipeline going on. First, 2D images are taken from different eyes of the camera, then, one of them is taken as a reference image to do background subtraction, then the foreground is partitioned into a polygon mesh; then importantly, depth mapping is done for the mesh vertices by correlating with the other 2D image, here, the right one. The depth map is finally combined with the texture or color map to produce a 3D video frame.*ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign*ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign*ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign*ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign***ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign*ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign**SMPTE timecode is a set of cooperating standards to label individual frames of video or film with a timecode defined by the Society of Motion Picture and Television Engineers in the SMPTE 12M specification.Timecodes are added to film, video or audio material, and have also been adapted to synchronize music. They provide a time reference for editing, synchronisation and identification. Timecode is a form of media metadata. The invention of timecode made modern videotape editing possible, and led eventually to the creation of non-linear editing systems.

    SMPTE (pron:sim-tee) timecodes contains binary coded decimal hour:minute:second:frame identification and 32 bits for use by users. There are also drop-frame and colour framing flags and three extra 'binary group flag' bits used for defining the use of the user bits. The formats of other forms SMPTE timecodes are derived from that of the longitudinal timecode.Time code can have any of a number of frame rates: common ones are24 frame/s (film)25 frame/s (PAL colour television)29.97 (30*1.000/1.001) frame/s (NTSC color television)30 frame/s (American black-and-white television) (virtually obsolete)In general, SMPTE timecode frame rate information is implicit, known from the rate of arrival of the timecode from the medium, or other metadata encoded in the medium. The interpretation of several bits, including the "colour framing" and "drop frame" bits, depends on the underlying data rate. In particular, the drop frame bit is only valid for a nominal frame rate of 30 frame/s: see below for details.More complex timecodes such as Vertical interval timecode can also include extra information in a variety of encodings.

    SMPTE time code is a digital signal whose ones and zeroes assign a number to every frame of video, representing hours, minutes, seconds, frames, and some additional user/specified information such as tape number. For instance, the time code number 01:12:59:16 represents a picture 1 hour, 12 minutes, 59 seconds, and 16 frames into the tape.ORCHID Research GroupDepartment of Computer Science, University of Illinois at Urbana-Champaign*


Recommended