[IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium...

RECONSTRUCTING STATIC SCENE VIEWED THROUGH SMOKE USING VIDEO

Ákos Kiss(1), Tamás Szirányi(2)

(1) Budapest University of Technology and EconomicsDept. of Control Engineering and Information Technology

Magyar tudósok körútja 2., H-1117 Budapest, Hungary, [email protected](1)(2) Computer and Automation Research Institute, Hungarian Academy of Sciences

Distributed Events Analysis Research GroupKende u. 13-17, H-1111 Budapest, Hungary, [email protected]

ABSTRACT

In this paper we present a method for reconstructing static sceneviewed through thick smoke using multiple images. Based on spa-tiotemporal statistical approach our method works well on noisyvideos containing swirling smoke. We apply statistical analysis onregions of color input images, and show the way to reconstruct sceneby transforming images to alter mean and deviation locally. Weintroduce a method to extract necessary parameters using multipleframes of a video. We verify our method with the widely used phys-ical model of aerosols, highlighting some differences from remov-ing haze and fog - a widely studied area. Furthermore, our approacheliminates the need for complex optimization, making real-time pro-cessing possible. Results show that our method is capable of recon-structing scene in challenging cases.

Index Terms— Haze, fog, smoke, deweather, reconstruction,noise, video processing, statistics, spatiotemporal

1. INTRODUCTION

Aerosols like haze, mist and fog degrade image quality. The widelyused physical model describing effect of aerosols [1] leads to ahighly underconstrained system of equations, hence additional in-formation or constrains must be presented to solve the problem andreconstruct haze free image - referred to as deweathering. However,it is clear from this model that effect can be computed and removedfrom the image knowing the ambient light and the distribution of theaerosols.

Many works target deweathering urban images. Some of themuse multiple images of the static scene taken under different weatherconditions, increasing the number of equations. In [2, 3, 4] a homo-geneously distributed aerosol is assumed, which leads to a similarmodel where the degradation of image data depends on the depthmap. This applies to haze and fog, so reconstruction of clear sceneand depth map take place simultaneously. In [5] some user inputis still needed, but fully automatic methods were also developed[6, 2, 3].

Image degradation affects both saturation and contrast of image.Color holds valuable information, but reconstruction is possible fromcontrast only [4, 3]. A common method of retrieving aerosol distri-bution is based on an observation that in clear color images, usuallyone channel is pixelwise dark [7]. This so called dark channel prior

This work was partially supported by the Hungarian Scientific ResearchFund under grant number 80352.

leads to a straightforward estimation of aerosol distribution and im-age reconstruction from only one image. In this case, Pixels withlow saturation violate this assumption, and will become dark in thereconstructed image.

Scene-camera geometry can also be used to refine results. In [8],the depth is considered to increase from bottom to top, which holdsfor wide range of outdoor photographs.

To refine haze parameters, edge preserving bilateral filtering [9,10] and soft matting [7, 1] were used. These address the distinctionif an edge is due to change of depth or change of scene.

It is obvious that scene and aerosol distribution has no physicalrelation, therefore they should be uncorrelated [11]. This statisticalconstraint can be used to reconstruct scene, but variation in back-ground image and low noise are required.

Using appropriate filtering and implementation it is possible tocarry out real-time haze removal for videos [9].

These methods were applied to haze and fog, but the model isalso valid for smoke. In several cases smoke has specific turbulencepatterns, so it can be detected by statistical models [12]. In case ofvideos recording scene inside smoke, the noise [12], the density andvariation of smoke along with different lightning conditions preventthese methods from working properly.

Reconstructing static scene viewed through smoke from videois possible by selecting a clearest candidate throughout the wholevideo for regions of images [13]. For good results we need at leastone clear view for every region, so thick, constantly present smokecannot be eliminated.

We address the problem from a statistical point of view, elim-inating noise in measurements and variation in smoke distribution.The idea is introduced in Section 2. In Section 3 we show the phys-ical model and also show that the model well explains the methoddescribed in Section 2. We describe steps of reconstruction in Sec-tion 4, and conclude the results in Section 5.

2. METHOD

Observing a scene through thick smoke saturation decreases dramat-ically, while luminance increases as smoke reflects more of the am-bient light. In case of thick smoke, reflected sunlight seems to com-pletely suppress the characteristics of the scene, as in Fig 1(b).

However, further analysis shows that certain characteristics arestill present: the saturation of these smokey images is proportional tothe saturation of the scene. This means that by reducing intensity andemphasizing saturation of the image, we can reconstruct the scene:

2011 18th IEEE International Conference on Image Processing

978-1-4577-1303-3/11/$26.00 ©2011 IEEE 3461

(a) (b) (c)

Fig. 1. (a) is the ground truth scene, in (b) the scene is highly occluded by thick smoke and (c) is the reconstructed scene from (b) usingground truth parameters extracted from (a).

Jcx = Sc

x(Icx −Rc

x) (1)

where x denotes pixel position, c denotes color channel, Jcx is

the estimated scene value, Scx is the chrominance emphasizing pa-

rameter, Icx is the observed pixel value and Rcx is the intensity of the

ambient light reflection (this phenomenon will be explained by themodel in Section 3.). Our idea is to consider an image region B,where we assume Rc

x and Scx to be constant, denoted by Rc

B and ScB.

These two parameters fully control the mean value and deviation ofthe scene values in region B:

JcB = Sc

B IcB − Sc

BRcB (2)

σ(JcB) = Sc

Bσ(IcB) (3)

Having an estimation of the mean JcB and deviation σ(Jc

B), wecan compute Rc

B and ScB values so that Jc

B = JcB and σ(Jc

B) =σ(Jc

B). The only condition, to have non-zero input variances, is areasonable assumption for real images.

For verifying this method, we used the ground truth information:image of the smoke-free scene visible in Fig 1(a). We splitted test(Fig 1(b)) and reference (clear) images into regions - using the samegrid -, and computed IcB and σ(IcB) values from test image as well asJcB and σ(Jc

B) parameters from reference image. After computingRc

B and ScB for each block, we can reconstruct Jc blockwise, the

result is visible in Fig 1(c).Applying the same method to a cleaner image, we get a better

estimation of the scene (Fig 2(a,b)). We can also combine subse-quent images of a video to cancel out noise presented by acquisitionequipment and inhomogeneous smoke. For every pixel in region B,we have to compute the weighted sum of corresponding pixels fromevery frame using a weighting function wBf that has high value forclear image region and low value for occluded ones:

Jcx =

∑f∈F wBfJ

cxf∑

f∈F wBf,∀x ∈ B (4)

where f is the frame and wBf is the weighting functionThe result of combining 15 subsequent frames is in Fig 2(c).

Amount of smoke varied between what we saw in the two examples.Results indicate that reconstruction can be done by estimating meanand variance parameters of image regions even using noisy observa-tions.

3. MODEL DESCRIPTION

Smoke, haze and fog consist of particles reflecting light. This leadsto occlusion of the scene, decreasing its luminance, while reflectingthe ambient light. This latter, considering light ambient source, likesunlight, leads to increased luminance after all. The observation isalso affected by the camera settings as white balance and shutterspeed, which may change in time. These effects can be expressed inthe formula:

Icx = Gc(txJcx + (1− tx)A

cx) (5)

Where Gc is the gain of the acquisition equipment, tx is thetransmission (the ratio of light passing to the observer from thescene), and Ac

x is the ambient light intensity. Note that 1 − t is thedegree of occlusion, t ∈ [0, 1], and t is common for every colorchannel. For a short video, due to slow adaptation of cameras, Gc

values can be considered to be constant, leading to a simpler model,where the scene and the ambient light are not white balanced:

Icx = txJcx + (1− tx)A

cx, (6)

which form is widely used in the literature [14, 8, 5, 9, 11, 7,1]. Realigning this equation we find that this model explains theempirical formula in (1).

The mean value of Icx over a region is hard to deal with. Sum-mation can be simplified considering uncorrelated tx and Jc

x (spe-cially if t is constant), as E{txJc

x} becomes E{tx}E{Jcx}. Fur-

thermore Acx depends on the lightness and direction of light source.

Direct sunlight is not an ambient source, some parts of smoke willshadow out other parts, changing Ac

x in space, however, this changeis smooth. As a result, Ac

x will be approximately constant locally,and E

{(1 − t)Ac

B}

=(1 − E{t}

)Ac

B will stand. Now the meanand deviation are

IcB = tBJcB + (1− tB)A

cB (7)

σ(IcB) = σ(tB(J

cB −Ac

B))= σ(tBJ

cB) (8)

for any region B. The method in [11] is based on the same as-sumption, however, in their case if both tB and IcB appears to beconstant, correlation holds no information, thus reconstruction fails.Here the equations remain useful.

In order to simplify deviation expression, we chose to assume lo-cally homogeneous smoke, for which tB is approximately constant,thereby:


3462

(a) (b) (c)

Fig. 2. (a) is an image with less smoke, (b) is the reconstructed scene using ground truth and (c) is the reconstructed scene from 15 frames.

σ(IcB) = tBσ(JcB) (9)

Reconstructing the scene from equation (6) is not possible be-cause of the so called airlight-albedo ambiguity [1, 11, 10]. Thismeans we cannot distinguish between change in transmission andchange in scene. To reconstruct the scene, we need more constraints[11], or an estimation of t [7].

4. RECONSTRUCTION

We use the method described in Section 2. A straightforward choiceof B = {Bi} set is coincident non-overlapping rectangles - regionsize affects results as shown in Fig 5. This way computing IcB and σc

Bvalues become simple, and IcB values compose a downscaled versionof the input image, eliminating the noise. In our experiments, blurredB-s did not improve quality.

For reconstructing the scene, we will need to estimate the off-scene parameters Ac

B and tB; as well as the scene parameters JcB and

σ(JcB). Our method needs a video stream as input, which we can also

use to improve the result by selecting best input regions. To form thescene we have to compute the weighted sum of reconstructed regionsaccording to (4).

4.1. Filtering input

For reconstruction, it is mandatory to use as clean images as possi-ble. To ensure this, we build up a reliable input image from reliableregions by choosing IcB from {IcBf : f ∈ F} for which tBf is max-imal. This reliable input image contains the most information aboutthe scene, improving reconstruction quality.

4.2. Computing parameters

Considering haze and fog, AcB values are constant throughout the

image, but in our case, smoke shadows out direct sunlight fromother parts of smoke, which leads to slightly changing Ac

B in space.Fortunately, as our goal is to process very smokey input video, themost occluded parts will have transmission close to zero, resulting inIcx ≈ Ac

x. For a video, we can select the maximal intensity of pixelsin B throughout all frames to estimate Ac

B.At low transmission, we measure slightly different values for

RGB color channels, because auto white balance adapts to the ambi-ent light. However, differences are amplified while reducing smoke,so even slightly different Ac values hold information about the whitebalance of the scene relative to the ambient light (see Fig 4).

Having AcB values and noise-free, reliable input images {IcB},

we can compute tB values using dark channel prior [4]. This is basedon the observation that colors in real images are usually saturated [7].This means that the pixel value is low for at least one color channel,and for the darkest channel (7) becomes:

IdarkB ≈ (1− tB)AdarkB (10)

yielding the needed estimation of tB. Now we can estimatemean values for all color channels Jc

B using (7).Rearranging (7) and using (9), we see that the observations

should form a line segment in the mean-variance plane:

(a) (b)

Fig. 3. (a) is the intensity-deviation model from (11) and (b) is asample from our real video (points are the observations for a com-mon region in each frame)

σ(IcB) = σ(JcB)

IcB − AcB

JcB − Ac

B(11)

Using linear approximation, we can determine the linear rela-tionship between observed mean value and deviation. Fig 3 showsthe theoretical model and observations for a region throughout ourreal video. Results show that data fits this linear model well, makingit possible to estimate σ(Jc

B) from JcB.

4.3. Reconstructing scene

We can reconstruct the scene regionwise by altering its mean anddeviation according to the method described in Section 2. We get:

JcB = Jc

B +(IcB − IcB

) σ(JcB)

σ(IcB)(12)

Next, we need to apply some smoothing on region boundariespreserving local deviation. But still, a reconstructed image will con-tain a lot of noise due to camera and smoke distribution properties.To cancel these out, we perform a weighted summation over the


3463

(a) (b) (c) (d)

Fig. 4. (a) is the reconstructed scene using spatially changing AcB, in (b) AB was considered constant for color channels, and in (c, d) Ac

was chosen constant for the whole image from the lightest and darkest region respectively.

whole video sequence. For a region in a frame, we define the weightas the transmission to emphasize clear images, so (4) becomes

(a) 10x10 pixel (b) 30x30 pixel

Fig. 5. Statistical analysis is unreliable for small regions, we foundthat at least 30x30 pixel regions should be used. Varying size be-tween 30x30 and 60x60 pixels had no impact on visible quality.

JcB =

∑f∈F tBJ

cBf∑

f∈F tB(13)

We tested our algorithm using 15 consecutive frames (from Fig1(b) to Fig 2(a)). Fig 4(a), Fig 5 show the resulting scene.

5. CONCLUSION

We introduced a statistical approach for reconstructing static sceneviewed through thick smoke using multiple images. For estimatingparameters we applied temporal analysis, utilizing changing distri-bution of smoke. We applied premises specific to video processingand effect of smoke, as opposed to widely used models handlinghaze and fog. Results show our method works well under challeng-ing circumstances (Fig 4(a), Fig 5). Numerical evaluation was im-possible as ground truth data was available only for smokeless case.

Algorithm steps indicate that efficient, real-time implementa-tion is possible. Furthermore, Framewise and blockwise processingmakes our method an excellent subject to parallelization.

6. REFERENCES

[1] Chengjiang Long, Jianhui Zhao, Shizhong Han, Lu Xiong,Zhiyong Yuan, Jing Huang, and Weiwei Gao, “Transmission:

A new feature for computer vision based smoke detection,”in Artificial Intelligence and Computational Intelligence, vol.LNCS 6319, pp. 389–396. Springer, 2010.

[2] Srinivasa G. Narasimhan and Shree K. Nayar, “Chromaticframework for vision in bad weather,” Computer Vision andPattern Recognition, IEEE Comp.Soc., vol. 1, pp. 1598, 2000.

[3] Srinivasa G. Narasimhan and Shree K. Nayar, “Contrastrestoration of weather degraded images,” IEEE Tr.PatternAnalysis and Machine Intelligence, vol. 25, pp. 713–724, 2003.

[4] Robby T. Tan, “Visibility in bad weather from a single image,”Computer Vision and Pattern Recognition, IEEE Comp.Soc.,pp. 1–8, 2008.

[5] Srinivasa G. Narasimhan and Shree Nayar, “Interactivedeweathering of an image using physical models,” in IEEEWorkshop on Color and Photometric Methods in Computer Vi-sion (ICCV), October 2003.

[6] Shree K. Nayar and Srinivasa G. Narasimhan, “Vision in badweather,” IEEE International Conf. Comp.Vision, vol. 2, pp.820, 1999.

[7] Kaiming He, Jian Sun, and Xiaoou Tang, “Single image hazeremoval using dark channel prior,” Computer Vision and Pat-tern Recognition, IEEE Comp. Soc., pp. 1956–1963, 2009.

[8] Peter Carr and Richard Hartley, “Improved single image de-hazing using geometry,” Digital Image Computing: Tech-niques and Applications, pp. 103–110, 2009.

[9] Xingyong Lv, Wenbin Chen, and I fan Shen, “Real-time de-hazing for image and video,” Pacific Conference on ComputerGraphics and Applications, pp. 62–69, 2010.

[10] Jiawan Zhang, Liang Li, Guoqiang Yang, Yi Zhang, and JizhouSun, “Local albedo-insensitive single image dehazing,” TheVisual Computer, vol. 26, pp. 761–768, 2010.

[11] Raanan Fattal, “Single image dehazing,” ACM Trans. Graph.,vol. 27, pp. 72:1–72:9, August 2008.

[12] I Kopilovic, B Vágvölgyi, and T Szirányi, “Application ofpanoramic annular lens for motion analysis tasks: Surveillanceand smoke detection,” in Proc. of 15th ICPR, Barcelona, IEEE& IAPR, 2000, vol. 4, pp. 714–717.

[13] Arturo Donate and Eraldo Ribeiro, “Viewing scenes occludedby smoke,” in Advances in Visual Computing, vol. LNCS 4292,pp. 750–759. Springer, 2006.

[14] S. Shwartz, E. Namer, and Y.Y. Schechner, “Blind haze sepa-ration,” 2006, pp. II: 1984–1991.


3464

Date post:	08-Dec-2016
Category:	Documents
Upload:	tamas
View:	218 times
Download:	2 times

[IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium...

Documents