+ All Categories
Home > Documents > Anisotropic Total Variation Based Image Restoration Using ...total variation does not favor smooth...

Anisotropic Total Variation Based Image Restoration Using ...total variation does not favor smooth...

Date post: 02-Feb-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
123
Anisotropic Total Variation Based Image Restoration Using Graph Cuts BjΓΈrn Rustad Master of Science in Physics and Mathematics Supervisor: Markus Grasmair, MATH Department of Mathematical Sciences Submission date: February 2015 Norwegian University of Science and Technology
Transcript
  • Anisotropic Total Variation Based Image Restoration Using Graph Cuts

    BjΓΈrn Rustad

    Master of Science in Physics and Mathematics

    Supervisor: Markus Grasmair, MATH

    Department of Mathematical Sciences

    Submission date: February 2015

    Norwegian University of Science and Technology

  • AbstractIn this thesis we consider a particular kind of edge-enhancing image restorationmethod based on total variation. We want to address the fact that the totalvariation method in some cases leads to contrast loss in thin structures. To reducethe contrast loss a directional dependence is introduced through an anisotropytensor. The tensor controls the regularization applied based on the position in theimage and the direction of the gradient. It is constructed using edge informationextracted from the noisy image. We optimize the resulting functional using agraph cut framework; a discretization which is made possible by a coarea and aCauchy–Crofton formula. In the end we perform numerical studies, experimentwith the parameters and discuss the results.

  • SammendragI denne masteroppgaven ser vi pΓ₯ en spesifikk kant-bevarende stΓΈyfjerningsalgo-ritme basert pΓ₯ Β«total variationΒ». Vi tar for oss at Β«total variationΒ» i noen tilfellerfΓΈrer til tap av kontrast i detaljer og tynne strukturer. For Γ₯ redusere kontrast-tapetintroduserer vi en retningsavhengig anisotropitensor. Denne tensoren kontrollererstΓΈyfjerningen basert pΓ₯ posisjonen i bildet, og retningen til gradienten i punktet.Den blir konstruert basert pΓ₯ kant-informasjon fra det opprinnelige stΓΈyete bildet.Vi minimerer den resulterende funksjonalen i et graf-kutt-rammeverk, som er gjortmulig ved hjelp av en coarea- og en Cauchy–Crofton-likning. Vi avslutter med ennumerisk studie, eksperimentering med parametrene og diskusjon av resultatene.

  • Preface

    This master thesis concludes my study at the Applied Physics and Mathemat-ics Master’s degree program with specialization in Industrial Mathematics at theNorwegian University of Science and Technology (NTNU).

    I would like to thank my supervisor Markus Grasmair at the Department of Math-ematical Sciences for invaluable help and discussion throughout my work with myproject and this thesis.

    Finally I would like to thank my family for their support, and Mats, Lars, Kine,Hager, Edvard and Henrik for productive discussions around the coffee pot.

    BjΓΈrn Rustad, February 8, 2015.

  • Contents

    1 Introduction 1

    2 Methods in image restoration 32.1 Diffusion filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Total variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    3 Continuous formulation 113.1 Anisotropic total variation . . . . . . . . . . . . . . . . . . . . . . . 113.2 Well-posedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.3 Anisotropic coarea formula . . . . . . . . . . . . . . . . . . . . . . . 193.4 Anisotropic Cauchy–Crofton formula . . . . . . . . . . . . . . . . . 24

    4 Discrete formulation 324.1 Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.2 Graph cut approach . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    5 Maximum flow 475.1 Flow graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.2 Augmenting path algorithms . . . . . . . . . . . . . . . . . . . . . . 495.3 Other algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.4 Push–relabel algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 515.5 Boykov–Kolmogorov algorithm . . . . . . . . . . . . . . . . . . . . 60

    6 Results 666.1 Tensor parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . 666.2 Neighborhood stencils . . . . . . . . . . . . . . . . . . . . . . . . . 706.3 Restoration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

    7 Discussion and conclusion 76

    iii

  • iv CONTENTS

    Bibliography 79

    List of Figures 83

    List of Tables 85

    List of Symbols 87

    A C++ implementation 89A.1 main.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90A.2 image.hpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92A.3 image.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93A.4 anisotropy.hpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96A.5 anisotropy.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96A.6 graph.hpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99A.7 graph.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101A.8 selectionrule.hpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109A.9 selectionrule.cpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110A.10 neighborhood.hpp . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

  • Chapter 1Introduction

    Image processing is becoming an increasingly important part of our modern com-puterized world. Tasks previously only performed by humans, like detecting edges,recognizing textures and inferring shapes and motions can now be performed al-gorithmically. The background of these methods spans several fields, includingpsychology and biology for the study of human vision, statistics and analysis forthe mathematical background, and computer science for their implementation andperformance analysis.

    Image restoration methods are concerned with trying to remove noise or recoverotherwise degraded images. Possible noise can result from the physical nature oflight traveling to your sensor, dust on your lens, and many other sources. Thereforenumerous different approaches to denoising exist, each having their own strengthsand weaknesses. Some of these are introduced in Chapter 2, and one of the mainchallenges they all face is the recovery of edges.

    A method well known for recovering edges is the total variation method, as thetotal variation does not favor smooth gradients over edges. I gave an overview ofthis method in my project work [1], where I used a graph cut framework to obtaina numerical solution. The method consists of trying to reduce the total variationof the image, while still staying β€œclose” to the original.

    A problem with the total varation method is that contrast is often lost, espe-cially in fine details and thin structures. In this thesis we try to alleviate this. Weextend the method by introducing an anisotropy tensor into the total variation,thus making it directionally dependent. This means we can control the regular-ization applied to the image based on position and direction. The main idea isthen to reduce the regularization applied across edges in the image, while we stillregularize along them.

    The variational problem we obtain is a convex minimization problem, andmany optimization approaches exist. We choose to discretize in such a way that

    1

  • 2 CHAPTER 1. INTRODUCTION

    we can apply the same graph cut framework used in my project work [1]. Throughthe coarea formula, the functional is decomposed into a sequence of minimizationproblems, one for each level of the image. These separate level problems are thentransformed and discretized further using an anisotropic Cauchy–Crofton formulathat we develop. Similar formulas have been presented before in other contexts.

    A nice property of this numerical approach is that we can prove that thegraph cut framework finds an exact global minimizer of the discrete functional.Additionally we verify that the discrete functional is consistent with the continuousone.

    We present and implement two maximum flow algorithms that allow us to findminimum cuts corresponding to minimizers of the discrete functionals. The push-relabel algorithm is considered to be the fastest and most versatile for generalgraphs, while the Boykov–Kolmogorov algorithm is specially tailored for the typeof graphs we find in these kinds of imaging applications. We describe every partof the method in detail such that it can be easily implemented by the reader. Inaddition, a C++ implementation is attached.

    In the end we present numerical results that show how the different parame-ters affect the restoration, and we look into and explain some artifacts caused byapproximations in the discretization. Further we look at how the introduction ofthe anisotropy in certain cases amend some of the weaknesses of the total variationmethod. We particularly look at how contrast loss is reduced in images containingthin structures such as fingerprints.

  • Chapter 2Methods in image restoration

    There are numerous methods in image restoration, but we do not have time norspace to discuss them all. In short overview, which is an extension of the onegiven in my project [1], we will focus on the methods related to the anisotropictotal variation method considered later in this thesis. See [2] and [3] for morebackground on image processing in general.

    In this chapter, and also in the rest of the thesis we will assume that we aregiven an image 𝑓 ∢ Ξ© β†’ ℝ where Ξ© is a rectangular, open domain. Because oflimitations in the numerical method used, the codomain is ℝ and we are thusrestricted to monochrome, or grayscale images. Such images are produced in largenumbers by for example ultrasound, X-ray and MRI machines.

    The space in which the image 𝑓 resides in will vary, but since we are lookingat image restoration methods, we assume that it includes some kind of noise.Depending on the application and how the image is obtained, one might constructdifferent models describing different types of noise.

    We will assume that the given image 𝑓 is a combination of an underlying,actual image π‘’βˆ—, and some noise 𝛿. The simplest model is additive noise where theassumption is that 𝑓 = π‘’βˆ— + 𝛿. There is also multiplicative noise where 𝑓 = π‘’βˆ— β‹… 𝛿.An other much seen noise type is salt and pepper noise, which is when black andwhite pixels randomly appear in the image.

    These are only models, and in the real world the noise might be more complex,and even come from a combination of sources. Depending on the application, thegoal might not even be to recover π‘’βˆ—, but rather to obtain an output which fulfillscertain smoothness or regularity properties. In any case, we will continue denotingthe noisy input image 𝑓 and use 𝑒 for the output image in the description of therestoration methods.

    3

  • 4 CHAPTER 2. METHODS IN IMAGE RESTORATION

    2.1 Diffusion filteringDiffusion filtering is a broad group of filtering and restoration methods based onphysical diffusion processes. The basic idea is to take the noisy image as theinitial value of some diffusion process, and then let it evolve for some time. Thebest known method is probably the Gaussian filter or Gaussian blur, in which oneconvolves the image with the Gaussian function

    𝐾𝜎(π‘₯, 𝑦) =1

    2πœ‹πœŽ2 exp (βˆ’π‘₯2 + 𝑦2

    2𝜎2 ) . (2.1)

    In the discrete setting where the image consists of a grid of pixels, the Gaussianblur amounts to calculating each pixel in the output image as a weighted averageof its neighboring pixels in the input image.

    The Gaussian function happens to be the fundamental solution of the heatequation πœ•π‘‘π‘’ = Δ𝑒. Convolving 𝐾𝜎(π‘₯, 𝑦) with the original image 𝑓 is thereforeequivalent to solving the heat equation with 𝑓 as initial value, until some time𝑇 > 0 depending on 𝜎. Boundary conditions have to be specified of course, andone common choice is to symmetrically extend the image in π‘₯ and 𝑦 directions,which corresponds to zero flux boundary conditions.

    By basic Fourier analysis it is possible to show that the Gaussian filter is alow-pass filter which attenuates high frequencies. Further theory can be found inWeickert’s book on anisotropic diffusion [4].

    The main concern with the Gaussian filter is that it will, in addition to smooth-ing out possible noise, remove details from the image, which motivates the nextset of methods, where the amount of diffusion can vary between different parts ofthe image.

    2.1.1 Non-linear diffusion filteringIn the theory of the heat equation one can introduce a thermal diffusivity 𝛼 suchthat the equation becomes

    { πœ•π‘‘π‘’ = div (𝛼(βˆ‡π‘’)βˆ‡π‘’),𝑒|𝑑=0 = 𝑓.(2.2)

    The thermal diffusivity 𝛼(βˆ‡π‘’) = 𝛼(π‘₯, βˆ‡π‘’) is material dependent, and can alsovary throughout the object. It specifies how well heat travels through the specificpoint in the object. We can make use of this in the image restoration context byspecifying different diffusivity in different parts of the image, in an effort to reducenoise without loosing image detail. Optimally, we would like there to be a lot of

  • 2.1. DIFFUSION FILTERING 5

    diffusion in smooth parts of the image, and not so much in areas with a lot ofdetails.

    One much-studied non-linear diffusion equation is the Perona–Malik equation

    πœ•π‘‘π‘’ = div ( βˆ‡π‘’1 + |βˆ‡π‘’|2πœ†2) . (2.3)

    The thermal diffusivity 𝛼(βˆ‡π‘’) = (1 + |βˆ‡π‘’|2/πœ†2)βˆ’1 varies from 1 in smooth areas to0 as the norm of the gradient |βˆ‡π‘’| grows.

    This particular form of the thermal diffusivity has been shown to be relatedto how brightness is perceived by the human visual system. The model has sometheoretical problems related to well-posedness, for more information see [4].

    A different kind of non-linear diffusion model is the total variation flow whichcan be formulated as

    πœ•π‘‘π‘’ = divβˆ‡π‘’|βˆ‡π‘’|, (2.4)

    where the diffusivity has a similar effect of reducing the diffusion in areas of highvariation. As the name suggests, this model can be related to the variational totalvariation formulation presented later. One forward Euler time-step in the solutionof this partial differential equation corresponds to the Euler–Lagrange equation ofthe variational formulation.

    Note that we follow Weickert’s terminology when it comes to the distinctionbetween non-linear and anisotropic diffusion methods. The Perona–Malik equa-tion, and other diffusion equations with non-homogenous diffusivities, are often byothers called anisotropic, as the diffusivity depends on the location. We will namethese methods non-linear and spare the anisotropy term for the β€œreal” anisotropicmethods. These are methods where the diffusivity is a tensor, and thus bothlocation and direction dependent.

    2.1.2 Anisotropic diffusionThe diffusivity is made directionally dependent by introducing a diffusion tensor𝐴(𝑒) such that the initial boundary value problem becomes

    ⎧{⎨{⎩

    πœ•π‘‘π‘’ = div (𝐴(𝑒)βˆ‡π‘’) on Ξ© Γ— (0, ∞),𝑒|𝑑=0 = 𝑓 on Ξ©,

    𝐴(𝑒)βˆ‡π‘’ β‹… 𝜈 = 0 on πœ•Ξ© Γ— (0, ∞),(2.5)

    where 𝜈 is the outer normal of Ξ©. The tensor 𝐴(𝑒) is constructed such as todiminish the effect of βˆ‡π‘’ across what we believe to be edges in the image. This

  • 6 CHAPTER 2. METHODS IN IMAGE RESTORATION

    way, there will also be less diffusion through these edges. Weickert [4] suggestsconstructing 𝐴(𝑒) based on the edge estimator βˆ‡π‘’πœŽ where

    π‘’πœŽ ∢= 𝐾𝜎 βˆ— οΏ½ΜƒοΏ½ (2.6)

    and οΏ½ΜƒοΏ½ is an extension of 𝑒 from Ξ© to ℝ2 made by symmetrically extending 𝑒 acrossthe boundary of Ξ©. Assuming we are at an edge in the image, the direction of βˆ‡π‘’πœŽshould be perpendicular to the edge, while its magnitude will provide informationon the steepness of the edge.

    To extract this information, and also to identify features on a larger scale, thestructure tensor is introduced

    π‘†πœŒ(π‘₯) ∢= 𝐾𝜌 βˆ— (βˆ‡π‘’πœŽ βŠ— βˆ‡π‘’πœŽ), (2.7)

    where the convolution with the Gaussian function 𝐾𝜌 is done component-wise.The anisotropy tensor 𝐴(𝑒) can then be constructed based on the eigenvectorsand eigenvalues of π‘†πœŒ(π‘₯). The structure tensor and its properties will be discussedfurther when we introduce our anisotropic total variation functional.

    Assuming some smoothness, symmetry and uniform positive definiteness on𝐴(𝑒) one can prove well-posedness, regularity and an extremum principle of theproblem (2.5) as done in [4].

    However, even if the diffusivity tensor was introduced to reduce the amount ofsmoothing across edges, the solution of (2.5) will still be infinitely differentiable[4], i.e. 𝑒(𝑇 ) ∈ 𝐢∞(Ξ©) for 𝑇 > 0. Thus there are no real discontinuities, and noreal edges in the solution.

    Further, the anisotropic diffusion may introduce structure based on noise, whenthere really was no structure to begin with. This is a problem we aim to avoid inour anisotropic total variation method.

    2.2 Total variationTotal variation was initially introduced to the field of image restoration by Rudin,Osher and Fatemi in [5] and is usually formulated as a minimization problem

    minπ‘’βˆˆπΏπ‘(Ω)

    𝐹(𝑒),

    𝐹(𝑒) = βˆ«β„¦

    |𝑒 βˆ’ 𝑓|𝑝 𝑑π‘₯⏟⏟⏟⏟⏟⏟⏟

    fidelity term

    + 𝛽 βˆ«β„¦

    |βˆ‡π‘’| 𝑑π‘₯⏟⏟⏟⏟⏟

    regularization term

    , (2.8)

    where 𝑝 is normally taken to be 1 or 2. The fidelity term penalizes images 𝑒 thatare far from the original image 𝑓 . The regularization term is the total variation

  • 2.2. TOTAL VARIATION 7

    of the image, and minimizing it will reduce the variation and thus regularize theimage. The 𝛽 parameter controls the strength of the regularization. Note that𝑒 = 𝑓 is a minimizer of the fidelity term, while a constant image 𝑒 = 𝑐 is aminimizer of the regularization term.

    As this restoration method is the one which will be extended later in thisthesis, we will look a little bit more deeply into the background and the numericalmethods relating to it.

    Since we do not only want to consider differentiable images 𝑒 ∈ 𝐢1(Ξ©) forwhich the gradient exists, we introduce the total variation using the distributionalderivative.

    Definition 2.1 (Total variation). Given a function 𝑒 ∈ 𝐿1(Ξ©), the total vari-ation of 𝑒, often written βˆ«β„¦ |𝐷𝑒| 𝑑π‘₯, where the 𝐷 is the gradient taken in thedistributional sense, is

    TV(𝑒) = βˆ«β„¦

    |𝐷𝑒| 𝑑π‘₯ = sup {βˆ«β„¦

    𝑒 β‹… div πœ‘ 𝑑π‘₯ ∢ πœ‘ ∈ πΆβˆžπ‘ (Ξ©, ℝ2) , β€–πœ‘β€–πΏβˆž(Ω) ≀ 1} .(2.9)

    The test functions πœ‘ are taken from πΆβˆžπ‘ (Ξ©, ℝ2), the space of smooth functionsfrom Ξ© to ℝ2 with compact support in Ξ©.

    Note that since Ξ© is open and bounded, the test functions πœ‘ vanish on theboundary of Ξ©. Thus no variation is measured at the boundary.

    As we are searching for an image with low total variation, it is useful to intro-duce the space of functions of bounded variation.

    Definition 2.2 (Functions of bounded variation). The space of functions of boundedvariation BV(Ξ©) is the space of functions 𝑒 ∈ 𝐿1(Ξ©) for which the total variationis finite, i.e.,

    BV(Ξ©) = {𝑒 ∈ 𝐿1(Ξ©) ∢ TV(𝑒) < ∞} . (2.10)

    Our optimization problem has thus become

    minπ‘’βˆˆBV(Ω)

    βˆ«β„¦

    |𝑒 βˆ’ 𝑣|𝑝 𝑑π‘₯ + 𝛽 TV(𝑒). (2.11)

    As with any restoration method, the total variation method has its strengthsand weaknesses. Its main strength is its ability to recover edges in the input image.The total variation of a section only takes the absolute change into account, anddoes not favor gradual changes like the diffusion methods.

    There is also a theoretical result stating that the set of edges in the solution𝑒 is contained in the set of edges in the original image 𝑓 , thus no new edges arecreated [6]. However, in the presence of noise, the method may introduce or rather

  • 8 CHAPTER 2. METHODS IN IMAGE RESTORATION

    (a) Noisy gradient (b) Total variation restoration

    Figure 2.1: Although the original gradient was smooth, the total variationmethod manages to find structure in the noise, and create edges in therestored image.

    Figure 2.2: A fingerprint heavily regularized using the total variationmethod. The originally white and black ridges have been brought closer invalue, to reduce the total variation.

    β€œfind” new edges that were not in the original image, since flat sections of zerovariation are encouraged by the functional. This effect is called the stair-casingeffect, and can be seen in Figure 2.1 where a noisy gradient has been restored usingthe total variation method.

    Fine details, thin objects and corners may suffer from contrast loss since bring-ing them closer to their surroundings reduces the total variation. An example ofthis is shown in Figure 2.2, where a not particularly noisy fingerprint image hasbeen strongly regularized. The original black and white levels have been broughtcloser to yield a lower total variation in the regularized image.

  • 2.2. TOTAL VARIATION 9

    2.2.1 Numerical methodsSee [7] for an overview of some of the numerical methods relating to total vari-ation image restoration. Amongst others it describes some dual and primal-dualmethods, as well as the graph cut approach we take in this thesis.

    Graph cut approach

    Using graph cuts is the approach we will be taking later when considering theanisotropic total variation regularization, and it is therefore valuable to brieflylook into how graph cuts are used in the case of regular total variation.

    A graph cut is a set of edges that when removed will separate the graph intotwo disconnected parts. A minimum cut is a cut such that the sum of the weightof the edges in the cut is minimal. It has been shown that for some discrete func-tionals, it is possible to construct graphs for which the minimum cuts correspondto minimizers of the functional.

    In the discrete setting our image consists of pixels, and is represented by afunction 𝑒 ∢ 𝒒 β†’ 𝒫 where 𝒒 is a regular grid over Ξ©, and 𝒫 = {0, … , 𝐿 βˆ’ 1} is thediscrete set of pixel values, or levels. We denote the value in pixel π‘₯ as 𝑒(π‘₯) = 𝑒π‘₯.

    For an image 𝑒 and a level πœ† we denote the level set by {𝑒 > πœ†}, defined as theset {π‘₯ ∈ Ξ© ∢ 𝑒π‘₯ > πœ†}. The thresholded image π‘’πœ†, an indicator function, is thendefined as

    π‘’πœ† = πœ’π‘’>πœ†. (2.12)Here, πœ’πΈ signifies the characteristic function of the set 𝐸, the function which isequal to one in every point in 𝐸, and zero elsewhere.

    The idea of the graph cut approach is to decompose the minimization probleminto one minimization problem for each level of the image, and then solve themseparately before combining the results.

    Through careful manipulation of the continuous functional in (2.11) it is pos-sible to obtain a discrete functional decomposed as a sum over all the level valueson the form

    𝐹(𝑒) =πΏβˆ’2βˆ‘πœ†=0

    βˆ‘π‘₯

    𝐹 π‘₯πœ† (π‘’πœ†π‘₯) + π›½πΏβˆ’2βˆ‘πœ†=0

    βˆ‘(π‘₯,𝑦)

    𝐹 π‘₯,𝑦(π‘’πœ†π‘₯, π‘’πœ†π‘¦) =βˆΆπΏβˆ’2βˆ‘πœ†=0

    πΉπœ†(π‘’πœ†) (2.13)

    where the sum over (π‘₯, 𝑦) is over all pixel pairs (π‘₯, 𝑦) in a neighbor relation, i.e.pixels that are β€œclose” to each other. The actual form of the functional, and thesteps to construct it will be presented later.

    The graph cut we find will for each level πœ† give us the thresholded image π‘’πœ†,and they can then be combined to form the complete image 𝑒.

    When constructing the graph used to find the thresholded image π‘’πœ†, we havetwo special vertices, one representing the set {𝑒 > πœ†}, and one which represents

  • 10 CHAPTER 2. METHODS IN IMAGE RESTORATION

    the set {𝑒 ≀ πœ†}. The pixels are then connected to these vertices with a weightrepresenting how strongly they are related to the corresponding set. This weightwill be based on the value of 𝐹 π‘₯πœ† .

    Additionally there are connections between pixels in a neighborhood relation,representing the energy 𝐹 π‘₯,𝑦. Thus when finding a cut, we partition the pixelsinto the sets {𝑒 > πœ†} and {𝑒 ≀ πœ†}. And if in addition the cut is minimal, weknow that the edges cut have minimal weight, and can prove that the π‘’πœ† foundminimizes the functional in (2.13).

  • Chapter 3Continuous formulation

    In the previous chapter we saw that there are many different approaches to theimage restoration problem, all with their own strengths and weaknesses. Themethod considered in this thesis is an anisotropic total variation formulation, andthe aim is to keep the strengths of the anisotropic diffusion and total variationmethods, while eliminating some of their weaknesses.

    This chapter will be devoted to the continuous formulation of the method. Wewill look at the functional we want to minimize and its different forms, and brieflydiscuss its well-posedness. Through the anisotropic coarea formula, the anisotropictotal variation is rewritten as an integral of the perimeter of all the level sets ofthe image.

    Following that, the anisotropic Cauchy–Crofton formula is introduced to makeit feasible to calculate the perimeter of these level sets. All of this leads up to thediscretization of our functional in the next chapter.

    3.1 Anisotropic total variationThe method considered will build on the total variation regularization methodof Section 2.2. From anisotropic diffusion in Section 2.1.2 we borrow the idea ofmaking the regularization in each point directionally dependent. We introduce theanisotropic total variation

    TV𝐴(𝑒) = βˆ«β„¦

    βˆšβˆ‡π‘’(π‘₯)𝑇 𝐴(π‘₯)βˆ‡π‘’(π‘₯) 𝑑π‘₯ (3.1)

    for all 𝑒 ∈ 𝐢1(Ξ©). We assume here that 𝐴(π‘₯) is continuous and positive definite,and we will later need the eigenvalues of 𝐴(π‘₯) to be uniformly bounded below andabove. If 𝐴(π‘₯) is the identity matrix we get the regular total variation found in

    11

  • 12 CHAPTER 3. CONTINUOUS FORMULATION

    (2.8). When minimizing the regular total variation, we will also try to reduce thevariation over known edges in the image. This can lead to unwanted contrast loss,especially in fine details. By controlling 𝐴(π‘₯) such that the contribution of βˆ‡π‘’(π‘₯)is reduced across known edges, we hope to retain the regularization properties ofthe original method while reducing this contrast loss. If the variation across anedge is β€œignored” by the functional, there is no gain in reducing the height of theedge as before.

    Note that 𝑒(π‘₯) and 𝐴(π‘₯) are always dependent on the position in the image π‘₯,but we will sometimes drop the π‘₯, when the intended meaning is clear.

    As we will not always be working with differentiable images, we extend thedefinition of the total variation functional. Being symmetric positive definite, thematrix 𝐴 can be factored into two symmetric matrices as 𝐴 = 𝐴1/2𝐴1/2. We canthen write

    TV𝐴(𝑒) = βˆ«β„¦

    ∣𝐴1/2βˆ‡π‘’βˆ£ 𝑑π‘₯

    = sup|πœ‰(π‘₯)|≀1

    βˆ«β„¦

    (𝐴1/2βˆ‡π‘’)𝑇 πœ‰ 𝑑π‘₯

    = sup|πœ‰(π‘₯)|≀1

    βˆ«β„¦

    βˆ‡π‘’ β‹… 𝐴1/2πœ‰ 𝑑π‘₯

    = sup|πœ‰(π‘₯)|≀1

    βˆ«β„¦

    𝑒 div(𝐴1/2πœ‰) 𝑑π‘₯

    = supπœ‚π‘‡ π΄βˆ’1πœ‚β‰€1

    βˆ«β„¦

    𝑒 div πœ‚ 𝑑π‘₯,

    (3.2)

    where πœ‰ and πœ‚ = 𝐴1/2πœ‰ are in πΆβˆžπ‘ (Ξ©, ℝ2), the space of smooth vector fields withcompact support. In the following we define the norms β€–πœ‰β€–π΄ = supπ‘₯(πœ‰π‘‡ π΄πœ‰)

    1/2 andβ€–πœ‚β€–βˆ—π΄ = supπ‘₯(πœ‚π‘‡ π΄βˆ’1πœ‚)

    1/2, and with that we present the formal definition of theanisotropic total variation.

    Definition 3.1 (Anisotropic total variation). For a function 𝑒 ∈ 𝐿2(Ξ©) and a con-tinuous symmetric positive definite tensor 𝐴 ∢ Ξ© β†’ ℝ2Γ—2 we define the anisotropictotal variation

    TV𝐴(𝑒) = sup {βˆ«β„¦

    𝑒 div πœ‰ 𝑑π‘₯ ∢ πœ‰ ∈ πΆβˆžπ‘ (Ξ©, ℝ2), β€–πœ‰β€–βˆ—π΄ ≀ 1} . (3.3)

    With this extended definition, we have arrived at a minimization problem wherewe seek to find a minimizer of the functional

    𝐹(𝑒) = βˆ«β„¦

    (𝑒 βˆ’ 𝑓)2 𝑑π‘₯ + 𝛽 TV𝐴(𝑒). (3.4)

  • 3.1. ANISOTROPIC TOTAL VARIATION 13

    Figure 3.1: A noisy fingerprint on the left, and the largest eigenvalue of thestructure tensor is |βˆ‡π‘“πœŽ(π‘₯)|2 on the left, whichβ€”as we can seeβ€”functionsas an edge detector.

    Similar functionals have been considered in [8] and [9]. The question is now howto construct the anisotropy tensor 𝐴(π‘₯) to get the improvements we hope for, andhow the introduction of the tensor affects our numerical solution method.

    3.1.1 Anisotropy tensorThere are many possible choices for the anisotropy tensor 𝐴(π‘₯). Our constraintsare that we have assumed it to be continuous and symmetric positive definite,and we have some wishes for its properties. We would first and foremost like it todown-weight βˆ‡π‘’ in (3.1) across true edges, while maintaining normal regularizationproperties in smooth sections.

    By true edges we mean that that we do not want the tensor to be sensitive tonoise in the image, and thus find edges where there are none, so we somehow wantto be sure about the edges we find.

    Edges can be found in many different ways, but as suggested by Weickert in hisbook on Anisotropic Diffusion [4], and briefly mentioned in Section 2.1.2, a goodstarting point is the edge detector βˆ‡π‘“πœŽ. The image is smoothed by a Gaussianfilter as described in Section 2.1: π‘“πœŽ = 𝐾𝜎 βˆ— ̃𝑓 , where ̃𝑓 is the symmetric extensionof the initial image 𝑓 in ℝ2. The smoothing parameter 𝜎 is called the noise scale,and it controls the scale at which details are considered to be noise.

    As seen in Figure 3.1, the edge detector is fine for detecting edges, but itcan not give us information about larger structures, like corners and textures,

  • 14 CHAPTER 3. CONTINUOUS FORMULATION

    which is why we introduce the structure tensor π‘†πœŒ(π‘₯). First consider the tensor𝑆0(π‘₯) = βˆ‡π‘“πœŽ(π‘₯) βŠ— βˆ‡π‘“πœŽ(π‘₯). It is symmetric positive semi-definite, and obviouslycontains no more information than the edge detector itself. Its eigenvalues areπœ†1 = |βˆ‡π‘“πœŽ(π‘₯)|2 and πœ†2 = 0 with corresponding eigenvectors 𝑣1 and 𝑣2 paralleland perpendicular to βˆ‡π‘“πœŽ(π‘₯) respectively.

    To detect features in a neighborhood around the point π‘₯, such as corners,curved edges and coherent structures we introduce the component-wise convolutionwith 𝐾𝜌 such that

    π‘†πœŒ(π‘₯) ∢= 𝐾𝜌 βˆ— (βˆ‡π‘“πœŽ(π‘₯) βŠ— βˆ‡π‘“πœŽ(π‘₯))(π‘₯). (3.5)

    The parameter 𝜌, called the integration scale, controls the size of the neighborhoodwhich affects the structure tensor. Thus it defines the size of the structures wewant our anisotropy tensor to be sensitive to.

    The smoothed tensor π‘†πœŒ(π‘₯) can easily be verified to be symmetric positivesemi-definite, just like 𝑆0(π‘₯). In addition, when 𝜌 > 0, the elements of π‘†πœŒ aresmooth maps from Ξ© to ℝ.

    We order the two real eigenvalues such that πœ†1 β‰₯ πœ†2 and denote the correspond-ing eigenvectors 𝑣1 and 𝑣2. From the characteristic polynomial of π‘†πœŒ(π‘₯) = ( 𝑠11 𝑠12𝑠12 𝑠22 )we obtain a closed form expression for the eigenvalues

    πœ† = 12 (𝑠11 + 𝑠22 Β± √(𝑠11 βˆ’ 𝑠22)2 + 4𝑠212) . (3.6)

    The vector 𝑣1 will then indicate the direction of most variation in the neighbor-hood. An edge will give πœ†1 ≫ πœ†2 β‰ˆ 0, while smooth areas will give πœ†1 β‰ˆ πœ†2 β‰ˆ 0.In corners we have variation in the direction of 𝑣1 but also perpendicular to 𝑣1,so we will have πœ†1 β‰ˆ πœ†2 ≫ 0. Thus the quantity (πœ†1 βˆ’ πœ†2)2 will be large aroundedges and small in smooth or non-coherent areas.

    To extract this information from the structure tensor, we decompose it as

    π‘†πœŒ(π‘₯) = π‘ˆ(π‘₯)Ξ›(π‘₯)π‘ˆ(π‘₯)𝑇 , (3.7)

    whereΞ›(π‘₯) = (πœ†1 00 πœ†2

    ) (3.8)

    has the eigenvalues πœ†1 β‰₯ πœ†2 on its diagonal, while π‘ˆ(π‘₯) is a rotation matrix andhas the eigenvectors of π‘†πœŒ(π‘₯) as its columns. From this we construct a new matrix𝐴(π‘₯) = π‘ˆ(π‘₯)Ξ£(π‘₯)π‘ˆ(π‘₯)𝑇 where

    Ξ£(π‘₯) = (𝜎1 00 𝜎2) . (3.9)

  • 3.1. ANISOTROPIC TOTAL VARIATION 15

    ..

    πœ†1

    .

    πœ†2

    (a) The structure tensor π‘†πœŒ...

    1

    .

    𝜎1

    (b) The anisotropy tensor 𝐴.

    Figure 3.2: An edge with the structure and anisotropy tensors visualizedusing their eigenvectors and eigenvalues.

    and for 𝜎1 and 𝜎2 we choose

    𝜎1 = (1 +(πœ†1 βˆ’ πœ†2)2

    πœ”2 )βˆ’1

    ,

    𝜎2 = 1.(3.10)

    Thus the eigenvectors of 𝐴(π‘₯) and π‘†πœŒ(π‘₯) are equal, while the eigenvalues aredifferent. A visualization of the two tensors can be seen in Figure 3.2 where thetwo tensors are shown at an edge in the image.

    In smooth areas, 𝜎1 β‰ˆ 1 and 𝐴(π‘₯) will be close to the identity matrix. At oraround edges, 𝜎1, which corresponds to the eigenvector perpendicular to the edge,will be small.

    Around corners 𝐴(π‘₯) will be close to the identity matrix, which gives regu-larization similar to smooth areas. This is one possible down-side of this tensorchoice, as rounded corners may occur.

    The parameter πœ” controls the amount of anisotropy in the method, such thatif it is very large we are left with the identity matrix and our method becomes theregular total variation method. Note also that changing the parameter πœ” implicitlyaffects the amount of regularization applied. For an image 𝑒, decreasing πœ” will, allelse being equal, decrease the lowest eigenvalue of 𝐴(π‘₯) and in turn decrease theanisotropic total variation TV𝐴(𝑒).

    For the case where πœ†1 = πœ†2, the π‘ˆ(π‘₯) in our decomposition is not well-defined.This is not a problem however, since Ξ£(π‘₯) will be the identity matrix, so anyorthogonal matrix will suffice for π‘ˆ(π‘₯).

    Note that the eigenvalues of π‘†πœŒ are continuous, and so are the eigenvectors(ignoring their sign) except possibly when πœ†1 = πœ†2. Thus 𝐴 is also continuousexcept possibly in these points. When πœ†1 = πœ†2 however, the eigenvalues 𝜎1 and𝜎2 of 𝐴 will both be 1, and 𝐴 is the identity matrix. Thus we can argue that ifπ‘†πœŒ(π‘₯) β†’ πœ†πΌ then 𝐴(π‘₯) β†’ 𝐼 and 𝐴 is continuous in all of Ξ©.

  • 16 CHAPTER 3. CONTINUOUS FORMULATION

    See [10] for a different tensor construction, made to enhance flow structures inthe image, relevant in for example fingerprint analysis.

    3.2 Well-posednessThe theory of existence and uniqueness for these kinds of variational methods isa minefield of more or less subtle problems. Even if we restrict ourselves to a nicespace such as 𝐿2(Ω) we will at some point run into problems. The discussion hereis not meant to give the most rigorous background, but rather an overview of whatneeds to be shown. Some problems will be worked around, while others will beskipped with a reference to further theory.

    The basic things we ask of our functional

    𝐹(𝑒) = βˆ«β„¦

    (𝑒 βˆ’ 𝑓)2 + 𝛽 TV𝐴(𝑒) (3.11)

    to have a well-posed problem are lower semi-continuity and coercivity for existence,and convexity for uniqueness. We restrict ourself to 𝐿2(Ξ©) which makes sense withour fidelity term, assuming that 𝑓 ∈ 𝐿2(Ξ©).

    We consider the weak topology, as it will allow us to arrive at an existenceresult relatively easily. We say that a sequence 𝑓𝑛 in 𝐿2(Ξ©) converges weakly to 𝑓if

    limπ‘›β†’βˆž

    βˆ«β„¦

    𝑓𝑛 πœ‰ 𝑑π‘₯ = βˆ«β„¦

    𝑓 πœ‰ 𝑑π‘₯ (3.12)

    for all πœ‰ ∈ 𝐿2(Ξ©) and we write 𝑓𝑛 ⇀ 𝑓 . A weakly convergent sequence is a sequencethat converges in the weak topology.

    3.2.1 ConvexityWe start with convexity as it is the easiest to show. Being quadratic, the fidelityterm of our functional

    βˆ«β„¦

    (𝑒 βˆ’ 𝑓)2 𝑑π‘₯ (3.13)

    is obviously strictly convex. This can be shown by expanding and rearranging thestrict convexity condition

    βˆ«β„¦

    (πœ†π‘’1 + (1 βˆ’ πœ†)𝑒2 βˆ’ 𝑓)2 𝑑π‘₯ < πœ† βˆ«β„¦

    (𝑒1 βˆ’ 𝑓)2 𝑑π‘₯ + (1 βˆ’ πœ†) βˆ«β„¦

    (𝑒2 βˆ’ 𝑓)2 𝑑π‘₯ (3.14)

    to obtain that it is equivalent to

    βˆ’ πœ†(1 βˆ’ πœ†) βˆ«β„¦

    (𝑒1 βˆ’ 𝑒2)2 𝑑π‘₯ < 0 (3.15)

  • 3.2. WELL-POSEDNESS 17

    .. π‘₯..

    Figure 3.3: A lower semi-continuous function 𝑓 ∢ ℝ β†’ ℝ can havediscontinuities, but for a convergent sequence π‘₯π‘˜ β†’ π‘₯ we always have𝑓(π‘₯) ≀ lim infπ‘˜β†’βˆž 𝑓(π‘₯π‘˜).

    which is true for 0 < πœ† < 1 and 𝑒1 β‰  𝑒2.The anisotropic total variation

    TV𝐴(𝑒) = supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«β„¦

    𝑒 div πœ‰ 𝑑π‘₯ (3.16)

    can be thought of asβ€”and has the properties ofβ€”a semi-norm, and is thereforeconvex. The sum of the fidelity and regularization terms is thus strictly convex,which, given the existence of a minimizer, implies uniqueness.

    3.2.2 CoercivityCoercivity relates to how the functional behaves when the norm of the image𝑒 tends to infinity. What we need in order to conclude with existence is weaksequential coercivity. Thus we need all level sets 𝐹 𝛼 = {𝑒 ∈ 𝐿2(Ξ©) ∢ 𝐹(𝑒) ≀ 𝛼} tobe weakly sequentially pre-compact, meaning that all sequences in the set containa subsequence weakly converging to an element of the closure of the set.

    It is obvious from the fidelity term that for some fixed 𝑓 ∈ 𝐿2(Ξ©), if ‖𝑒‖𝐿2 β†’ ∞then 𝐹(𝑒) β†’ ∞. This implies that all the level sets 𝐹 𝛼 are bounded. Since 𝐿2(Ξ©)is a Hilbert space all bounded sequences contain a weakly convergent subsequence.Thus all the level sets 𝐹 𝛼 are weakly sequentially pre-compact.

    3.2.3 Lower semi-continuityThe lower semi-continuity is the most tricky part, and this is where we will takesome shortcuts. Lower semi-continuity for a functional 𝐹 at a point 𝑒 meansthat at points π‘’πœ– close to 𝑒, the functional takes values either close to or above𝐹(𝑒). More specifically, for every sequence π‘’π‘˜ converging to 𝑒, we have 𝐹(𝑒) ≀lim infπ‘˜ 𝐹(π‘’π‘˜). For a function 𝑓 ∢ ℝ β†’ ℝ this can be visualized as in Figure 3.3.

  • 18 CHAPTER 3. CONTINUOUS FORMULATION

    Since our space 𝐿2(Ξ©) is of infinite dimensions things become a little prob-lematic here. The problem lies in the fact that a functional which is continuouswith respect to sequences is not necessarily continuous with respect to the under-lying topology. In other words, in these spaces, there can be a difference betweensequential continuity and topological continuity. Topological continuity impliessequential continuity, but the converse does not hold. One way to get around thiswould be to consider topological nets, an extension of sequences, but for simplic-ity, and because it might not add much to the understanding of the restorationmethod, we will stick to proving sequential lower semi-continuity and referring tofurther theory. For further reading on the theory of sequential versus topologicalcontinuity see for example Megginson’s book on Banach space theory [11].

    The mapping 𝑒 ↦ βˆ«β„¦ π‘’πœ‰ 𝑑π‘₯ is weakly continuous for all πœ‰ ∈ 𝐿2(Ξ©). Note that

    when we write weakly continuous it is not a weaker version of continuity, but rathercontinuity in the weak topology, and the same goes for weak lower semi-continuity.

    Before arguing that our own functional is sequentially weakly lower semi-continuous, we present a needed result.

    Lemma 3.2. Assume that the functional 𝐹 ∢ 𝐿2(Ξ©) β†’ ℝ is defined by

    𝐹 = sup𝑖

    𝐹𝑖 (3.17)

    where all the 𝐹𝑖 are sequentially weakly lower semi-continuous, then 𝐹 is sequen-tially weakly lower semi-continuous, meaning that for any sequence π‘’π‘˜ ⇀ 𝑒 wehave 𝐹(𝑒) ≀ lim infπ‘˜ 𝐹(π‘’π‘˜).

    Proof. For any sequence π‘’π‘˜ ⇀ 𝑒 in 𝐿2(Ξ©) we have

    𝐹(𝑒) = sup𝑖

    𝐹𝑖(𝑒) ≀ sup𝑖

    lim infπ‘˜β†’βˆž

    𝐹𝑖(π‘’π‘˜) (3.18)

    from the sequential weak lower semi-continuity of 𝐹𝑖. Using that lim infπ‘˜β†’βˆž π‘’π‘˜ =supπ‘˜ inf 𝑙β‰₯π‘˜ 𝑒𝑙, we obtain

    𝐹(𝑒) ≀ sup𝑖

    supπ‘˜

    inf𝑙β‰₯π‘˜

    𝐹𝑖(𝑒𝑙)

    = supπ‘˜

    sup𝑖

    inf𝑙β‰₯π‘˜

    𝐹𝑖(𝑒𝑙)

    ≀ supπ‘˜

    inf𝑙β‰₯π‘˜

    sup𝑖

    𝐹𝑖(𝑒𝑙)

    = lim infπ‘˜β†’βˆž

    𝐹(π‘’π‘˜)

    (3.19)

    which proves that 𝐹 is sequentially weakly lower semi-continuous.

  • 3.3. ANISOTROPIC COAREA FORMULA 19

    In our functional in (3.4), we first consider the fidelity term, and rewrite it asa supremum

    βˆ«β„¦

    (𝑒 βˆ’ 𝑓)2 𝑑π‘₯ = sup {βˆ«β„¦

    (𝑒 βˆ’ 𝑓)πœ‰ 𝑑π‘₯ ∢ πœ‰ ∈ 𝐿2(Ξ©), |πœ‰(π‘₯)| ≀ |𝑒(π‘₯) βˆ’ 𝑓(π‘₯)|} (3.20)

    As the map 𝑒 ↦ βˆ«β„¦(π‘’βˆ’π‘£)πœ‰ 𝑑π‘₯ is continuous in the weak topology, the fidelity termis thus a supremum of weakly continuous functionals, and is thus by Lemma 3.2sequentially lower semi-continuous.

    For the regularization term the approach is similar. With our extended defini-tion from (3.3), we have

    TV𝐴(𝑒) = sup {βˆ«β„¦

    𝑒 div πœ‰ 𝑑π‘₯ ∢ πœ‰ ∈ πΆβˆžπ‘ (Ξ©, ℝ2), β€–πœ‰β€–βˆ—π΄ ≀ 1} (3.21)

    This is again a supremum of weakly continuous functionals. Thus the regulariza-tion term is by Lemma 3.2 also sequentially weakly lower semi-continuous.

    The sum of the two terms is trivially sequentially weakly lower semi-continuousfunctional since

    𝐹1(𝑒) + 𝐹2(𝑒) ≀ lim infπ‘˜β†’βˆž 𝐹1(π‘’π‘˜) + lim infπ‘˜β†’βˆž 𝐹2(π‘’π‘˜)

    = limπ‘˜β†’βˆž

    (inf𝑙β‰₯π‘˜

    𝐹1(𝑒𝑙) + inf𝑙β‰₯π‘˜ 𝐹2(𝑒𝑙))

    ≀ lim infπ‘˜β†’βˆž

    (𝐹1(π‘’π‘˜) + 𝐹2(π‘’π‘˜)) ,

    (3.22)

    and thus our functional is sequentially weakly lower semi-continuous.The usual ways of going from coercivity and lower semi-continuity to existence

    do not work in infinite dimensions. But with sequential coercivity and sequentiallower semi-continuity in the weak topology we can conclude that we have existencefrom [12, Theorem 5.1].

    3.3 Anisotropic coarea formulaThe anisotropic coarea formula we present here will allow us to write the an-isotropic total variation as an integral over the levels of the image. For a similarpresentation of the regular coarea formula for all 𝑓 ∈ BV(Ξ©) see [13].

    First we define the thresholded image at level 𝑠.Definition 3.3 (Thresholded image). The thresholded image at level 𝑠 is thefunction

    𝑒𝑠(π‘₯) = {1 if 𝑒(π‘₯) > 𝑠,0 otherwise. (3.23)

  • 20 CHAPTER 3. CONTINUOUS FORMULATION

    This will be used throughout the rest of the thesis. Note that given the thresh-olded image for every level, we are able to reconstruct the image as

    𝑒(π‘₯) = sup {𝑠 ∢ 𝑒𝑠(π‘₯) = 1} . (3.24)The thresholded image definition also allows us to write a non-negative image𝑒 β‰₯ 0 as an integral over all the layers

    𝑒(π‘₯) = ∫∞

    0𝑒𝑠(π‘₯)𝑑𝑠. (3.25)

    Note that (3.25) only holds for non-negative images, which complicates the proofof the anisotropic coarea formula a little.Theorem 3.4 (Anisotropic coarea formula). Given an image 𝑒 ∈ BV(Ξ©), theanisotropic total variation can be written as an integral over all the levels

    TV𝐴(𝑒) = ∫∞

    βˆ’βˆžTV𝐴(𝑒𝑠)𝑑𝑠. (3.26)

    For the proof we will avoid measure theory and follow a proof given in [9], butfirst we will present a necessary result from measure theory.Theorem 3.5 (Lebesgue’s Dominated Convergence theorem). Let {𝑓𝑛} be a se-quence of real-valued measurable functions on a space 𝑆 with measure π‘‘πœ‡ whichconverges almost everywhere to a real-valued measurable function 𝑓. If there existsan integrable function 𝑔 such that |𝑓𝑛| ≀ 𝑔 for all 𝑛, then 𝑓 is integrable and

    limπ‘›β†’βˆž

    βˆ«π‘†

    𝑓𝑛 π‘‘πœ‡ = βˆ«π‘†

    𝑓 π‘‘πœ‡. (3.27)

    For a proof and further background on measure theory and Lebesgue integra-tion theory see for example [14].

    Proof of the anisotropic coarea formula. Assume that 𝑒 ∈ 𝐢1(Ξ©) ∩ BV(Ξ©). Theextension to all functions 𝑒 ∈ BV(Ξ©) will not be considered here, but for the caseof regular total variation see [15, Theorem 5.3.3].

    Proof of upper bound. Assume that 𝑒 β‰₯ 0 such that the integral repre-sentation in (3.25) holds, then inserting (3.25) into the extended total variationdefinition in (3.3) gives

    TV𝐴(𝑒) = supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«β„¦

    (∫∞

    0𝑒𝑠𝑑𝑠) div πœ‰ 𝑑π‘₯ = sup

    β€–πœ‰β€–βˆ—π΄β‰€1∫

    β„¦βˆ«

    ∞

    0𝑒𝑠 div πœ‰ 𝑑𝑠 𝑑π‘₯

    ≀ ∫∞

    0( sup

    β€–πœ‰β€–βˆ—π΄β‰€1∫

    Ω𝑒𝑠 div πœ‰ 𝑑π‘₯) 𝑑𝑠 = ∫

    ∞

    0TV𝐴(𝑒𝑠)𝑑𝑠.

    (3.28)

  • 3.3. ANISOTROPIC COAREA FORMULA 21

    For 𝑒 ≀ 0 we use that TV𝐴(βˆ’π‘£) = TV𝐴(𝑣) and that TV𝐴(𝑐 + 𝑣) = TV𝐴(𝑣) forany constant 𝑐. Note that βˆ’π‘’ β‰₯ 0 and that its thresholded image (βˆ’π‘’)𝑠 will beexactly the opposite of π‘’βˆ’π‘ , that is (βˆ’π‘’)𝑠 = 1 βˆ’ π‘’βˆ’π‘ . This allows us to show that

    TV𝐴(𝑒) = TV𝐴(βˆ’π‘’) ≀ ∫∞

    0TV𝐴((βˆ’π‘’)π‘Ÿ)π‘‘π‘Ÿ = ∫

    ∞

    0TV𝐴(1 βˆ’ π‘’βˆ’π‘Ÿ)π‘‘π‘Ÿ

    = ∫∞

    0TV𝐴(π‘’βˆ’π‘Ÿ)π‘‘π‘Ÿ = ∫

    0

    βˆ’βˆžTV𝐴(𝑒𝑠)𝑑𝑠.

    (3.29)

    Following from the supremum definition of the anisotropic total variation in (3.3),we obtain the inequality

    TV𝐴(𝑒1 + 𝑒2) = supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«β„¦

    (𝑒1 + 𝑒2) div πœ‰ 𝑑π‘₯

    ≀ supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«β„¦

    𝑒1 div πœ‰ 𝑑π‘₯ + supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«β„¦

    𝑒2 div πœ‰ 𝑑π‘₯

    = TV𝐴(𝑒1) + TV𝐴(𝑒2).

    (3.30)

    Next, we write a general 𝑒 as a difference of two positive functions 𝑒 = 𝑒+ βˆ’ π‘’βˆ’where 𝑒+ = max{𝑒, 0} and π‘’βˆ’ = βˆ’ min{𝑒, 0}. Inserting (3.28) and (3.29) into(3.30) we obtain

    TV𝐴(𝑒) ≀ TV𝐴(π‘’βˆ’) + TV𝐴(𝑒+) = TV𝐴(βˆ’π‘’βˆ’) + TV𝐴(𝑒+)

    ≀ ∫0

    βˆ’βˆžTV𝐴((βˆ’π‘’βˆ’)𝑠)𝑑𝑠 + ∫

    ∞

    0TV𝐴(𝑒𝑠+)𝑑𝑠

    = ∫0

    βˆ’βˆžTV𝐴(𝑒𝑠)𝑑𝑠 + ∫

    ∞

    0TV𝐴(𝑒𝑠)𝑑𝑠 = ∫

    ∞

    βˆ’βˆžTV𝐴(𝑒𝑠)𝑑𝑠.

    (3.31)

    Note that 𝑒+ and π‘’βˆ’ will not be differentiable everywhere, but we did not use thedifferentiability of 𝑒 in this part of the proof.

    Proof of lower bound. Define the function

    π‘š(𝑑) = ∫{π‘₯βˆˆβ„¦βˆΆπ‘’(π‘₯)≀𝑑}

    β€–βˆ‡π‘’β€–π΄ 𝑑π‘₯, (3.32)

    and note that π‘š(∞) = TV𝐴(𝑒) and π‘š(βˆ’βˆž) = 0. Since π‘š(𝑑) is non-decreasingwith 𝑑, we can apply the existence theorems of Lebesgue [16, Thm. 17.12, 18.14]to conclude that π‘šβ€²(𝑑) exists almost everywhere and that the following inequalityholds:

    ∫∞

    βˆ’βˆžπ‘šβ€²(𝑑)𝑑𝑑 ≀ π‘š(∞) βˆ’ π‘š(βˆ’βˆž) = TV𝐴(𝑒). (3.33)

  • 22 CHAPTER 3. CONTINUOUS FORMULATION

    .. 𝑑..𝑠

    .𝑠 + π‘Ÿ

    .

    1

    (a) πœ‚π‘Ÿ(𝑑)

    .. 𝑑..𝑠

    .𝑠 + π‘Ÿ

    (b) πœ‚β€²π‘Ÿ(𝑑)

    Figure 3.4: Visualization of the cut-off function πœ‚π‘Ÿ(𝑑) and its derivative.

    Next, fix an 𝑠 ∈ ℝ and define the cut-off function

    πœ‚π‘Ÿ(𝑑) =⎧{⎨{⎩

    0 if 𝑑 < 𝑠,(𝑑 βˆ’ 𝑠)/π‘Ÿ if 𝑠 ≀ 𝑑 < 𝑠 + π‘Ÿ,1 if 𝑑 β‰₯ 𝑠 + π‘Ÿ,

    πœ‚π‘Ÿβ€²(𝑑) =⎧{⎨{⎩

    0 if 𝑑 < 𝑠,1 if 𝑠 < 𝑑 < 𝑠 + π‘Ÿ,0 if 𝑑 > 𝑠 + π‘Ÿ,

    (3.34)

    visualized in Figure 3.4. By composing the function πœ‚π‘Ÿ with our image 𝑒 and usingGreen’s formula, for example from [8, Corollary 9.32] we obtain

    βˆ«β„¦

    βˆ’πœ‚π‘Ÿ(𝑒) div πœ‰ 𝑑π‘₯ = βˆ«β„¦

    πœ‚π‘Ÿβ€²(𝑒)βˆ‡π‘’ β‹… πœ‰ 𝑑π‘₯ =1π‘Ÿ ∫{𝑠

  • 3.3. ANISOTROPIC COAREA FORMULA 23

    From (3.36) we then obtain

    π‘šβ€²(𝑠) β‰₯ βˆ’ βˆ«β„¦

    𝑒𝑠 div πœ‰ 𝑑π‘₯. (3.38)

    As this holds for any β€–πœ‰β€–βˆ—π΄ ≀ 1, we get from the extended total variation definitionin (3.3) that π‘šβ€²(𝑠) β‰₯ TV𝐴(𝑒𝑠) almost everywhere and conclude using (3.33) that

    TV𝐴(𝑒) β‰₯ ∫∞

    βˆ’βˆžπ‘šβ€²(𝑑)𝑑𝑑 β‰₯ ∫

    ∞

    βˆ’βˆžTV𝐴(𝑒𝑠)𝑑𝑠. (3.39)

    Combining the upper and lower bounds just proved, we have equality.

    This coarea formula is our first step in transforming the anisotropic total vari-ation into an easily discretizable expression. It allows us to consider each level πœ†separately when calculating the anisotropic total variation.

    The anisotropic total variation of the thresholded images occurring in theanisotropic coarea formula is very much related to the size of the boundary ofthe level set, as the only variation in a characteristic function occurs at the bound-ary of the set. This is why we introduce the following definition of the anisotropicset perimeter.

    Definition 3.6 (The anisotropic set perimeter). Given an anisotropy tensor 𝐴the anisotropic perimeter of a set π‘ˆ in Ξ© is defined as

    Per𝐴(π‘ˆ; Ξ©) = TV𝐴(πœ’π‘ˆ). (3.40)The anisotropic set perimeter is not like the regular set perimeter and does not

    measure the length of the boundary of the set, but it can for sufficiently nice levelsets be calculated in the following way

    Per𝐴({𝑒 > 𝑠}; Ξ©) = TV𝐴(𝑒𝑠)

    = supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«β„¦

    𝑒𝑠 div πœ‰ 𝑑π‘₯

    = supβ€–πœ‰β€–βˆ—π΄β‰€1

    ∫{𝑒>𝑠}

    div πœ‰ 𝑑π‘₯

    = supβ€–πœ‰β€–βˆ—π΄β‰€1

    βˆ«πœ•{𝑒>𝑠}

    πœˆπ‘  β‹… πœ‰ 𝑑𝑑

    = supβ€–πœ‚β€–β‰€1

    βˆ«πœ•{𝑒>𝑠}

    πœˆπ‘  β‹… 𝐴1/2πœ‚ 𝑑𝑑

    = βˆ«πœ•{𝑒>𝑠}

    βˆšπœˆπ‘ π΄πœˆπ‘  𝑑𝑑.

    (3.41)

  • 24 CHAPTER 3. CONTINUOUS FORMULATION

    .. π‘₯.

    𝑦

    .....

    𝜌

    .

    𝜈

    .πœ™

    Figure 3.5: The blue line is parametrized by the angle πœ™ and the distancefrom the origin to the line 𝜌, or alternatively, the pair (𝜈, 𝜌).

    Here, πœˆπ‘  is the unit exterior normal of the level set {𝑒 > 𝑠}. Note that becauseof the compact support of πœ‰ in Definition 3.1, the parts of the boundary of π‘ˆ thatoverlap with the boundary of Ξ© will not be included in the perimeter.

    Exterior normals and perimeters of level sets of any function 𝑒 ∈ BV(Ξ©) willnot be considered here, but can for the isotropic case be found in for example [15,Section 5.4 and 5.5].

    Using the anisotropic coarea formula and inserting the anisotropic perimeterdefinition we transform the anisotropic total variation and are left with the problemof minimizing the following functional

    𝐹(𝑒) = βˆ«β„¦

    (𝑒 βˆ’ 𝑓)2 𝑑π‘₯ + 𝛽 ∫∞

    βˆ’βˆžPer𝐴({𝑒 > πœ†}; Ξ©)π‘‘πœ†. (3.42)

    The transformation is motivated by our upcoming anisotropic Cauchy–Crofton in-tegration formula, and the discretization, where an approximation of the perimeterwill be computed using a graph cut machinery.

    3.4 Anisotropic Cauchy–Crofton formulaIn the fields of integral geometry and geometric measure theory there are a numberof interesting integral formulas. Several of them fall in a category often referredto as Cauchy–Crofton style formulas, and give ways to measure geometric objectsusing the set of all lines in the plane. The formulas presented here will give a wayto measure the length of a curve by counting the times it intersects lines in the setof all lines. The first formula will be for the isotropic case, and we will use it toprove the anisotropic formula following it.

  • 3.4. ANISOTROPIC CAUCHY–CROFTON FORMULA 25

    We write β„’ for the set of all lines in the plane, and parametrize them as shownin Figure 3.5. A line is parametrized by the angle πœ™ ∈ [0, 2πœ‹) of the normal goingto the origin, and the distance 𝜌 ∈ [0, ∞) from origin to the line. Sometimesit is more convenient to consider a unit vector 𝜈 giving the direction of the lineinstead of the angle parameter πœ™. We denote a line by β„“πœ™,𝜌 = β„“πœˆ,𝜌 where 𝜈 is aunit vector along the line, i.e. 𝜈 = (βˆ’ sin πœ™, cos πœ™)𝑇 . By defining the measure onthis set 𝑑ℒ = π‘‘πœ™ π‘‘πœŒ we are ready to introduce the Cauchy–Crofton formula. Notethat the measure 𝑑ℒ is invariant under rotations.

    Theorem 3.7 (The Euclidean Cauchy–Crofton formula). Given a differentiablecurve 𝐢 in ℝ2, the length of this curve |𝐢| is related to the set of lines β„’ as follows

    βˆ«β„’

    #(β„“πœ™,𝜌 ∩ 𝐢)𝑑ℒ(β„“πœ™,𝜌) = 2 |𝐢| , (3.43)

    where #(β„“πœ™,𝜌 ∩ 𝐢) is the number of times the line β„“πœ™,𝜌 intersects the curve 𝐢.

    Proof. See [17, Theorem 3, Section 1-7].

    If our space is equipped with a metric tensor 𝑀(π‘₯) such that the inner productof two vectors π‘Ž and 𝑏 in a point π‘₯ is calculated as βŸ¨π‘Ž, π‘βŸ©π‘€ = βŸ¨π‘Ž, 𝑀(π‘₯)π‘βŸ©, then thelength of a curve 𝛾 parametrized by some parameter 𝑑 becomes

    |𝛾|𝑀 = βˆ«π›Ύ

    √⟨ ̇𝛾, 𝑀(𝛾(𝑑)) Μ‡π›ΎβŸ© 𝑑𝑑. (3.44)

    We will now present and prove a Cauchy–Crofton formula in this case where ourdomain is equipped with a metric tensor in each point. This elegant formula isvery useful when we later will discretize our perimeter calculation. The set of linesβ„’ is then discretized in a reasonable way, and the length of the curve 𝐢 can beapproximated by a sum over all these lines.

    Theorem 3.8 (The anisotropic Cauchy–Crofton formula). Assume that our spaceΞ© is equipped with a continuous positive definite metric tensor 𝑀(π‘₯), whose eigen-values are bounded by 0 < π‘˜ ≀ πœ†2 ≀ πœ†1 ≀ 𝐾 < ∞ for all π‘₯ ∈ Ξ©. The Cauchy–Crofton formula for a differentiable curve 𝐢 of finite length then becomes

    |𝐢|𝑀 = βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    𝑑ℒ(β„“πœˆ,𝜌). (3.45)

    Proof of the anisotropic Cauchy–Crofton formula. Assume first that our space isequipped with a constant metric tensor 𝑀 . The length of a curve in this space

  • 26 CHAPTER 3. CONTINUOUS FORMULATION

    can be calculated by transforming the curve and applying the Euclidean Cauchy–Crofton formula

    |𝐢|𝑀 = ∫𝐢

    √⟨ ̇𝐢, 𝑀 Μ‡πΆβŸ© 𝑑𝑑 = ∫𝐢

    βˆšβŸ¨π‘€ 1/2 ̇𝐢, 𝑀 1/2 Μ‡πΆβŸ© = βˆ£π‘€ 1/2𝐢∣ (3.46)

    = βˆ«β„’

    #(β„“πœ™,𝜌 ∩ 𝑀 1/2𝐢) 𝑑ℒ(β„“πœ™,𝜌) (3.47)

    = βˆ«β„’

    #(π‘€βˆ’1/2β„“πœ™,𝜌 ∩ 𝐢)𝑑ℒ(β„“πœ™,𝜌) (3.48)

    = βˆ«β„’

    #(π‘šπœ™,𝜌 ∩ 𝐢) βˆ£π½π‘€(β„“πœ™,𝜌)∣ 𝑑ℒ(π‘šπœ™,𝜌). (3.49)

    Here 𝐽𝑀(β„“πœ™,𝜌) is the Jacobian of the coordinate transformation 𝐹 ∢ β„’ β†’ β„’, whichmaps β„“πœ™,𝜌 ↦ 𝑀 1/2β„“πœ™,𝜌.

    We will now compute the Jacobian 𝐽𝑀(β„“πœ™,𝜌). As 𝑀 ∈ ℝ2Γ—2 is symmetric,so is 𝑀 1/2, and it admits a decomposition 𝑀 1/2 = π‘ˆΞ£π‘ˆπ‘‡ where the componentscorrespond to the following coordinate transformations

    π‘ˆ(β„“πœˆ,𝜌) = β„“πœ™+πœ‰,𝜌 = β„“π‘ˆπœˆ,𝜌 (3.50)π‘ˆπ‘‡ (β„“πœˆ,𝜌) = β„“πœ™βˆ’πœ‰,𝜌 = β„“π‘ˆπ‘‡ 𝜈,𝜌 (3.51)

    Ξ£ = (𝜎1 00 𝜎2) = (βˆšπœ†1 00 βˆšπœ†2

    ) (3.52)

    As π‘ˆ and π‘ˆπ‘‡ correspond to rotations and our measure β„’ is invariant under rota-tions, π‘ˆ and π‘ˆπ‘‡ do not have direct contributions to the Jacobian. They do howeveraffect the input angle of the operator Ξ› such that 𝐽𝑀(β„“πœ™,𝜌) = 𝐽Σ2(π‘ˆπ‘‡ β„“πœ™,𝜌). Thuswe will now compute 𝐽Σ2(β„“πœ™,𝜌). Given a line

    β„“πœ™,𝜌 = (𝜌 β‹… cos πœ™πœŒ β‹… sin πœ™) + ℝ (

    βˆ’ sin πœ™cos πœ™ ) , (3.53)

    the operator Ξ£ transforms it into

    Ξ£β„“πœ™,𝜌 = (𝜎1𝜌 β‹… cos πœ™πœŽ2𝜌 β‹… sin πœ™

    ) + ℝ (βˆ’πœŽ1 sin πœ™πœŽ2 cos πœ™) , (3.54)

    which equals the line β„“πœƒ,πœ‚ with

    πœƒ = arctan (𝜎1𝜎2tan πœ™) (3.55)

    πœ‚ = ⟨(𝜎1𝜌 β‹… cos πœ™πœŽ2𝜌 β‹… sin πœ™) , (cos πœƒsin πœƒ)⟩ = 𝜎1𝜌 β‹… cos πœ™ β‹… cos πœƒ + 𝜎2𝜌 β‹… sin πœ™ β‹… sin πœƒ. (3.56)

  • 3.4. ANISOTROPIC CAUCHY–CROFTON FORMULA 27

    As πœ•πœŒπœƒ = 0, the Jacobian becomes ∣𝐽Σ2(β„“πœ™,𝜌)∣ = πœ•πœ™πœƒ β‹… πœ•πœŒπœ‚. Differentiation yields

    πœ•πœ™πœƒ =𝜎1𝜎2 sec2 πœ™

    1 + 𝜎21𝜎22 tan2 πœ™

    = 𝜎1𝜎2𝜎21 sin2 πœ™ + 𝜎22 cos2 πœ™, (3.57)

    πœ•πœŒπœ‚ = 𝜎1 cos πœ™ β‹… cos πœƒ + 𝜎2 sin πœ™ β‹… sin πœƒ. (3.58)

    In the expression for πœ•πœŒπœ‚ we insert πœƒ from (3.55) and use that sin(arctan(π‘₯)) =π‘₯/

    √1 + π‘₯2 and that cos(arctan(π‘₯)) = 1/

    √1 + π‘₯2 to obtain

    πœ•πœŒπœ‚ =𝜎1 cos πœ™ + 𝜎2 sin πœ™ 𝜎1𝜎2 tan πœ™

    √1 + 𝜎21𝜎22 tan2 πœ™

    = 𝜎1𝜎2√𝜎21 sin2 πœ™ + 𝜎22 cos2 πœ™

    . (3.59)

    If 𝜈 = (𝜈π‘₯, πœˆπ‘¦)𝑇 is a unit vector along the line β„“πœ™,𝜌 = β„“πœˆ,𝜌 then

    ∣𝐽Σ2(β„“πœˆ,𝜌)∣ =𝜎21𝜎22

    (𝜎21 sin2 πœ™ + 𝜎22 cos2 πœ™)3/2 =

    𝜎21𝜎22(𝜎21𝜈2π‘₯ + 𝜎22𝜈2𝑦)

    3/2 =det Ξ£2

    (πœˆπ‘‡ β‹… Ξ£2 β‹… 𝜈)3/2 .

    (3.60)We are interested in the Jacobian of the whole transformation 𝐽Σ2(π‘ˆπ‘‡ β„“πœˆ,𝜌), so allthat is left to do is insert π‘ˆπ‘‡ β„“πœˆ,𝜌 to obtain

    βˆ£π½π‘€(β„“πœˆ,𝜌)∣ = ∣𝐽Σ2(π‘ˆπ‘‡ β„“πœˆ,𝜌)∣ =det 𝑀

    (πœˆπ‘‡ π‘ˆ β‹… Ξ£2 β‹… π‘ˆπ‘‡ 𝜈)3/2= det 𝑀

    (πœˆπ‘‡ β‹… 𝑀 β‹… 𝜈)3/2(3.61)

    We have now proved that for a constant metric tensor 𝑀 , the length of the differ-entiable curve 𝐢 with regards to this tensor can be calculated as

    |𝐢|𝑀 = ∫𝐢

    √⟨ ̇𝐢, 𝑀 Μ‡πΆβŸ© 𝑑𝑑 = βˆ«β„’

    #(β„“πœˆ,𝜌 ∩ 𝐢)det 𝑀

    (πœˆπ‘‡ β‹… 𝑀 β‹… 𝜈)3/2𝑑ℒ(β„“πœˆ,𝜌). (3.62)

    Further we argue that the similar formula in (3.45) holds for a non-constant butcontinuous metric tensor 𝑀(π‘₯). By partitioning the domain into disjoint sets π‘ˆπ‘–such that Ξ© = βˆͺπ‘–π‘ˆπ‘–, we make a piecewise constant approximation π‘€πœ‹(π‘₯) suchthat if π‘₯ ∈ π‘ˆπ‘– then π‘€πœ‹(π‘₯) = 𝑀(π‘₯𝑖) for some fixed π‘₯𝑖 ∈ π‘ˆπ‘–. We then approximate(3.62) by

    |𝐢|π‘€πœ‹ = βˆ‘π‘–βˆ«

    β„’#(β„“πœˆ,𝜌 ∩ 𝐢 ∩ π‘ˆπ‘–)𝑀𝑖(𝜈)𝑑ℒ(β„“πœˆ,𝜌) (3.63)

    where 𝑀𝑖 is the weight-function used in the set π‘ˆπ‘–, that is,

    𝑀𝑖(𝜈) =det 𝑀(π‘₯𝑖)

    (πœˆπ‘‡ β‹… 𝑀(π‘₯𝑖) β‹… 𝜈)3/2 . (3.64)

  • 28 CHAPTER 3. CONTINUOUS FORMULATION

    We further simplify the approximation by introducing the global weight-functionπ‘€πœ‹(𝜈, π‘₯) which is equal to 𝑀𝑖(𝜈) when π‘₯ ∈ π‘ˆπ‘–. It can be written as

    π‘€πœ‹(𝜈, π‘₯) =det π‘€πœ‹(π‘₯)

    (πœˆπ‘‡ β‹… π‘€πœ‹(π‘₯) β‹… 𝜈)3/2 . (3.65)

    Using this weight in (3.63) we can get rid of the sum over the partition 𝑖 and forma sum of all intersection point of 𝐢 and the line β„“πœˆ,𝜌 currently being integratedover. The approximation becomes

    |𝐢|π‘€πœ‹ = βˆ‘π‘–βˆ«

    β„’βˆ‘

    π‘₯βˆˆβ„“πœˆ,πœŒβˆ©πΆβˆ©π‘ˆπ‘–π‘€πœ‹(𝜈, π‘₯)𝑑ℒ(β„“πœˆ,𝜌)

    = βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    π‘€πœ‹(𝜈, π‘₯)𝑑ℒ(β„“πœˆ,𝜌).(3.66)

    Now it only remains to show that the left- and right-hand side of (3.66) convergesto the left- and right-hand side of (3.45).

    As our partition πœ‹ is refined, the weight π‘€πœ‹(π‘₯) converges pointwise to thecontinuously varying weight

    𝑀(𝜈, π‘₯) = det 𝑀(π‘₯)(πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    (3.67)

    found in (3.45).Recall from (3.44) that the left-hand side is calculated as

    |𝐢|π‘€πœ‹ = ∫𝐢∣ ̇𝐢(𝑑)∣

    π‘€πœ‹π‘‘π‘‘ = ∫

    𝐢√ ̇𝐢(𝑑)𝑇 π‘€πœ‹(𝐢(𝑑)) ̇𝐢(𝑑) 𝑑𝑑. (3.68)

    We know that π‘€πœ‹(π‘₯) converges pointwise to 𝑀(π‘₯), and thus | ̇𝐢(𝑑)|π‘€πœ‹ convergespointwise to | ̇𝐢(𝑑)|𝑀 . We have assumed bounds on the eigenvalues of 𝑀(π‘₯) suchthat, according to the Rayleigh principle

    𝐾 β‰₯ πœ†1 = maxπœ‰πœ‰π‘‡ π‘€πœ‹(π‘₯)πœ‰

    πœ‰π‘‡ πœ‰ (3.69)

    and therefore we have the bound

    πœ‰π‘‡ π‘€πœ‹(π‘₯)πœ‰ ≀ 𝐾 β€–πœ‰β€–2 , βˆ€πœ‰. (3.70)

    Thus the integrand of (3.68) is bounded by 𝑔(𝑑) = (𝐾 β‹… ̇𝐢(𝑑)𝑇 ̇𝐢(𝑑))1/2. We knowthat 𝑔(𝑑) is integrable as its integral is exactly

    √𝐾 |𝐢| and we have assumed that

  • 3.4. ANISOTROPIC CAUCHY–CROFTON FORMULA 29

    the curve is of finite length. This means we can apply Lebesgue’s dominatedconvergence theorem to see that |𝐢|π‘€πœ‹ β†’ |𝐢|𝑀 .

    We apply the same theorem to show that the right-hand side of (3.66) con-verges. Recall the definition of π‘€πœ‹ in (3.65). The numerator is equal to 𝜎21𝜎22 =πœ†1πœ†2 and is by assumption bounded from above by 𝐾2.

    Next we need to bound πœˆπ‘‡ π‘€πœ‹(π‘₯)𝜈 away from zero. According to the Rayleighprinciple

    πœ†2 = minβ€–πœ‰β€–=1 πœ‰π‘‡ π‘€πœ‹(π‘₯)πœ‰ (3.71)

    and thus πœˆπ‘‡ π‘€πœ‹(π‘₯)𝜈 β‰₯ πœ†2 β‰₯ π‘˜. The weight function π‘€πœ‹ is then bounded such that

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    π‘€πœ‹(𝜈, π‘₯) ≀ βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    𝐾2π‘˜3/2 =

    𝐾2π‘˜3/2 β‹… #(β„“πœˆ,𝜌 ∩ 𝐢) =∢ 𝑔(β„“πœˆ,𝜌). (3.72)

    This is integrable following from the Euclidean Cauchy–Crofton formula in Theo-rem 3.7 and the fact that we assumed 𝐢 to be of finite length:

    βˆ«β„’

    𝑔(β„“πœˆ,𝜌)𝑑ℒ(β„“πœˆ,𝜌) =𝐾2π‘˜3/2 |𝐢| < ∞. (3.73)

    Thus we can apply the dominated convergence theorem again and conclude that

    βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    π‘€πœ‹(𝜈, π‘₯)𝑑ℒ(β„“πœˆ,𝜌) β†’ βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    𝑀(𝜈, π‘₯)𝑑ℒ(β„“πœˆ,𝜌) (3.74)

    whichβ€”as both sides of the equality in (3.66) have been shown to convergeβ€”leavesus with what we wanted to prove

    |𝐢|𝑀 = βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    𝑑ℒ(β„“πœˆ,𝜌). (3.75)

    With the anisotropic coarea formula in Theorem 3.4 we have a way to calculatethe anisotropic total variation by integrating the anisotropic perimeter of each levelset of the image. We will now see how the anisotropic Cauchy–Crofton formulacan help us calculate the perimeters of the level sets. In the Euclidean case, whichhere would amount to setting the anisotropy tensor 𝐴 equal to the identity matrix𝐼 , the perimeter coincides nicely with the length of the boundary curve, assumingsome regularity for the boundary. In the general case we need to be more careful.As can be seen in (3.41), the anisotropic perimeter is calculated by integratingthe norm of the normal vector around the boundary, while the anisotropic curve

  • 30 CHAPTER 3. CONTINUOUS FORMULATION

    length in (3.44) is the integral of the norm of the tangent vector of the curve. Thusa 90Β° rotation separates the two.

    If 𝑃 is a 90Β° rotation matrix we have

    Per𝐴(π‘ˆ; Ξ©) = βˆ«πœ•π‘ˆ

    βˆšβŸ¨πœˆπœ•π‘ˆ , 𝐴(π‘₯)πœˆπœ•π‘ˆβŸ© 𝑑𝑑

    = βˆ«πœ•π‘ˆ

    βˆšβŸ¨π‘ƒπœˆπœ•π‘ˆ , 𝑃𝐴(π‘₯)𝑃 𝑇 π‘ƒπœˆπœ•π‘ˆβŸ© 𝑑𝑑.(3.76)

    We simplify the equation by defining the metric tensor 𝑀(π‘₯) = 𝑃𝐴(π‘₯)𝑃 𝑇 andletting 𝛾 = πœ•π‘ˆ ∩ Ξ© be an arclength parametrization of the boundary of π‘ˆ thatdoes not overlap with the boundary of Ξ©

    Per𝐴(π‘ˆ; Ξ©) = βˆ«π›Ύ

    √⟨ ̇𝛾, 𝑀(π‘₯) Μ‡π›ΎβŸ© 𝑑𝑑. (3.77)

    Now we make sure that all the assumptions of the anisotropic Cauchy–Croftonformula in Theorem 3.8 are fulfilled so that it can be applied to the curve lengthintegral we have constructed in (3.77).

    The structure tensor is constructed as described in Section 3.1.1

    π‘†πœŒ(π‘₯) = (𝐾𝜌 βˆ— (βˆ‡π‘“πœŽ βŠ— βˆ‡π‘“πœŽ)) (π‘₯). (3.78)

    Because of the convolutions with the Gaussian function, this is a smooth mapfrom Ξ©Μ„ to ℝ2Γ—2. As we can see in (3.6), the eigenvalues depend continuously onthe coefficients of the elements in the structure tensor π‘†πœŒ(π‘₯). The extreme valuetheorem states that a continuous real-valued function on a nonempty compactspace is bounded above. Thus the eigenvalues πœ†1 β‰₯ πœ†2 of π‘†πœŒ(π‘₯) are boundedabove. Moreover, by the construction in (3.10), there exists uniform bound π‘˜ suchthat the smallest eigenvalue 𝜎1 of the anisotropy tensor 𝐴(π‘₯) is bounded awayfrom zero, as

    𝜎1 = (1 +(πœ†1 βˆ’ πœ†2)2

    πœ”2 )βˆ’1

    β‰₯ (1 + πœ†21

    πœ”2 )βˆ’1

    β‰₯ π‘˜ > 0. (3.79)

    Hence our metric tensor 𝑀(π‘₯) = 𝑃𝐴(π‘₯)𝑃 𝑇 is continuous and positive definite withbounded eigenvalues π‘˜ ≀ 𝜎1 ≀ 𝜎2 ≀ 𝐾 = 1 and thus the curve length calculationin (3.77) fulfills all the assumptions of the anisotropic Cauchy–Crofton formula inTheorem 3.8. Hence we can apply the formula to calculate the perimeter in (3.77)as

    Per𝐴(π‘ˆ; Ξ©) = βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,πœŒβˆ©π›Ύ

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    𝑑ℒ(β„“πœˆ,𝜌)𝑑𝑠, (3.80)

  • 3.4. ANISOTROPIC CAUCHY–CROFTON FORMULA 31

    where 𝛾 = πœ•π‘ˆ ∩ Ξ©. Note that 𝑃 does not affect the determinant, i.e. det 𝐴 =det 𝑃𝐴𝑃 𝑇 = det 𝑀 , and from our decomposition in (3.10) we see that the trans-formation 𝑃𝐴𝑃 𝑇 β†’ 𝑀 actually amounts to switching the two eigenvalues 𝜎1 and𝜎2 in Ξ£.

    This concludes the treatment of the continuous problem. We have seen how theanisotropic coarea formula in Theorem 3.4 allows us to calculate the anisotropictotal variation as an integral of the perimeter of all the level sets. Through theanisotropic Cauchy–Crofton formula in Theorem 3.8 these perimeters are calcu-lated by an integral over the set of all lines. We are then left with the functional

    𝐹(𝑒) = βˆ«β„¦

    (𝑒 βˆ’ 𝑓)2 + 𝛽 TV𝐴(𝑒), (3.81)

    where

    TV𝐴(𝑒) = ∫∞

    βˆ’βˆžβˆ«

    β„’βˆ‘

    π‘₯βˆˆβ„“πœˆ,πœŒβˆ©π›Ύπ‘ 

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    𝑑ℒ(β„“πœˆ,𝜌)𝑑𝑠, (3.82)

    and 𝛾𝑠 = πœ•{𝑒 > 𝑠} ∩ Ξ©. Within the restrictions that these theorems put on thetensor 𝑀(π‘₯), we have chosen a construction where one eigenvalue is always 1,while the other varies from 1 in smooth areas towards 0 around edges, with thecorresponding eigenvector perpendicular to the edge.

  • Chapter 4Discrete formulation

    The whole transformation from the initial functional in (3.4), through the an-isotropic coarea formula and the Cauchy–Crofton formula was motivated by thediscrete formulation which will be described here. After discretizing the functional,we will see how a graph cut approach can be used to find a global minimizer inpolynomial time.

    4.1 DiscretizationAssume that our discrete images are given on a uniform grid 𝒒, where each gridpoint is called a pixel. The image is a function giving each pixel a value in the setof levels 𝒫 = {0, … , 𝐿 βˆ’ 1}, such that 𝑒 ∢ 𝒒 β†’ 𝒫. This is a reasonable assumptionfor digital grayscale images. Thus, when discretizing the functional in (3.81), wehave to consider that our images now have both discrete domain and co-domain.

    The integrals in (3.81) will be approximated by discrete sums. First the fidelityterm is discretized without too much trouble, while with the regularization term,there is more choice as to how to discretize the set of lines β„’. In the end we willverify that our discretization is consistent with the continuous functional.

    4.1.1 Fidelity termSince it is not affected by our introduction of the anisotropy tensor, the fidelityterm can be discretized as in my project work [1]. For some pixel position π‘₯ ∈ 𝒒and some level value π‘˜ ∈ 𝒫, we define the following function

    𝑁π‘₯(π‘˜) = |π‘˜ βˆ’ 𝑓π‘₯|2 (4.1)

    32

  • 4.1. DISCRETIZATION 33

    which is the value of the fidelity term if we were to give 𝑒π‘₯ a value of π‘˜. Thisallows us write

    βˆ«β„¦

    |𝑒 βˆ’ 𝑓|2 𝑑π‘₯ β‰ˆ βˆ‘π‘₯βˆˆπ’’

    |𝑒π‘₯ βˆ’ 𝑓π‘₯|2 Ξ”π‘₯ = βˆ‘π‘₯βˆˆπ’’

    𝑁π‘₯(𝑒π‘₯)Ξ”π‘₯. (4.2)

    The reason we introduce the function 𝑁π‘₯(π‘˜) is that we want to apply the followingdecomposition formula, which holds for any function 𝐹(π‘˜) taking values π‘˜ ∈ 𝒫:

    𝐹(π‘˜) =π‘˜βˆ’1βˆ‘πœ†=0

    (𝐹(πœ† + 1) βˆ’ 𝐹(πœ†)) + 𝐹(0)

    =πΏβˆ’2βˆ‘πœ†=0

    (𝐹(πœ† + 1) βˆ’ 𝐹(πœ†))𝐼(πœ† < π‘˜) + 𝐹(0),(4.3)

    where 𝐼(π‘₯) is the indicator function that takes the value 1 if π‘₯ is true, and 0 if π‘₯is false. Since 𝐼(πœ† < 𝑒π‘₯) = π‘’πœ†π‘₯ we rewrite (4.2) and obtain

    βˆ‘π‘₯βˆˆπ’’

    |𝑒π‘₯ βˆ’ 𝑓π‘₯|2 = βˆ‘π‘₯βˆˆπ’’

    𝑁π‘₯(𝑒π‘₯) =πΏβˆ’2βˆ‘πœ†=0

    βˆ‘π‘₯βˆˆπ’’

    (𝑁π‘₯(πœ† + 1) βˆ’ 𝑁π‘₯(πœ†))π‘’πœ†π‘₯ + 𝑁π‘₯(0). (4.4)

    As our domain is discretized uniformly, we drop the constant Ξ”π‘₯, and absorbit into our parameter 𝛽 of (3.81). Note that since our image takes values in𝒫 = {0, … , 𝐿 βˆ’ 1}, the thresholded image π‘’πΏβˆ’1 is equal to zero everywhere.

    4.1.2 Regularization term

    Discretizing the regularization term is more challenging. We introduce the discretelevels to get

    ∫∞

    βˆ’βˆžPer𝐴({𝑒 > πœ†}; Ξ©)π‘‘πœ† β‰ˆ

    πΏβˆ’2βˆ‘πœ†=0

    Per𝐴({𝑒 > πœ†}; Ξ©)Ξ”πœ†. (4.5)

    As with the Ξ”π‘₯ difference, we can absorb the Ξ”πœ† difference into the 𝛽 parameter of(3.81). The perimeter is then calculated using a discretized version of the Cauchy–Crofton formula introduced in Theorem 3.8. Again, we stop the sum at πΏβˆ’2 sincethe level set {𝑒 > 𝐿 βˆ’ 1} is empty and has zero perimeter.

  • 34 CHAPTER 4. DISCRETE FORMULATION

    .......................................................

    Ξ”πœ™

    (a) The discrete set of lines ℒ𝐷 visual-ized as a neighborhood.

    ......................................

    Ξ”πœŒ

    (b) One family of lines having the sameπœ™ parameter.

    Figure 4.1: The set of lines β„’ is discretized to ℒ𝐷 where each line belongsto a family given by πœ™, the angle parameter.

    Discrete anisotropic Cauchy–Crofton formula

    By approximating the integral Theorem 3.8 by a discrete sum we obtain the ap-proximation

    |𝐢|𝑀 = βˆ«β„’

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    𝑑ℒ(β„“πœˆ,𝜌)

    β‰ˆ βˆ‘β„“πœˆ,πœŒβˆˆβ„’π·

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    Ξ”β„“πœˆ,𝜌

    = βˆ‘πœˆ

    βˆ‘πœŒ

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    Ξ”πœŒ Ξ”πœˆ.

    (4.6)

    The set of lines β„’ has been discretized into the set ℒ𝐷. Note that 𝐢 is still adifferentiable curve, not yet discretized. Being a difference in the 𝜌 parameter ofour line discretization in Figure 3.5, the difference Ξ”πœŒ represents the distance fromone line to the next in a line family as shown in Figure 4.1b, and thus dependson the angle πœ™ considered. The difference Ξ”πœ™ is taken to be the average of thedistance to the two neighboring line families as shown in Figure 4.1a, and thusalso depends on πœ™.

    The choice of our discrete set of lines ℒ𝐷 is important, as it will decide theaccuracy of our approximation. We need some sensible restrictions on the set

  • 4.1. DISCRETIZATION 35

    ...........

    𝑏

    .

    π‘Ž

    Figure 4.2: Here our intersection approximation would not be correct, asonly the intersection with edge 𝑏 is counted in (4.15), even though the curveintersects edge π‘Ž twice.

    ℒ𝐷 to simplify the further discussion. All lines intersect at least two grid points,and from the periodicity of our grid they thus intersect an infinite number of gridpoints. This puts some restrictions on the angles we can choose. For each angleincluded, we include all possible lines of that family, meaning there are no gridpoints without a line of that family intersecting it.

    The set of lines can then be represented by the neighborhood of a pixel as shownin Figure 4.1a. We write 𝒩(π‘₯) for the neighborhood of grid point π‘₯. Extendingthe edges shown in Figure 4.1a gives all lines going through the point considered.Figure 4.1b shows all lines of a given family, i.e. lines having the same angleparameter πœ™.

    Thus not only have we discretized the set of lines, but each line is made up ofedges going from one grid point to the next. We will denote such an edge by 𝑒 orπ‘’π‘Žπ‘ when its endpoints are π‘Ž, 𝑏 ∈ 𝒒. Thus we rewrite the discretization of (4.6),and sum over all the edges in the discretization ℒ𝐷 to obtain

    |𝐢|𝑀 β‰ˆ βˆ‘π‘’

    βˆ‘π‘₯βˆˆπ‘’βˆ©πΆ

    det 𝑀(π‘₯) ‖𝑒‖3

    2 (𝑒𝑇 β‹… 𝑀(π‘₯) β‹… 𝑒)3/2Ξ”πœ™ Ξ”πœŒ. (4.7)

    This is beginning to look like something we can calculate. One difficulty is findingthe intersections 𝑒 ∩ 𝐢. The exact calculations of these points will not fit into ourgraph cut framework later, and thus for an edge 𝑒 we will only consider the questionof β€œdid 𝑒 cross 𝐢 or not?” This amounts to checking whether the terminals of 𝑒lie on different sides of the curve 𝐢. This approximation is exact for zero or oneintersection points, but will, as we see in Figure 4.2, be wrong when we have more.

    The second difficulty is that in the discrete setting, we will only have an ap-proximation of the metric tensor 𝑀(π‘₯) for each point π‘₯ ∈ 𝒒, and it is thus notavailable for arbitrary intersection points in Ξ©. For an intersection of edge 𝑒 wewill utilize the average of the tensor in the two endpoints of the edge. Thus for an

  • 36 CHAPTER 4. DISCRETE FORMULATION

    .........................

    Figure 4.3: A visual argument showing that 𝛿2 = Ξ”πœŒ ‖𝑒‖. If extendedto the whole plane, there will be the same amount of blue squares as redrectangles as each grid point is the upper left corner of both a blue and ared rectangle. Thus their areas must be equal.

    intersection point π‘₯ somewhere on the edge π‘’π‘Žπ‘, we approximate the metric tensorby

    𝑀(π‘₯) β‰ˆ 𝑀(π‘’π‘Žπ‘) =𝑀(π‘Ž) + 𝑀(𝑏)

    2 , (4.8)

    the component-wise average of the tensors in the two end points of the edge. Recallthat we have already done some spatial smoothing of the structure tensor in (3.5)corresponding to the integration scale 𝜌, and thus we expect the tensors 𝑀(π‘Ž) and𝑀(𝑏) to be similar for edges 𝑒 of reasonably short length.

    We also remark that using the Rayleigh principle, it is easy to conclude thatthe eigenvalues of the tensor approximation 𝑀(π‘’π‘Žπ‘) are bounded below and aboveby the smallest and largest eigenvalues of 𝑀(π‘Ž) and 𝑀(𝑏).

    We have now almost arrived at our final curve length approximation, but weneed a way to calculate the inter-line distance Ξ”πœŒ which will be provided by thefollowing lemma.

    Lemma 4.1. For each family of lines given by an angle parameter πœ™ in the uniformgrid of size 𝛿 we have the relation

    𝛿2 = ‖𝑒‖ Ξ”πœŒ. (4.9)

    Proof. Consider a line β„“ intersecting the point in the grid given by the indices(𝑝, π‘ž) ∈ β„€2. The distance Ξ”πœŒ from this line β„“ to the neighboring lines can then becalculated as a minimum over the distance to all other grid points.

    The lines are split into edges 𝑒 = (𝛿𝑠, 𝛿𝑑)𝑇 where 𝑠, 𝑑 ∈ β„€ are coprime suchthat 𝑒 does not intersect any other grid points than its two endpoints.

  • 4.1. DISCRETIZATION 37

    We then calculate the minimal distance to a grid point not on the line β„“ as

    Ξ”πœŒ = min(𝑝′,π‘žβ€²)βˆˆπ’’\β„“

    {βŸ¨π›Ώ[𝑝 βˆ’ 𝑝′, π‘ž βˆ’ π‘žβ€²], π‘’βŸ‚

    β€–π‘’βŸ‚β€–βŸ©}

    = min(𝑝′,π‘žβ€²)βˆˆπ’’\β„“

    {𝛿2 β‹… 𝑑(𝑝 βˆ’ 𝑝′) βˆ’ 𝑠(π‘ž βˆ’ π‘žβ€²)‖𝑒‖ } .(4.10)

    Since 𝑠 and 𝑑 are coprime, there exists π‘Ž, 𝑏 ∈ β„€ such that π‘Žπ‘‘ βˆ’ 𝑏𝑠 = 1, and sincethe Ξ”πœŒ cannot be zero, we obtain

    Ξ”πœŒ = 𝛿2

    ‖𝑒‖. (4.11)

    A visual argument for the same result can be seen in Figure 4.3.

    Inserting Ξ”πœŒ = 𝛿2/ ‖𝑒‖ and the tensor approximation of (4.8) into the curvelength approximation of (4.7) we obtain

    |𝐢|𝑀 β‰ˆ βˆ‘π‘’βˆ©πΆ

    det 𝑀(𝑒) ‖𝑒‖2 𝛿2 Ξ”πœ™2 (𝑒𝑇 β‹… 𝑀(𝑒) β‹… 𝑒)3/2

    , (4.12)

    where the sum is over all edges crossing the curve.The curve length we initially wanted to calculate was the level set perimeter

    Per𝐴({𝑒 > πœ†}; Ξ©) in (4.5). To find edges that crosses this boundary curve, weidentify the edges that have one terminal inside the level set, and the other outside.Thus we rewrite the sum over 𝑒 ∩ 𝐢 such that

    Per𝐴({𝑒 > πœ†}; Ξ©) β‰ˆ βˆ‘π‘’π‘Žπ‘

    βˆ£π‘’πœ†π‘Ž βˆ’ π‘’πœ†π‘ ∣det 𝑀(π‘’π‘Žπ‘) β€–π‘’π‘Žπ‘β€–2 𝛿2 Ξ”πœ™2 (π‘’π‘‡π‘Žπ‘ β‹… 𝑀(π‘’π‘Žπ‘) β‹… π‘’π‘Žπ‘)

    3/2 . (4.13)

    The absolute value βˆ£π‘’πœ†π‘Ž βˆ’ π‘’πœ†π‘ ∣ is 1 if one of π‘Ž and 𝑏 lie inside the level set and theother lies outside, and 0 otherwise. In other words the absolute value is one if π‘’π‘Žπ‘crosses the perimeter of {𝑒 > πœ†} an odd number of times, and zero otherwise.

    Thus we have arrived at our final discretization, which takes the form

    𝐹(𝑒) = βˆ‘πœ†

    βˆ‘π‘₯

    𝐹 πœ†π‘₯ (π‘’πœ†π‘₯) + 𝛽 βˆ‘πœ†

    βˆ‘(π‘₯,𝑦)

    𝐹 πœ†π‘₯,𝑦(π‘’πœ†π‘₯, π‘’πœ†π‘¦) =∢ 𝐹 πœ†(π‘’πœ†), (4.14)

    𝐹 πœ†π‘₯ (π‘’πœ†π‘₯) = (𝑁π‘₯(πœ† + 1) βˆ’ 𝑁π‘₯(πœ†)) β‹… π‘’πœ†π‘₯,

    𝐹 πœ†π‘₯,𝑦(π‘’πœ†π‘₯, π‘’πœ†π‘¦) = βˆ£π‘’πœ†π‘₯ βˆ’ π‘’πœ†π‘¦ ∣det 𝑀(𝑒π‘₯𝑦) βˆ₯𝑒π‘₯𝑦βˆ₯

    2 𝛿2 Ξ”πœ™2 (𝑒𝑇π‘₯𝑦 β‹… 𝑀(𝑒π‘₯𝑦) β‹… 𝑒π‘₯𝑦)

    3/2 .(4.15)

    Recall that 𝑁π‘₯(πœ†) = |πœ† βˆ’ 𝑓π‘₯|2.

  • 38 CHAPTER 4. DISCRETE FORMULATION

    If we minimize πΉπœ† to obtain π‘’πœ† for each level separately, it is obvious thatwe will also minimize the sum over all πΉπœ†. However, it is not guaranteed thatthe obtained thresholded images π‘’πœ† can be combined to make an output image 𝑒.They were defined as π‘’πœ† = πœ’π‘’>πœ†, so we need them to be monotonically decreasingin increasing level values, i.e.

    π‘’πœ†π‘₯ β‰₯ π‘’πœ‡π‘₯, βˆ€πœ† ≀ πœ‡, βˆ€π‘₯ ∈ 𝒒. (4.16)

    Later we will present two graph cut algorithms that find the thresholded imagesminimizing each level, while guaranteeing that they meet this requirement.

    Consistency

    Consistency relates to whether a solution to the continuous problem fits in the dis-cretized equation, in other words, whether the discretized equation approximatesthe continuous one.

    It is obvious that the discretization of the fidelity term in (4.2) is consistent.The sum is a midpoint rule approximation of the integral. As the grid is refinedand 𝛿 β†’ 0 the sum will converge to the integral.

    For the regularization term we will argue that for a differentiable curve 𝐢, thediscretization of our domain Ξ© and the set of lines β„’ gives a discrete Cauchy–Crofton formula that is consistent with the continuous one. We will show that foran increasingly refined discrete domain 𝒒, there exists a choice for ℒ𝐷 that leads toa consistent Cauchy–Crofton formula. For convenience we will use a neighborhoodrepresentation of ℒ𝐷 similar to the one in Figure 4.1a.

    If we consider the edges 𝑒 of each family separately, the curve length approxi-mation in (4.12) can be written

    |𝐢|𝑀 = ∫𝜈

    ∫𝜌

    βˆ‘π‘₯βˆˆβ„“πœˆ,𝜌∩𝐢

    det 𝑀(π‘₯)2 (πœˆπ‘‡ β‹… 𝑀(π‘₯) β‹… 𝜈)3/2

    π‘‘πœŒ π‘‘πœˆ

    β‰ˆ βˆ‘πœˆ

    βˆ‘πœŒ

    βˆ‘π‘’πœˆ,𝜌∩𝐢

    det 𝑀(π‘’πœˆ,𝜌) βˆ₯π‘’πœˆ,𝜌βˆ₯3

    2 (π‘’π‘‡πœˆ,𝜌 β‹… 𝑀(π‘’πœˆ,𝜌) β‹… π‘’πœˆ,𝜌)3/2 Ξ”πœŒ Ξ”πœˆ.

    (4.17)

    As described in the construction of this formula, there are four main approxima-tions used. Firstly there is the fact that we do not consider the actual intersectionpoints, but only whether an edge crosses the curve or not. Secondly we have thetensor which is averaged as in (4.8). And then we have the discretizations of ourtwo line parameters 𝜈 and 𝜌.

    It is intuitive that if sup ‖𝑒‖ β†’ 0, the number of times the differentiable curve𝐢 can cross a given edge decreases. We will not prove convergence, but ratherassume that the special cases where it might not work, are negligible.

  • 4.1. DISCRETIZATION 39

    ..............

    βˆ†πœŒ

    Figure 4.4: The discretization in the 𝜌 dimension can be regarded as amidpoint rule approximation of the integral, since the difference Ξ”πœŒ thesame for all lines in one line family.

    ...

    πœ™π‘˜βˆ’1

    .

    πœ™π‘˜

    .

    πœ™π‘˜+1

    .

    Ξ”πœ™π‘˜

    Figure 4.5: We showed that the maximal angle difference Ξ”πœ™π‘˜ goes tozero. The discretization in the πœ™ dimension can be viewed as a rectangleapproximation rule of the integral, as the summand is evaluated at πœ™π‘˜,somewhere inside the interval Ξ”πœ™π‘˜.

    Further, if sup ‖𝑒‖ β†’ 0 it is obvious that the tensor average in (4.8) convergesto the tensor in the intersection point.

    Consider now the discretization in 𝜌. For each πœ™ parameter, our discretizationin the 𝜌 dimension can be regarded as a midpoint rule as shown in Figure 4.4.Thus if sup Ξ”πœŒ β†’ 0, this part of the discretization is consistent.

    The discretization in the πœ™ dimension can also be regarded as a version of therectangle method, although not the midpoint rule. As shown in Figure 4.5, thecircle is split into intervals

    [πœ™π‘˜βˆ’1 + πœ™π‘˜2 ,πœ™π‘˜ + πœ™π‘˜+1

    2 ] (4.18)

    of length Ξ”πœ™π‘˜ = (πœ™π‘˜+1 + πœ™π‘˜βˆ’1)/2. The summand is evaluated at πœ™π‘˜, somewhereinside the interval. Thus if sup Ξ”πœ™π‘˜ β†’ 0, this discretization is also consistent.

    To show that all these properties can be fulfilled, we look at a particular neigh-borhood stencil construction. Consider a square centered around a grid point with

  • 40 CHAPTER 4. DISCRETE FORMULATION

    ....................................................................

    π‘Ž

    .

    𝑏

    .

    βˆšπ›Ώ

    Figure 4.6: To show that we have a consistent discretization of theCauchy–Crofton integral formula, we construct a discrete set of lines ℒ𝐷such that the length of the edges ‖𝑒‖, the angle differences Ξ”πœ™ (here π‘Ž and𝑏) and the distance between lines Ξ”πœŒ goes to zero as 𝛿 β†’ 0.

    side lengths√

    𝛿 as shown in Figure 4.6. As 𝛿 goes to zero, the size of this squarewill go to zero. Inside this square we can fit a square of 𝑛2 = ⌊1/

    βˆšπ›ΏβŒ‹2 grid points.

    This means that the number of grid points along the outer edge of this square 𝑛goes to infinity.

    We include all grid points inside the square in our neighborhood, except formultiple points that lie on the same line from the origin. If two or more grid pointslie on the same line, we include only the one closest to the origin. This impliesthat for each grid point along the outer edge of this square, we include in ourneighborhood a grid point having the same angle πœ™ to the π‘₯-axis.

    This construction can be seen in Figure 4.6 for 𝑛 = 5. The maximal anglebetween two lines πœ™π‘˜ βˆ’ πœ™π‘˜βˆ’1 will always be when one of πœ™π‘˜ and πœ™π‘˜βˆ’1 is horizontalor vertical, shown in Figure 4.6 as the angle π‘Ž. Thus the largest Ξ”πœ™π‘˜ will be whenπœ™π‘˜ = π‘šπœ‹/2 for π‘š ∈ β„€, so around vertical and horizontal edges. The supremum canthen be calculated to be

    sup Ξ”πœ™π‘˜ = 2 β‹… supπœ™π‘˜+1 βˆ’ πœ™π‘˜

    2 = arctan1/𝑛𝑛/2 = arctan

    2𝑛2 β†’ 0. (4.19)

    Further we see that the edge length will be bounded by half of the diagonal ofthe square such that

    ‖𝑒‖ ≀ βˆšπ›Ώ/2 β†’ 0. (4.20)And finally we know from Lemma 4.1 that for each line family 𝛿2 = Ξ”πœŒ ‖𝑒‖ and‖𝑒‖ β‰₯ 𝛿. Thus for the inter-line distance Ξ”πœŒπ‘˜ we have

    sup Ξ”πœŒπ‘˜ = sup𝛿2‖𝑒‖ ≀

    𝛿2𝛿 = 𝛿 β†’ 0. (4.21)

  • 4.2. GRAPH CUT APPROACH 41

    Hence the approximation has been shown to be equivalent to well-known, andconsistent integral approximations, where the summand converges to the inte-grand, and the differences Ξ”πœ™ and Ξ”πœŒ go to zero. Thus the perimeter approxima-tion in (4.5) is consistent with the continuous formulation in Theorem 3.8.

    Note that as we will work with digital images with fixed resolutions, we do notreally have the chance to refine our discretization. We do however have to takethese things into account when creating our neighborhood stencil, to make surethat we get a reasonable approximation of the perimeter lengths.

    4.2 Graph cut approachThe discretization we arrived at in (4.15) can be minimized using graph cuts. Foreach level πœ†, a minimum graph cut is found to produce the corresponding level set{𝑒 > πœ†}. These are then combined to form the final restored image 𝑒.

    In this section we will look at how these graphs are constructed such that theirminimum cuts correspond to the minimizers of the functional 𝐹 πœ†. The descriptionis taken with some small adjustments from my project work [1], and is includedhere for completeness. An implementation of the described approach can be foundin Appendix A.

    4.2.1 GraphsUsing the notation of [18] we will denote a directed graph as 𝐺 = (𝑉 , 𝐸) where𝑉 is a finite set of vertices, and 𝐸 is a binary relation on 𝑉 . If (𝑒, 𝑣) ∈ 𝐸 we saythat there is an edge from 𝑒 to 𝑣 in the graph 𝐺.

    We introduce the non-negative capacity function 𝑐 ∢ 𝑉 Γ— 𝑉 β†’ [0, ∞). Onlyedges (𝑒, 𝑣) ∈ 𝐸 can have a positive capacity 𝑐(𝑒, 𝑣) = π‘ž > 0 and it means that itis possible to send a flow of maximum π‘ž units from 𝑒 to 𝑣. For convenience we willlet 𝑐(𝑒, 𝑣) = 0 for any pair (𝑒, 𝑣) βˆ‰ 𝐸, and we do not allow self-loops in our graph.When a directed graph 𝐺 is equipped with capacity function 𝑐, one might call ita capacitated graph or a flow network, but as all our graphs will be capacitatedfrom this point, we will just call them graphs and we write 𝐺 = (𝑉 , 𝐸, 𝑐).

    There are two special vertices in the graph, the source 𝑠 and the sink 𝑑. Con-trary to other vertices, which can neither produce nor receive excess flow, thesource can produce and the sink can receive an unlimited amount of flow. Themost basic problem in graph flow theory is


Recommended