Valentina, Giorgetti. "Ill-Posed Problems in Computer Vision." Doctoral thesis, 2022. http://hdl.handle.net/2158/1274371.
Abstract:
The visual reconstruction problems have often an ill-posed nature. In this thesis we deal with
analyzing and solving three kinds of visual reconstruction problems: Blind Source Separation,
Demosaicing and Deblurring.
The demosaicing problem is related to the acquisition of RGB color images by means of CCD
digital cameras. In the RGB model, each pixel of a digital color image is associated to a triple
of numbers, which indicate the light intensity of the red, green and blue channel, respectively.
However, most cameras use a single sensor, associated with a color filter that allows only the
measure at each pixel of the reflectance of the scene at one of the three colors, according to
a given scheme or pattern, called Color Filter Array (CFA). For this reason, at each pixel, the
other two missing colors should be estimated. Different CFA’s are proposed for the acquisition.
The most common is the Bayer pattern. In this scheme,
the numbers of pixels in which the green color is sampled are double with respect to those
associated with the red and blue channels, because of the higher sensibility of the human eye
to the green wavelengths. If we decompose the acquired image into three channels, we obtain
three downsampled grayscale images, so that demosaicing could be interpreted as interpolating
grayscale images from sparse data. In most cameras, demosaicing is a part of the processing
required to obtain a visible images. The camera’s built-in-firmware is substantially based on fast
local interpolation algorithms.
The heuristic approaches, which do not try to solve an optimization problem defined in math-
ematical terms, are widely used in the literature. These methods, in general, are very fast. Our
proposed technique is of heuristic kind. In general, the heuristic techniques consist of filtering
operations, which are formulated by means of suitable observations on color images. The non-
adaptive algorithms, among which bilinear and bicubic interpolation, yield satisfactory results
in smooth regions of an image, but they can fail in textured or edge areas. Edge-directed in-
terpolation is an adaptive approach, where, by analyzing the area around each pixel, we choose
the possible interpolation direction. In practice, the interpolation direction is chosen to avoid
interpolating across the edges.
The algorithm here presented consists of three steps. The first two ones are initialization
steps, while the third one is an iterative steps. In the first one, the missing valued in the green
component are determined, in particular a weighted average-type technique is used. The weights
are determined in an edge-directed approach, in which we consider also the possible edges in the
red and blue components. In the second step, we determine the missing values in the red and
blue components. In this case we use two alternative techniques, according to the position of
the involved pixel in the Bayer pattern. In the first technique, the missing value is determined
by imposing that the second derivative of the intensity value of the red/blue channel is equal
to the second derivative of the intensity values of the green channel. This is done according to
the proposed approaches in the AP algorithm and the regularization algorithm.
In particular, a constraint is imposed, to get the derivatives of all channels similar as soon
as possible. At the third step, all values of the three channels are recursively updated, by means
of a constant-hue-based technique. In particular, we assume the constant color difference. The
technique we propose at this step is similar to that used by W. T. Freeman. Indeed, even
here a median filter is employed, in order to correct small spurious imperfections. We repeat
iteratively the third step. However, to avoid increasing excessively the computational cost, we
experimentally estimate that only four iterations are necessary to obtain an accurate demosaicing.
We call our technique as Local Edge Preserving (LEP) algorithm. The results related to this
technique have been published in A. Boccuto, I. Gerace, V. Giorgetti and M. Rinaldi, A Fast Algorithm for the Demosaicing Problem Concerning the Bayer Pattern. The Open Signal Processing Journal 6 (2019), 1–14.
In this thesis, we also propose an algorithm for image demosaicing that does not work within
the framework of the regularization approaches and is suited, in a natural way, to deal with noisy
data. More precisely, we propose an algorithm for joint demosaicing and denoising. Regular-
ization requires the adoption of constraints for the solution. The constraints we consider are
intra-channel and inter-channel local correlation. With respect to the intra-channel correlation,
we assume the intensity of each channel to be locally regular, i.e. piecewise smooth, so that also
noise can be removed. We describe this constraint through stabilizers that are functions discour-
aging intensity discontinuities of first, second and third order in a selective way, so that those
associated to truly edges in the scene are left to emerge. This allows to describe scenes even
very complex. Indeed, first order local smoothness characterizes images consisting of constant
patches, second order local smoothness describes patches whose pixels have values varying lin-
early, while third order local smoothness is used to represent images made up of quadratic-valued
patches. As per the inter-channel correlation, we enforce it in correspondence with the intensity
discontinuities, by means of constraints that promote their amplitude in the three channels to be
equal almost everywhere.
Note that all these constraints are by no means biased in favor of one of the three channels,
nor the geometry of the sampling pattern is in any way exploited. Thus, the method we propose
is completely independent of the CFA considered, although, in the experimental result section,
we present its application to images mosaiced through the Bayer CFA.
All the above constraints, including the data fidelity term, are merged in a non-convex en-
ergy function, whose minimizer is taken as our desired solution. The optimization is performed
through an iterative deterministic algorithm entailing the minimization in a sequence of a family
of approximating functions that, starting with a first componentwise convex function, gradually
converges to the original energy.
Our regularization approach can produce image solutions that exhibit reliable discontinuities
of both the intensity and the gradients, despite the necessary smoothness constraints. There-
fore, we propose an edge-preserving regularization approach, which means that the significant
discontinuities in the reconstructed image are geometrically consistent. In the very first works
proposing edge-preserving regularization, the image discontinuities were often represented by
means of extra, explicit variables, the so-called “line processes”. In that way, it was
relatively easy to formulate in terms of constraints the various properties required by significant
discontinuities. Nevertheless, the use of explicit line variables entails large computational costs.
Thus, so-called “duality theorems” were derived to demonstrate the edge-preserving
properties of suitable stabilizers, without introducing extra variables. In particular,
we developed duality theorems to determine the properties required for a stabilizer to implicitly
manage lines with the desired regularity features. In this work, we choose a suitable family of
approximations with the peculiarity that each function satisfies the conditions required for an im-
plicit treatment of geometrically significant edges, as expressed in the duality theorems.
This allows a better adherence of the approximations to the ideal energy function, with a
consequent better coherence with the properties required for the desired solution.
In this thesis we also study a Blind Source Separation (BSS) problem. These topics have been
widely investigated since the end of the last century, and have various applications.
In particular, we analyze the digital reconstruction of degraded documents. We observe that
weathering, powder, humidity, seeping of ink, mold and light transmission can determine the
degradation of the paper and the ink of written text. Some of the consequences in damaged
documents are, for instance, stains, noise, transparency of writing on the verso side and on the
close pages, unfocused or overlapping characters, and so on. Historically, the first techniques
of restoration for degraded documents were manual, and they led to a material restoration. Re-
cently, thanks to the diffusion of scanners and software for reconstruction of images, videos,
texts, photographs and films, several new techniques were used in the recovery and restoration
of deteriorated material, like for instance digital or virtual restoration. Digital imaging for doc-
uments is very important, because it allows to have digital achieves, to make always possible
the accessibility and the readability. The Digital Document Restoration consists of a set of pro-
cesses finalized to the visual and aesthetic improvement of a virtual reconstruction of a corrupted
document, without risk of deterioration.
We deal with show-through and bleed-through effects. The show-through is a front-to-back
interference, caused by the transparency of the paper and the scanning process, and by means of
which the text in the recto side of the document can appear also in the verso side, and conversely.
The bleed-through is an intrinsic front-to-back physical deterioration caused by ink seeping, and
its effect is similar to that of show-through. The physical model for the show-through distortion,
is very complex, because there are the spreading of light in the paper, the features of the paper, the
reflectance of the verso and the transmittance parameters. Sharma gave a mathematical
model was first analyzed and then further approximated so to become easier to handle. This
model describes the observed recto and verso images as mixtures of the two uncorrupted texts.
Locally, we consider a classical linear and stationary recto-verso model developed for
this purpose, and are concerned with the problem of estimating both the
ideal source images of the recto and the verso of the document and the mixture matrix producing
the bleed-through or show-through effects. This problem is ill-posed in the sense of Hadamard.
In fact, as the estimated mixture matrix varies, the corresponding estimated
sources are in general different, and thus infinitely many solutions exist. Many techniques to
solve this ill-posed inverse problem have been proposed in the literature. Among them, the
Independent Component Analysis (ICA) methods are based on the assumption that the sources
are mutually independent. The best-known ICA technique is the so-called FastICA,
which by means of a fixed point iteration finds an orthogonal rotation of the prewhitened data that maximizes a measure of non-Gaussianity of the rotated components. The FastICA algorithm is a parameter-free and extremely fast procedure, but ICA is not a viable approach in our setting, as for the problem we consider there is a clear correlation
among the sources. On the other hand, several techniques for ill-posed inverse problems require
that the estimated sources are only mutually uncorrelated. In this case, the estimated sources
are determined via a linear transformation of the data, which is obtained by imposing either an
orthogonality condition, as in Principal Component Analysis (PCA),
or an orthonormality condition, as in Whitening (W) and Symmetric Whitening (SW) techniques.
These approaches all require only a single and very fast processing step.
In [49, 156] it is observed that the results obtained by means of the SW method are substantially
equivalent to those produced by an ICA technique in the symmetric mixing case.
Here we assume that the sum of all rows of the mixing matrix is equal to one, since we expect
the color of the background of the source to be the same as that of the data. In our setting, we
change the variables of the data so that high and low light intensities correspond to presence
and absence of text in the document, respectively, and we impose a nonnegativity constraint on
the estimated sources. We define the overlapping matrix of both
the observed data and the ideal sources, a quantity related to the cross-correlation between the
signals. From the overlapping matrix we can deduce the overlapping level, which measures the
similarity between the front and the back of the document.
In order to obtain an accurate estimate of the sources, it is necessary to determine a correct
source overlapping level. To this aim, we propose the following iterative procedure. At each it-
eration, given the current source overlapping level, we estimate the mixture matrix that produces
the sources with the lowest possible source overlapping level among those having light intensity
in the desired range. This mixture matrix is computed by means of a suitable symmetric factor-
ization of the data overlapping matrix. We then use the estimated sources to update the source
overlapping level, and iterate the procedure until a fixed point is reached. At the fixed point,
the corresponding source overlapping level is the smallest one that allows to estimate the ideal
recto and verso sides with the desired properties. We consider this level as an adequate estimate
of the ideal source overlapping level. Thus, by means of this technique, we can estimate not
only the ideal sources and the mixture matrix, but also the source overlapping level, a value that
indicates the correlation between the ideal sources. Therefore, our method can be classified as
a Correlated Component Analysis (CCA) technique. We refer to
this method as the Minimum Amount of Text Overlapping in Document Separation (MATODS)
algorithm. Similarly to the FastICA technique, the MATODS algorithm is a parameter-free and
extremely fast procedure. We use the MATODS algorithm to solve the non-stationary and locally
linear model we propose, and in particular we present an extension of this technique that fits this
model, which we call the Not Invariant for Translation MATODS (NIT-MATODS) algorithm.
The related results have been published in A. Boccuto, I. Gerace and V. Giorgetti, A Blind Source
Separation Technique for Document Restoration. SIAM J. Imaging Sci. 12 (2) (2019), 1135–1162.
In this thesis we modify the MATODS algorithm to deal with the derivatives of the images of
the original sources. In this case, we assume that the overlapping level is equal to zero. By means
of our experimental results, we show that the proposed technique improves the results obtained
by MATODS in terms both of accuracy of the estimates and of computational costs. We refer
to this method as the Zero Edge Overlapping in Document Separation (ZEODS) algorithm. The
obtained results are published in A. Boccuto, I. Gerace, V. Giorgetti and G. Valenti, A Blind Source Separation Technique for Document Restoration Based on Edge
Estimation. http://viXra.org/abs/2201.0141 (2022).
In [148], Sharma gave a mathematical model was first analyzed and then further approxi-
mated so to become easier to handle. This model describes the observed recto and verso images
as mixtures of the two uncorrupted texts.
Now we analyze in detail the iterative technique to solve such a model, in which the
sources, the blur operators and the interference level are computed separately at every step, until
a fixed point is found. In this work, in particular, we deal with determining the interference level,
by fixing the blur operators and the ideal sources. To this aim, we use a GNC-type technique. In forthcoming papers, the steps
about finding the blur operators and the ideal sources will be treated. The results concerning such
a technique have been published in A. Boccuto, I. Gerace and V. Giorgetti, Blind Source Separation in Document Restoration: an Interference Level Estimation. http://viXra.org/abs/2201.0050 (2022).
The problem of restoring images consists of estimating the original image, starting from the
observed image and the supposed blur. In our model, we suppose to know the blur mask. In
general, this problem is ill–conditioned and/or ill–posed in the Hadamard sense.
Thanks to known regularization techniques, it is possible to reduce this
problem to a well–posed problem, whose solution is the minimum of the so-called primal energy
function, which consists of the sum of two terms. The former indicates the faithfulness of the
solution to the data, and the latter is in connection with the regularity properties of the solution. In order to obtain more realistic restored images, the discontinuities in the
intensity field is considered. Indeed, in images of real scenes, there are some dis-
continuities in correspondence with edges of several objects. To deal with such discontinuities,
we consider some line variables. It is possible to minimize a priori the primal
energy function in these variables, to determine a dual energy function,
which treats implicitly discontinuities. Indeed, minimizing the dual energy function is more
computationally efficient than minimizing directly the primal energy function. In general, the
dual energy function has a quadratic term, related to the faithfulness with the data, and a not
necessarily convex addend, the regularization term. In order to link these two kinds of energy
functions, some suitable duality theorems are used.
In order to improve the quality of the reconstructed images, it is possible to consider a dual
energy function which implicitly treats Boolean line variables. The proposed duality theorems
can be used even with such a function. However, the related dual energy function is not neces-
sarily convex. So, to minimize it, we use a GNC-type technique, which considers as first convex
approximation the proposed convex dual energy function.
It is possible to verify experimentally that the more expensive minimization is the first one,
because the other ones just start with a good approximation of the solution. Hence, when we
minimize the first convex approximation, we will approximate every block of the blur operator
by matrices whose product can be computed by a suitable fast discrete transform. As every block
is a symmetric Toeplitz matrix, we deal with determining a class of matrices easy to handle from
the computational point of view, which yield a good approximation of the Toeplitz matrices.
Toeplitz-type linear systems arise from numerical approximation of differential equations.
Moreover, in restoration of blurred images, it is often dealt with Toeplitz matrices.
Thus, in this thesis we investigate a particular class, which is a sum of two families
of simultaneously diagonalizable real matrices, whose elements we call β-matrices. Such a class
includes both circulant and reverse circulant matrices. Symmetric circulant matrices have several
applications to ordinary and partial differential equations, images and
signal restoration, graph theory. Reverse
circulant matrices have different applications, for instance in exponential data fitting and signal
processing. The obtained results have been published in A. Boccuto, I. Gerace and V. Giorgetti, Image deblurring: a class of matrices approximating Toeplitz matrices http://viXra.org/abs/2201.0155 (2022).
The thesis is structured as follows. In Chapter 1 we deal with the demosaicing problem,
proposing a fast technique which locally estimates the edges. In Chapter 2 we treat the same
problem, by giving a regularization technique for solving it. In Chapter 3 we consider the BSS
problem for ancient documents, proposing a technique which uses symmetric factorizations. In
Chapter 4 we modify the technique illustrated in the previous chapter, by introducing disconti-
nuities. In Chapter 5 we deal with the BSS problem, by giving a regularization technique, and
in particular we study the estimates of the interference levels. In Chapter 6 we treat the prob-
lem of image deblurring, and in particular we analyze how symmetric Toeplitz operators can be
approximated in the proposed GNC technique.