Thèses : « Visual image reconstruction »

1

Duraisamy, Prakash. « 3D Reconstruction Using Lidar and Visual Images ». Thesis, University of North Texas, 2012. https://digital.library.unt.edu/ark:/67531/metadc177193/.

Texte intégral

Résumé :

In this research, multi-perspective image registration using LiDAR and visual images was considered. 2D-3D image registration is a difficult task because it requires the extraction of different semantic features from each modality. This problem is solved in three parts. The first step involves detection and extraction of common features from each of the data sets. The second step consists of associating the common features between two different modalities. Traditional methods use lines or orthogonal corners as common features. The third step consists of building the projection matrix. Many existing methods use global positing system (GPS) or inertial navigation system (INS) for an initial estimate of the camera pose. However, the approach discussed herein does not use GPS, INS, or any such devices for initial estimate; hence the model can be used in places like the lunar surface or Mars where GPS or INS are not available. A variation of the method is also described, which does not require strong features from both images but rather uses intensity gradients in the image. This can be useful when one image does not have strong features (such as lines) or there are too many extraneous features.

Styles APA, Harvard, Vancouver, ISO, etc.

2

He, Peng. « Image-based reconstruction and visual hull from imprecise input ». Thesis, Imperial College London, 2012. http://hdl.handle.net/10044/1/10005.

Texte intégral

Résumé :

Image-based reconstruction is a series of computer vision processes which takes 2D images of the scene as input and outputs the geometric shape approximate of the scene. It has vast applications in industrial design, manufacture, gaming, filming, heritage protection and many other areas. The visual hull of a polyhedral (or polygonal) scene in R3 (or R2) is the best 3D (or 2D) shape that one can retrieve from its silhouettes. It has great advantage in obstacle avoidance, robotic navigation, 3D model acquisition and human motion tracking. A 3D visual hull is bounded by planes and quadratic surfaces. Classical image-based reconstruction and visual hull methods fail to maintain the exactness and robustness when the input is imprecise. In the solid domain of 3D objects in R3, geometric shapes with imprecision are well modelled and carefully studied. Each partial geometric object is defined by two disjoint open sets : interior and exterior. The interior (respectively, exterior) is an open set that contains all the points definitely known to be inside (respectively, outside) the object. Partial objects, ordered with subset inclusion, form a continuous Scott domain in which each object approximates the target object at a certain level of precision. We study the image-based reconstruction and visual hull in the solid domain which allows the notion of the partial polyhedron and the partial visual hull. They capture the imprecision in the input polyhedral scenes and outputs the exact information about what points are definitely inside or outside the reconstructed scene and the visual hull. The partial image-based reconstruction and the partial visual hull algorithm maintain the same computational complexity as the corresponding classical methods. The outputs of the algorithms are partial objects or partial visual hulls which converge to their classical counterparts as the input converges to an exact value. For the image-based reconstruction and the 2D visual hull, we show that their construction processes with imprecise input are Hausdro and the Scott continuity. For the 3D visual hull algorithm, we show its Hausdro and the Scott continuity of the domain-theoretic construction in the Solid domain of the projective 3 space P3. Furthermore, we prove the computability of the image-based reconstruction, the 2D and the 3D visual hull.

Styles APA, Harvard, Vancouver, ISO, etc.

3

Grauman, Kristen Lorraine 1979. « A statistical image-based shape model for visual hull reconstruction and 3D structure inference ». Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/87347.

Texte intégral

Résumé :

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.
Includes bibliographical references (p. 69-72).
by Kristen Lorraine Grauman.
S.M.

Styles APA, Harvard, Vancouver, ISO, etc.

4

Ozcelik, Furkan. « Déchiffrer le langage visuel du cerveau : reconstruction d'images naturelles à l'aide de modèles génératifs profonds à partir de signaux IRMf ». Electronic Thesis or Diss., Université de Toulouse (2023-....), 2024. http://www.theses.fr/2024TLSES073.

Texte intégral

Résumé :

Les grands esprits de l'humanité ont toujours été curieux de la nature de l'esprit, du cerveau et de la conscience. Par le biais d'expériences physiques et mentales, ils ont tenté de répondre à des questions difficiles sur la perception visuelle. Avec le développement des techniques de neuro-imagerie, les techniques de codage et de décodage neuronaux ont permis de mieux comprendre la manière dont nous traitons les informations visuelles. Les progrès réalisés dans les domaines de l'intelligence artificielle et de l'apprentissage profond ont également influencé la recherche en neuroscience. Avec l'émergence de modèles génératifs profonds tels que les autoencodeurs variationnels (VAE), les réseaux adversariaux génératifs (GAN) et les modèles de diffusion latente (LDM), les chercheurs ont également utilisé ces modèles dans des tâches de décodage neuronal telles que la reconstruction visuelle des stimuli perçus à partir de données de neuro-imagerie. La présente thèse fournit deux bases théoriques dans le domaine de la reconstruction des stimuli perçus à partir de données de neuro-imagerie, en particulier les données IRMf, en utilisant des modèles génératifs profonds. Ces bases théoriques se concentrent sur des aspects différents de la tâche de reconstruction visuelle que leurs prédécesseurs, et donc ils peuvent apporter des résultats précieux pour les études qui suivront. La première étude dans la thèse (décrite au chapitre 2) utilise un modèle génératif particulier appelé IC-GAN pour capturer les aspects sémantiques et réalistes de la reconstruction visuelle. La seconde étude (décrite au chapitre 3) apporte une nouvelle perspective sur la reconstruction visuelle en fusionnant les informations décodées à partir de différentes modalités (par exemple, le texte et l'image) en utilisant des modèles de diffusion latente récents. Ces études sont à la pointe de la technologie dans leurs domaines de référence en présentant des reconstructions très fidèles des différents attributs des stimuli. Dans nos deux études, nous proposons des analyses de régions d'intérêt (ROI) pour comprendre les propriétés fonctionnelles de régions visuelles spécifiques en utilisant nos modèles de décodage neuronal. Les relations statistiques entre les régions d'intérêt et les caractéristiques latentes décodées montrent que les zones visuelles précoces contiennent plus d'informations sur les caractéristiques de bas niveau (qui se concentrent sur la disposition et l'orientation des objets), tandis que les zones visuelles supérieures sont plus informatives sur les caractéristiques sémantiques de haut niveau. Nous avons également observé que les images optimales de ROI générées à l'aide de nos techniques de reconstruction visuelle sont capables de capturer les propriétés de sélectivité fonctionnelle des ROI qui ont été examinées dans de nombreuses études antérieures dans le domaine de la recherche neuroscientifique. Notre thèse tente d'apporter des informations précieuses pour les études futures sur le décodage neuronal, la reconstruction visuelle et l'exploration neuroscientifique à l'aide de modèles d'apprentissage profond en fournissant les résultats de deux bases théoriques de reconstruction visuelle et d'analyses de ROI. Les résultats et les contributions de la thèse peuvent aider les chercheurs travaillant dans le domaine des neurosciences cognitives et avoir des implications pour les applications d'interface cerveau-ordinateur
The great minds of humanity were always curious about the nature of mind, brain, and consciousness. Through physical and thought experiments, they tried to tackle challenging questions about visual perception. As neuroimaging techniques were developed, neural encoding and decoding techniques provided profound understanding about how we process visual information. Advancements in Artificial Intelligence and Deep Learning areas have also influenced neuroscientific research. With the emergence of deep generative models like Variational Autoencoders (VAE), Generative Adversarial Networks (GAN) and Latent Diffusion Models (LDM), researchers also used these models in neural decoding tasks such as visual reconstruction of perceived stimuli from neuroimaging data. The current thesis provides two frameworks in the above-mentioned area of reconstructing perceived stimuli from neuroimaging data, particularly fMRI data, using deep generative models. These frameworks focus on different aspects of the visual reconstruction task than their predecessors, and hence they may bring valuable outcomes for the studies that will follow. The first study of the thesis (described in Chapter 2) utilizes a particular generative model called IC-GAN to capture both semantic and realistic aspects of the visual reconstruction. The second study (mentioned in Chapter 3) brings new perspective on visual reconstruction by fusing decoded information from different modalities (e.g. text and image) using recent latent diffusion models. These studies become state-of-the-art in their benchmarks by exhibiting high-fidelity reconstructions of different attributes of the stimuli. In both of our studies, we propose region-of-interest (ROI) analyses to understand the functional properties of specific visual regions using our neural decoding models. Statistical relations between ROIs and decoded latent features show that while early visual areas carry more information about low-level features (which focus on layout and orientation of objects), higher visual areas are more informative about high-level semantic features. We also observed that generated ROI-optimal images, using these visual reconstruction frameworks, are able to capture functional selectivity properties of the ROIs that have been examined in many prior studies in neuroscientific research. Our thesis attempts to bring valuable insights for future studies in neural decoding, visual reconstruction, and neuroscientific exploration using deep learning models by providing the results of two visual reconstruction frameworks and ROI analyses. The findings and contributions of the thesis may help researchers working in cognitive neuroscience and have implications for brain-computer-interface applications

Styles APA, Harvard, Vancouver, ISO, etc.

5

Anliot, Manne. « Volume Estimation of Airbags : A Visual Hull Approach ». Thesis, Linköping University, Department of Electrical Engineering, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-421.

Texte intégral

Résumé :

This thesis presents a complete and fully automatic method for estimating the volume of an airbag, through all stages of its inflation, with multiple synchronized high-speed cameras.

Using recorded contours of the inflating airbag, its visual hull is reconstructed with a novel method: The intersections of all back-projected contours are first identified with an accelerated epipolar algorithm. These intersections, together with additional points sampled from concave surface regions of the visual hull, are then Delaunay triangulated to a connected set of tetrahedra. Finally, the visual hull is extracted by carving away the tetrahedra that are classified as inconsistent with the contours, according to a voting procedure.

The volume of an airbag's visual hull is always larger than the airbag's real volume. By projecting a known synthetic model of the airbag into the cameras, this volume offset is computed, and an accurate estimate of the real airbag volume is extracted.

Even though volume estimates can be computed for all camera setups, the cameras should be specially posed to achieve optimal results. Such poses are uniquely found for different airbag models with a separate, fully automatic, simulated annealing algorithm.

Satisfying results are presented for both synthetic and real-world data.

Styles APA, Harvard, Vancouver, ISO, etc.

6

Naouai, Mohamed. « Localisation et reconstruction du réseau routier par vectorisation d'image THR et approximation des contraintes de type "NURBS" ». Phd thesis, Université de Strasbourg, 2013. http://tel.archives-ouvertes.fr/tel-00994333.

Texte intégral

Résumé :

Ce travail de thèse vise à mettre en place un système d'extraction de réseau routier en milieu urbain à partir d'image satellite à très haute résolution. Dans ce contexte, nous avons proposé deux méthodes de localisation de routes. La première approche est fondée sur la procédure de conversion de l'image vers un format vectoriel. L'originalité de cette approche réside dans l'utilisation d'une méthode géométrique pour assurer le passage vers une représentation vectorielle de l'image d'origine et la mise en place d'un formalisme logique fondé sur un ensemble de critères perceptifs permettant le filtrage de l'information inutile et l'extraction des structures linéaires. Dans la deuxième approche, nous avons proposé un algorithme fondé sur la théorie des ondelettes, il met particulièrement en évidence les deux volets multi-résolution et multi-direction. Nous proposons donc une approche de localisation des routes mettant en jeux l'information fréquentielle multi directionnelle issue de la transformée en ondelette Log-Gabor. Dans l'étape de localisation, nous avons présenté deux détecteurs de routes qui exploitent l'information radiométrique, géométrique et fréquentielle. Cependant, ces informations ne permettent pas un résultat exact et précis. Pour remédier à ce problème, un algorithme de suivi s'avère nécessaire. Nous proposons la reconstruction de réseaux routiers par des courbes NURBS. Cette approche est basée sur un ensemble de points de repères identifiés dans la phase de localisation. Elle propose un nouveau concept, que nous avons désigné par NURBSC, basé sur les contraintes géométriques des formes à approximer. Nous connectons les segments de route identifiés afin d'obtenir des tracés continus propres aux routes.

Styles APA, Harvard, Vancouver, ISO, etc.

7

Féraud, Thomas. « Rejeu de chemin et localisation monoculaire : application du Visual SLAM sur carte peu dense en environnement extérieur contraint ». Phd thesis, Université Blaise Pascal - Clermont-Ferrand II, 2011. http://tel.archives-ouvertes.fr/tel-00697028.

Texte intégral

Résumé :

Dans le cadre de la robotique mobile en environnement extérieur, les concepts de localisation et de perception sont au coeur de toute réalisation. Aussi, les travaux menés au sein de cette thèse visent à rendre plus robustes des processus de localisation existants sans pour autant augmenter de manière notable leur complexité. La problématique proposée place un robot au sein d'un environnement potentiellement dangereux avec pour objectif de suivre une trajectoire établie comme sécurisée avec une carte aussi simple que possible. De plus, des contraintes fortes sont imposées tant dans la réalisation (système peu onéreux, indétectable) que dans le résultat (une exécution temps-réel et une localisation en permanence dans une tolérance de 10 cm autour de la trajectoire de référence). Le capteur extéroceptif choisi pour mener à bien ce projet est une caméra tandis que l'estimation de la pose du véhicule à chaque instant est réalisée par un filtre de Kalman dans sa version étendue. Les principaux problèmes d'estimation résident dans la non-linéarité des modèles d'observation et les contributions apportées apportent quelques solutions : - une méthode de calcul exacte de la propagation des incertitudes de l'espace monde vers l'espace capteur (caméra) ; - une méthode de détection des principaux cas de divergence du filtre de Kalman dans le calcul de la phase de mise à jour ; - une méthode de correction du gain de Kalman. Ce projet avait deux objectifs : réaliser une fonction de localisation répondant aux contraintes fortes préalablement évoquées, et permettre à un véhicule de quitter temporairement la trajectoire de référence, suite à la prise en main de l'opérateur pour ensuite reprendre le cours normal de sa mission au plus près de la trajectoire de référence. Ce deuxième volet fait intervenir un cadre plus large dans lequel il faut, en plus de la localisation, cartographier son environnement. Cette problématique, identifiée par l'acronyme SLAM (Simultaneous Localization And Mapping), fait le lien avec les deux dernières contributions de ces travaux de thèse : - une méthode d'initialisation des points qui constitueront la carte SLAM ; - une méthode pour maintenir la cohérence entre la carte de référence et la carte SLAM. Des résultats sur des données réelles, étayant chacune des contributions, sont présentés et illustrent la réalisation des deux principaux objectifs.

Styles APA, Harvard, Vancouver, ISO, etc.

8

North, Peter R. J. « The reconstruction of visual appearance by combining stereo surfaces ». Thesis, University of Sussex, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.362837.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

9

Ebrahimi, Shahin. « Contribution to automatic adjustments of vertebrae landmarks on x-ray images for 3D reconstruction and quantification of clinical indices ». Thesis, Paris, ENSAM, 2017. http://www.theses.fr/2017ENAM0050/document.

Texte intégral

Résumé :

L’exploitation de données radiographiques, en particulier pour la reconstruction 3D du rachis de patients scoliotiques, est un prérequis à la modélisation personnalisée. Les méthodes actuelles, bien qu’assez robustes pour la routine clinique, reposent sur des ajustements manuels fastidieux. Dans ce contexte, ce travail de thèse vise à la détection automatisée de points anatomiques spécifiques des vertèbres, permettant ainsi des ajustements automatisés. Nous avons développé premièrement une méthode originale de localisation de coins de vertèbres cervicales et lombaires sur les radiographies sagittales. L’évaluation rigoureuse de cette méthode suggère sa robustesse et sa précision. Nous avons ensuite développé un algorithme pour le problème pertinent cliniquement de localisation des pédicules sur les radiographies coronales. Cet algorithme se compare favorablement aux méthodes similaires dans la littérature, qui nécessitent une saisie manuelle. Enfin, nous avons soulevé les problèmes, relativement peu étudiés, de détection, identification et segmentation des apophyses épineuses du rachis cervical dans les radiographies sagittales. Toutes les tâches mentionnées ont été réalisées grâce à une combinaison originale de descripteurs visuels et une classification multi-classe par Random Forest, menant à une nouvelle et puissante approche de localisation et de segmentation. Les méthodes proposées dans cette thèse suggèrent un grand potentiel pour être intégré à la reconstruction 3D du rachis, utilisée quotidiennement en routine clinique
Exploitation of spine radiographs, in particular for 3D spine shape reconstruction of scoliotic patients, is a prerequisite for personalized modelling. Current methods, even though robust enough to be used in clinical routine, still rely on tedious manual adjustments. In this context, this PhD thesis aims toward automated detection of specific vertebrae landmarks in spine radiographs, enabling automated adjustments. In the first part, we developed an original Random Forest based framework for vertebrae corner localization that was applied on sagittal radiographs of both cervical and lumbar spine regions. A rigorous evaluation of the method confirms robustness and high accuracy of the proposed method. In the second part, we developed an algorithm for the clinically-important task of pedicle localization in the thoracolumbar region on frontal radiographs. The proposed algorithm compares favourably to similar methods from the literature while relying on less manual supervision. The last part of this PhD tackled the scarcely-studied task of joint detection, identification and segmentation of spinous processes of cervical vertebrae in sagittal radiographs, with again high precision performance. All three algorithmic solutions were designed around a generic framework exploiting dedicated visual feature descriptors and multi-class Random Forest classifiers, proposing a novel solution with computational and manual supervision burdens aiming for translation into clinical use. Overall, the presented frameworks suggest a great potential of being integrated in current spine 3D reconstruction frameworks that are used in daily clinical routine

Styles APA, Harvard, Vancouver, ISO, etc.

10

Haouchine, Nazim. « Image-guided simulation for augmented reality during hepatic surgery ». Thesis, Lille 1, 2015. http://www.theses.fr/2015LIL10009/document.

Texte intégral

Résumé :

L’objectif principal de cette thèse est de fournir aux chirurgiens des outils d’aide à la décision pré et per-opératoire lors d’interventions minimalement invasives en chirurgie hépatique. Ces interventions reposent en général sur des techniques de laparoscopie ou plus récemment d’endoscopie flexible. Lors de telles interventions, le chirurgien cherche à retirer un nombre souvent important de tumeurs hépatiques, tout en préservant le rôle fonctionnel du foie. Cela implique de définir une hépatectomie optimale, c’est à dire garantissant un volume du foie post-opératoire d’au moins 55% du foie initial et préservant au mieux la vascularisation hépatique. Bien qu’une planification de l’intervention puisse actuellement s’envisager sur la base de données pré-opératoire spécifiques au patient, les mouvements importants du foie et ses déformations lors de l’intervention rendent cette planification très difficile à exploiter en pratique. Les travaux proposés dans cette thèse visent à fournir des outils de réalité augmentée utilisables en conditions per-opératoires et permettant de visualiser à chaque instant la position des tumeurs et réseaux vasculaires hépatiques
The main objective of this thesis is to provide surgeons with tools for pre and intra-operative decision support during minimally invasive hepatic surgery. These interventions are usually based on laparoscopic techniques or, more recently, flexible endoscopy. During such operations, the surgeon tries to remove a significant number of liver tumors while preserving the functional role of the liver. This involves defining an optimal hepatectomy, i.e. ensuring that the volume of post-operative liver is at least at 55% of the original liver and the preserving at hepatic vasculature. Although intervention planning can now be considered on the basis of preoperative patient-specific, significant movements of the liver and its deformations during surgery data make this very difficult to use planning in practice. The work proposed in this thesis aims to provide augmented reality tools to be used in intra-operative conditions in order to visualize the position of tumors and hepatic vascular networks at any time

Styles APA, Harvard, Vancouver, ISO, etc.

11

Koehler, Ana Luiza Goulart. « Retraçando os becos de Porto Alegre : visualizando a cidade invisível ». reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2015. http://hdl.handle.net/10183/139940.

Texte intégral

Résumé :

O presente estudo tem por objetivo geral recuperar as imagens de uma cidade no passado, mas de uma cidade que se pode chamar de "invisível": a cidade dos becos, espaços vistos como enclaves de exclusão social e marginalização no centro de Porto Alegre. Para isso, será utilizada a História Cultural como arcabouço teórico para o entrecruzamento e composição de indícios textuais e imagéticos colhidos na documentação, culminando com a reconstrução, através de representações gráficas, de vistas perspectivas destes espaços urbanos desaparecidos.
The present study aims to recover the images and the form not only of the city of Porto Alegre in the past, but also to bring to light its “invisible” spaces: the city of the alleyways, seen as dens of poverty, criminality and disease in the heart of the city center. To this end, the data retrieved in the sources like texts, photographies, maps and municipality documents will be analysed in the light of the theory of Cultural History, basing the visual reconstruction of theses lost city spaces in the form of drawings and sketches.

Styles APA, Harvard, Vancouver, ISO, etc.

12

Kröber, Cindy, Kristina Friedrichs et Nicole Filz. « HistStadt4D – A four dimensional access to history ». TUDpress, 2016. https://tud.qucosa.de/id/qucosa%3A33991.

Texte intégral

Résumé :

Purpose – We propose a multidisciplinary approach based on an extensive data base which provides digitalized photographic material from the end of the 19th century up to recent times. Thus a large amount of photographic evidence will be exploited, structured and enriched by additional sources to serve as a foundation for an application relying on 3D visualizations. The application addresses scholars as well as the general public and will provide different kinds of information and tools for research and knowledge transfer. Design/methodology/approach – The method applied will be diachronic: the virtual model may show one point in urban history depicting a certain state of past Dresden and also its development through the various eras. In addition the method works in a dualistic mode: on the one hand the physical development of the urban area will be explored and presented in detail, on the other hand the analysis of the pictures will give profound insights in the specific perception of the urban space. Originality/value – This methodology aims to make large repositories more accessible and proactive in information-seeking. Using a 3D application as an access for media repositories, research tools and functionalities which can improve the scientific handling of the data will be considered. How should the data and information be processed to meet the researcher’s needs? Which information can be retrieved from the visual media? What needs to be considered to ensure scientific standards and motivation while working with the image repositories? Users of the virtual archives can benefit extensively form effective searching functions and tools which work not only content- and theme-based but also location-based. Practical implications – The outcomes of the research will be presented in a 4D browser and available in an Augmented Reality presentation. The design will comply with the requirements of the field of application, whether aiming at a scientific, educative or touristic purpose. The paper itself considers three different approaches to the topic highlighting the multidisciplinary strategy and opportunities of the project. The first one considers research questions from art history. The second one reflects on concepts from information science, photogrammetry and computer vision for visualizations and the third one introduces an interaction concept for an AR application for the Zwinger in Dresden.

Styles APA, Harvard, Vancouver, ISO, etc.

13

« Locally Adaptive Stereo Vision Based 3D Visual Reconstruction ». Doctoral diss., 2017. http://hdl.handle.net/2286/R.I.44195.

Texte intégral

Résumé :

abstract: Using stereo vision for 3D reconstruction and depth estimation has become a popular and promising research area as it has a simple setup with passive cameras and relatively efficient processing procedure. The work in this dissertation focuses on locally adaptive stereo vision methods and applications to different imaging setups and image scenes. Solder ball height and substrate coplanarity inspection is essential to the detection of potential connectivity issues in semi-conductor units. Current ball height and substrate coplanarity inspection tools are expensive and slow, which makes them difficult to use in a real-time manufacturing setting. In this dissertation, an automatic, stereo vision based, in-line ball height and coplanarity inspection method is presented. The proposed method includes an imaging setup together with a computer vision algorithm for reliable, in-line ball height measurement. The imaging setup and calibration, ball height estimation and substrate coplanarity calculation are presented with novel stereo vision methods. The results of the proposed method are evaluated in a measurement capability analysis (MCA) procedure and compared with the ground-truth obtained by an existing laser scanning tool and an existing confocal inspection tool. The proposed system outperforms existing inspection tools in terms of accuracy and stability. In a rectified stereo vision system, stereo matching methods can be categorized into global methods and local methods. Local stereo methods are more suitable for real-time processing purposes with competitive accuracy as compared with global methods. This work proposes a stereo matching method based on sparse locally adaptive cost aggregation. In order to reduce outlier disparity values that correspond to mis-matches, a novel sparse disparity subset selection method is proposed by assigning a significance status to candidate disparity values, and selecting the significant disparity values adaptively. An adaptive guided filtering method using the disparity subset for refined cost aggregation and disparity calculation is demonstrated. The proposed stereo matching algorithm is tested on the Middlebury and the KITTI stereo evaluation benchmark images. A performance analysis of the proposed method in terms of the I0 norm of the disparity subset is presented to demonstrate the achieved efficiency and accuracy.
Dissertation/Thesis
Doctoral Dissertation Electrical Engineering 2017

Styles APA, Harvard, Vancouver, ISO, etc.

14

Grauman, Kristen. « A Statistical Image-Based Shape Model for Visual Hull Reconstruction and 3D Structure Inference ». 2003. http://hdl.handle.net/1721.1/7104.

Texte intégral

Résumé :

We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Styles APA, Harvard, Vancouver, ISO, etc.

15

Khwaja, Asim. « Exploring the visual pathway and its applications to image reconstruction, contrast enhancement and object recognition ». Phd thesis, 2010. http://hdl.handle.net/1885/150688.

Texte intégral

Résumé :

The natural world is filled with perfectly working, functional systems that are robust, accurate, and adaptable; this work takes favour with the aforesaid and presents a biologically inspired approach to computer vision; in particular on the subjects of image reconstruction, contrast enhancement and object recognition. The first half of this thesis takes an exploratory approach, on the example of image reconstruction, towards the understanding of the visual pathway from retina to the primary visual cortex (V1), investigating redundancy reduction, information preservation and contrast enhancement. The retina having approximately 130 million cells, is forced to discriminate with the incoming information. Programmed for concision, the primate eye encodes information with sparsity, yet remains information preserving by encoding only contrast. By reconstructing an image from its contrast map pairs using gradient descent least squares error minimization, this work has shown that information is preserved across the optic nerve channel despite sparsification of the input image presented on the photoreceptors. By mimicking the irregularities of the eye's receptive fields, it has been shown that the neural architecture along the visual pathway is robust and fault tolerant against irregularities - a general characteristic of the entire nervous system. Using non-linear and asymmetric gain control with the on-and off-centre contrast map pairs, it has been shown that the mean luminance of an image can be controlled and the aforesaid reconstruction can be used for straightforward enhancement; thus reducing contrast enhancement to a scaling operation over the contrast domain. This has further been successfully applied to colour image contrast enhancement using a number of different models, including the neuro-physiologically proven representation of colour opponency, in the form of colour opponent contrast maps. With the above work serving as a pre-processing stage, the second half of the thesis approaches the subject of object recognition; improving upon prior work in Sparse Representation Classification (SRC). Sparseness is a key feature of the brain's internal representation whereby it achieves its robustness and adaptability. This work replaces the mean square error measure for similarity comparison of images with a perceptually compatible structural error measure, as well as the conventional sparsifiers with a genetic algorithm of the original SRC algorithm. This has resulted in an improved recognition rate - owed in large part to a more effective similarity comparison and improved sparseness of the solution. The approaches to the troika of reconstruction, contrast enhancement and object recognition strengthen both premise and belief that biologically inspired vision is dually meritorious and warrants greater appreciation and study by the Computer Vision community at large; not to be discounted as is often done. The hope is this work proves a seed for future endeavours.

Styles APA, Harvard, Vancouver, ISO, etc.

16

Barnard, Gerrit. « High quality coding and reconstruction for transmission of single video images ». Diss., 1990. http://hdl.handle.net/2263/29164.

Texte intégral

Résumé :

Please read the abstract in the section 00front of this document Copyright 1990, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. Please cite as follows: Barnard, G 1990, High quality coding and reconstruction for transmission of single video images, MEng dissertation, University of Pretoria, Pretoria, viewed yymmdd < http://upetd.up.ac.za/thesis/available/etd-10312007-110001/ >
Dissertation (M Eng (Electronic Engineering))--University of Pretoria, 2007.
Electrical, Electronic and Computer Engineering
unrestricted

Styles APA, Harvard, Vancouver, ISO, etc.

17

Chang, Yao-wen, et 張耀文. « Retinotopic Mapping Using Multi-Focal Functional MRI : Visual Image Reconstruction of Brain Activities and its Optimization method ». Thesis, 2012. http://ndltd.ncl.edu.tw/handle/80867299933012723434.

Texte intégral

Résumé :

碩士
國立臺灣科技大學
電機工程系
100
This thesis describes a study exploiting multi-focal functional MRI(fMRI) for retinotopic mapping, or retinotopy, in the primary visual cortex. We tried to reconstruct visual image according the retinotopy and brain activities obtained by fMRI. Multi-focal method divides the visual field into several blocks and each block has its own paradigm for the visual experiment. Using this method, researchers show that they are able to distinguish the brain areas corresponding to each block simultaneously. Despite visual fMRI, this method is also applied electrophysiological analysis of visual system. In this study, we performed a visual fMRI experiment using a specific pattern after multi-focal retinotopy. We then attempt to reconstruct the visual image by combining the results of visual fMRI and retinotopy. The study applied general linear model to analyze the fMRI signal and produced a t value to justify the existence of stimuli-related brain activities. However, judging the “existence” required selecting a threshold of the t value. We empirically found that the accuracy of the reconstructed visual image largely depended on the threshold selection. Therefore, this study proposed an approach to find the optimal t threshold according to a receiver operating characteristic analysis. The results obtained with 5 volunteers using the optimized t thresholds demonstrated an average accuracy of 80%. In conclusion, we successfully reconstructed the visual image by the fMRI technique. Compared to previous investigations, we regard the contributions of this thesis are the optimization method for visual image reconstruction. This method leads to a completely automatic reconstruction procedure and takes visual reconstruction a step forward.

Styles APA, Harvard, Vancouver, ISO, etc.

18

(11166546), Xiaoyu Xiang. « Machine Learning and Deep Learning Approaches to Print defect Detection, Face Set Recognition, Face Alignment, and Visual Enhancement in Space and Time ». Thesis, 2021.

Trouver le texte intégral

Résumé :

The research includes machine Learning and Deep Learning Approaches to Print Defect Detection, Face Set Recognition and Face Alignment, and Visual-Enhancement in Space and Time. This thesis consists of six parts which are related to 6 projects:

In Chapter 1, the first project focuses on detection of local printing defects including gray spots and solid spots. We propose a coarse-to-fine method to detect local defects in a block-wise manner and aggregate the blockwise attributes to generate the feature vector of the whole test page for a further ranking task. In the detection part, we first select candidate regions by thresholding a single feature. Then more detailed features of candidate blocks are calculated and sent to a decision tree that is previously trained on our training dataset. The final result is given by the decision tree model to control the false alarm rate while maintaining the required miss rate.

Chapter 2 introduces face set recognition and Chapter 3 is about face alignment. In order to reduce the computational complexity of comparing face sets, we propose a deep neural network that can compute and aggregate the face feature vectors with different weights. As for face alignment, our goal is to solve the jittering of landmark locations when applied on video. We propose metrics and corresponding methods around this goal.

In recent years, mobile photography has become increasingly prevalent in our lives with social media due to its high portability and convenience. However, many challenges still exist in distributing high-quality mobile images and videos under the limit of data capacity, hardware storage, and network bandwidth. Therefore, we have been exploring enhancement techniques to improve the image and video qualities, considering both effectiveness and efficiency for a wide variety of applications, including WhatsApp, Portal, TikTok, even the printing industry. Chapter 4 introduces single image super-resolution to handle real-world images with various degradations, and its influence on several downstream high-level computer vision tasks. Next, Chapter 5 studies on headshot image restoration with multiple references, which is an application of visual enhancement under more specific scenarios. Finally, as a step towards the temporal domain enhancement, the Zooming SlowMo framework for fast and accurate space-time video super-resolution will be introduced in Chapter 6.

Styles APA, Harvard, Vancouver, ISO, etc.

19

Wu, Yi-Jong, et 吳宜蓉. « The Culture Reflection Under Imperial Ching Dynasty-- the Compiling Visual Field in Taiwan Gazetteers and the Reconstruction of the Image of Taiwanese People in the “Folk Custom Category” ». Thesis, 2011. http://ndltd.ncl.edu.tw/handle/76840189615048932567.

Texte intégral

Résumé :

碩士
淡江大學
歷史學系碩士在職專班
99
Taiwan was incorporated into the operation system of the Chinese Qing Empire in the Kangxi twenty third year (AD 1684). This giant empire was about to know the tiny island located in the southeast coast. A huge empire machine was ready to mesh its heavy wheels with the pinion of the small island, and activated the driven wheel by giving island the rhythms and tempos that empire’s memberships required. Question is, how to "chimera"? How to let the tiny island familiar with the rhythms of this giant and sophisticated operation system appropriately? Therefore, the Qing empire utilized Taiwan gazetteers (fangzhi方志) to understand this unfamiliar island which he had never met before, in order to control the people and consolidate its political power. That is not only necessary but would be a relatively considerate and wise policy. However, as the Taiwan gazetteers (fangzhi方志) acted as the eye of the empire, it was seemingly an objective, rigorous , and systematic narratives; moreover, it was also written and influenced under inherent values and ideology of the empire. All things are placed in a specific classification and category, and the connotation of narratives represented was also shaped by the process and how compilers had illustrated. In other words, when viewers were reading, the tendency of subject values has already confined the viewers deeply in a set of customs and limitation. Therefore, I believe that the selections and tendencies of the subjective values had been already rooted deeply in the Qing Dynasty Taiwan gazetteers writers and editors’ minds before their investigation on the customary field works(cǎi fēng wen su采風問俗). Also, they had decided the narrative structure and directions under the Qing Dynasty gazetteers(fangzhi方志) official normative framework. The thesis put emphasis on the Qing Dynasty Taiwan gazetteers writers and editors and the compiling visual field in Taiwan Gazetteers. In addition to organizing the areas, compiling years, and the name list of writers and editors, the paper will analyze how they documented Taiwan, how they illustrated good morals, and how they reported the facts from their observations and perspectives; then to present the gazetteers based on their own value tendency and exiting cultural framework that empire established.

Styles APA, Harvard, Vancouver, ISO, etc.

20

Chang, Ter Hsin, et 張德歆. « Juan Net Wanderings – A Spiritual Exploration of the Reconstruction and Stacking of Visual Images in Different States of Consciousness ». Thesis, 2012. http://ndltd.ncl.edu.tw/handle/57250746638173029795.

Texte intégral

Résumé :

碩士
華梵大學
美術學系碩士班
101
This paper primarily focuses on paintings created by the author between 2010 and 2012, investigating Juan Net’s influence on the various manifestations of the environment’s and people’s expressions of imagination and meaning. The paper also explores the spirit behind the author’s methods and modes of painting and delves into the reconstruction and stacking under different states of consciousness that arise from visual reflection. When psychological and visual experiences are embodied in one’s creations, the creative spirit is released, and meaning and personality are revealed. During the process of creation, what the author hopes is revealed is the visual experience that cannot be entirely experienced through physical faculties and is instead brought to completion by psychological ones during our journeys in space and time. Personal experiences are combined with the principles of “Juan” and subjectively infused into the content of the work. Through the style of a narrative documentary, various journeys are revealed in an attempt to shed light on the content of the experience. The works focus on emotions of entanglement or confusion that are unrelated to pain or joy. From the standpoint of an observer, simultaneous feelings of distrust and self-righteous emotions might arise, allowing us to probe into “the reality in unreality.” When faced with this predicament, the author asks the question of whether its possible to experience the work’s innermost authenticity, and uses this opportunity to attempt to explore the author’s own flow of experience registration through image recording. The main principles behind the creations are the use of both concrete and formless images in daily life and twisting and crisscrossing lines, filling the resulting blocks with color. The unique characteristics of both are highlighted, creating an interwoven and penetrating image. Generally speaking, these works cannot be regarded as reproductions of natural objects, but instead, they contain narratives, illustrating how the inner world influences people and our surroundings. The entire creative process, then, becomes a documentation of a journey. Some of what occurs can be controlled, while the rest arises unexpectedly, rendering images that appear all the more profoundly uncertain and unique. The methods implemented in creating these pieces can be defined by three philosophies: I. a vanishing image (of the past), II. a visible image (of the present), and the experimental, III. an imagined image (of the present or future). Parts of the creations were also implemented while blindfolded. To further extend the concepts discussed above, the psychological stimuli resulting from physical experience is used to reveal the intangibility of our “Juan-like” world. The author also attempted to take on the perspective of the observer looking at the completed creations, and mindfully reflected upon the important role of experience and impressions with regards to a painting.

Styles APA, Harvard, Vancouver, ISO, etc.

Thèses sur le sujet « Visual image reconstruction »

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres