Thèses sur le sujet « 3D saliency »

Pour voir les autres types de publications sur ce sujet consultez le lien suivant : 3D saliency.

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 17 meilleures thèses pour votre recherche sur le sujet « 3D saliency ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.

1

Zhao, Yitian. « Detections and applications of saliency on 3D surfaces by using retinex theory ». Thesis, Aberystwyth University, 2013. http://hdl.handle.net/2160/83baa3e3-fe5c-4e1d-a3d8-e63d95bed13e.

Texte intégral
Résumé :
Unlike traditional 2D images, which are projections of the real world onto a two-dimensional surface, 3D images express the geometry of the objects of interest directly in terms of a set of points, a mesh, or a surface composed of points with three-dimensional coordinates. The size or shape of this 3D information of an object may be computed almost directly from its three-dimensional representation. 3D imaging geometry essentially simulates human binocular vision, and enables a direction acquisition of the depth information from the camera to the object of interest. It ffinds a variety of applications ranging from reverse engineering, urban planning and simulation to computer games. With the evolution in recent years of more modern technologies and devices, there has been enormous growth in the number of 3D models/3D images and their availability to various communities. Examples include the National Design Repository, which stores 3D computer-aided design (CAD) models for tens of thousands of mechanical parts; and the Princeton Shape Benchmark (PSB) with 36,000 everyday objects represented as polygonal surface models. Most of the latest scanners can generate a huge number of data points within a limited time (a matter of minutes). Even a single scan might contain millions of points, which often leads to expensive computation and storage. The development of relevant software has not matched that of 3D hardware. As the complexity of these data points has increased, the digital representation of the real world objects has become more accurate, but there is a trade-o_ between degree of accuracy and the cost of processing and storage of these models. Therefore, reduction of information content or implication of the 3D data points is useful for efficient processing, and necessary for visualization in some cases. In the course of a thorough review of the relevant literature, we found that the existing simplification algorithms perform inadequately, especially at a very high simplification rate. In recent years, the notion of human visual perception has been explored with a view to aiding simplification. With a view to retaining the important surface features and details, the selection of samples is now guided both by geometric properties and by the visual attention properties of the surface. Thus, as the criteria of the simplification or interest points detection, salient regions and non-salient regions can be processed separately, preserving more vertices or facets from salient regions, while selecting fewer vertices or facets from non-salient regions (in our proposed interest points detection method, only select points from salient regions). The estimation of the perceptual properties/saliency of the target object is thus a very important pre process for simplifying highly complicated 3D models. In this dissertation, a surface smoothing and two novel saliency detection methods on 3D models are proposed. The acquired data usually contains imaging noise, due to low reflection or specular reflection, occlusion and depth discontinuity. Sometimes a rough surface is generated due to the rapid changes of orientation and vertex locations of reconstructed surfaces caused by noise introduced in the process of surface scanning, image registration and integration. Hence, an extended non-local means filter has been proposed in the case of a 3D surface. To the best of our knowledge, there is no previous work on non-local means filtering of mesh with B-spline optimization. As we know, the non-local means filter takes advantages of the high degree of redundancy of any natural image. For a given pixel, the restored gray value is obtained by the weighted average of the gray values of all pixels in the image; each weight is proportional to the similarity between the local neighborhood of the pixel being processed and the neighborhood corresponding to the other image pixels. With this filter, a smoother version can be robustly obtained, since it defines the similarity between patches of pixels, rather than between the individual pixels themselves. However, when extending the 2D non-local means _lter to the processing of a 3D mesh, a problem arises in the determination of the similarity neighborhood. 2D images usually have a regular structure, which in most cases it is not true for a mesh due to variations of sampling density in the range scanning process. In this work, the B-spline is employed to determine the similarity neighborhood, which in turn generates the control net for the input mesh. The advantage of using B-spline surfaces is that the underlying control net is topologically similar to the image grid structure. The first saliency detection approach adapts Retinex from a 2D image enhancement technique to analysis of geometry or shape variation in 3D models. Retinex investigates the theory behind the constancy of color. It explains from a psychological perspective why the colors perceived by human beings are relatively stable, usually irrespective of illumination conditions. Retinex has also been imported into the computer vision field, in which the captured data are often unsatisfactory due to low contrast - either locally or globally - caused by too weak or too strong illumination, or even shadow. Retinex is extended here to enhance 3D shape information and aid analysis of global shape and local geometrical details. Normally, human perception and objective information with respect to vision are not in agreement. The human brain interprets an image of a 3D shape differently from how photo-sensors or scanners may sense it, by consciously correcting brightness and removing noise, shadows, glare, or reflections. After the application of Retinex, the 3D shape, component or surface may be represented more faithfully to the original, simulating the effect of human visual systems. After using the Retinex to enhance the surface, a random center-surround saliency detection is proposed. The main structure of our saliency system is based on the general layout of psychological attention models, and it improves and extends the concept of mesh saliency, integrated for more accurate detection of importance/saliency of points. While the first saliency detection approach is powerful for the characterization of the importance/saliency of points, it may be affected by imaging noise or depth discontinuity, leading to the salient regions being only partially detected. To overcome this shortcoming, a second method is proposed that measures similarity based on patches, rather than individual points. This saliency detection approach is an extension from the first saliency detection method. Based on observations from studies of biological vision, we know that the human vision system is sensitive to contrast in visual signal. It is widely believed that human cortical cells may be hard-wired to respond preferentially to high contrast stimulus in their receptive fields. Therefore, if a specific contrast for the 3D surface is generated, it may also be used to illustrate the difference in the geometry or topology that makes the local details or global shape distinctive. In this study, by combining Retinex-based Importance Feature, and Relative Distance, a weighted dissimilarity map is obtained to generate the `surface contrast'. The dissimilarity map is estimated as the sum of difference between geometric invariance of different points inside two patches, inversely proportional to their Euclidean distance. Subsequently, the global nature of salient regions are captured by considering the symmetric surround saliency. As noted above, as we know humans pay more attention to those image regions that contrast strongly with their neighbors. To determine the region-based saliency, a region-growing segmentation is employed to segment the surface. The results show that the proposed approach has the ability to locate the distinctive regions faithfully. In order to validate the proposed saliency detection methods, the detected salient regions have been applied to simplification, and interest points detection. A large number of experiments based on real data captured by Minolta Vivid 700 range camera show that more details have been retained in the process of surface simplification, the detected interest points are more repeatable - useful for the representation of the geometry and detail of the object of interest.
In addition, the comparative studies also show that the propose techniques outperform the state-of-the-art methods and have clear advantages.
Styles APA, Harvard, Vancouver, ISO, etc.
2

Wang, Junle. « From 2D to stereoscopic-3D visual saliency : revisiting psychophysical methods and computational modeling ». Nantes, 2012. http://www.theses.fr/2012NANT2072.

Texte intégral
Résumé :
L’attention visuelle est l’un des mécanismes les plus importants mis en oeuvre par le système visuel humain (SVH) afin de réduire la quantité d’information que le cerveau a besoin de traiter pour appréhender le contenu d’une scène. Un nombre croissant de travaux est consacré à l’étude de l’attention visuelle, et en particulier à sa modélisation computationnelle. Dans cette thèse, nous présentons des études portant sur plusieurs aspects de cette recherche. Nos travaux peuvent être classés globalement en deux parties. La première concerne les questions liées à la vérité de terrain utilisée, la seconde est relative à la modélisation de l’attention visuelle dans des conditions de visualisation 3D. Dans la première partie, nous analysons la fiabilité de cartes de densité de fixation issues de différentes bases de données occulométriques. Ensuite, nous identifions quantitativement les similitudes et les différences entre carte de densité de fixation et carte d’importance visuelle, ces deux types de carte étant les vérités de terrain communément utilisées par les applications relatives à l’attention. Puis, pour faire face au manque de vérité de terrain exploitable pour la modélisation de l’attention visuelle 3D, nous procédons à une expérimentation oculométrique binoculaire qui aboutit à la création d’une nouvelle base de données avec des images stéréoscopiques 3D. Dans la seconde partie, nous commençons par examiner l’impact de la profondeur sur l’attention visuelle dans des conditions de visualisation 3D. Nous quantifions d’abord le " biais de profondeur " lié à la visualisation de contenus synthétiques 3D sur écran plat stéréoscopique. Ensuite, nous étendons notre étude avec l’usage d’images 3D au contenu naturel. Nous proposons un modèle de l’attention visuelle 3D basé saillance de profondeur, modèle qui repose sur le contraste de profondeur de la scène. Deux façons différentes d’exploiter l’information de profondeur par notre modèle sont comparées. Ensuite, nous étudions le biais central et les différences qui existent selon que les conditions de visualisation soient 2D ou 3D. Nous intégrons aussi le biais central à notre modèle de l’attention visuelle 3D. Enfin, considérant que l’attention visuelle combinée à une technique de floutage peut améliorer la qualité d’expérience de la TV-3D, nous étudions l’influence de flou sur la perception de la profondeur, et la relation du flou avec la disparité binoculaire
Visual attention is one of the most important mechanisms deployed in the human visual system to reduce the amount of information that our brain needs to process. An increasing amount of efforts are being dedicated in the studies of visual attention, particularly in computational modeling of visual attention. In this thesis, we present studies focusing on several aspects of the research of visual attention. Our works can be mainly classified into two parts. The first part concerns ground truths used in the studies related to visual attention ; the second part contains studies related to the modeling of visual attention for Stereoscopic 3D (S-3D) viewing condition. In the first part, our work starts with identifying the reliability of FDM from different eye-tracking databases. Then we quantitatively identify the similarities and difference between fixation density maps and visual importance map, which have been two widely used ground truth for attention-related applications. Next, to solve the problem of lacking ground truth in the community of 3D visual attention modeling, we conduct a binocular eye-tracking experiment to create a new eye-tracking database for S-3D images. In the second part, we start with examining the impact of depth on visual attention in S-3D viewing condition. We firstly introduce a so-called “depth-bias” in the viewing of synthetic S-3D content on planar stereoscopic display. Then, we extend our study from synthetic stimuli to natural content S-3D images. We propose a depth-saliency-based model of 3D visual attention, which relies on depth contrast of the scene. Two different ways of applying depth information in S-3D visual attention model are also compared in our study. Next, we study the difference of center-bias between 2D and S-3D viewing conditions, and further integrate the center-bias with S-3D visual attention modeling. At the end, based on the assumption that visual attention can be used for improving Quality of Experience of 3D-TV when collaborating with blur, we study the influence of blur on depth perception and blur’s relationship with binocular disparity
Styles APA, Harvard, Vancouver, ISO, etc.
3

Munaretti, Rodrigo Barni. « Perceptual guidance in mesh processing and rendering using mesh saliency ». reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2007. http://hdl.handle.net/10183/12673.

Texte intégral
Résumé :
Considerações de informação perceptual têm ganhado espaço rapidamente em pesquisas referentes a representação, análise e exibição de malhas. Estudos com usuários, eye tracking e outras técnicas são capazes de fornecer informações cada vez mais úteis para sistemas voltados a usuário, que formam a maioria das aplicações em computação gráfica. Neste trabalho nós expandimos sobre o conceito de Saliência de Malhas — uma medida automática de importância visual para malhas de triângulos baseada em modelos de atenção humana em baixo nível — melhorando, extendendo e realizando integração com diferentes aplicações. Nós extendemos o conceito de Saliência de Malhas para englobar objetos deformáveis, mostrando como um mapa de saliência em nível de vértice pode ser construído capturando corretamente regiões de alta importância perceptual através de um conjunto de poses ou deformações. Nós definimos saliência multi-pose como um agregado multi-escala de valores de curvatura sobre uma vizinhança localmente estável, em conjunto com deformações desta vizinhança em múltiplas poses. Nós substituímos distância Euclideana por geodésica, assim fornecendo melhores estimativas de vizinhança local. Resultados mostram que saliência multi-pose gera resultados visualmente mais interessantes em simplificações quando comparado à saliência em uma única pose. Nós também aplicamos saliência de malhas ao problema de segmentação e rendering dependente de ponto de vista, introduzindo uma técnica para segmentação que particiona um objeto em um conjunto de clusters, cada um englobando um grupo de características localmente interessantes. Saliência de malhas é incorporada em um framework para clustering propagativo, guiando seleção de pontos de partida para clusters e custos de propagação de faces, levando a uma convergência de clusters ao redor de características perceptualmente importantes. Nós comparamos nossa técnica com diferentes métodos automáticos para segmentação, mostrando que ela fornece segmentação melhor ou comparável sem necessidade de intervenção do usuário. Uma vez que o algoritmo de segmentação proposto é especialmente aplicável a rendering multi-resolução, nós ilustramos uma aplicação do mesmo através de um sistema de rendering baseado em ponto de vista guiado por saliência, alcançando melhorias consideráveis em framerate com muito pouca perda de qualidade visual.
Considerations on perceptual information are quickly gaining importance in mesh representation, analysis and display research. User studies, eye tracking and other techniques are able to provide ever more useful insights for many user-centric systems, which form the bulk of computer graphics applications. In this work we build upon the concept of Mesh Saliency — an automatic measure of visual importance for triangle meshes based on models of low-level human visual attention—improving, extending and integrating it with different applications. We extend the concept of Mesh Saliency to encompass deformable objects, showing how a vertex-level saliency map can be constructed that accurately captures the regions of high perceptual importance over a range of mesh poses or deformations. We define multipose saliency as a multi-scale aggregate of curvature values over a locally stable vertex neighborhood together with deformations over multiple poses. We replace the use of the Euclidean distance by geodesic distance thereby providing superior estimates of the local neighborhood. Results show that multi-pose saliency generates more visually appealing mesh simplifications when compared to a single-pose mesh saliency. We also apply Mesh Saliency to the problem of mesh segmentation and view-dependent rendering, introducing a technique for segmentation that partitions an object into a set of face clusters, each encompassing a group of locally interesting features. Mesh Saliency is incorporated in a propagative mesh clustering framework, guiding cluster seed selection and triangle propagation costs and leading to a convergence of face clusters around perceptually important features. We compare our technique with different fully automatic segmentation algorithms, showing that it provides similar or better segmentation without the need for user input. Since the proposed clustering algorithm is specially suitable for multi-resolution rendering, we illustrate application of our clustering results through a saliency-guided view-dependent rendering system, achieving significant framerate increases with little loss of visual detail.
Styles APA, Harvard, Vancouver, ISO, etc.
4

Joubert, Deon. « Saliency grouped landmarks for use in vision-based simultaneous localisation and mapping ». Diss., University of Pretoria, 2013. http://hdl.handle.net/2263/40834.

Texte intégral
Résumé :
The effective application of mobile robotics requires that robots be able to perform tasks with an extended degree of autonomy. Simultaneous localisation and mapping (SLAM) aids automation by providing a robot with the means of exploring an unknown environment while being able to position itself within this environment. Vision-based SLAM benefits from the large amounts of data produced by cameras but requires intensive processing of these data to obtain useful information. In this dissertation it is proposed that, as the saliency content of an image distils a large amount of the information present, it can be used to benefit vision-based SLAM implementations. The proposal is investigated by developing a new landmark for use in SLAM. Image keypoints are grouped together according to the saliency content of an image to form the new landmark. A SLAM system utilising this new landmark is implemented in order to demonstrate the viability of using the landmark. The landmark extraction, data filtering and data association routines necessary to make use of the landmark are discussed in detail. A Microsoft Kinect is used to obtain video images as well as 3D information of a viewed scene. The system is evaluated using computer simulations and real-world datasets from indoor structured environments. The datasets used are both newly generated and freely available benchmarking ones.
Dissertation (MEng)--University of Pretoria, 2013.
gm2014
Electrical, Electronic and Computer Engineering
unrestricted
Styles APA, Harvard, Vancouver, ISO, etc.
5

Fraihat, Hossam. « Contribution à la perception visuelle multi-résolution de l’environnement 3D : application à la robotique autonome ». Thesis, Paris Est, 2017. http://www.theses.fr/2017PESC1065/document.

Texte intégral
Résumé :
Le travail de recherche effectué dans le cadre de cette thèse concerne le développement d’un système de perception de la saillance en environnement 3D en tirant l’avantage d’une représentation pseudo-3D. Notre contribution et concept issue de celle-ci part de l'hypothèse que la profondeur de l’objet par rapport au robot est un facteur important dans la détection de la saillance. Sur ce principe, un système de vision saillante de l’environnement 3D a été proposé, conçu et validée sur une plateforme comprenant un robot équipé d’un capteur pseudo-3D. La mise en œuvre du concept précité et sa conception ont été d’abord validés sur le système de vision pseudo-3D KINECT. Puis dans une deuxième étape, le concept et les algorithmes mis aux points ont été étendus à la plateforme précitée. Les principales contributions de la présente thèse peuvent être résumées de la manière suivante : A) Un état de l'art sur les différents capteurs d'acquisition de l’information de la profondeur ainsi que les différentes méthodes de la détection de la saillance 2D et pseudo 3D. B) Etude d’un système basé sur la saillance visuelle pseudo 3D réalisée grâce au développement d’un algorithme robuste permettant la détection d'objets saillants dans l’environnement 3D. C) réalisation d’un système d’estimation de la profondeur en centimètres pour le robot Pepper. D) La mise en œuvre des concepts et des méthodes proposés sur la plateforme précitée. Les études et les validations expérimentales réalisées ont notamment confirmé que les approches proposées permettent d’accroitre l’autonomie des robots dans un environnement 3D réel
The research work, carried out within the framework of this thesis, concerns the development of a system of perception and saliency detection in 3D environment taking advantage from a pseudo-3D representation. Our contribution and the issued concept derive from the hypothesis that the depth of the object with respect to the robot is an important factor in the detection of the saliency. On this basis, a salient vision system of the 3D environment has been proposed, designed and validated on a platform including a robot equipped with a pseudo-3D sensor. The implementation of the aforementioned concept and its design were first validated on the pseudo-3D KINECT vision system. Then, in a second step, the concept and the algorithms have been extended to the aforementioned robotic platform. The main contributions of the present thesis can be summarized as follow: A) A state of the art on the various sensors for acquiring depth information as well as different methods of detecting 2D salience and pseudo 3D. B) Study of pseudo-3D visual saliency system based on benefiting from the development of a robust algorithm allowing the detection of salient objects. C) Implementation of a depth estimation system in centimeters for the Pepper robot. D) Implementation of the concepts and methods proposed on the aforementioned platform. The carried out studies and the experimental validations confirmed that the proposed approaches allow to increase the autonomy of the robots in a real 3D environment
Styles APA, Harvard, Vancouver, ISO, etc.
6

El, Haje Noura. « A heterogeneous data-based proposal for procedural 3D cities visualization and generalization ». Thesis, Toulouse 3, 2018. http://www.theses.fr/2018TOU30238.

Texte intégral
Résumé :
Ce projet de thèse est né d'un projet de collaboration entre l'équipe de recherche VORTEX/ Objets visuels: de la réalité à l'expression (maintenant REVA: Réel Expression Vie Artificielle) à l'IRIT : Institut de Recherche en Informatique de Toulouse d'une part et de professionnels de l'éducation, entreprises et entités publiques d'autre part. Le projet de collaboration SCOLA est essentiellement une plate-forme d'apprentissage en ligne basée sur l'utilisation des jeux sérieux dans les écoles. Il aide les utilisateurs à acquérir et à repérer des compétences prédéfinies. Cette plate-forme offre aux enseignants un nouvel outil flexible qui crée des scénarios liés à la pédagogie et personnalise les dossiers des élèves. Plusieurs contributions ont été attribuées à l'IRIT. L'une d'elles consiste à suggérer une solution pour la création automatique d'environnements 3D, à intégrer au scénario du jeu. Cette solution vise à empêcher les infographistes 3D de modéliser manuellement des environnements 3D détaillés et volumineux, ce qui peut être très coûteux et prendre beaucoup de temps. Diverses applications et prototypes ont été développés pour permettre à l'utilisateur de généraliser et de visualiser son propre monde virtuel principalement à partir d'un ensemble de règles. Par conséquent, il n'existe pas de schéma de représentation unique dans le monde virtuel en raison de l'hétérogénéité et de la diversité de la conception de contenus 3D, en particulier des modèles de ville. Cette contrainte nous a amené à nous appuyer largement dans notre projet sur de vraies données urbaines 3D au lieu de données personnalisées prédéfinies par le concepteur de jeu. Les progrès réalisés en infographie, les capacités de calcul élevées et les technologies Web ont largement révolutionné les techniques de reconstruction et de visualisation des données. Ces techniques sont appliquées dans divers domaines, en commençant par les jeux vidéo, les simulations et en terminant par les films qui utilisent des espaces générés de manière procédurale et des animations de personnages. Bien que les jeux informatiques modernes n'aient pas les mêmes restrictions matérielles et de mémoire que les anciens jeux, la génération procédurale est fréquemment utilisée pour créer des jeux, des cartes, des niveaux, des personnages ou d'autres facettes aléatoires uniques sur chaque jeu. Actuellement, la tendance est déplacée vers les SIG: Systèmes d'Information Géographiques pour créer des mondes urbains, en particulier après leur mise en œuvre réussie dans le monde entier afin de prendre en charge de nombreuses domaines d'applications.[...]
This thesis project was born from a collaborative project between the research team VORTEX / Visual objects: from reality to expression (now REVA: Real Expression Artificial Life) at IRIT: Institute of Research in Computer Science Toulouse on the one hand and education professionals, companies and public entities on the other.The SCOLA collaborative project is essentially an online learning platform based on the use of serious games in schools. It helps users to acquire and track predefined skills. This platform provides teachers with a new flexible tool that creates pedagogical scenarios and personalizes student records. Several contributions have been attributed to IRIT. One of these is to suggest a solution for the automatic creation of 3D environments, to integrate into the game scenario. This solution aims to prevent 3D graphic designers from manually modeling detailed and large 3D environments, which can be very expensive and take a lot of time. Various applications and prototypes have been developed to allow the user to generalize and visualize their own virtual world primarily from a set of rules. Therefore, there is no single representation scheme in the virtual world due to the heterogeneity and diversity of 3D content design, especially city models. This constraint has led us to rely heavily on our project on real 3D urban data instead of custom data predefined by the game designer. Advances in computer graphics, high computing capabilities, and Web technologies have revolutionized data reconstruction and visualization techniques. These techniques are applied in a variety of areas, starting with video games, simulations, and ending with movies that use procedurally generated spaces and character animations. Although modern computer games do not have the same hardware and memory restrictions as older games, procedural generation is frequently used to create unique games, cards, levels, characters, or other random facets on each. Currently, the trend is shifting towards GIS : Geographical Information Systems to create urban worlds, especially after their successful implementation around the world to support many areas of applications. GIS are more specifically dedicated to applications such as simulation, disaster management and urban planning, with a great use more or less limited in games, for example the game "Minecraft", the latest version offers a map using real world cities Geodata in Minecraft.[...]
Styles APA, Harvard, Vancouver, ISO, etc.
7

Ben, salah Imeen. « Extraction d'un graphe de navigabilité à partir d'un nuage de points 3D enrichis ». Thesis, Normandie, 2019. http://www.theses.fr/2019NORMR070/document.

Texte intégral
Résumé :
Les caméras sont devenues de plus en plus communes dans les véhicules, les smartphones et les systèmes d'aide à la conduite ADAS (Advanced Driver Assistance Systèmes). Les domaines d'application de ces caméras dans le monde des systèmes intelligents de transport deviennent de plus en plus variés : la détection des piétons, les avertissements de franchissement de ligne, la navigation... La navigation basée sur la vision a atteint une certaine maturité durant ces dernières années grâce à l'utilisation de technologies avancées. Les systèmes de navigation basée sur la vision ont le considérable avantage de pouvoir utiliser directement les informations visuelles présentes dans l'environnement, sans devoir adapter le moindre élément de l'infrastructure. De plus, contrairement aux systèmes utilisant le GPS, ils peuvent être utilisés à l'extérieur ainsi qu'à l'intérieur des locaux et des bâtiments sans aucune perte de précision. C'est pour ces raisons que les systèmes basés sur la vision sont une bonne option car ils fournissent des informations très riches et précises sur l'environnement, qui peuvent être utilisées pour la navigation. Un axe important de recherche porte actuellement sur la cartographie qui représente une étape indispensable pour la navigation. Cette étape engendre une problématique de la gestion de la mémoire assez conséquente requise par ces systèmes en raison de la quantité d'informations importante collectées par chaque capteur. En effet, l'espace mémoire nécessaire pour accueillir la carte d'une petite ville se mesure en dizaines de GO voire des milliers lorsque l'on souhaite couvrir des espaces de grandes dimensions. Cela rend impossible son intégration dans un système mobile tel que les smartphones, les véhicules, les vélos ou les robots. Le défi serait donc de développer de nouveaux algorithmes permettant de diminuer au maximum la taille de la mémoire nécessaire pour faire fonctionner ce système de localisation par vision. C'est dans ce contexte que se situe notre projet qui consiste à développer un nouveau système capable de résumer une carte 3D qui contient des informations visuelles collectées par plusieurs capteurs. Le résumé sera un ensemble des vues sphériques permettant de garder le même niveau de visibilité dans toutes les directions. Cela permettrait aussi de garantir, à moindre coût, un bon niveau de précision et de rapidité lors de la navigation. La carte résumant l'environnement sera constituée d'un ensemble d'informations géométriques, photométriques et sémantiques
Cameras have become increasingly common in vehicles, smart phones, and advanced driver assistance systems. The areas of application of these cameras in the world of intelligent transportation systems are becoming more and more varied : pedestrian detection, line crossing detection, navigation ... Vision-based navigation has reached a certain maturity in recent years through the use of advanced technologies. Vision-based navigation systems have the considerable advantage of being able to directly use the visual information already existing in the environment without having to adapt any element of the infrastructure. In addition, unlike systems using GPS, they can be used outdoors and indoors without any loss of precision. This guarantees the superiority of these systems based on computer vision. A major area of {research currently focuses on mapping, which represents an essential step for navigation. This step generates a problem of memory management quite substantial required by these systems because of the huge amount of information collected by each sensor. Indeed, the memory space required to accommodate the map of a small city is measured in tens of GB or even thousands when one wants to cover large spaces. This makes impossible to integrate this map into a mobile system such as smartphones , cameras embedded in vehicles or robots. The challenge would be to develop new algorithms to minimize the size of the memory needed to operate this navigation system using only computer vision. It's in this context that our project consists in developing a new system able to summarize a3D map resulting from the visual information collected by several sensors. The summary will be a set of spherical views allow to keep the same level of visibility in all directions. It would also guarantee, at a lower cost, a good level of precision and speed during navigation. The summary map of the environment will contain geometric, photometric and semantic information
Styles APA, Harvard, Vancouver, ISO, etc.
8

Walter, Nicolas. « Détection de primitives par une approche discrète et non linéaire : application à la détection et la caractérisation de points d'intérêt dans les maillages 3D ». Phd thesis, Université de Bourgogne, 2010. http://tel.archives-ouvertes.fr/tel-00808216.

Texte intégral
Résumé :
Ce manuscrit est dédié à la détection et la caractérisation de points d'intérêt dans les maillages. Nous montrons tout d'abord les limitations de la mesure de courbure sur des contours francs, mesure habituellement utilisée dans le domaine de l'analyse de maillages. Nous présentons ensuite une généralisation de l'opérateur SUSAN pour les maillages, nommé SUSAN-3D. La mesure de saillance proposée quantifie les variations locales de la surface et classe directement les points analysés en cinq catégories : saillant, crête, plat, vallée et creux. Les maillages considérés sont à variété uniforme avec ou sans bords et peuvent être réguliers ou irréguliers, denses ou non et bruités ou non. Nous étudions ensuite les performances de SUSAN-3D en les comparant à celles de deux opérateurs de courbure : l'opérateur de Meyer et l'opérateur de Stokely. Deux méthodes de comparaison des mesures de saillance et courbure sont proposées et utilisées sur deux types d'objets : des sphères et des cubes. Les sphères permettent l'étude de la précision sur des surfaces différentiables et les cubes sur deux types de contours non-différentiables : les arêtes et les coins. Nous montrons au travers de ces études les avantages de notre méthode qui sont une forte répétabilité de la mesure, une faible sensibilité au bruit et la capacité d'analyser les surfaces peu denses. Enfin, nous présentons une extension multi-échelle et une automatisation de la détermination des échelles d'analyse qui font de SUSAN-3D un opérateur générique et autonome d'analyse et de caractérisation pour les maillages
Styles APA, Harvard, Vancouver, ISO, etc.
9

El, Sayed Abdul Rahman. « Traitement des objets 3D et images par les méthodes numériques sur graphes ». Thesis, Normandie, 2018. http://www.theses.fr/2018NORMLH19/document.

Texte intégral
Résumé :
La détection de peau consiste à détecter les pixels correspondant à une peau humaine dans une image couleur. Les visages constituent une catégorie de stimulus importante par la richesse des informations qu’ils véhiculent car avant de reconnaître n’importe quelle personne il est indispensable de localiser et reconnaître son visage. La plupart des applications liées à la sécurité et à la biométrie reposent sur la détection de régions de peau telles que la détection de visages, le filtrage d'objets 3D pour adultes et la reconnaissance de gestes. En outre, la détection de la saillance des mailles 3D est une phase de prétraitement importante pour de nombreuses applications de vision par ordinateur. La segmentation d'objets 3D basée sur des régions saillantes a été largement utilisée dans de nombreuses applications de vision par ordinateur telles que la correspondance de formes 3D, les alignements d'objets, le lissage de nuages de points 3D, la recherche des images sur le web, l’indexation des images par le contenu, la segmentation de la vidéo et la détection et la reconnaissance de visages. La détection de peau est une tâche très difficile pour différentes raisons liées en général à la variabilité de la forme et la couleur à détecter (teintes différentes d’une personne à une autre, orientation et tailles quelconques, conditions d’éclairage) et surtout pour les images issues du web capturées sous différentes conditions de lumière. Il existe plusieurs approches connues pour la détection de peau : les approches basées sur la géométrie et l’extraction de traits caractéristiques, les approches basées sur le mouvement (la soustraction de l’arrière-plan (SAP), différence entre deux images consécutives, calcul du flot optique) et les approches basées sur la couleur. Dans cette thèse, nous proposons des méthodes d'optimisation numérique pour la détection de régions de couleurs de peaux et de régions saillantes sur des maillages 3D et des nuages de points 3D en utilisant un graphe pondéré. En se basant sur ces méthodes, nous proposons des approches de détection de visage 3D à l'aide de la programmation linéaire et de fouille de données (Data Mining). En outre, nous avons adapté nos méthodes proposées pour résoudre le problème de la simplification des nuages de points 3D et de la correspondance des objets 3D. En plus, nous montrons la robustesse et l’efficacité de nos méthodes proposées à travers de différents résultats expérimentaux réalisés. Enfin, nous montrons la stabilité et la robustesse de nos méthodes par rapport au bruit
Skin detection involves detecting pixels corresponding to human skin in a color image. The faces constitute a category of stimulus important by the wealth of information that they convey because before recognizing any person it is essential to locate and recognize his face. Most security and biometrics applications rely on the detection of skin regions such as face detection, 3D adult object filtering, and gesture recognition. In addition, saliency detection of 3D mesh is an important pretreatment phase for many computer vision applications. 3D segmentation based on salient regions has been widely used in many computer vision applications such as 3D shape matching, object alignments, 3D point-point smoothing, searching images on the web, image indexing by content, video segmentation and face detection and recognition. The detection of skin is a very difficult task for various reasons generally related to the variability of the shape and the color to be detected (different hues from one person to another, orientation and different sizes, lighting conditions) and especially for images from the web captured under different light conditions. There are several known approaches to skin detection: approaches based on geometry and feature extraction, motion-based approaches (background subtraction (SAP), difference between two consecutive images, optical flow calculation) and color-based approaches. In this thesis, we propose numerical optimization methods for the detection of skins color and salient regions on 3D meshes and 3D point clouds using a weighted graph. Based on these methods, we provide 3D face detection approaches using Linear Programming and Data Mining. In addition, we adapted our proposed methods to solve the problem of simplifying 3D point clouds and matching 3D objects. In addition, we show the robustness and efficiency of our proposed methods through different experimental results. Finally, we show the stability and robustness of our methods with respect to noise
Styles APA, Harvard, Vancouver, ISO, etc.
10

Ricci, Thomas. « Individuazione di punti salienti in dati 3D mediante rappresentazioni strutturate ». Master's thesis, Alma Mater Studiorum - Università di Bologna, 2012. http://amslaurea.unibo.it/3968/.

Texte intégral
Résumé :
Questa tesi si inserisce nel filone di ricerca dell'elaborazione di dati 3D, e in particolare nella 3D Object Recognition, e delinea in primo luogo una panoramica sulle principali rappresentazioni strutturate di dati 3D, le quali rappresentano una prerogativa necessaria per implementare in modo efficiente algoritmi di processing di dati 3D, per poi presentare un nuovo algoritmo di 3D Keypoint Detection che è stato sviluppato e proposto dal Computer Vision Laboratory dell'Università di Bologna presso il quale ho effettuato la mia attività di tesi.
Styles APA, Harvard, Vancouver, ISO, etc.
11

Khaustova, Darya. « Objective assessment of stereoscopic video quality of 3DTV ». Thesis, Rennes 1, 2015. http://www.theses.fr/2015REN1S021/document.

Texte intégral
Résumé :
Le niveau d'exigence minimum pour tout système 3D (images stéréoscopiques) est de garantir le confort visuel des utilisateurs. Le confort visuel est un des trois axes perceptuels de la qualité d'expérience (QoE) 3D qui peut être directement lié aux paramètres techniques du système 3D. Par conséquent, le but de cette thèse est de caractériser objectivement l'impact de ces paramètres sur la perception humaine afin de contrôler la qualité stéréoscopique. La première partie de la thèse examine l'intérêt de prendre en compte l'attention visuelle des spectateurs dans la conception d'une mesure objective de qualité 3D. Premièrement, l'attention visuelle en 2D et 3D sont comparées en utilisant des stimuli simples. Les conclusions de cette première expérience sont validées en utilisant des scènes complexes avec des disparités croisées et décroisées. De plus, nous explorons l'impact de l'inconfort visuel causé par des disparités excessives sur l'attention visuelle. La seconde partie de la thèse est dédiée à la conception d'un modèle objectif de QoE pour des vidéos 3D, basé sur les seuils perceptuels humains et le niveau d'acceptabilité. De plus nous explorons la possibilité d'utiliser la modèle proposé comme une nouvelle échelle subjective. Pour la validation de ce modèle, des expériences subjectives sont conduites présentant aux sujets des images stéréoscopiques fixes et animées avec différents niveaux d'asymétrie. La performance est évaluée en comparant des prédictions objectives avec des notes subjectives pour différents niveaux d'asymétrie qui pourraient provoquer un inconfort visuel
The minimum requirement for any 3D (stereoscopic images) system is to guarantee visual comfort of viewers. Visual comfort is one of the three primary perceptual attributes of 3D QoE, which can be linked directly with technical parameters of a 3D system. Therefore, the goal of this thesis is to characterize objectively the impact of these parameters on human perception for stereoscopic quality monitoring. The first part of the thesis investigates whether visual attention of the viewers should be considered when designing an objective 3D quality metrics. First, the visual attention in 2D and 3D is compared using simple test patterns. The conclusions of this first experiment are validated using complex stimuli with crossed and uncrossed disparities. In addition, we explore the impact of visual discomfort caused by excessive disparities on visual attention. The second part of the thesis is dedicated to the design of an objective model of 3D video QoE, which is based on human perceptual thresholds and acceptability level. Additionally we explore the possibility to use the proposed model as a new subjective scale. For the validation of proposed model, subjective experiments with fully controlled still and moving stereoscopic images with different types of view asymmetries are conducted. The performance is evaluated by comparing objective predictions with subjective scores for various levels of view discrepancies which might provoke visual discomfort
Styles APA, Harvard, Vancouver, ISO, etc.
12

Charton, Jerome. « Etude de caractéristiques saillantes sur des maillages 3D par estimation des normales et des courbures discrètes ». Thesis, Bordeaux, 2014. http://www.theses.fr/2014BORD0333/document.

Texte intégral
Résumé :
Dans l'objectif d'améliorer et d'automatiser la chaîne de reproductiond'objet qui va de l'acquisition à l'impression 3D. Nous avons cherché à caractériserde la saillance sur les objets 3D modélisés par la structure d'un maillage 3D.Pour cela, nous avons fait un état de l'art des méthodes d'estimation des proprié-tés différentielles, à savoir la normale et la courbure, sur des surfaces discrètes sousla forme de maillage 3D. Pour comparer le comportement des différentes méthodes,nous avons repris un ensemble de critères de comparaison classique dans le domaine,qui sont : la précision, la convergence et la robustesse par rapport aux variations duvoisinage. Pour cela, nous avons établi un protocole de tests mettant en avant cesqualités. De cette première comparaison, il est ressorti que l'ensemble des méthodesexistantes présentent des défauts selon ces différents critères. Afin d'avoir une estimationdes propriétés différentielles plus fiable et précise nous avons élaboré deuxnouveaux estimateurs
With the aim to improve and automate the object reproduction chainfrom acquisition to 3D printing .We sought to characterize the salience on 3D objectsmodeled by a 3D mesh structure. For this, we have a state of the art of estimatingdifferential properties methods, namely normal and curvature on discrete surfaces inthe form of 3D mesh. To compare the behavior of different methods, we took a set ofclassic benchmarks in the domain, which are : accuracy, convergence and robustnesswith respect to variations of the neighbourhood. For this, we have established atest protocol emphasizing these qualities. From this first comparision, it was foundthat all the existing methods have shortcomings as these criteria. In order to havean estimation of the differential properties more reliable and accurate we developedtwo new estimators
Styles APA, Harvard, Vancouver, ISO, etc.
13

Pinto, Carlos Henrique Villa. « Construção e aplicação de atlas de pontos salientes 3D na inicialização de modelos geométricos deformáveis em imagens de ressonância magnética ». Universidade Federal de São Carlos, 2016. https://repositorio.ufscar.br/handle/ufscar/7861.

Texte intégral
Résumé :
Submitted by Luciana Sebin (lusebin@ufscar.br) on 2016-09-30T13:54:49Z No. of bitstreams: 1 DissCHVP.pdf: 4899707 bytes, checksum: e7de60b5431e48ddbc2b9016dae268c7 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-14T14:06:37Z (GMT) No. of bitstreams: 1 DissCHVP.pdf: 4899707 bytes, checksum: e7de60b5431e48ddbc2b9016dae268c7 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-14T14:06:48Z (GMT) No. of bitstreams: 1 DissCHVP.pdf: 4899707 bytes, checksum: e7de60b5431e48ddbc2b9016dae268c7 (MD5)
Made available in DSpace on 2016-10-14T14:06:58Z (GMT). No. of bitstreams: 1 DissCHVP.pdf: 4899707 bytes, checksum: e7de60b5431e48ddbc2b9016dae268c7 (MD5) Previous issue date: 2016-03-10
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
The magnetic resonance (MR) imaging has become an indispensable tool for the diagnosis and study of various diseases and syndromes of the central nervous system, such as Alzheimer’s disease (AD). In order to perform the precise diagnosis of a disease, as well as the evolutionary monitoring of a certain treatment, the neuroradiologist doctor often needs to measure and assess volume and shape changes in certain brain structures along a series of MR images. For that, the previous delineation of the structures of interest is necessary. In general, such task is manually done, with limited help from a computer, and therefore it has several problems. For this reason, many researchers have turned their efforts towards the development of automatic techniques for segmentation of brain structures in MR images. Among the various approaches proposed in the literature, techniques based on deformable models and anatomical atlases are among those which present the best results. However, one of the main difficulties in applying geometric deformable models is the initial positioning of the model. Thus, this research aimed to develop an atlas of 3D salient points (automatically detected from a set of MR images) and to investigate the applicability of such atlas in guiding the initial positioning of geometric deformable models representing brain structures, with the purpose of helping the automatic segmentation of such structures in MR images. The processing pipeline included the use of a 3D salient point detector based on the phase congruency measure, an adaptation of the shape contexts technique to create point descriptors and the estimation of a B-spline transform to map pairs of matching points. The results, evaluated using the Jaccard and Dice metrics before and after the model initializations, showed a significant gain in the tests involving synthetically deformed images of normal patients, but for images of clinical patients with AD the gain was marginal and can still be improved in future researches. Some ways to do such improvements are discussed in this work.
O imageamento por ressonância magnética (RM) tornou-se uma ferramenta indispensável no diagnóstico e estudo de diversas doenças e síndromes do sistema nervoso central, tais como a doença de Alzheimer (DA). Para que se possa realizar o diagnóstico preciso de uma doença, bem como o acompanhamento evolutivo de um determinado tratamento, o médico neurorradiologista frequentemente precisa medir e avaliar alterações de volume e forma em determinadas estruturas do cérebro ao longo de uma série de imagens de RM. Para isso, a delineação prévia das estruturas de interesse nas imagens é necessária. Em geral, essa tarefa é realizada manualmente, com ajuda limitada de um computador, e portanto possui diversos problemas. Por esse motivo, vários pesquisadores têm voltado seus esforços para o desenvolvimento de técnicas automáticas de segmentação de estruturas cerebrais em imagens de RM. Dentre as várias abordagens propostas na literatura, técnicas baseadas em modelos deformáveis e atlas anatômicos estão entre as que apresentam os melhores resultados. No entanto, uma das principais dificuldades na aplicação de modelos geométricos deformáveis é o posicionamento inicial do modelo. Assim, esta pesquisa teve por objetivo desenvolver um atlas de pontos salientes 3D (automaticamente detectados em um conjunto de imagens de RM) e investigar a aplicabilidade de tal atlas em guiar o posicionamento inicial de modelos geométricos deformáveis representando estruturas cerebrais, com o propósito de auxiliar a segmentação automática de tais estruturas em imagens de RM. O arcabouço de processamento incluiu o uso de um detector de pontos salientes 3D baseado na medida de congruência de fase, uma adaptação da técnica shape contexts para a criação de descritores de pontos e a estimação de uma transformação B-spline para mapear pares de pontos correspondentes. Os resultados, avaliados com as métricas Jaccard e Dice antes e após a inicialização dos modelos, mostraram um ganho significativo em testes envolvendo imagens sinteticamente deformadas de pacientes normais, mas em imagens de pacientes clínicos com DA o ganho foi marginal e ainda pode ser melhorado em pesquisas futuras. Algumas maneiras de se realizar tais melhorias são discutidas neste trabalho.
FAPESP: 2015/02232-1
CAPES: 2014/11988-0
Styles APA, Harvard, Vancouver, ISO, etc.
14

Mohamodhosen, Bibi Safoorah Bilquis. « Optimisation topologique de dispositifs électromagnétiques ». Thesis, Ecole centrale de Lille, 2017. http://www.theses.fr/2017ECLI0028/document.

Texte intégral
Résumé :
L’Optimisation Topologique (OT) est un sujet en plein essor qui suscite l’intérêt de nombreux chercheurs depuis ces deux dernières décennies dans le domaine de l’électromagnétisme. L’OT représente une méthode très attrayante et originale car elle permet de trouver des structures innovantes sans aucun a priori. Ce travail de thèse est orienté vers l’OT des dispositifs électromagnétiques en approfondissant plusieurs aspects du sujet. Tout d’abord, un outil d’OT est développé et testé, à partir des outils existant au L2EP. En effet, l’OT requiert un outil d’éléments finis et un outil d’optimisation devant être couplés. Une méthodologie originale d’OT fondée sur les principes de la Méthode de Densité est développée et testée. Un cas test académique est utilisé afin de tester et valider le couplage des outils et la méthodologie. Une approche visant à prendre en compte le comportement non-linéaire des matériaux ferromagnétiques avec nos outils OT est également mise en place. Ensuite, la méthode est appliquée afin d’optimiser un électroaimant en 3 dimensions, représentant un cas test proche de la réalité. Ce cas test permet de comparer les résultats avec un comportement linéaire et non-linéaire des matériaux. Diverses topologies sont présentées, par rapport aux différentes formulations du problème. Par la suite, la méthodologie est appliquée à un dispositif électromagnétique plus complexe : une Génératrice Synchrone à Pôles Saillants. Cet exemple nous permet de voir comment la définition du problème d’optimisation peut grandement affecter les résultats d’OT. Quelques topologies sont présentées, et leur faisabilité est analysée
Topology Optimisation (TO) is a fast growing topic that has been sparking the interest of many researchers for the past two decades in the electromagnetic community. Its attractiveness lies in the originality of finding innovative structures without any layout a priori. This thesis work is oriented towards the TO of electromagnetic devices by elaborating on various aspects of the subject. First of all, a tool for TO is developed and tested, based on the ‘home-made’ tools available at the L2EP. As TO requires a FE and an optimisation tool working together, a coupling is done using both. Furthermore, a TO methodology is developed and tested, based on the Density Method. An academic cubic test case is used to carry out all the tests, and validate the tools and methodology. An approach is also developed to consider the nonlinear behaviour of the ferromagnetic materials with our TO tools. Afterwards, the methodology is applied to a 3D electromagnet, which represents a more real test case. This test case also serves to compare the results with linear and nonlinear behaviour of the materials used. Various topologies are presented, for different problem formulations. Subsequently, the methodology is applied to a more complex electromagnetic device: a Salient Pole Synchronous Generator. This example allows us to see how the problem definition can largely affect TO results. Some topologies are presented and their viability is discussed
Styles APA, Harvard, Vancouver, ISO, etc.
15

Ferreira, Lino Miguel Moreira. « Methods for Flexible Representation and Coding of 2D and 3D Visual Information ». Doctoral thesis, 2016. http://hdl.handle.net/10316/31011.

Texte intégral
Résumé :
Tese de doutoramento em Engenharia Electrotécnica e de Computadores, na especialidade de Telecomunicações, apresentada ao Departamento de Engenharia Electrotécnica e de Computadores da Faculdade de Ciências e Tecnologia da Universidade de Coimbra
Atualmente existe uma grande diversidade e quantidade de conteúdos multimédia utilizados em diferentes aplicações que exigem ferramentas de gestão eficientes e flexíveis para diferentes fins, tais como adaptação, indexação e pesquisa. No entanto, os formatos de representação atuais são principalmente agnósticos em relação ao conteúdo visual contido nos sinais digitais. Consequentemente, o acesso e o processamento da informação visual com base em algum tipo de relevância para os utilizadores ficam bastante limitados, e as soluções mais eficientes para adaptação de conteúdos devido a restrições dos sistemas de comunicação heterogéneos podem não ser facilmente alcançadas. Neste contexto, o trabalho de investigação realizado nesta Tese é uma contribuição para aumentar a flexibilidade de representação da informação visual existente nos sinais de vídeo e expandir o estado-da-arte relativamente aos métodos associados. Esta dissertação é iniciada por uma revisão bibliográfica dos conceitos básicos utilizados na representação da informação visual, codificada e por codificar. Adicionalmente, apresenta-se uma revisão dos métodos usados para calcular saliências visuais em vídeo 2D/3D. Apresenta-se também um estudo exaustivo dos métodos de segmentação temporal e sumarização de vídeo 2D/3D e uma visão geral dos métodos de redimensionamento de vídeo. Adicionalmente, são descritos de forma global os conceitos básicos de codificação de vídeo incluindo um estudo mais aprofundado da codificação de vídeo escalável e das Regiões de Interesse. Neste trabalho foram desenvolvidos dois métodos para calcular mapas saliência visual em vídeo 3D. Estes métodos, baseiam-se na fusão de quatro mapas saliência intermédios (espaço-temporal, de profundidade e da saliência face), seguido por uma função de ponderação centre-bias, que é usada para modelar a tendência humana para observar objetos localizados no centro da cena. Os métodos propostos foram avaliados com mapas de densidade de fixação, obtidos a partir de experiências de eye-tracking. Os resultados experimentais mostram que os métodos propostos obtêm melhor desempenho do que outros descritos na literatura. Adicionalmente, e tendo em conta os resultados dos métodos de cálculo de mapas saliência visual propostos, foi desenvolvido e avaliado um método de redimensionamento espaço-temporal com base em regiões salientes. O método proposto redimensiona o vídeo original para o tamanho específico de ecrã do dispositivo terminal. A solução proposta de redimensionamento é comparada com outros métodos existentes na literatura e os resultados mostram que a solução proposta alcança resultados competitivos. A representação flexível de informação visual no domínio temporal foi investigada no âmbito sumarização de vídeo. Neste caso, foi estudado e proposto uma abordagem nova para obter versões reduzidas de uma sequência de vídeo de acordo com critérios previamente definidos. Esta abordagem é constituida por duas partes: a segmentação temporal e a extração das tramas-chave. A solução proposta suporta vários formatos de vídeo, podendo ser usados critérios diversos para segmentar o vídeo original e para extrair as tramas-chave, como por exemplo saliências visuais. Diferentes métricas e vídeos foram utilizadas para avaliar o desempenho do modelo. Os resultados demonstram que o modelo proposto supera os métodos semelhantes descritos na literatura. No geral, os temas investigados nesta tese e os resultados de desempenho obtidos a partir de simulações demonstram a validade do trabalho realizado e são motivadoras de novas investigações nestes tópicos.
Nowadays, there is a great diversity and quantity of image and video content used in multimedia services and applications, which require efficient and flexible management tools for different purposes, such as adaptation, indexing, searching and browsing. However, the existing representation formats are mostly agnostic in regard to the visual content conveyed by the digital signals. As a consequence, the access and processing of the visual information based on user-driven parameters is rather limited and the most efficient solutions for adaptation and matching heterogeneous constraints in communication systems cannot be easily achieved. In this context, the research work carried out in this Thesis is a contribution to advance state-of-the-art methods capable of providing different types of additional flexibility in the representation of visual information. The Thesis starts with a review of the basic concepts used in representation of the visual information either in raw or coded format. Additionally, a review of visual saliency computation methods for 2D/3D video is presented, where the relevant methods regarding this issue are explained. A comprehensive study of temporal segmentation and video summarisation methods for 2D/3D is first realised. Then an overview of video retargeting methods is presented, describing different methods and including non-content-aware and content-aware retargeting methods. In addition, an overview of coding schemes that are able to cope with flexible representation of visual content is also described. After a brief review of the basic video coding concepts, the study is mainly focused on scalable and ROI video coding. This research work proposes two methods for computing visual saliency maps for 3D video. These, are based on the fusion of four intermediate saliency maps (spatio-temporal, depth and face saliency) followed by a centre-bias weighting function, which is used to model the human tendency to gaze at objects located in the centre of the visual scene. The proposed methods have been evaluated with diverse publicly available datasets which contain several videos and the respective fixation density maps, obtained from eye-tracking experiments. The experimental results show that the proposed methods achieve better performance than other state-of-the-art methods used here. Additionally, and taking into account the output of the proposed visual saliency computation methods, a spatio-temporal retargeting method based on salient regions was developed and evaluated. The proposed method resizes the original video for specific display size. Our retargeting solution is compared against state-of-the-art methods and the results show that the proposed approach achieves competitive results. A flexible representation of visual information in the temporal domain was also investigated in the field of video summarisation. Here, a computational framework to obtain compact versions of video sequences (video summary), according to meaningful criteria is presented. The proposed framework is composed by two modules namely, the temporal segmentation and the key-frame extraction. The proposed solution addresses various video types and formats, several meaningful criteria can be used to segment original video and to select the key-frames, such as visual saliency. Using different performance metrics and publicly available databases, the results demonstrate that the proposed framework outperforms similar state-of-the-art methods. Overall, the topics investigated in this Thesis and performance results obtained from simulations, demonstrate the validity of the work done and provide good insight to further research in these topics.
FCT - SFRH/BD/37510/2007
Styles APA, Harvard, Vancouver, ISO, etc.
16

He, Yu-Dai, et 何育岱. « Fast Iterative 3D Mesh Segmentation Using Part-Salience ». Thesis, 2015. http://ndltd.ncl.edu.tw/handle/12166282192805697713.

Texte intégral
Résumé :
碩士
國立勤益科技大學
電子工程系
103
As the graphics hardwares and associate technology greatly improved in these year, the related applications such as the computer games, the computer animation, 3D vision, virtual reality, etc., showed an explosive growth. As an important 3D mesh analyzing technique, mesh segmentation is intensively studied. . We propose a novel hierarchical part-type mesh segmentation technique that utilizes salient features and iterative cut to derive a hierarchical part-type segmented model from a 3D mesh. By means of the concept of part salience borrowed from cognition science, the extent of protrusion, the strength of boundary, and the relative size of the parts are jointly considered by our work. Where in our work, we have proposed a new formula for the estimation of protrusion to help us finding initial features from the input mesh. By applying region growing from the farthest two features and the calculation of boundary strength, a proper cut maximizing the boundary strength is applied to a part from each iteration. . Furthermore, most former studies applied shortest path algorithm in finding farthest features and only a few recent works have considered part salience. Since the calculation in finding the shortest path among feature points is time consuming, we have proposed a simple metric for the estimation of farthest features to eliminated the need of the shortest path calculations. To prevent from overly segmented, a threshold to the segmented parts considers both the relative size and part salience is given. According to our experimental results, the new approach is successful. .
Styles APA, Harvard, Vancouver, ISO, etc.
17

Hu, Gang. « A Generic Gesture Recognition Approach based on Visual Perception ». 2012. http://hdl.handle.net/10222/15095.

Texte intégral
Résumé :
Current developments of hardware devices have allowed the computer vision technologies to analyze complex human activities in real time. High quality computer algorithms for human activity interpretation are required by many emerging applications, such as patient behavior analysis, surveillance, gesture control video games, and other human computer interface systems. Despite great efforts that have been made in the past decades, it is still a challenging task to provide a generic gesture recognition solution that can facilitate the developments of different gesture-based applications. Human vision is able to perceive scenes continuously, recognize objects and grasp motion semantics effortlessly. Neuroscientists and psychologists have tried to understand and explain how exactly the visual system works. Some theories/hypotheses on visual perception such as the visual attention and the Gestalt Laws of perceptual organization (PO) have been established and shed some light on understanding fundamental mechanisms of human visual perception. In this dissertation, inspired by those visual attention models, we attempt to model and integrate important visual perception discoveries into a generic gesture recognition framework, which is the fundamental component of full-tier human activity understanding tasks. Our approach handles challenging tasks by: (1) organizing the complex visual information into a hierarchical structure including low-level feature, object (human body), and 4D spatiotemporal layers; 2) extracting bottom-up shape-based visual salience entities at each layer according to PO grouping laws; 3) building shape-based hierarchical salience maps in favor of high-level tasks for visual feature selection by manipulating attention conditions of the top-down knowledge about gestures and body structures; and 4) modeling gesture representations by a set of perceptual gesture salience entities (PGSEs) that provide qualitative gesture descriptions in 4D space for recognition tasks. Unlike other existing approaches, our gesture representation method encodes both extrinsic and intrinsic properties and reflects the way humans perceive the visual world so as to reduce the semantic gaps. Experimental results show our approach outperforms the others and has great potential in real-time applications.
PhD Thesis
Styles APA, Harvard, Vancouver, ISO, etc.
Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!

Vers la bibliographie