Дисертації з теми "Estimation de poses humaines"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-17 дисертацій для дослідження на тему "Estimation de poses humaines".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Benzine, Abdallah. "Estimation de poses 3D multi-personnes à partir d'images RGB." Thesis, Sorbonne université, 2020. http://www.theses.fr/2020SORUS103.
3D human pose estimation from RGB monocular images is the processus allowing to locate human joints from an image or of a sequence of images. It provides rich geometric and motion information about the human body. Most existing 3D pose estimation approaches assume that the image contains only one person, fully visible. Such a scenario is not realistic. In real life conditions several people interact. They then tend to hide each other, which makes 3D pose estimation even more ambiguous and complex. The work carried out during this thesis focused on single-shot estimation. of multi-person 3D poses from RGB monocular images. We first proposed a bottom-up approach for predicting multi-person 3D poses that first predicts the 3D coordinates of all the joints present in the image and then uses a grouping process to predict full 3D skeletons. In order to be robust in cases where the people in the image are numerous and far away from the camera, we developed PandaNet, which is based on an anchor representation and integrates a process that allows ignoring anchors ambiguously associated to ground truthes and an automatic weighting of losses. Finally, PandaNet is completed with an Absolute Distance Estimation Module (ADEM). The combination of these two models, called Absolute PandaNet, allows the prediction of absolute human 3D poses expressed in the camera frame
Toony, Razieh. "Calibration-free Pedestrian Partial Pose Estimation Using a High-mounted Kinect." Master's thesis, Université Laval, 2015. http://hdl.handle.net/20.500.11794/26420.
The application of human behavior analysis has undergone rapid development during the last decades from entertainment system to professional one, as Human Robot Interaction (HRI), Advanced Driver Assistance System (ADAS), Pedestrian Protection System (PPS), etc. Meanwhile, this thesis addresses the problem of recognizing pedestrians and estimating their body orientation in 3D based on the fact that estimating a person’s orientation is beneficial in determining their behavior. In this thesis, a new method is proposed for detecting and estimating the orientation, in which the result of a pedestrian detection module and a orientation estimation module are integrated sequentially. For the goal of pedestrian detection, a cascade classifier is designed to draw a bounding box around the detected pedestrian. Following this, extracted regions are given to a discrete orientation classifier to estimate pedestrian body’s orientation. This classification is based on a coarse, rasterized depth image simulating a top-view virtual camera, and uses a support vector machine classifier that was trained to distinguish 10 orientations (30 degrees increments). In order to test the performance of our approach, a new benchmark database contains 764 sets of point cloud for body-orientation classification was captured. For this benchmark, a Kinect recorded the point cloud of 30 participants and a marker-based motion capture system (Vicon) provided the ground truth on their orientation. Finally we demonstrated the improvements brought by our system, as it detected pedestrian with an accuracy of 95:29% and estimated the body orientation with an accuracy of 88:88%.We hope it can provide a new foundation for future researches.
Carbonera, Luvizon Diogo. "Apprentissage automatique pour la reconnaissance d'action humaine et l'estimation de pose à partir de l'information 3D." Thesis, Cergy-Pontoise, 2019. http://www.theses.fr/2019CERG1015.
3D human action recognition is a challenging task due to the complexity ofhuman movements and to the variety on poses and actions performed by distinctsubjects. Recent technologies based on depth sensors can provide 3D humanskeletons with low computational cost, which is an useful information foraction recognition. However, such low cost sensors are restricted tocontrolled environment and frequently output noisy data. Meanwhile,convolutional neural networks (CNN) have shown significant improvements onboth action recognition and 3D human pose estimation from RGB images. Despitebeing closely related problems, the two tasks are frequently handled separatedin the literature. In this work, we analyze the problem of 3D human actionrecognition in two scenarios: first, we explore spatial and temporalfeatures from human skeletons, which are aggregated by a shallow metriclearning approach. In the second scenario, we not only show that precise 3Dposes are beneficial to action recognition, but also that both tasks can beefficiently performed by a single deep neural network and stillachieves state-of-the-art results. Additionally, wedemonstrate that optimization from end-to-end using poses as an intermediateconstraint leads to significant higher accuracy on the action task thanseparated learning. Finally, we propose a new scalable architecture forreal-time 3D pose estimation and action recognition simultaneously, whichoffers a range of performance vs speed trade-off with a single multimodal andmultitask training procedure
Dogan, Emre. "Human pose estimation and action recognition by multi-robot systems." Thesis, Lyon, 2017. http://www.theses.fr/2017LYSEI060/document.
Estimating human pose and recognizing human activities are important steps in many applications, such as human computer interfaces (HCI), health care, smart conferencing, robotics, security surveillance etc. Despite the ongoing effort in the domain, these tasks remained unsolved in unconstrained and non cooperative environments in particular. Pose estimation and activity recognition face many challenges under these conditions such as occlusion or self occlusion, variations in clothing, background clutter, deformable nature of human body and diversity of human behaviors during activities. Using depth imagery has been a popular solution to address appearance and background related challenges, but it has restricted application area due to its hardware limitations and fails to handle remaining problems. Specifically, we considered action recognition scenarios where the position of the recording device is not fixed, and consequently require a method which is not affected by the viewpoint. As a second prob- lem, we tackled the human pose estimation task in particular settings where multiple visual sensors are available and allowed to collaborate. In this thesis, we addressed these two related problems separately. In the first part, we focused on indoor action recognition from videos and we consider complex ac- tivities. To this end, we explored several methodologies and eventually introduced a 3D spatio-temporal representation for a video sequence that is viewpoint independent. More specifically, we captured the movement of the person over time using depth sensor and we encoded it in 3D to represent the performed action with a single structure. A 3D feature descriptor was employed afterwards to build a codebook and classify the actions with the bag-of-words approach. As for the second part, we concentrated on articulated pose estimation, which is often an intermediate step for activity recognition. Our motivation was to incorporate information from multiple sources and views and fuse them early in the pipeline to overcome the problem of self-occlusion, and eventually obtain robust estimations. To achieve this, we proposed a multi-view flexible mixture of parts model inspired by the classical pictorial structures methodology. In addition to the single-view appearance of the human body and its kinematic priors, we demonstrated that geometrical constraints and appearance- consistency parameters are effective for boosting the coherence between the viewpoints in a multi-view setting. Both methods that we proposed was evaluated on public benchmarks and showed that the use of view-independent representations and integrating information from multiple viewpoints improves the performance of action recognition and pose estimation tasks, respectively
Fathollahi, Ghezelghieh Mona. "Estimation of Human Poses Categories and Physical Object Properties from Motion Trajectories." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6835.
Tokunaga, Daniel Makoto. "Local pose estimation of feature points for object based augmented reality." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/3/3141/tde-22092016-110832/.
O uso de objetos reais como meio de conexão entre informações reais e virtuais é um aspecto chave dentro da realidade aumentada. Uma questão central para tal conexão é a estimativa de informações visuo-espaciais do objeto, ou em outras palavras, a detecção da pose do objeto. Diferentes objetos podem ter diferentes comportamentos quando utilizados em interações. Não somente incluindo a mudança de posição, mas também sendo dobradas ou deformadas. Pesquisas tradicionais solucionam tais problemas de detecção usando diferentes abordagens, dependendo do tipo de objeto. Adicionalmente, algumas pesquisas se baseiam somente na informação posicional dos pontos de interesse, simplificando a informação do objeto. Neste trabalho, a detecção de pose de diferente objetos é explorada coletando-se mais informações dos pontos de interesse observados e, por sua vez, obtendo as poses locais de tais pontos, poses que não são exploradas em outras pesquisas. Este conceito da detecção de pose locais é aplicada em dois ambientes de capturas, estendendo-se em duas abordagens inovadoras: uma baseada em câmeras RGB-D, e outra baseada em câmeras RGB e métodos de aprendizado de maquinas. Na abordagem baseada em RGB-D, a orientação e superfície ao redor do ponto de interesse são utilizadas para obter a normal do ponto. Através de tais informações a pose local é obtida. Esta abordagem não só permite a obtenção de poses de objetos rígidos, mas também a pose aproximada de objetos deformáveis. Por outro lado, a abordagem baseada em RGB explora o aprendizado de máquina aplicado em alterações das aparências locais. Diferentemente de outros trabalhos baseados em câmeras RGB, esta abordagem substitui solucionadores não lineares complexos com um método rápido e robusto, permitindo a obtenção de rotações locais dos pontos de interesse, assim como, a pose completa (com 6 graus-de-liberdade) de objetos rígidos, com uma demanda computacional muito menor para cálculos em tempo-real. Ambas as abordagens mostram que a coleta de poses locais podem gerar informações para a detecção de poses de diferentes tipos de objetos.
Liebelt, Jörg. "Détection de classes d'objets et estimation de leurs poses à partir de modèles 3D synthétiques." Grenoble, 2010. https://theses.hal.science/tel-00553343.
This dissertation aims at extending object class detection and pose estimation tasks on single 2D images by a 3D model-based approach. The work describes learning, detection and estimation steps adapted to the use of synthetically rendered data with known 3D geometry. Most existing approaches recognize object classes for a particular viewpoint or combine classifiers for a few discrete views. By using existing CAD models and rendering techniques from the domain of computer graphics which are parameterized to reproduce some variations commonly found in real images, we propose instead to build 3D representations of object classes which allow to handle viewpoint changes and intra-class variability. These 3D representations are derived in two different ways : either as an unsupervised filtering process of pose and class discriminant local features on purely synthetic training data, or as a part model which discriminatively learns the object class appearance from an annotated database of real images and builds a generative representation of 3D geometry from a database of synthetic CAD models. During detection, we introduce a 3D voting scheme which reinforces geometric coherence by means of a robust pose estimation, and we propose an alternative probabilistic pose estimation method which evaluates the likelihood of groups of 2D part detections with respect to a full 3D geometry. Both detection methods yield approximate 3D bounding boxes in addition to 2D localizations ; these initializations are subsequently improved by a registration scheme aligning arbitrary 3D models to optical and Synthetic Aperture Radar (SAR) images in order to disambiguate and prune 2D detections and to handle occlusions. The work is evaluated on several standard benchmark datasets and it is shown to achieve state-of-the-art performance for 2D detection in addition to providing 3D pose estimations from single images
Blanc, Beyne Thibault. "Estimation de posture 3D à partir de données imprécises et incomplètes : application à l'analyse d'activité d'opérateurs humains dans un centre de tri." Thesis, Toulouse, INPT, 2020. http://www.theses.fr/2020INPT0106.
In a context of study of stress and ergonomics at work for the prevention of musculoskeletal disorders, the company Ebhys wants to develop a tool for analyzing the activity of human operators in a waste sorting center, by measuring ergonomic indicators. To cope with the uncontrolled environment of the sorting center, these indicators are measured from depth images. An ergonomic study allows us to define the indicators to be measured. These indicators are zones of movement of the operator’s hands and zones of angulations of certain joints of the upper body. They are therefore indicators that can be obtained from an analysis of the operator’s 3D pose. The software for calculating the indicators will thus be composed of three steps : a first part segments the operator from the rest of the scene to ease the 3D pose estimation, a second part estimates the operator’s 3D pose, and the third part uses the operator’s 3D pose to compute the ergonomic indicators. First of all, we propose an algorithm that extracts the operator from the rest of the depth image. To do this, we use a first automatic segmentation based on static background removal and selection of a moving element given its position and size. This first segmentation allows us to train a neural network that improves the results. This neural network is trained using the segmentations obtained from the first automatic segmentation, from which the best quality samples are automatically selected during training. Next, we build a neural network model to estimate the operator’s 3D pose. We propose a study that allows us to find a light and optimal model for 3D pose estimation on synthetic depth images, which we generate numerically. However, if this network gives outstanding performances on synthetic depth images, it is not directly applicable to real depth images that we acquired in an industrial context. To overcome this issue, we finally build a module that allows us to transform the synthetic depth images into more realistic depth images. This image-to-image translation model modifies the style of the depth image without changing its content, keeping the 3D pose of the operator from the synthetic source image unchanged on the translated realistic depth frames. These more realistic depth images are then used to re-train the 3D pose estimation neural network, to finally obtain a convincing 3D pose estimation on the depth images acquired in real conditions, to compute de ergonomic indicators
Gourjon, Géraud. "L'estimation du mélange génétique dans les populations humaines." Thesis, Aix-Marseille 2, 2010. http://www.theses.fr/2010AIX20686/document.
Different methods have been developed to estimate the genetic admixture contributions of parental populations to a hybrid one. Most of these methods are implemented in different software programs that provide estimates having variable accuracy. A full comparison between ADMIX (weighted least square), ADMIX95 (gene identity), Admix 2.0 (coalescent-based), Mistura (maximum-likelihood), LEA (likelihood-based) and LEADMIX (maximum-likelihood) software programs has been carried out, both at the “intra” (test of each software programs) and “inter” level (comparisons between them). We tested all of these programs on a real human population data set, using four kinds of markers, autosomal (Blood groups and KIR genes) and uniparental (mtDNA and Y-Chromosome). We demonstrated that the accuracy of the results depends not only on the method itself but also on the choice of loci and of parental populations. We consider that the results of admixture contribution rates obtained from human population data set should not be considered as an accurate value but rather as an indicative result and we suggest using an “Admixture Indicative Interval” as a measurement of admixture
Zvénigorosky-Durel, Vincent. "Etude des parentés génétiques dans les populations humaines anciennes : estimation de la fiabilité et de l'efficacité des méthodes d'analyse." Thesis, Toulouse 3, 2018. http://www.theses.fr/2018TOU30260/document.
The study of genetic kinship allows anthropology to identify the place of an individual within which they evolve: a biological family, a social group, a population. The application of classical probabilistic methods (that were established to solve cases in legal medicine, such as Likelihood Ratios, or LR) to STR data from archaeological material has permitted the discovery of numerous parental links which together constitute genealogies both simple and complex. Our continued practice of these methods has however led us to identify limits to the interpretation of STR data, especially in cases of complex, distant or inbred kinship. The first part of the present work is constituted by the estimation of the reliability and the efficacy of the LR method in four situations: a large modern population with significant allelic diversity, a large modern population with poor allelic diversity, a large ancient population and a small ancient population. Recent publications use the more numerous markers analysed using Next generation Sequencing (NGS) to implement new strategies in the detection of kinship, especially based on the analysis of chromosome segments shared due to common ancestry (IBD "Identity-by-Descent" segments). These methods have permitted the more reliable estimation of kinship probabilities in ancient material. They are nevertheless ill-suited to certain typical situations that are characteristic of ancient DNA studies: they were not conceived to function using single pairs of isolated individuals and they depend, like classical methods, on the estimation of allelic diversity in the population. We therefore propose the quantification of the reliability and efficiency of the IBD segment method using NGS data, focusing on the estimation of the quality of results in different situations with populations of different sizes and different sets of more or less heterogeneous samples.[...]
Neverova, Natalia. "Deep learning for human motion analysis." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSEI029/document.
The research goal of this work is to develop learning methods advancing automatic analysis and interpreting of human motion from different perspectives and based on various sources of information, such as images, video, depth, mocap data, audio and inertial sensors. For this purpose, we propose a several deep neural models and associated training algorithms for supervised classification and semi-supervised feature learning, as well as modelling of temporal dependencies, and show their efficiency on a set of fundamental tasks, including detection, classification, parameter estimation and user verification. First, we present a method for human action and gesture spotting and classification based on multi-scale and multi-modal deep learning from visual signals (such as video, depth and mocap data). Key to our technique is a training strategy which exploits, first, careful initialization of individual modalities and, second, gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. Moving forward, from 1 to N mapping to continuous evaluation of gesture parameters, we address the problem of hand pose estimation and present a new method for regression on depth images, based on semi-supervised learning using convolutional deep neural networks, where raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. In separate but related work, we explore convolutional temporal models for human authentication based on their motion patterns. In this project, the data is captured by inertial sensors (such as accelerometers and gyroscopes) built in mobile devices. We propose an optimized shift-invariant dense convolutional mechanism and incorporate the discriminatively-trained dynamic features in a probabilistic generative framework taking into account temporal characteristics. Our results demonstrate, that human kinematics convey important information about user identity and can serve as a valuable component of multi-modal authentication systems
Martinez, Francis. "Tout est dans le regard : reconnaissance visuelle du comportement humain en vue subjective." Phd thesis, Université Pierre et Marie Curie - Paris VI, 2013. http://tel.archives-ouvertes.fr/tel-01001816.
Assaad, Aziz. "Pollution anthropique de cours d'eau : caractérisation spatio-temporelle et estimation des flux." Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0054/document.
The Water Framework Directive demands a return to good condition for rivers in Europe. These rivers receive different types of pollution related to various economic activities of populations installed along their banks. We are often interested in an isolated manner to particular types of pollution: pollution due to agricultural pesticides, fertilizers and livestock waste in rural areas, pollution due to a specific industry (steel, paper mill, etc.), more or less well treated domestic pollution, etc. But in many cases, we are dealing with a mixture of pollutants. In the case of the Moselle, the pollution generated by human activities in the French part of the Moselle watershed impacts surface water quality downstream and therefore the Rhine. Our goal is to characterize the state of some tributaries of the Moselle (Madon, Meurthe, Vologne and Fensch) versus anthropogenic pressures and propose a strategy to calculate the flow of pollutants along these rivers. In this context, sampling campaigns with a dense spatial stations have been organized. In addition to the usual parameters characterizing water quality (conductivity, pH, dissolved organic carbon, ammonia nitrogen, nitrate, etc.) a particular attention has been given to optical properties (UV-visible absorbance, synchronous fluorescence) of dissolved organic matter in order to understand its origin. Synchronous fluorescence spectra were studied by deconvolution or by principal components analysis. A method has been developed, based on the synchronous fluorescence spectroscopy, to detect the presence of optical brighteners. Finally, a methodology has been developed in Madon watershed in order to calculate the mean daily pollution flux at each sampling station for each sampling period from geographic data
Baradel, Fabien. "Structured deep learning for video analysis." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEI045.
With the massive increase of video content on Internet and beyond, the automatic understanding of visual content could impact many different application fields such as robotics, health care, content search or filtering. The goal of this thesis is to provide methodological contributions in Computer Vision and Machine Learning for automatic content understanding from videos. We emphasis on problems, namely fine-grained human action recognition and visual reasoning from object-level interactions. In the first part of this manuscript, we tackle the problem of fine-grained human action recognition. We introduce two different trained attention mechanisms on the visual content from articulated human pose. The first method is able to automatically draw attention to important pre-selected points of the video conditioned on learned features extracted from the articulated human pose. We show that such mechanism improves performance on the final task and provides a good way to visualize the most discriminative parts of the visual content. The second method goes beyond pose-based human action recognition. We develop a method able to automatically identify unstructured feature clouds of interest in the video using contextual information. Furthermore, we introduce a learned distributed system for aggregating the features in a recurrent manner and taking decisions in a distributed way. We demonstrate that we can achieve a better performance than obtained previously, without using articulated pose information at test time. In the second part of this thesis, we investigate video representations from an object-level perspective. Given a set of detected persons and objects in the scene, we develop a method which learns to infer the important object interactions through space and time using the video-level annotation only. That allows to identify important objects and object interactions for a given action, as well as potential dataset bias. Finally, in a third part, we go beyond the task of classification and supervised learning from visual content by tackling causality in interactions, in particular the problem of counterfactual learning. We introduce a new benchmark, namely CoPhy, where, after watching a video, the task is to predict the outcome after modifying the initial stage of the video. We develop a method based on object- level interactions able to infer object properties without supervision as well as future object locations after the intervention
Liebelt, Joerg. "Détection de Classes d'Objets et Estimation de leurs Poses à partir de Modèles 3D Synthétiques." Phd thesis, 2010. http://tel.archives-ouvertes.fr/tel-00553343.
Irazi, Caribert. "Estimation des pertes humaines dues aux guerres civiles au Burundi, au Mozambique et en Ouganda, entre 1971 et 1992." Thèse, 2005. http://hdl.handle.net/1866/17574.
Deslauriers, Pierre-Luc. "Une estimation de la contribution relative de l'éducation des filles et des garçons sur la croissance économique des pays pauvres." Mémoire, 2008. http://www.archipel.uqam.ca/1446/1/M10579.pdf.