Tesi sul tema "Reconnaissance faciale automatisée"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-22 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Reconnaissance faciale automatisée".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Maalej, Ahmed. "Reconnaissance d'Expressions Faciale 3D Basée sur l'Analyse de Forme et l'Apprentissage Automatique". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2012. http://tel.archives-ouvertes.fr/tel-00726298.
Testo completoAbdat, Faiza. "Reconnaissance automatique des émotions par données multimodales : expressions faciales et des signaux physiologiques". Thesis, Metz, 2010. http://www.theses.fr/2010METZ035S/document.
Testo completoThis thesis presents a generic method for automatic recognition of emotions from a bimodal system based on facial expressions and physiological signals. This data processing approach leads to better extraction of information and is more reliable than single modality. The proposed algorithm for facial expression recognition is based on the distance variation of facial muscles from the neutral state and on the classification by means of Support Vector Machines (SVM). And the emotion recognition from physiological signals is based on the classification of statistical parameters by the same classifier. In order to have a more reliable recognition system, we have combined the facial expressions and physiological signals. The direct combination of such information is not trivial giving the differences of characteristics (such as frequency, amplitude, variation, and dimensionality). To remedy this, we have merged the information at different levels of implementation. At feature-level fusion, we have tested the mutual information approach for selecting the most relevant and principal component analysis to reduce their dimensionality. For decision-level fusion we have implemented two methods; the first based on voting process and another based on dynamic Bayesian networks. The optimal results were obtained with the fusion of features based on Principal Component Analysis. These methods have been tested on a database developed in our laboratory from healthy subjects and inducing with IAPS pictures. A self-assessment step has been applied to all subjects in order to improve the annotation of images used for induction. The obtained results have shown good performance even in presence of variability among individuals and the emotional state variability for several days
Abdat, Faiza. "Reconnaissance automatique des émotions par données multimodales : expressions faciales et des signaux physiologiques". Electronic Thesis or Diss., Metz, 2010. http://www.theses.fr/2010METZ035S.
Testo completoThis thesis presents a generic method for automatic recognition of emotions from a bimodal system based on facial expressions and physiological signals. This data processing approach leads to better extraction of information and is more reliable than single modality. The proposed algorithm for facial expression recognition is based on the distance variation of facial muscles from the neutral state and on the classification by means of Support Vector Machines (SVM). And the emotion recognition from physiological signals is based on the classification of statistical parameters by the same classifier. In order to have a more reliable recognition system, we have combined the facial expressions and physiological signals. The direct combination of such information is not trivial giving the differences of characteristics (such as frequency, amplitude, variation, and dimensionality). To remedy this, we have merged the information at different levels of implementation. At feature-level fusion, we have tested the mutual information approach for selecting the most relevant and principal component analysis to reduce their dimensionality. For decision-level fusion we have implemented two methods; the first based on voting process and another based on dynamic Bayesian networks. The optimal results were obtained with the fusion of features based on Principal Component Analysis. These methods have been tested on a database developed in our laboratory from healthy subjects and inducing with IAPS pictures. A self-assessment step has been applied to all subjects in order to improve the annotation of images used for induction. The obtained results have shown good performance even in presence of variability among individuals and the emotional state variability for several days
Al, chanti Dawood. "Analyse Automatique des Macro et Micro Expressions Faciales : Détection et Reconnaissance par Machine Learning". Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAT058.
Testo completoFacial expression analysis is an important problem in many biometric tasks, such as face recognition, face animation, affective computing and human computer interface. In this thesis, we aim at analyzing facial expressions of a face using images and video sequences. We divided the problem into three leading parts.First, we study Macro Facial Expressions for Emotion Recognition and we propose three different levels of feature representations. Low-level feature through a Bag of Visual Word model, mid-level feature through Sparse Representation and hierarchical features through a Deep Learning based method. The objective of doing this is to find the most effective and efficient representation that contains distinctive information of expressions and that overcomes various challenges coming from: 1) intrinsic factors such as appearance and expressiveness variability and 2) extrinsic factors such as illumination, pose, scale and imaging parameters, e.g., resolution, focus, imaging, noise. Then, we incorporate the time dimension to extract spatio-temporal features with the objective to describe subtle feature deformations to discriminate ambiguous classes.Second, we direct our research toward transfer learning, where we aim at Adapting Facial Expression Category Models to New Domains and Tasks. Thus we study domain adaptation and zero shot learning for developing a method that solves the two tasks jointly. Our method is suitable for unlabelled target datasets coming from different data distributions than the source domain and for unlabelled target datasets with different label distributions but sharing the same context as the source domain. Therefore, to permit knowledge transfer between domains and tasks, we use Euclidean learning and Convolutional Neural Networks to design a mapping function that map the visual information coming from facial expressions into a semantic space coming from a Natural Language model that encodes the visual attribute description or use the label information. The consistency between the two subspaces is maximized by aligning them using the visual feature distribution.Third, we study Micro Facial Expression Detection. We propose an algorithm to spot micro-expression segments including the onset and offset frames and to spatially pinpoint in each image space the regions involved in the micro-facial muscle movements. The problem is formulated into Anomaly Detection due to the fact that micro-expressions occur infrequently and thus leading to few data generation compared to natural facial behaviours. In this manner, first, we propose a deep Recurrent Convolutional Auto-Encoder to capture spatial and motion feature changes of natural facial behaviours. Then, a statistical based model for estimating the probability density function of normal facial behaviours while associating a discriminating score to spot micro-expressions is learned based on a Gaussian Mixture Model. Finally, an adaptive thresholding technique for identifying micro expressions from natural facial behaviour is proposed.Our algorithms are tested over deliberate and spontaneous facial expression benchmarks
Ouzar, Yassine. "Reconnaissance automatique sans contact de l'état affectif de la personne par fusion physio-visuelle à partir de vidéo du visage". Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0076.
Testo completoHuman affective state recognition remains a challenging topic due to the complexity of emotions, which involves experiential, behavioral, and physiological elements. Since it is difficult to comprehensively describe emotion in terms of single modalities, recent studies have focused on artificial intelligence approaches and fusion strategy to exploit the complementarity of multimodal signals using artificial intelligence approaches. The main objective is to study the feasibility of a physio-visual fusion for the recognition of the affective state of the person (emotions/stress) from facial videos. The fusion of facial expressions and physiological signals allows to take advantage of each modality. Facial expressions are easy to acquire and provide an external view of the affective state, while physiological signals improve reliability and address the problem of falsified facial expressions. The research developed in this thesis lies at the intersection of artificial intelligence, affective computing, and biomedical engineering. Our contribution focuses on two points. First, we propose a new end-to-end approach for instantaneous pulse rate estimation directly from facial video recordings using the principle of imaging photoplethysmography (iPPG). This method is based on a deep spatio-temporal network (X-iPPGNet) that learns the iPPG concept from scratch, without incorporating prior knowledge or going through manual iPPG signal extraction. The second contribution focuses on a physio-visual fusion for spontaneous emotions and stress recognition from facial videos. The proposed model includes two pipelines to extract the features of each modality. The physiological pipeline is common to both the emotion and stress recognition systems. It is based on MTTS-CAN, a recent method for estimating the iPPG signal, while two distinct neural models were used to predict the person's emotions and stress from the visual information contained in the video (e.g. facial expressions): a spatio-temporal network combining the Squeeze-Excitation module and the Xception architecture for estimating the emotional state and a transfer learning approach for estimating the stress level. This approach reduces development effort and overcomes the lack of data. A fusion of physiological and facial features is then performed to predict the emotional or stress states
Alashkar, Taleb. "3D dynamic facial sequences analysis for face recognition and emotion detection". Thesis, Lille 1, 2015. http://www.theses.fr/2015LIL10109/document.
Testo completoIn this thesis, we have investigated the problems of identity recognition and emotion detection from facial 3D shapes animations (called 4D faces). In particular, we have studied the role of facial (shapes) dynamics in revealing the human identity and their exhibited spontaneous emotion. To this end, we have adopted a comprehensive geometric framework for the purpose of analyzing 3D faces and their dynamics across time. That is, a sequence of 3D faces is first split to an indexed collection of short-term sub-sequences that are represented as matrix (subspace) which define a special matrix manifold called, Grassmann manifold (set of k-dimensional linear subspaces). The geometry of the underlying space is used to effectively compare the 3D sub-sequences, compute statistical summaries (e.g. sample mean, etc.) and quantify densely the divergence between subspaces. Two different representations have been proposed to address the problems of face recognition and emotion detection. They are respectively (1) a dictionary (of subspaces) representation associated to Dictionary Learning and Sparse Coding techniques and (2) a time-parameterized curve (trajectory) representation on the underlying space associated with the Structured-Output SVM classifier for early emotion detection. Experimental evaluations conducted on publicly available BU-4DFE, BU4D-Spontaneous and Cam3D Kinect datasets illustrate the effectiveness of these representations and the algorithmic solutions for identity recognition and emotion detection proposed in this thesis
Moufidi, Abderrazzaq. "Machine Learning-Based Multimodal integration for Short Utterance-Based Biometrics Identification and Engagement Detection". Electronic Thesis or Diss., Angers, 2024. http://www.theses.fr/2024ANGE0026.
Testo completoThe rapid advancement and democratization of technology have led to an abundance of sensors. Consequently, the integration of these diverse modalities presents an advantage for numerous real-life applications, such as biometrics recognition and engage ment detection. In the field of multimodality, researchers have developed various fusion ar chitectures, ranging from early, hybrid, to late fusion approaches. However, these architec tures may have limitations involving short utterances and brief video segments, necessi tating a paradigm shift towards the development of multimodal machine learning techniques that promise precision and efficiency for short-duration data analysis. In this thesis, we lean on integration of multimodality to tackle these previous challenges ranging from supervised biometrics identification to unsupervised student engagement detection. This PhD began with the first contribution on the integration of multiscale Wavelet Scattering Transform with x-vectors architecture, through which we enhanced the accuracy of speaker identification in scenarios involving short utterances. Going through multimodality benefits, a late fusion architecture combining lips depth videos and audio signals further improved identification accuracy under short utterances, utilizing an effective and less computational methods to extract spatiotemporal features. In the realm of biometrics challenges, there is the threat emergence of deepfakes. There-fore, we focalized on elaborating a deepfake detection methods based on, shallow learning and a fine-tuned architecture of our previous late fusion architecture applied on RGB lips videos and audios. By employing hand-crafted anomaly detection methods for both audio and visual modalities, the study demonstrated robust detection capabilities across various datasets and conditions, emphasizing the importance of multimodal approaches in countering evolving deepfake techniques. Expanding to educational contexts, the dissertation explores multimodal student engagement detection in classrooms. Using low-cost sensors to capture Heart Rate signals and facial expressions, the study developed a reproducible dataset and pipeline for identifying significant moments, accounting for cultural nuances. The analysis of facial expressions using Vision Transformer (ViT) fused with heart rate signal processing, validated through expert observations, showcased the potential for real-time monitoring to enhance educational outcomes through timely interventions
Allaert, Benjamin. "Analyse des expressions faciales dans un flux vidéo". Thesis, Lille 1, 2018. http://www.theses.fr/2018LIL1I021/document.
Testo completoFacial expression recognition has attracted great interest over the past decade in wide application areas, such as human behavior analysis, e-health and marketing. In this thesis we explore a new approach to step forward towards in-the-wild expression recognition. Special attention has been paid to encode respectively small/large facial expression amplitudes, and to analyze facial expressions in presence of varying head pose. The first challenge addressed concerns varying facial expression amplitudes. We propose an innovative motion descriptor called LMP. This descriptor takes into account mechanical facial skin deformation properties. When extracting motion information from the face, the unified approach deals with inconsistencies and noise, caused by face characteristics. The main originality of our approach is a unified approach for both micro and macro expression recognition, with the same facial recognition framework. The second challenge addressed concerns important head pose variations. In facial expression analysis, the face registration step must ensure that minimal deformation appears. Registration techniques must be used with care in presence of unconstrained head pose as facial texture transformations apply. Hence, it is valuable to estimate the impact of alignment-related induced noise on the global recognition performance. For this, we propose a new database, called SNaP-2DFe, allowing to study the impact of head pose and intra-facial occlusions on expression recognition approaches. We prove that the usage of face registration approach does not seem adequate for preserving the features encoding facial expression deformations
Deramgozin, Mohammadmahdi. "Développement de modèles de reconnaissance des expressions faciales à base d’apprentissage profond pour les applications embarquées". Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0286.
Testo completoThe field of Facial Emotion Recognition (FER) is pivotal in advancing human-machine interactions and finds essential applications in healthcare for conditions like depression and anxiety. Leveraging Convolutional Neural Networks (CNNs), this thesis presents a progression of models aimed at optimizing emotion detection and interpretation. The initial model is resource-frugal but competes favorably with state-of-the-art solutions, making it a strong candidate for embedded systems constrained in computational and memory resources. To capture the complexity and ambiguity of human emotions, the research work presented in this thesis enhances this CNN-based foundational model by incorporating facial Action Units (AUs). This approach not only refines emotion detection but also provides interpretability by identifying specific AUs tied to each emotion. Further sophistication is achieved by introducing neural attention mechanisms—both spatial and channel-based—improving the model's focus on salient facial features. This makes the CNN-based model adapted well to real-world scenarios, such as partially obscured or subtle facial expressions. Based on the previous results, in this thesis we propose finally an optimized, yet computationally efficient, CNN model that is ideal for resource-limited environments like embedded systems. While it provides a robust solution for FER, this research also identifies perspectives for future work, such as real-time applications and advanced techniques for model interpretability
Ruiz, hernandez John alexander. "Analyse faciale avec dérivées Gaussiennes". Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00646718.
Testo completoRuiz, Hernandez John Alexander. "Analyse faciale avec dérivées Gaussiennes". Thesis, Grenoble, 2011. http://www.theses.fr/2011GRENM039/document.
Testo completoIn this thesis, we propose to modelize facial images using Gaussian Derivatives computed with a Half-Octave Gaussian Pyramid. In this scope, Gaussian derivatives have shown a high versatility in object recognition and image analysis, nevertheless there is not a considerable number of proposed aproaches in the state-of-the-art that uses Gaussian derivatives for extracting important information from facial images. Motivated by the above mentioned and the high amount of applications in facial analysis, security systems and Biometry, in this thesis as a first time, we propose to use an unique image representation, the Gaussian Scale Space computed with a half octave pyramid. We show in this thesis that this image representation could be used to perform different tasks in facial analysis without lost of performance compared with other approaches in the state-of-the-art that uses more complicated image representations. it is also well know that using an unique image represenation could be convenient in real world applications where the amount of memory capacity is limitated by hardware constraints. To demostrate our assumptations we solve three different tasks in facial analysis: Face detection, Face recognition and Age estimation. In face detection we propose to use a cascade of classifiers using Gaussian derivatives. Specifically we propose to use Gaussian derivatives up to the fourth order, in effect experiemnts using different derivatives orders have shown that fourth order Gaussian derivatives provide important information in face detection and recognition. In adition, to improve the speed of detection using Gaussian derivatives, we develope a new cascade architecture which considerates the computational cost of each Gaussian derivative order to chose its best position in the cascade. Finally, to solve the face recognition and age estimation problems, we propose a tensorial model based in Gaussian derivatives. This tensorial model preserves the 3-D structure of feature space and it does not break the natural structure of data when a vectorization process is applied. Each one of the methods proposed in the thesis are discused and validated with a set of well defined experiments. All our results are compared with the last state-of-the-art results in face detection, recognition and age estimation, giving comparable or superior results
Grossard, Charline. "Evaluation et rééducation des expressions faciales émotionnelles chez l’enfant avec TSA : le projet JEMImE Serious games to teach social interactions and emotions to individuals with autism spectrum disorders (ASD) Children facial expression production : influence of age, gender, emotion subtype, elicitation condition and culture". Thesis, Sorbonne université, 2019. http://www.theses.fr/2019SORUS625.
Testo completoThe autism spectrum disorder (ASD) is characterized by difficulties in socials skills, as emotion recognition and production. Several studies focused on emotional facial expressions (EFE) recognition, but few worked on its production, either in typical children or in children with ASD. Nowadays, information and communication technologies are used to work on social skills in ASD but few studies using these technologies focus on EFE production. After a literature review, we found only 4 games regarding EFE production. Our final goal was to create the serious game JEMImE to work on EFE production with children with ASD using an automatic feedback. We first created a dataset of EFE of typical children and children with ASD to train an EFE recognition algorithm and to study their production skills. Several factors modulate them, such as age, type of emotion or culture. We observed that human judges and the algorithm assess the quality of the EFE of children with ASD as poorer than the EFE of typical children. Also, the EFE recognition algorithm needs more features to classify their EFE. We then integrated the algorithm in JEMImE to give the child a visual feedback in real time to correct his/her productions. A pilot study including 23 children with ASD showed that children are able to adapt their productions thanks to the feedback given by the algorithm and illustrated an overall good subjective experience with JEMImE. The beta version of JEMImE shows promising potential and encourages further development of the game in order to offer longer game exposure to children with ASD and so allow a reliable assessment of the effect of this training on their production of EFE
Maalej, Ahmed. "3D Facial Expressions Recognition Using Shape Analysis and Machine Learning". Thesis, Lille 1, 2012. http://www.theses.fr/2012LIL10025/document.
Testo completoFacial expression recognition is a challenging task, which has received growing interest within the research community, impacting important applications in fields related to human machine interaction (HMI). Toward building human-like emotionally intelligent HMI devices, scientists are trying to include the essence of human emotional state in such systems. The recent development of 3D acquisition sensors has made 3D data more available, and this kind of data comes to alleviate the problems inherent in 2D data such as illumination, pose and scale variations as well as low resolution. Several 3D facial databases are publicly available for the researchers in the field of face and facial expression recognition to validate and evaluate their approaches. This thesis deals with facial expression recognition (FER) problem and proposes an approach based on shape analysis to handle both static and dynamic FER tasks. Our approach includes the following steps: first, a curve-based representation of the 3D face model is proposed to describe facial features. Then, once these curves are extracted, their shape information is quantified using a Riemannain framework. We end up with similarity scores between different facial local shapes constituting feature vectors associated with each facial surface. Afterwards, these features are used as entry parameters to some machine learning and classification algorithms to recognize expressions. Exhaustive experiments are derived to validate our approach and results are presented and compared to the related work achievements
Baccouche, Moez. "Apprentissage neuronal de caractéristiques spatio-temporelles pour la classification automatique de séquences vidéo". Phd thesis, INSA de Lyon, 2013. http://tel.archives-ouvertes.fr/tel-00932662.
Testo completoYang, Yu-Fang. "Contribution des caractéristiques diagnostiques dans la reconnaissance des expressions faciales émotionnelles : une approche neurocognitive alliant oculométrie et électroencéphalographie". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS099/document.
Testo completoProficient recognition of facial expression is crucial for social interaction. Behaviour, event-related potentials (ERPs), and eye-tracking techniques can be used to investigate the underlying brain mechanisms supporting this seemingly effortless processing of facial expression. Facial expression recognition involves not only the extraction of expressive information from diagnostic facial features, known as part-based processing, but also the integration of featural information, known as configural processing. Despite the critical role of diagnostic features in emotion recognition and extensive research in this area, it is still not known how the brain decodes configural information in terms of emotion recognition. The complexity of facial information integration becomes evident when comparing performance between healthy subjects and individuals with schizophrenia because those patients tend to process featural information on emotional faces. The different ways in examining faces possibly impact on social-cognitive ability in recognizing emotions. Therefore, this thesis investigates the role of diagnostic features and face configuration in the recognition of facial expression. In addition to behavior, we examined both the spatiotemporal dynamics of fixations using eye-tracking, and early neurocognitive sensitivity to face as indexed by the P100 and N170 ERP components. In order to address the questions, we built a new set of sketch face stimuli by transforming photographed faces from the Radboud Faces Database through the removal of facial texture and retaining only the diagnostic features (e.g., eyes, nose, mouth) with neutral and four facial expressions - anger, sadness, fear, happiness. Sketch faces supposedly impair configural processing in comparison with photographed faces, resulting in increased sensitivity to diagnostic features through part-based processing. The direct comparison of neurocognitive measures between sketch and photographed faces expressing basic emotions has never been tested. In this thesis, we examined (i) eye fixations as a function of stimulus type, and (ii) neuroelectric response to experimental manipulations such face inversion and deconfiguration. The use of these methods aimed to reveal which face processing drives emotion recognition and to establish neurocognitive markers of emotional sketch and photographed faces processing. Overall, the behavioral results showed that sketch faces convey sufficient expressive information (content of diagnostic features) as in photographed faces for emotion recognition. There was a clear emotion recognition advantage for happy expressions as compared to other emotions. In contrast, recognizing sad and angry faces was more difficult. Concomitantly, results of eye-tracking showed that participants employed more part-based processing on sketch and photographed faces during second fixation. The extracting information from the eyes is needed when the expression conveys more complex emotional information and when stimuli are impoverished (e.g., sketch). Using electroencephalographic (EEG), the P100 and N170 components are used to study the effect of stimulus type (sketch, photographed), orientation (inverted, upright), and deconfiguration, and possible interactions. Results also suggest that sketch faces evoked more part-based processing. The cues conveyed by diagnostic features might have been subjected to early processing, likely driven by low-level information during P100 time window, followed by a later decoding of facial structure and its emotional content in the N170 time window. In sum, this thesis helped elucidate elements of the debate about configural and part-based face processing for emotion recognition, and extend our current understanding of the role of diagnostic features and configural information during neurocognitive processing of facial expressions of emotion
Ballihi, Lahoucine. "Biométrie faciale 3D par apprentissage des caractéristiques géométriques : Application à la reconnaissance des visages et à la classification du genre". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2012. http://tel.archives-ouvertes.fr/tel-00726299.
Testo completoDagnes, Nicole. "3D human face analysis for recognition applications and motion capture". Thesis, Compiègne, 2020. http://www.theses.fr/2020COMP2542.
Testo completoThis thesis is intended as a geometrical study of the three-dimensional facial surface, whose aim is to provide an application framework of entities coming from Differential Geometry context to use as facial descriptors in face analysis applications, like FR and FER fields. Indeed, although every visage is unique, all faces are similar and their morphological features are the same for all mankind. Hence, it is primary for face analysis to extract suitable features. All the facial features, proposed in this study, are based only on the geometrical properties of the facial surface. Then, these geometrical descriptors and the related entities proposed have been applied in the description of facial surface in pattern recognition contexts. Indeed, the final goal of this research is to prove that Differential Geometry is a comprehensive tool oriented to face analysis and geometrical features are suitable to describe and compare faces and, generally, to extract relevant information for human face analysis in different practical application fields. Finally, since in the last decades face analysis has gained great attention also for clinical application, this work focuses on musculoskeletal disorders analysis by proposing an objective quantification of facial movements for helping maxillofacial surgery and facial motion rehabilitation. At this time, different methods are employed for evaluating facial muscles function. This research work investigates the 3D motion capture system, adopting the Technology, Sport and Health platform, located in the Innovation Centre of the University of Technology of Compiègne, in the Biomechanics and Bioengineering Laboratory (BMBI)
Peyrard, Clément. "Single image super-resolution based on neural networks for text and face recognition". Thesis, Lyon, 2017. http://www.theses.fr/2017LYSEI083/document.
Testo completoThis thesis is focussed on super-resolution (SR) methods for improving automatic recognition system (Optical Character Recognition, face recognition) in realistic contexts. SR methods allow to generate high resolution images from low resolution ones. Unlike upsampling methods such as interpolation, they restore spatial high frequencies and compensate artefacts such as blur or jaggy edges. In particular, example-based approaches learn and model the relationship between low and high resolution spaces via pairs of low and high resolution images. Artificial Neural Networks are among the most efficient systems to address this problem. This work demonstrate the interest of SR methods based on neural networks for improved automatic recognition systems. By adapting the data, it is possible to train such Machine Learning algorithms to produce high-resolution images. Convolutional Neural Networks are especially efficient as they are trained to simultaneously extract relevant non-linear features while learning the mapping between low and high resolution spaces. On document text images, the proposed method improves OCR accuracy by +7.85 points compared with simple interpolation. The creation of an annotated image dataset and the organisation of an international competition (ICDAR2015) highlighted the interest and the relevance of such approaches. Moreover, if a priori knowledge is available, it can be used by a suitable network architecture. For facial images, face features are critical for automatic recognition. A two step method is proposed in which image resolution is first improved, followed by specialised models that focus on the essential features. An off-the-shelf face verification system has its performance improved from +6.91 up to +8.15 points. Finally, to address the variability of real-world low-resolution images, deep neural networks allow to absorb the diversity of the blurring kernels that characterise the low-resolution images. With a single model, high-resolution images are produced with natural image statistics, without any knowledge of the actual observation model of the low-resolution image
Baklouti, Malek. "Localisation du visage et extraction des éléments faciaux, pour la conception d'un mode d'interaction homme-machine". Versailles-St Quentin en Yvelines, 2009. http://www.theses.fr/2009VERS0035.
Testo completoThis work deals with Human-Machine Interface for assistive robotic systems. Assistive systems should be endowed with interfaces that are specifically designed for disabled people in order to enable them to control the system with the most natural and less tiring way. This is the primary concern of this work. More precisely, we were interested in developing a vision based interface using user’s head movement. The problem was tackled incrementally following the system used: monocular and stereoscopic camera. Using monocular camera, we proposed a new approach for learning faces using a committee of neural networks generated using the well known Adaboost. We proposed training the neural network with reduced space Haar-like features instead of working with image pixels themselves. In the second part, we are proposing to tackle the head pose estimation in its fine level using stereo vision approach. The framework can be break down into two parts: The first part consists in estimating the 3D points set using stereoscopic acquisition and the second one deals with aligning a Candide-1 model with the 3D points set. Under alignment, the transformation matrix of the Candide model corresponds to the head pose parameters
Morabit, Safaa El. "New Artificial Intelligence techniques for Computer vision based medical diagnosis". Electronic Thesis or Diss., Valenciennes, Université Polytechnique Hauts-de-France, 2023. http://www.theses.fr/2023UPHF0013.
Testo completoThe ability to feel pain is crucial for life, since it serves as an early warning system forpotential harm to the body. The majority of pain evaluations rely on patient reports. Patients who are unable to express their own pain must instead rely on third-party reportsof their suffering. Due to potential observer bias, pain reports may contain inaccuracies. In addition, it would be impossible for people to keep watch around the clock. Inorder to better manage pain, especially in noncommunicative patients, automatic paindetection technologies might be implemented to aid human caregivers and complementtheir service. Facial expressions are used by all observer-based pain assessment systemsbecause they are a reliable indicator of pain and can be interpreted from a distance.Taking into consideration that pain generally generates spontaneous facial behavior,these facial expressions could be used to detect the presence of pain. In this thesis, weanalyze facial expressions of pain in order to address pain estimation. First, we presenta thorough analysis of the problem by comparing numerous common CNN (Convolutional Neural Network) architectures, such as MobileNet, GoogleNet, ResNeXt-50, ResNet18, and DenseNet-161. We employ these networks in two unique modes: standalone and feature extraction. In standalone mode, models (i.e., networks) are utilized to directly estimate pain. In feature extractor mode, "values" from the middle layer are extracted and fed into classifiers like Support Vector Regression (SVR) and Random Forest Regression (RFR).CNNs have achieved significant results in image classification and have achievedgreat success. The effectiveness of Transformers in computer vision has been demonstrated through recent studies. Transformer-based architectures were proposed in the second section of this thesis. Two distinct Transformer-based frameworks were presented to address two distinct pain issues: pain detection (pain vs no pain) and thedistinction between genuine and posed pain. The innovative architecture for binaryidentification of facial pain is based on data-efficient image transformers (Deit). Twodatasets, UNBC-McMaster shoulder pain and BioVid heat pain, were used to fine-tuneand assess the trained model. The suggested architecture is built on Vision Transformers for the detection of genuine and simulated pain from facial expressions (ViT). Todistinguish between Genuine and Posed Pain, the model must pay particular attentionto the subtle changes in facial expressions over time. The employed approach takes intoaccount the sequential aspect and captures the variations in facial expressions. Experiments on the publicly accessible BioVid Heat Pain Database demonstrate the efficacy of our strategy
Mercier, Hugo. "Modélisation et suivi des déformations faciales : applications à la description des expressions du visage dans le contexte de la langue des signes". Phd thesis, Université Paul Sabatier - Toulouse III, 2007. http://tel.archives-ouvertes.fr/tel-00185084.
Testo completoLe formalisme des modèles à apparence active (Active Appearance Models - AAM) est utilisé ici pour modéliser le visage en termes de déplacements d'un certain nombre de points d'intérêt et en termes de variations de texture. Quand il est associé à une méthode d'optimisation, ce formalisme permet de trouver les coordonnées des points d'intérêt sur un visage. Nous utilisons ici une méthode d'optimisation dite "à composition inverse", qui permet une implémentation efficace et l'obtention de résultats précis.
Dans le contexte de la langue des signes, les rotations hors-plan et les occultations manuelles sont fréquentes. Il est donc nécessaire de développer des méthodes robustes à ces conditions. Il existe pour cela une variante robuste des méthodes d'optimisation d'AAM qui permet de considérer une image d'entrée éventuellement bruitée.
Nous avons étendu cette variante de façon à ce que la détection des occultations puisse se faire de manière automatique, en supposant connu le comportement de l'algorithme dans le cas non-occulté.
Le résultat de l'algorithme est alors constitué des coordonnées 2D de chacun des points d'intérêt du modèle en chaque image d'une séquence vidéo, associées éventuellement à un score de confiance. Ces données brutes peuvent ensuite être exploitées dans plusieurs applications.
Nous proposons ainsi comme première application de décrire une séquence vidéo expressive en chaque instant par une combinaison de déformations unitaires activées à des intensités différentes. Une autre application originale consiste à traiter une vidéo de manière à empêcher l'identification d'un visage sans perturber la reconnaissance de ses expressions.
Dahmani, Sara. "Synthèse audiovisuelle de la parole expressive : modélisation des émotions par apprentissage profond". Electronic Thesis or Diss., Université de Lorraine, 2020. http://www.theses.fr/2020LORR0137.
Testo completo: The work of this thesis concerns the modeling of emotions for expressive audiovisual textto-speech synthesis. Today, the results of text-to-speech synthesis systems are of good quality, however audiovisual synthesis remains an open issue and expressive synthesis is even less studied. As part of this thesis, we present an emotions modeling method which is malleable and flexible, and allows us to mix emotions as we mix shades on a palette of colors. In the first part, we present and study two expressive corpora that we have built. The recording strategy and the expressive content of these corpora are analyzed to validate their use for the purpose of audiovisual speech synthesis. In the second part, we present two neural architectures for speech synthesis. We used these two architectures to model three aspects of speech : 1) the duration of sounds, 2) the acoustic modality and 3) the visual modality. First, we use a fully connected architecture. This architecture allowed us to study the behavior of neural networks when dealing with different contextual and linguistic descriptors. We were also able to analyze, with objective measures, the network’s ability to model emotions. The second neural architecture proposed is a variational auto-encoder. This architecture is able to learn a latent representation of emotions without using emotion labels. After analyzing the latent space of emotions, we presented a procedure for structuring it in order to move from a discrete representation of emotions to a continuous one. We were able to validate, through perceptual experiments, the ability of our system to generate emotions, nuances of emotions and mixtures of emotions, and this for expressive audiovisual text-to-speech synthesis