Tesi sul tema "Robust Representations"

Segui questo link per vedere altri tipi di pubblicazioni sul tema: Robust Representations.

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Robust Representations".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.

1

Tran, Thi Quynh Nhi. "Robust and comprehensive joint image-text representations". Thesis, Paris, CNAM, 2017. http://www.theses.fr/2017CNAM1096/document.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
La présente thèse étudie la modélisation conjointe des contenus visuels et textuels extraits à partir des documents multimédias pour résoudre les problèmes intermodaux. Ces tâches exigent la capacité de ``traduire'' l'information d'une modalité vers une autre. Un espace de représentation commun, par exemple obtenu par l'Analyse Canonique des Corrélation ou son extension kernelisée est une solution généralement adoptée. Sur cet espace, images et texte peuvent être représentés par des vecteurs de même type sur lesquels la comparaison intermodale peut se faire directement.Néanmoins, un tel espace commun souffre de plusieurs déficiences qui peuvent diminuer la performance des ces tâches. Le premier défaut concerne des informations qui sont mal représentées sur cet espace pourtant très importantes dans le contexte de la recherche intermodale. Le deuxième défaut porte sur la séparation entre les modalités sur l'espace commun, ce qui conduit à une limite de qualité de traduction entre modalités. Pour faire face au premier défaut concernant les données mal représentées, nous avons proposé un modèle qui identifie tout d'abord ces informations et puis les combine avec des données relativement bien représentées sur l'espace commun. Les évaluations sur la tâche d'illustration de texte montrent que la prise en compte de ces information fortement améliore les résultats de la recherche intermodale. La contribution majeure de la thèse se concentre sur la séparation entre les modalités sur l'espace commun pour améliorer la performance des tâches intermodales. Nous proposons deux méthodes de représentation pour les documents bi-modaux ou uni-modaux qui regroupent à la fois des informations visuelles et textuelles projetées sur l'espace commun. Pour les documents uni-modaux, nous suggérons un processus de complétion basé sur un ensemble de données auxiliaires pour trouver les informations correspondantes dans la modalité absente. Ces informations complémentaires sont ensuite utilisées pour construire une représentation bi-modale finale pour un document uni-modal. Nos approches permettent d'obtenir des résultats de l'état de l'art pour la recherche intermodale ou la classification bi-modale et intermodale
This thesis investigates the joint modeling of visual and textual content of multimedia documents to address cross-modal problems. Such tasks require the ability to match information across modalities. A common representation space, obtained by eg Kernel Canonical Correlation Analysis, on which images and text can be both represented and directly compared is a generally adopted solution.Nevertheless, such a joint space still suffers from several deficiencies that may hinder the performance of cross-modal tasks. An important contribution of this thesis is therefore to identify two major limitations of such a space. The first limitation concerns information that is poorly represented on the common space yet very significant for a retrieval task. The second limitation consists in a separation between modalities on the common space, which leads to coarse cross-modal matching. To deal with the first limitation concerning poorly-represented data, we put forward a model which first identifies such information and then finds ways to combine it with data that is relatively well-represented on the joint space. Evaluations on emph{text illustration} tasks show that by appropriately identifying and taking such information into account, the results of cross-modal retrieval can be strongly improved. The major work in this thesis aims to cope with the separation between modalities on the joint space to enhance the performance of cross-modal tasks.We propose two representation methods for bi-modal or uni-modal documents that aggregate information from both the visual and textual modalities projected on the joint space. Specifically, for uni-modal documents we suggest a completion process relying on an auxiliary dataset to find the corresponding information in the absent modality and then use such information to build a final bi-modal representation for a uni-modal document. Evaluations show that our approaches achieve state-of-the-art results on several standard and challenging datasets for cross-modal retrieval or bi-modal and cross-modal classification
2

Tran, Thi Quynh Nhi. "Robust and comprehensive joint image-text representations". Electronic Thesis or Diss., Paris, CNAM, 2017. http://www.theses.fr/2017CNAM1096.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
La présente thèse étudie la modélisation conjointe des contenus visuels et textuels extraits à partir des documents multimédias pour résoudre les problèmes intermodaux. Ces tâches exigent la capacité de ``traduire'' l'information d'une modalité vers une autre. Un espace de représentation commun, par exemple obtenu par l'Analyse Canonique des Corrélation ou son extension kernelisée est une solution généralement adoptée. Sur cet espace, images et texte peuvent être représentés par des vecteurs de même type sur lesquels la comparaison intermodale peut se faire directement.Néanmoins, un tel espace commun souffre de plusieurs déficiences qui peuvent diminuer la performance des ces tâches. Le premier défaut concerne des informations qui sont mal représentées sur cet espace pourtant très importantes dans le contexte de la recherche intermodale. Le deuxième défaut porte sur la séparation entre les modalités sur l'espace commun, ce qui conduit à une limite de qualité de traduction entre modalités. Pour faire face au premier défaut concernant les données mal représentées, nous avons proposé un modèle qui identifie tout d'abord ces informations et puis les combine avec des données relativement bien représentées sur l'espace commun. Les évaluations sur la tâche d'illustration de texte montrent que la prise en compte de ces information fortement améliore les résultats de la recherche intermodale. La contribution majeure de la thèse se concentre sur la séparation entre les modalités sur l'espace commun pour améliorer la performance des tâches intermodales. Nous proposons deux méthodes de représentation pour les documents bi-modaux ou uni-modaux qui regroupent à la fois des informations visuelles et textuelles projetées sur l'espace commun. Pour les documents uni-modaux, nous suggérons un processus de complétion basé sur un ensemble de données auxiliaires pour trouver les informations correspondantes dans la modalité absente. Ces informations complémentaires sont ensuite utilisées pour construire une représentation bi-modale finale pour un document uni-modal. Nos approches permettent d'obtenir des résultats de l'état de l'art pour la recherche intermodale ou la classification bi-modale et intermodale
This thesis investigates the joint modeling of visual and textual content of multimedia documents to address cross-modal problems. Such tasks require the ability to match information across modalities. A common representation space, obtained by eg Kernel Canonical Correlation Analysis, on which images and text can be both represented and directly compared is a generally adopted solution.Nevertheless, such a joint space still suffers from several deficiencies that may hinder the performance of cross-modal tasks. An important contribution of this thesis is therefore to identify two major limitations of such a space. The first limitation concerns information that is poorly represented on the common space yet very significant for a retrieval task. The second limitation consists in a separation between modalities on the common space, which leads to coarse cross-modal matching. To deal with the first limitation concerning poorly-represented data, we put forward a model which first identifies such information and then finds ways to combine it with data that is relatively well-represented on the joint space. Evaluations on emph{text illustration} tasks show that by appropriately identifying and taking such information into account, the results of cross-modal retrieval can be strongly improved. The major work in this thesis aims to cope with the separation between modalities on the joint space to enhance the performance of cross-modal tasks.We propose two representation methods for bi-modal or uni-modal documents that aggregate information from both the visual and textual modalities projected on the joint space. Specifically, for uni-modal documents we suggest a completion process relying on an auxiliary dataset to find the corresponding information in the absent modality and then use such information to build a final bi-modal representation for a uni-modal document. Evaluations show that our approaches achieve state-of-the-art results on several standard and challenging datasets for cross-modal retrieval or bi-modal and cross-modal classification
3

Tran, Brandon Vanhuy. "Building and using robust representations in image classification". Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/127912.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, May, 2020
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 115-131).
One of the major appeals of the deep learning paradigm is the ability to learn high-level feature representations of complex data. These learned representations obviate manual data pre-processing, and are versatile enough to generalize across tasks. However, they are not yet capable of fully capturing abstract, meaningful features of the data. For instance, the pervasiveness of adversarial examples--small perturbations of correctly classified inputs causing model misclassification--is a prominent indication of such shortcomings. The goal of this thesis is to work towards building learned representations that are more robust and human-aligned. To achieve this, we turn to adversarial (or robust) training, an optimization technique for training networks less prone to adversarial inputs. Typically, robust training is studied purely in the context of machine learning security (as a safeguard against adversarial examples)--in contrast, we will cast it as a means of enforcing an additional prior onto the model. Specifically, it has been noticed that, in a similar manner to the well-known convolutional or recurrent priors, the robust prior serves as a "bias" that restricts the features models can use in classification--it does not allow for any features that change upon small perturbations. We find that the addition of this simple prior enables a number of downstream applications, from feature visualization and manipulation to input interpolation and image synthesis. Most importantly, robust training provides a simple way of interpreting and understanding model decisions. Besides diagnosing incorrect classification, this also has consequences in the so-called "data poisoning" setting, where an adversary corrupts training samples with the hope of causing misbehaviour in the resulting model. We find that in many cases, the prior arising from robust training significantly helps in detecting data poisoning.
by Brandon Vanhuy Tran.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Mathematics
4

Parekh, Sanjeel. "Learning representations for robust audio-visual scene analysis". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT015/document.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
L'objectif de cette thèse est de concevoir des algorithmes qui permettent la détection robuste d’objets et d’événements dans des vidéos en s’appuyant sur une analyse conjointe de données audio et visuelle. Ceci est inspiré par la capacité remarquable des humains à intégrer les caractéristiques auditives et visuelles pour améliorer leur compréhension de scénarios bruités. À cette fin, nous nous appuyons sur deux types d'associations naturelles entre les modalités d'enregistrements audiovisuels (réalisés à l'aide d'un seul microphone et d'une seule caméra), à savoir la corrélation mouvement/audio et la co-occurrence apparence/audio. Dans le premier cas, nous utilisons la séparation de sources audio comme application principale et proposons deux nouvelles méthodes dans le cadre classique de la factorisation par matrices non négatives (NMF). L'idée centrale est d'utiliser la corrélation temporelle entre l'audio et le mouvement pour les objets / actions où le mouvement produisant le son est visible. La première méthode proposée met l'accent sur le couplage flexible entre les représentations audio et de mouvement capturant les variations temporelles, tandis que la seconde repose sur la régression intermodale. Nous avons séparé plusieurs mélanges complexes d'instruments à cordes en leurs sources constituantes en utilisant ces approches.Pour identifier et extraire de nombreux objets couramment rencontrés, nous exploitons la co-occurrence apparence/audio dans de grands ensembles de données. Ce mécanisme d'association complémentaire est particulièrement utile pour les objets où les corrélations basées sur le mouvement ne sont ni visibles ni disponibles. Le problème est traité dans un contexte faiblement supervisé dans lequel nous proposons un framework d’apprentissage de représentation pour la classification robuste des événements audiovisuels, la localisation des objets visuels, la détection des événements audio et la séparation de sources.Nous avons testé de manière approfondie les idées proposées sur des ensembles de données publics. Ces expériences permettent de faire un lien avec des phénomènes intuitifs et multimodaux que les humains utilisent dans leur processus de compréhension de scènes audiovisuelles
The goal of this thesis is to design algorithms that enable robust detection of objectsand events in videos through joint audio-visual analysis. This is motivated by humans’remarkable ability to meaningfully integrate auditory and visual characteristics forperception in noisy scenarios. To this end, we identify two kinds of natural associationsbetween the modalities in recordings made using a single microphone and camera,namely motion-audio correlation and appearance-audio co-occurrence.For the former, we use audio source separation as the primary application andpropose two novel methods within the popular non-negative matrix factorizationframework. The central idea is to utilize the temporal correlation between audio andmotion for objects/actions where the sound-producing motion is visible. The firstproposed method focuses on soft coupling between audio and motion representationscapturing temporal variations, while the second is based on cross-modal regression.We segregate several challenging audio mixtures of string instruments into theirconstituent sources using these approaches.To identify and extract many commonly encountered objects, we leverageappearance–audio co-occurrence in large datasets. This complementary associationmechanism is particularly useful for objects where motion-based correlations are notvisible or available. The problem is dealt with in a weakly-supervised setting whereinwe design a representation learning framework for robust AV event classification,visual object localization, audio event detection and source separation.We extensively test the proposed ideas on publicly available datasets. The experimentsdemonstrate several intuitive multimodal phenomena that humans utilize on aregular basis for robust scene understanding
5

Parekh, Sanjeel. "Learning representations for robust audio-visual scene analysis". Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLT015.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
L'objectif de cette thèse est de concevoir des algorithmes qui permettent la détection robuste d’objets et d’événements dans des vidéos en s’appuyant sur une analyse conjointe de données audio et visuelle. Ceci est inspiré par la capacité remarquable des humains à intégrer les caractéristiques auditives et visuelles pour améliorer leur compréhension de scénarios bruités. À cette fin, nous nous appuyons sur deux types d'associations naturelles entre les modalités d'enregistrements audiovisuels (réalisés à l'aide d'un seul microphone et d'une seule caméra), à savoir la corrélation mouvement/audio et la co-occurrence apparence/audio. Dans le premier cas, nous utilisons la séparation de sources audio comme application principale et proposons deux nouvelles méthodes dans le cadre classique de la factorisation par matrices non négatives (NMF). L'idée centrale est d'utiliser la corrélation temporelle entre l'audio et le mouvement pour les objets / actions où le mouvement produisant le son est visible. La première méthode proposée met l'accent sur le couplage flexible entre les représentations audio et de mouvement capturant les variations temporelles, tandis que la seconde repose sur la régression intermodale. Nous avons séparé plusieurs mélanges complexes d'instruments à cordes en leurs sources constituantes en utilisant ces approches.Pour identifier et extraire de nombreux objets couramment rencontrés, nous exploitons la co-occurrence apparence/audio dans de grands ensembles de données. Ce mécanisme d'association complémentaire est particulièrement utile pour les objets où les corrélations basées sur le mouvement ne sont ni visibles ni disponibles. Le problème est traité dans un contexte faiblement supervisé dans lequel nous proposons un framework d’apprentissage de représentation pour la classification robuste des événements audiovisuels, la localisation des objets visuels, la détection des événements audio et la séparation de sources.Nous avons testé de manière approfondie les idées proposées sur des ensembles de données publics. Ces expériences permettent de faire un lien avec des phénomènes intuitifs et multimodaux que les humains utilisent dans leur processus de compréhension de scènes audiovisuelles
The goal of this thesis is to design algorithms that enable robust detection of objectsand events in videos through joint audio-visual analysis. This is motivated by humans’remarkable ability to meaningfully integrate auditory and visual characteristics forperception in noisy scenarios. To this end, we identify two kinds of natural associationsbetween the modalities in recordings made using a single microphone and camera,namely motion-audio correlation and appearance-audio co-occurrence.For the former, we use audio source separation as the primary application andpropose two novel methods within the popular non-negative matrix factorizationframework. The central idea is to utilize the temporal correlation between audio andmotion for objects/actions where the sound-producing motion is visible. The firstproposed method focuses on soft coupling between audio and motion representationscapturing temporal variations, while the second is based on cross-modal regression.We segregate several challenging audio mixtures of string instruments into theirconstituent sources using these approaches.To identify and extract many commonly encountered objects, we leverageappearance–audio co-occurrence in large datasets. This complementary associationmechanism is particularly useful for objects where motion-based correlations are notvisible or available. The problem is dealt with in a weakly-supervised setting whereinwe design a representation learning framework for robust AV event classification,visual object localization, audio event detection and source separation.We extensively test the proposed ideas on publicly available datasets. The experimentsdemonstrate several intuitive multimodal phenomena that humans utilize on aregular basis for robust scene understanding
6

Herdtweck, Christian [Verfasser], e Heinrich [Akademischer Betreuer] Bülthoff. "Learning Data-Driven Representations for Robust Monocular Computer Vision Applications / Christian Herdtweck ; Betreuer: Heinrich Bülthoff". Tübingen : Universitätsbibliothek Tübingen, 2014. http://d-nb.info/1162897317/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Xu, Guanglin. "Optimization under uncertainty: conic programming representations, relaxations, and approximations". Diss., University of Iowa, 2017. https://ir.uiowa.edu/etd/5881.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
In practice, the presence of uncertain parameters in optimization problems introduces new challenges in modeling and solvability to operations research. There are three main paradigms proposed for optimization problems under uncertainty. These include stochastic programming, robust optimization, and sensitivity analysis. In this thesis, we examine, improve, and combine the latter two paradigms in several relevant models and applications. In the second chapter, we study a two-stage adjustable robust linear optimization problem in which the right-hand sides are uncertain and belong to a compact, convex, and tractable uncertainty set. Under standard and simple assumptions, we reformulate the two-stage problem as a copositive optimization program, which in turns leads to a class of tractable semidefinite-based approximations that are at least as strong as the affine policy, which is a well studied tractable approximation in the literature. We examine our approach over several examples from the literature and the results demonstrate that our tractable approximations significantly improve the affine policy. In particular, our approach recovers the optimal values of a class of instances of increasing size for which the affine policy admits an arbitrary large gap. In the third chapter, we leverage the concept of robust optimization to conduct sensitivity analysis of the optimal value of linear programming (LP). In particular, we propose a framework for sensitivity analysis of LP problems, allowing for simultaneous perturbations in the objective coefficients and right-hand sides, where the perturbations are modeled in a compact, convex, and tractable uncertainty set. This framework unifies and extends multiple approaches for LP sensitivity analysis in the literature and has close ties to worst-case LP and two-stage adjustable linear programming. We define the best-case and worst-case LP optimal values over the uncertainty set. As the concept aligns well with the general spirit of robust optimization, we denote our approach as robust sensitivity analysis. While the best-case and worst-case optimal values are difficult to compute in general, we prove that they equal the optimal values of two separate, but related, copositive programs. We then develop tight, tractable conic relaxations to provide bounds on the best-case and worst case optimal values, respectively. We also develop techniques to assess the quality of the bounds, and we validate our approach computationally on several examples from—and inspired by—the literature. We find that the bounds are very strong in practice and, in particular, are at least as strong as known results for specific cases from the literature. In the fourth chapter of this thesis, we study the expected optimal value of a mixed 0-1 programming problem with uncertain objective coefficients following a joint distribution. We assume that the true distribution is not known exactly, but a set of independent samples can be observed. Using the Wasserstein metric, we construct an ambiguity set centered at the empirical distribution from the observed samples and containing all distributions that could have generated the observed samples with a high confidence. The problem of interest is to investigate the bound on the expected optimal value over the Wasserstein ambiguity set. Under standard assumptions, we reformulate the problem into a copositive programming problem, which naturally leads to a tractable semidefinite-based approximation. We compare our approach with a moment-based approach from the literature for two applications. The numerical results illustrate the effectiveness of our approach. Finally, we conclude the thesis with remarks on some interesting open questions in the field of optimization under uncertainty. In particular, we point out that some interesting topics that can be potentially studied by copositive programming techniques.
8

Barbano, Carlo Alberto Maria. "Collateral-Free Learning of Deep Representations : From Natural Images to Biomedical Applications". Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAT038.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
L’apprentissage profond est devenu l'un des outils prédominants pour résoudre une variété de tâches, souvent avec des performances supérieures à celles des méthodes précédentes. Les modèles d'apprentissage profond sont souvent capables d'apprendre des représentations significatives et abstraites des données sous-jacentes. Toutefois, il a été démontré qu'ils pouvaient également apprendre des caractéristiques supplémentaires, qui ne sont pas nécessairement pertinentes ou nécessaires pour la tâche souhaitée. Cela peut poser un certain nombre de problèmes, car ces informations supplémentaires peuvent contenir des biais, du bruit ou des informations sensibles qui ne devraient pas être prises en compte (comme le sexe, la race, l'âge, etc.) par le modèle. Nous appelons ces informations "collatérales". La présence d'informations collatérales se traduit par des problèmes pratiques, en particulier lorsqu'il s'agit de données d'utilisateurs privés. L'apprentissage de représentations robustes exemptes d'informations collatérales peut être utile dans divers domaines, tels que les applications médicales et les systèmes d'aide à la décision.Dans cette thèse, nous introduisons le concept d'apprentissage collatéral, qui se réfère à tous les cas où un modèle apprend plus d'informations que prévu. L'objectif de l'apprentissage collatéral est de combler le fossé entre différents domaines, tels que la robustesse, le débiaisage, la généralisation en imagerie médicale et la préservation de la vie privée. Nous proposons différentes méthodes pour obtenir des représentations robustes exemptes d'informations collatérales. Certaines de nos contributions sont basées sur des techniques de régularisation, tandis que d'autres sont représentées par de nouvelles fonctions de perte.Dans la première partie de la thèse, nous posons les bases de notre travail, en développant des techniques pour l'apprentissage de représentations robustes sur des images naturelles, en se concentrant sur les données biaisées.Plus précisément, nous nous concentrons sur l'apprentissage contrastif (CL) et nous proposons un cadre d'apprentissage métrique unifié qui nous permet à la fois d'analyser facilement les fonctions de perte existantes et d'en dériver de nouvelles.Nous proposons ici une nouvelle fonction de perte contrastive supervisée, ε-SupInfoNCE, et deux techniques de régularisation de débiaisage, EnD et FairKL, qui atteignent des performances de pointe sur un certain nombre de repères de classification et de débiaisage de vision standard.Dans la deuxième partie de la thèse, nous nous concentrons sur l'apprentissage collatéral sur les images de neuro-imagerie et de radiographie thoracique. Pour la neuro-imagerie, nous présentons une nouvelle approche d'apprentissage contrastif pour l'estimation de l'âge du cerveau. Notre approche atteint des résultats de pointe sur l'ensemble de données OpenBHB pour la régression de l'âge et montre une robustesse accrue à l'effet de site. Nous tirons également parti de cette méthode pour détecter des modèles de vieillissement cérébral malsains, ce qui donne des résultats prometteurs dans la classification d'affections cérébrales telles que les troubles cognitifs légers (MCI) et la maladie d'Alzheimer (AD). Pour les images de radiographie thoracique (CXR), nous ciblerons la classification Covid-19, en montrant comment l'apprentissage collatéral peut effectivement nuire à la fiabilité de ces modèles. Pour résoudre ce problème, nous proposons une approche d'apprentissage par transfert qui, combinée à nos techniques de régularisation, donne des résultats prometteurs sur un ensemble de données CXR multisites.Enfin, nous donnons quelques indications sur l'apprentissage collatéral et la préservation de la vie privée dans les modèles DL. Nous montrons que certaines des méthodes que nous proposons peuvent être efficaces pour empêcher que certaines informations soient apprises par le modèle, évitant ainsi une fuite potentielle de données
Deep Learning (DL) has become one of the predominant tools for solving a variety of tasks, often with superior performance compared to previous state-of-the-art methods. DL models are often able to learn meaningful and abstract representations of the underlying data. However, it has been shown that they might also learn additional features, which are not necessarily relevant or required for the desired task. This could pose a number of issues, as this additional information can contain bias, noise, or sensitive information, that should not be taken into account (e.g. gender, race, age, etc.) by the model. We refer to this information as collateral. The presence of collateral information translates into practical issues when deploying DL-based pipelines, especially if they involve private users' data. Learning robust representations that are free of collateral information can be highly relevant for a variety of fields and applications, like medical applications and decision support systems.In this thesis, we introduce the concept of Collateral Learning, which refers to all those instances in which a model learns more information than intended. The aim of Collateral Learning is to bridge the gap between different fields in DL, such as robustness, debiasing, generalization in medical imaging, and privacy preservation. We propose different methods for achieving robust representations free of collateral information. Some of our contributions are based on regularization techniques, while others are represented by novel loss functions.In the first part of the thesis, we lay the foundations of our work, by developing techniques for robust representation learning on natural images. We focus on one of the most important instances of Collateral Learning, namely biased data. Specifically, we focus on Contrastive Learning (CL), and we propose a unified metric learning framework that allows us to both easily analyze existing loss functions, and derive novel ones. Here, we propose a novel supervised contrastive loss function, ε-SupInfoNCE, and two debiasing regularization techniques, EnD and FairKL, that achieve state-of-the-art performance on a number of standard vision classification and debiasing benchmarks.In the second part of the thesis, we focus on Collateral Learning in medical imaging, specifically on neuroimaging and chest X-ray images. For neuroimaging, we present a novel contrastive learning approach for brain age estimation. Our approach achieves state-of-the-art results on the OpenBHB dataset for age regression and shows increased robustness to the site effect. We also leverage this method to detect unhealthy brain aging patterns, showing promising results in the classification of brain conditions such as Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD). For chest X-ray images (CXR), we will target Covid-19 classification, showing how Collateral Learning can effectively hinder the reliability of such models. To tackle such issue, we propose a transfer learning approach that, combined with our regularization techniques, shows promising results on an original multi-site CXRs dataset.Finally, we provide some hints about Collateral Learning and privacy preservation in DL models. We show that some of our proposed methods can be effective in preventing certain information from being learned by the model, thus avoiding potential data leakage
9

Terzi, Matteo. "Learning interpretable representations for classification, anomaly detection, human gesture and action recognition". Doctoral thesis, Università degli studi di Padova, 2019. http://hdl.handle.net/11577/3423183.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The goal of this thesis is to provide algorithms and models for classification, gesture recognition and anomaly detection with a partial focus on human activity. In applications where humans are involved, it is of paramount importance to provide robust and understandable algorithms and models. A way to accomplish this requirement is to use relatively simple and robust approaches, especially when devices are resource-constrained. The second approach, when a large amount of data is present, is to adopt complex algorithms and models and make them robust and interpretable from a human-like point of view. This motivates our thesis that is divided in two parts. The first part of this thesis is devoted to the development of parsimonious algorithms for action/gesture recognition in human-centric applications such as sports and anomaly detection for artificial pancreas. The data sources employed for the validation of our approaches consist of a collection of time-series data coming from sensors, such as accelerometers or glycemic. The main challenge in this context is to discard (i.e. being invariant to) many nuisance factors that make the recognition task difficult, especially where many different users are involved. Moreover, in some cases, data cannot be easily labelled, making supervised approaches not viable. Thus, we present the mathematical tools and the background with a focus to the recognition problems and then we derive novel methods for: (i) gesture/action recognition using sparse representations for a sport application; (ii) gesture/action recognition using a symbolic representations and its extension to the multivariate case; (iii) model-free and unsupervised anomaly detection for detecting faults on artificial pancreas. These algorithms are well-suited to be deployed in resource constrained devices, such as wearables. In the second part, we investigate the feasibility of deep learning frameworks where human interpretation is crucial. Standard deep learning models are not robust and, unfortunately, literature approaches that ensure robustness are typically detrimental to accuracy in general. However, in general, real-world applications often require a minimum amount of accuracy to be employed. In view of this, after reviewing some results present in the recent literature, we formulate a new algorithm being able to semantically trade-off between accuracy and robustness, where a cost-sensitive classification problem is provided and a given threshold of accuracy is required. In addition, we provide a link between robustness to input perturbations and interpretability guided by a physical minimum energy principle: in fact, leveraging optimal transport tools, we show that robust training is connected to the optimal transport problem. Thanks to these theoretical insights we develop a new algorithm that provides robust, interpretable and more transferable representations.
10

山本, 有作, e Yusaku Yamamoto. "密行列固有値解法の最近の発展(I) : Multiple Relatively Robust Representationsアルゴリズム". 日本応用数理学会, 2005. http://hdl.handle.net/2237/10838.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Huang, Weilin. "Robust facial representation for recognition". Thesis, University of Manchester, 2013. https://www.research.manchester.ac.uk/portal/en/theses/robust-facial-representation-for-recognition(ee2f295c-7b1a-4966-bd12-17edba43b2b4).html.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
One of the main challenges in face recognition lies in robust representation of facial images in unconstrained real-world environment, where face appearances of a same person often vary significantly. This thesis investigates both holistic and local feature based representations, and develops several novel representation models in an effort to mitigate within-person variations and enhance discriminative power.The work first focuses on feature extraction of high-dimensional holistic representation based on intensities. Several linear and nonlinear dimensionality reduction methods are systematically compared. One of key findings is that linear PCA has comparable performances to the most recent nonlinear methods for extracting low-dimensional facial features. Extensive experiments are conducted and results are presented to support the findings, together with a quantitative measure of nonlinearity showing theoretical insights. Following these findings, a robust framework combining an automatic outlier detector and a nearest subspace classifier, is presented. The detector computes the corrupted regions of face images by measuring their reconstructive capabilities, while the classifier models face data by multiple linear subspaces.
12

Drapeau, Samuel. "Risk preferences and their robust representation". Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II, 2010. http://dx.doi.org/10.18452/16135.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Ziel dieser Dissertation ist es, den Begriff des Risikos unter den Aspekten seiner Quantifizierung durch robuste Darstellungen zu untersuchen. In einem ersten Teil wird Risiko anhand Kontext-Invarianter Merkmale betrachtet: Diversifizierung und Monotonie. Wir führen die drei Schlüsselkonzepte, Risikoordnung, Risikomaß und Risikoakzeptanzfamilen ein, und studieren deren eins-zu-eins Beziehung. Unser Hauptresultat stellt eine eindeutige duale robuste Darstellung jedes unterhalbstetigen Risikomaßes auf topologischen Vektorräumen her. Wir zeigen auch automatische Stetigkeitsergebnisse und robuste Darstellungen für Risikomaße auf diversen Arten von konvexen Mengen. Diese Herangehensweise lässt bei der Wahl der konvexen Menge viel Spielraum, und erlaubt damit eine Vielfalt von Interpretationen von Risiko: Modellrisiko im Falle von Zufallsvariablen, Verteilungsrisiko im Falle von Lotterien, Abdiskontierungsrisiko im Falle von Konsumströmen... Diverse Beispiele sind dann in diesen verschiedenen Situationen explizit berechnet (Sicherheitsäquivalent, ökonomischer Risikoindex, VaR für Lotterien, "variational preferences"...). Im zweiten Teil, betrachten wir Präferenzordnungen, die möglicherweise zusätzliche Informationen benötigen, um ausgedrückt zu werden. Hierzu führen wir einen axiomatischen Rahmen in Form von bedingten Präferenzordungen ein, die lokal mit der Information kompatibel sind. Dies erlaubt die Konstruktion einer bedingten numerischen Darstellung. Wir erhalten eine bedingte Variante der von Neumann und Morgenstern Darstellung für messbare stochastische Kerne und erweitern dieses Ergebnis zur einer bedingten Version der "variational preferences". Abschließend, klären wir das Zusammenpiel zwischen Modellrisiko und Verteilungsrisiko auf der axiomatischen Ebene.
The goal of this thesis is the conceptual study of risk and its quantification via robust representations. We concentrate in a first part on context invariant features related to this notion: diversification and monotonicity. We introduce and study the general properties of three key concepts, risk order, risk measure and risk acceptance family and their one-to-one relations. Our main result is a uniquely characterized dual robust representation of lower semicontinuous risk orders on topological vector space. We also provide automatic continuity and robust representation results on specific convex sets. This approach allows multiple interpretation of risk depending on the setting: model risk in the case of random variables, distributional risk in the case of lotteries, discounting risk in the case of consumption streams... Various explicit computations in those different settings are then treated (economic index of riskiness, certainty equivalent, VaR on lotteries, variational preferences...). In the second part, we consider preferences which might require additional information in order to be expressed. We provide a mathematical framework for this idea in terms of preorders, called conditional preference orders, which are locally compatible with the available information. This allows us to construct conditional numerical representations of conditional preferences. We obtain a conditional version of the von Neumann and Morgenstern representation for measurable stochastic kernels and extend then to a conditional version of the variational preferences. We finally clarify the interplay between model risk and distributional risk on the axiomatic level.
13

Lee, Chia-ying (Chia-ying Jackie). "Closed-loop auditory-based representation for robust speech recognition". Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/60176.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.
Includes bibliographical references (p. 93-96).
A closed-loop auditory based speech feature extraction algorithm is presented to address the problem of unseen noise for robust speech recognition. This closed-loop model is inspired by the possible role of the medial olivocochlear (MOC) efferent system of the human auditory periphery, which has been suggested in [6, 13, 42] to be important for human speech intelligibility in noisy environment. We propose that instead of using a fixed filter bank, the filters used in a feature extraction algorithm should be more flexible to adapt dynamically to different types of background noise. Therefore, in the closed-loop model, a feedback mechanism is designed to regulate the operating points of filters in the filter bank based on the background noise. The model is tested on a dataset created from TIDigits database. In this dataset, five kinds of noise are added to synthesize noisy speech. Compared with the standard MFCC extraction algorithm, the proposed closed-loop form of feature extraction algorithm provides 9.7%, 9.1% and 11.4% absolution word error rate reduction on average for three kinds of filter banks respectively.
by Chia-ying Lee.
S.M.
14

Siméoni, Oriane. "Robust image representation for classification, retrieval and object discovery". Thesis, Rennes 1, 2020. https://ged.univ-rennes1.fr/nuxeo/site/esupversions/415eb65b-d5f7-4be7-85e6-c2ecb2aba4dc.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Les réseaux de neurones convolutifs (CNNs) ont été exploités avec succès pour la résolution de tâches dans le domaine de la vision par ordinateur tels que la classification, la segmentation d'image, la détection d'objets dans une image ou la recherche d'images dans une base de données. Typiquement, un réseau est entraîné spécifiquement pour une tâche et l'entraînement nécessite une très grande quantité d'images annotées. Dans cette thèse, nous proposons des solutions pour extraire le maximum d'information avec un minimum de supervision. D'abord, nous nous concentrons sur la tâche de classification en examinant le processus d'apprentissage actif dans le contexte de l'apprentissage profond. Nous montrons qu'en combinant l'apprentissage actif aux techniques d'apprentissage semi-supervisé et non supervisé, il est possible d'améliorer significativement les résultats. Ensuite, nous étudions la tâche de recherche d'images dans une base de données et nous exploitons les informations de localisation spatiale disponible directement dans les cartes d'activation produites par les CNNs. En première approche, nous proposons de représenter une image par une collection de caractéristiques locales, détectées dans les cartes, qui sont peu coûteuses en terme de mémoire et assez robustes pour effectuer une mise en correspondance spatiale. Alternativement, nous découvrons dans les cartes d'activation les objets d'intérêts des images d'une base de données et nous structurons leurs représentations dans un graphe de plus proches voisins. En utilisant la mesure de centralité du graphe, nous sommes capable de construire une carte de saillance, par image, qui met en lumière les objets qui se répètent et nous permet de construire une représentation globale qui exclue les objets non pertinents et d'arrière-plan
Neural network representations proved to be relevant for many computer vision tasks such as image classification, object detection, segmentation or instance-level image retrieval. A network is trained for one particular task and requires a large number of labeled data. We propose in this thesis solutions to extract the most information with the least supervision. First focusing on the classification task, we examine the active learning process in the context of deep learning and show that combining it to semi-supervised and unsupervised techniques boost greatly results. We then investigate the image retrieval task, and in particular we exploit the spatial localization information available ``for free'' in CNN feature maps. We first propose to represent an image by a collection of affine local features detected within activation maps, which are memory-efficient and robust enough to perform spatial matching. Then again extracting information from feature maps, we discover objects of interest in images of a dataset and gather their representations in a nearest neighbor graph. Using the centrality measure on the graph, we are able to construct a saliency map per image which focuses on the repeating objects and allows us to compute a global representation excluding clutter and background
15

Althaus, Philipp. "Indoor Navigation for Mobile Robots : Control and Representations". Doctoral thesis, KTH, Numerical Analysis and Computer Science, NADA, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3644.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):

This thesis deals with various aspects of indoor navigationfor mobile robots. For a system that moves around in ahousehold or office environment,two major problems must betackled. First, an appropriate control scheme has to bedesigned in order to navigate the platform. Second, the form ofrepresentations of the environment must be chosen.

Behaviour based approaches have become the dominantmethodologies for designing control schemes for robotnavigation. One of them is the dynamical systems approach,which is based on the mathematical theory of nonlineardynamics. It provides a sound theoretical framework for bothbehaviour design and behaviour coordination. In the workpresented in this thesis, the approach has been used for thefirst time to construct a navigation system for realistic tasksin large-scale real-world environments. In particular, thecoordination scheme was exploited in order to combinecontinuous sensory signals and discrete events for decisionmaking processes. In addition, this coordination frameworkassures a continuous control signal at all times and permitsthe robot to deal with unexpected events.

In order to act in the real world, the control system makesuse of representations of the environment. On the one hand,local geometrical representations parameterise the behaviours.On the other hand, context information and a predefined worldmodel enable the coordination scheme to switchbetweensubtasks. These representations constitute symbols, on thebasis of which the system makes decisions. These symbols mustbe anchored in the real world, requiring the capability ofrelating to sensory data. A general framework for theseanchoring processes in hybrid deliberative architectures isproposed. A distinction of anchoring on two different levels ofabstraction reduces the complexity of the problemsignificantly.

A topological map was chosen as a world model. Through theadvanced behaviour coordination system and a proper choice ofrepresentations,the complexity of this map can be kept at aminimum. This allows the development of simple algorithms forautomatic map acquisition. When the robot is guided through theenvironment, it creates such a map of the area online. Theresulting map is precise enough for subsequent use innavigation.

In addition, initial studies on navigation in human-robotinteraction tasks are presented. These kinds of tasks posedifferent constraints on a robotic system than, for example,delivery missions. It is shown that the methods developed inthis thesis can easily be applied to interactive navigation.Results show a personal robot maintaining formations with agroup of persons during social interaction.

Keywords:mobile robots, robot navigation, indoornavigation, behaviour based robotics, hybrid deliberativesystems, dynamical systems approach, topological maps, symbolanchoring, autonomous mapping, human-robot interaction

16

Nielsen, Casper Falkenberg. "A robust framework for medical image segmentation through adaptable class-specific representation". Thesis, Middlesex University, 2002. http://eprints.mdx.ac.uk/13507/.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Medical image segmentation is an increasingly important component in virtual pathology, diagnostic imaging and computer-assisted surgery. Better hard\vare for image acquisition and a variety of advanced visualisation methods have paved the way for the development of computer based tools for medical image analysis and interpretation. The routine use of medical imaging scans of multiple modalities has been growing over the last decades and data sets such as the Visible Human Project have introduced a new modality in the form of colour cryo section data. These developments have given rise to an increasing need for better automatic and semiautomatic segmentation methods. The work presented in this thesis concerns the development of a new framework for robust semi-automatic segmentation of medical imaging data of multiple modalities. Following the specification of a set of conceptual and technical requirements, the framework known as ACSR (Adaptable ClassSpecific Representation) is developed in the first case for 2D colour cryo section segmentation. This is achieved through the development of a novel algorithm for adaptable class-specific sampling of point neighbourhoods, known as the PGA (Path Growing Algorithm), combined with Learning Vector Quantization. The framework is extended to accommodate 3D volume segmentation of cryo section data and subsequently segmentation of single and multi-channel greyscale MRl data. For the latter the issues of inhomogeneity and noise are specifically addressed. Evaluation is based on comparison with previously published results on standard simulated and real data sets, using visual presentation, ground truth comparison and human observer experiments. ACSR provides the user with a simple and intuitive visual initialisation process followed by a fully automatic segmentation. Results on both cryo section and MRI data compare favourably to existing methods, demonstrating robustness both to common artefacts and multiple user initialisations. Further developments into specific clinical applications are discussed in the future work section.
17

Laforgue, Pierre. "Deep kernel representation learning for complex data and reliability issues". Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT006.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Cette thèse débute par l'étude d'architectures profondes à noyaux pour les données complexes. L'une des clefs du succès des algorithmes d'apprentissage profond est la capacité des réseaux de neurones à extraire des représentations pertinentes. Cependant, les raisons théoriques de ce succès nous sont encore largement inconnues, et ces approches sont presque exclusivement réservées aux données vectorielles. D'autre part, les méthodes à noyaux engendrent des espaces fonctionnels étudiés de longue date, les Espaces de Hilbert à Noyau Reproduisant (Reproducing Kernel Hilbert Spaces, RKHSs), dont la complexité est facilement contrôlée par le noyau ou la pénalisation, tout en autorisant les prédictions dans les espaces structurés complexes via les RKHSs à valeurs vectorielles (vv-RKHSs).L'architecture proposée consiste à remplacer les blocs élémentaires des réseaux usuels par des fonctions appartenant à des vv-RKHSs. Bien que très différents à première vue, les espaces fonctionnels ainsi définis sont en réalité très similaires, ne différant que par l'ordre dans lequel les fonctions linéaires/non-linéaires sont appliquées. En plus du contrôle théorique sur les couches, considérer des fonctions à noyau permet de traiter des données structurées, en entrée comme en sortie, étendant le champ d'application des réseaux aux données complexes. Nous conclurons cette partie en montrant que ces architectures admettent la plupart du temps une paramétrisation finie-dimensionnelle, ouvrant la voie à des méthodes d'optimisation efficaces pour une large gamme de fonctions de perte.La seconde partie de cette thèse étudie des alternatives à la moyenne empirique comme substitut de l'espérance dans le cadre de la Minimisation du Risque Empirique (Empirical Risk Minimization, ERM). En effet, l'ERM suppose de manière implicite que la moyenne empirique est un bon estimateur. Cependant, dans de nombreux cas pratiques (e.g. données à queue lourde, présence d'anomalies, biais de sélection), ce n'est pas le cas.La Médiane-des-Moyennes (Median-of-Means, MoM) est un estimateur robuste de l'espérance construit comme suit: des moyennes empiriques sont calculées sur des sous-échantillons disjoints de l'échantillon initial, puis est choisie la médiane de ces moyennes. Nous proposons et analysons deux extensions de MoM, via des sous-échantillons aléatoires et/ou pour les U-statistiques. Par construction, les estimateurs MoM présentent des propriétés de robustesse, qui sont exploitées plus avant pour la construction de méthodes d'apprentissage robustes. Il est ainsi prouvé que la minimisation d'un estimateur MoM (aléatoire) est robuste aux anomalies, tandis que les méthodes de tournoi MoM sont étendues au cas de l'apprentissage sur les paires.Enfin, nous proposons une méthode d'apprentissage permettant de résister au biais de sélection. Si les données d'entraînement proviennent d'échantillons biaisés, la connaissance des fonctions de biais permet une repondération non-triviale des observations, afin de construire un estimateur non biaisé du risque. Nous avons alors démontré des garanties non-asymptotiques vérifiées par les minimiseurs de ce dernier, tout en supportant empiriquement l'analyse
The first part of this thesis aims at exploring deep kernel architectures for complex data. One of the known keys to the success of deep learning algorithms is the ability of neural networks to extract meaningful internal representations. However, the theoretical understanding of why these compositional architectures are so successful remains limited, and deep approaches are almost restricted to vectorial data. On the other hand, kernel methods provide with functional spaces whose geometry are well studied and understood. Their complexity can be easily controlled, by the choice of kernel or penalization. In addition, vector-valued kernel methods can be used to predict kernelized data. It then allows to make predictions in complex structured spaces, as soon as a kernel can be defined on it.The deep kernel architecture we propose consists in replacing the basic neural mappings functions from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs). Although very different at first glance, the two functional spaces are actually very similar, and differ only by the order in which linear/nonlinear functions are applied. Apart from gaining understanding and theoretical control on layers, considering kernel mappings allows for dealing with structured data, both in input and output, broadening the applicability scope of networks. We finally expose works that ensure a finite dimensional parametrization of the model, opening the door to efficient optimization procedures for a wide range of losses.The second part of this thesis investigates alternatives to the sample mean as substitutes to the expectation in the Empirical Risk Minimization (ERM) paradigm. Indeed, ERM implicitly assumes that the empirical mean is a good estimate of the expectation. However, in many practical use cases (e.g. heavy-tailed distribution, presence of outliers, biased training data), this is not the case.The Median-of-Means (MoM) is a robust mean estimator constructed as follows: the original dataset is split into disjoint blocks, empirical means on each block are computed, and the median of these means is finally returned. We propose two extensions of MoM, both to randomized blocks and/or U-statistics, with provable guarantees. By construction, MoM-like estimators exhibit interesting robustness properties. This is further exploited by the design of robust learning strategies. The (randomized) MoM minimizers are shown to be robust to outliers, while MoM tournament procedure are extended to the pairwise setting.We close this thesis by proposing an ERM procedure tailored to the sample bias issue. If training data comes from several biased samples, computing blindly the empirical mean yields a biased estimate of the risk. Alternatively, from the knowledge of the biasing functions, it is possible to reweight observations so as to build an unbiased estimate of the test distribution. We have then derived non-asymptotic guarantees for the minimizers of the debiased risk estimate thus created. The soundness of the approach is also empirically endorsed
18

Wolter, Diedrich. "Spatial representation and reasoning for robot mapping a shape-based approach /". Berlin : Springer, 2008. http://www.myilibrary.com?id=186085.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Dondrup, Christian. "Human-robot spatial interaction using probabilistic qualitative representations". Thesis, University of Lincoln, 2016. http://eprints.lincoln.ac.uk/28665/.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Current human-aware navigation approaches use a predominantly metric representation of the interaction which makes them susceptible to changes in the environment. In order to accomplish reliable navigation in ever-changing human populated environments, the presented work aims to abstract from the underlying metric representation by using Qualitative Spatial Relations (QSR), namely the Qualitative Trajectory Calculus (QTC), for Human-Robot Spatial Interaction (HRSI). So far, this form of representing HRSI has been used to analyse different types of interactions online. This work extends this representation to be able to classify the interaction type online using incrementally updated QTC state chains, create a belief about the state of the world, and transform this high-level descriptor into low-level movement commands. By using QSRs the system becomes invariant to change in the environment, which is essential for any form of long-term deployment of a robot, but most importantly also allows the transfer of knowledge between similar encounters in different environments to facilitate interaction learning. To create a robust qualitative representation of the interaction, the essence of the movement of the human in relation to the robot and vice-versa is encoded in two new variants of QTC especially designed for HRSI and evaluated in several user studies. To enable interaction learning and facilitate reasoning, they are employed in a probabilistic framework using Hidden Markov Models (HMMs) for online classiffication and evaluation of their appropriateness for the task of human-aware navigation. In order to create a system for an autonomous robot, a perception pipeline for the detection and tracking of humans in the vicinity of the robot is described which serves as an enabling technology to create incrementally updated QTC state chains in real-time using the robot's sensors. Using this framework, the abstraction and generalisability of the QTC based framework is tested by using data from a different study for the classiffication of automatically generated state chains which shows the benefits of using such a highlevel description language. The detriment of using qualitative states to encode interaction is the severe loss of information that would be necessary to generate behaviour from it. To overcome this issue, so-called Velocity Costmaps are introduced which restrict the sampling space of a reactive local planner to only allow the generation of trajectories that correspond to the desired QTC state. This results in a exible and agile behaviour I generation that is able to produce inherently safe paths. In order to classify the current interaction type online and predict the current state for action selection, the HMMs are evolved into a particle filter especially designed to work with QSRs of any kind. This online belief generation is the basis for a exible action selection process that is based on data acquired using Learning from Demonstration (LfD) to encode human judgement into the used model. Thereby, the generated behaviour is not only sociable but also legible and ensures a high experienced comfort as shown in the experiments conducted. LfD itself is a rather underused approach when it comes to human-aware navigation but is facilitated by the qualitative model and allows exploitation of expert knowledge for model generation. Hence, the presented work bridges the gap between the speed and exibility of a sampling based reactive approach by using the particle filter and fast action selection, and the legibility of deliberative planners by using high-level information based on expert knowledge about the unfolding of an interaction.
20

Oliveira, José Ricardo Marques de. "World representation for an autonomous driving robot". Master's thesis, Universidade de Aveiro, 2009. http://hdl.handle.net/10773/2121.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Mestrado em Engenharia de Computadores e Telemática
Condução autónoma constitui a deslocação de um agente, robô ou veículo, de um qualquer ponto no espaço para um outro, sem qualquer intervenção humana, por forma a atingir objectivos pré-estabelecidos. Para conduzir de forma autónoma, usando planeamento de trajectória, é crucial que o agente consiga representar abstractamente tanto o conhecimento a priori acerca do mundo, como a informação que este vai adquirindo à medida que avança. Para alcançar este propósito, desenvolveu-se um sistema para ser usado na pista da Competição de Condução Autónoma do Festival Nacional de Robótica. Este sistema caracteriza-se por ser flexível e modular. Tais características permitem não são a adição componentes na pista acima referida, mas também a fácil expansão do suporte a outros tipos de pistas ou circuitos. Concluiu-se, pois, que o modelo de representação mais adequado para o sistema que se pretendia desenvolver seria um modelo híbrido, na medida em que, ao nível global tal representação seria topológica e ao nível local métrica. Ou seja, dividindo a pista em secções, estas são a base para a representação topológica, sendo depois cada secção mapeada internamente de forma métrica. Ao integrar o trabalho desta dissertação com o sistema global lograva-se alcançar um sistema de Condução Autónoma susceptível de planear a curto e médio prazo, com vista a melhorar o desempenho dos robôs usados no projecto, relativamente à solução anteriormente usada, que era baseada num sistema reactivo com alguma memória e noção de estado, mas sem planeamento de trajectória. ABSTRACT: Autonomous driving is the movement of an agent, robot or vehicle, from some point in space to another one, without any human intervention, in order to achieve predetermined goals. To drive autonomously using trajectory planning, it is vital to have an abstraction of the knowledge about the world, be it a priori or information that the agent acquires during the driving. For this, we developed a system capable of abstractly represent, not only the track for the Autonomous Driving Competition of the Portuguese Robotics Open, but also, tracks with similar characteristics. The system was developed in a exible and modular manner, in order to allow the addition of new elements to the stated track and the easy expansion to support other types of tracks and circuits. The conclusion was that the most appropriate representation model for the system we were trying to develop was an hybrid model, in that, at a global level the representation would be topological and at a local level it would be metrical. In other words, dividing the track into sections, these are the basis for the topological representation, being each of the sections then mapped internally using a metrical representation. Integrating the work of this dissertation in the global system, one hoped to achieve a Autonomous Driving system capable of short and medium term planning, with the goal of improve the performance of the ROTA project robots, comparatively with the previous solution, which was based in a reactive system with some memory and to some degree stateful.
21

Ko, W. Y. Albert, e 高永賢. "The design of a representation and analysis method for modular self-reconfigurable robots". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2003. http://hub.hku.hk/bib/B29513807.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Li, Wing Yin (Cherry). "Narrative and representation in Robert Schumann's Waldszenen, Op. 82". Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/11994.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Robert Schumann's music is replete with literary references and extramusical indications. His devotion to literature and his adaptation of the narrative strategies of the early Romantics in his compositions have prompted many investigations of literary influences on Schumann's music. Many of his early piano cycles are inspired by the literature of the Romantics, and in particular by the novels of Jean Paul Richter. However, it has sometimes been suggested that Schumann discarded the narrative strategies of Jean Paul in his late compositions, some of which were written for musical education and music-making in the home. My goal, in this dissertation, is to demonstrate that Jean Paul's narrative devices remained relevant in Schumann's late works. This study examines the aspects of narrative and representation that permeate the Waldszenen cycle. The first aspect is large-cycle coherence, an effect that is achieved through innovative associational means -- including motivic and tonal cross-references-- and through more traditional hierarchical means, such as tonal departure and return and the use of programmatic titles that suggest a complete forest journey. The second aspect is the manipulation of formal conventions, which is accomplished through problematic closure, problematic recapitulation, and ambiguous formal function. The third aspect is the use of intertextual allusions to Schumann's earlier works. The last aspect of representation in Waldszenen is the use of three musical topics - fantasy, pastoral, and hunt - in association with their corresponding Romantic literary genres - Kunstmärchen, idyll, and hunting tale and song.
23

NGUYEN, DONG HAI PHUONG. "Toward Robots with Peripersonal Space Representation for Adaptive Behaviors". Doctoral thesis, Università degli studi di Genova, 2019. http://hdl.handle.net/11567/942472.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The abilities to adapt and act autonomously in an unstructured and human-oriented environment are necessarily vital for the next generation of robots, which aim to safely cooperate with humans. While this adaptability is natural and feasible for humans, it is still very complex and challenging for robots. Observations and findings from psychology and neuroscience in respect to the development of the human sensorimotor system can inform the development of novel approaches to adaptive robotics. Among these is the formation of the representation of space closely surrounding the body, the Peripersonal Space (PPS) , from multisensory sources like vision, hearing, touch and proprioception, which helps to facilitate human activities within their surroundings. Taking inspiration from the virtual safety margin formed by the PPS representation in humans, this thesis first constructs an equivalent model of the safety zone for each body part of the iCub humanoid robot. This PPS layer serves as a distributed collision predictor, which translates visually detected objects approaching a robot’s body parts (e.g., arm, hand) into the probabilities of a collision between those objects and body parts. This leads to adaptive avoidance behaviors in the robot via an optimization-based reactive controller. Notably, this visual reactive control pipeline can also seamlessly incorporate tactile input to guarantee safety in both pre- and post-collision phases in physical Human-Robot Interaction (pHRI). Concurrently, the controller is also able to take into account multiple targets (of manipulation reaching tasks) generated by a multiple Cartesian point planner. All components, namely the PPS, the multi-target motion planner (for manipulation reaching tasks), the reaching-with-avoidance controller and the humancentred visual perception, are combined harmoniously to form a hybrid control framework designed to provide safety for robots’ interactions in a cluttered environment shared with human partners. Later, motivated by the development of manipulation skills in infants, in which the multisensory integration is thought to play an important role, a learning framework is proposed to allow a robot to learn the processes of forming sensory representations, namely visuomotor and visuotactile, from their own motor activities in the environment. Both multisensory integration models are constructed with Deep Neural Networks (DNNs) in such a way that their outputs are represented in motor space to facilitate the robot’s subsequent actions.
24

Mesgarani, Nima. "Representation of speech in the primary auditory cortex and its implications for robust speech processing". College Park, Md.: University of Maryland, 2008. http://hdl.handle.net/1903/8586.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Thesis (Ph. D.) -- University of Maryland, College Park, 2008.
Thesis research directed by: Dept. of Electrical and Computer Engineering. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
25

Schlenoff, Craig. "Inferring intentions through state representations in cooperative human-robot environments". Thesis, Dijon, 2014. http://www.theses.fr/2014DIJOS064/document.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Les humains et les robots travaillant en toute sécurité et en parfaite harmonie dans un environnement est l'un des objectifs futurs de la communauté robotique. Quand les humains et les robots peuvent travailler ensemble dans le même espace, toute une catégorie de tâches devient prête à l'automatisation, allant de la collaboration pour l'assemblage de pièces, à la manutention de pièces et de materiels ainsi qu'à leur livraison. Garantir la sûreté des humains nécessite que le robot puisse être capable de surveiller la zone de travail, déduire l'intention humaine, et être conscient suffisamment tôt des dangers potentiels afin de les éviter.Des normes existent sur la collaboration entre robots et humains, cependant elles se focalisent à limiter les distances d'approche et les forces de contact entre l'humain et le robot. Ces approches s'appuient sur des processus qui se basent uniquement sur la lecture des capteurs, et ne tiennent pas compte des états futurs ou des informations sur les tâches en question. Un outil clé pour la sécurité entre des robots et des humains travaillant dans un environnement inclut la reconnaissance de l'intention dans lequel le robot tente de comprendre l'intention d'un agent (l'humain) en reconnaissant tout ou partie des actions de l'agent pour l'aider à prévoir les actions futures de cet agent. La connaissance de ces actions futures permettra au robot de planifier sa contribution aux tâches que l'humain doit exécuter ou au minimum, à ne pas se mettre dans une position dangereuse.Dans cette thèse, nous présentons une approche qui est capable de déduire l'intention d'un agent grâce à la reconnaissance et à la représentation des informations de l'état. Cette approche est différente des nombreuses approches présentes dans la littérature qui se concentrent principalement sur la reconnaissance de l'activité (par opposition à la reconnaissance de l'état) et qui « devinent » des raisons pour expliquer les observations. Nous déduisons les relations détaillées de l'état à partir d'observations en utilisant Region Connection Calculus 8 (RCC-8) et ensuite nous déduisons les relations globales de l'état qui sont vraies à un moment donné. L'utilisation des informations sur l'état sert à apporter une contribution plus précise aux algorithmes de reconnaissance de l'intention et à générer des résultats qui sont equivalents, et dans certains cas, meilleurs qu'un être humain qui a accès aux mêmes informations
Humans and robots working safely and seamlessly together in a cooperative environment is one of the future goals of the robotics community. When humans and robots can work together in the same space, a whole class of tasks becomes amenable to automation, ranging from collaborative assembly to parts and material handling to delivery. Proposed standards exist for collaborative human-robot safety, but they focus on limiting the approach distances and contact forces between the human and the robot. These standards focus on reactive processes based only on current sensor readings. They do not consider future states or task-relevant information. A key enabler for human-robot safety in cooperative environments involves the field of intention recognition, in which the robot attempts to understand the intention of an agent (the human) by recognizing some or all of their actions to help predict the human’s future actions.We present an approach to inferring the intention of an agent in the environment via the recognition and representation of state information. This approach to intention recognition is different than many ontology-based intention recognition approaches in the literature as they primarily focus on activity (as opposed to state) recognition and then use a form of abduction to provide explanations for observations. We infer detailed state relationships using observations based on Region Connection Calculus 8 (RCC-8) and then infer the overall state relationships that are true at a given time. Once a sequence of state relationships has been determined, we use a Bayesian approach to associate those states with likely overall intentions to determine the next possible action (and associated state) that is likely to occur. We compare the output of the Intention Recognition Algorithm to those of an experiment involving human subjects attempting to recognize the same intentions in a manufacturing kitting domain. The results show that the Intention Recognition Algorithm, in almost every case, performed as good, if not better, than a human performing the same activity
26

Hafidi, Hakim. "Robust machine learning for Graphs/Networks". Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAT004.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Cette thèse aborde les progrès de l’apprentissage des représentation des nœuds d’ungraphe, en se concentrant sur les défis et les opportunités présentées par les réseaux de neuronespour graphe (GNN). Elle met en évidence l’importance des graphes dans la représentation dessystèmes complexes et la nécessité d’apprendre des représentations de nœuds qui capturent à la fois les caractéristiques des nœuds et la structure des graphes. L’ étude identifie les problèmes clés des réseaux de neurones pour graphe, tels que leur dépendance à l’ ´égard de données étiquetées de haute qualité, l’incohérence des performances dansdivers ensembles de données et la vulnérabilité auxattaques adverses.Pour relever ces défis, la thèse introduit plusieursapproches innovantes. Tout d’abord, elle utilise l’apprentissage contrastif pour la représentation des nœuds, permettant un apprentissage auto-supervisé qui réduit la dépendance aux données étiquetées.Deuxièmement, un classificateur bayésien est proposé pour la classification des nœuds, qui prenden compte la structure du graphe pour améliorer la précision. Enfin, la thèse aborde la vulnérabilité des GNN aux attaques adversariaux en évaluant la robustesse du classificateur proposé et en introduisant des mécanismes de défense efficaces. Ces contributionsvisent à améliorer à la fois la performance et la résilience des GNN dans l’apprentissage de lareprésentation des nœuds
This thesis addresses advancements in graph representation learning, focusing on the challengesand opportunities presented by Graph Neural Networks (GNNs). It highlights the significanceof graphs in representing complex systems and the necessity of learning node embeddings that capture both node features and graph structure. The study identifies key issues in GNNs, such as their dependence on high-quality labeled data, inconsistent performanceacross various datasets, and susceptibility to adversarial attacks.To tackle these challenges, the thesis introduces several innovative approaches. Firstly, it employs contrastive learning for node representation, enabling self-supervised learning that reduces reliance on labeled data. Secondly, a Bayesian-based classifier isproposed for node classification, which considers the graph’s structure to enhance accuracy. Lastly, the thesis addresses the vulnerability of GNNs to adversarialattacks by assessing the robustness of the proposed classifier and introducing effective defense mechanisms.These contributions aim to improve both the performance and resilience of GNNs in graph representation learning
27

McNeill, Dean K. "Adaptive visual representations for autonomous mobile robots using competitive learning algorithms". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp02/NQ35045.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Glover, Arren John. "Developing grounded representations for robots through the principles of sensorimotor coordination". Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/71763/1/Arren_Glover_Thesis.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Robots currently recognise and use objects through algorithms that are hand-coded or specifically trained. Such robots can operate in known, structured environments but cannot learn to recognise or use novel objects as they appear. This thesis demonstrates that a robot can develop meaningful object representations by learning the fundamental relationship between action and change in sensory state; the robot learns sensorimotor coordination. Methods based on Markov Decision Processes are experimentally validated on a mobile robot capable of gripping objects, and it is found that object recognition and manipulation can be learnt as an emergent property of sensorimotor coordination.
29

Wallgrün, Jan Oliver. "Hierarchical Voronoi graphs spatial representation and reasoning for mobile robots". Berlin Heidelberg Springer, 2008. http://d-nb.info/99728210X/04.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
30

Cosgun, Akansel. "Navigation behavior design and representations for a people aware mobile robot system". Diss., Georgia Institute of Technology, 2016. http://hdl.handle.net/1853/54944.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
There are millions of robots in operation around the world today, and almost all of them operate on factory floors in isolation from people. However, it is now becoming clear that robots can provide much more value assisting people in daily tasks in human environments. Perhaps the most fundamental capability for a mobile robot is navigating from one location to another. Advances in mapping and motion planning research in the past decades made indoor navigation a commodity for mobile robots. Yet, questions remain on how the robots should move around humans. This thesis advocates the use of semantic maps and spatial rules of engagement to enable non-expert users to effortlessly interact with and control a mobile robot. A core concept explored in this thesis is the Tour Scenario, where the task is to familiarize a mobile robot to a new environment after it is first shipped and unpacked in a home or office setting. During the tour, the robot follows the user and creates a semantic representation of the environment. The user labels objects, landmarks and locations by performing pointing gestures and using the robot's user interface. The spatial semantic information is meaningful to humans, as it allows providing commands to the robot such as ``bring me a cup from the kitchen table". While the robot is navigating towards the goal, it should not treat nearby humans as obstacles and should move in a socially acceptable manner. Three main navigation behaviors are studied in this work. The first behavior is the point-to-point navigation. The navigation planner presented in this thesis borrows ideas from human-human spatial interactions, and takes into account personal spaces as well as reactions of people who are in close proximity to the trajectory of the robot. The second navigation behavior is person following. After the description of a basic following behavior, a user study on person following for telepresence robots is presented. Additionally, situation awareness for person following is demonstrated, where the robot facilitates tasks by predicting the intent of the user and utilizing the semantic map. The third behavior is person guidance. A tour-guide robot is presented with a particular application for visually impaired users.
31

Sjöö, Kristoffer. "Functional understanding of space : Representing spatial knowledge using concepts grounded in an agent's purpose". Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-48400.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This thesis examines the role of function in representations of space by robots - that is, dealing directly and explicitly with those aspects of space and objects in space that serve some purpose for the robot. It is suggested that taking function into account helps increase the generality and robustness of solutions in an unpredictable and complex world, and the suggestion is affirmed by several instantiations of functionally conceived spatial models. These include perceptual models for the "on" and "in" relations based on support and containment; context-sensitive segmentation of 2-D maps into regions distinguished by functional criteria; and, learned predictive models of the causal relationships between objects in physics simulation. Practical application of these models is also demonstrated in the context of object search on a mobile robotic platform.
QC 20111125
32

Wu, Jianxin. "Visual place categorization". Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29784.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Thesis (Ph.D)--Computing, Georgia Institute of Technology, 2010.
Committee Chair: Rehg, James M.; Committee Member: Christensen, Henrik; Committee Member: Dellaert, Frank; Committee Member: Essa, Irfan; Committee Member: Malik, Jitendra. Part of the SMARTech Electronic Thesis and Dissertation Collection.
33

Ivan, Vladimir. "Topology based representations for motion synthesis and planning". Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/10520.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Robot motion can be described in several alternative representations, including joint configuration or end-effector spaces. These representations are often used for manipulation or navigation tasks but they are not suitable for tasks that involve close interaction with the environment. In these scenarios, collisions and relative poses of the robot and its surroundings create a complex planning space. To deal with this complexity, we exploit several representations that capture the state of the interaction, rather than the state of the robot. Borrowing notions of topology invariances and homotopy classes, we design task spaces based on winding numbers and writhe for synthesizing winding motion, and electro-static fields for planning reaching and grasping motion. Our experiments show that these representations capture the motion, preserving its qualitative properties, while generalising over finer geometrical detail. Based on the same motivation, we utilise a scale and rotation invariant representation for locally preserving distances, called interaction mesh. The interaction mesh allows for transferring motion between robots of different scales (motion re-targeting), between humans and robots (teleoperation) and between different environments (motion adaptation). To estimate the state of the environment we employ real-time sensing techniques utilizing dense stereo tracking, magnetic tracking sensors and inertia measurements units. We combine and exploit these representations for synthesis and generalization of motion in dynamic environments. The benefit of this method is on problems where direct planning in joint space is extremely hard whereas local optimal control exploiting topology and metric of these novel representations can efficiently compute optimal trajectories. We formulate this approach in the framework of optimal control as an approximate inference problem. This allows for consistent combination of multiple task spaces (e.g. end-effector, joint space and the abstract task spaces we investigate in this thesis). Motion generalization to novel situations and kinematics is similarly performed by projecting motion from abstract representations to joint configuration space. This technique, based on operational space control, allows us to adapt the motion in real time. This process of real-time re-mapping generates robust motion, thus reducing the amount of re-planning. We have implemented our approach as a part of an open source project called the Extensible Optimisation library (EXOTica). This software allows for defining motion synthesis problems by combining task representations and presenting this problem to various motion planners using a common interface. Using EXOTica, we perform comparisons between different representations and different planners to validate that these representations truly improve the motion planning.
34

Sundvall, Denise, e Sara Harila. "Rise of The Robots : En innehållsanalys om representation av virtuella influencers". Thesis, Luleå tekniska universitet, Institutionen för konst, kommunikation och lärande, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-73567.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
I denna studie undersöks representationen av virtuella influencers och interaktionen mellan publiken. Syftet med undersökningen är att få en djupare förståelse för detta nya fenomen som idag växer på sociala medier, och framför allt Instagram. De teoretiska utgångspunkterna som ligger som grund för undersökningen är hyperrealitet, identitet och personas, semiotik och representation. Metoderna som används för att analysera materialet är en kvalitativ innehållsanalys samt en semiotisk analys. I genomförandet av den kvalitativa innehållsanalysen (publikens kommentarer) identifierades följande teman; hyllningar/kritik, politik och realitet. I den semiotiska analysen identifierades tre andra teman, dessa var livsstil, politik och relationer (bilder och bildtexter från de virtuella influencers). Resultatet av undersökningen visade att beroende på influencer, bilder och bildtexter var variationen stor mellan hur publiken tog emot materialet. Det gick också att se en stor skillnad i den semiotiska analysen mellan de olika virtuella influencers som undersöktes.
35

Huang, Di. "Robust face recognition based on three dimensional data". Phd thesis, Ecole Centrale de Lyon, 2011. http://tel.archives-ouvertes.fr/tel-00693158.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
The face is one of the best biometrics for person identification and verification related applications, because it is natural, non-intrusive, and socially weIl accepted. Unfortunately, an human faces are similar to each other and hence offer low distinctiveness as compared with other biometrics, e.g., fingerprints and irises. Furthermore, when employing facial texture images, intra-class variations due to factors as diverse as illumination and pose changes are usually greater than inter-class ones, making 2D face recognition far from reliable in the real condition. Recently, 3D face data have been extensively investigated by the research community to deal with the unsolved issues in 2D face recognition, Le., illumination and pose changes. This Ph.D thesis is dedicated to robust face recognition based on three dimensional data, including only 3D shape based face recognition, textured 3D face recognition as well as asymmetric 3D-2D face recognition. In only 3D shape-based face recognition, since 3D face data, such as facial pointclouds and facial scans, are theoretically insensitive to lighting variations and generally allow easy pose correction using an ICP-based registration step, the key problem mainly lies in how to represent 3D facial surfaces accurately and achieve matching that is robust to facial expression changes. In this thesis, we design an effective and efficient approach in only 3D shape based face recognition. For facial description, we propose a novel geometric representation based on extended Local Binary Pattern (eLBP) depth maps, and it can comprehensively describe local geometry changes of 3D facial surfaces; while a 81FT -based local matching process further improved by facial component and configuration constraints is proposed to associate keypoints between corresponding facial representations of different facial scans belonging to the same subject. Evaluated on the FRGC v2.0 and Gavab databases, the proposed approach proves its effectiveness. Furthermore, due tq the use of local matching, it does not require registration for nearly frontal facial scans and only needs a coarse alignment for the ones with severe pose variations, in contrast to most of the related tasks that are based on a time-consuming fine registration step. Considering that most of the current 3D imaging systems deliver 3D face models along with their aligned texture counterpart, a major trend in the literature is to adopt both the 3D shape and 2D texture based modalities, arguing that the joint use of both clues can generally provides more accurate and robust performance than utilizing only either of the single modality. Two important factors in this issue are facial representation on both types of data as well as result fusion. In this thesis, we propose a biological vision-based facial representation, named Oriented Gradient Maps (OGMs), which can be applied to both facial range and texture images. The OGMs simulate the response of complex neurons to gradient information within a given neighborhood and have properties of being highly distinctive and robust to affine illumination and geometric transformations. The previously proposed matching process is then adopted to calculate similarity measurements between probe and gallery faces. Because the biological vision-based facial representation produces an OGM for each quantized orientation of facial range and texture images, we finally use a score level fusion strategy that optimizes weights by a genetic algorithm in a learning pro cess. The experimental results achieved on the FRGC v2.0 and 3DTEC datasets display the effectiveness of the proposed biological vision-based facial description and the optimized weighted sum fusion. [...]
36

Liemhetcharat, Somchaya. "Representation, Planning, and Learning of Dynamic Ad Hoc Robot Teams". Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/304.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
Forming an effective multi-robot team to perform a task is a key problem in many domains. The performance of a multi-robot team depends on the robots the team is composed of, where each robot has different capabilities. Team performance has previously been modeled as the sum of single-robot capabilities, and these capabilities are assumed to be known. Is team performance just the sum of single-robot capabilities? This thesis is motivated by instances where agents perform differently depending on their teammates, i.e., there is synergy in the team. For example, in human sports teams, a well-trained team performs better than an allstars team composed of top players from around the world. This thesis introduces a novel model of team synergy — the Synergy Graph model — where the performance of a team depends on each robot’s individual capabilities and a task-based relationship among them. Robots are capable of learning to collaborate and improving team performance over time, and this thesis explores how such robots are represented in the Synergy Graph Model. This thesis contributes a novel algorithm that allocates training instances for the robots to improve, so as to form an effective multi-robot team. The goal of team formation is the optimal selection of a subset of robots to perform the task, and this thesis contributes team formation algorithms that use a Synergy Graph to form an effective multi-robot team with high performance. In particular, the performance of a team is modeled with a Normal distribution to represent the nondeterminism of the robots’ actions in a dynamic world, and this thesis introduces the concept of a δ-optimal team that trades off risk versus reward. Further, robots may fail from time to time, and this thesis considers the formation of a robust multi-robot team that attains high performance even if failures occur. This thesis considers ad hoc teams, where the robots of the team have not collaborated together, and so their capabilities and synergy are initially unknown. This thesis contributes a novel learning algorithm that uses observations of team performance to learn a Synergy Graph that models the capabilities and synergy of the team. Further, new robots may become available, and this thesis introduces an algorithm that iteratively updates a Synergy Graph with new robots.
37

Tan, Chee Khoon. "Fuzzy spatial representation and sensory integration for mobile robot task". Thesis, University of Nottingham, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.409387.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Twardon, Lukas [Verfasser]. "Bimanual Interaction with Clothes. Topology, Geometry, and Policy Representations in Robots / Lukas Twardon". Bielefeld : Universitätsbibliothek Bielefeld, 2019. http://d-nb.info/1200097610/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Wolter, Diedrich. "Spatial representation and reasoning for robot mapping a shape-based approach". Berlin Heidelberg Springer, 2006. http://d-nb.info/989966941/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Garg, Sourav. "Robust visual place recognition under simultaneous variations in viewpoint and appearance". Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/134410/1/Sourav%20Garg%20Thesis.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This thesis explores the problem of visual place recognition and localization for a mobile robot, particularly dealing with the challenges of simultaneous variations in scene appearance and camera viewpoint. The proposed methods draw inspiration from humans and make use of semantic cues to represent places. This approach enables effective place recognition from similar or opposing viewpoints, despite variations in scene appearance caused by different times of day or seasons. The research contributions presented in the thesis advance visual place recognition techniques, making them more useful for deployment in a wide range of robotic and autonomous vehicle scenarios.
41

Vasudevan, Shrihari. "Spatial cognition for mobile robots : a hierarchical probabilistic concept-oriented representation of space". Zürich : ETH, 2008. http://e-collection.ethbib.ethz.ch/show?type=diss&nr=17612.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Brook, James. "Robert Wilson and an aesthetic of human behaviour in the performing body". Thesis, University of Gloucestershire, 2013. http://eprints.glos.ac.uk/2836/.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
This practice-based research investigates movement and gesture in relation to the theatre work of Robert Wilson. A group of performers was established to explore Wilson’s construction of a code of movement during a series of over fifty workshops and films including: a feature film Oedipus; a live performance Two Sides to an Envelope; and a theatre production The Mansion’s Third Unbridled View. The creation of an embodied experience for the spectator, perceived through the senses, is central to Wilson’s theatre. Integral to this are the relationships between drama and image, and time and space. Wilson’s images, in which the body is presented in attitudes of stillness and repetition, are created through these transitional structures. Taking these structures as a starting point for my own performative work, the research led to an abstracted form of natural behaviour, where the movements and arrangements of bodies defined specific movement forms. Subsequently, the relationship between movement and images in Wilson’s theatre was reconsidered through Deleuze’s analysis of the cinematic image. Deleuze identifies subjectivity with the ‘semi-subjective image’, in which traces of the camera’s movements are imprinted in the film. In films made to register these movements, images of moving bodies evincing a sense of time passing were also created. This led to my discovery of film as a direct embodiment of performance, rather than as a form of documentation. Critical to these films, the theatre production, performances, and workshops was the relationship between images and continuous motion predicated upon Wilson’s idea of space, the horizontal: and time, the vertical. This idea enabled me to consider Wilson’s theatre and video works in relation to Bergson’s philosophy concerning duration. The research discovered new ways of interpreting Wilson’s aesthetic through Bergson’s idea that motion is an indivisible process which can also be perceived in relation to the position of bodies in space. Through this understanding, an original performance language was created based on the relationship between stasis and motion, and the interplay between the immersive, semiotic and instrumental modes of gestural communication.
43

Dantam, Neil Thomas. "A linguistic method for robot verification programming and control". Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/54284.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
There are many competing techniques for specifying robot policies, each having advantages in different circumstances. To unify these techniques in a single framework, we use formal language as an intermediate representation for robot behavior. This links previously disparate techniques such as temporal logics and learning from demonstration, and it links data driven approaches such as semantic mapping with formal discrete event and hybrid systems models. These formal models enable system verification -- a crucial point for physical robots. We introduce a set of rewrite rules for hybrid systems and apply it automatically build a hybrid model for mobile manipulation from a semantic map. In the manipulation domain, we develop a new workspace interpolation methods which provides direct, non-stop motion through multiple waypoints, and we introduce a filtering technique for online camera registration to avoid static calibration and handle changing camera positions. To handle concurrent communication with embedded robot hardware, we develop a new real-time interprocess communication system which offers lower latency than Linux sockets. Finally, we consider how time constraints affect the execution of systems modeled hierarchically using context-free grammars. Based on these constraints, we modify the LL(1) parser generation algorithm to operate in real-time with bounded memory use.
44

Yuan, Fang [Verfasser]. "Interactive acquisition of spatial representations with mobile robots / Fang Yuan. Technische Fakultät - AG Angewandte Informatik". Bielefeld : Universitätsbibliothek Bielefeld, Hochschulschriften, 2011. http://d-nb.info/101799630X/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Harris, John Steven. "Of Rauschenberg, policy and representation at the Vancouver Art Gallery : a partial history 1966-1983". Thesis, University of British Columbia, 1985. http://hdl.handle.net/2429/25419.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):
My thesis examines the policy of the Vancouver Art Gallery (VAG) as it affected the representation of art in its community in the 1960s and '70s. It was begun in order to understand what determined the changes in policy as they were experienced during this period, which saw an enormous expansion in the activities of the Gallery. To some extent the expansion was realized by means of increased cultural expenditure by the federal government, but this only made programmes possible, it did not carry them out. During the 1960s the Vancouver Art Gallery gained a measure of international recognition for its innovative programming, which depended to a degree on the redefinition of its relationship to the local, whether that signified its traditional patronage, Vancouver artists or the "man in the street". VAG's new outreach programme was not unique, but it was contemporary with developments in other locations. Given the popular and critical success of his policy, VAG director Tony Emery pushed it to the relative exclusion of the more traditional type of gallery programme, in this manner angering VAG's "more conservative" audience. With the first indications of a fiscal crisis in the 1970s, the government began reining in public expenditure, including that on the arts. There was first a freeze on funding to the larger arts institutions, which by now included the Gallery, and then the slow withering of government support. VAG's experiments in programming, which had been made possible through this support, became expendable, and there was soon a re-orientation towards more traditional programmes, accompanied by another redefinition of the Gallery's audience. The Gallery's structure, policy and programme were gradually transformed to fit an increasingly corporate model or paradigm in order to secure the extra funds it needed to remain solvent. A crucial aspect of this change was the plan to move the Gallery into larger quarters, which would be more attractive to donors and collectors, and which would allow prestigious exhibitions to be brought into the city. The thesis undertakes to examine the vagaries of Gallery policy with the aid of the current literature on museums and government cultural policy, and with government and Gallery documents. The other major section examines the formation of the reputation of Robert Rauschenberg, as it bears on the reception of a group of his works exhibited at VAG in 1978. Rauschenberg was an artist in frequent contact with Vancouver through exhibitions of his work at a private gallery, and the consolidation of his reputation following the 1976 retrospective of his work by the Smithsonian made his work apt for the promotion of VAG. Rauschenberg's use-value for VAG depends on a particular reading of his work which had become generalized after 1963, and reinforced in 1976, which was appropriate to the new Gallery role promoted by VAG's paladins. This interpretation, which was developed by Alan Solomon in 1963, fixed Rauschenberg's works as celebrations of a way of looking at one's environment and of what was looked at. Solomon's reading became the accepted one, but by an examination of the reception of Rauschenberg's art prior to 1963, and by an analysis of two of his works, I argue that it is neither the only possibility nor even the most accurate one. In the 1970s, critics conflated Rauschenberg's earlier and later work within the context of Solomon's interpretation, which has hardly been expanded upon. They have usually tried to establish an identity of the earlier and later work, based upon Solomon's reading, where I am trying to establish their difference. An analysis of two of the works which appeared in the 1978 Works from Captiva exhibition at VAG indicates the differences with the earlier work and the susceptibility of their iconography to the new role the Gallery was attempting to promote.
Arts, Faculty of
Art History, Visual Art and Theory, Department of
Graduate
46

Topp, Elin Anna. "Human-Robot Interaction and Mapping with a Service Robot : Human Augmented Mapping". Doctoral thesis, Stockholm : School of computer science and communication, KTH, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4899.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Montavon, Grégoire Verfasser], Klaus-Robert [Akademischer Betreuer] [Müller, Yoshua [Akademischer Betreuer] Bengio e Léon [Akademischer Betreuer] Bottou. "On layer-wise representations in deep neural networks / Grégoire Montavon. Gutachter: Klaus-Robert Müller ; Yoshua Bengio ; Léon Bottou. Betreuer: Klaus-Robert Müller". Berlin : Technische Universität Berlin, 2013. http://d-nb.info/1065665458/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Wolter, Diedrich [Verfasser]. "Spatial representation and reasoning for robot mapping : a shape-based approach / Diedrich Wolter". Berlin, 2008. http://d-nb.info/989966941/34.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Dayoub, Feras. "An adaptive spherical view representation for mobile robot navigation in non-stationary environments". Thesis, University of Lincoln, 2011. https://eprints.qut.edu.au/105983/1/Dayoub_PhD_Thesis_2011.pdf.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Stening, John. "Exploring Internal Simulations of Perception in a Mobile Robot using Abstractions". Thesis, University of Skövde, School of Humanities and Informatics, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-907.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
Abstract (sommario):

This thesis investigates the possibilities of explaining higher cognition as internal simulations of perception and action at an abstract level. Relatively recent findings in both neuroscience and psychology indicates that both perception and action can be internally simulated by activating sensory and motor areas in the brain in absence of sensory input and without any resulting overt behavior. An investigation was conducted in order to test the hypothesis that perception can be simulated in a mobile robot using abstractions. The result from this investigation showed that this was indeed the case but that the accuracy was limited. The simulations allowed the robot to anticipate long chains of future situations but were not good enough to support any overt behavior. To further improve the results there is a need for better training techniques and/or a more complex architecture.

Vai alla bibliografia