Dissertations / Theses on the topic 'Domain Adversarial Learning'

To see the other types of publications on this topic, follow the link: Domain Adversarial Learning.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 15 dissertations / theses for your research on the topic 'Domain Adversarial Learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Bejiga, Mesay Belete. "Adversarial approaches to remote sensing image analysis." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/257100.

Full text
Abstract:
The recent advance in generative modeling in particular the unsupervised learning of data distribution is attributed to the invention of models with new learning algorithms. Among the methods proposed, generative adversarial networks (GANs) have shown to be the most efficient approaches to estimate data distributions. The core idea of GANs is an adversarial training of two deep neural networks, called generator and discriminator, to learn an implicit approximation of the true data distribution. The distribution is approximated through the weights of the generator network, and interaction with the distribution is through the process of sampling. GANs have found to be useful in applications such as image-to-image translation, in-painting, and text-to-image synthesis. In this thesis, we propose to capitalize on the power of GANs for different remote sensing problems. The first problem is a new research track to the remote sensing community that aims to generate remote sensing images from text descriptions. More specifically, we focus on exploiting ancient text descriptions of geographical areas, inherited from previous civilizations, and convert them the equivalent remote sensing images. The proposed method is composed of a text encoder and an image synthesis module. The text encoder is tasked with converting a text description into a vector. To this end, we explore two encoding schemes: a multilabel encoder and a doc2vec encoder. The multilabel encoder takes into account the presence or absence of objects in the encoding process whereas the doc2vec method encodes additional information available in the text. The encoded vectors are then used as conditional information to a GAN network and guide the synthesis process. We collected satellite images and ancient text descriptions for training in order to evaluate the efficacy of the proposed method. The qualitative and quantitative results obtained suggest that the doc2vec encoder-based model yields better images in terms of the semantic agreement with the input description. In addition, we present open research areas that we believe are important to further advance this new research area. The second problem we want to address is the issue of semi-supervised domain adaptation. The goal of domain adaptation is to learn a generic classifier for multiple related problems, thereby reducing the cost of labeling. To that end, we propose two methods. The first method uses GANs in the context of image-to-image translation to adapt source domain images into target domain images and train a classifier using the adapted images. We evaluated the proposed method on two remote sensing datasets. Though we have not explored this avenue extensively due to computational challenges, the results obtained show that the proposed method is promising and worth exploring in the future. The second domain adaptation strategy borrows the adversarial property of GANs to learn a new representation space where the domain discrepancy is negligible, and the new features are discriminative enough. The method is composed of a feature extractor, class predictor, and domain classifier blocks. Contrary to the traditional methods that perform representation and classifier learning in separate stages, this method combines both into a single-stage thereby learning a new representation of the input data that is domain invariant and discriminative. After training, the classifier is used to predict both source and target domain labels. We apply this method for large-scale land cover classification and cross-sensor hyperspectral classification problems. Experimental results obtained show that the proposed method provides a performance gain of up to 40%, and thus indicates the efficacy of the method.
APA, Harvard, Vancouver, ISO, and other styles
2

Rahman, Mohammad Mahfujur. "Deep domain adaptation and generalisation." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205619/1/Mohammad%20Mahfujur_Rahman_Thesis.pdf.

Full text
Abstract:
This thesis addresses a critical problem in computer vision of dealing with dataset bias between source and target environments. Variations in image data can arise from multiple factors including contrasts in picture quality (shading, brightness, colour, resolution, and occlusion), diverse backgrounds, distinct circumstances, changes in camera viewpoint, and implicit heterogeneity of the samples themselves. This research developed strategies to address this domain shift problem for the object recognition task. Several domain adaptation and generalization approaches based on deep neural networks were introduced to improve poor performance due to domain shift or domain bias.
APA, Harvard, Vancouver, ISO, and other styles
3

Gustafsson, Fredrik, and Erik Linder-Norén. "Automotive 3D Object Detection Without Target Domain Annotations." Thesis, Linköpings universitet, Datorseende, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148585.

Full text
Abstract:
In this thesis we study a perception problem in the context of autonomous driving. Specifically, we study the computer vision problem of 3D object detection, in which objects should be detected from various sensor data and their position in the 3D world should be estimated. We also study the application of Generative Adversarial Networks in domain adaptation techniques, aiming to improve the 3D object detection model's ability to transfer between different domains. The state-of-the-art Frustum-PointNet architecture for LiDAR-based 3D object detection was implemented and found to closely match its reported performance when trained and evaluated on the KITTI dataset. The architecture was also found to transfer reasonably well from the synthetic SYN dataset to KITTI, and is thus believed to be usable in a semi-automatic 3D bounding box annotation process. The Frustum-PointNet architecture was also extended to explicitly utilize image features, which surprisingly degraded its detection performance. Furthermore, an image-only 3D object detection model was designed and implemented, which was found to compare quite favourably with current state-of-the-art in terms of detection performance. Additionally, the PixelDA approach was adopted and successfully applied to the MNIST to MNIST-M domain adaptation problem, which validated the idea that unsupervised domain adaptation using Generative Adversarial Networks can improve the performance of a task network for a dataset lacking ground truth annotations. Surprisingly, the approach did however not significantly improve upon the performance of the image-based 3D object detection models when trained on the SYN dataset and evaluated on KITTI.
APA, Harvard, Vancouver, ISO, and other styles
4

Brandt, Carl-Simon, Jonathan Kleivard, and Andreas Turesson. "Convolutional, adversarial and random forest-based DGA detection : Comparative study for DGA detection with different machine learning algorithms." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-20103.

Full text
Abstract:
Malware is becoming more intelligent as static methods for blocking communication with Command and Control (C&C) server are becoming obsolete. Domain Generation Algorithms (DGAs) are a common evasion technique that generates pseudo-random domain names to communicate with C&C servers in a difficult way to detect using handcrafted methods. Trying to detect DGAs by looking at the domain name is a broad and efficient approach to detect malware-infected hosts. This gives us the possibility of detecting a wider assortment of malware compared to other techniques, even without knowledge of the malware’s existence. Our study compared the effectiveness of three different machine learning classifiers: Convolutional Neural Network (CNN), Generative Adversarial Network (GAN) and Random Forest (RF) when recognizing patterns and identifying these pseudo-random domains. The result indicates that CNN differed significantly from GAN and RF. It achieved 97.46% accuracy in the final evaluation, while RF achieved 93.89% and GAN achieved 60.39%. In the future, network traffic (efficiency) could be a key component to examine, as productivity may be harmed if the networkis over burdened by domain identification using machine learning algorithms.
APA, Harvard, Vancouver, ISO, and other styles
5

Marzinotto, Gabriel. "Semantic frame based analysis using machine learning techniques : improving the cross-domain generalization of semantic parsers." Electronic Thesis or Diss., Aix-Marseille, 2019. http://www.theses.fr/2019AIXM0483.

Full text
Abstract:
Rendre les analyseurs sémantiques robustes aux variations lexicales et stylistiques est un véritable défi pour de nombreuses applications industrielles. De nos jours, l'analyse sémantique nécessite de corpus annotés spécifiques à chaque domaine afin de garantir des performances acceptables. Les techniques d'apprenti-ssage par transfert sont largement étudiées et adoptées pour résoudre ce problème de manque de robustesse et la stratégie la plus courante consiste à utiliser des représentations de mots pré-formés. Cependant, les meilleurs analyseurs montrent toujours une dégradation significative des performances lors d'un changement de domaine, mettant en évidence la nécessité de stratégies d'apprentissage par transfert supplémentaires pour atteindre la robustesse. Ce travail propose une nouvelle référence pour étudier le problème de dépendance de domaine dans l'analyse sémantique. Nous utilisons un nouveau corpus annoté pour évaluer les techniques classiques d'apprentissage par transfert et pour proposer et évaluer de nouvelles techniques basées sur les réseaux antagonistes. Toutes ces techniques sont testées sur des analyseurs sémantiques de pointe. Nous affirmons que les approches basées sur les réseaux antagonistes peuvent améliorer les capacités de généralisation des modèles. Nous testons cette hypothèse sur différents schémas de représentation sémantique, langages et corpus, en fournissant des résultats expérimentaux à l'appui de notre hypothèse
Making semantic parsers robust to lexical and stylistic variations is a real challenge with many industrial applications. Nowadays, semantic parsing requires the usage of domain-specific training corpora to ensure acceptable performances on a given domain. Transfer learning techniques are widely studied and adopted when addressing this lack of robustness, and the most common strategy is the usage of pre-trained word representations. However, the best parsers still show significant performance degradation under domain shift, evidencing the need for supplementary transfer learning strategies to achieve robustness. This work proposes a new benchmark to study the domain dependence problem in semantic parsing. We use this bench to evaluate classical transfer learning techniques and to propose and evaluate new techniques based on adversarial learning. All these techniques are tested on state-of-the-art semantic parsers. We claim that adversarial learning approaches can improve the generalization capacities of models. We test this hypothesis on different semantic representation schemes, languages and corpora, providing experimental results to support our hypothesis
APA, Harvard, Vancouver, ISO, and other styles
6

Ackerman, Wesley. "Semantic-Driven Unsupervised Image-to-Image Translation for Distinct Image Domains." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8684.

Full text
Abstract:
We expand the scope of image-to-image translation to include more distinct image domains, where the image sets have analogous structures, but may not share object types between them. Semantic-Driven Unsupervised Image-to-Image Translation for Distinct Image Domains (SUNIT) is built to more successfully translate images in this setting, where content from one domain is not found in the other. Our method trains an image translation model by learning encodings for semantic segmentations of images. These segmentations are translated between image domains to learn meaningful mappings between the structures in the two domains. The translated segmentations are then used as the basis for image generation. Beginning image generation with encoded segmentation information helps maintain the original structure of the image. We qualitatively and quantitatively show that SUNIT improves image translation outcomes, especially for image translation tasks where the image domains are very distinct.
APA, Harvard, Vancouver, ISO, and other styles
7

Tsai, Jen-Chieh, and 蔡仁傑. "Deep Adversarial Learning and Domain Adaptation." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/3848u8.

Full text
Abstract:
碩士
國立交通大學
電機工程學系
105
Deep learning has been rapidly developing from different aspects of theories and applications where a large amount of labeled data are available for supervised training. However, in practice, it is time-consuming to collect a large set of labeled data. In real world, we may only observe a limited set of labeled data and unlabeled data. How to perform data augmentation and improve model regularization is a crucial research topic. Recently, adversarial learning has been discovering to generate or synthesize realistic data without the mixing problem in traditional model based on Markov chain. This thesis deals with the generation of new training samples based on deep adversarial learning. Our goal is to carry out the adversarial generation of new samples and apply it for defect classification in manufacturing process. To improve system performance, we introduce the additional latent codes and maximize the mutual information between generative samples and latent codes to build a conditional generative adversarial model. This model is capable of generating a variety of samples under the same class. We evaluate the performance of this unsupervised model by detecting the defect conditions in production process of copper foil images. On the other hand, transfer learning provides an alternative method to handle the problem of insufficient labeled data where data generation is not required. Transfer learning involves several issues owing to different setups. The issue we concern is mainly on domain adaptation. Domain adaptation aims to adapt a model from source domain to target domain through learning the shared representation that allows knowledge transfer across domains. Traditional domain adaptation methods are specialized to learn the shared representation for distribution matching between source domain and target domain where the individual information in both domains is missing. In this thesis, we present a deep hybrid adversarial learning framework which captures the shared information and the individual information simultaneously. Our idea is to estimate the shared feature which is informative for classification and the individual feature which contains the domain specific information. We use adaptation network to extract the shared feature and separation network to extract individual feature. Both adaptation and separation network are seen as an adversarial network. A hybrid adversarial learning is incorporated in the separation network as well as the adaptation network where the according to the minimax optimization over separation loss and domain discrepancy, respectively. The idea in the adaptation network is that we want to extract shared feature that an optimal discriminator cannot tell where feature come from. The idea in the separation network is that we want to extract feature including shared and individual feature which can be separated even by a bad discriminator. In other words the features have to be good enough to force the discriminator to classify them correctly. For the experiment on generative adversarial model, we investigate different unsupervised learning methods for defect detection in presence of copper foil images. In general, defect detection requires very high accuracy but the defect rate usually is relatively low which means the images with and without defect are very unbalanced. We generate the defective images to balance the training data between defective images and non-defective images conditioned on different classes. For the experiments on domain adaptation problem, we evaluate the proposed method on different tasks and show the merit of using the proposed adversarial domain separation and adaptation in the tasks of sentiment classification and image recognition.
APA, Harvard, Vancouver, ISO, and other styles
8

Wei, Kai-Ya, and 魏凱亞. "Generative Adversarial Guided Learning for Domain Adaptation." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/zt2car.

Full text
Abstract:
碩士
國立清華大學
資訊工程學系所
106
This thesis focuses on unsupervised domain adaptation problem, which aims to learn a classification model on an unlabelled target domain by referring to a fully-labelled source domain. Our goal is twofold: bridging the gap between source-target domains, and deriving a discriminative model for the target domain. We propose a Generative Adversarial Guided Learning (GAGL) model to tackle the task. To minimize the source-target domain shift, we adopt the idea of domain adversarial training to build a classification network. Next, to derive a target discriminative classifier, we propose to include a generative network to guide the classifier so as to push its decision boundaries away from high density area of target domain. The proposed GAGL model is an end-to-end framework and thus can simultaneously learn the classification model and refine its decision boundary under the guidance of the generator. Our experimental results show that the proposed GAGL model not only outperforms the baseline domain adversarial model but also achieves competitive results with state-of-the-art methods on standard benchmarks.
APA, Harvard, Vancouver, ISO, and other styles
9

Pereira, João Afonso Pinto. "Fingerprint Anti Spoofing - Domain Adaptation and Adversarial Learning." Master's thesis, 2020. https://hdl.handle.net/10216/128390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Pereira, João Afonso Pinto. "Fingerprint Anti Spoofing - Domain Adaptation and Adversarial Learning." Dissertação, 2020. https://hdl.handle.net/10216/128390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Chen, Tseng-Hung, and 陳增鴻. "Generating Cross-domain Visual Description via Adversarial Learning." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/r8k45f.

Full text
Abstract:
碩士
國立清華大學
電機工程學系所
105
Impressive image captioning results are achieved in domains with plenty of training image and sentence pairs (e.g., MSCOCO). However, transferring to a target domain with significant domain shifts but no paired training data (referred to as cross-domain image captioning) remains largely unexplored. We propose a novel adversarial training procedure to leverage unpaired data in the target domain. Two critic networks are introduced to guide the captioner, namely domain critic and multi-modal critic. The domain critic assesses whether the generated sentences are indistinguishable from sentences in the target domain. The multi-modal critic assesses whether an image and its generated sentence are a valid pair. During training, the critics and captioner act as adversaries -- captioner aims to generate indistinguishable sentences, whereas critics aim at distinguishing them. The assessment improves the captioner through policy gradient updates. During inference, we further propose a novel critic-based planning method to select high-quality sentences without additional supervision (e.g., tags). To evaluate, we use MSCOCO as the source domain and four other datasets (CUB-200-2011, Oxford-102, TGIF, and Flickr30k) as the target domains. Our method consistently performs well on all datasets. Utilizing the learned critic during inference further boosts the overall performance in CUB-200 and Oxford-102. Furthermore, we extend our method to the task of video captioning. We observe improvements for the adaptation between large-scale video captioning datasets such as MSR-VTT, M-VAD and MPII-MD.
APA, Harvard, Vancouver, ISO, and other styles
12

"Generalized Domain Adaptation for Visual Domains." Master's thesis, 2020. http://hdl.handle.net/2286/R.I.57226.

Full text
Abstract:
abstract: Humans have a great ability to recognize objects in different environments irrespective of their variations. However, the same does not apply to machine learning models which are unable to generalize to images of objects from different domains. The generalization of these models to new data is constrained by the domain gap. Many factors such as image background, image resolution, color, camera perspective and variations in the objects are responsible for the domain gap between the training data (source domain) and testing data (target domain). Domain adaptation algorithms aim to overcome the domain gap between the source and target domains and learn robust models that can perform well across both the domains. This thesis provides solutions for the standard problem of unsupervised domain adaptation (UDA) and the more generic problem of generalized domain adaptation (GDA). The contributions of this thesis are as follows. (1) Certain and Consistent Domain Adaptation model for closed-set unsupervised domain adaptation by aligning the features of the source and target domain using deep neural networks. (2) A multi-adversarial deep learning model for generalized domain adaptation. (3) A gating model that detects out-of-distribution samples for generalized domain adaptation. The models were tested across multiple computer vision datasets for domain adaptation. The dissertation concludes with a discussion on the proposed approaches and future directions for research in closed set and generalized domain adaptation.
Dissertation/Thesis
Masters Thesis Computer Science 2020
APA, Harvard, Vancouver, ISO, and other styles
13

Ganin, Iaroslav. "Natural image processing and synthesis using deep learning." Thèse, 2019. http://hdl.handle.net/1866/23437.

Full text
Abstract:
Nous étudions dans cette thèse comment les réseaux de neurones profonds peuvent être utilisés dans différents domaines de la vision artificielle. La vision artificielle est un domaine interdisciplinaire qui traite de la compréhension d’images et de vidéos numériques. Les problèmes de ce domaine ont traditionnellement été adressés avec des méthodes ad-hoc nécessitant beaucoup de réglages manuels. En effet, ces systèmes de vision artificiels comprenaient jusqu’à récemment une série de modules optimisés indépendamment. Cette approche est très raisonnable dans la mesure où, avec peu de données, elle bénéficient autant que possible des connaissances du chercheur. Mais cette avantage peut se révéler être une limitation si certaines données d’entré n’ont pas été considérées dans la conception de l’algorithme. Avec des volumes et une diversité de données toujours plus grands, ainsi que des capacités de calcul plus rapides et économiques, les réseaux de neurones profonds optimisés d’un bout à l’autre sont devenus une alternative attrayante. Nous démontrons leur avantage avec une série d’articles de recherche, chacun d’entre eux trouvant une solution à base de réseaux de neurones profonds à un problème d’analyse ou de synthèse visuelle particulier. Dans le premier article, nous considérons un problème de vision classique: la détection de bords et de contours. Nous partons de l’approche classique et la rendons plus ‘neurale’ en combinant deux étapes, la détection et la description de motifs visuels, en un seul réseau convolutionnel. Cette méthode, qui peut ainsi s’adapter à de nouveaux ensembles de données, s’avère être au moins aussi précis que les méthodes conventionnelles quand il s’agit de domaines qui leur sont favorables, tout en étant beaucoup plus robuste dans des domaines plus générales. Dans le deuxième article, nous construisons une nouvelle architecture pour la manipulation d’images qui utilise l’idée que la majorité des pixels produits peuvent d’être copiés de l’image d’entrée. Cette technique bénéficie de plusieurs avantages majeurs par rapport à l’approche conventionnelle en apprentissage profond. En effet, elle conserve les détails de l’image d’origine, n’introduit pas d’aberrations grâce à la capacité limitée du réseau sous-jacent et simplifie l’apprentissage. Nous démontrons l’efficacité de cette architecture dans le cadre d’une tâche de correction du regard, où notre système produit d’excellents résultats. Dans le troisième article, nous nous éclipsons de la vision artificielle pour étudier le problème plus générale de l’adaptation à de nouveaux domaines. Nous développons un nouvel algorithme d’apprentissage, qui assure l’adaptation avec un objectif auxiliaire à la tâche principale. Nous cherchons ainsi à extraire des motifs qui permettent d’accomplir la tâche mais qui ne permettent pas à un réseau dédié de reconnaître le domaine. Ce réseau est optimisé de manière simultané avec les motifs en question, et a pour tâche de reconnaître le domaine de provenance des motifs. Cette technique est simple à implémenter, et conduit pourtant à l’état de l’art sur toutes les tâches de référence. Enfin, le quatrième article présente un nouveau type de modèle génératif d’images. À l’opposé des approches conventionnels à base de réseaux de neurones convolutionnels, notre système baptisé SPIRAL décrit les images en termes de programmes bas-niveau qui sont exécutés par un logiciel de graphisme ordinaire. Entre autres, ceci permet à l’algorithme de ne pas s’attarder sur les détails de l’image, et de se concentrer plutôt sur sa structure globale. L’espace latent de notre modèle est, par construction, interprétable et permet de manipuler des images de façon prévisible. Nous montrons la capacité et l’agilité de cette approche sur plusieurs bases de données de référence.
In the present thesis, we study how deep neural networks can be applied to various tasks in computer vision. Computer vision is an interdisciplinary field that deals with understanding of digital images and video. Traditionally, the problems arising in this domain were tackled using heavily hand-engineered adhoc methods. A typical computer vision system up until recently consisted of a sequence of independent modules which barely talked to each other. Such an approach is quite reasonable in the case of limited data as it takes major advantage of the researcher's domain expertise. This strength turns into a weakness if some of the input scenarios are overlooked in the algorithm design process. With the rapidly increasing volumes and varieties of data and the advent of cheaper and faster computational resources end-to-end deep neural networks have become an appealing alternative to the traditional computer vision pipelines. We demonstrate this in a series of research articles, each of which considers a particular task of either image analysis or synthesis and presenting a solution based on a ``deep'' backbone. In the first article, we deal with a classic low-level vision problem of edge detection. Inspired by a top-performing non-neural approach, we take a step towards building an end-to-end system by combining feature extraction and description in a single convolutional network. The resulting fully data-driven method matches or surpasses the detection quality of the existing conventional approaches in the settings for which they were designed while being significantly more usable in the out-of-domain situations. In our second article, we introduce a custom architecture for image manipulation based on the idea that most of the pixels in the output image can be directly copied from the input. This technique bears several significant advantages over the naive black-box neural approach. It retains the level of detail of the original images, does not introduce artifacts due to insufficient capacity of the underlying neural network and simplifies training process, to name a few. We demonstrate the efficiency of the proposed architecture on the challenging gaze correction task where our system achieves excellent results. In the third article, we slightly diverge from pure computer vision and study a more general problem of domain adaption. There, we introduce a novel training-time algorithm (\ie, adaptation is attained by using an auxilliary objective in addition to the main one). We seek to extract features that maximally confuse a dedicated network called domain classifier while being useful for the task at hand. The domain classifier is learned simultaneosly with the features and attempts to tell whether those features are coming from the source or the target domain. The proposed technique is easy to implement, yet results in superior performance in all the standard benchmarks. Finally, the fourth article presents a new kind of generative model for image data. Unlike conventional neural network based approaches our system dubbed SPIRAL describes images in terms of concise low-level programs executed by off-the-shelf rendering software used by humans to create visual content. Among other things, this allows SPIRAL not to waste its capacity on minutae of datasets and focus more on the global structure. The latent space of our model is easily interpretable by design and provides means for predictable image manipulation. We test our approach on several popular datasets and demonstrate its power and flexibility.
APA, Harvard, Vancouver, ISO, and other styles
14

Serdyuk, Dmitriy. "Advances in deep learning methods for speech recognition and understanding." Thesis, 2020. http://hdl.handle.net/1866/24803.

Full text
Abstract:
Ce travail expose plusieurs études dans les domaines de la reconnaissance de la parole et compréhension du langage parlé. La compréhension sémantique du langage parlé est un sous-domaine important de l'intelligence artificielle. Le traitement de la parole intéresse depuis longtemps les chercheurs, puisque la parole est une des charactéristiques qui definit l'être humain. Avec le développement du réseau neuronal artificiel, le domaine a connu une évolution rapide à la fois en terme de précision et de perception humaine. Une autre étape importante a été franchie avec le développement d'approches bout en bout. De telles approches permettent une coadaptation de toutes les parties du modèle, ce qui augmente ainsi les performances, et ce qui simplifie la procédure d'entrainement. Les modèles de bout en bout sont devenus réalisables avec la quantité croissante de données disponibles, de ressources informatiques et, surtout, avec de nombreux développements architecturaux innovateurs. Néanmoins, les approches traditionnelles (qui ne sont pas bout en bout) sont toujours pertinentes pour le traitement de la parole en raison des données difficiles dans les environnements bruyants, de la parole avec un accent et de la grande variété de dialectes. Dans le premier travail, nous explorons la reconnaissance de la parole hybride dans des environnements bruyants. Nous proposons de traiter la reconnaissance de la parole, qui fonctionne dans un nouvel environnement composé de différents bruits inconnus, comme une tâche d'adaptation de domaine. Pour cela, nous utilisons la nouvelle technique à l'époque de l'adaptation du domaine antagoniste. En résumé, ces travaux antérieurs proposaient de former des caractéristiques de manière à ce qu'elles soient distinctives pour la tâche principale, mais non-distinctive pour la tâche secondaire. Cette tâche secondaire est conçue pour être la tâche de reconnaissance de domaine. Ainsi, les fonctionnalités entraînées sont invariantes vis-à-vis du domaine considéré. Dans notre travail, nous adoptons cette technique et la modifions pour la tâche de reconnaissance de la parole dans un environnement bruyant. Dans le second travail, nous développons une méthode générale pour la régularisation des réseaux génératif récurrents. Il est connu que les réseaux récurrents ont souvent des difficultés à rester sur le même chemin, lors de la production de sorties longues. Bien qu'il soit possible d'utiliser des réseaux bidirectionnels pour une meilleure traitement de séquences pour l'apprentissage des charactéristiques, qui n'est pas applicable au cas génératif. Nous avons développé un moyen d'améliorer la cohérence de la production de longues séquences avec des réseaux récurrents. Nous proposons un moyen de construire un modèle similaire à un réseau bidirectionnel. L'idée centrale est d'utiliser une perte L2 entre les réseaux récurrents génératifs vers l'avant et vers l'arrière. Nous fournissons une évaluation expérimentale sur une multitude de tâches et d'ensembles de données, y compris la reconnaissance vocale, le sous-titrage d'images et la modélisation du langage. Dans le troisième article, nous étudions la possibilité de développer un identificateur d'intention de bout en bout pour la compréhension du langage parlé. La compréhension sémantique du langage parlé est une étape importante vers le développement d'une intelligence artificielle de type humain. Nous avons vu que les approches de bout en bout montrent des performances élevées sur les tâches, y compris la traduction automatique et la reconnaissance de la parole. Nous nous inspirons des travaux antérieurs pour développer un système de bout en bout pour la reconnaissance de l'intention.
This work presents several studies in the areas of speech recognition and understanding. The semantic speech understanding is an important sub-domain of the broader field of artificial intelligence. Speech processing has had interest from the researchers for long time because language is one of the defining characteristics of a human being. With the development of neural networks, the domain has seen rapid progress both in terms of accuracy and human perception. Another important milestone was achieved with the development of end-to-end approaches. Such approaches allow co-adaptation of all the parts of the model thus increasing the performance, as well as simplifying the training procedure. End-to-end models became feasible with the increasing amount of available data, computational resources, and most importantly with many novel architectural developments. Nevertheless, traditional, non end-to-end, approaches are still relevant for speech processing due to challenging data in noisy environments, accented speech, and high variety of dialects. In the first work, we explore the hybrid speech recognition in noisy environments. We propose to treat the recognition in the unseen noise condition as the domain adaptation task. For this, we use the novel at the time technique of the adversarial domain adaptation. In the nutshell, this prior work proposed to train features in such a way that they are discriminative for the primary task, but non-discriminative for the secondary task. This secondary task is constructed to be the domain recognition task. Thus, the features trained are invariant towards the domain at hand. In our work, we adopt this technique and modify it for the task of noisy speech recognition. In the second work, we develop a general method for regularizing the generative recurrent networks. It is known that the recurrent networks frequently have difficulties staying on same track when generating long outputs. While it is possible to use bi-directional networks for better sequence aggregation for feature learning, it is not applicable for the generative case. We developed a way improve the consistency of generating long sequences with recurrent networks. We propose a way to construct a model similar to bi-directional network. The key insight is to use a soft L2 loss between the forward and the backward generative recurrent networks. We provide experimental evaluation on a multitude of tasks and datasets, including speech recognition, image captioning, and language modeling. In the third paper, we investigate the possibility of developing an end-to-end intent recognizer for spoken language understanding. The semantic spoken language understanding is an important step towards developing a human-like artificial intelligence. We have seen that the end-to-end approaches show high performance on the tasks including machine translation and speech recognition. We draw the inspiration from the prior works to develop an end-to-end system for intent recognition.
APA, Harvard, Vancouver, ISO, and other styles
15

Harkreader, Robert Chandler. "Playing Hide-and-Seek with Spammers: Detecting Evasive Adversaries in the Online Social Network Domain." Thesis, 2012. http://hdl.handle.net/1969.1/ETD-TAMU-2012-08-11479.

Full text
Abstract:
Online Social Networks (OSNs) have seen an enormous boost in popularity in recent years. Along with this popularity has come tribulations such as privacy concerns, spam, phishing and malware. Many recent works have focused on automatically detecting these unwanted behaviors in OSNs so that they may be removed. These works have developed state-of-the-art detection schemes that use machine learning techniques to automatically classify OSN accounts as spam or non-spam. In this work, these detection schemes are recreated and tested on new data. Through this analysis, it is clear that spammers are beginning to evade even these detectors. The evasion tactics used by spammers are identified and analyzed. Then a new detection scheme is built upon the previous ones that is robust against these evasion tactics. Next, the difficulty of evasion of the existing detectors and the new detector are formalized and compared. This work builds a foundation for future researchers to build on so that those who would like to protect innocent internet users from spam and malicious content can overcome the advances of those that would prey on these users for a meager dollar.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography