Tesi sul tema "Adaptation de domaines"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Adaptation de domaines".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Fernandes, Montesuma Eduardo. "Multi-Source Domain Adaptation through Wasserstein Barycenters". Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG045.
Testo completoMachine learning systems work under the assumption that training and test conditions are uniform, i.e., they do not change. However, this hypothesis is seldom met in practice. Hence, the system is trained with data that is no longer representative of the data it will be tested on. This case is represented by a shift in the probability measure generating the data. This scenario is known in the literature as distributional shift between two domains: a source, and a target. A straightforward generalization of this problem is when training data itself exhibit shifts on its own. In this case, one consider Multi Source Domain Adaptation (MSDA). In this context, optimal transport is an useful field of mathematics. Especially, optimal transport serves as a toolbox, for comparing and manipulating probability measures. This thesis studies the contributions of optimal transport to multi-source domain adaptation. We do so through Wasserstein barycenters, an object that defines a weighted average, in the space of probability measures, for the multiple domains in MSDA. Based on this concept, we propose: (i) a novel notion of barycenter, when the measures at hand are equipped with labels, (ii) a novel dictionary learning problem over empirical probability measures and (iii) new tools for domain adaptation through the optimal transport of Gaussian mixture models. Through our methods, we are able to improve domain adaptation performance in comparison with previous optimal transport-based methods on image, and cross-domain fault diagnosis benchmarks. Our work opens an interesting research direction, on learning the barycentric hull of probability measures
Lévesque-Gravel, Anick. "Adaptation de la formule de Schwarz-Christoffel aux domaines multiplement connexes". Master's thesis, Université Laval, 2015. http://hdl.handle.net/20.500.11794/26169.
Testo completoLa formule de Schwarz–Christoffel permet de trouver une transformation conforme entre un domaine polygonal et un disque. Par contre, cette formule ne s’applique qu’aux domaines simplement connexes. Récemment, Darren Crowdy a obtenu une généralisation de cette formule pour les domaines multiplement connexes. Celle-ci envoie des domaines circulaires sur des domaines polygonaux. Ce mémoire vise à faire la démonstration de la formule développée par Crowdy. Pour ce faire, il faudra définir la fonction de Schottky–Klein ainsi que la fonction de Green modifiée. Il faudra aussi introduire les domaines canoniques.
Meftah, Sara. "Neural Transfer Learning for Domain Adaptation in Natural Language Processing". Thesis, université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG021.
Testo completoRecent approaches based on end-to-end deep neural networks have revolutionised Natural Language Processing (NLP), achieving remarkable results in several tasks and languages. Nevertheless, these approaches are limited with their "gluttony" in terms of annotated data, since they rely on a supervised training paradigm, i.e. training from scratch on large amounts of annotated data. Therefore, there is a wide gap between NLP technologies capabilities for high-resource languages compared to the long tail of low-resourced languages. Moreover, NLP researchers have focused much of their effort on training NLP models on the news domain, due to the availability of training data. However, many research works have highlighted that models trained on news fail to work efficiently on out-of-domain data, due to their lack of robustness against domain shifts. This thesis presents a study of transfer learning approaches, through which we propose different methods to take benefit from the pre-learned knowledge on the high-resourced domain to enhance the performance of neural NLP models in low-resourced settings. Precisely, we apply our approaches to transfer from the news domain to the social media domain. Indeed, despite the importance of its valuable content for a variety of applications (e.g. public security, health monitoring, or trends highlight), this domain is still poor in terms of annotated data. We present different contributions. First, we propose two methods to transfer the knowledge encoded in the neural representations of a source model pretrained on large labelled datasets from the source domain to the target model, further adapted by a fine-tuning on few annotated examples from the target domain. The first transfers contextualised supervisedly pretrained representations, while the second method transfers pretrained weights, used to initialise the target model's parameters. Second, we perform a series of analysis to spot the limits of the above-mentioned proposed methods. We find that even if the proposed transfer learning approach enhances the performance on social media domain, a hidden negative transfer may mitigate the final gain brought by transfer learning. In addition, an interpretive analysis of the pretrained model, show that pretrained neurons may be biased by what they have learned from the source domain, thus struggle with learning uncommon target-specific patterns. Third, stemming from our analysis, we propose a new adaptation scheme which augments the target model with normalised, weighted and randomly initialised neurons that beget a better adaptation while maintaining the valuable source knowledge. Finally, we propose a model, that in addition to the pre-learned knowledge from the high-resource source-domain, takes advantage of various supervised NLP tasks
Marchand, Morgane. "Domaines et fouille d'opinion : une étude des marqueurs multi-polaires au niveau du texte". Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112026/document.
Testo completoIn this thesis, we are studying the adaptation of a text level opinion classifier across domains. Howerver, people express their opinion in a different way depending on the subject of the conversation. The same word in two different domains can refer to different objects or have an other connotation. If these words are not detected, they will lead to classification errors.We call these words or bigrams « multi-polarity marquers ». Their presence in a text signals a polarity wich is different according to the domain of the text. Their study is the subject of this thesis. These marquers are detected using a khi2 test if labels exist in both targeted domains. We also propose a semi-supervised detection method for the case with labels in only one domain. We use a collection of auto-epurated pivot words in order to assure a stable polarity accross domains.We have also checked the linguistic interest of the selected words with a manual evaluation campaign. The validated words can be : a word of context, a word giving an opinion, a word explaining an opinion or a word wich refer to the evaluated object. Our study also show that the causes of the changing polarity are of three kinds : changing meaning, changing object or changing use.Finally, we have studyed the influence of multi-polarity marquers on opinion classification at text level in three different cases : adaptation of a source domain to a target domain, multi-domain corpora and open domain corpora. The results of our experiments show that the potential improvement is bigger when the initial transfer was difficult. In the favorable cases, we improve accurracy up to five points
Passerieux, Emilie. "Corrélation entre l'organisation spatiale du perimysium et des domaines subcellulaires des fibres musculaires squelettiques : implication dans la transmission latérale des forces et conséquences possibles sur les adaptations du muscle à l'exercice physique". Bordeaux 2, 2006. http://www.theses.fr/2006BOR21358.
Testo completoWe investigated the possibility that the perimysium, a component of intramuscular connective tissue, is involved in muscular adaptation mechanisms. We demonstrated in bovine skeletal Flexor carpi radialis muscle that (i) the perimysium drives the forces of muscular contraction from myofibers to the tendons, (ii) the spatial distribution of the perimysium in muscle corresponds directly to the distribution of integrins (associated with the presence of satellite cells at the surface of myofibers) and the distribution of myonuclei, subsarcolemmal mitochondria, and myosin inside myofibers. We concluded that the perimysium-myofiber relationship reflects the existence of a mechanosensor system explaining short and long term muscle adaptations. Our investigations were essential for detecting the way of myofibers adaptations
Delaforge, Elise. "Dynamique structurale et fonctionnelle du domaine C-terminal de la protéine PB2 du virus de la grippe A". Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAV037/document.
Testo completoThe ability of avian influenza viruses to cross the species barrier and become dangerously pathogenic to mammalian hosts represents a major threat for human health. In birds the viral replication is carried out in the intestine at 40°C, while in humans it occurs in the cooler respiratory tract at 33°C. It has been shown that temperature adaption of the influenza virus occurs through numerous mutations in the viral polymerase, in particular in the C-terminal domain 627-NLS of the PB2 protein. This domain has already been shown to participate in host adaptation and is involved in importin alpha binding and therefore is required for entry of the viral polymerase into the nucleus [Tarendeau et al., 2008]. Crystallographic structures are available for 627-NLS and the complex importin alpha/NLS, however, a steric clash between importin alpha and the 627 domain becomes apparent when superimposing the NLS domain of the two structures, indicating that another conformation of 627-NLS is required for binding to importin alpha [Boivin and Hart, 2011]. Here we investigate the molecular basis of inter-species adaptation by studying the structure and dynamics of human and avian 627-NLS. We have identified two conformations of 627-NLS in slow exchange (10-100 s-1), corresponding to an apparently open and closed conformation of the two domains. We show that the equilibrium between closed and open conformations is strongly temperature dependent. We propose that the open conformation of 627-NLS is the only conformation compatible with binding to importin alpha and that the equilibrium between closed and open conformations may play a role as a molecular thermostat, controlling the efficiency of viral replication in the different species. The kinetics and domain dynamics of this important conformational behaviour and of the interaction between 627-NLS and importin alpha have been characterized using nuclear magnetic resonance chemical shifts, paramagnetic relaxation enhancement, spin relaxation and chemical exchange saturation transfer, in combination with X-ray and neutron small angle scattering and Förster resonance energy transfer. Also, we have determined the affinities of various evolutionnary mutants of 627-NLS to importin alpha and of avian and human 627-NLS to different isoforms of importin alpha, showing that the observed affinities are coherent with the preferred interactions seen in vivo
Lopez, Rémy. "Adaptation des méthodes “statistical energy analysis” (sea) aux problèmes d'électromagnétisme en cavités". Toulouse 3, 2006. http://www.theses.fr/2006TOU30045.
Testo completoModeling electromagnetic phenomena by deterministic methods requires a subdivision of the volume under study into a number of discrete elements with sizes of the order of tenth of the wavelength. So, the demand for computer resources significantly grows with increasing frequencies. Moreover, taking into account the complexity of the problems and the uncertainties on the input data, it becomes illusory to make a deterministic calculation for each studied variable. New methods, called energetic methods, were developed to study systems large in front of the wavelength. They allow to estimate statistically the value of the field inside a system One of these methods, the Statistical Energy Analysis (SEA), developed in acoustic, is transposed here in electromagnetism. The SEA allows to describe the exchanges of energy between the different systems of a structure. The energy of each system depends on the concepts of mode of resonance, loss and coupling. The parameters linked with these concepts are assessed by analytical formulae and numerical simulations. An automatic sub structuring method is also presented. The results obtained seem to confirm the interest of this method
Sidibe, Mamadou Gouro. "Métrologie des services audiovisuels dans un contexte multi-opérateurs et multi-domaines réseaux". Versailles-St Quentin en Yvelines, 2010. http://www.theses.fr/2010VERS0068.
Testo completoAccess to multimedia services over heterogeneous networks and terminals is of increasing market interest, while providing end-to-end (E2E) Quality of Service (QoS) guarantees is still a challenge. Solving this issue requires to deploy new E2E management architectures including components that monitor the network QoS (NQoS) parameters, as well as the Quality of Experience (QoE) of the user. In this thesis, we first propose an E2E Integrated QoS Management Supervisor for an efficient provisioning, monitoring and adaptation of video services using the MPEG-21 standard. We then propose a novel QoE-aware monitoring solution for large-scale service connectivity and user-perceived quality monitoring over heterogeneous networks. The solution specifies a scalable cross-layer monitoring architecture, comprising four types of QoS monitoring agents operating at node, network, application and service levels. It also specifies related intra/inter-domain signalling protocols
Rouquet, Géraldine. "Etude du rôle de l'opéron métabolique frz dans la virulence d'escherichia coli et dans son adaptation aux conditions environnementales". Thesis, Tours, 2010. http://www.theses.fr/2010TOUR4008.
Testo completoThe metabolic frz operon codes for three subunits of a PTS transporter of the fructose sub-family, for a transcriptional activator of PTS systems of the MgA family (FrzR), for two type II ketose-1,6-bisphosphate aldolases, for a sugar specific kinase (ROK family) and for a protein of the cupin superfamily. It is highly associated with Extra-intestinal Pathogenic Escherichia coli strains. We proved that frz promotes bacterial fitness under stressful conditions, (such as oxygen restriction, late stationary phase of growth or growth in serum or in the intestinal tract). Furthermore, we showed that frz is involved in adherence to and internalization of E. coli in several eukaryotic cells by regulating the expression of type 1 fimbriae. The FrzR activator is involved in these phenotypes. Microarrays, experiments allowed the identification of several genes under the dependence of the frz system. Our data suggest that frz codes for a sensor of the environment allowing E. coli to adapt to a fluctuating environment by regulating some virulence and host adaptation genes. A regulation model is presented
Alqasir, Hiba. "Apprentissage profond pour l'analyse de scènes de remontées mécaniques : amélioration de la généralisation dans un contexte multi-domaines". Thesis, Lyon, 2020. http://www.theses.fr/2020LYSES045.
Testo completoThis thesis presents our work on chairlift safety using deep learning techniques as part of the Mivao project, which aims to develop a computer vision system that acquires images of the chairlift boarding station, analyzes the crucial elements, and detects dangerous situations. In this scenario, we have different chairlifts spread over different ski resorts, with a high diversity of acquisition conditions and geometries; thus, each chairlift is considered a domain. When the system is installed for a new chairlift, the objective is to perform an accurate and reliable scene analysis, given the lack of labeled data on this new domain (chairlift).In this context, we mainly concentrate on the chairlift safety bar and propose to classify each image into two categories, depending on whether the safety bar is closed (safe) or open (unsafe). Thus, it is an image classification problem with three specific features: (i) the image category depends on a small detail (the safety bar) in a cluttered background, (ii) manual annotations are not easy to obtain, (iii) a classifier trained on some chairlifts should provide good results on a new one (generalization). To guide the classifier towards the important regions of the images, we have proposed two solutions: object detection and Siamese networks. Furthermore, we analyzed the generalization property of these two approaches. Our solutions are motivated by the need to minimize human annotation efforts while improving the accuracy of the chairlift safety problem. However, these contributions are not necessarily limited to this specific application context, and they may be applied to other problems in a multi-domain context
Ciobanu, Oana Alexandra. "Méthode de décomposition de domaine avec adaptation de maillage en espace-temps pour les équations d'Euler et de Navier-Stockes". Thesis, Paris 13, 2014. http://www.theses.fr/2014PA132052/document.
Testo completoNumerical simulations of more and more complex fluid dynamics phenomena, especially unsteady phenomena, require solving systems of equations with high degrees of freedom. Under their original form, these aerodynamic multi-scale problems are difficult to solve, costly in CPU time and do not allow simulations of large time scales. An implicit formulation, similar to the Schwarz method, with a simple block parallelisation and explicit coupling is no longer sufficient. More robust domain decomposition methods must be conceived so as to make use and adapt to the most of existent hardware.The main aim of this study was to build a parallel in space and in time CFD Finite Volumes code for steady/unsteady problems modelled by Euler and Navier-Stokes equations based on Schwarz method that improves consistency, accelerates convergence and decreases computational cost. First, a study of discretisation and numerical schemes to solve steady and unsteady Euler and Navier–Stokes problems has been conducted. Secondly, an adaptive timespace domain decomposition method has been proposed, as it allows local time stepping in each sub-domain. Thirdly, we have focused our study on the implementation of different parallel computing strategies (OpenMP, MPI, GPU). Numerical results illustrate the efficiency of the method
El, Boukkouri Hicham. "Domain adaptation of word embeddings through the exploitation of in-domain corpora and knowledge bases". Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG086.
Testo completoThere are, at the basis of most NLP systems, numerical representations that enable the machine to process, interact with and—to some extent—understand human language. These “word embeddings” come in different flavours but can be generally categorised into two distinct groups: on one hand, static embeddings that learn and assign a single definitive representation to each word; and on the other, contextual embeddings that instead learn to generate word representations on the fly, according to a current context. In both cases, training these models requires a large amount of texts. This often leads NLP practitioners to compile and merge texts from multiple sources, often mixing different styles and domains (e.g. encyclopaedias, news articles, scientific articles, etc.) in order to produce corpora that are sufficiently large for training good representations. These so-called “general domain” corpora are today the basis on which most word embeddings are trained, greatly limiting their use in more specific areas. In fact, “specialized domains” like the medical domain usually manifest enough lexical, semantic and stylistic idiosyncrasies (e.g. use of acronyms and technical terms) that general-purpose word embeddings are unable to effectively encode out-of-the-box. In this thesis, we explore how different kinds of resources may be leveraged to train domain-specific representations or further specialise preexisting ones. Specifically, we first investigate how in-domain corpora can be used for this purpose. In particular, we show that both corpus size and domain similarity play an important role in this process and propose a way to leverage a small corpus from the target domain to achieve improved results in low-resource settings. Then, we address the case of BERT-like models and observe that the general-domain vocabularies of these models may not be suited for specialized domains. However, we show evidence that models trained using such vocabularies can be on par with fully specialized systems using in-domain vocabularies—which leads us to accept re-training general domain models as an effective approach for constructing domain-specific systems. We also propose CharacterBERT, a variant of BERT that is able to produce word-level open-vocabulary representations by consulting a word's characters. We show evidence that this architecture leads to improved performance in the medical domain while being more robust to misspellings. Finally, we investigate how external resources in the form of knowledge bases may be leveraged to specialise existing representations. In this context, we propose a simple approach that consists in constructing dense representations of these knowledge bases then combining these knowledge vectors with the target word embeddings. We generalise this approach and propose Knowledge Injection Modules, small neural layers that incorporate external representations into the hidden states of a Transformer-based model. Overall, we show that these approaches can lead to improved results, however, we intuit that this final performance ultimately depends on whether the knowledge that is relevant to the target task is available in the input resource. All in all, our work shows evidence that both in-domain corpora and knowledge may be used to construct better word embeddings for specialized domains. In order to facilitate future research on similar topics, we open-source our code and share pre-trained models whenever appropriate
Sandu, Oana. "Domain adaptation for summarizing conversations". Thesis, University of British Columbia, 2011. http://hdl.handle.net/2429/33932.
Testo completoHtike, Kyaw Kyaw. "Domain adaptation for pedestrian detection". Thesis, University of Leeds, 2014. http://etheses.whiterose.ac.uk/7290/.
Testo completoRahman, MD Hafizur. "Domain adaptation for speaker verification". Thesis, Queensland University of Technology, 2018. https://eprints.qut.edu.au/116511/1/MD%20Hafizur_Rahman_Thesis.pdf.
Testo completoRahman, Mohammad Mahfujur. "Deep domain adaptation and generalisation". Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205619/1/Mohammad%20Mahfujur_Rahman_Thesis.pdf.
Testo completoRubino, Raphaël. "Traduction automatique statistique et adaptation à un domaine spécialisé". Phd thesis, Université d'Avignon, 2011. http://tel.archives-ouvertes.fr/tel-00879945.
Testo completoCardace, Adriano. "Learning Features Across Tasks and Domains". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20050/.
Testo completoSaporta, Antoine. "Domain Adaptation for Urban Scene Segmentation". Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS115.
Testo completoThis thesis tackles some of the scientific locks of perception systems based on neural networks for autonomous vehicles. This dissertation discusses domain adaptation, a class of tools aiming at minimizing the need for labeled data. Domain adaptation allows generalization to so-called target data that share structures with the labeled so-called source data allowing supervision but nevertheless following a different statistical distribution. First, we study the introduction of privileged information in the source data, for instance, depth labels. The proposed strategy, BerMuDA, bases its domain adaptation on a multimodal representation obtained by bilinear fusion, modeling complex interactions between segmentation and depth. Next, we examine self-supervised learning strategies in domain adaptation, relying on selecting predictions on the unlabeled target data, serving as pseudo-labels. We propose two new selection criteria: first, an entropic criterion with ESL; then, with ConDA, using an estimate of the true class probability. Finally, the extension of adaptation scenarios to several target domains as well as in a continual learning framework is proposed. Two approaches are presented to extend traditional adversarial methods to multi-target domain adaptation: Multi-Dis. and MTKT. In a continual learning setting for which the target domains are discovered sequentially and without rehearsal, the proposed CTKT approach adapts MTKT to this new problem to tackle catastrophic forgetting
Xu, Jiaolong. "Domain adaptation of deformable part-based models". Doctoral thesis, Universitat Autònoma de Barcelona, 2015. http://hdl.handle.net/10803/290266.
Testo completoLa detección de peatones es crucial para los sistemas de asistencia a la conducción (ADAS). Disponer de un clasificador preciso es fundamental para un detector de peatones basado en visión. Al entrenar un clasificador, se asume que las características de los datos de entrenamiento siguen la misma distribución de probabilidad que las de los datos de prueba. Sin embargo, en la práctica, esta asunción puede no cumplirse debido a diferentes causas. En estos casos, en la comunidad de visión por computador cada vez es más común utilizar técnicas que permiten adaptar los clasificadores existentes de su entorno de entrenamiento (dominio de origen) al nuevo entorno de prueba (dominio de destino). En esta tesis nos centramos en la adaptación de dominio de los detectores de peatones basados en modelos deformables basados en partes (DPMs). Como prueba de concepto, usamos como dominio de origen datos sintéticos (mundo virtual) y adaptamos el detector DPM entrenado en el mundo virtual para funcionar en diferentes escenarios reales. Comenzamos explotando al máximo las capacidades de detección del DPM entrenado en datos del mundo virtual pero, aun así, al aplicarlo a diferentes conjuntos del mundo real, el detector todavía pierde poder de discriminaci ón debido a las diferencias entre el mundo virtual y el real. Es por ello que nos centramos en la adaptación de dominio del DPM. Para comenzar, consideramos un único dominio de origen para adaptarlo a un único dominio de destino mediante dos métodos de aprendizaje por lotes, el A-SSVM y SA-SSVM. Después, lo ampliamos a trabajar con múltiples (sub-)dominios mediante una adaptación progresiva usando una jerarquía adaptativa basada en SSVM (HA-SSVM) en el proceso de optimización. Finalmente, extendimos HA-SSVM para conseguir un detector que se adapte de forma progresiva y sin intervención humana al dominio de destino. Cabe destacar que ninguno de los métodos propuestos en esta tesis requieren visitar los datos del dominio de origen. La evaluación de los resultados, realizadas con el sistema de evaluación de Caltech, muestran que el SA-SSVM mejora ligeramente respecto al A-SSVM y mejora en 15 puntos respecto al detector no adaptado. El modelo jerárquico entrenado mediante el HA-SSVM todavía mejora más los resultados de la adaptación de dominio. Finalmente, el método secuencial de adaptación de domino ha demostrado que puede obtener resultados comparables a la adaptación por lotes pero sin necesidad de etiquetar manualmente ningún ejemplo del dominio de destino. La adaptación de domino aplicada a la detección de peatones es de gran importancia y es un área que se encuentra relativamente sin explorar. Deseamos que esta tesis pueda sentar las bases del trabajo futuro en esta área.
On-board pedestrian detection is crucial for Advanced Driver Assistance Systems (ADAS). An accurate classi cation is fundamental for vision-based pedestrian detection. The underlying assumption for learning classi ers is that the training set and the deployment environment (testing) follow the same probability distribution regarding the features used by the classi ers. However, in practice, there are di erent reasons that can break this constancy assumption. Accordingly, reusing existing classi ers by adapting them from the previous training environment (source domain) to the new testing one (target domain) is an approach with increasing acceptance in the computer vision community. In this thesis we focus on the domain adaptation of deformable part-based models (DPMs) for pedestrian detection. As a prof of concept, we use a computer graphic based synthetic dataset, i.e. a virtual world, as the source domain, and adapt the virtual-world trained DPM detector to various real-world dataset. We start by exploiting the maximum detection accuracy of the virtual-world trained DPM. Even though, when operating in various real-world datasets, the virtualworld trained detector still su er from accuracy degradation due to the domain gap of virtual and real worlds. We then focus on domain adaptation of DPM. At the rst step, we consider single source and single target domain adaptation and propose two batch learning methods, namely A-SSVM and SA-SSVM. Later, we further consider leveraging multiple target (sub-)domains for progressive domain adaptation and propose a hierarchical adaptive structured SVM (HA-SSVM) for optimization. Finally, we extend HA-SSVM for the challenging online domain adaptation problem, aiming at making the detector to automatically adapt to the target domain online, without any human intervention. All of the proposed methods in this thesis do not require revisiting source domain data. The evaluations are done on the Caltech pedestrian detection benchmark. Results show that SA-SSVM slightly outperforms A-SSVM and avoids accuracy drops as high as 15 points when comparing with a non-adapted detector. The hierarchical model learned by HA-SSVM further boosts the domain adaptation performance. Finally, the online domain adaptation method has demonstrated that it can achieve comparable accuracy to the batch learned models while not requiring manually label target domain examples. Domain adaptation for pedestrian detection is of paramount importance and a relatively unexplored area. We humbly hope the work in this thesis could provide foundations for future work in this area.
Shahabuddin, Sharmeen. "Compressed Domain Spatial Adaptation of H264 Videos". Thesis, University of Ottawa (Canada), 2010. http://hdl.handle.net/10393/28787.
Testo completoHerndon, Nic. "Domain adaptation algorithms for biological sequence classification". Diss., Kansas State University, 2016. http://hdl.handle.net/2097/35242.
Testo completoDepartment of Computing and Information Sciences
Doina Caragea
The large volume of data generated in the recent years has created opportunities for discoveries in various fields. In biology, next generation sequencing technologies determine faster and cheaper the exact order of nucleotides present within a DNA or RNA fragment. This large volume of data requires the use of automated tools to extract information and generate knowledge. Machine learning classification algorithms provide an automated means to annotate data but require some of these data to be manually labeled by human experts, a process that is costly and time consuming. An alternative to labeling data is to use existing labeled data from a related domain, the source domain, if any such data is available, to train a classifier for the domain of interest, the target domain. However, the classification accuracy usually decreases for the domain of interest as the distance between the source and target domains increases. Another alternative is to label some data and complement it with abundant unlabeled data from the same domain, and train a semi-supervised classifier, although the unlabeled data can mislead such classifier. In this work another alternative is considered, domain adaptation, in which the goal is to train an accurate classifier for a domain with limited labeled data and abundant unlabeled data, the target domain, by leveraging labeled data from a related domain, the source domain. Several domain adaptation classifiers are proposed, derived from a supervised discriminative classifier (logistic regression) or a supervised generative classifier (naïve Bayes), and some of the factors that influence their accuracy are studied: features, data used from the source domain, how to incorporate the unlabeled data, and how to combine all available data. The proposed approaches were evaluated on two biological problems -- protein localization and ab initio splice site prediction. The former is motivated by the fact that predicting where a protein is localized provides an indication for its function, whereas the latter is an essential step in gene prediction.
Shu, Le. "Graph and Subspace Learning for Domain Adaptation". Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/363757.
Testo completoPh.D.
In many practical problems, given that the instances in the training and test may be drawn from different distributions, traditional supervised learning can not achieve good performance on the new domain. Domain adaptation algorithms are therefore designed to bridge the distribution gap between training (source) data and test (target) data. In this thesis, I propose two graph learning and two subspace learning methods for domain adaptation. Graph learning methods use a graph to model pairwise relations between instances and then minimize the domain discrepancy based on the graphs directly. The first effort we make is to propose a novel locality preserving projection method for domain adaptation task, which can find a linear mapping preserving the intrinsic structure for both source and target domains. We first construct two graphs encoding the neighborhood information for source and target domains separately. We then find linear projection coefficients which have the property of locality preserving for each graph. Instead of combing the two objective terms under compatibility assumption and requiring the user to decide the importance of each objective function, we propose a multi-objective formulation for this problem and solve it simultaneously using Pareto optimization. Pareto optimization allows multiple objectives to compete with each other in deciding the optimal trade-off. We use generalized eigen-decomposition to find the pareto frontier, which captures all possible good linear projection coefficients that are preferred by one or more objectives. The second effort is to directly improve the pair-wise similarities between instances in the same domain as well as in different domains. We propose a novel method to solve domain adaptation task in a transductive setting. The proposed method bridges the distribution gap between source domain and target domain through affinity learning. It exploits the existence of a subset of data points in target domain which distribute similarly to the data points in the source domain. These data points act as the bridge that facilitates the data similarities propagation across domains. We also propose to control the relative importance of intra- and inter- domain similarities to boost the similarity propagation. In our approach, we first construct the similarity matrix which encodes both the intra- and inter- domain similarities. We then learn the true similarities among data points in joint manifold using graph diffusion. We demonstrate that with improved similarities between source and target data, spectral embedding provides a better data representation, which boosts the prediction accuracy. Subspace learning methods aim to find a new coordinate system, in which the domain discrepancy is minimized. In this thesis, we refer to subspace-based method as those which model the domain shift between two subspaces directly. Our first effort is to propose a novel linear subspace learning approach for domain adaptation. Our key observation is that in many real world problems, such as image classification with blurred test images or cross domain text classification, domain shift can be modeled by a linear transformation between the source and target domain (intrinsically linear transformation between two subspaces underlying the source and target data). Motivated by this observation, our method explicitly aligns the data in two domains using a linear transformation while simultaneously finding a subspace which preserves the most data variance. With explicit data alignment, the subspace learning is formulated as minimizing of a PCA-like objective, which consists of two variables: the basis vectors of the common subspace and the linear transformation between two domains. We show that the optimization can be solved efficiently using an iterative algorithm based on alternating minimization, and prove its convergence to a local optimum. Our method can also integrate the label information of source data, which further improves the robustness of the subspace learning and yields better prediction. Existing subspace based domain adaptation methods assume that data lie in a single low dimensional subspace. This assumption is too strong in many real world applications especially considering the domain could be a mixture of latent domains with significant inner-domain variations that should not be neglected. In our second approach, the key idea is to assume the data lie in a union of multiple low dimensional subspaces, which relaxes the common assumption above. We propose a novel two step subspace based domain adaptation algorithm: in subspaces discovery step, we cluster the source and target data using subspace clustering algorithm and estimate the subspace for each cluster using principal component analysis; in domain adaptation step, we propose a novel multiple subspace alignment (Multi-SA) algorithm, in which we identify one common subspace that aligns well with both source and target subspaces, and therefore, best preserves the variance for both domains. To solve this alignment problem jointly for multiple subspaces, we formulate this problem as solving an optimization problem that minimizes the weighted sum of multiple alignment costs. A higher weight is assigned to a source subspace if its label distribution has smaller distance, measured by KL divergence, compared to the overall label distribution. By putting more weights on those subspaces, the learned common subspace is able to to preserve the distinctive information.
Temple University--Theses
Di, Bella Laura. "Women's adaptation to STEM domains : generalised effects on judgement and cognition". Thesis, University of Kent, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.654096.
Testo completoSelvaggi, Kevin. "Synthetic-to-Real Domain Adaptation for Autonomous Driving". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020.
Cerca il testo completoSopova, Oleksandra. "Domain adaptation for classifying disaster-related Twitter data". Kansas State University, 2017. http://hdl.handle.net/2097/35388.
Testo completoDepartment of Computing and Information Sciences
Doina Caragea
Machine learning is the subfield of Artificial intelligence that gives computers the ability to learn without being explicitly programmed, as it was defined by Arthur Samuel - the American pioneer in the field of computer gaming and artificial intelligence who was born in Emporia, Kansas. Supervised Machine Learning is focused on building predictive models given labeled training data. Data may come from a variety of sources, for instance, social media networks. In our research, we use Twitter data, specifically, user-generated tweets about disasters such as floods, hurricanes, terrorist attacks, etc., to build classifiers that could help disaster management teams identify useful information. A supervised classifier trained on data (training data) from a particular domain (i.e. disaster) is expected to give accurate predictions on unseen data (testing data) from the same domain, assuming that the training and test data have similar characteristics. Labeled data is not easily available for a current target disaster. However, labeled data from a prior source disaster is presumably available, and can be used to learn a supervised classifier for the target disaster. Unfortunately, the source disaster data and the target disaster data may not share the same characteristics, and the classifier learned from the source may not perform well on the target. Domain adaptation techniques, which use unlabeled target data in addition to labeled source data, can be used to address this problem. We study single-source and multi-source domain adaptation techniques, using Nave Bayes classifier. Experimental results on Twitter datasets corresponding to six disasters show that domain adaptation techniques improve the overall performance as compared to basic supervised learning classifiers. Domain adaptation is crucial for many machine learning applications, as it enables the use of unlabeled data in domains where labeled data is not available.
Thornström, Johan. "Domain Adaptation of Unreal Images for Image Classification". Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-165758.
Testo completoShah, Darsh J. (Darsh Jaidip). "Multi-source domain adaptation with mixture of experts". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/121741.
Testo completoCataloged from PDF version of thesis.
Includes bibliographical references (pages 35-37).
We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources. The key idea is to explicitly capture the relationship between a target example and different source domains. This relationship, expressed by a point-to-set metric, determines how to combine predictors trained on various domains. The metric is learned in an unsupervised fashion using meta-training. Experimental results on sentiment analysis and part-of-speech tagging demonstrate that our approach consistently outperforms multiple baselines and can robustly handle negative transfer.
by Darsh J. Shah.
S.M.
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Yang, Baoyao. "Distribution alignment for unsupervised domain adaptation: cross-domain feature learning and synthesis". HKBU Institutional Repository, 2018. https://repository.hkbu.edu.hk/etd_oa/556.
Testo completoRoy, Subhankar. "Learning to Adapt Neural Networks Across Visual Domains". Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/354343.
Testo completoHatzichristou, Chryse, e Diether Hopf. "School adaptation of Greek children after remigration : age differences in multiple domains". Universität Potsdam, 1995. http://opus.kobv.de/ubp/volltexte/2009/1687/.
Testo completoPalm, Myllylä Johannes. "Domain Adaptation for Hypernym Discovery via Automatic Collection of Domain-Specific Training Data". Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157693.
Testo completoDonati, Lorenzo. "Domain Adaptation through Deep Neural Networks for Health Informatics". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14888/.
Testo completoMargineanu, Elena. "Institutional adaptation in environmental domain: the case of Moldova". Thesis, Тернопіль: Вектор, 2020. http://er.nau.edu.ua/handle/NAU/41786.
Testo completoManamasa, Krishna Himaja. "Domain adaptation from 3D synthetic images to real images". Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-19303.
Testo completoLiu, Ye. "Application of Convolutional Deep Belief Networks to Domain Adaptation". The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1397728737.
Testo completoChinea, Ríos Mara. "Advanced techniques for domain adaptation in Statistical Machine Translation". Doctoral thesis, Universitat Politècnica de València, 2019. http://hdl.handle.net/10251/117611.
Testo completo[CAT] La Traducció Automàtica Estadística és un sup-camp de la lingüística computacional que investiga com emprar els ordinadors en el procés de traducció d'un text d'un llenguatge humà a un altre. La traducció automàtica estadística és l'enfocament més popular que s'empra per a construir aquests sistemes de traducció automàtics. La qualitat d'aquests sistemes depèn en gran mesura dels exemples de traducció que s'empren durant els processos d'entrenament i adaptació dels models. Els conjunts de dades emprades són obtinguts a partir d'una gran varietat de fonts i en molts casos pot ser que no tinguem a mà les dades més adequades per a un domini específic. Donat aquest problema de manca de dades, la idea principal per a solucionar-ho és trobar aquells conjunts de dades més adequades per a entrenar o adaptar un sistema de traducció. En aquest sentit, aquesta tesi proposa un conjunt de tècniques de selecció de dades que identifiquen les dades bilingües més rellevants per a una tasca extrets d'un gran conjunt de dades. Com a primer pas en aquesta tesi, les tècniques de selecció de dades són aplicades per a millorar la qualitat de la traducció dels sistemes de traducció sota el paradigma basat en frases. Aquestes tècniques es basen en el concepte de representació contínua de les paraules o les oracions en un espai vectorial. Els resultats experimentals demostren que les tècniques utilitzades són efectives per a diferents llenguatges i dominis. El paradigma de Traducció Automàtica Neuronal també va ser aplicat en aquesta tesi. Dins d'aquest paradigma, investiguem l'aplicació que poden tenir les tècniques de selecció de dades anteriorment validades en el paradigma basat en frases. El treball realitzat es va centrar en la utilització de dues tasques diferents. D'una banda, investiguem com augmentar la qualitat de traducció del sistema, augmentant la grandària del conjunt d'entrenament. D'altra banda, el mètode de selecció de dades es va emprar per a crear un conjunt de dades sintètiques. Els experiments es van realitzar per a diferents dominis i els resultats de traducció obtinguts són convincents per a ambdues tasques. Finalment, cal assenyalar que les tècniques desenvolupades i presentades al llarg d'aquesta tesi poden implementar-se fàcilment dins d'un escenari de traducció real.
[EN] La Traducció Automàtica Estadística és un sup-camp de la lingüística computacional que investiga com emprar els ordinadors en el procés de traducció d'un text d'un llenguatge humà a un altre. La traducció automàtica estadística és l'enfocament més popular que s'empra per a construir aquests sistemes de traducció automàtics. La qualitat d'aquests sistemes depèn en gran mesura dels exemples de traducció que s'empren durant els processos d'entrenament i adaptació dels models. Els conjunts de dades emprades són obtinguts a partir d'una gran varietat de fonts i en molts casos pot ser que no tinguem a mà les dades més adequades per a un domini específic. Donat aquest problema de manca de dades, la idea principal per a solucionar-ho és trobar aquells conjunts de dades més adequades per a entrenar o adaptar un sistema de traducció. En aquest sentit, aquesta tesi proposa un conjunt de tècniques de selecció de dades que identifiquen les dades bilingües més rellevants per a una tasca extrets d'un gran conjunt de dades. Com a primer pas en aquesta tesi, les tècniques de selecció de dades són aplicades per a millorar la qualitat de la traducció dels sistemes de traducció sota el paradigma basat en frases. Aquestes tècniques es basen en el concepte de representació contínua de les paraules o les oracions en un espai vectorial. Els resultats experimentals demostren que les tècniques utilitzades són efectives per a diferents llenguatges i dominis. El paradigma de Traducció Automàtica Neuronal també va ser aplicat en aquesta tesi. Dins d'aquest paradigma, investiguem l'aplicació que poden tenir les tècniques de selecció de dades anteriorment validades en el paradigma basat en frases. El treball realitzat es va centrar en la utilització de dues tasques diferents d'adaptació del sistema. D'una banda, investiguem com augmentar la qualitat de traducció del sistema, augmentant la grandària del conjunt d'entrenament. D'altra banda, el mètode de selecció de dades es va emprar per a crear un conjunt de dades sintètiques. Els experiments es van realitzar per a diferents dominis i els resultats de traducció obtinguts són convincents per a ambdues tasques. Finalment, cal assenyalar que les tècniques desenvolupades i presentades al llarg d'aquesta tesi poden implementar-se fàcilment dins d'un escenari de traducció real.
Chinea Ríos, M. (2019). Advanced techniques for domain adaptation in Statistical Machine Translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/117611
TESIS
Radhakrishnan, Saieshwar. "Domain Adaptation of IMU sensors using Generative Adversarial Networks". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286821.
Testo completoAutonoma fordon förlitar sig på sensorer för att skapa en bild av omgivningen. På en tung lastbil placeras sensorerna på multipla ställen, till exempel på hytten, chassiet och på trailern för att öka siktfältet och för att minska blinda områden. Vanligtvis presterar sensorerna som bäst när de är stationära i förhållande till marken, därför kan stora och snabba rörelser, som är vanliga på en lastbil, leda till nedsatt prestanda, felaktig data och i värsta fall fallerande sensorer. På grund av detta så finns det ett stort behov av att validera sensordata innan det används för kritiskt beslutsfattande. Den här avhandlingen föreslår domänadaption som en av de strategier för att samvalidera Tröghetsmätningssensorer (IMU-sensorer). Det föreslagna Generative Adversarial Network (GAN) baserade ramverket förutspår en Tröghetssensors data genom att implicit lära sig den interna dynamiken från andra Tröghetssensorer som är monterade på lastbilen. Den här prediktionsmodellen kombinerat med andra sensorfusionsstrategier kan användas av kontrollsystemet för att i realtid validera Tröghetssensorerna. Med hjälp av data insamlat från verkliga experiment visas det att det föreslagna ramverket klarar av att med hög noggrannhet konvertera obehandlade Tröghetssensor-sekvenser mellan domäner. Ytterligare en undersökning mellan Long Short Term Memory (LSTM) och WaveNet-baserade arkitekturer görs för att visa överlägsenheten i WaveNets när det gäller prestanda och beräkningseffektivitet.
Zhang, Xinwen. "Multi-modality Medical Image Segmentation with Unsupervised Domain Adaptation". Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/29776.
Testo completoTian, Tian. "Domain Adaptation and Model Combination for the Annotation of Multi-source, Multi-domain Texts". Thesis, Paris 3, 2019. http://www.theses.fr/2019PA030003.
Testo completoThe increasing mass of User-Generated Content (UGC) on the Internet means that people are now willing to comment, edit or share their opinions on different topics. This content is now the main ressource for sentiment analysis on the Internet. Due to abbreviations, noise, spelling errors and all other problems with UGC, traditional Natural Language Processing (NLP) tools, including Named Entity Recognizers and part-of-speech (POS) taggers, perform poorly when compared to their usual results on canonical text (Ritter et al., 2011).This thesis deals with Named Entity Recognition (NER) on some User-Generated Content (UGC). We have created an evaluation dataset including multi-domain and multi-sources texts. We then developed a Conditional Random Fields (CRFs) model trained on User-Generated Content (UGC).In order to improve NER results in this context, we first developed a POStagger on UGC and used the predicted POS tags as a feature in the CRFs model. To turn UGC into canonical text, we also developed a normalization model using neural networks to propose a correct form for Non-Standard Words (NSW) in the UGC
各种社交网络应用使得互联网用户对各种话题的实时评价,编辑和分享成为可能。这类用户生成的文本内容(User Generated content)已成为社交网络上意见分析的主要目标和来源。但是,此类文本内容中包含的缩写,噪声(不规则词),拼写错误以及其他各种问题导致包括命名实体识别,词性标注在内的传统的自然语言处理工具的性能,相比良好组成的文本降低了许多【参见Ritter 2011】。本论文的主要目标是针对社交网络上用户生成文本内容的命名实体识别。我们首先建立了一个包含多来源,多领域文本的有标注的语料库作为标准评价语料库。然后,我们开发了一个由社交网络用户生成文本训练的基于条件随机场(Conditional Random Fields)的序列标注模型。基于改善这个命名实体识别模型的目的,我们又开发了另一个同样由社交网络用户生成内容训练的词性标注模型,并使用此模型预测的词性作为命名实体识别的条件随机场模型的特征。最后,为了将用户生成文本内容转换成相对标准的良好文本内容,我们开发了一个基于神经网络的词汇标准化模型,用以改正用户生成文本内容中的不标准字,并使用模型提供的改正形式作为命名实体识别的条件随机场模型的特征,借以改善原模型的性能。
Vázquez, Bermúdez David. "Domain Adaptation of Virtual and Real Worlds for Pedestrian Detection". Doctoral thesis, Universitat Autònoma de Barcelona, 2013. http://hdl.handle.net/10803/125977.
Testo completoPedestrian detection is of paramount interest for many applications, e.g. Advanced Driver Assistance Systems, Surveillance and Media. Most promising pedestrian detectors rely on appearance-based classifiers trained with annotated samples. However, the required annotation step represents an intensive and subjective task when it has to be done by persons. Therefore, it is worth to minimize the human intervention in such a task by using computational tools like realistic virtual worlds, where precise and rich annotations of visual information can be automatically generated. Nevertheless, the use of this kind of data generates the following question: can a pedestrian appearance model learnt with virtual-world data work successfully for pedestrian detection in real- world scenarios?. To answer this question, we conducted different experiments that suggest that classifiers based on virtual-world data can perform well in real-world environments. However, it was also found that in some cases these classifiers can suffer the so called dataset shift problem as real-world based classifiers does. Accordingly, we have designed a domain adaptation framework, V-AYLA, in which we have explored different techniques to collect a few pedestrian samples from the target domain (real world) and combine them with many samples of the source domain (virtual world) in order to train a domain adapted pedestrian classifier. V-AYLA reports the same detection performance as the one obtained by training with human-provided pedestrian annotations and testing with real-world images from the same domain. Ideally, we would like to adapt our system without any human intervention. Therefore, as a first proof of concept we proposed the use of an unsupervised domain adaptation technique that avoids human intervention during the adaptation process. To the best of our knowledge, this is the first work that demonstrates adaptation of virtual and real worlds for developing an object detector. We also assess a different strategy to avoid the dataset shift that consists in collecting real-world samples and retrain with them, but in such a way that no bounding boxes of real-world pedestrians have to be provided. We show that the generated classifier is competitive with respect to the counterpart trained with samples collected by manually annotating pedestrian bounding boxes. The results presented on this Thesis not only end with a proposal for adapting a virtual-world pedestrian detector to the real world, but also it goes further by pointing out a new methodology that would allow the system to adapt to different situations, which we hope will provide the foundations for future research in this unexplored area.
Ruberg, Anders. "Frequency Domain Link Adaptation for OFDM-based Cellular Packet Data". Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-6328.
Testo completoIn order to be competitive with emerging mobile systems and to satisfy the ever growing request for higher data rates, the 3G consortium, 3rd Generation Partnership Project (3GPP), is currently developing concepts for a long term evolution (LTE) of the 3G standard. The LTE-concept at Ericsson is based on Orthogonal Frequency Division Multiplexing (OFDM) as downlink air interface. OFDM enables the use of frequency domain link adaptation to select the most appropriate transmission parameters according to current channel conditions, in order to maximize the throughput and maintain the delay at a desired level. The purpose of this thesis work is to study, implement and evaluate different link adaptation algorithms. The main focus is on modulation adaptation, where the differences in performance between time domain and frequency domain adaptation are investigated. The simulations made in this thesis are made with a simulator developed at Ericsson. Simulations show in general that the cell throughput is enhanced by an average of 3% when using frequency domain modulation adaptation. When using the implemented frequency domain power allocation algorithm, a gain of 23-36% in average is seen in the users 5th percentile throughput. It should be noted that the simulations use a realistic web traffic model, which makes the channel quality estimation (CQE) difficult. The CQE has great impact on the performance of frequency domain adaptation. Throughput improvements are expected when using an improved CQE or interference avoidance schemes. The gains with frequency domain adaptation shown in this thesis work may be too small to motivate the extra signalling overhead required. The complexity of the implemented frequency domain power allocation algorithm is also very high compared to the performance enhancement seen.
XIAO, MIN. "Generalized Domain Adaptation for Sequence Labeling in Natural Language Processing". Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/391382.
Testo completoPh.D.
Sequence labeling tasks have been widely studied in the natural language processing area, such as part-of-speech tagging, syntactic chunking, dependency parsing, and etc. Most of those systems are developed on a large amount of labeled training data via supervised learning. However, manually collecting labeled training data is too time-consuming and expensive. As an alternative, to alleviate the issue of label scarcity, domain adaptation has recently been proposed to train a statistical machine learning model in a target domain where there is no enough labeled training data by exploiting existing free labeled training data in a different but related source domain. The natural language processing community has witnessed the success of domain adaptation in a variety of sequence labeling tasks. Though the labeled training data in the source domain are available and free, however, they are not exactly as and can be very different from the test data in the target domain. Thus, simply applying naive supervised machine learning algorithms without considering domain differences may not fulfill the purpose. In this dissertation, we developed several novel representation learning approaches to address domain adaptation for sequence labeling in natural language processing. Those representation learning techniques aim to induce latent generalizable features to bridge domain divergence to enable cross-domain prediction. We first tackle a semi-supervised domain adaptation scenario where the target domain has a small amount of labeled training data and propose a distributed representation learning approach based on a probabilistic neural language model. We then relax the assumption of the availability of labeled training data in the target domain and study an unsupervised domain adaptation scenario where the target domain has only unlabeled training data, and give a task-informative representation learning approach based on dynamic dependency networks. Both works are developed in the setting where different domains contain sentences in different genres. We then extend and generalize domain adaptation into a more challenging scenario where different domains contain sentences in different languages and propose two cross-lingual representation learning approaches, one is based on deep neural networks with auxiliary bilingual word pairs and the other is based on annotation projection with auxiliary parallel sentences. All four specific learning scenarios are extensively evaluated with different sequence labeling tasks. The empirical results demonstrate the effectiveness of those generalized domain adaptation techniques for sequence labeling in natural language processing.
Temple University--Theses
Xu, Brian(Brian W. ). "Combating fake news with adversarial domain adaptation and neural models". Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/121689.
Testo completoThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 77-80).
Factually incorrect claims on the web and in social media can cause considerable damage to individuals and societies by misleading them. As we enter an era where it is easier than ever to disseminate "fake news" and other dubious claims, automatic fact checking becomes an essential tool to help people discern fact from fiction. In this thesis, we focus on two main tasks: fact checking which involves classifying an input claim with respect to its veracity, and stance detection which involves determining the perspective of a document with respect to a claim. For the fact checking task, we present Bidirectional Long Short Term Memory (Bi-LSTM) and Convolutional Neural Network (CNN) based models and conduct our experiments on the LIAR dataset [Wang, 2017], a recently released fact checking task. Our model outperforms the state of the art baseline on this dataset. For the stance detection task, we present bag of words (BOW) and CNN based models in hierarchy schemes. These architectures are then supplemented with an adversarial domain adaptation technique, which helps the models overcome dataset size limitations. We test the performance of these models by using the Fake News Challenge (FNC) [Pomerleau and Rao, 2017], the Fact Extraction and VERification (FEVER) [Thorne et al., 2018], and the Stanford Natural Language Inference (SNLI) [Bowman et al., 2015] datasets. Our experiments yielded a model which has state of the art performance on FNC target data by using FEVER source data coupled with adversarial domain adaptation [Xu et al., 2018].
by Brian Xu.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Peyrache, Jean-Philippe. "Nouvelles approches itératives avec garanties théoriques pour l'adaptation de domaine non supervisée". Thesis, Saint-Etienne, 2014. http://www.theses.fr/2014STET4023/document.
Testo completoDuring the past few years, an increasing interest for Machine Learning has been encountered, in various domains like image recognition or medical data analysis. However, a limitation of the classical PAC framework has recently been highlighted. It led to the emergence of a new research axis: Domain Adaptation (DA), in which learning data are considered as coming from a distribution (the source one) different from the one (the target one) from which are generated test data. The first theoretical works concluded that a good performance on the target domain can be obtained by minimizing in the same time the source error and a divergence term between the two distributions. Three main categories of approaches are derived from this idea : by reweighting, by reprojection and by self-labeling. In this thesis work, we propose two contributions. The first one is a reprojection approach based on boosting theory and designed for numerical data. It offers interesting theoretical guarantees and also seems able to obtain good generalization performances. Our second contribution consists first in a framework filling the gap of the lack of theoretical results for self-labeling methods by introducing necessary conditions ensuring the good behavior of this kind of algorithm. On the other hand, we propose in this framework a new approach, using the theory of (epsilon, gamma, tau)- good similarity functions to go around the limitations due to the use of kernel theory in the specific context of structured data
Panareda, Busto Pau [Verfasser]. "Domain Adaptation for Image Recognition and Viewpoint Estimation / Pau Panareda Busto". Bonn : Universitäts- und Landesbibliothek Bonn, 2020. http://d-nb.info/1219140449/34.
Testo completoNyströmer, Carl. "Musical Instrument Activity Detection using Self-Supervised Learning and Domain Adaptation". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280810.
Testo completoI och med de ständigt växande media- och musikkatalogerna krävs verktyg för att söka och navigera i dessa. För mer komplexa sökförfrågningar så behövs det metadata, men att manuellt annotera de enorma mängderna av ny data är omöjligt. I denna uppsats undersöks automatisk annotering utav instrumentsaktivitet inom musik, med ett fokus på bristen av annoterad data för modellerna för instrumentaktivitetsigenkänning. Två metoder för att komma runt bristen på data föreslås och undersöks. Den första metoden bygger på självövervakad inlärning baserad på automatisk annotering och slumpartad mixning av olika instrumentspår. Den andra metoden använder domänadaption genom att träna modeller på samplade MIDI-filer för detektering av instrument i inspelad musik. Metoden med självövervakning gav bättre resultat än baseline och pekar på att djupinlärningsmodeller kan lära sig instrumentigenkänning trots att ljudmixarna saknar musikalisk struktur. Domänadaptionsmodellerna som endast var tränade på samplad MIDI-data presterade sämre än baseline, men att använda MIDI-data tillsammans med data från inspelad musik gav förbättrade resultat. En hybridmodell som kombinerade både självövervakad inlärning och domänadaption genom att använda både samplad MIDI-data och inspelad musik gav de bästa resultaten totalt.
Sedinkina, Marina [Verfasser], e Hinrich [Akademischer Betreuer] Schütze. "Domain adaptation in Natural Language Processing / Marina Sedinkina ; Betreuer: Hinrich Schütze". München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2021. http://d-nb.info/1233966936/34.
Testo completoZen, Gloria. "Understanding Visual Information: from Unsupervised Discovery to Minimal Effort Domain Adaptation". Doctoral thesis, Università degli studi di Trento, 2015. https://hdl.handle.net/11572/368625.
Testo completoZen, Gloria. "Understanding Visual Information: from Unsupervised Discovery to Minimal Effort Domain Adaptation". Doctoral thesis, University of Trento, 2015. http://eprints-phd.biblio.unitn.it/1443/1/GZen_final_thesis.pdf.
Testo completo