Dissertations / Theses: 'Multi-view machine learning'

1

Labroski, Aleksandar. "Multi-view versus single-view machine learning for disease diagnosis in primary healthcare." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235533.

Full text

Abstract:

The work presented in this report considers and compares two different approaches of machine learning towards solving the problem of disease diagnosis prediction in primary healthcare: single-view and multi-view machine learning. In particular, the problem of disease diagnosis prediction refers to the issue of predicting a (possible) diagnosis for a given patient based on her past medical history. The problem area is extensive, especially considering the fact that there are over 14,400 unique possible diagnoses (grouped into22 high level categories) that can be considered as prediction targets. The approach taken in this work considers the high-level categories as prediction targets and attempts to use the two different machine learning techniques towards getting close to an optimal solution of the issue. The multi-view machine learning paradigm was chosen as an approach that can improve predictive performance of classifiers in settings where we have multiple heterogeneous data sources (different views of the same data), which is exactlyt he case here. In order to compare the single-view and multi-view machine learning paradigms (based on the concept of supervised learning), several different experiments are devised which explore the possible solution space under each paradigm. The work closely touches on other machine learning concepts such as ensemble learning, stacked generalization and dimensionality reduction-based learning. As we shall see, the results show that multiview stacked generalization is a powerful paradigm that can significantly improve the predictive performance in a supervised learning setting. The different models performance was evaluated using F1 scores and we have been able to observe an average increase of performance of 0.04 and a maximum increase of 0.114 F1 score points. The findings also show that approach of multi-view stacked ensemble learning is particularly well suited as a noise reduction technique and works well in cases where the feature data is expected to contain a notable amount of noise. This can be very beneficial and of interest to projects where the features are not manually chosen by domainexperts.
Arbetet som presenteras i denna rapport beaktar och jämför två olika metoder för maskininlärning för att lösa problemet med prognos för sjukdomsdiagnos i primärvården: single-view och multi-view maskininlärning. I synnerhet avser problemet med sjukdomsdiagnos prediktion av en (möjlig) diagnos för en given patient, baserat på dennes tidigare medicinska historia. Problemområdet är omfattande, i synnerhet med tanke på att det finns över 14 400 unika möjliga diagnoser (grupperade i 22 högkvalitativa kategorier) som kan betraktas som förutsägbara. Tillvägagångssättet i detta arbete betraktar kategorierna i hög-nivå och försöker använda de två olika maskininlärningsteknikerna för att komma nära en optimal lösning på problemet. Multi-view maskininlärningsparadigmet valdes som ett tillvägagångssätt som kan förbättra prediktiv prestanda för klassifikationer i inställningar där vi har flera heterogena datakällor (olika visningar av samma data), vilket är det exakta fallet här. För att jämföra single-view och multi-view maskininlärning paradigmerna (baserat på begreppet övervakat lärande), är flera olika experiment utformade som undersöker det möjliga lösningsutrymmet under varje paradigm. Arbetet berör noga andra koncept för maskininlärning, som ensembleinlärning, samlad generalisering och dimensioneringsreduktionsbaserat lärande. Som vi kan se visar resultaten att multi-view samlad generalisering är ett kraftfullt paradigm som kan förbättra den prediktiva prestandan avsevärt i en övervakad inlärningsinställning. De olika modellernas prestanda utvärderades med hjälp av F1-poäng och vi har kunnat observera en genomsnittlig ökning av prestanda på 0,04 och en maximal ökning av 0.114 F1 poäng. Resultaten visar också att tillvägagångssättet för multi-view stacked ensemblelärande är särskilt väl lämpat som en brusreduceringsteknik och fungerar bra i fall där funktionsdata förväntas innehålla en anmärkningsvärd mängd brus. Detta kan vara mycket fördelaktigt och av intresse för projekt där funktioner inte manuellt väljs av domänexperter.

APA, Harvard, Vancouver, ISO, and other styles

2

Byun, Byungki. "On discriminative semi-supervised incremental learning with a multi-view perspective for image concept modeling." Thesis, Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/43597.

Full text

Abstract:

This dissertation presents the development of a semi-supervised incremental learning framework with a multi-view perspective for image concept modeling. For reliable image concept characterization, having a large number of labeled images is crucial. However, the size of the training set is often limited due to the cost required for generating concept labels associated with objects in a large quantity of images. To address this issue, in this research, we propose to incrementally incorporate unlabeled samples into a learning process to enhance concept models originally learned with a small number of labeled samples. To tackle the sub-optimality problem of conventional techniques, the proposed incremental learning framework selects unlabeled samples based on an expected error reduction function that measures contributions of the unlabeled samples based on their ability to increase the modeling accuracy. To improve the convergence property of the proposed incremental learning framework, we further propose a multi-view learning approach that makes use of multiple features such as color, texture, etc., of images when including unlabeled samples. For robustness to mismatches between training and testing conditions, a discriminative learning algorithm, namely a kernelized maximal- figure-of-merit (kMFoM) learning approach is also developed. Combining individual techniques, we conduct a set of experiments on various image concept modeling problems, such as handwritten digit recognition, object recognition, and image spam detection to highlight the effectiveness of the proposed framework.

APA, Harvard, Vancouver, ISO, and other styles

3

Zantedeschi, Valentina. "A Unified View of Local Learning : Theory and Algorithms for Enhancing Linear Models." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSES055/document.

Full text

Abstract:

Dans le domaine de l'apprentissage machine, les caractéristiques des données varient généralement dans l'espace des entrées : la distribution globale pourrait être multimodale et contenir des non-linéarités. Afin d'obtenir de bonnes performances, l'algorithme d'apprentissage devrait alors être capable de capturer et de s'adapter à ces changements. Même si les modèles linéaires ne parviennent pas à décrire des distributions complexes, ils sont réputés pour leur passage à l'échelle, en entraînement et en test, aux grands ensembles de données en termes de nombre d'exemples et de nombre de fonctionnalités. Plusieurs méthodes ont été proposées pour tirer parti du passage à l'échelle et de la simplicité des hypothèses linéaires afin de construire des modèles aux grandes capacités discriminatoires. Ces méthodes améliorent les modèles linéaires, dans le sens où elles renforcent leur expressivité grâce à différentes techniques. Cette thèse porte sur l'amélioration des approches d'apprentissage locales, une famille de techniques qui infère des modèles en capturant les caractéristiques locales de l'espace dans lequel les observations sont intégrées.L'hypothèse fondatrice de ces techniques est que le modèle appris doit se comporter de manière cohérente sur des exemples qui sont proches, ce qui implique que ses résultats doivent aussi changer de façon continue dans l'espace des entrées. La localité peut être définie sur la base de critères spatiaux (par exemple, la proximité en fonction d'une métrique choisie) ou d'autres relations fournies, telles que l'association à la même catégorie d'exemples ou un attribut commun. On sait que les approches locales d'apprentissage sont efficaces pour capturer des distributions complexes de données, évitant de recourir à la sélection d'un modèle spécifique pour la tâche. Cependant, les techniques de pointe souffrent de trois inconvénients majeurs :ils mémorisent facilement l'ensemble d'entraînement, ce qui se traduit par des performances médiocres sur de nouvelles données ; leurs prédictions manquent de continuité dans des endroits particuliers de l'espace ; elles évoluent mal avec la taille des ensembles des données. Les contributions de cette thèse examinent les problèmes susmentionnés dans deux directions : nous proposons d'introduire des informations secondaires dans la formulation du problème pour renforcer la continuité de la prédiction et atténuer le phénomène de la mémorisation ; nous fournissons une nouvelle représentation de l'ensemble de données qui tient compte de ses spécificités locales et améliore son évolutivité. Des études approfondies sont menées pour mettre en évidence l'efficacité de ces contributions pour confirmer le bien-fondé de leurs intuitions. Nous étudions empiriquement les performances des méthodes proposées tant sur des jeux de données synthétiques que sur des tâches réelles, en termes de précision et de temps d'exécution, et les comparons aux résultats de l'état de l'art. Nous analysons également nos approches d'un point de vue théorique, en étudiant leurs complexités de calcul et de mémoire et en dérivant des bornes de généralisation serrées
In Machine Learning field, data characteristics usually vary over the space: the overall distribution might be multi-modal and contain non-linearities.In order to achieve good performance, the learning algorithm should then be able to capture and adapt to these changes. Even though linear models fail to describe complex distributions, they are renowned for their scalability, at training and at testing, to datasets big in terms of number of examples and of number of features. Several methods have been proposed to take advantage of the scalability and the simplicity of linear hypotheses to build models with great discriminatory capabilities. These methods empower linear models, in the sense that they enhance their expressive power through different techniques. This dissertation focuses on enhancing local learning approaches, a family of techniques that infers models by capturing the local characteristics of the space in which the observations are embedded. The founding assumption of these techniques is that the learned model should behave consistently on examples that are close, implying that its results should also change smoothly over the space. The locality can be defined on spatial criteria (e.g. closeness according to a selected metric) or other provided relations, such as the association to the same category of examples or a shared attribute. Local learning approaches are known to be effective in capturing complex distributions of the data, avoiding to resort to selecting a model specific for the task. However, state of the art techniques suffer from three major drawbacks: they easily memorize the training set, resulting in poor performance on unseen data; their predictions lack of smoothness in particular locations of the space;they scale poorly with the size of the datasets. The contributions of this dissertation investigate the aforementioned pitfalls in two directions: we propose to introduce side information in the problem formulation to enforce smoothness in prediction and attenuate the memorization phenomenon; we provide a new representation for the dataset which takes into account its local specificities and improves scalability. Thorough studies are conducted to highlight the effectiveness of the said contributions which confirmed the soundness of their intuitions. We empirically study the performance of the proposed methods both on toy and real tasks, in terms of accuracy and execution time, and compare it to state of the art results. We also analyze our approaches from a theoretical standpoint, by studying their computational and memory complexities and by deriving tight generalization bounds

APA, Harvard, Vancouver, ISO, and other styles

4

Xie, Zhiyuan. "Effect of Enhancement on Convolutional Neural Network Based Multi-view Object Classification." University of Dayton / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1522937516903222.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Seifi, Farid. "Improving Classification and Attribute Clustering: An Iterative Semi-supervised Approach." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32140.

Full text

Abstract:

This thesis proposes a novel approach to attribute clustering. It exploits the strength of semi-supervised learning to improve the quality of attribute clustering particularly when labeled data is limited. The significance of this work derives in part from the broad, and increasingly important, usage of attribute clustering to address outstanding problems within the machine learning community. This form of clustering has also been shown to have strong practical applications, being usable in heavyweight industrial applications. Although researchers have focused on supervised and unsupervised attribute clustering in recent years, semi-supervised attribute clustering has not received substantial attention. In this research, we propose an innovative two step iterative semi-supervised attribute clustering framework. This new framework, in each iteration, uses the result of attribute clustering to improve a classifier. It then uses the classifier to augment the training data used by attribute clustering in next iteration. This iterative framework outputs an improved classifier and attribute clustering at the same time. It gives more accurate clusters of attributes which better fit the real relations between attributes. In this study we proposed two new usages for attribute clustering to improve classification: solving the automatic view definition problem for multi-view learning and improving missing attribute-value handling at induction and prediction time. The application of these two new usages of attribute clustering in our proposed semi-supervised attribute clustering is evaluated using real world data sets from different domains.

APA, Harvard, Vancouver, ISO, and other styles

6

Li, Rui [Verfasser], Burkhard [Akademischer Betreuer] [Gutachter] Rost, and Stefan [Gutachter] Kramer. "Data Mining and Machine Learning Methods for High-dimensional Patient Data in Dementia Research: Voxel Features Mining, Subgroup Discovery and Multi-view Learning / Rui Li ; Gutachter: Burkhard Rost, Stefan Kramer ; Betreuer: Burkhard Rost." München : Universitätsbibliothek der TU München, 2017. http://d-nb.info/1125018224/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Soares, Matheus Victor Brum. "Aprendizado de máquina parcialmente supervisionado multidescrição para realimentação de relevância em recuperação de informação na WEB." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-03092009-135403/.

Full text

Abstract:

Atualmente, o meio mais comum de busca de informações é a WEB. Assim, é importante procurar métodos eficientes para recuperar essa informação. As máquinas de busca na WEB usualmente utilizam palavras-chaves para expressar uma busca. Porém, não é trivial caracterizar a informação desejada. Usuários diferentes com necessidades diferentes podem estar interessados em informações relacionadas, mas distintas, ao realizar a mesma busca. O processo de realimentação de relevância torna possível a participação ativa do usuário no processo de busca. A idéia geral desse processo consiste em, após o usuário realizar uma busca na WEB permitir que indique, dentre os sites encontrados, quais deles considera relevantes e não relevantes. A opinião do usuário pode então ser considerada para reordenar os dados, de forma que os sites relevantes para o usuário sejam retornados mais facilmente. Nesse contexto, e considerando que, na grande maioria dos casos, uma consulta retorna um número muito grande de sites WEB que a satisfazem, das quais o usuário é responsável por indicar um pequeno número de sites relevantes e não relevantes, tem-se o cenário ideal para utilizar aprendizado parcialmente supervisionado, pois essa classe de algoritmos de aprendizado requer um número pequeno de exemplos rotulados e um grande número de exemplos não-rotulados. Assim, partindo da hipótese que a utilização de aprendizado parcialmente supervisionado é apropriada para induzir um classificador que pode ser utilizado como um filtro de realimentação de relevância para buscas na WEB, o objetivo deste trabalho consiste em explorar algoritmos de aprendizado parcialmente supervisionado, mais especificamente, aqueles que utilizam multidescrição de dados, para auxiliar na recuperação de sites na WEB. Para avaliar esta hipótese foi projetada e desenvolvida uma ferramenta denominada C-SEARCH que realiza esta reordenação dos sites a partir da indicação do usuário. Experimentos mostram que, em casos que buscas genéricas, que o resultado possui um bom diferencial entre sites relevantes e irrelevantes, o sistema consegue obter melhores resultados para o usuário
As nowadays the WEB is the most common source of information, it is very important to find reliable and efficient methods to retrieve this information. However, the WEB is a highly volatile and heterogeneous information source, thus keyword based querying may not be the best approach when few information is given. This is due to the fact that different users with different needs may want distinct information, although related to the same keyword query. The process of relevance feedback makes it possible for the user to interact actively with the search engine. The main idea is that after performing an initial search in the WEB, the process enables the user to indicate, among the retrieved sites, a small number of the ones considered relevant or irrelevant according with his/her required information. The users preferences can then be used to rearrange sites returned in the initial search, so that relevant sites are ranked first. As in most cases a search returns a large amount of WEB sites which fits the keyword query, this is an ideal situation to use partially supervised machine learning algorithms. This kind of learning algorithms require a small number of labeled examples, and a large number of unlabeled examples. Thus, based on the assumption that the use of partially supervised learning is appropriate to induce a classifier that can be used as a filter for relevance feedback in WEB information retrieval, the aim of this work is to explore the use of a partially supervised machine learning algorithm, more specifically, one that uses multi-description data, in order to assist the WEB search. To this end, a computational tool called C-SEARCH, which performs the reordering of the searched results using the users feedback, has been implemented. Experimental results show that in cases where the keyword query is generic and there is a clear distinction between relevant and irrelevant sites, which is recognized by the user, the system can achieve good results

APA, Harvard, Vancouver, ISO, and other styles

8

Koco, Sokol. "Méthodes ensembliste pour des problèmes de classification multi-vues et multi-classes avec déséquilibres." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4101/document.

Full text

Abstract:

De nos jours, dans plusieurs domaines, tels que la bio-informatique ou le multimédia, les données peuvent être représentées par plusieurs ensembles d'attributs, appelés des vues. Pour une tâche de classification donnée, nous distinguons deux types de vues : les vues fortes sont celles adaptées à la tâche, les vues faibles sont adaptées à une (petite) partie de la tâche ; en classification multi-classes, chaque vue peut s'avérer forte pour reconnaître une classe, et faible pour reconnaître d’autres classes : une telle vue est dite déséquilibrée. Les travaux présentés dans cette thèse s'inscrivent dans le cadre de l'apprentissage supervisé et ont pour but de traiter les questions d'apprentissage multi-vue dans le cas des vues fortes, faibles et déséquilibrées. La première contribution de cette thèse est un algorithme d'apprentissage multi-vues théoriquement fondé sur le cadre de boosting multi-classes utilisé par AdaBoost.MM. La seconde partie de cette thèse concerne la mise en place d'un cadre général pour les méthodes d'apprentissage de classes déséquilibrées (certaines classes sont plus représentées que les autres). Dans la troisième partie, nous traitons le problème des vues déséquilibrées en combinant notre approche des classes déséquilibrées et la coopération entre les vues mise en place pour appréhender la classification multi-vues. Afin de tester les méthodes sur des données réelles, nous nous intéressons au problème de classification d'appels téléphoniques, qui a fait l'objet du projet ANR DECODA. Ainsi chaque partie traite différentes facettes du problème
Nowadays, in many fields, such as bioinformatics or multimedia, data may be described using different sets of features, also called views. For a given classification task, we distinguish two types of views:strong views, which are suited for the task, and weak views suited for a (small) part of the task; in multi-class learning, a view can be strong with respect to some (few) classes and weak for the rest of the classes: these are imbalanced views. The works presented in this thesis fall in the supervised learning setting and their aim is to address the problem of multi-view learning under strong, weak and imbalanced views, regrouped under the notion of uneven views. The first contribution of this thesis is a multi-view learning algorithm based on the same framework as AdaBoost.MM. The second part of this thesis proposes a unifying framework for imbalanced classes supervised methods (some of the classes are more represented than others). In the third part of this thesis, we tackle the uneven views problem through the combination of the imbalanced classes framework and the between-views cooperation used to take advantage of the multiple views. In order to test the proposed methods on real-world data, we consider the task of phone calls classifications, which constitutes the subject of the ANR DECODA project. Each part of this thesis deals with different aspects of the problem

APA, Harvard, Vancouver, ISO, and other styles

9

Matsubara, Edson Takashi. "O algoritmo de aprendizado semi-supervisionado co-training e sua aplicação na rotulação de documentos." Universidade de São Paulo, 2004. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-19082004-092311/.

Full text

Abstract:

Em Aprendizado de Máquina, a abordagem supervisionada normalmente necessita de um número significativo de exemplos de treinamento para a indução de classificadores precisos. Entretanto, a rotulação de dados é freqüentemente realizada manualmente, o que torna esse processo demorado e caro. Por outro lado, exemplos não-rotulados são facilmente obtidos se comparados a exemplos rotulados. Isso é particularmente verdade para tarefas de classificação de textos que envolvem fontes de dados on-line tais como páginas de internet, email e artigos científicos. A classificação de textos tem grande importância dado o grande volume de textos disponível on-line. Aprendizado semi-supervisionado, uma área de pesquisa relativamente nova em Aprendizado de Máquina, representa a junção do aprendizado supervisionado e não-supervisionado, e tem o potencial de reduzir a necessidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados está disponível. Este trabalho descreve o algoritmo de aprendizado semi-supervisionado co-training, que necessita de duas descrições de cada exemplo. Deve ser observado que as duas descrições necessárias para co-training podem ser facilmente obtidas de documentos textuais por meio de pré-processamento. Neste trabalho, várias extensões do algoritmo co-training foram implementadas. Ainda mais, foi implementado um ambiente computacional para o pré-processamento de textos, denominado PreTexT, com o objetivo de utilizar co-training em problemas de classificação de textos. Os resultados experimentais foram obtidos utilizando três conjuntos de dados. Dois conjuntos de dados estão relacionados com classificação de textos e o outro com classificação de páginas de internet. Os resultados, que variam de excelentes a ruins, mostram que co-training, similarmente a outros algoritmos de aprendizado semi-supervisionado, é afetado de maneira bastante complexa pelos diferentes aspectos na indução dos modelos.
In Machine Learning, the supervised approach usually requires a large number of labeled training examples to learn accurately. However, labeling is often manually performed, making this process costly and time-consuming. By contrast, unlabeled examples are often inexpensive and easier to obtain than labeled examples. This is especially true for text classification tasks involving on-line data sources, such as web pages, email and scientific papers. Text classification is of great practical importance today given the massive volume of online text available. Semi-supervised learning, a relatively new area in Machine Learning, represents a blend of supervised and unsupervised learning, and has the potential of reducing the need of expensive labeled data whenever only a small set of labeled examples is available. This work describes the semi-supervised learning algorithm co-training, which requires a partitioned description of each example into two distinct views. It should be observed that the two different views required by co-training can be easily obtained from textual documents through pre-processing. In this works, several extensions of co-training algorithm have been implemented. Furthermore, we have also implemented a computational environment for text pre-processing, called PreTexT, in order to apply the co-training algorithm to text classification problems. Experimental results using co-training on three data sets are described. Two data sets are related to text classification and the other one to web-page classification. Results, which range from excellent to poor, show that co-training, similarly to other semi-supervised learning algorithms, is affected by modelling assumptions in a rather complicated way.

APA, Harvard, Vancouver, ISO, and other styles

10

Twinanda, Andru Putra. "Vision-based approaches for surgical activity recognition using laparoscopic and RBGD videos." Thesis, Strasbourg, 2017. http://www.theses.fr/2017STRAD005/document.

Full text

Abstract:

Cette thèse a pour objectif la conception de méthodes pour la reconnaissance automatique des activités chirurgicales. Cette reconnaissance est un élément clé pour le développement de systèmes réactifs au contexte clinique et pour des applications comme l’assistance automatique lors de chirurgies complexes. Nous abordons ce problème en utilisant des méthodes de Vision puisque l’utilisation de caméras permet de percevoir l’environnement sans perturber la chirurgie. Deux types de vidéos sont utilisées : des vidéos laparoscopiques et des vidéos multi-vues RGBD. Nous avons d’abord étudié les résultats obtenus avec les méthodes de l’état de l’art, puis nous avons proposé des nouvelles approches basées sur le « Deep learning ». Nous avons aussi généré de larges jeux de données constitués d’enregistrements de chirurgies. Les résultats montrent que nos méthodes permettent d’obtenir des meilleures performances pour la reconnaissance automatique d’activités chirurgicales que l’état de l’art
The main objective of this thesis is to address the problem of activity recognition in the operating room (OR). Activity recognition is an essential component in the development of context-aware systems, which will allow various applications, such as automated assistance during difficult procedures. Here, we focus on vision-based approaches since cameras are a common source of information to observe the OR without disrupting the surgical workflow. Specifically, we propose to use two complementary video types: laparoscopic and OR-scene RGBD videos. We investigate how state-of-the-art computer vision approaches perform on these videos and propose novel approaches, consisting of deep learning approaches, to carry out the tasks. To evaluate our proposed approaches, we generate large datasets of recordings of real surgeries. The results demonstrate that the proposed approaches outperform the state-of-the-art methods in performing surgical activity recognition on these new datasets

APA, Harvard, Vancouver, ISO, and other styles

11

Sublemontier, Jacques-Henri. "Classification non supervisée : de la multiplicité des données à la multiplicité des analyses." Phd thesis, Université d'Orléans, 2012. http://tel.archives-ouvertes.fr/tel-00801555.

Full text

Abstract:

La classification automatique non supervisée est un problème majeur, aux frontières de multiples communautés issues de l'Intelligence Artificielle, de l'Analyse de Données et des Sciences de la Cognition. Elle vise à formaliser et mécaniser la tâche cognitive de classification, afin de l'automatiser pour la rendre applicable à un grand nombre d'objets (ou individus) à classer. Des visées plus applicatives s'intéressent à l'organisation automatique de grands ensembles d'objets en différents groupes partageant des caractéristiques communes. La présente thèse propose des méthodes de classification non supervisées applicables lorsque plusieurs sources d'informations sont disponibles pour compléter et guider la recherche d'une ou plusieurs classifications des données. Pour la classification non supervisée multi-vues, la première contribution propose un mécanisme de recherche de classifications locales adaptées aux données dans chaque représentation, ainsi qu'un consensus entre celles-ci. Pour la classification semi-supervisée, la seconde contribution propose d'utiliser des connaissances externes sur les données pour guider et améliorer la recherche d'une classification d'objets par un algorithme quelconque de partitionnement de données. Enfin, la troisième et dernière contribution propose un environnement collaboratif permettant d'atteindre au choix les objectifs de consensus et d'alternatives pour la classification d'objets mono-représentés ou multi-représentés. Cette dernière contribution ré-pond ainsi aux différents problèmes de multiplicité des données et des analyses dans le contexte de la classification non supervisée, et propose, au sein d'une même plate-forme unificatrice, une proposition répondant à des problèmes très actifs et actuels en Fouille de Données et en Extraction et Gestion des Connaissances.

APA, Harvard, Vancouver, ISO, and other styles

12

Braga, Ígor Assis. "Aprendizado semissupervisionado multidescrição em classificação de textos." Universidade de São Paulo, 2010. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-02062010-160019/.

Full text

Abstract:

Algoritmos de aprendizado semissupervisionado aprendem a partir de uma combinação de dados rotulados e não rotulados. Assim, eles podem ser aplicados em domínios em que poucos exemplos rotulados e uma vasta quantidade de exemplos não rotulados estão disponíveis. Além disso, os algoritmos semissupervisionados podem atingir um desempenho superior aos algoritmos supervisionados treinados nos mesmos poucos exemplos rotulados. Uma poderosa abordagem ao aprendizado semissupervisionado, denominada aprendizado multidescrição, pode ser usada sempre que os exemplos de treinamento são descritos por dois ou mais conjuntos de atributos disjuntos. A classificação de textos é um domínio de aplicação no qual algoritmos semissupervisionados vêm obtendo sucesso. No entanto, o aprendizado semissupervisionado multidescrição ainda não foi bem explorado nesse domínio dadas as diversas maneiras possíveis de se descrever bases de textos. O objetivo neste trabalho é analisar o desempenho de algoritmos semissupervisionados multidescrição na classificação de textos, usando unigramas e bigramas para compor duas descrições distintas de documentos textuais. Assim, é considerado inicialmente o difundido algoritmo multidescrição CO-TRAINING, para o qual são propostas modificações a fim de se tratar o problema dos pontos de contenção. É também proposto o algoritmo COAL, o qual pode melhorar ainda mais o algoritmo CO-TRAINING pela incorporação de aprendizado ativo como uma maneira de tratar pontos de contenção. Uma ampla avaliação experimental desses algoritmos foi conduzida em bases de textos reais. Os resultados mostram que o algoritmo COAL, usando unigramas como uma descrição das bases textuais e bigramas como uma outra descrição, atinge um desempenho significativamente melhor que um algoritmo semissupervisionado monodescrição. Levando em consideração os bons resultados obtidos por COAL, conclui-se que o uso de unigramas e bigramas como duas descrições distintas de bases de textos pode ser bastante compensador
Semi-supervised learning algorithms learn from a combination of both labeled and unlabeled data. Thus, they can be applied in domains where few labeled examples and a vast amount of unlabeled examples are available. Furthermore, semi-supervised learning algorithms may achieve a better performance than supervised learning algorithms trained on the same few labeled examples. A powerful approach to semi-supervised learning, called multi-view learning, can be used whenever the training examples are described by two or more disjoint sets of attributes. Text classification is a domain in which semi-supervised learning algorithms have shown some success. However, multi-view semi-supervised learning has not yet been well explored in this domain despite the possibility of describing textual documents in a myriad of ways. The aim of this work is to analyze the effectiveness of multi-view semi-supervised learning in text classification using unigrams and bigrams as two distinct descriptions of text documents. To this end, we initially consider the widely adopted CO-TRAINING multi-view algorithm and propose some modifications to it in order to deal with the problem of contention points. We also propose the COAL algorithm, which further improves CO-TRAINING by incorporating active learning as a way of dealing with contention points. A thorough experimental evaluation of these algorithms was conducted on real text data sets. The results show that the COAL algorithm, using unigrams as one description of text documents and bigrams as another description, achieves significantly better performance than a single-view semi-supervised algorithm. Taking into account the good results obtained by COAL, we conclude that the use of unigrams and bigrams as two distinct descriptions of text documents can be very effective

APA, Harvard, Vancouver, ISO, and other styles

13

Robbeloth, Michael Christopher. "Recognition of Incomplete Objects based on Synthesis of Views Using a Geometric Based Local-Global Graphs." Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1557509373174391.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Liu, Fang. "Efficient Online Learning with Bandit Feedback." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587680990430268.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Tumati, Saini. "A Combined Approach to Handle Multi-class Imbalanced Data and to Adapt Concept Drifts using Machine Learning." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623240328088387.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Patel, Neel R. "CHARACTERIZING GLOBAL REGULATORY PATTERNS OF TRANSCRIPTION FACTORS ON SYSTEMS-WIDE SCALE USING MULTI-OMICS DATASETS AND MACHINE LEARNING." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1626284802198267.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Walton, Ashley E. "Multi-scaled assessment for predicting pain experience in adolescents with Sickle Cell Disease." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1522332374293073.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Bisig, Caleb R. "Modular Decentralized Genetic Fuzzy Control for Multi-UAV Slung Payloads." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617106491512366.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Chintalapati, Veera Venkata Tarun Kartik. "Multi-Vehicle Path Following and Adversarial Agent Detection in Constrained Environments." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1613751238253121.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Maguluri, Naga Sai Nikhil. "Multi-Class Classification of Textual Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1494248022049882.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Hulbert, Sarah Marie HULBERT. "Biophysical Approaches for the Multi-System Analysis of Neural Control of Movement and Neurologic Rehabilitation." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1534678369235538.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Li, Yichao. "Algorithmic Methods for Multi-Omics Biomarker Discovery." Ohio University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1541609328071533.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Curnalia, James W. "The Impact of Training Epoch Size on the Accuracy of Collaborative Filtering Models in GraphChi Utilizing a Multi-Cyclic Training Regimen." Youngstown State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=ysu1370016838.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Partin, Michael. "Scalable, Pluggable, and Fault Tolerant Multi-Modal Situational Awareness Data Stream Management Systems." Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1567073723628721.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Taslimitehrani, Vahid. "Contrast Pattern Aided Regression and Classification." Wright State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=wright1459377694.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Siddiqui, Mohammad Faridul Haque. "A Multi-modal Emotion Recognition Framework Through The Fusion Of Speech With Visible And Infrared Images." University of Toledo / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1556459232937498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Liu, Yuzhou. "Deep CASA for Robust Pitch Tracking and Speaker Separation." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1566179636974186.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Karvir, Hrishikesh. "Design and Validation of a Sensor Integration and Feature Fusion Test-Bed for Image-Based Pattern Recognition Applications." Wright State University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=wright1291753291.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

OLAOYE, ISRAEL A. "WATER QUALITY MODELING OF THE OLD WOMAN CREEK WATERSHED, OHIO, UNDER THE INFLUENCE OF CLIMATE CHANGE TO YEAR 2100." Kent State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=kent1605955492844115.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Fallahtafti, Alireza. "Developing Risk-Minimizing Vehicle Routing Problem for Transportation of Valuables: Models and Algorithms." Ohio University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1627568962315484.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Jin, Gaole. "On surrogate supervision multi-view learning." Thesis, 2012. http://hdl.handle.net/1957/37997.

Full text

Abstract:

Data can be represented in multiple views. Traditional multi-view learning methods (i.e., co-training, multi-task learning) focus on improving learning performance using information from the auxiliary view, although information from the target view is sufficient for learning task. However, this work addresses a semi-supervised case of multi-view learning, the surrogate supervision multi-view learning, where labels are available on limited views and a classifier is obtained on the target view where labels are missing. In surrogate multi-view learning, one cannot obtain a classifier without information from the auxiliary view. To solve this challenging problem, we propose discriminative and generative approaches.
Graduation date: 2013

APA, Harvard, Vancouver, ISO, and other styles

32

"Multi-view machine learning for integration of brain imaging and (epi)genomics data." Tulane University, 2021.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

33

Åleskog, Christoffer. "Graph-based Multi-view Clustering for Continuous Pattern Mining." Thesis, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21850.

Full text

Abstract:

Background. In many smart monitoring applications, such as smart healthcare, smart building, autonomous cars etc., data are collected from multiple sources and contain information about different perspectives/views of the monitored phenomenon, physical object, system. In addition, in many of those applications the availability of relevant labelled data is often low or even non-existing. Inspired by this, in this thesis study we propose a novel algorithm for multi-view stream clustering. The algorithm can be applied for continuous pattern mining and labeling of streaming data. Objectives. The main objective of this thesis is to develop and implement a novel multi-view stream clustering algorithm. In addition, the potential of the proposed algorithm is studied and evaluated on two datasets: synthetic and real-world. The conducted experiments study the new algorithm’s performance compared to a single-view clustering algorithm and an algorithm without transferring knowledge between chunks. Finally, the obtained results are analyzed, discussed and interpreted. Methods. Initially, we study the state-of-the-art multi-view (stream) clustering algorithms. Then we develop our multi-view clustering algorithm for streaming data by implementing transfer of knowledge feature. We present and explain in details the developed algorithm by motivating each choice made during the algorithm design phase. Finally, discussion of the algorithm configuration, experimental setup and the datasets chosen for the experiments are presented and motivated. Results. Different configurations of the proposed algorithm have been studied and evaluated under different experimental scenarios on two different datasets: synthetic and real-world. The proposed multi-view clustering algorithm has demonstrated higher performance on the synthetic data than on the real-world dataset. This is mainly due to not very good quality of the used real-world data. Conclusions. The proposed algorithm has demonstrated higher performance results on the synthetic dataset than on the real-world dataset. It can generate high-quality clustering solutions with respect to the used evaluation metrics. In addition, the transfer of knowledge feature has been shown to have a positive effect on the algorithm performance. A further study of the proposed algorithm on other richer and more suitable datasets, e.g., data collected from numerous sensors used for monitoring some phenomenon, is planned to be conducted in the future work.

APA, Harvard, Vancouver, ISO, and other styles

34

Kuo, Wei-Yuan, and 郭瑋元. "A Real-time Basketball Action Recognition based on Machine Learning Algorithm in Multi-View Environment." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/et723c.

Full text

Abstract:

碩士
國立中央大學
通訊工程學系
105
Human action recognition has been an important research in computer vision and computer graphics. It is widely used in entertainment, sports, medical applications and surveillance system. The traditional motion capture equipment is not usually affordable for normal developer. With the reasonable price of Kinect camera, low-cost human motion recognition becomes possible. In this paper, we use multiple Kinect sensors and Kinect SDK as the tool to build our human action recognition system. This solves the problem of action recognition equipment costs. Using multiple Kinect cameras to solve the judging and correction error problems (such as self-occlusion and image noise...etc.) and using machine learning method to classified our features, it can make our recognition result with higher performance. In our methods, we also have a detection of basketball to prevent that the subject is without ball, it makes our works more reasonable. Above of all, this paper have the action recognition rate to be more than 90% in real-time usage from three of the trained behaviors, i.e. right-hand dribble, left-hand dribble, and shooting behaviors.

APA, Harvard, Vancouver, ISO, and other styles

35

(9187466), Bharath Kumar Comandur Jagannathan Raghunathan. "Semantic Labeling of Large Geographic Areas Using Multi-Date and Multi-View Satellite Images and Noisy OpenStreetMap Labels." Thesis, 2020.

Find full text

Abstract:

This dissertation addresses the problem of how to design a convolutional neural network (CNN) for giving semantic labels to the points on the ground given the satellite image coverage over the area and, for the ground truth, given the noisy labels in OpenStreetMap (OSM). This problem is made challenging by the fact that -- (1) Most of the images are likely to have been recorded from off-nadir viewpoints for the area of interest on the ground; (2) The user-supplied labels in OSM are frequently inaccurate and, not uncommonly, entirely missing; and (3) The size of the area covered on the ground must be large enough to possess any engineering utility. As this dissertation demonstrates, solving this problem requires that we first construct a DSM (Digital Surface Model) from a stereo fusion of the available images, and subsequently use the DSM to map the individual pixels in the satellite images to points on the ground. That creates an association between the pixels in the images and the noisy labels in OSM. The CNN-based solution we present yields a 4-8% improvement in the per-class segmentation IoU (Intersection over Union) scores compared to the traditional approaches that use the views independently of one another. The system we present is end-to-end automated, which facilitates comparing the classifiers trained directly on true orthophotos vis-`a-vis first training them on the off-nadir images and subsequently translating the predicted labels to geographical coordinates. This work also presents, for arguably the first time, an in-depth discussion of large-area image alignment and DSM construction using tens of true multi-date and multi-view WorldView-3 satellite images on a distributed OpenStack cloud computing platform.

APA, Harvard, Vancouver, ISO, and other styles

36

(10157291), Yi-Yu Lai. "Relational Representation Learning Incorporating Textual Communication for Social Networks." Thesis, 2021.

Find full text

Abstract:

Representation learning (RL) for social networks facilitates real-world tasks such as visualization, link prediction and friend recommendation. Many methods have been proposed in this area to learn continuous low-dimensional embedding of nodes, edges or relations in social and information networks. However, most previous network RL methods neglect social signals, such as textual communication between users (nodes). Unlike more typical binary features on edges, such as post likes and retweet actions, social signals are more varied and contain ambiguous information. This makes it more challenging to incorporate them into RL methods, but the ability to quantify social signals should allow RL methods to better capture the implicit relationships among real people in social networks. Second, most previous work in network RL has focused on learning from homogeneous networks (i.e., single type of node, edge, role, and direction) and thus, most existing RL methods cannot capture the heterogeneous nature of relationships in social networks. Based on these identified gaps, this thesis aims to study the feasibility of incorporating heterogeneous information, e.g., texts, attributes, multiple relations and edge types (directions), to learn more accurate, fine-grained network representations.

In this dissertation, we discuss a preliminary study and outline three major works that aim to incorporate textual interactions to improve relational representation learning. The preliminary study learns a joint representation that captures the textual similarity in content between interacting nodes. The promising results motivate us to pursue broader research on using social signals for representation learning. The first major component aims to learn explicit node and relation embeddings in social networks. Traditional knowledge graph (KG) completion models learn latent representations of entities and relations by interpreting them as translations operating on the embedding of the entities. However, existing approaches do not consider textual communications between users, which contain valuable information to provide meaning and context for social relationships. We propose a novel approach that incorporates textual interactions between each pair of users to improve representation learning of both users and relationships. The second major component focuses on analyzing how users interact with each other via natural language content. Although the data is interconnected and dependent, previous research has primarily focused on modeling the social network behavior separately from the textual content. In this work, we model the data in a holistic way, taking into account the connections between the social behavior of users and the content generated when they interact, by learning a joint embedding over user characteristics and user language. In the third major component, we consider the task of learning edge representations in social networks. Edge representations are especially beneficial as we need to describe or explain the relationships, activities, and interactions among users. However, previous work in this area lack well-defined edge representations and ignore the relational signals over multiple views of social networks, which typically contain multi-view contexts (due to multiple edge types) that need to be considered when learning the representation. We propose a new methodology that captures asymmetry in multiple views by learning well-defined edge representations and incorporates textual communications to identify multiple sources of social signals that moderate the impact of different views between users.

APA, Harvard, Vancouver, ISO, and other styles

37

JANG, JIAN JIA-CHING, and 張簡嘉慶. "Multi-View Face Recognition by Convolutionary Extreme Learning Machines." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/88fbcj.

Full text

Abstract:

碩士
國立高雄大學
電機工程學系碩博士班
105
Artificial Neural Network (ANN) is one of the methods for implementing the core learning engine for face image recognition. However, the difficulties in determining effective network architectures and learning weights make ANN hard to be realized for practical applications. Extreme Learning Machine(ELM) is an improved version of ANN, that employs simpler network architecture and training process for implementing learning systems in an efficient way. This study integrates ELM with several enhancements for effective face image recognition. First, convolution is used to extract the features of face images. Second, the technique of pooling is used to reduce the very high dimension of feature vectors of face images. With convolution and pooling, features and models for face image recognition can be obtained with fewer training time. Furthermore, most face recognition systems only detect the front face image as the target for recognition. In practical applications, the incorrect capturing angle of camera may result in the lost or corruption of some image features, and hence, affect the recognition accuracy. In this study, multi-view face images from different capturing angles are extracted for training multi-view face recognition models. A variety of kernels and pooling methods are tested and compared. The performance of face recognition using single-view and multi-view methods is also compared and discussed. The experimental results show that our method improves the training performance of ELM for face recognition with satisfactory recognition accuracy.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Multi-view machine learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles