Dissertations / Theses on the topic 'Neural Network Embeddings'

To see the other types of publications on this topic, follow the link: Neural Network Embeddings.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Neural Network Embeddings.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Embretsén, Niklas. "Representing Voices Using Convolutional Neural Network Embeddings." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-261415.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In today’s society services centered around voices are gaining popularity. Being able to provide the users with voices they like, to obtain and sustain their attention, is of importance for enhancing the overall experience of the service. Finding an efficient way of representing voices such that similarity comparisons can be performed is therefore of great use. In the field of Natural Language Processing great progress has been made using embeddings from Deep Learning models to represent words in an unsupervised fashion. These representations managed to capture the semantics of the words. This thesis sets out to explore whether such embeddings can be found for audio data as well, more specifically voices from narrators of audiobooks, that captures similarities between different voices. For this two different Convolutional Neural Networks are developed and evaluated, trained on spectrogram representations of the voices. One is performing regular classification while the other one uses pairwise relationships and a Kullback–Leibler divergence based loss function, in an attempt to minimize and maximize the difference of the output between similar and dissimilar pairs of samples. From these models the embeddings used to represent each sample are extracted from the different layers of the fully connected part of the network during the evaluation. Both an objective and a subjective evaluation is performed. During the objective evaluation of the models it is first investigated whether the found embeddings are distinct for the different narrators, as well as if the embeddings do encode information about gender. The regular classification model is then further evaluated through a user test, as it achieved an order of magnitude better results during the objective evaluation. The user test sets out to evaluate whether the found embeddings capture information based on perceived similarity. It is concluded that the proposed approach has the potential to be used for representing voices in a way such that similarity is encoded, although more extensive testing, research and evaluation has to be performed to know for sure. For future work it is proposed to perform more sophisticated pre-proceessing of the data and also to collect and include data about relationships between voices during the training of the models.
I dagens samhälle ökar populariteten för röstbaserade tjänster. Att kunna förse användare med röster de tycker om, för att fånga och behålla deras uppmärksamhet, är därför viktigt för att förbättra användarupplevelsen. Att hitta ett effektiv sätt att representera röster, så att likheter mellan dessa kan jämföras, är därför av stor nytta. Inom fältet språkteknologi i maskininlärning har stora framstegs gjorts genom att skapa representationer av ord från de inre lagren av neurala nätverk, så kallade neurala nätverksinbäddningar. Dessa representationer har visat sig innehålla semantiken av orden. Denna uppsats avser att undersöka huruvida liknande representationer kan hittas för ljuddata i form av berättarröster från ljudböcker, där likhet mellan röster fångas upp. För att undersöka detta utvecklades och utvärderades två faltningsnätverk som använde sig av spektrogramrepresentationer av röstdata. Den ena modellen är konstruerad som en vanlig klassificeringsmodell, tränad för att skilja mellan uppläsare i datasetet. Den andra modellen använder parvisa förhållanden mellan datapunkterna och en Kullback–Leibler divergensbaserad optimeringsfunktion, med syfte att minimera och maximera skillnaden mellan lika och olika par av datapunkter. Från dessa modeller används representationer från de olika lagren av nätverket för att representera varje datapunkt under utvärderingen. Både en objektiv och subjektiv utvärderingsmetod används. Under den objektiva utvärderingen undersöks först om de funna representationerna är distinkta för olika uppläsare, sedan undersöks även om dessa fångar upp information om uppläsarens kön. Den vanliga klassificeringsmodellen utvärderas också genom ett användartest, eftersom den modellen nådde en storleksordning bättre resultat under den objektiva utvärderingen. Syftet med användartestet var att undersöka om de funna representationerna innehåller information om den upplevda likheten mellan rösterna. Slutsatsen är att det föreslagna tillvägagångssättet har potential till att användas för att representera röster så att information om likhet fångas upp, men att det krävs mer omfattande testning, undersökning och utvärdering. För framtida studier föreslås mer sofistikerad förbehandling av data samt att samla in och använda sig av data kring förhållandet mellan röster under träningen av modellerna.
2

Bopaiah, Jeevith. "A recurrent neural network architecture for biomedical event trigger classification." UKnowledge, 2018. https://uknowledge.uky.edu/cs_etds/73.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
A “biomedical event” is a broad term used to describe the roles and interactions between entities (such as proteins, genes and cells) in a biological system. The task of biomedical event extraction aims at identifying and extracting these events from unstructured texts. An important component in the early stage of the task is biomedical trigger classification which involves identifying and classifying words/phrases that indicate an event. In this thesis, we present our work on biomedical trigger classification developed using the multi-level event extraction dataset. We restrict the scope of our classification to 19 biomedical event types grouped under four broad categories - Anatomical, Molecular, General and Planned. While most of the existing approaches are based on traditional machine learning algorithms which require extensive feature engineering, our model relies on neural networks to implicitly learn important features directly from the text. We use natural language processing techniques to transform the text into vectorized inputs that can be used in a neural network architecture. As per our knowledge, this is the first time neural attention strategies are being explored in the area of biomedical trigger classification. Our best results were obtained from an ensemble of 50 models which produced a micro F-score of 79.82%, an improvement of 1.3% over the previous best score.
3

PALUMBO, ENRICO. "Knowledge Graph Embeddings for Recommender Systems." Doctoral thesis, Politecnico di Torino, 2020. http://hdl.handle.net/11583/2850588.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Pettersson, Fredrik. "Optimizing Deep Neural Networks for Classification of Short Texts." Thesis, Luleå tekniska universitet, Datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-76811.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This master's thesis investigates how a state-of-the-art (SOTA) deep neural network (NN) model can be created for a specific natural language processing (NLP) dataset, the effects of using different dimensionality reduction techniques on common pre-trained word embeddings and how well this model generalize on a secondary dataset. The research is motivated by two factors. One is that the construction of a machine learning (ML) text classification (TC) model is typically done around a specific dataset and often requires a lot of manual intervention. It's therefore hard to know exactly what procedures to implement for a specific dataset and how the result will be affected. The other reason is that, if the dimensionality of pre-trained embedding vectors can be lowered without losing accuracy, and thus saving execution time, other techniques can be used during the time saved to achieve even higher accuracy. A handful of deep neural network architectures are used, namely a convolutional neural network (CNN), long short-term memory neural network (LSTM) and a bidirectional LSTM (Bi-LSTM) architecture. These deep neural network architectures are combined with four different word embeddings: GoogleNews-vectors-negative300, glove.840B.300d, paragram_300_sl999 and wiki-news-300d-1M. Three main experiments are conducted in this thesis. In the first experiment, a top-performing TC model is created for a recent NLP competition held at Kaggle.com. Each implemented procedure is benchmarked on how the accuracy and execution time of the model is affected. In the second experiment, principal component analysis (PCA) and random projection (RP) are applied to the pre-trained word embeddings used in the top-performing model to investigate how the accuracy and execution time is affected when creating lower-dimensional embedding vectors. In the third experiment, the same model is benchmarked on a separate dataset (Sentiment140) to investigate how well it generalizes on other data and how each implemented procedure affects the accuracy compared to on the original dataset. The first experiment results in a bidirectional LSTM model and a combination of the three embeddings: glove, paragram and wiki-news concatenated together. The model is able to give predictions with an F1 score of 71% which is good enough to reach 9th place out of 1,401 participating teams in the competition. In the second experiment, the execution time is improved by 13%, by using PCA, while lowering the dimensionality of the embeddings by 66% and only losing half a percent of F1 accuracy. RP gave a constant accuracy of 66-67% regardless of the projected dimensions compared to over 70% when using PCA. In the third experiment, the model gained around 12% accuracy from the initial to the final benchmarks, compared to 19% on the competition dataset. The best-achieved accuracy on the Sentiment140 dataset is 86% and thus higher than the 71% achieved on the Quora dataset.
5

Revanur, Vandan, and Ayodeji Ayibiowu. "Automatic Generation of Descriptive Features for Predicting Vehicle Faults." Thesis, Högskolan i Halmstad, CAISR Centrum för tillämpade intelligenta system (IS-lab), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-42885.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Predictive Maintenance (PM) has been increasingly adopted in the Automotive industry, in the recent decades along with conventional approaches such as the Preventive Maintenance and Diagnostic/Corrective Maintenance, since it provides many advantages to estimate the failure before the actual occurrence proactively, and also being adaptive to the present status of the vehicle, in turn allowing flexible maintenance schedules for efficient repair or replacing of faulty components. PM necessitates the storage and analysis of large amounts of sensor data. This requirement can be a challenge in deploying this method on-board the vehicles due to the limited storage and computational power on the hardware of the vehicle. Hence, this thesis seeks to obtain low dimensional descriptive features from high dimensional data using Representation Learning. This low dimensional representation will be used for predicting vehicle faults, specifically Turbocharger related failures. Since the Logged Vehicle Data (LVD) was base on all the data utilized in this thesis, it allowed for the evaluation of large populations of trucks without requiring additional measuring devices and facilities. The gradual degradation methodology is considered for describing vehicle condition, which allows for modeling the malfunction/ failure as a continuous process rather than a discrete flip from healthy to an unhealthy state. This approach eliminates the challenge of data imbalance of healthy and unhealthy samples. Two important hypotheses are presented. Firstly, Parallel StackedClassical Autoencoders would produce better representations com-pared to individual Autoencoders. Secondly, employing Learned Em-beddings on Categorical Variables would improve the performance of the Dimensionality reduction. Based on these hypotheses, a model architecture is proposed and is developed on the LVD. The model is shown to achieve good performance, and in close standards to the previous state-of-the-art research. This thesis, finally, illustrates the potential to apply parallel stacked architectures with Learned Embeddings for the Categorical features, and a combination of feature selection and extraction for numerical features, to predict the Remaining Useful Life (RUL) of a vehicle, in the context of the Turbocharger. A performance improvement of 21.68% with respect to the Mean Absolute Error (MAE) loss with an 80.42% reduction in the size of data was observed.
6

Murugan, Srikala. "Determining Event Outcomes from Social Media." Thesis, University of North Texas, 2020. https://digital.library.unt.edu/ark:/67531/metadc1703427/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
An event is something that happens at a time and location. Events include major life events such as graduating college or getting married, and also simple day-to-day activities such as commuting to work or eating lunch. Most work on event extraction detects events and the entities involved in events. For example, cooking events will usually involve a cook, some utensils and appliances, and a final product. In this work, we target the task of determining whether events result in their expected outcomes. Specifically, we target cooking and baking events, and characterize event outcomes into two categories. First, we distinguish whether something edible resulted from the event. Second, if something edible resulted, we distinguish between perfect, partial and alternative outcomes. The main contributions of this thesis are a corpus of 4,000 tweets annotated with event outcome information and experimental results showing that the task can be automated. The corpus includes tweets that have only text as well as tweets that have text and an image.
7

De, Vine Lance. "Analogical frames by constraint satisfaction." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/198036/1/Lance_De%20Vine_Thesis.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This research develops a new and efficient constraint satisfaction approach to the unsupervised discovery of linguistic analogies. It shows that systems of analogies can be discovered with high confidence in natural language text by a computer program without human input. The discovery of analogies is useful for many applications such as the construction of linguistic resources, natural language processing and the automation of inference and reasoning.
8

Horn, Franziska Verfasser], Klaus-Robert [Akademischer Betreuer] [Gutachter] [Müller, Alan [Gutachter] Akbik, and Ziawasch [Gutachter] Abedjan. "Similarity encoder: A neural network architecture for learning similarity preserving embeddings / Franziska Horn ; Gutachter: Klaus-Robert Müller, Alan Akbik, Ziawasch Abedjan ; Betreuer: Klaus-Robert Müller." Berlin : Technische Universität Berlin, 2020. http://d-nb.info/1210998386/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Horn, Franziska [Verfasser], Klaus-Robert [Akademischer Betreuer] [Gutachter] Müller, Alan [Gutachter] Akbik, and Ziawasch [Gutachter] Abedjan. "Similarity encoder: A neural network architecture for learning similarity preserving embeddings / Franziska Horn ; Gutachter: Klaus-Robert Müller, Alan Akbik, Ziawasch Abedjan ; Betreuer: Klaus-Robert Müller." Berlin : Technische Universität Berlin, 2020. http://d-nb.info/1210998386/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Šůstek, Martin. "Word2vec modely s přidanou kontextovou informací." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2017. http://www.nusl.cz/ntk/nusl-363837.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extension based on convolution neural network.
11

Sarti, Paolo. "Embeddings for text classification with recurrent neural networks." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
L'importanza di metodi automatici per la classificazione ed estrazione di informazioni da testi è cresciuta significativamente negli ultimi anni, a causa della produzione sempre maggiore di questo tipo di dati, specialmente tramite piattaforme web. Questo ha portato allo sviluppo di nuovi algoritmi per analizzare testi non strutturati. Le tecniche di "Embedding", che associano parole o parti di testo di lunghezza variabile a vettori di dimensione fissa mantenendo relazioni di similarità semantica, sono state un grande progresso per il campo del "Natural Language Processing". Inoltre, avanzamenti nelle tecniche di Deep Learning hanno migliorato significativamente la classificazione del testo, grazie agli affinamenti delle architetture delle reti neurali ricorrenti, in grado di processare sequenze di dimensioni variabili. Lo scopo di questo lavoro è stata la realizzazione di un prototipo che utilizzasse le tecniche citate per classificare documenti ed estrarre parti di testo. Il dominio di riferimento era composto da documenti amministrativi redatti da notai. Per la classificazione sono state utilizzate reti ricorrenti di tipo LSTM, e due tipologie di "embedding": a livello di parole ed a livello di frase. La prima tecnica è risultata più performante sull'insieme dei documenti di test, raggiungendo il 98,8% di accuratezza, mentre la seconda si è fermata al 96,7%. L'estrazione di parti rilevanti del testo è stata impostata come un problema di classificazione multi classe a livello della singola frase, utilizzando "word embedding" e reti ricorrenti LSTM. Complessivamente l'accuratezza ha raggiunto l'85,5% sull'insieme di test, mostrando però risultati non uniformi sulle singole classi. Tuttavia, si è rilevata una confusione ridotta tra le classi che rappresentavano le informazioni da estrarre. I modelli predittivi sono stati integrati in un prototipo, che ha permesso di verificare anche qualitativamente le buone prestazioni delle tecniche impiegate.
12

Barhoumi, Amira. "Une approche neuronale pour l’analyse d’opinions en arabe." Thesis, Le Mans, 2020. http://www.theses.fr/2020LEMA1022.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cette thèse s’inscrit dans le cadre de l’analyse d’opinions en arabe. Son objectif consiste à déterminer la polarité globale d’un énoncé textuel donné écrit en Arabe standard moderne (ASM) ou dialectes arabes. Cette thématique est un domaine de recherche en plein essor et a fait l’objet de nombreuses études avec une majorité de travaux actuels traitant des langues indo-européennes, en particulier la langue anglaise. Une des difficultés à laquelle se confronte cette thèse est le traitement de la langue arabe qui est une langue morphologiquement riche avec une grande variabilité des formes de surface observables dans les données d’apprentissage. Nous souhaitons pallier ce problème en produisant, de manière totalement automatique et contrôlée, de nouvelles représentations vectorielles continues (en anglais embeddings) spécifiques à la langue arabe. Notre étude se concentre sur l’utilisation d’une approche neuronale pour améliorer la détection de polarité, en exploitant la puissance des embeddings. En effet, ceux-ci se sont révélés un atout fondamental dans différentes tâches de traitement automatique des langues naturelles (TALN). Notre contribution dans le cadre de cette thèse porte plusieurs axes. Nous commençons, d’abord, par une étude préliminaire des différentes ressources d’embeddings de mots pré-entraînés existants en langue arabe. Ces embeddings considèrent les mots comme étant des unités séparées par des espaces afin de capturer, dans l'espace de projection, des similarités sémantiques et syntaxiques. Ensuite, nous nous focalisons sur les spécificités de la langue arabe en proposant des embeddings spécifiques pour cette langue. Les phénomènes comme l’agglutination et la richesse morphologique de l’arabe sont alors pris en compte. Ces embeddings spécifiques ont été utilisés, seuls et combinés, comme entrée à deux réseaux neuronaux (l’un convolutif et l’autre récurrent) apportant une amélioration des performances dans la détection de polarité sur un corpus de revues. Nous proposons une analyse poussée des embeddings proposées. Dans une évaluation intrinsèque, nous proposons un nouveau protocole introduisant la notion de la stabilité de polarités (sentiment stability) dans l’espace d'embeddings. Puis, nous proposons une analyse qualitative extrinsèque de nos embeddings en utilisant des méthodes de projection et de visualisation
My thesis is part of Arabic sentiment analysis. Its aim is to determine the global polarity of a given textual statement written in MSA or dialectal arabic. This research area has been subject of numerous studies dealing with Indo-European languages, in particular English. One of difficulties confronting this thesis is the processing of Arabic. In fact, Arabic is a morphologically rich language which implies a greater sparsity : we want to overcome this problem by producing, in a completely automatic way, new arabic specific embeddings. Our study focuses on the use of a neural approach to improve polarity detection, using embeddings. These embeddings have revealed fundamental in various natural languages processing tasks (NLP). Our contribution in this thesis concerns several axis. First, we begin with a preliminary study of the various existing pre-trained word embeddings resources in arabic. These embeddings consider words as space separated units in order to capture semantic and syntactic similarities in the embedding space. Second, we focus on the specifity of Arabic language. We propose arabic specific embeddings that take into account agglutination and morphological richness of Arabic. These specific embeddings have been used, alone and in combined way, as input to neural networks providing an improvement in terms of classification performance. Finally, we evaluate embeddings with intrinsic and extrinsic methods specific to sentiment analysis task. For intrinsic embeddings evaluation, we propose a new protocol introducing the notion of sentiment stability in the embeddings space. We propose also a qualitaive extrinsic analysis of our embeddings by using visualisation methods
13

Bekkouch, Imad Eddine Ibrahim. "Auxiliary learning & Adversarial training pour les études des manuscrits médiévaux." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUL014.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cette thèse se situe à l'intersection de la musicologie et de l'intelligence artificielle, et vise à exploiter l'IA pour aider les musicologues dans leur travail répétitif, comme la recherche d'objets dans les manuscrits du musée. Nous avons annoté quatre nouveaux ensembles de données pour l'étude des manuscrits médiévaux : AMIMO, AnnMusiconis, AnnVihuelas et MMSD. Dans la deuxième partie, nous améliorons les performances des détecteurs d'objets en utilisant des techniques de Transfer learning et de Few Shot Object Detection.Dans la troisième partie, nous discutons d'une approche puissante de Domain Adaptation, qui est auxiliary learning, où nous formons le modèle sur la tâche cible et une tâche supplémentaire qui permet une meilleure stabilisation du modèle et réduit le over-fitting.Enfin, nous abordons l'apprentissage auto-supervisé, qui n'utilise pas de méta-données supplémentaires en tirant parti de l'approche de adversarial learning, forçant le modèle à extraire des caractéristiques indépendantes du domaine
This thesis is at the intersection of musicology and artificial intelligence, aiming to leverage AI to help musicologists with repetitive work, such as object searching in the museum's manuscripts. We annotated four new datasets for medieval manuscript studies: AMIMO, AnnMusiconis, AnnVihuelas, and MMSD. In the second part, we improve object detectors' performances using Transfer learning techniques and Few Shot Object Detection.In the third part, we discuss a powerful approach to Domain Adaptation, which is auxiliary learning, where we train the model on the target task and an extra task that allows for better stabilization of the model and reduces over-fitting.Finally, we discuss self-supervised learning, which does not use extra meta-data by leveraging the adversarial learning approach, forcing the model to extract domain-independent features
14

Chen, Beichen. "Stylometric Embeddings for Book Similarities." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-303125.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Stylometry is the field of research aimed at defining features for quantifying writing style, and the most studied question in stylometry has been authorship attribution, where given a set of texts with known authorship, we are asked to determine the author of a new unseen document. In this study a number of lexical and syntactic stylometric feature sets were extracted for two datasets, a smaller one containing 27 books from 25 authors, and a larger one containing 11,063 books from 316 authors. Neural networks were used to transform the features into embeddings after which the nearest neighbor method was used to attribute texts to their closest neighbor. The smaller dataset achieved an accuracy of 91.25% using frequencies of 50 most common functional words, dependency relations, and Part-of-speech (POS) tags as features, and the larger dataset achieved 69.18% accuracy using a similar feature set with 100 most common functional words. In addition to performing author attribution, a user test showed the potentials of the model in generating author similarities and hence being useful in an applied setting for recommending books to readers based on author style.
Stilometri eller stilistisk statistik är ett forskningsområde som arbetar med att definiera särdrag för att kvantitativt studera stilistisk variation hos författare. Stilometri har mest fokuserat på författarbestämning, där uppgiften är att avgöra vem som skrivit en viss text där författaren är okänd, givet tidigare texter med kända författare. I denna stude valdes ett antal lexikala och syntaktiska stilistiska särdrag vilka användes för att bestämma författare. Experimentella resultat redovisas för två samlingar litterära verk: en mindre med 27 böcker skrivna av 25 författare och en större med 11 063 böcker skrivna av 316 författare. Neurala nätverk användes för att koda de valda särdragen som vektorer varefter de närmaste grannarna för de okända texterna i vektorrummet användes för att bestämma författarna. För den mindre samlingen uppnåddes en träffsäkerhet på 91,25% genom att använda de 50 vanligaste funktionsorden, syntaktiska dependensrelationer och ordklassinformation. För den större samlingen uppnåddes en träffsäkerhet på 69,18% med liknande särdrag. Ett användartest visar att modellen utöver att bestämma författare har potential att representera likhet mellan författares stil. Detta skulle kunna tillämpas för att rekommendera böcker till läsare baserat på stil.
15

Fong, Vivian Lin. "Software Requirements Classification Using Word Embeddings and Convolutional Neural Networks." DigitalCommons@CalPoly, 2018. https://digitalcommons.calpoly.edu/theses/1851.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Software requirements classification, the practice of categorizing requirements by their type or purpose, can improve organization and transparency in the requirements engineering process and thus promote requirement fulfillment and software project completion. Requirements classification automation is a prominent area of research as automation can alleviate the tediousness of manual labeling and loosen its necessity for domain-expertise. This thesis explores the application of deep learning techniques on software requirements classification, specifically the use of word embeddings for document representation when training a convolutional neural network (CNN). As past research endeavors mainly utilize information retrieval and traditional machine learning techniques, we entertain the potential of deep learning on this particular task. With the support of learning libraries such as TensorFlow and Scikit-Learn and word embedding models such as word2vec and fastText, we build a Python system that trains and validates configurations of Naïve Bayes and CNN requirements classifiers. Applying our system to a suite of experiments on two well-studied requirements datasets, we recreate or establish the Naïve Bayes baselines and evaluate the impact of CNNs equipped with word embeddings trained from scratch versus word embeddings pre-trained on Big Data.
16

Okuno, Akifumi. "Studies on Neural Network-Based Graph Embedding and Its Extensions." Kyoto University, 2020. http://hdl.handle.net/2433/259075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Gilljam, Daniel, and Mario Youssef. "Jämförelse av artificiella neurala nätverksalgoritmerför klassificering av omdömen." Thesis, KTH, Hälsoinformatik och logistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230660.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Vid stor mängd data i form av kundomdömen kan det vara ett relativt tidskrävande arbeteatt bedöma varje omdömes sentiment manuellt, om det är positivt eller negativt laddat. Denna avhandling har utförts för att automatiskt kunna klassificera kundomdömen efter positiva eller negativa omdömen vilket hanterades med hjälp av maskininlärning. Tre olika djupa neurala nätverk testades och jämfördes med hjälp av två olika ramverk, TensorFlow och Keras, på både större och mindre datamängder. Även olika inbäddningsmetoder testades med de neurala nätverken. Den bästa kombination av neuralt nätverk, ramverk och inbäddningsmetod var ett Convolutional Neural Network (CNN) som använde ordinbäddningsmetoden Word2Vec, var skriven i ramverket Keras och gav en träffsäkerhetpå ca 88.87% med en avvikelse på ca 0.4%. CNN gav bäst resultat i alla olika tester framför de andra två neurala nätverken, Recurrent Neural Network (RNN) och Convolutional Recurrent Neural Network (CRNN)
With large amount of data in the form of customer reviews, it could be time consuming to manually go through each review and decide if its sentiment is positive or negative. This thesis have been done to automatically classify client reviews to determine if a review is positive or negative. This was dealt with by machine learning. Three different deep neural network was tested on greater and lesser datasets, and compared with the help of two different frameworks, TensorFlow and Keras. Different embedding methods were tested on the neural networks. The best combination of a neural network, a framework and anembedding was the Convolutional Neural Network (CNN) which used the word embedding method Word2Vec, was written in Keras framework and gave an accuracy of approximately 88.87% with a deviation of approximately 0.4%. CNN scored a better result in all of the tests in comparison with the two other neural networks, Recurrent NeuralNetwork (RNN) and Convolutional Recurrent Neural Network (CRNN).
18

Nguyen, Gia Hung. "Modèles neuronaux pour la recherche d'information : approches dirigées par les ressources sémantiques." Thesis, Toulouse 3, 2018. http://www.theses.fr/2018TOU30233.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Le projet de thèse porte sur l'application des approches neuronales pour la représentation de textes et l'appariement de textes en recherche d'information en vue de lever le verrou du fossé sémantique. Plus précisément, les activités de thèse explorent la combinaison des apports de la sémantique relationnelle issue de ressources externes (comme DPBedia et UMLS) et la sémantique distributionnelle basée sur les réseaux de neurones, dans le but : 1) d'apprendre des représentations de granules d'informations (mots, concepts) et représentations de documents, et 2) d'apprendre la fonction pertinence d'un document pour une requête. Notre première contribution comprend des modèles neuronaux pour l'apprentissage en ligne et apprentissage hors ligne des représentations de texte à plusieurs niveaux (mot, sens, document). Ces modèles intègrent les contraintes relationnelles issues des ressources externes par régularisation de la fonction objectif ou par enrichissement sémantique des instances d'apprentissage. La deuxième contribution consiste en un modèle d'appariement requête-document par un réseau de neurones siamois. Ce réseau apprend à mesurer un score de pertinence entre un document et une requête à partir des vecteurs de représentation en entrée modélisant des objets (concepts, entités) identifiés dans la requêtes et documents et leurs relations issues des ressources externes. Les évaluation expérimentales sont conduites sur des tâches de RI et de traitement du langage naturel (TALN) en utilisant des collections standards TREC et des ressources largement utilisées comme DBpedia ou UMLS. Les résultats montrent principalement l'intérêt de l'utilisation des approches neuronales à la fois au niveau de la représentation des textes et de leur appariement ainsi que la variabilité de leurs performances selon les tâches considérées
In this thesis, we focus on bridging the semantic gap between the documents and queries representations, hence improve the matching performance. We propose to combine relational semantics from knowledge resources and distributed semantics of the corpus inferred by neural models. Our contributions consist of two main aspects: (1) Improving distributed representations of text for IR tasks. We propose two models that integrate relational semantics into the distributed representations: a) an offline model that combines two types of pre-trained representations to obtain a hybrid representation of the document; b) an online model that jointly learns distributed representations of documents, concepts and words. To better integrate relational semantics from knowledge resources, we propose two approaches to inject these relational constraints, one based on the regularization of the objective function, the other based on instances in the training text. (2) Exploiting neural networks for semantic matching of documents}. We propose a neural model for document-query matching. Our neural model relies on: a) a representation of raw-data that models the relational semantics of text by jointly considering objects and relations expressed in a knowledge resource, and b) an end-to-end neural architecture that learns the query-document relevance by leveraging the distributional and relational semantics of documents and queries
19

Adewumi, Oluwatosin. "Word Vector Representations using Shallow Neural Networks." Licentiate thesis, Luleå tekniska universitet, EISLAB, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-83578.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This work highlights some important factors for consideration when developing word vector representations and data-driven conversational systems. The neural network methods for creating word embeddings have gained more prominence than their older, count-based counterparts.However, there are still challenges, such as prolonged training time and the need for more data, especially with deep neural networks. Shallow neural networks with lesser depth appear to have the advantage of less complexity, however, they also face challenges, such as sub-optimal combination of hyper-parameters which produce sub-optimal models. This work, therefore, investigates the following research questions: "How importantly do hyper-parameters influence word embeddings’ performance?" and "What factors are important for developing ethical and robust conversational systems?" In answering the questions, various experiments were conducted using different datasets in different studies. The first study investigates, empirically, various hyper-parameter combinations for creating word vectors and their impact on a few natural language processing (NLP) downstream tasks: named entity recognition (NER) and sentiment analysis (SA). The study shows that optimal performance of embeddings for downstream \acrshort{nlp} tasks depends on the task at hand.It also shows that certain combinations give strong performance across the tasks chosen for the study. Furthermore, it shows that reasonably smaller corpora are sufficient or even produce better models in some cases and take less time to train and load. This is important, especially now that environmental considerations play prominent role in ethical research. Subsequent studies build on the findings of the first and explore the hyper-parameter combinations for Swedish and English embeddings for the downstream NER task. The second study presents the new Swedish analogy test set for evaluation of Swedish embeddings. Furthermore, it shows that character n-grams are useful for Swedish, a morphologically rich language. The third study shows that broad coverage of topics in a corpus appears to be important to produce better embeddings and that noise may be helpful in certain instances, though they are generally harmful. Hence, relatively smaller corpus can show better performance than a larger one, as demonstrated in the work with the smaller Swedish Wikipedia corpus against the Swedish Gigaword. The argument is made, in the final study (in answering the second question) from the point of view of the philosophy of science, that the near-elimination of the presence of unwanted bias in training data and the use of foralike the peer-review, conferences, and journals to provide the necessary avenues for criticism and feedback are instrumental for the development of ethical and robust conversational systems.
20

Pottorff, Robert Thomas. "Video Prediction with Invertible Linear Embeddings." BYU ScholarsArchive, 2019. https://scholarsarchive.byu.edu/etd/7577.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Using recently popularized invertible neural network We predict future video frames from complex dynamic scenes. Our invertible linear embedding (ILE) demonstrates successful learning, prediction and latent state inference. In contrast to other approaches, ILE does not use any explicit reconstruction loss or simplistic pixel-space assumptions. Instead, it leverages invertibility to optimize the likelihood of image sequences exactly, albeit indirectly.Experiments and comparisons against state of the art methods over synthetic and natural image sequences demonstrate the robustness of our approach, and a discussion of future work explores the opportunities our method might provide to other fields in which the accurate analysis and forecasting of non-linear dynamic systems is essential.
21

Wang, Run Fen. "Semantic Text Matching Using Convolutional Neural Networks." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-362134.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Semantic text matching is a fundamental task for many applications in NaturalLanguage Processing (NLP). Traditional methods using term frequencyinversedocument frequency (TF-IDF) to match exact words in documentshave one strong drawback which is TF-IDF is unable to capture semanticrelations between closely-related words which will lead to a disappointingmatching result. Neural networks have recently been used for various applicationsin NLP, and achieved state-of-the-art performances on many tasks.Recurrent Neural Networks (RNN) have been tested on text classificationand text matching, but it did not gain any remarkable results, which is dueto RNNs working more effectively on texts with a short length, but longdocuments. In this paper, Convolutional Neural Networks (CNN) will beapplied to match texts in a semantic aspect. It uses word embedding representationsof two texts as inputs to the CNN construction to extract thesemantic features between the two texts and give a score as the output ofhow certain the CNN model is that they match. The results show that aftersome tuning of the parameters the CNN model could produce accuracy,prediction, recall and F1-scores all over 80%. This is a great improvementover the previous TF-IDF results and further improvements could be madeby using dynamic word vectors, better pre-processing of the data, generatelarger and more feature rich data sets and further tuning of the parameters.
22

Le, Khanh Duc. "A Study of Face Embedding in Face Recognition." DigitalCommons@CalPoly, 2019. https://digitalcommons.calpoly.edu/theses/1989.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Face Recognition has been a long-standing topic in computer vision and pattern recognition field because of its wide and important applications in our daily lives such as surveillance system, access control, and so on. The current modern face recognition model, which keeps only a couple of images per person in the database, can now recognize a face with high accuracy. Moreover, the model does not need to be retrained every time a new person is added to the database. By using the face dataset from Digital Democracy, the thesis will explore the capability of this model by comparing it with the standard convolutional neural network based on pose variations and training set sizes. First, we compare different types of pose to see their effect on the accuracy of the algorithm. Second, we train the system using different number of training images per person to see how many training samples are actually needed to maintain a reasonable accuracy. Finally, to push the limit, we decide to train the model using only a single image per person with the help of a face generation technique to synthesize more faces. The performance obtained by this integration is found to be competitive with the previous results, which are trained on multiple images.
23

Gupta, Parth Alokkumar. "Cross-view Embeddings for Information Retrieval." Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/78457.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this dissertation, we deal with the cross-view tasks related to information retrieval using embedding methods. We study existing methodologies and propose new methods to overcome their limitations. We formally introduce the concept of mixed-script IR, which deals with the challenges faced by an IR system when a language is written in different scripts because of various technological and sociological factors. Mixed-script terms are represented by a small and finite feature space comprised of character n-grams. We propose the cross-view autoencoder (CAE) to model such terms in an abstract space and CAE provides the state-of-the-art performance. We study a wide variety of models for cross-language information retrieval (CLIR) and propose a model based on compositional neural networks (XCNN) which overcomes the limitations of the existing methods and achieves the best results for many CLIR tasks such as ad-hoc retrieval, parallel sentence retrieval and cross-language plagiarism detection. We empirically test the proposed models for these tasks on publicly available datasets and present the results with analyses. In this dissertation, we also explore an effective method to incorporate contextual similarity for lexical selection in machine translation. Concretely, we investigate a feature based on context available in source sentence calculated using deep autoencoders. The proposed feature exhibits statistically significant improvements over the strong baselines for English-to-Spanish and English-to-Hindi translation tasks. Finally, we explore the the methods to evaluate the quality of autoencoder generated representations of text data and analyse its architectural properties. For this, we propose two metrics based on reconstruction capabilities of the autoencoders: structure preservation index (SPI) and similarity accumulation index (SAI). We also introduce a concept of critical bottleneck dimensionality (CBD) below which the structural information is lost and present analyses linking CBD and language perplexity.
En esta disertación estudiamos problemas de vistas-múltiples relacionados con la recuperación de información utilizando técnicas de representación en espacios de baja dimensionalidad. Estudiamos las técnicas existentes y proponemos nuevas técnicas para solventar algunas de las limitaciones existentes. Presentamos formalmente el concepto de recuperación de información con escritura mixta, el cual trata las dificultades de los sistemas de recuperación de información cuando los textos contienen escrituras en distintos alfabetos debido a razones tecnológicas y socioculturales. Las palabras en escritura mixta son representadas en un espacio de características finito y reducido, compuesto por n-gramas de caracteres. Proponemos los auto-codificadores de vistas-múltiples (CAE, por sus siglas en inglés) para modelar dichas palabras en un espacio abstracto, y esta técnica produce resultados de vanguardia. En este sentido, estudiamos varios modelos para la recuperación de información entre lenguas diferentes (CLIR, por sus siglas en inglés) y proponemos un modelo basado en redes neuronales composicionales (XCNN, por sus siglas en inglés), el cual supera las limitaciones de los métodos existentes. El método de XCNN propuesto produce mejores resultados en diferentes tareas de CLIR tales como la recuperación de información ad-hoc, la identificación de oraciones equivalentes en lenguas distintas y la detección de plagio entre lenguas diferentes. Para tal efecto, realizamos pruebas experimentales para dichas tareas sobre conjuntos de datos disponibles públicamente, presentando los resultados y análisis correspondientes. En esta disertación, también exploramos un método eficiente para utilizar similitud semántica de contextos en el proceso de selección léxica en traducción automática. Específicamente, proponemos características extraídas de los contextos disponibles en las oraciones fuentes mediante el uso de auto-codificadores. El uso de las características propuestas demuestra mejoras estadísticamente significativas sobre sistemas de traducción robustos para las tareas de traducción entre inglés y español, e inglés e hindú. Finalmente, exploramos métodos para evaluar la calidad de las representaciones de datos de texto generadas por los auto-codificadores, a la vez que analizamos las propiedades de sus arquitecturas. Como resultado, proponemos dos nuevas métricas para cuantificar la calidad de las reconstrucciones generadas por los auto-codificadores: el índice de preservación de estructura (SPI, por sus siglas en inglés) y el índice de acumulación de similitud (SAI, por sus siglas en inglés). También presentamos el concepto de dimensión crítica de cuello de botella (CBD, por sus siglas en inglés), por debajo de la cual la información estructural se deteriora. Mostramos que, interesantemente, la CBD está relacionada con la perplejidad de la lengua.
En aquesta dissertació estudiem els problemes de vistes-múltiples relacionats amb la recuperació d'informació utilitzant tècniques de representació en espais de baixa dimensionalitat. Estudiem les tècniques existents i en proposem unes de noves per solucionar algunes de les limitacions existents. Presentem formalment el concepte de recuperació d'informació amb escriptura mixta, el qual tracta les dificultats dels sistemes de recuperació d'informació quan els textos contenen escriptures en diferents alfabets per motius tecnològics i socioculturals. Les paraules en escriptura mixta són representades en un espai de característiques finit i reduït, composat per n-grames de caràcters. Proposem els auto-codificadors de vistes-múltiples (CAE, per les seves sigles en anglès) per modelar aquestes paraules en un espai abstracte, i aquesta tècnica produeix resultats d'avantguarda. En aquest sentit, estudiem diversos models per a la recuperació d'informació entre llengües diferents (CLIR , per les sevas sigles en anglès) i proposem un model basat en xarxes neuronals composicionals (XCNN, per les sevas sigles en anglès), el qual supera les limitacions dels mètodes existents. El mètode de XCNN proposat produeix millors resultats en diferents tasques de CLIR com ara la recuperació d'informació ad-hoc, la identificació d'oracions equivalents en llengües diferents, i la detecció de plagi entre llengües diferents. Per a tal efecte, realitzem proves experimentals per aquestes tasques sobre conjunts de dades disponibles públicament, presentant els resultats i anàlisis corresponents. En aquesta dissertació, també explorem un mètode eficient per utilitzar similitud semàntica de contextos en el procés de selecció lèxica en traducció automàtica. Específicament, proposem característiques extretes dels contextos disponibles a les oracions fonts mitjançant l'ús d'auto-codificadors. L'ús de les característiques proposades demostra millores estadísticament significatives sobre sistemes de traducció robustos per a les tasques de traducció entre anglès i espanyol, i anglès i hindú. Finalment, explorem mètodes per avaluar la qualitat de les representacions de dades de text generades pels auto-codificadors, alhora que analitzem les propietats de les seves arquitectures. Com a resultat, proposem dues noves mètriques per quantificar la qualitat de les reconstruccions generades pels auto-codificadors: l'índex de preservació d'estructura (SCI, per les seves sigles en anglès) i l'índex d'acumulació de similitud (SAI, per les seves sigles en anglès). També presentem el concepte de dimensió crítica de coll d'ampolla (CBD, per les seves sigles en anglès), per sota de la qual la informació estructural es deteriora. Mostrem que, de manera interessant, la CBD està relacionada amb la perplexitat de la llengua.
Gupta, PA. (2017). Cross-view Embeddings for Information Retrieval [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/78457
TESIS
24

Stein, Roger Alan. "An analysis of hierarchical text classification using word embeddings." Universidade do Vale do Rio dos Sinos, 2018. http://www.repositorio.jesuita.org.br/handle/UNISINOS/7624.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Submitted by JOSIANE SANTOS DE OLIVEIRA (josianeso) on 2019-03-07T14:41:05Z No. of bitstreams: 1 Roger Alan Stein_.pdf: 476239 bytes, checksum: a87a32ffe84d0e5d7a882e0db7b03847 (MD5)
Made available in DSpace on 2019-03-07T14:41:05Z (GMT). No. of bitstreams: 1 Roger Alan Stein_.pdf: 476239 bytes, checksum: a87a32ffe84d0e5d7a882e0db7b03847 (MD5) Previous issue date: 2018-03-28
CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Efficient distributed numerical word representation models (word embeddings) combined with modern machine learning algorithms have recently yielded considerable improvement on automatic document classification tasks. However, the effectiveness of such techniques has not been assessed for the hierarchical text classification (HTC) yet. This study investigates application of those models and algorithms on this specific problem by means of experimentation and analysis. Classification models were trained with prominent machine learning algorithm implementations—fastText, XGBoost, and Keras’ CNN—and noticeable word embeddings generation methods—GloVe, word2vec, and fastText—with publicly available data and evaluated them with measures specifically appropriate for the hierarchical context. FastText achieved an LCAF1 of 0.871 on a single-labeled version of the RCV1 dataset. The results analysis indicates that using word embeddings is a very promising approach for HTC.
Modelos eficientes de representação numérica textual (word embeddings) combinados com algoritmos modernos de aprendizado de máquina têm recentemente produzido uma melhoria considerável em tarefas de classificação automática de documentos. Contudo, a efetividade de tais técnicas ainda não foi avaliada com relação à classificação hierárquica de texto. Este estudo investiga a aplicação daqueles modelos e algoritmos neste problema em específico através de experimentação e análise. Modelos de classificação foram treinados usando implementações proeminentes de algoritmos de aprendizado de máquina—fastText, XGBoost e CNN (Keras)— e notórios métodos de geração de word embeddings—GloVe, word2vec e fastText—com dados disponíveis publicamente e avaliados usando métricas especificamente adequadas ao contexto hierárquico. Nesses experimentos, fastText alcançou um LCAF1 de 0,871 usando uma versão da base de dados RCV1 com apenas uma categoria por tupla. A análise dos resultados indica que a utilização de word embeddings é uma abordagem muito promissora para classificação hierárquica de texto.
25

Droh, Erik. "T-Distributed Stochastic Neighbor Embedding Data Preprocessing Impact on Image Classification using Deep Convolutional Neural Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-237422.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Image classification in Machine Learning encompasses the task of identification of objects in an image. The technique has applications in various areas such as e-commerce, social media and security surveillance. In this report the author explores the impact of using t-Distributed Stochastic Neighbor Embedding (t-SNE) on data as a preprocessing step when classifying multiple classes of clothing with a state-of-the-art Deep Convolutional Neural Network (DCNN). The t-SNE algorithm uses dimensionality reduction and groups similar objects close to each other in three-dimensional space. Extracting this information in the form of a positional coordinate gives us a new parameter which could help with the classification process since the features it uses can be different from that of the DCNN. Therefore, three slightly different DCNN models receives different input and are compared. The first benchmark model only receives pixel values, the second and third receive pixel values together with the positional coordinates from the t-SNE preprocessing for each data point, but with different hyperparameter values in the preprocessing step. The Fashion-MNIST dataset used contains 10 different clothing classes which are normalized and gray-scaled for easeof-use. The dataset contains 70.000 images in total. Results show minimum change in classification accuracy in the case of using a low-density map with higher learning rate as the data size increases, while a more dense map and lower learning rate performs a significant increase in accuracy of 4.4% when using a small data set. This is evidence for the fact that the method can be used to boost results when data is limited.
Bildklassificering i maskinlärning innefattar uppgiften att identifiera objekt i en bild. Tekniken har applikationer inom olika områden så som e-handel, sociala medier och säkerhetsövervakning. I denna rapport undersöker författaren effekten av att användat-Distributed Stochastic Neighbour Embedding (t-SNE) på data som ett förbehandlingssteg vid klassificering av flera klasser av kläder med ett state-of-the-art Deep Convolutio-nal Neural Network (DCNN). t-SNE-algoritmen använder dimensioneringsreduktion och grupperar liknande objekt nära varandra i tredimensionellt utrymme. Att extrahera denna information i form av en positionskoordinat ger oss en ny parameter som kan hjälpa till med klassificeringsprocessen eftersom funktionerna som den använder kan skilja sig från DCNN-modelen. Tre olika DCNN-modeller får olika in-data och jämförs därefter. Den första referensmodellen mottar endast pixelvärden, det andra och det tredje motar pixelvärden tillsammans med positionskoordinaterna från t-SNE-förbehandlingen för varje datapunkt men med olika hyperparametervärden i förbehandlingssteget. I studien används Fashion-MNIST datasetet som innehåller 10 olika klädklasser som är normaliserade och gråskalade för enkel användning. Datasetet innehåller totalt 70.000 bilder. Resultaten visar minst förändring i klassificeringsnoggrannheten vid användning av en låg densitets karta med högre inlärningsgrad allt eftersom datastorleken ökar, medan en mer tät karta och lägre inlärningsgrad uppnår en signifikant ökad noggrannhet på 4.4% när man använder en liten datamängd. Detta är bevis på att metoden kan användas för att öka klassificeringsresultaten när datamängden är begränsad.
26

Boroş, Emanuela. "Neural Methods for Event Extraction." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS302/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Du point de vue du traitement automatique des langues (TAL), l’extraction des événements dans les textes est la forme la plus complexe des processus d’extraction d’information, qui recouvrent de façon plus générale l’extraction des entités nommées et des relations qui les lient dans les textes. Le cas des événements est particulièrement ardu car un événement peut être assimilé à une relation n-aire ou à une configuration de relations. Alors que la recherche en extraction d’information a largement bénéficié des jeux de données étiquetés manuellement pour apprendre des modèles permettant l’analyse des textes, la disponibilité de ces ressources reste un problème important. En outre, de nombreuses approches en extraction d’information fondées sur l’apprentissage automatique reposent sur la possibilité d’extraire à partir des textes de larges en sembles de traits définis manuellement grâce à des outils de TAL élaborés. De ce fait, l’adaptation à un nouveau domaine constitue un défi supplémentaire. Cette thèse présente plusieurs stratégies pour améliorer la performance d’un système d’extraction d’événements en utilisant des approches fondées sur les réseaux de neurones et en exploitant les propriétés morphologiques, syntaxiques et sémantiques des plongements de mots. Ceux-ci ont en effet l’avantage de ne pas nécessiter une modélisation a priori des connaissances du domaine et de générer automatiquement un ensemble de traits beaucoup plus vaste pour apprendre un modèle. Nous avons proposé plus spécifiquement différents modèles d’apprentissage profond pour les deux sous-tâches liées à l’extraction d’événements : la détection d’événements et la détection d’arguments. La détection d’événements est considérée comme une sous-tâche importante de l’extraction d’événements dans la mesure où la détection d’arguments est très directement dépendante de son résultat. La détection d’événements consiste plus précisément à identifier des instances d’événements dans les textes et à les classer en types d’événements précis. En préalable à l’introduction de nos nouveaux modèles, nous commençons par présenter en détail le modèle de l’état de l’art qui en constitue la base. Des expériences approfondies sont menées sur l’utilisation de différents types de plongements de mots et sur l’influence des différents hyperparamètres du modèle en nous appuyant sur le cadre d’évaluation ACE 2005, standard d’évaluation pour cette tâche. Nous proposons ensuite deux nouveaux modèles permettant d’améliorer un système de détection d’événements. L’un permet d’augmenter le contexte pris en compte lors de la prédiction d’une instance d’événement (déclencheur d’événement) en utilisant un contexte phrastique, tandis que l’autre exploite la structure interne des mots en profitant de connaissances morphologiques en apparence moins nécessaires mais dans les faits importantes. Nous proposons enfin de reconsidérer la détection des arguments comme une extraction de relation d’ordre supérieur et nous analysons la dépendance de cette détection vis-à-vis de la détection d’événements
With the increasing amount of data and the exploding number data sources, the extraction of information about events, whether from the perspective of acquiring knowledge or from a more directly operational perspective, becomes a more and more obvious need. This extraction nevertheless comes up against a recurring difficulty: most of the information is present in documents in a textual form, thus unstructured and difficult to be grasped by the machine. From the point of view of Natural Language Processing (NLP), the extraction of events from texts is the most complex form of Information Extraction (IE) techniques, which more generally encompasses the extraction of named entities and relationships that bind them in the texts. The event extraction task can be represented as a complex combination of relations linked to a set of empirical observations from texts. Compared to relations involving only two entities, there is, therefore, a new dimension that often requires going beyond the scope of the sentence, which constitutes an additional difficulty. In practice, an event is described by a trigger and a set of participants in that event whose values are text excerpts. While IE research has benefited significantly from manually annotated datasets to learn patterns for text analysis, the availability of these resources remains a significant problem. These datasets are often obtained through the sustained efforts of research communities, potentially complemented by crowdsourcing. In addition, many machine learning-based IE approaches rely on the ability to extract large sets of manually defined features from text using sophisticated NLP tools. As a result, adaptation to a new domain is an additional challenge. This thesis presents several strategies for improving the performance of an Event Extraction (EE) system using neural-based approaches exploiting morphological, syntactic, and semantic properties of word embeddings. These have the advantage of not requiring a priori modeling domain knowledge and automatically generate a much larger set of features to learn a model. More specifically, we proposed different deep learning models for two sub-tasks related to EE: event detection and argument detection and classification. Event Detection (ED) is considered an important subtask of event extraction since the detection of arguments is very directly dependent on its outcome. ED specifically involves identifying instances of events in texts and classifying them into specific event types. Classically, the same event may appear as different expressions and these expressions may themselves represent different events in different contexts, hence the difficulty of the task. The detection of the arguments is based on the detection of the expression considered as triggering the event and ensures the recognition of the participants of the event. Among the difficulties to take into account, it should be noted that an argument can be common to several events and that it does not necessarily identify with an easily recognizable named entity. As a preliminary to the introduction of our proposed models, we begin by presenting in detail a state-of-the-art model which constitutes the baseline. In-depth experiments are conducted on the use of different types of word embeddings and the influence of the different hyperparameters of the model using the ACE 2005 evaluation framework, a standard evaluation for this task. We then propose two new models to improve an event detection system. One allows increasing the context taken into account when predicting an event instance by using a sentential context, while the other exploits the internal structure of words by taking advantage of seemingly less obvious but essentially important morphological knowledge. We also reconsider the detection of arguments as a high-order relation extraction and we analyze the dependence of arguments on the ED task
27

Yap, Han Lun. "Constrained measurement systems of low-dimensional signals." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/47716.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The object of this thesis is the study of constrained measurement systems of signals having low-dimensional structure using analytic tools from Compressed Sensing (CS). Realistic measurement systems usually have architectural constraints that make them differ from their idealized, well-studied counterparts. Nonetheless, these measurement systems can exploit structure in the signals that they measure. Signals considered in this research have low-dimensional structure and can be broken down into two types: static or dynamic. Static signals are either sparse in a specified basis or lying on a low-dimensional manifold (called manifold-modeled signals). Dynamic signals, exemplified as states of a dynamical system, either lie on a low-dimensional manifold or have converged onto a low-dimensional attractor. In CS, the Restricted Isometry Property (RIP) of a measurement system ensures that distances between all signals of a certain sparsity are preserved. This stable embedding ensures that sparse signals can be distinguished one from another by their measurements and therefore be robustly recovered. Moreover, signal-processing and data-inference algorithms can be performed directly on the measurements instead of requiring a prior signal recovery step. Taking inspiration from the RIP, this research analyzes conditions on realistic, constrained measurement systems (of the signals described above) such that they are stable embeddings of the signals that they measure. Specifically, this thesis focuses on four different types of measurement systems. First, we study the concentration of measure and the RIP of random block diagonal matrices that represent measurement systems constrained to make local measurements. Second, we study the stable embedding of manifold-modeled signals by existing CS matrices. The third part of this thesis deals with measurement systems of dynamical systems that produce time series observations. While Takens' embedding result ensures that this time series output can be an embedding of the dynamical systems' states, our research establishes that a stronger stable embedding result is possible under certain conditions. The final part of this thesis is the application of CS ideas to the study of the short-term memory of neural networks. In particular, we show that the nodes of a recurrent neural network can be a stable embedding of sparse input sequences.
28

Kilinc, Ismail Ozsel. "Graph-based Latent Embedding, Annotation and Representation Learning in Neural Networks for Semi-supervised and Unsupervised Settings." Scholar Commons, 2017. https://scholarcommons.usf.edu/etd/7415.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Machine learning has been immensely successful in supervised learning with outstanding examples in major industrial applications such as voice and image recognition. Following these developments, the most recent research has now begun to focus primarily on algorithms which can exploit very large sets of unlabeled examples to reduce the amount of manually labeled data required for existing models to perform well. In this dissertation, we propose graph-based latent embedding/annotation/representation learning techniques in neural networks tailored for semi-supervised and unsupervised learning problems. Specifically, we propose a novel regularization technique called Graph-based Activity Regularization (GAR) and a novel output layer modification called Auto-clustering Output Layer (ACOL) which can be used separately or collaboratively to develop scalable and efficient learning frameworks for semi-supervised and unsupervised settings. First, singularly using the GAR technique, we develop a framework providing an effective and scalable graph-based solution for semi-supervised settings in which there exists a large number of observations but a small subset with ground-truth labels. The proposed approach is natural for the classification framework on neural networks as it requires no additional task calculating the reconstruction error (as in autoencoder based methods) or implementing zero-sum game mechanism (as in adversarial training based methods). We demonstrate that GAR effectively and accurately propagates the available labels to unlabeled examples. Our results show comparable performance with state-of-the-art generative approaches for this setting using an easier-to-train framework. Second, we explore a different type of semi-supervised setting where a coarse level of labeling is available for all the observations but the model has to learn a fine, deeper level of latent annotations for each one. Problems in this setting are likely to be encountered in many domains such as text categorization, protein function prediction, image classification as well as in exploratory scientific studies such as medical and genomics research. We consider this setting as simultaneously performed supervised classification (per the available coarse labels) and unsupervised clustering (within each one of the coarse labels) and propose a novel framework combining GAR with ACOL, which enables the network to perform concurrent classification and clustering. We demonstrate how the coarse label supervision impacts performance and the classification task actually helps propagate useful clustering information between sub-classes. Comparative tests on the most popular image datasets rigorously demonstrate the effectiveness and competitiveness of the proposed approach. The third and final setup builds on the prior framework to unlock fully unsupervised learning where we propose to substitute real, yet unavailable, parent- class information with pseudo class labels. In this novel unsupervised clustering approach the network can exploit hidden information indirectly introduced through a pseudo classification objective. We train an ACOL network through this pseudo supervision together with unsupervised objective based on GAR and ultimately obtain a k-means friendly latent representation. Furthermore, we demonstrate how the chosen transformation type impacts performance and helps propagate the latent information that is useful in revealing unknown clusters. Our results show state-of-the-art performance for unsupervised clustering tasks on MNIST, SVHN and USPS datasets with the highest accuracies reported to date in the literature.
29

Sandvick, Joshua Sandvick. "Machine Translation Through the Creation of a Common Embedding Space." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531420294211248.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Mendolia, Isabella. "Deep neural networks leveraging different arrangements of molecular fingerprints to define a novel embedding for virtual screening procedure." Doctoral thesis, Università degli Studi di Palermo, 2022. https://hdl.handle.net/10447/554696.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Bader, Sebastian. "Neural-Symbolic Integration." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2009. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-25468.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this thesis, we discuss different techniques to bridge the gap between two different approaches to artificial intelligence: the symbolic and the connectionist paradigm. Both approaches have quite contrasting advantages and disadvantages. Research in the area of neural-symbolic integration aims at bridging the gap between them. Starting from a human readable logic program, we construct connectionist systems, which behave equivalently. Afterwards, those systems can be trained, and later the refined knowledge be extracted.
32

Fano, Elena. "A comparative study of word embedding methods for early risk prediction on the Internet." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-385052.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We built a system to participate in the eRisk 2019 T1 Shared Task. The aim of the task was to evaluate systems for early risk prediction on the internet, in particular to identify users suffering from eating disorders as accurately andquickly as possible given their history of Reddit posts in chronological order. In the controlled settings of this task, we also evaluated the performance of three different word representation methods: random indexing, GloVe, and ELMo.We discuss our system’s performance, also in the light of the scores obtained by other teams in the shared task. Our results show that our two-step learning approach was quite successful, and we obtained good scores on the early risk prediction metric ERDE across the board. Contrary to our expectations, we did not observe a clear-cut advantage of contextualized ELMo vectors over the commonly used and much more light-weight GloVevectors. Our best model in terms of F1 score turned out to be a model with GloVe vectors as input to the text classifier and a multi-layer perceptron as user classifier. The best ERDE scores were obtained by the model with ELMo vectors and a multi-layer perceptron. The model with random indexing vectors hit a good balance between precision and recall in the early processing stages but was eventually surpassed by the models with GloVe and ELMo vectors. We put forward some possible explanations for the observed results, as well as proposing some improvements to our system.
33

Vukotic, Verdran. "Deep Neural Architectures for Automatic Representation Learning from Multimedia Multimodal Data." Thesis, Rennes, INSA, 2017. http://www.theses.fr/2017ISAR0015/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La thèse porte sur le développement d'architectures neuronales profondes permettant d'analyser des contenus textuels ou visuels, ou la combinaison des deux. De manière générale, le travail tire parti de la capacité des réseaux de neurones à apprendre des représentations abstraites. Les principales contributions de la thèse sont les suivantes: 1) Réseaux récurrents pour la compréhension de la parole: différentes architectures de réseaux sont comparées pour cette tâche sur leurs facultés à modéliser les observations ainsi que les dépendances sur les étiquettes à prédire. 2) Prédiction d’image et de mouvement : nous proposons une architecture permettant d'apprendre une représentation d'une image représentant une action humaine afin de prédire l'évolution du mouvement dans une vidéo ; l'originalité du modèle proposé réside dans sa capacité à prédire des images à une distance arbitraire dans une vidéo. 3) Encodeurs bidirectionnels multimodaux : le résultat majeur de la thèse concerne la proposition d'un réseau bidirectionnel permettant de traduire une modalité en une autre, offrant ainsi la possibilité de représenter conjointement plusieurs modalités. L'approche été étudiée principalement en structuration de collections de vidéos, dons le cadre d'évaluations internationales où l'approche proposée s'est imposée comme l'état de l'art. 4) Réseaux adverses pour la fusion multimodale: la thèse propose d'utiliser les architectures génératives adverses pour apprendre des représentations multimodales en offrant la possibilité de visualiser les représentations dans l'espace des images
In this dissertation, the thesis that deep neural networks are suited for analysis of visual, textual and fused visual and textual content is discussed. This work evaluates the ability of deep neural networks to learn automatic multimodal representations in either unsupervised or supervised manners and brings the following main contributions:1) Recurrent neural networks for spoken language understanding (slot filling): different architectures are compared for this task with the aim of modeling both the input context and output label dependencies.2) Action prediction from single images: we propose an architecture that allow us to predict human actions from a single image. The architecture is evaluated on videos, by utilizing solely one frame as input.3) Bidirectional multimodal encoders: the main contribution of this thesis consists of neural architecture that translates from one modality to the other and conversely and offers and improved multimodal representation space where the initially disjoint representations can translated and fused. This enables for improved multimodal fusion of multiple modalities. The architecture was extensively studied an evaluated in international benchmarks within the task of video hyperlinking where it defined the state of the art today.4) Generative adversarial networks for multimodal fusion: continuing on the topic of multimodal fusion, we evaluate the possibility of using conditional generative adversarial networks to lean multimodal representations in addition to providing multimodal representations, generative adversarial networks permit to visualize the learned model directly in the image domain
34

CONTINO, Salvatore. "Study and identification of new molecular descriptors, finalized to the development of Virtual Screening techniques through the use of deep neural networks." Doctoral thesis, Università degli Studi di Palermo, 2022. https://hdl.handle.net/10447/554714.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Prencipe, Michele Pio. "Elaborazione del Linguaggio Naturale con Metodi Probabilistici e Reti Neurali." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24312/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
L'elaborazione del linguaggio naturale (NLP) è il processo per il quale la macchina tenta di imparare le informazioni del parlato o dello scritto tipico dell'essere umano. La procedura è resa particolarmente complessa dalle numerose ambiguità tipiche della lingua o del testo: ironia, metafore, errori ortografici e così via. Grazie all'apprendimento profondo, il Deep Learning, che ha permesso lo sviluppo delle reti neurali, si è raggiunto lo stato dell'arte nell'ambito NLP, tramite l'introduzione di architetture quali Encoder-Decoder, Transformers o meccanismi di attenzione. Le reti neurali, in particolare quelle con memoria o ricorrenti, si prestano molto bene ai task di NLP, per via della loro capacità di apprendere da una grande mole di dati a disposizione, ma anche perché riescono a concentrarsi particolarmente bene sul contesto di ciascuna parola in input o sulla sentiment analysis di una frase. In questo elaborato vengono analizzate le principali tecniche per fare apprendere il linguaggio naturale al calcolatore elettronico; il tutto viene descritto con esempi e parti di codice Python. Per avere una visione completa sull'ambito, si prende come riferimento il libro di testo "Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow" di Aurélien Géron, oltre che alla bibliografia correlata.
36

Bahceci, Oktay. "Deep Neural Networks for Context Aware Personalized Music Recommendation : A Vector of Curation." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-210252.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Information Filtering and Recommender Systems have been used and has been implemented in various ways from various entities since the dawn of the Internet, and state-of-the-art approaches rely on Machine Learning and Deep Learning in order to create accurate and personalized recommendations for users in a given context. These models require big amounts of data with a variety of features such as time, location and user data in order to find correlations and patterns that other classical models such as matrix factorization and collaborative filtering cannot. This thesis researches, implements and compares a variety of models with the primary focus of Machine Learning and Deep Learning for the task of music recommendation and do so successfully by representing the task of recommendation as a multi-class extreme classification task with 100 000 distinct labels. By comparing fourteen different experiments, all implemented models successfully learn features such as time, location, user features and previous listening history in order to create context-aware personalized music predictions, and solves the cold start problem by using user demographic information, where the best model being capable of capturing the intended label in its top 100 list of recommended items for more than 1/3 of the unseen data in an offine evaluation, when evaluating on randomly selected examples from the unseen following week.
Informationsfiltrering och rekommendationssystem har använts och implementeratspå flera olika sätt från olika enheter sedan gryningen avInternet, och moderna tillvägagångssätt beror påMaskininlärrning samtDjupinlärningför att kunna skapa precisa och personliga rekommendationerför användare i en given kontext. Dessa modeller kräver data i storamängder med en varians av kännetecken såsom tid, plats och användardataför att kunna hitta korrelationer samt mönster som klassiska modellersåsom matris faktorisering samt samverkande filtrering inte kan. Dettaexamensarbete forskar, implementerar och jämför en mängd av modellermed fokus påMaskininlärning samt Djupinlärning för musikrekommendationoch gör det med succé genom att representera rekommendationsproblemetsom ett extremt multi-klass klassifikationsproblem med 100000 unika klasser att välja utav. Genom att jämföra fjorton olika experiment,så lär alla modeller sig kännetäcken såsomtid, plats, användarkänneteckenoch lyssningshistorik för att kunna skapa kontextberoendepersonaliserade musikprediktioner, och löser kallstartsproblemet genomanvändning av användares demografiska kännetäcken, där den bästa modellenklarar av att fånga målklassen i sin rekommendationslista medlängd 100 för mer än 1/3 av det osedda datat under en offline evaluering,när slumpmässigt valda exempel från den osedda kommande veckanevalueras.
37

Rubio, Romano Antonio. "Fashion discovery : a computer vision approach." Doctoral thesis, TDX (Tesis Doctorals en Xarxa), 2021. http://hdl.handle.net/10803/672423.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Performing semantic interpretation of fashion images is undeniably one of the most challenging domains for computer vision. Subtle variations in color and shape might confer different meanings or interpretations to an image. Not only is it a domain tightly coupled with human understanding, but also with scene interpretation and context. Being able to extract fashion-specific information from images and interpret that information in a proper manner can be useful in many situations and help understanding the underlying information in an image. Fashion is also one of the most important businesses around the world, with an estimated value of 3 trillion dollars and a constantly growing online market, which increases the utility of image-based algorithms to search, classify or recommend garments. This doctoral thesis aims to solve specific problems related with the treatment of fashion e-commerce data, from low-level pure pixel information to high-level abstract conclusions of the garments appearing in an image, taking advantage of the multi-modality of the available data for developing some of the solutions. The contributions include: - A new superpixel extraction method focused on improving the annotation process for clothing images. - The construction of an image and text embedding for fashion data. - The application of this embedding space to the task of retrieving the main product in an image showing a complete outfit. In summary, fashion is a complex computer vision and machine learning problem at many levels, and developing specific algorithms that are able to capture essential information from pictures and text is not trivial. In order to solve some of the challenges it proposes, and taking into account that this is an Industrial Ph.D., we contribute with a variety of solutions that can boost the performance of many tasks useful for the fashion e-commerce industry.
La interpretación semántica de imágenes del mundo de la moda es sin duda uno de los dominios más desafiantes para la visión por computador. Leves variaciones en color y forma pueden conferir significados o interpretaciones distintas a una imagen. Es un dominio estrechamente ligado a la comprensión humana subjetiva, pero también a la interpretación y reconocimiento de escenarios y contextos. Ser capaz de extraer información específica sobre moda de imágenes e interpretarla de manera correcta puede ser útil en muchas situaciones y puede ayudar a entender la información subyacente en una imagen. Además, la moda es uno de los negocios más importantes a nivel global, con un valor estimado de tres trillones de dólares y un mercado online en constante crecimiento, lo cual aumenta el interés de los algoritmos basados en imágenes para buscar, clasificar o recomendar prendas. Esta tesis doctoral pretende resolver problemas específicos relacionados con el tratamiento de datos de tiendas virtuales de moda, yendo desde la información más básica a nivel de píxel hasta un entendimiento más abstracto que permita extraer conclusiones sobre las prendas presentes en una imagen, aprovechando para ello la Multi-modalidad de los datos disponibles para desarrollar algunas de las soluciones. Las contribuciones incluyen: - Un nuevo método de extracción de superpíxeles enfocado a mejorar el proceso de anotación de imágenes de moda. - La construcción de un espacio común para representar imágenes y textos referentes a moda. - La aplicación de ese espacio en la tarea de identificar el producto principal dentro de una imagen que muestra un conjunto de prendas. En resumen, la moda es un dominio complejo a muchos niveles en términos de visión por computador y aprendizaje automático, y desarrollar algoritmos específicos capaces de capturar la información esencial a partir de imágenes y textos no es una tarea trivial. Con el fin de resolver algunos de los desafíos que esta plantea, y considerando que este es un doctorado industrial, contribuimos al tema con una variedad de soluciones que pueden mejorar el rendimiento de muchas tareas extremadamente útiles para la industria de la moda online
Automàtica, robòtica i visió
38

Drezga, Irislav. "A generalized ANN-based model for short-term load forecasting." Diss., This resource online, 1996. http://scholar.lib.vt.edu/theses/available/etd-06062008-151653/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Valgimigli, Lorenzo. "Job Recommendation Based on Deep Learning Methods for Natural Language Processing." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20467/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La ricerca di nuove soluzioni sempre più efficienti, da quando l’essere umano viveva nelle caverne ad oggi, ci ha spinto a sviluppare nuovi strumenti sempre più sofisticati e precisi. Con l’arrivo del computer, un altro grande passo in avanti è stato fatto potendo risolvere velocemente quei problemi complessi ma che possono essere espressi in maniera formale. Diversamente, esistono problemi più o meno semplici, ma difficili da porre con una certa formalità come capire una frase. Per questo tipo di problemi sono nate le Intelligenze Artificiali o le Artificial Neural Network e successivamente, grazie all’arrivo dei Big Data, le Deep Neural Network. Oggi esse hanno trovato grande impiego in vari settori quali banche, ospedali, … . Queste nuove tecnologie continuano però a essere molto studiate e a stupire per i risultati sempre migliori che riescono ad ottenere. Un campo in cui sono applicate, che ha ricevuto una recente attenzione, è quello della Job Recommendation. Esso comprende tutto l’insieme di tecnologie e strumenti utilizzati per facilitare un lavoratore nel trovare un lavoro e una azienda nel trovare i migliori candidati per le sue posizioni aperte. Questo campo si basa molto, in particolare per piccole realtà, su personale qualificato che si occupa di cercare candidati utilizzando alcuni modelli di Machine Learning per semplificare la ricerca. Questo lavoro vuole indagare come le nuove tecnologie di Deep Neural Network, che si sono affermate in vari settori come il Natural Language Processing, possano aiutare anche nel campo della Job Recommendation. L’idea è quella di prendere i migliori modelli nei task di NLP e provare, con le opportune modifiche, ad applicarli in un campo nuovo, con regole nuove. Infine, valutare i risultati ottenuti e come questi modelli possano essere applicati concretamente.
40

Costa, Pablo Botton da. "Um analisador sintático neural multilíngue baseado em transições." Universidade Federal de São Carlos, 2017. https://repositorio.ufscar.br/handle/ufscar/9065.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Submitted by Ronildo Prado (ronisp@ufscar.br) on 2017-08-23T18:26:08Z No. of bitstreams: 1 DissPBC.pdf: 1229668 bytes, checksum: 806b06dd0fbdd6a4076384a7d0f90456 (MD5)
Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-08-23T18:26:15Z (GMT) No. of bitstreams: 1 DissPBC.pdf: 1229668 bytes, checksum: 806b06dd0fbdd6a4076384a7d0f90456 (MD5)
Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-08-23T18:26:21Z (GMT) No. of bitstreams: 1 DissPBC.pdf: 1229668 bytes, checksum: 806b06dd0fbdd6a4076384a7d0f90456 (MD5)
Made available in DSpace on 2017-08-23T18:26:28Z (GMT). No. of bitstreams: 1 DissPBC.pdf: 1229668 bytes, checksum: 806b06dd0fbdd6a4076384a7d0f90456 (MD5) Previous issue date: 2017-01-24
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
A dependency parser consists in inducing a model that is capable of extracting the right dependency tree from an input natural language sentence. Nowadays, the multilingual techniques are being used more and more in Natural Language Processing (NLP) (BROWN et al., 1995; COHEN; DAS; SMITH, 2011), especially in the dependency parsing task. Intuitively, a multilingual parser can be seen as vector of different parsers, in which each one is individually trained on one language. However, this approach can be a really pain in the neck in terms of processing time and resources. As an alternative, many parsing techniques have been developed in order to solve this problem (MCDONALD; PETROV; HALL, 2011; TACKSTROM; MCDONALD; USZKOREIT, 2012; TITOV; HENDERSON, 2007) but all of them depends on word alignment (TACKSTROM; MCDONALD; USZKOREIT, 2012) or word clustering, which increases the complexity since it is difficult to induce alignments between words and syntactic resources (TSARFATY et al., 2013; BOHNET et al., 2013a). A simple solution proposed recently (NIVRE et al., 2016a) uses an universal annotated corpus in order to reduce the complexity associated with the construction of a multilingual parser. In this context, this work presents an universal model for dependency parsing: the NNParser. Our model is a modification of Chen e Manning (2014) with a more greedy and accurate model to capture distributional representations (MIKOLOV et al., 2011). The NNparser reached 93.08% UAS in English Penn Treebank (WSJ) and better results than the state of the art Stack LSTM parser for Portuguese (87.93% × 86.2% LAS) and Spanish (86.95% × 85.7% LAS) on the universal dependencies corpus.
Um analisador sintático de dependência consiste em um modelo capaz de extrair a estrutura de dependência de uma sentença em língua natural. No Processamento de Linguagem Natural (PLN), os métodos multilíngues tem sido cada vez mais utilizados (BROWN et al., 1995; COHEN; DAS; SMITH, 2011), inclusive na tarefa de análise de dependência. Intuitivamente, um analisador sintático multilíngue pode ser visto como um vetor de analisadores sintáticos treinados individualmente em cada língua. Contudo, a tarefa realizada com base neste vetor torna-se inviável devido a sua alta demanda por recursos. Como alternativa, diversos métodos de análise sintática foram propostos (MCDONALD; PETROV; HALL, 2011; TACKSTROM; MCDONALD; USZKOREIT, 2012; TITOV; HENDERSON, 2007), mas todos dependentes de alinhamento entre palavras (TACKSTROM; MCDONALD; USZKOREIT, 2012) ou de técnicas de agrupamento, o que também aumenta a complexidade associada ao modelo (TSARFATY et al., 2013; BOHNET et al., 2013a). Uma solução simples surgiu recentemente com a construção de recursos universais (NIVRE et al., 2016a). Estes recursos universais têm o potencial de diminuir a complexidade associada à construção de um modelo multilíngue, uma vez que não é necessário um mapeamento entre as diferentes notações das línguas. Nesta linha, este trabalho apresenta um modelo para análise sintática universal de dependência: o NNParser. O modelo em questão é uma modificação da proposta de Chen e Manning (2014) com um modelo mais guloso e preciso na captura de representações distribuídas (MIKOLOV et al., 2011). Nos experimentos aqui apresentados o NNParser atingiu 93, 08% de UAS para o inglês no córpus Penn Treebank e resultados melhores do que o estado da arte, o Stack LSTM, para o português (87,93% × 86,2% LAS) e o espanhol (86,95% × 85,7% LAS) no córpus UD 1.2.
41

Huddlestone, Grant E. "Implementation and evaluation of two prediction techniques for the Lorenz time series." Thesis, Stellenbosch : Stellenbosch University, 2003. http://hdl.handle.net/10019.1/53459.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Thesis (MSc)-- Stellenbosch University, 2003.
ENGLISH ABSTRACT: This thesis implements and evaluates two prediction techniques used to forecast deterministic chaotic time series. For a large number of such techniques, the reconstruction of the phase space attractor associated with the time series is required. Embedding is presented as the means of reconstructing the attractor from limited data. Methods for obtaining the minimal embedding dimension and optimal time delay from the false neighbour heuristic and average mutual information method are discussed. The first prediction algorithm that is discussed is based on work by Sauer, which includes the implementation of the singular value decomposition on data obtained from the embedding of the time series being predicted. The second prediction algorithm is based on neural networks. A specific architecture, suited to the prediction of deterministic chaotic time series, namely the time dependent neural network architecture is discussed and implemented. Adaptations to the back propagation training algorithm for use with the time dependent neural networks are also presented. Both algorithms are evaluated by means of predictions made for the well-known Lorenz time series. Different embedding and algorithm-specific parameters are used to obtain predicted time series. Actual values corresponding to the predictions are obtained from Lorenz time series, which aid in evaluating the prediction accuracies. The predicted time series are evaluated in terms of two criteria, prediction accuracy and qualitative behavioural accuracy. Behavioural accuracy refers to the ability of the algorithm to simulate qualitative features of the time series being predicted. It is shown that for both algorithms the choice of the embedding dimension greater than the minimum embedding dimension, obtained from the false neighbour heuristic, produces greater prediction accuracy. For the neural network algorithm, values of the embedding dimension greater than the minimum embedding dimension satisfy the behavioural criterion adequately, as expected. Sauer's algorithm has the greatest behavioural accuracy for embedding dimensions smaller than the minimal embedding dimension. In terms of the time delay, it is shown that both algorithms have the greatest prediction accuracy for values of the time delay in a small interval around the optimal time delay. The neural network algorithm is shown to have the greatest behavioural accuracy for time delay close to the optimal time delay and Sauer's algorithm has the best behavioural accuracy for small values of the time delay. Matlab code is presented for both algorithms.
AFRIKAANSE OPSOMMING: In hierdie tesis word twee voorspellings-tegnieke geskik vir voorspelling van deterministiese chaotiese tydreekse ge"implementeer en geevalueer. Vir sulke tegnieke word die rekonstruksie van die aantrekker in fase-ruimte geassosieer met die tydreeks gewoonlik vereis. Inbedmetodes word aangebied as 'n manier om die aantrekker te rekonstrueer uit beperkte data. Metodes om die minimum inbed-dimensie te bereken uit gemiddelde wedersydse inligting sowel as die optimale tydsvertraging te bereken uit vals-buurpunt-heuristiek, word bespreek. Die eerste voorspellingsalgoritme wat bespreek word is gebaseer op 'n tegniek van Sauer. Hierdie algoritme maak gebruik van die implementering van singulierwaarde-ontbinding van die ingebedde tydreeks wat voorspel word. Die tweede voorspellingsalgoritme is gebaseer op neurale netwerke. 'n Spesifieke netwerkargitektuur geskik vir deterministiese chaotiese tydreekse, naamlik die tydafhanklike neurale netwerk argitektuur word bespreek en ge"implementeer. 'n Modifikasie van die terugprapagerende leer-algoritme vir gebruik met die tydafhanklike neurale netwerk word ook aangebied. Albei algoritmes word geevalueer deur voorspellings te maak vir die bekende Lorenz tydreeks. Verskeie inbed parameters en ander algoritme-spesifieke parameters word gebruik om die voorspelling te maak. Die werklike waardes vanuit die Lorentz tydreeks word gebruik om die voorspellings te evalueer en om voorspellingsakkuraatheid te bepaal. Die voorspelde tydreekse word geevalueer op grand van twee kriteria, naamlik voorspellingsakkuraatheid, en kwalitatiewe gedragsakkuraatheid. Gedragsakkuraatheid verwys na die vermoe van die algoritme om die kwalitatiewe eienskappe van die tydreeks korrek te simuleer. Daar word aangetoon dat vir beide algoritmes die keuse van inbed-dimensie grater as die minimum inbeddimensie soos bereken uit die vals-buurpunt-heuristiek, grater akkuraatheid gee. Vir die neurale netwerkalgoritme gee 'n inbed-dimensie grater as die minimum inbed-dimensie ook betel' gedragsakkuraatheid soos verwag. Vir Sauer se algoritme, egter, word betel' gedragsakkuraatheid gevind vir 'n inbed-dimensie kleiner as die minimale inbed-dimensie. In terme van tydsvertraging word dit aangetoon dat vir beide algoritmes die grootste voorspellingsakkuraatheid verkry word by tydvertragings in 'n interval rondom die optimale tydsvetraging. Daar word ook aangetoon dat die neurale netwerk-algoritme die beste gedragsakkuraatheid gee vir tydsvertragings naby aan die optimale tydsvertraging, terwyl Sauer se algoritme betel' gedragsakkuraatheid gee by kleineI' waardes van die tydsvertraging. Die Matlab kode van beide algoritmes word ook aangebied.
42

Sabo, Jozef. "Aplikace metody učení bez učitele na hledání podobných grafů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445517.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Goal of this master's thesis was in cooperation with the company Avast to design a system, which can extract knowledge from a database of graphs. Graphs, used for data mining, describe behaviour of computer systems and they are anonymously inserted into the company's database from systems of the company's products users. Each graph in the database can be assigned with one of two labels: clean or malware (malicious) graph. The task of the proposed self-learning system is to find clusters of graphs in the graph database, in which the classes of graphs do not mix. Graph clusters with only one class of graphs can be interpreted as different types of clean or malware graphs and they are a useful source of further analysis on the graphs. To evaluate the quality of the clusters, a custom metric, named as monochromaticity, was designed. The metric evaluates the quality of the clusters based on how much clean and malware graphs are mixed in the clusters. The best results of the metric were obtained when vector representations of graphs were created by a deep learning model (variational  graph autoencoder with two relation graph convolution operators) and the parameterless method MeanShift was used for clustering over vectors.
43

Callin, Jimmy. "Word Representations and Machine Learning Models for Implicit Sense Classification in Shallow Discourse Parsing." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-325876.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
CoNLL 2015 featured a shared task on shallow discourse parsing. In 2016, the efforts continued with an increasing focus on sense classification. In the case of implicit sense classification, there was an interesting mix of traditional and modern machine learning classifiers using word representation models. In this thesis, we explore the performance of a number of these models, and investigate how they perform using a variety of word representation models. We show that there are large performance differences between word representation models for certain machine learning classifiers, while others are more robust to the choice of word representation model. We also show that with the right choice of word representation model, simple and traditional machine learning classifiers can reach competitive scores even when compared with modern neural network approaches.
44

Russo, Nicholas A. "DiSH: Democracy in State Houses." DigitalCommons@CalPoly, 2019. https://digitalcommons.calpoly.edu/theses/1967.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In our current political climate, state level legislators have become increasingly impor- tant. Due to cuts in funding and growing focus at the national level, public oversight for these legislators has drastically decreased. This makes it difficult for citizens and activists to understand the relationships and commonalities between legislators. This thesis provides three contributions to address this issue. First, we created a data set containing over 1200 features focused on a legislator’s activity on bills. Second, we created embeddings that represented a legislator’s level of activity and engagement for a given bill using a custom model called Democracy2Vec. Third, we provided a case study focused on the 2015-2016 California State Legislator and had our results verified by a political expert. Our results show that our embeddings can explain relationships between legislator and how they will likely act during the legislative process.
45

Mrkšić, Nikola. "Data-driven language understanding for spoken dialogue systems." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276689.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Spoken dialogue systems provide a natural conversational interface to computer applications. In recent years, the substantial improvements in the performance of speech recognition engines have helped shift the research focus to the next component of the dialogue system pipeline: the one in charge of language understanding. The role of this module is to translate user inputs into accurate representations of the user goal in the form that can be used by the system to interact with the underlying application. The challenges include the modelling of linguistic variation, speech recognition errors and the effects of dialogue context. Recently, the focus of language understanding research has moved to making use of word embeddings induced from large textual corpora using unsupervised methods. The work presented in this thesis demonstrates how these methods can be adapted to overcome the limitations of language understanding pipelines currently used in spoken dialogue systems. The thesis starts with a discussion of the pros and cons of language understanding models used in modern dialogue systems. Most models in use today are based on the delexicalisation paradigm, where exact string matching supplemented by a list of domain-specific rephrasings is used to recognise users' intents and update the system's internal belief state. This is followed by an attempt to use pretrained word vector collections to automatically induce domain-specific semantic lexicons, which are typically hand-crafted to handle lexical variation and account for a plethora of system failure modes. The results highlight the deficiencies of distributional word vectors which must be overcome to make them useful for downstream language understanding models. The thesis next shifts focus to overcoming the language understanding models' dependency on semantic lexicons. To achieve that, the proposed Neural Belief Tracking (NBT) model forsakes the use of standard one-hot n-gram representations used in Natural Language Processing in favour of distributed representations of user utterances, dialogue context and domain ontologies. The NBT model makes use of external lexical knowledge embedded in semantically specialised word vectors, obviating the need for domain-specific semantic lexicons. Subsequent work focuses on semantic specialisation, presenting an efficient method for injecting external lexical knowledge into word vector spaces. The proposed Attract-Repel algorithm boosts the semantic content of existing word vectors while simultaneously inducing high-quality cross-lingual word vector spaces. Finally, NBT models powered by specialised cross-lingual word vectors are used to train multilingual belief tracking models. These models operate across many languages at once, providing an efficient method for bootstrapping language understanding models for lower-resource languages with limited training data.
46

Sawaya, Antonio. "Financial time series analysis : Chaos and neurodynamics approach." Thesis, Högskolan Dalarna, Datateknik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:du-4810.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This work aims at combining the Chaos theory postulates and Artificial Neural Networks classification and predictive capability, in the field of financial time series prediction. Chaos theory, provides valuable qualitative and quantitative tools to decide on the predictability of a chaotic system. Quantitative measurements based on Chaos theory, are used, to decide a-priori whether a time series, or a portion of a time series is predictable, while Chaos theory based qualitative tools are used to provide further observations and analysis on the predictability, in cases where measurements provide negative answers. Phase space reconstruction is achieved by time delay embedding resulting in multiple embedded vectors. The cognitive approach suggested, is inspired by the capability of some chartists to predict the direction of an index by looking at the price time series. Thus, in this work, the calculation of the embedding dimension and the separation, in Takens‘ embedding theorem for phase space reconstruction, is not limited to False Nearest Neighbor, Differential Entropy or other specific method, rather, this work is interested in all embedding dimensions and separations that are regarded as different ways of looking at a time series by different chartists, based on their expectations. Prior to the prediction, the embedded vectors of the phase space are classified with Fuzzy-ART, then, for each class a back propagation Neural Network is trained to predict the last element of each vector, whereas all previous elements of a vector are used as features.
47

Yedroudj, Mehdi. "Steganalysis and steganography by deep learning." Thesis, Montpellier, 2019. http://www.theses.fr/2019MONTS095.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La stéganographie d'image est l'art de la communication secrète dans le but d'échanger un message de manière furtive. La stéganalyse d'image a elle pour objectif de détecter la présence d'un message caché en recherchant les artefacts présent dans l'image. Pendant une dizaine d'années, l'approche classique en stéganalyse a été d'utiliser un ensemble classifieur alimenté par des caractéristiques extraites "à la main". Au cours des dernières années, plusieurs études ont montré que les réseaux de neurones convolutionnels peuvent atteindre des performances supérieures à celles des approches conventionnelles d'apprentissage machine.Le sujet de cette thèse traite des techniques d'apprentissage profond utilisées pour la stéganographie d'images et la stéganalyse dans le domaine spatial.La première contribution est un réseau de neurones convolutionnel rapide et efficace pour la stéganalyse, nommé Yedroudj-Net. Comparé aux méthodes modernes de steganalyse basées sur l'apprentissage profond, Yedroudj-Net permet d'obtenir des résultats de détection performants, mais prend également moins de temps à converger, ce qui permet l'utilisation des bases d'apprentissage de grandes dimensions. De plus, Yedroudj-Net peut facilement être amélioré en ajoutant des compléments ou des modules bien connus. Parmi les amélioration possibles, nous avons évalué l'augmentation de la base de données d'entraînement, et l'utilisation d'un ensemble de CNN. Les deux modules complémentaires permettent d'améliorer les performances de notre réseau.La deuxième contribution est l'application des techniques d'apprentissage profond à des fins stéganographiques i.e pour l'insertion. Parmi les techniques existantes, nous nous concentrons sur l'approche du "jeu-à-3-joueurs". Nous proposons un algorithme d'insertion qui apprend automatiquement à insérer un message secrètement. Le système de stéganographie que nous proposons est basé sur l'utilisation de réseaux adverses génératifs. L'entraînement de ce système stéganographique se fait à l'aide de trois réseaux de neurones qui se font concurrence : le stéganographeur, l'extracteur et le stéganalyseur. Pour le stéganalyseur nous utilisons Yedroudj-Net, pour sa petite taille, et le faite que son entraînement ne nécessite pas l'utilisation d'astuces qui pourrait augmenter le temps de calcul.Cette deuxième contribution donne des premiers éléments de réflexion tout en donnant des résultats prometteurs, et pose ainsi les bases pour de futurs recherches
Image steganography is the art of secret communication in order to exchange a secret message. In the other hand, image steganalysis attempts to detect the presence of a hidden message by searching artefacts within an image. For about ten years, the classic approach for steganalysis was to use an Ensemble Classifier fed by hand-crafted features. In recent years, studies have shown that well-designed convolutional neural networks (CNNs) can achieve superior performance compared to conventional machine-learning approaches.The subject of this thesis deals with the use of deep learning techniques for image steganography and steganalysis in the spatialdomain.The first contribution is a fast and very effective convolutional neural network for steganalysis, named Yedroudj-Net. Compared tomodern deep learning based steganalysis methods, Yedroudj-Net can achieve state-of-the-art detection results, but also takes less time to converge, allowing the use of a large training set. Moreover,Yedroudj-Net can easily be improved by using well known add-ons. Among these add-ons, we have evaluated the data augmentation, and the the use of an ensemble of CNN; Both increase our CNN performances.The second contribution is the application of deep learning techniques for steganography i.e the embedding. Among the existing techniques, we focus on the 3-player game approach.We propose an embedding algorithm that automatically learns how to hide a message secretly. Our proposed steganography system is based on the use of generative adversarial networks. The training of this steganographic system is conducted using three neural networks that compete against each other: the embedder, the extractor, and the steganalyzer. For the steganalyzer we use Yedroudj-Net, this for its affordable size, and for the fact that its training does not require the use of any tricks that could increase the computational time.This second contribution defines a research direction, by giving first reflection elements while giving promising first results
48

Malik, Muhammad Hamza. "Information extraction and mapping for KG construction with learned concepts from scientic documents : Experimentation with relations data for development of concept learner." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-285572.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Systematic review of research manuscripts is a common procedure in which research studies pertaining a particular field or domain are classified and structured in a methodological way. This process involves, between other steps, an extensive review and consolidation of scientific metrics and attributes of the manuscripts, such as citations, type or venue of publication. The extraction and mapping of relevant publication data, evidently, is a very laborious task if performed manually. Automation of such systematic mapping steps intend to reduce the human effort required and therefore can potentially reduce the time required for this process.The objective of this thesis is to automate the data extraction and mapping steps when systematically reviewing studies. The manual process is replaced by novel graph modelling techniques for effective knowledge representation, as well as novel machine learning techniques that aim to learn these representations. This eventually automates this process by characterising the publications on the basis of certain sub-properties and qualities that give the reviewer a quick high-level overview of each research study. The final model is a concept learner that predicts these sub-properties which in addition addresses the inherent concept-drift of novel manuscripts over time. Different models were developed and explored in this research study for the development of concept learner.Results show that: (1) Graph reasoning techniques which leverage the expressive power in modern graph databases are very effective in capturing the extracted knowledge in a so-called knowledge graph, which allows us to form concepts that can be learned using standard machine learning techniques like logistic regression, decision trees and neural networks etc. (2) Neural network models and ensemble models outperformed other standard machine learning techniques like logistic regression and decision trees based on the evaluation metrics. (3) The concept learner is able to detect and avoid concept drift by retraining the model.
Systematisk granskning av forskningsmanuskript är en vanlig procedur där forskningsstudier inom ett visst område klassificeras och struktureras på ett metodologiskt sätt. Denna process innefattar en omfattande granskning och sammanförande av vetenskapliga mätvärden och attribut för manuskriptet, såsom citat, typ av manuskript eller publiceringsplats. Framställning och kartläggning av relevant publikationsdata är uppenbarligen en mycket mödosam uppgift om den utförs manuellt. Avsikten med automatiseringen av processen för denna typ av systematisk kartläggning är att minska den mänskliga ansträngningen, och den tid som krävs kan på så sätt minskas. Syftet med denna avhandling är att automatisera datautvinning och stegen för kartläggning vid systematisk granskning av studier. Den manuella processen ersätts av avancerade grafmodelleringstekniker för effektiv kunskapsrepresentation, liksom avancerade maskininlärningstekniker som syftar till att lära maskinen dessa representationer. Detta automatiserar så småningom denna process genom att karakterisera publikationerna beserat på vissa subjektiva egenskaper och kvaliter som ger granskaren en snabb god översikt över varje forskningsstudie. Den slutliga modellen är ett inlärningskoncept som förutsäger dessa subjektiva egenskaper och dessutom behandlar den inneboende konceptuella driften i manuskriptet över tiden. Olika modeller utvecklades och undersöktes i denna forskningsstudie för utvecklingen av inlärningskonceptet. Resultaten visar att: (1) Diagrammatiskt resonerande som uttnytjar moderna grafdatabaser är mycket effektiva för att fånga den framställda kunskapen i en så kallad kunskapsgraf, och gör det möjligt att vidareutveckla koncept som kan läras med hjälp av standard tekniker för maskininlärning. (2) Neurala nätverksmodeller och ensemblemodeller överträffade andra standard maskininlärningstekniker baserat på utvärderingsvärdena. (3) Inlärningskonceptet kan detektera och undvika konceptuell drift baserat på F1-poäng och omlärning av algoritmen.
49

Lachmann, Tim, and Johan Sabel. "Distributionella representationer av ord för effektiv informationssökning : Algoritmer för sökning i kundsupportforum." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209695.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
I takt med att informationsmängden ökar i samhället ställs högre krav på mer förfinade metoder för sökning och hantering av information. Att utvinna relevant data från företagsinterna system blir en mer komplex uppgift då större informationsmängder måste hanteras och mycket kommunikation förflyttas till digitala plattformar. Metoder för vektorbaserad ordinbäddning har under senare år gjort stora framsteg; i synnerhet visade Google 2013 banbrytande resultat med modellen Word2vec och överträffade äldre metoder. Vi implementerar en sökmotor som utnyttjar ordinbäddningar baserade på Word2vec och liknande modeller, avsedd att användas på IT-företaget Kundo och för produkten Kundo Forum. Resultaten visar på potential för informationssökning med markant bättre täckning utan minskad precision. Kopplat till huvudområdet informationssökning genomförs också en analys av vilka implikationer en förbättrad sökmotor har ur ett marknads- och produktutvecklingsperspektiv.
As the abundance of information in society increases, so does the need for more sophisticated methods of information retrieval. Extracting information from internal systems becomes a more complex task when handling larger amounts of information and when more communications are transferred to digital platforms. Recent years methods for word embedding in vector space have gained traction. In 2013 Google sent ripples across the field of Natural Language Processing with a new method called Word2vec, significantly outperforming former practices. Among different established methods for information retrieval, we implement a retrieval method utilizing Word2vec and related methods of word embedding for the search engine at IT company Kundo and their product Kundo Forum. We demonstrate the potential to improve information retrieval recall by a significant margin without diminishing precision. Coupled with the primary subject of information retrieval we also investigate potential market and product development implications related to a different kind of search engine.
50

Suta, Adin. "Multilabel text classification of public procurements using deep learning intent detection." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-252558.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Textual data is one of the most widespread forms of data and the amount of such data available in the world increases at a rapid rate. Text can be understood as either a sequence of characters or words, where the latter approach is the most common. With the breakthroughs within the area of applied artificial intelligence in recent years, more and more tasks are aided by automatic processing of text in various applications. The models introduced in the following sections rely on deep-learning sequence-processing in order to process and text to produce a regression algorithm for classification of what the text input refers to. We investigate and compare the performance of several model architectures along with different hyperparameters. The data set was provided by e-Avrop, a Swedish company which hosts a web platform for posting and bidding of public procurements. It consists of titles and descriptions of Swedish public procurements posted on the website of e-Avrop, along with the respective category/categories of each text. When the texts are described by several categories (multi label case) we suggest a deep learning sequence-processing regression algorithm, where a set of deep learning classifiers are used. Each model uses one of the several labels in the multi label case, along with the text input to produce a set of text - label observation pairs. The goal becomes to investigate whether these classifiers can carry out different levels of intent, an intent which should theoretically be imposed by the different training data sets used by each of the individual deep learning classifiers.
Data i form av text är en av de mest utbredda formerna av data och mängden tillgänglig textdata runt om i världen ökar i snabb takt. Text kan tolkas som en följd av bokstäver eller ord, där tolkning av text i form av ordföljder är absolut vanligast. Genombrott inom artificiell intelligens under de senaste åren har medfört att fler och fler arbetsuppgifter med koppling till text assisteras av automatisk textbearbetning. Modellerna som introduceras i denna uppsats är baserade på djupa artificiella neuronnät med sekventiell bearbetning av textdata, som med hjälp av regression förutspår tillhörande ämnesområde för den inmatade texten. Flera modeller och tillhörande hyperparametrar utreds och jämförs enligt prestanda. Datamängden som använts är tillhandahållet av e-Avrop, ett svenskt företag som erbjuder en webbtjänst för offentliggörande och budgivning av offentliga upphandlingar. Datamängden består av titlar, beskrivningar samt tillhörande ämneskategorier för offentliga upphandlingar inom Sverige, tagna från e-Avrops webtjänst. När texterna är märkta med ett flertal kategorier, föreslås en algoritm baserad på ett djupt artificiellt neuronnät med sekventiell bearbetning, där en mängd klassificeringsmodeller används. Varje sådan modell använder en av de märkta kategorierna tillsammans med den tillhörande texten, som skapar en mängd av text - kategori par. Målet är att utreda huruvida dessa klassificerare kan uppvisa olika former av uppsåt som teoretiskt sett borde vara medfört från de olika datamängderna modellerna mottagit.

To the bibliography