Tesi sul tema "Network data representation"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Network data representation".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Lim, Chong-U. "Modeling player self-representation in multiplayer online games using social network data". Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/82409.
Testo completoCataloged from PDF version of thesis.
Includes bibliographical references (p. 101-105).
Game players express values related to self-expression through various means such as avatar customization, gameplay style, and interactions with other players. Multiplayer online games are now often integrated with social networks that provide social contexts in which player-to-player interactions take place, such as conversation and trading of virtual items. Building upon a theoretical framework based in machine learning and cognitive science, I present results from a novel approach to modeling and analyzing player values in terms of both preferences in avatar customization and patterns in social network use. To facilitate this work, I developed the Steam-Player- Preference Analyzer (Steam-PPA) system, which performs advanced data collection on publicly available social networking profile information. The primary contribution of this thesis is the AIR Toolkit Status Performance Classifier (AIR-SPC), which uses machine learning techniques including k-means clustering, natural language processing (NLP), and support vector machines (SVM) to perform inference on the data. As an initial case study, I use Steam-PPA to collect gameplay and avatar customization information from players in the popular, and commercially successful, multi-player first-person-shooter game Team Fortress 2 (TF2). Next, I use AIR-SPC to analyze the information from profiles on the social network Steam. The upshot is that I use social networking information to predict the likelihood of players customizing their profile in several ways associated with the monetary values of their avatars. In this manner I have developed a computational model of aspects of players' digital social identity capable of predicting specific values in terms of preferences exhibited within a virtual game-world.
by Chong-U Lim.
S.M.
Lee, John Boaz T. "Deep Learning on Graph-structured Data". Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/570.
Testo completoAzorin, Raphael. "Traffic representations for network measurements". Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS141.
Testo completoMeasurements are essential to operate and manage computer networks, as they are critical to analyze performance and establish diagnosis. In particular, per-flow monitoring consists in computing metrics that characterize the individual data streams traversing the network. To develop relevant traffic representations, operators need to select suitable flow characteristics and carefully relate their cost of extraction with their expressiveness for the downstream tasks considered. In this thesis, we propose novel methodologies to extract appropriate traffic representations. In particular, we posit that Machine Learning can enhance measurement systems, thanks to its ability to learn patterns from data, in order to provide predictions of pertinent traffic characteristics.The first contribution of this thesis is a framework for sketch-based measurements systems to exploit the skewed nature of network traffic. Specifically, we propose a novel data structure representation that leverages sketches' under-utilization, reducing per-flow measurements memory footprint by storing only relevant counters. The second contribution is a Machine Learning-assisted monitoring system that integrates a lightweight traffic classifier. In particular, we segregate large and small flows in the data plane, before processing them separately with dedicated data structures for various use cases. The last contributions address the design of a unified Deep Learning measurement pipeline that extracts rich representations from traffic data for network analysis. We first draw from recent advances in sequence modeling to learn representations from both numerical and categorical traffic data. These representations serve as input to solve complex networking tasks such as clickstream identification and mobile terminal movement prediction in WLAN. Finally, we present an empirical study of task affinity to assess when two tasks would benefit from being learned together
SURANO, FRANCESCO VINCENZO. "Unveiling human interactions : approaches and techniques toward the discovery and representation of interactions in networks". Doctoral thesis, Politecnico di Torino, 2023. https://hdl.handle.net/11583/2975708.
Testo completoWoodbury, Nathan Scott. "Representation and Reconstruction of Linear, Time-Invariant Networks". BYU ScholarsArchive, 2019. https://scholarsarchive.byu.edu/etd/7402.
Testo completoMartignano, Anna. "Real-time Anomaly Detection on Financial Data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281832.
Testo completoDetta arbete presenterar en undersökning av tillämpningar av Network Representation Learning (NRL) inom den finansiella industrin. Metoder inom NRL möjliggör datadriven kondensering av grafstrukturer till lågdimensionella och lätthanterliga vektorer.Dessa vektorer kan sedan användas i andra maskininlärningsuppgifter. Närmare bestämt, kan metoder inom NRL underlätta hantering av och informantionsutvinning ur beräkningsintensiva och storskaliga grafer inom den finansiella sektorn, till exempel avvikelsehantering bland finansiella transaktioner. Arbetet med data av denna typ försvåras av det faktum att transaktionsgrafer är dynamiska och i konstant förändring. Utöver detta kan noderna, dvs transaktionspunkterna, vara vitt skilda eller med andra ord härstamma från olika fördelningar.I detta arbete har Graph Convolutional Network (ConvGNN) ansetts till den mest lämpliga lösningen för nämnda tillämpningar riktade mot upptäckt av avvikelser i transaktioner. GraphSAGE har använts som utgångspunkt för experimenten i två olika varianter: en dynamisk version där vikterna uppdateras allteftersom nya transaktionssekvenser matas in, och en variant avsedd särskilt för bipartita (tvådelade) grafer. Dessa varianter har utvärderats genom användning av faktiska datamängder med avvikelsehantering som slutmål.
GARBARINO, DAVIDE. "Acknowledging the structured nature of real-world data with graphs embeddings and probabilistic inference methods". Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1092453.
Testo completoRANDAZZO, VINCENZO. "Novel neural approaches to data topology analysis and telemedicine". Doctoral thesis, Politecnico di Torino, 2020. http://hdl.handle.net/11583/2850610.
Testo completoLucke, Helmut. "On the representation of temporal data for connectionist word recognition". Thesis, University of Cambridge, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.239520.
Testo completoCori, Marcel. "Modèles pour la représentation et l'interrogation de données textuelles et de connaissances". Paris 7, 1987. http://www.theses.fr/1987PA077047.
Testo completoAmir, Mohammad. "Semantically-enriched and semi-Autonomous collaboration framework for the Web of Things. Design, implementation and evaluation of a multi-party collaboration framework with semantic annotation and representation of sensors in the Web of Things and a case study on disaster management". Thesis, University of Bradford, 2015. http://hdl.handle.net/10454/14363.
Testo completoMolter, Colin. "Storing information through complex dynamics in recurrent neural networks". Doctoral thesis, Universite Libre de Bruxelles, 2005. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/211039.
Testo completoIn this thesis, it is shown experimentally that the more information is to be stored in robust cyclic attractors, the more chaos appears as a regime in the back, erratically itinerating among brief appearances of these attractors. Chaos does not appear to be the cause but the consequence of the learning. However, it appears as an helpful consequence that widens the net's encoding capacity. To learn the information to be stored, an unsupervised Hebbian learning algorithm is introduced. By leaving the semantics of the attractors to be associated with the feeding data unprescribed, promising results have been obtained in term of storing capacity.
Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished
Popescu, Alexandru. "Towards multi-dimensional data representation and routing in Cognitive Radio Networks". Licentiate thesis, Karlskrona : Blekinge Institute of Technology, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-00495.
Testo completoVukotic, Verdran. "Deep Neural Architectures for Automatic Representation Learning from Multimedia Multimodal Data". Thesis, Rennes, INSA, 2017. http://www.theses.fr/2017ISAR0015/document.
Testo completoIn this dissertation, the thesis that deep neural networks are suited for analysis of visual, textual and fused visual and textual content is discussed. This work evaluates the ability of deep neural networks to learn automatic multimodal representations in either unsupervised or supervised manners and brings the following main contributions:1) Recurrent neural networks for spoken language understanding (slot filling): different architectures are compared for this task with the aim of modeling both the input context and output label dependencies.2) Action prediction from single images: we propose an architecture that allow us to predict human actions from a single image. The architecture is evaluated on videos, by utilizing solely one frame as input.3) Bidirectional multimodal encoders: the main contribution of this thesis consists of neural architecture that translates from one modality to the other and conversely and offers and improved multimodal representation space where the initially disjoint representations can translated and fused. This enables for improved multimodal fusion of multiple modalities. The architecture was extensively studied an evaluated in international benchmarks within the task of video hyperlinking where it defined the state of the art today.4) Generative adversarial networks for multimodal fusion: continuing on the topic of multimodal fusion, we evaluate the possibility of using conditional generative adversarial networks to lean multimodal representations in addition to providing multimodal representations, generative adversarial networks permit to visualize the learned model directly in the image domain
Hiselius, Leo. "Stimulus representation in anisotropically connected spiking neural networks". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300153.
Testo completoBiologiska neurala nätverk är ett centralt studieobjekt inom beräkningsneurovetenskapen, och nyliga studier har även visat deras potentiella applicerbarhet inom artificiell intelligens och robotik [1]. De kan formuleras på många olika sätt, och ett välkänt och vida studerat exempel är liquid state machine från 2004 [2]. 2019 presenterades en ny och enkel kopplingsregel i SpreizerNätverket [3]. Kopplingarna i SpreizerNätverket styrs av en typ av gradientbrus vid namn Perlinbrus, och som sådana är de anisotropiska men korrelerade. Spikdatan som genereras av SpreizerNätverket är möjligtvis betydelsefull för funktion, till exempel för motorisk kontroll eller separation av stimuli. 2020 visade Michaelis m. fl. att spikdatan var relevant för motorisk kontroll [4]. I denna masteruppsats frågar vi oss om spikdatan är funktionellt relevant för stimulusrepresentation. Vi undersöker hur stimulus från MNIST handwritten digits -datasetet representeras i de spatiotemporella aktivitetssekvenserna som genereras i SpreizerNätverket, och huruvida denna representation är tillräcklig för separation.Vidare betraktar vi hur parametrarna som styr den lokala kopplingsstrukturen påverkar representation och separation. Vi visar att (1) SpreizerNätverket separerar stimuli i ett initialt skede efter stimuli och (2) att separationen minskar med tid när aktiviteten från olika stimuli blir enhetlig.
Marinakis, Dimitrios. "Inferring environmental representations through limited sensory data with applications to sensor network self-calibration". Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=66780.
Testo completoCe thèse s'adresse au problème de l'emploi de la détection dispersée pour déduire automatiquement une représentation de l'environnement, c'est-à-dire une carte, qui peut servir dans l'autocalibrage des systèmes intelligents tels que les réseaux des capteurs. L'information récupérée par un tel processus permet aux applications typiques telle que la collecte des données et la navigation de continuer sans une contribution de main d'œuvre de la part d'un technicien humain. Simplifier la répartition en grand des réseaux de capteurs et d'autres systèmes intelligents réduira effectivement leur coût et améliora leur disponibilité répandue, donc il facilitera leur application pratique aux tâches comme le contrôle des émissions de carbone et les gaz à effet de serre.Dans nos recherches nous nous concentrons sur les algorithmes et les techniques pour récupérer deux types d'information de l'environnement immédiat : l'information topologique qui indique la connectivité physique entre les régions d'intérêt du point de vue d'un agent navigateur; et une fonction de dispersion de probabilité (PDF) qui décrit la position des élément du système intelligent. Nous considérons les situation où les données se recueillent des systèmes composés de: plusieurs éléments fixes du réseau; des éléments fixes du réseau augmentés d'un robot mobile; un robot mobile seulement. Nos approches sont, pour la plupart, fondées sur des méthodes statistiques qui emploient des techniques stochastiques d'échantillonnage pour fournir des solutions approximatives aux problèmes dont le calcul d'une solution exacte ou optimale reste réfractaire. Les simulations numériques et les expériences exécutées au matériel suggèrent que ces recherches promettent des applications actuelles et pratiques dans le domaine d'autocalibrage des réseaux de capteurs.
Sarker, Bishnu. "On Graph-Based Approaches for Protein Function Annotation and Knowledge Discovery". Electronic Thesis or Diss., Université de Lorraine, 2021. http://www.theses.fr/2021LORR0094.
Testo completoDue to the recent advancement in genomic sequencing technologies, the number of protein entries in public databases is growing exponentially. It is important to harness this huge amount of data to describe living things at the molecular level, which is essential for understanding human disease processes and accelerating drug discovery. A prerequisite, however, is that all of these proteins be annotated with functional properties such as Enzyme Commission (EC) numbers and Gene Ontology (GO) terms. Today, only a small fraction of the proteins is functionally annotated and reviewed by expert curators because it is expensive, slow and time-consuming. Developing automatic protein function annotation tools is the way forward to reduce the gap between the annotated and unannotated proteins and to predict reliable annotations for unknown proteins. Many tools of this type already exist, but none of them are fully satisfactory. We observed that only few consider graph-based approaches and the domain composition of proteins. Indeed, domains are conserved regions across protein sequences of the same family. In this thesis, we design and evaluate graph-based approaches to perform automatic protein function annotation and we explore the impact of domain architecture on protein functions. The first part is dedicated to protein function annotation using domain similarity graph and neighborhood-based label propagation technique. We present GrAPFI (Graph-based Automatic Protein Function Inference) for automatically annotating proteins with enzymatic functions (EC numbers) and GO terms from a protein-domain similarity graph. We validate the performance of GrAPFI using six reference proteomes from UniprotKB/SwissProt and compare GrAPFI results with state-of-the-art EC prediction approaches. We find that GrAPFI achieves better accuracy and comparable or better coverage. The second part of the dissertation deals with learning representation for biological entities. At the beginning, we focus on neural network-based word embedding technique. We formulate the annotation task as a text classification task. We build a corpus of proteins as sentences composed of respective domains and learn fixed dimensional vector representation for proteins. Then, we focus on learning representation from heterogeneous biological network. We build knowledge graph integrating different sources of information related to proteins and their functions. We formulate the problem of function annotation as a link prediction task between proteins and GO terms. We propose Prot-A-GAN, a machine-learning model inspired by Generative Adversarial Network (GAN) to learn vector representation of biological entities from protein knowledge graph. We observe that Prot-A-GAN works with promising results to associate ap- propriate functions with query proteins. In conclusion, this thesis revisits the crucial problem of large-scale automatic protein function annotation in the light of innovative techniques of artificial intelligence. It opens up wide perspectives, in particular for the use of knowledge graphs, which are today available in many fields other than protein annotation thanks to the progress of data science
Banerjee, Torsha. "Energy Efficient Data Representation and Aggregation with Event Region Detection in Wireless Sensor Networks". University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1196187013.
Testo completoWu, Zutao. "Kmer-based sequence representations for fast retrieval and comparison". Thesis, Queensland University of Technology, 2017. https://eprints.qut.edu.au/103083/1/Zutao_Wu_Thesis.pdf.
Testo completoMcCormack, Daniel Keith Raymond. "An investigation into the representation of data for the neural implementation of a handwritten static signature verification system". Thesis, Cardiff University, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.338970.
Testo completoYadappanavar, Vinay M. (Vinay Muralidhara) 1976. "Time-dependent networks : data representation techniques and shortest path algorithms with applications to transportation problems". Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/8813.
Testo completoIncludes bibliographical references (leaves 139-141).
In this thesis, we develop methods for the following problems: the representation of discrete-time dynamic data, and the computation of fastest paths in continuous-time dynamic networks. We apply these methods for the following application problems: storage and communication of discrete-time dynamic transportation network data, and computation of fastest paths in traffic networks with signalized intersections. These problems are at the heart of realtime management of transportation networks equipped with information technologies. We propose a representation (called the bit-stream representation) method for nondecreasing discrete-time dynamic functions as a stream of 0 and 1 bits. We show that this representation is 12 times less memory consuming than the classical representation for such data, where the function value at each time-instant is stored as an L-bit integer. We exploit this representation to efficiently store and represent travel-time data in discrete-time dynamic transportation networks. Since the bit-stream representation requires lesser memory space, it also leads to lesser communication-time requirements for applications involving communication of such data. We adapt a classical dynamic one-to-all fastest path to work on bit-streams and show that this leads to savings of up to 16-times in over-all communication and computation times. This holds the potential to impact the development of efficient high performance computer implementations of dynamic shortest path algorithms in time-dependent networks. We model travel-times in dynamic networks using piece-wise linear functions. We consider the one-to-all fastest path problem in a class of continuous-time dynamic networks. We present two algorithms: Algorithm OR, that is based on a conceptual algorithm known in the literature; and Algorithm IOT-C, that is developed in this thesis. We implement the two algorithms, and show that Algorithm IOT-C outperforms Algorithm OR by a factor of two. We study the application problem of computing fastest paths in traffic networks with signalized intersections. We use a piece-wise linear link travel-time dynamic network model to address this problem, and demonstrate that this model is more accurate than discrete-time models proposed in the literature. Some of the implemented algorithms are applied to solve variants of the one-to-all fastest path problem in traffic networks with signalized intersections, and study the computational performance of these implementations.
by Vinay M. Yadappanavar.
S.M.
Lloyd, James Robert. "Representation, learning, description and criticism of probabilistic models with applications to networks, functions and relational data". Thesis, University of Cambridge, 2015. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.709264.
Testo completoVarol, Gül. "Learning human body and human action representations from visual data". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE029.
Testo completoThe focus of visual content is often people. Automatic analysis of people from visual data is therefore of great importance for numerous applications in content search, autonomous driving, surveillance, health care, and entertainment. The goal of this thesis is to learn visual representations for human understanding. Particular emphasis is given to two closely related areas of computer vision: human body analysis and human action recognition. In summary, our contributions are the following: (i) we generate photo-realistic synthetic data for people that allows training CNNs for human body analysis, (ii) we propose a multi-task architecture to recover a volumetric body shape from a single image, (iii) we study the benefits of long-term temporal convolutions for human action recognition using 3D CNNs, (iv) we incorporate similarity training in multi-view videos to design view-independent representations for action recognition
Kilingar, Nanda Gopala. "Generation and data-driven upscaling of open foam representational volume elements". Doctoral thesis, Universite Libre de Bruxelles, 2021. https://dipot.ulb.ac.be/dspace/bitstream/2013/313595/4/toc.pdf.
Testo completoDans ce travail, un générateur de volumes élémentaires représentatifs (VER) basé sur les champs de distance d'un agrégat d'inclusions de forme arbitraire est développé dans le cadre de matériaux moussés à structure ouverte. Lorsque les inclusions sont sphériques, la tessellation de l'agrégat résulte en des morphologies similaires aux échantillons de mousse physique en termes de rapports des nombres de face par pores et de bords par faces, ainsi que de la distribution de la longueur des entretoises, entre autres. Les fonctions qui combinent les champs de distance peuvent être utilisées pour obtenir des tesselations avec les variations nécessaires aux géométries des entretoises et extraire ces morphologies de mousse ouverte. Il est également possible de remplacer l'agrégat d'inclusions par un ensemble prédéfini d'inclusions qui sont directement extraites d'images tomographiques.L'utilisation de fonctions de niveaux discrètes entraîne de fortes discontinuités dans les dérivées des champs de distance. Une approche basée sur des ensembles de niveaux multiples est présentée qui peut capturer de manière appropriée les arêtes vives des entretoises des mousses ouvertes à partir des champs de distance résultants. Une telle approche peut contourner les discontinuités présentées par les champs de distance qui pourraient conduire à des concentrations de contraintes parasites dans une analyse ducomportement des matériaux.Les pores individuels sont ensuite extraits en tant que surfaces d'inclusions sur la base desdites combinaisons des fonctions de distance et de leurs modifications. Ces surfaces peuvent être réunies pour obtenir la géométrie finale des morphologies de mousse ouverte. Les attributs physiques des géométries extraites sont comparés aux données expérimentales. Une comparaison statistique est présentée décrivant les différentes caractéristiques. L'étude est étendue aux morphologies qui ont été extraites à l'aide d'images tomographiques.À l'aide d'outils d'optimisation de maillage, les triangulations des surfaces peuvent être obtenues, fusionnées et développées sous forme de modèles d'éléments finis (FE). Les modèles sont prêts à être utilisés dans une étude multi-échelle pour obtenir le comportement homogénéisé du matériau. La mise à l'échelle peut aider à évaluer les applications pratiques de ces modèles en les comparant aux données expérimentales d'échantillons physiques. Le comportement des matériaux des VERs est également comparé aux observations expérimentales.Pour augmenter l'efficacité de calcul de l'étude, un modèle de substitution basé sur un réseau neuronal est présenté. Ce modèle peut remplacer le problème aux valeurs limites à l'échelle micro dans une analyse multi-échelle. Les réseaux de neurones sont construits à l'aide de modules spécialement conçus pour prédire le comportement dépendant de l'histoire et sont appelés réseaux de neurones récurrents (RNN). Les modèles de substitution sont entrainés pour prendre en compte le caractère aléatoire du chargement que subit un matériau complexe lors d'une analyse de comportement d'un matériau.
Doctorat en Sciences de l'ingénieur et technologie
info:eu-repo/semantics/nonPublished
Araújo, Bilzã Marques de. "Rotulação de indivíduos representativos no aprendizado semissupervisionado baseado em redes: caracterização, realce, ganho e filosofia". Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-16122015-151236/.
Testo completoSemi-supervised learning (SSL) is the name given to the machine learning paradigm that considers both labeled and unlabeled data. Although often defined as a mid-term between unsupervised and supervised machine learning, this paradigm is usually applied to predictive or descriptive tasks. In the classification task, for example, the goal is to label the unlabeled data according to the labels of the labeled data. In this case, while the unlabeled data describes the data distributions and mediate the label propagation, the labeled data seeds the label propagation and guide it to the stability. However, as a whole, data is generated unlabeled, and to label data requires the involvement of domain specialists, labeling it by hand. Difficulties on visualizing huge amounts of data, as well as the cost of the specialists involvement, are challenges which may constraint the labeling task performance. Therefore, the automatic highlighting of good candidates to label by hand, henceforth called representative individuals, is a high value task, which may result in a good tradeoff between the cost with the specialist and the machine learning performance. Among the SSL approaches in the literature, our study is focused on the network--based approache, where datasets are represented relationally, through the graphic abstraction. Thus, the current study aims to explore and exploit the influence of the labeled data on the SSL performance, that is, the proper characterization of representative nodes, how the network structure may enhance them, the SSL performance gain due to labeling them by hand, and related philosophical aspects. Concerning the characterization, central nodes characterization criteria were studied on networks with well-defined modular structures. Counterintuitively, highly connected nodes (hubs) are not much representatives. Not so connected nodes placed in low connectivity neighborhoods are, though. Strictly local, this characterization is scalable to huge volumes of data. In networks with homogeneous degree distribution - Girvan-Newman networks (GN), nodes with high clustering coefficient also figure out as representatives. On the other hand, in networks with inhomogeneous degree distribution - Lancichinetti-Fortunato-Radicchi networks (LFR), nodes with high betweenness stand out. Nodes with high clustering coefficient in GN networks typically lie in almost-cliques motifs; nodes with high betweenness in LFR networks are highly connected nodes, which lie in communities borders. In both cases, the highlighted nodes are outstanding regularizers. Besides that, unified approaches to characterize representative nodes were studied because diverse criteria stand out for diverse networks. Crucial for highlighting representative nodes and ensure good SSL performance, the graph construction from vector-based datasets was also studied. The method called AdaRadius was introduced and presents advantages such as adaptability to data with variable density, low dependency on parameters settings, and reasonable computational cost on both pool based and incremental data. Yielding networks are sparse but connected and allow the semi-supervised classification to take great advantage of the manual labeling of representative nodes. Lastly, the validation of graph construction methods for SSL was studied, being proposed the validation measure called graph-labels Katz coherence. Summing up, the discussed results give rise to the validity of representative individuals selection to seed the semi-supervised classification, supporting the central assumption of current thesis. Analogies may be found in several real-world network problems, such as epidemiology, rumors and information spreading, resilience, lethality, grandmother cells, and network evolving and self-organization.
Jagtap, Surabhi. "Multilayer Graph Embeddings for Omics Data Integration in Bioinformatics". Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPAST014.
Testo completoBiological systems are composed of interacting bio-molecules at different molecular levels. With the advent of high-throughput technologies, omics data at their respective molecular level can be easily obtained. These huge, complex multi-omics data can be useful to provide insights into the flow of information at multiple levels, unraveling the mechanisms underlying the biological condition of interest. Integration of different omics data types is often expected to elucidate potential causative changes that lead to specific phenotypes, or targeted treatments. With the recent advances in network science, we choose to handle this integration issue by representing omics data through networks. In this thesis, we have developed three models, namely BraneExp, BraneNet, and BraneMF, for learning node embeddings from multilayer biological networks generated with omics data. We aim to tackle various challenging problems arising in multi-omics data integration, developing expressive and scalable methods capable of leveraging rich structural semantics of realworld networks
Finfando, Filip. "Indoor scene verification : Evaluation of indoor scene representations for the purpose of location verification". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288856.
Testo completoMed mänskliga synförmågan är det ganska lätt att bedöma om två bilder som tas i samma inomhusutrymme verkligen har tagits i exakt samma plats även om man aldrig har varit där. Det är möjligt tack vare många faktorer, sådana som rumsliga egenskaper (fönsterformer, rumsformer), gemensamma mönster (golv, väggar) eller närvaro av särskilda föremål (möbler, ljus). Ändring av kamerans placering, belysning, möblernas placering eller digitalbildens förändring (t. ex. vattenstämpel) påverkar denna förmåga minimalt. Traditionella metoder att mäta bildernas perceptuella likheter hade svårigheter att reproducera denna färdighet . Denna uppsats definierar verifiering av inomhusbilder, Indoor SceneVerification (ISV), som en ansats att ta reda på om två inomhusbilder har tagits i samma utrymme eller inte. Studien undersöker de främsta perceptuella identitetsfunktionerna genom att introducera två nya datauppsättningar designade särskilt för detta. Perceptual hash, ORB, FaceNet och NetVLAD identifierades som potentiella referenspunkter. Resultaten visar att NetVLAD levererar de bästa resultaten i båda datauppsättningarna, varpå de valdes som referenspunkter till undersökningen i syfte att förbättra det. Tre experiment undersöker påverkan av användning av olika datauppsättningar, ändring av struktur i neuronnätet och införande av en ny minskande funktion. Kvantitativ AUC-värdet analys visar att ett byte frånVGG16 till MobileNetV2 tillåter förbättringar i jämförelse med de primära lösningarna.
Mohammadi, Samin. "Analysis of user popularity pattern and engagement prediction in online social networks". Thesis, Evry, Institut national des télécommunications, 2018. http://www.theses.fr/2018TELE0019/document.
Testo completoNowadays, social media has widely affected every aspect of human life. The most significant change in people's behavior after emerging Online Social Networks (OSNs) is their communication method and its range. Having more connections on OSNs brings more attention and visibility to people, where it is called popularity on social media. Depending on the type of social network, popularity is measured by the number of followers, friends, retweets, likes, and all those other metrics that is used to calculate engagement. Studying the popularity behavior of users and published contents on social media and predicting its future status are the important research directions which benefit different applications such as recommender systems, content delivery networks, advertising campaign, election results prediction and so on. This thesis addresses the analysis of popularity behavior of OSN users and their published posts in order to first, identify the popularity trends of users and posts and second, predict their future popularity and engagement level for published posts by users. To this end, i) the popularity evolution of ONS users is studied using a dataset of 8K Facebook professional users collected by an advanced crawler. The collected dataset includes around 38 million snapshots of users' popularity values and 64 million published posts over a period of 4 years. Clustering temporal sequences of users' popularity values led to identifying different and interesting popularity evolution patterns. The identified clusters are characterized by analyzing the users' business sector, called category, their activity level, and also the effect of external events. Then ii) the thesis focuses on the prediction of user engagement on the posts published by users on OSNs. A novel prediction model is proposed which takes advantage of Point-wise Mutual Information (PMI) and predicts users' future reaction to newly published posts. Finally, iii) the proposed model is extended to get benefits of representation learning and predict users' future engagement on each other's posts. The proposed prediction approach extracts user embedding from their reaction history instead of using conventional feature extraction methods. The performance of the proposed model proves that it outperforms conventional learning methods available in the literature. The models proposed in this thesis, not only improves the reaction prediction models to exploit representation learning features instead of hand-crafted features but also could help news agencies, advertising campaigns, content providers in CDNs, and recommender systems to take advantage of more accurate prediction results in order to improve their user services
Mohammadi, Samin. "Analysis of user popularity pattern and engagement prediction in online social networks". Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2018. http://www.theses.fr/2018TELE0019.
Testo completoNowadays, social media has widely affected every aspect of human life. The most significant change in people's behavior after emerging Online Social Networks (OSNs) is their communication method and its range. Having more connections on OSNs brings more attention and visibility to people, where it is called popularity on social media. Depending on the type of social network, popularity is measured by the number of followers, friends, retweets, likes, and all those other metrics that is used to calculate engagement. Studying the popularity behavior of users and published contents on social media and predicting its future status are the important research directions which benefit different applications such as recommender systems, content delivery networks, advertising campaign, election results prediction and so on. This thesis addresses the analysis of popularity behavior of OSN users and their published posts in order to first, identify the popularity trends of users and posts and second, predict their future popularity and engagement level for published posts by users. To this end, i) the popularity evolution of ONS users is studied using a dataset of 8K Facebook professional users collected by an advanced crawler. The collected dataset includes around 38 million snapshots of users' popularity values and 64 million published posts over a period of 4 years. Clustering temporal sequences of users' popularity values led to identifying different and interesting popularity evolution patterns. The identified clusters are characterized by analyzing the users' business sector, called category, their activity level, and also the effect of external events. Then ii) the thesis focuses on the prediction of user engagement on the posts published by users on OSNs. A novel prediction model is proposed which takes advantage of Point-wise Mutual Information (PMI) and predicts users' future reaction to newly published posts. Finally, iii) the proposed model is extended to get benefits of representation learning and predict users' future engagement on each other's posts. The proposed prediction approach extracts user embedding from their reaction history instead of using conventional feature extraction methods. The performance of the proposed model proves that it outperforms conventional learning methods available in the literature. The models proposed in this thesis, not only improves the reaction prediction models to exploit representation learning features instead of hand-crafted features but also could help news agencies, advertising campaigns, content providers in CDNs, and recommender systems to take advantage of more accurate prediction results in order to improve their user services
Soares, Telma Woerle de Lima. "Estruturas de dados eficientes para algoritmos evolutivos aplicados a projeto de redes". Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28052009-163303/.
Testo completoNetwork design problems (NDPs) are very important since they involve several applications from areas of Engineering and Sciences. In order to solve the limitations of traditional algorithms for NDPs that involve real world complex networks (in general, modeled by large-scale complete or sparse graphs), heuristics, such as evolutionary algorithms (EAs), have been investigated. Recent researches have shown that appropriate data structures can improve EA performance when applied to NDPs. One of these data structures is the Node-depth Encoding (NDE). In general, the performance of EAs with NDE has presented relevant results for large-scale NDPs. This thesis investigates the development of a new representation, based on NDE, called Node-depth-degree Encoding (NDDE). The NDDE is composed for improvements of the NDE operators and the development of new reproduction operators that enable the recombination of solutions. In this way, we developed a recombination operator to work with both non-complete and complete graphs, called EHR (Evolutionary History Recombination Operator). We also developed two other operators to work only with complete graphs, named NOX and NPBX. These improvements have the advantage of retaining the computational complexity of the operators relatively low in order to improve the EA performance. The analysis of representation properties have shown that NDDE is a redundant representation and, for this reason, we proposed some strategies to avoid it. This analysis also showed that EHR has low running time and it does not have bias, moreover, it revealed that NOX and NPBX have bias to trees like stars. The application of an EA using the NDDE to classic NDPs, such as, optimal communication spanning tree, degree-constraint minimum spanning tree and one-max tree, showed that the larger the instance is, the better the performance will be in comparison whit other EAs applied to NDPs in the literatura. An EA using the NDE with EHR was applied to a real-world NDP of reconfiguration of energy distribution systems. The results showed that EHR significantly decrease the convergence time of the EA
Kim, Pilho. "E-model event-based graph data model theory and implementation /". Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29608.
Testo completoCommittee Chair: Madisetti, Vijay; Committee Member: Jayant, Nikil; Committee Member: Lee, Chin-Hui; Committee Member: Ramachandran, Umakishore; Committee Member: Yalamanchili, Sudhakar. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Machens, Anna. "Processus épidémiques sur réseaux dynamiques". Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4066/document.
Testo completoIn this thesis we contribute to provide insights into questions concerning dynamic epidemic processes on data-driven, temporal networks. In particular, we investigate the influence of data representations on the outcome of epidemic processes, shedding some light on the question how much detail is necessary for the data representation and its dependence on the spreading parameters. By introducing an improvement to the contact matrix representation we provide a data representation that could in the future be integrated into multi-scale epidemic models in order to improve the accuracy of predictions and corresponding immunization strategies. We also point out some of the ways dynamic processes are influenced by temporal properties of the data
Nguyen, Thanh Hai. "Some contributions to deep learning for metagenomics". Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS102.
Testo completoMetagenomic data from human microbiome is a novel source of data for improving diagnosis and prognosis in human diseases. However, to do a prediction based on individual bacteria abundance is a challenge, since the number of features is much bigger than the number of samples. Hence, we face the difficulties related to high dimensional data processing, as well as to the high complexity of heterogeneous data. Machine Learning has obtained great achievements on important metagenomics problems linked to OTU-clustering, binning, taxonomic assignment, etc. The contribution of this PhD thesis is multi-fold: 1) a feature selection framework for efficient heterogeneous biomedical signature extraction, and 2) a novel deep learning approach for predicting diseases using artificial image representations. The first contribution is an efficient feature selection approach based on visualization capabilities of Self-Organizing Maps for heterogeneous data fusion. The framework is efficient on a real and heterogeneous datasets containing metadata, genes of adipose tissue, and gut flora metagenomic data with a reasonable classification accuracy compared to the state-of-the-art methods. The second approach is a method to visualize metagenomic data using a simple fill-up method, and also various state-of-the-art dimensional reduction learning approaches. The new metagenomic data representation can be considered as synthetic images, and used as a novel data set for an efficient deep learning method such as Convolutional Neural Networks. The results show that the proposed methods either achieve the state-of-the-art predictive performance, or outperform it on public rich metagenomic benchmarks
Fonseca, Eduardo. "Training sound event classifiers using different types of supervision". Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/673067.
Testo completoEl interés en el reconocimiento automático de eventos sonoros se ha incrementado en los últimos años, motivado por nuevas aplicaciones en campos como la asistencia médica, smart homes, o urbanismo. Al comienzo de esta tesis, la investigación en clasificación de eventos sonoros se centraba principalmente en aprendizaje supervisado usando datasets pequeños, a menudo anotados cuidadosamente con vocabularios limitados a dominios específicos (como el urbano o el doméstico). Sin embargo, tales datasets no permiten entrenar clasificadores capaces de reconocer los cientos de eventos sonoros que ocurren en nuestro entorno, como silbidos de kettle, sonidos de pájaros, coches pasando, o diferentes alarmas. Al mismo tiempo, websites como Freesound o YouTube albergan grandes cantidades de datos de sonido ambiental, que pueden ser útiles para entrenar clasificadores con un vocabulario más extenso, particularmente utilizando métodos de deep learning que requieren gran cantidad de datos. Para avanzar el estado del arte en la clasificación de eventos sonoros, esta tesis investiga varios aspectos de la creación de datasets, así como de aprendizaje supervisado y no supervisado para entrenar clasificadores de eventos sonoros con un vocabulario extenso, utilizando diferentes tipos de supervisión de manera novedosa y alternativa. En concreto, nos centramos en aprendizaje supervisado usando etiquetas sin ruido y con ruido, así como en aprendizaje de representaciones auto-supervisado a partir de datos no etiquetados. La primera parte de esta tesis se centra en la creación de FSD50K, un dataset con más de 100h de audio etiquetado manualmente usando 200 clases de eventos sonoros. Presentamos una descripción detallada del proceso de creación y una caracterización exhaustiva del dataset. Además, exploramos modificaciones arquitectónicas para aumentar la invariancia frente a desplazamientos en CNNs, mejorando la robustez frente a desplazamientos de tiempo/frecuencia en los espectrogramas de entrada. En la segunda parte, nos centramos en entrenar clasificadores de eventos sonoros usando etiquetas con ruido. Primero, proponemos un dataset que permite la investigación del ruido de etiquetas real. Después, exploramos métodos agnósticos a la arquitectura de red para mitigar el efecto del ruido en las etiquetas durante el entrenamiento, incluyendo técnicas de regularización, funciones de coste robustas al ruido, y estrategias para rechazar ejemplos etiquetados con ruido. Además, desarrollamos un método teacher-student para abordar el problema de las etiquetas ausentes en datasets de eventos sonoros. En la tercera parte, proponemos algoritmos para aprender representaciones de audio a partir de datos sin etiquetar. En particular, desarrollamos métodos de aprendizaje contrastivos auto-supervisados, donde las representaciones se aprenden comparando pares de ejemplos calculados a través de métodos de aumento de datos y separación automática de sonido. Finalmente, reportamos sobre la organización de dos DCASE Challenge Tasks para el tageado automático de audio a partir de etiquetas ruidosas. Mediante la propuesta de datasets, así como de métodos de vanguardia y representaciones de audio, esta tesis contribuye al avance de la investigación abierta sobre eventos sonoros y a la transición del aprendizaje supervisado tradicional utilizando etiquetas sin ruido a otras estrategias de aprendizaje menos dependientes de costosos esfuerzos de anotación.
"A new data structure and algorithm for spatial network representation". 2003. http://library.cuhk.edu.hk/record=b5891646.
Testo completoThesis (M.Phil.)--Chinese University of Hong Kong, 2003.
Includes bibliographical references (leaves 92-96).
Abstracts in English and Chinese.
Abstract in English --- p.i
Abstract in Chinese --- p.ii
Acknowledgements --- p.iii
Table of Contents --- p.iv-vi
List of Figures --- p.vii-ix
List of Tables --- p.x
Chapter Chapter 1 --- Introduction
Chapter 1.1 --- Introduction --- p.1
Chapter 1.2 --- Motivation --- p.3
Chapter 1.3 --- Purposes of this Research --- p.6
Chapter 1.4 --- Contribution of this Research --- p.7
Chapter 1.5 --- Outline of the Thesis --- p.9
Chapter Chapter 2 --- Literature Review And Research Issues
Chapter 2.1 --- Introduction --- p.11
Chapter 2.2 --- Spatial Access Methods --- p.14
Chapter 2.2.1 --- R-Tree --- p.15
Chapter 2.2.2 --- R*-Tree --- p.19
Chapter 2.2.3 --- R+-Tree --- p.21
Chapter 2.3 --- Spatial Network Analysis --- p.22
Chapter 2.4 --- Nearest Neighbor Queries --- p.23
Chapter 2.5 --- Summary --- p.25
Chapter Chapter 3 --- Data Preparation
Chapter 3.1 --- "Introduction (XML, GML), XML indexing" --- p.26
Chapter 3.2 --- Spatial data from Lands Department --- p.31
Chapter 3.3 --- Graph representation for Road Network data --- p.32
Chapter 3.4 --- Summary --- p.35
Chapter Chapter 4 --- XML Indexing for Spatial Data
Chapter 4.1 --- Introduction --- p.36
Chapter 4.2 --- STR Packed R-Tree --- p.38
Chapter 4.2.1 --- Implementation --- p.39
Chapter 4.2.2 --- Experimental Result --- p.41
Chapter 4.3 --- Summary --- p.48
Chapter Chapter 5 --- Spatial Network
Chapter 5.1 --- Introduction --- p.50
Chapter 5.2 --- CCAM: Connectivity-Clustered Access Method --- p.53
Chapter 5.3 --- Shortest Path in Spatial Network --- p.56
Chapter 5.4 --- A New Algorithm Specially for Partitioning /Clustering Network --- p.63
Chapter 5.5 --- A New Simple heuristic for Shortest Path Problem for Spatial Network --- p.70
Chapter 5.6 --- Summary --- p.74
Chapter Chapter 6 --- Nearest Neighbor Queries
Chapter 6.1 --- Introduction --- p.76
Chapter 6.2 --- Modified Algorithm for Nearest Neighbor Queries --- p.78
Chapter 6.3 --- Summary --- p.83
Chapter Chapter 7 --- Conclusion and Future Work
Chapter 7.1 --- Conclusion --- p.84
Chapter 7.2 --- Future Work --- p.85
Appendix Space Driven Algorithm
Chapter A.1 --- Introduction --- p.87
Chapter A.2 --- Fixed Grid --- p.88
Chapter A.3 --- Z-curve --- p.89
Chapter A.4 --- Hilbert curve --- p.90
Chapter A.5 --- Conclusion --- p.91
Bibliography --- p.92
Dumoulin, Vincent. "Representation Learning for Visual Data". Thèse, 2018. http://hdl.handle.net/1866/21140.
Testo completo(10157291), Yi-Yu Lai. "Relational Representation Learning Incorporating Textual Communication for Social Networks". Thesis, 2021.
Cerca il testo completoPulabaigari, Viswanath. "Pattern Synthesis Techniques And Compact Data Representation Schemes For Efficient Nearest Neighbor Classification". Thesis, 2005. https://etd.iisc.ac.in/handle/2005/1560.
Testo completoPulabaigari, Viswanath. "Pattern Synthesis Techniques And Compact Data Representation Schemes For Efficient Nearest Neighbor Classification". Thesis, 2005. http://etd.iisc.ernet.in/handle/2005/1560.
Testo completo(11197824), Kiirthanaa Gangadharan. "Deep Transferable Intelligence for Wearable Big Data Pattern Detection". Thesis, 2021.
Cerca il testo completo(10723926), Adefolarin Alaba Bolaji. "Community Detection of Anomaly in Large-Scale Network Dissertation - Adefolarin Bolaji .pdf". Thesis, 2021.
Cerca il testo completoThe detection of anomalies in real-world networks is applicable in different domains; the application includes, but is not limited to, credit card fraud detection, malware identification and classification, cancer detection from diagnostic reports, abnormal traffic detection, identification of fake media posts, and the like. Many ongoing and current researches are providing tools for analyzing labeled and unlabeled data; however, the challenges of finding anomalies and patterns in large-scale datasets still exist because of rapid changes in the threat landscape.
In this study, I implemented a novel and robust solution that combines data science and cybersecurity to solve complex network security problems. I used Long Short-Term Memory (LSTM) model, Louvain algorithm, and PageRank algorithm to identify and group anomalies in large-scale real-world networks. The network has billions of packets. The developed model used different visualization techniques to provide further insight into how the anomalies in the network are related.
Mean absolute error (MAE) and root mean square error (RMSE) was used to validate the anomaly detection models, the results obtained for both are 5.1813e-04 and 1e-03 respectively. The low loss from the training phase confirmed the low RMSE at loss: 5.1812e-04, mean absolute error: 5.1813e-04, validation loss: 3.9858e-04, validation mean absolute error: 3.9858e-04. The result from the community detection shows an overall modularity value of 0.914 which is proof of the existence of very strong communities among the anomalies. The largest sub-community of the anomalies connects 10.42% of the total nodes of the anomalies.
The broader aim and impact of this study was to provide sophisticated, AI-assisted countermeasures to cyber-threats in large-scale networks. To close the existing gaps created by the shortage of skilled and experienced cybersecurity specialists and analysts in the cybersecurity field, solutions based on out-of-the-box thinking are inevitable; this research was aimed at yielding one of such solutions. It was built to detect specific and collaborating threat actors in large networks and to help speed up how the activities of anomalies in any given large-scale network can be curtailed in time.
(11073474), Bin Zhang. "Data-driven Uncertainty Analysis in Neural Networks with Applications to Manufacturing Process Monitoring". Thesis, 2021.
Cerca il testo completoArtificial neural networks, including deep neural networks, play a central role in data-driven science due to their superior learning capacity and adaptability to different tasks and data structures. However, although quantitative uncertainty analysis is essential for training and deploying reliable data-driven models, the uncertainties in neural networks are often overlooked or underestimated in many studies, mainly due to the lack of a high-fidelity and computationally efficient uncertainty quantification approach. In this work, a novel uncertainty analysis scheme is developed. The Gaussian mixture model is used to characterize the probability distributions of uncertainties in arbitrary forms, which yields higher fidelity than the presumed distribution forms, like Gaussian, when the underlying uncertainty is multimodal, and is more compact and efficient than large-scale Monte Carlo sampling. The fidelity of the Gaussian mixture is refined through adaptive scheduling of the width of each Gaussian component based on the active assessment of the factors that could deteriorate the uncertainty representation quality, such as the nonlinearity of activation functions in the neural network.
Following this idea, an adaptive Gaussian mixture scheme of nonlinear uncertainty propagation is proposed to effectively propagate the probability distributions of uncertainties through layers in deep neural networks or through time in recurrent neural networks. An adaptive Gaussian mixture filter (AGMF) is then designed based on this uncertainty propagation scheme. By approximating the dynamics of a highly nonlinear system with a feedforward neural network, the adaptive Gaussian mixture refinement is applied at both the state prediction and Bayesian update steps to closely track the distribution of unmeasurable states. As a result, this new AGMF exhibits state-of-the-art accuracy with a reasonable computational cost on highly nonlinear state estimation problems subject to high magnitudes of uncertainties. Next, a probabilistic neural network with Gaussian-mixture-distributed parameters (GM-PNN) is developed. The adaptive Gaussian mixture scheme is extended to refine intermediate layer states and ensure the fidelity of both linear and nonlinear transformations within the network so that the predictive distribution of output target can be inferred directly without sampling or approximation of integration. The derivatives of the loss function with respect to all the probabilistic parameters in this network are derived explicitly, and therefore, the GM-PNN can be easily trained with any backpropagation method to address practical data-driven problems subject to uncertainties.
The GM-PNN is applied to two data-driven condition monitoring schemes of manufacturing processes. For tool wear monitoring in the turning process, a systematic feature normalization and selection scheme is proposed for the engineering of optimal feature sets extracted from sensor signals. The predictive tool wear models are established using two methods, one is a type-2 fuzzy network for interval-type uncertainty quantification and the other is the GM-PNN for probabilistic uncertainty quantification. For porosity monitoring in laser additive manufacturing processes, convolutional neural network (CNN) is used to directly learn patterns from melt-pool patterns to predict porosity. The classical CNN models without consideration of uncertainty are compared with the CNN models in which GM-PNN is embedded as an uncertainty quantification module. For both monitoring schemes, experimental results show that the GM-PNN not only achieves higher prediction accuracies of process conditions than the classical models but also provides more effective uncertainty quantification to facilitate the process-level decision-making in the manufacturing environment.
Based on the developed uncertainty analysis methods and their proven successes in practical applications, some directions for future studies are suggested. Closed-loop control systems may be synthesized by combining the AGMF with data-driven controller design. The AGMF can also be extended from a state estimator to the parameter estimation problems in data-driven models. In addition, the GM-PNN scheme may be expanded to directly build more complicated models like convolutional or recurrent neural networks.
Higgins, Stefan. "Imagining information: the uses of storytelling". Thesis, 2020. http://hdl.handle.net/1828/12555.
Testo completoGraduate
2021-06-20
"An effective information representation for opinion-oriented applications". 2013. http://library.cuhk.edu.hk/record=b5549839.
Testo completo本文回答了以下几个由主观信息表示不当所引发的研究问题: 1. 对于主观信息而言单个词将不再是基本语义单元,是否存在一种有效的表示方法对其进行描述? 2. 由于主观信息是观点信息和相关性信息的结合,如何利用新的表示方法来描述这二者之间的关联信息?3. 如何对主观信息进行量化,以便对文档进行检索和分析? 4. 如何在面向观点应用中实现全新的主观信息表示方法?
由于观点检索的结果会直接影响到其它面向观点应用的性能,因此本文从观点检索这一问题入手。首先,我们提出了一种基于句子的方法来分析词袋表示方法的局限性。以此为据,定义了一种具有丰富语义的表达方式来表示主观信息,即词对,它是由出现在同一句子中的情感词和与之关联的目标词共同组成的。然后,我们提出了一系列方法来描述和获取两类语境信息:1)观点内信息:我们给出了三种提取词对的方法以获取观点与主题的关联信息;2)观点间信息:我们提出了一种权重计算方法来度量词对间的相关程度,从而获取词对与词对之间的关系。最后,我们集成了观点内信息和观点间信息并提出了潜在情感关联模型来解决观点检索这一问题。在标准数据集上的实验结果表明,基于词对的表示方法可以有效地描述主观信息,同时潜在情感关联模型能够获取词与词之间的关联信息,从而实现了利用语境信息提高观点检索的效果。
此外,我们将词对应用于观点摘要和观点问答中,标准数据集上的评测结果显示基于词对的主观信息表示方法对于其它面向观点应用也同样有效。
There is a growing interest for users to express their opinions about products, films, politics, by using on-line tools such as forums, blogs, facebooks, etc. These opinions cannot only help users make decisions, e.g., whether to buy a product, but also to ob-tain valuable feedback for business and social events. Today, research on opin-ion-oriented applications (OOAs) including opinion retrieval, opinion summarization and opinion question and answering is attracting much attention. The difference be-tween fact-based and opinion-oriented applications lies in users‘ information need. The former requires objective information and the latter subjective, which comprises of opinions or comments expressed on a specific target. To meet the need of subjective information, both opinionatedness and relevance together with the association between them should be taken into account. Existing systems represent documents in bag-of-word. However, this representation fails to distinguish opinionatedness from relevance. Moreover, due to the ignorance of word sequence, words associations are lost. For this reason, bag-of-word representation is ineffective for subjective information, and affects the performance of OOAs seriously.
In this thesis, we try to answer the following challenging questions arose in subjective information representation. Since word is no longer the basic semantic unit, how would subjective information be represented? Subjective information is a combination of opinionatedness and relevance, so how would the association between them be modeled? How would subjective information be measured for the purpose of document ranking, retrieval, and analysis? How would opinion-oriented applications benefit from subjective information?
We start from solving the problem of opinion retrieval whose results can directly influence the performance of other opinion-oriented applications. We first present a sentence-based approach to analyze the limitation of bag-of-word representation and define a semantically richer representation, namely word pair for subjective infor-mation. A word pair is constructed by a sentiment word and its associated target co-occurring in a sentence. We then propose techniques to capture two kinds of con-textual information. 1) Intra-opinion information: three methods are proposed to ex-tract the word pair. 2) Inter-opinion information: a weighting scheme is present to measure the weight of individual word pair. Finally, we devise an algorithm to integrate both intra-opinion and inter-opinion information into a latent sentimental association model for opinion retrieval. The evaluation on three benchmark datasets suggests the effectiveness of word pair and the latent sentimental association retrieval model provide insight into the words association to support opinion retrieval beneficial from pairwise representation. We also apply word pair to opinion summarization and opinion question answering. The evaluation on two benchmark datasets shows that word pair performs effectively in the applications.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Li, Binyang.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves [96]-103).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese.
Abstract --- p.ii
Abstract in Chinese --- p.iv
Acknowledgements --- p.vi
Contents --- p.viii
List of Tables --- p.xi
List of Figures --- p.xiii
Chapter 1. --- Introduction --- p.1
Chapter 1.1. --- Problem and Challenges --- p.3
Chapter 1.1.1 --- Subjective Information Representation --- p.3
Chapter 1.1.2 --- Associative Information in an Opinion Expression --- p.4
Chapter 1.1.3 --- Opinion Expression Measurement --- p.5
Chapter 1.1.4 --- Applications of Subjective Information Representation to Different OOAs --- p.6
Chapter 1.2. --- Contributions --- p.6
Chapter 1.3. --- Chapter Summary --- p.7
Chapter 2. --- Pairwise Representation --- p.9
Chapter 2.1 --- Related Woks on Opinion Retrieval --- p.10
Chapter 2.1.1 --- Opinion Retrieval Models --- p.10
Chapter 2.1.2 --- Lexicon-based Opinion Identification --- p.12
Chapter 2.2 --- Sentence-based Approach for Opinion Retrieval --- p.13
Chapter 2.2.1 --- The Limitations of Document-based Approaches for Opinion Retrieval --- p.13
Chapter 2.2.2 --- Sentence-based Approach for Opinion Retrieval --- p.16
Chapter 2.2.3 --- Evaluation and Results --- p.21
Chapter 2.2.4 --- Summary --- p.26
Chapter 2.3 --- Pairwise Representation --- p.28
Chapter 2.3.1 --- Definition of Word Pair --- p.28
Chapter 2.3.2 --- Sentiment Lexicon Construction --- p.29
Chapter 2.3.3 --- Topic Term Lexicon Construction --- p.30
Chapter 2.3.4 --- Word Pair Construction --- p.31
Chapter 2.4 --- Graph-based Model for Opinion Retrieval --- p.33
Chapter 2.4.1 --- HITS Model for Opinion Retrieval --- p.34
Chapter 2.4.2 --- PageRank Model for Opinion Retrieval --- p.37
Chapter 2.4.3 --- Evaluation and Results --- p.40
Chapter 2.5 --- Chapter Summary --- p.50
Chapter 3. --- Pairwise Representation Measurement --- p.51
Chapter 3.1 --- Word Pair Weighting Scheme --- p.52
Chapter 3.1.1 --- PMI-based Weighting Scheme --- p.52
Chapter 3.1.2 --- Evaluation and Results --- p.56
Chapter 3.1.3 --- Summary --- p.60
Chapter 3.2 --- Latent Sentimental Association --- p.61
Chapter 3.2.1 --- Problem Formulation --- p.61
Chapter 3.2.2 --- LSA Integrated Generative Model --- p.62
Chapter 3.2.3 --- Modeling the Dependency between Q and d --- p.64
Chapter 3.2.4 --- Modeling the Dependency between O and d --- p.67
Chapter 3.3 --- Parameter Estimation --- p.67
Chapter 3.3.1 --- Estimating P(Q --- p.67
Chapter 3.3.2 --- Estimating MI(Q,O --- p.69
Chapter 3.4 --- Evaluation and Results --- p.69
Chapter 3.5 --- Chapter Summary --- p.72
Chapter 4. --- Pairwise Representation in Opinion-oriented Application --- p.75
Chapter 4.1. --- Opinion Questioning and Answering --- p.76
Chapter 4.1.1 --- Problem Statement --- p.76
Chapter 4.1.2 --- Existing Solution --- p.78
Chapter 4.1.3 --- A Word Pair based Approach for Sentence Ranking --- p.79
Chapter 4.1.4 --- Answer Generation --- p.82
Chapter 4.1.5 --- Evaluation and Results --- p.82
Chapter 4.2. --- Opinion Summarization --- p.86
Chapter 4.2.1 --- Problem Statement --- p.86
Chapter 4.2.2 --- Existing Solution --- p.87
Chapter 4.2.3 --- Sentence Ranking --- p.88
Chapter 4.2.4 --- Summary Generation --- p.88
Chapter 4.2.5 --- Evaluation and Results --- p.89
Chapter 4.3. --- Chapter Summary --- p.91
Chapter 5. --- Conclusions and Future Works --- p.93
Bibliography --- p.97
Wang, Fang. "From Line Drawings to Human Actions: Deep Neural Networks for Visual Data Representation". Phd thesis, 2017. http://hdl.handle.net/1885/135765.
Testo completoCunha, Adriana Monteiro e. "Neural networks for 2D representations of cell expression". Master's thesis, 2020. http://hdl.handle.net/10316/93918.
Testo completoThe recent advances in transcriptome sequencing technologies lead to the increase of gene expression studies, with significant impact in the fields of cellular biology and medicine. Typically, the work developed based on this type of data resorts to feature reduction techniques to combat the problems risen by the curse of dimensionality and from data extraction (such as dropout events, noise, etc.), especially in projects involving classification tasks. This dissertation presents a novel dimensionality reduction model inspired by deep neural networks, the Supervised Autoencoder, which combines the architecture of traditional autoencoders with a SoftMax classification layer, so the latent space maximizes different classes’ separability. To account for the recurring dropout events in this type of datasets, a Dropout layer was implemented during training, improving the model’s robustness. The present study focuses particularly on two-dimensional reductions to ease the information’s visualisation. In addition to an analysis of the effect of label usage in the feature reduction process (prior to potential classification tasks), the possibility of inferring new similarity patterns between samples through the latent space was explored.The model was validated with three datasets, comparing its results with those of Principal Component Analysis and the equivalent simple autoencoder, as well as by analysing the heatmap of the complete gene expression clustered based on the engineered features. The results show the model is capable of meaningful representations of the original data that ease the classification task compared to the ones resultant of state-of-the-art techniques. However, it is not possible to draw new parallels between samples based on those features.
Os recentes avanços nas tecnologias de sequenciação do transcriptoma humano levaram ao aumento de estudos baseados em dados de expressão genética, com notável impacto nas áreas da biologia e medicina. Tipicamente, o trabalho desenvolvido com base neste tipo de informação recorre a técnicas de redução de features para combater os problemas que advêm da curse of dimensionality e associados à extração de dados de expressão (como eventos de dropout, ruído, etc.), sobretudo em projetos com tarefas de classificação.Nesta dissertação apresenta-se um modelo de redução de dimensionalidade inspirado em redes neuronais, o Autoencoder Supervisionado, que acopla a arquitetura tradicional de autoencoders com uma camada de classificação SoftMax, para que as representações no espaço latente maximizem a separabilidade entre diferentes classes. De forma a considerar os recorrentes eventos dropout neste tipo de dados, foi usada uma camada Dropout na fase de treino, conferindo maior robustez ao modelo. O estudo em causa foca-se em particular em reduções para duas dimensões, de forma a facilitar a visualização gráfica de informação. Além da análise do efeito da contabilização de classes no processo de redução de features (a priori de potenciais tarefas de classificação), explorou-se a possibilidade de o espaço latente obtido permitir aferir novos padrões de semelhança entre amostras.O modelo foi validado usando três conjuntos de dados, comparando os seus resultados com os obtidos através de Principal Component Analysis e do autoencoder simples equivalente, bem como através da análise do mapa de calor dos dados completos de expressão genética agrupados através do clustering hierárquico das features reduzidas.Os resultados mostram que o modelo é capaz de gerar representações adequadas dos dados originais, que permitem facilitar a tarefa de classificação quando comparadas com as resultantes das técnicas estado-da-arte. No entanto, não foi possível utilizá-las para estabelecer novos paralelos entre amostras.
Outro - Projeto financiado pela Fundação para a Ciência e Tecnologia: D4 - Deep Drug Discovery and Deployment (CENTRO-01-0145-FEDER-029266)
Kang, Yao-Wen, e 康耀文. "Adaptive Data Representation to Decrease Analog Variation Error of ReRAM Crossbar Accelerator for Neural Networks". Thesis, 2019. http://ndltd.ncl.edu.tw/handle/85rtfs.
Testo completo國立臺灣大學
資訊工程學研究所
107
Current deep neural network computations incur intensive memory accesses and thus limit the performance of current Von-Neumann architecture. To bridge the performance gap, Processing-In-Memory (PIM) architecture is widely advocated and crossbar accelerators with Resistive Random-Access Memory (ReRAM) are one of the intensively-studied solutions. However, due to the programming variation of ReRAM, crossbar accelerators suffer from the serious accuracy issue. To improve the accuracy, we propose an adaptive data representation strategy to minimize the analog variation errors caused by the programming variation of ReRAM. The proposed strategy was evaluated by a series of intensive experiments based on the data collected from real ReRAM chips, and the results show that the proposed strategy can improve the accuracy for around 20% for MNIST which is close to the ideal case and 40% for CIFAR10.
(6636128), Nidhi Nandkishor Sakhala. "Generation of cyber attack data using generative techniques". Thesis, 2019.
Cerca il testo completoThe presence of attacks in day-to-day traffic flow in connected networks is considerably less compared to genuine traffic flow. Yet, the consequences of these attacks are disastrous. It is very important to identify if the network is being attacked and block these attempts to protect the network system. Failure to block these attacks can lead to loss of confidential information and reputation and can also lead to financial loss. One of the strategies to identify these attacks is to use machine learning algorithms that learn to identify attacks by looking at previous examples. But since the number of attacks is small, it is difficult to train these machine learning algorithms. This study aims to use generative techniques to create new attack samples that can be used to train the machine learning based intrusion detection systems to identify more attacks. Two metrics are used to verify that the training has improved and a binary classifier is used to perform a two-sample test for verifying the generated attacks.
(11153640), Amir Daneshmand. "Parallel and Decentralized Algorithms for Big-data Optimization over Networks". Thesis, 2021.
Cerca il testo completoRecent decades have witnessed the rise of data deluge generated by heterogeneous sources, e.g., social networks, streaming, marketing services etc., which has naturally created a surge of interests in theory and applications of large-scale convex and non-convex optimization. For example, real-world instances of statistical learning problems such as deep learning, recommendation systems, etc. can generate sheer volumes of spatially/temporally diverse data (up to Petabytes of data in commercial applications) with millions of decision variables to be optimized. Such problems are often referred to as Big-data problems. Solving these problems by standard optimization methods demands intractable amount of centralized storage and computational resources which is infeasible and is the foremost purpose of parallel and decentralized algorithms developed in this thesis.
This thesis consists of two parts: (I) Distributed Nonconvex Optimization and (II) Distributed Convex Optimization.
In Part (I), we start by studying a winning paradigm in big-data optimization, Block Coordinate Descent (BCD) algorithm, which cease to be effective when problem dimensions grow overwhelmingly. In particular, we considered a general family of constrained non-convex composite large-scale problems defined on multicore computing machines equipped with shared memory. We design a hybrid deterministic/random parallel algorithm to efficiently solve such problems combining synergically Successive Convex Approximation (SCA) with greedy/random dimensionality reduction techniques. We provide theoretical and empirical results showing efficacy of the proposed scheme in face of huge-scale problems. The next step is to broaden the network setting to general mesh networks modeled as directed graphs, and propose a class of gradient-tracking based algorithms with global convergence guarantees to critical points of the problem. We further explore the geometry of the landscape of the non-convex problems to establish second-order guarantees and strengthen our convergence to local optimal solutions results to global optimal solutions for a wide range of Machine Learning problems.
In Part (II), we focus on a family of distributed convex optimization problems defined over meshed networks. Relevant state-of-the-art algorithms often consider limited problem settings with pessimistic communication complexities with respect to the complexity of their centralized variants, which raises an important question: can one achieve the rate of centralized first-order methods over networks, and moreover, can one improve upon their communication costs by using higher-order local solvers? To answer these questions, we proposed an algorithm that utilizes surrogate objective functions in local solvers (hence going beyond first-order realms, such as proximal-gradient) coupled with a perturbed (push-sum) consensus mechanism that aims to track locally the gradient of the central objective function. The algorithm is proved to match the convergence rate of its centralized counterparts, up to multiplying network factors. When considering in particular, Empirical Risk Minimization (ERM) problems with statistically homogeneous data across the agents, our algorithm employing high-order surrogates provably achieves faster rates than what is achievable by first-order methods. Such improvements are made without exchanging any Hessian matrices over the network.
Finally, we focus on the ill-conditioning issue impacting the efficiency of decentralized first-order methods over networks which rendered them impractical both in terms of computation and communication cost. A natural solution is to develop distributed second-order methods, but their requisite for Hessian information incurs substantial communication overheads on the network. To work around such exorbitant communication costs, we propose a “statistically informed” preconditioned cubic regularized Newton method which provably improves upon the rates of first-order methods. The proposed scheme does not require communication of Hessian information in the network, and yet, achieves the iteration complexity of centralized second-order methods up to the statistical precision. In addition, (second-order) approximate nature of the utilized surrogate functions, improves upon the per-iteration computational cost of our earlier proposed scheme in this setting.
Hudeček, Ján. "Dynamické sociální sítě a jejich analýza". Master's thesis, 2021. http://www.nusl.cz/ntk/nusl-438020.
Testo completo