Dissertations / Theses on the topic 'Unsupervised anomaly detection'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Unsupervised anomaly detection.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Mazel, Johan. "Unsupervised network anomaly detection." Thesis, Toulouse, INSA, 2011. http://www.theses.fr/2011ISAT0024/document.
Full textAnomaly detection has become a vital component of any network in today’s Internet. Ranging from non-malicious unexpected events such as flash-crowds and failures, to network attacks such as denials-of-service and network scans, network traffic anomalies can have serious detrimental effects on the performance and integrity of the network. The continuous arising of new anomalies and attacks create a continuous challenge to cope with events that put the network integrity at risk. Moreover, the inner polymorphic nature of traffic caused, among other things, by a highly changing protocol landscape, complicates anomaly detection system's task. In fact, most network anomaly detection systems proposed so far employ knowledge-dependent techniques, using either misuse detection signature-based detection methods or anomaly detection relying on supervised-learning techniques. However, both approaches present major limitations: the former fails to detect and characterize unknown anomalies (letting the network unprotected for long periods) and the latter requires training over labeled normal traffic, which is a difficult and expensive stage that need to be updated on a regular basis to follow network traffic evolution. Such limitations impose a serious bottleneck to the previously presented problem.We introduce an unsupervised approach to detect and characterize network anomalies, without relying on signatures, statistical training, or labeled traffic, which represents a significant step towards the autonomy of networks. Unsupervised detection is accomplished by means of robust data-clustering techniques, combining Sub-Space clustering with Evidence Accumulation or Inter-Clustering Results Association, to blindly identify anomalies in traffic flows. Correlating the results of several unsupervised detections is also performed to improve detection robustness. The correlation results are further used along other anomaly characteristics to build an anomaly hierarchy in terms of dangerousness. Characterization is then achieved by building efficient filtering rules to describe a detected anomaly. The detection and characterization performances and sensitivities to parameters are evaluated over a substantial subset of the MAWI repository which contains real network traffic traces.Our work shows that unsupervised learning techniques allow anomaly detection systems to isolate anomalous traffic without any previous knowledge. We think that this contribution constitutes a great step towards autonomous network anomaly detection.This PhD thesis has been funded through the ECODE project by the European Commission under the Framework Programme 7. The goal of this project is to develop, implement, and validate experimentally a cognitive routing system that meet the challenges experienced by the Internet in terms of manageability and security, availability and accountability, as well as routing system scalability and quality. The concerned use case inside the ECODE project is network anomaly
Joshi, Vineet. "Unsupervised Anomaly Detection in Numerical Datasets." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1427799744.
Full textDi, Felice Marco. "Unsupervised anomaly detection in HPC systems." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.
Find full textForstén, Andreas. "Unsupervised Anomaly Detection in Receipt Data." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215161.
Full textMed de framsteg inom datahantering och datorkraft som gjorts så kommer också möjligheten att automatisera uppgifter som ej nödvändigtvis utförs av människor. Denna studie gjordes i samarbete med ett företag som digitaliserar företags kvitton. Vi undersöker möjligheten att automatisera sökandet av avvikande kvittodata, vilket kan avlasta revisorer. Vti studerar både avvikande användarbeteenden och individuella kvitton. Resultaten indikerar att automatisering är möjligt, vilket kan reducera behovet av mänsklig inspektion av kvitton
Cheng, Leon. "Unsupervised topic discovery by anomaly detection." Thesis, Monterey, California: Naval Postgraduate School, 2013. http://hdl.handle.net/10945/37599.
Full textWith the vast amount of information and public comment available online, it is of increasing interest to understand what is being said and what topics are trending online. Government agencies, for example, want to know what policies concern the public without having to look through thousands of comments manually. Topic detection provides automatic identification of topics in documents based on the information content and enhances many natural language processing tasks, including text summarization and information retrieval. Unsupervised topic detection, however, has always been a difficult task. Methods such as Latent Dirichlet Allocation (LDA) convert documents from word space into document space (weighted sums over topic space), but do not perform any form of classification, nor do they address the relation of generated topics with actual human level topics. In this thesis we attempt a novel way of unsupervised topic detection and classification by performing LDA and then clustering. We propose variations to the popular K-Mean Clustering algorithm to optimize the choice of centroids, and we perform experiments using Facebook data and the New York Times (NYT) corpus. Although the results were poor for the Facebook data, our method performed acceptably with the NYT data. The new clustering algorithms also performed slightly and consistently better than the normal K-Means algorithm.
Putina, Andrian. "Unsupervised anomaly detection : methods and applications." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAT012.
Full textAn anomaly (also known as outlier) is an instance that significantly deviates from the rest of the input data and being defined by Hawkins as 'an observation, which deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism'. Anomaly detection (also known as outlier or novelty detection) is thus the machine learning and data mining field with the purpose of identifying those instances whose features appear to be inconsistent with the remainder of the dataset. In many applications, correctly distinguishing the set of anomalous data points (outliers) from the set of normal ones (inliers) proves to be very important. A first application is data cleaning, i.e., identifying noisy and fallacious measurement in a dataset before further applying learning algorithms. However, with the explosive growth of data volume collectable from various sources, e.g., card transactions, internet connections, temperature measurements, etc. the use of anomaly detection becomes a crucial stand-alone task for continuous monitoring of the systems. In this context, anomaly detection can be used to detect ongoing intrusion attacks, faulty sensor networks or cancerous masses.The thesis proposes first a batch tree-based approach for unsupervised anomaly detection, called 'Random Histogram Forest (RHF)'. The algorithm solves the curse of dimensionality problem using the fourth central moment (aka kurtosis) in the model construction while boasting linear running time. A stream based anomaly detection engine, called 'ODS', that leverages DenStream, an unsupervised clustering technique is presented subsequently and finally Automated Anomaly Detection engine which alleviates the human effort required when dealing with several algorithm and hyper-parameters is presented as last contribution
Audibert, Julien. "Unsupervised anomaly detection in time-series." Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS358.
Full textAnomaly detection in multivariate time series is a major issue in many fields. The increasing complexity of systems and the explosion of the amount of data have made its automation indispensable. This thesis proposes an unsupervised method for anomaly detection in multivariate time series called USAD. However, deep neural network methods suffer from a limitation in their ability to extract features from the data since they only rely on local information. To improve the performance of these methods, this thesis presents a feature engineering strategy that introduces non-local information. Finally, this thesis proposes a comparison of sixteen time series anomaly detection methods to understand whether the explosion in complexity of neural network methods proposed in the current literature is really necessary
Dani, Mohamed Cherif. "Unsupervised anomaly detection for aircraft health monitoring system." Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB258.
Full textThe limitation of the knowledge, technical, fundamental is a daily challenge for industries. The need to updates these knowledge are important for a competitive industry and also for an efficient reliability and maintainability of the systems. Actually, thanks to these machines and systems, the expansion of the data on quantity and frequency of generation is a real phenomenon. Within Airbus for example, and thanks to thousands of sensors, the aircrafts generate hundreds of megabytes of data per flight. These data are today exploited on the ground to improve safety and health monitoring system as a failure, incident and change detection. In theory, these changes, incident and failure are known as anomalies. An anomaly is known as deviation form a normal behavior of the data. Others define it as a behavior that do not conform the normal behavior. Whatever the definition, the anomaly detection process is very important for good functioning of the aircraft. Currently, the anomaly detection process is provided by several health monitoring equipments, one of these equipment is the Aircraft Health Monitoring System (ACMS), it records continuously the date of each sensor, and also monitor these sensors to detect anomalies and incident using triggers and predefined condition (exeedance approach). These predefined conditions are programmed by airlines and system designed according to a prior knowledge (physical, mechanical, etc.). However, several constraints limit the ACMS anomaly detection potential. We can mention, for example, the limitation the expert knowledge which is a classic problem in many domains, since the triggers are designed only to the targeted anomalies. Otherwise, the triggers do not cover all the system conditions. In other words, if a new behavior appears (new condition) in the sensor, after a maintenance action, parts changing, etc. the predefined conditions won't detect any thing and may be in many cases generated false alarms. Another constraint is that the triggers (predefined conditions) are static, they are unable to adapt their proprieties to each new condition. Another limitation is discussed gradually in the future chapters. The principle of objective of this thesis is to detect anomalies and changes in the ACMS data. In order to improve the health monitoring function of the ACMS. The work is based principally on two stages, the univariate anomaly detection stage, where we use the unsupervised learning to process the univariate sensors, since we don’t have any a prior knowledge of the system, and no documentation or labeled classes are available. The univariate analysis focuses on each sensor independently. The second stage of the analysis is the multivariate anomaly detection, which is based on density clustering, where the objective is to filter the anomalies detected in the first stage (false alarms) and to detect suspected behaviours (group of anomalies). The anomalies detected in both univariate and multivariate can be potential triggers or can be used to update the existing triggers. Otherwise, we propose also a generic concept of anomaly detection based on univariate and multivariate anomaly detection. And finally a new concept of validation anomalies within airbus
Dani, Mohamed Cherif. "Unsupervised anomaly detection for aircraft health monitoring system." Electronic Thesis or Diss., Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCB258.
Full textThe limitation of the knowledge, technical, fundamental is a daily challenge for industries. The need to updates these knowledge are important for a competitive industry and also for an efficient reliability and maintainability of the systems. Actually, thanks to these machines and systems, the expansion of the data on quantity and frequency of generation is a real phenomenon. Within Airbus for example, and thanks to thousands of sensors, the aircrafts generate hundreds of megabytes of data per flight. These data are today exploited on the ground to improve safety and health monitoring system as a failure, incident and change detection. In theory, these changes, incident and failure are known as anomalies. An anomaly is known as deviation form a normal behavior of the data. Others define it as a behavior that do not conform the normal behavior. Whatever the definition, the anomaly detection process is very important for good functioning of the aircraft. Currently, the anomaly detection process is provided by several health monitoring equipments, one of these equipment is the Aircraft Health Monitoring System (ACMS), it records continuously the date of each sensor, and also monitor these sensors to detect anomalies and incident using triggers and predefined condition (exeedance approach). These predefined conditions are programmed by airlines and system designed according to a prior knowledge (physical, mechanical, etc.). However, several constraints limit the ACMS anomaly detection potential. We can mention, for example, the limitation the expert knowledge which is a classic problem in many domains, since the triggers are designed only to the targeted anomalies. Otherwise, the triggers do not cover all the system conditions. In other words, if a new behavior appears (new condition) in the sensor, after a maintenance action, parts changing, etc. the predefined conditions won't detect any thing and may be in many cases generated false alarms. Another constraint is that the triggers (predefined conditions) are static, they are unable to adapt their proprieties to each new condition. Another limitation is discussed gradually in the future chapters. The principle of objective of this thesis is to detect anomalies and changes in the ACMS data. In order to improve the health monitoring function of the ACMS. The work is based principally on two stages, the univariate anomaly detection stage, where we use the unsupervised learning to process the univariate sensors, since we don’t have any a prior knowledge of the system, and no documentation or labeled classes are available. The univariate analysis focuses on each sensor independently. The second stage of the analysis is the multivariate anomaly detection, which is based on density clustering, where the objective is to filter the anomalies detected in the first stage (false alarms) and to detect suspected behaviours (group of anomalies). The anomalies detected in both univariate and multivariate can be potential triggers or can be used to update the existing triggers. Otherwise, we propose also a generic concept of anomaly detection based on univariate and multivariate anomaly detection. And finally a new concept of validation anomalies within airbus
Sarossy, George. "Anomaly detection in Network data with unsupervised learning methods." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-55096.
Full textLindgren, Erik, and Niklas Allard. "Exploring unsupervised anomaly detection in Bill of Materials structures." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160262.
Full textVendramin, Nicoló. "Unsupervised Anomaly Detection on Multi-Process Event Time Series." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254885.
Full textAtt fastställa huruvida observerade data är avvikande eller inte är en viktig uppgift som har studerats ingående i litteraturen och problemet blir ännu mer komplext, om detta kombineras med högdimensionella representationer och flera källor som oberoende genererar de mönster som ska analyseras. Arbetet som presenteras i denna uppsats använder en data-driven pipeline för definitionen av en återkommande auto-encoderarkitektur för att analysera, på ett oövervakat sätt, högdimensionella händelsetidsserier som genereras av flera och variabla processer som interagerar med ett system. Mot bakgrund av ovanstående problem undersöker arbetet om det är möjligt eller inte att använda en enda modell för att analysera mönster som producerats av olika källor. Analys av loggfiler som registrerar händelser av interaktion mellan användare och radionätverksinfrastruktur används som en fallstudie för det angivna problemet. Undersökningen syftar till att verifiera prestandan hos en enda maskininlärningsmodell som tillämpas för inlärning av flera mönster som utvecklats över tid från olika källor. Arbetet föreslår en pipeline för att hantera den komplexa representationen hos datakällorna och definitionen och avstämningen av anomalidetektionsmodellen, som inte är baserad på domänspecifik kunskap och därför kan anpassas till olika probleminställningar. Modellen har implementerats i fyra olika varianter som har utvärderats med avseende på både normala och avvikande data, som delvis har samlats in från verkliga nätverksceller och delvis från simulering av avvikande beteenden. De empiriska resultaten visar modellens tillämplighet för detektering av avvikande sekvenser och händelser i det föreslagna ramverket, med F1-score över 80%, varierande beroende på den specifika tröskelinställningen. Dessutom ger deras djupare tolkning insikter om skillnaden mellan olika varianter av modellen och därmed deras begränsningar och styrkor.
Granlund, Oskar. "Unsupervised anomaly detection on log-based time series data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-265534.
Full textEftersom antalet uppkopplade enheter ständigt har ökat och kravet på tillgänglighet, äkthet och integritet hos applikationer är höga så har den här uppsatsen fokuserat på oövervakad anomalidetektering i datacenter. Den utvärderar hur lämpliga öppna och moderna anomalidetekteringsmetoder är för att hitta avvikande mönster och trender på logbaserade dataströmmar. Metoderna använda i det här projektet är Principalkomponentanalys, LogCluster och Hierarkisk temporärt minne. De är utvärderade med F-score på en datamängd från en Apache-accesslogg tagen från en produktionsmiljö. Datan var utvald för att reprensentera ett normalt tillstånd där få eller inga onormala händelser förekom. 0.5% av datapunkterna transformerades till anomalier, baserat på den genomsnittliga förekomsten av varje logsekvens som matchar ett visst mönster. Principalkomponentanalys visade de bästa resultaten med ett F-score från 0.4 till 0.56. Näst bäst var LogCluster, de två metoderna baserade på hierarkiskt temporärt minne visade inte alls bra resultat. Resultaten visade att PCA kan hitta ca 50% av de injecerade anomalierna vilket kan användas för att förbättra konfidentialitet, tillgänglighet och integriteten hos applikationer.
Leto, Kevin. "Anomaly detection in HPC systems." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.
Find full textBeach, David J. "Anomaly Detection with Advanced Nonlinear Dimensionality Reduction." Digital WPI, 2020. https://digitalcommons.wpi.edu/etd-theses/1378.
Full textFröjdholm, Hampus. "Learning from 3D generated synthetic data for unsupervised anomaly detection." Thesis, Uppsala universitet, Avdelningen för visuell information och interaktion, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-443243.
Full textAzmoudeh, Fard Simon. "Anomaly Detection in Networks using Autoencoder and Unsupervised Learning Methods." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-55097.
Full textWu, Xinheng. "A Deep Unsupervised Anomaly Detection Model for Automated Tumor Segmentation." Thesis, The University of Sydney, 2020. https://hdl.handle.net/2123/22502.
Full textLarsson, Frans. "Algorithmic trading surveillance : Identifying deviating behavior with unsupervised anomaly detection." Thesis, Uppsala universitet, Matematiska institutionen, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-389941.
Full textHaddad, Josef, and Carl Piehl. "Unsupervised anomaly detection in time series with recurrent neural networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259655.
Full textArtificiella neurala nätverk (ANN) har tillämpats på många problem. Däremot försöker inte de flesta ANN-modeller efterlikna hjärnan i detalj. Ett exempel på ett ANN som är begränsat till att efterlikna hjärnan är Hierarchical Temporal Memory (HTM). Denna studie tillämpar HTM och Long Short-Term Memory (LSTM) på avvikelsedetektionsproblem i tidsserier för att undersöka vilka styrkor och svagheter de har för detta problem. Avvikelserna i denna studie är begränsade till punktavvikelser och tidsserierna är i endast en variabel. Redan existerande implementationer som utnyttjar dessa nätverk för oövervakad avvikelsedetektionsproblem i tidsserier används i denna studie. Vi använder främst våra egna syntetiska tidsserier för att undersöka hur nätverken hanterar brus och hur de hanterar olika egenskaper som en tidsserie kan ha. Våra resultat visar att båda nätverken kan hantera brus och prestationsskillnaden rörande brusrobusthet var inte tillräckligt stor för att urskilja modellerna. LSTM presterade bättre än HTM på att upptäcka punktavvikelser i våra syntetiska tidsserier som följer en sinuskurva men en slutsats angående vilket nätverk som presterar bäst överlag är fortfarande oavgjord.
Sreenivasulu, Ajay. "Evaluation of cluster based Anomaly detection." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18053.
Full textFockstedt, Jonas, and Ema Krcic. "Unsupervised anomaly detection for structured data - Finding similarities between retail products." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-44756.
Full textRenström, Martin, and Timothy Holmsten. "Fraud Detection on Unlabeled Data with Unsupervised Machine Learning." Thesis, KTH, Hälsoinformatik och logistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230592.
Full textEtt vanligt problem med användares interaktioner i ett system var risken för bedrägeri. För ett system som hanterarade dataset med kreditkortstransaktioner så kunde ett exempel vara att en person använde en annans identitet för kortköp, eller i system som hanterade reklam så skulle det kunna ha varit en automatiserad mjukvara som simulerade interaktioner. Dessa attacker var ofta maskerade som normala interaktioner och kunde därmed vara svåra att upptäcka. Inom dataset som inte har korrekt märkt data så skulle det vara speciellt svårt att utveckla en algoritm som kan skilja på om interaktionen var avvikande eller inte. I denna avhandling så utforskas ämnet att upptäcka anomalier i dataset utan specifik data som tyder på att det var bedrägeri. Tre prototyper av neurala nätverk användes i denna studie som tränades och utvärderades på två dataset som innehöll både data som sade att det var bedrägeri och inte bedrägeri. Den första prototypen som fungerade som en bas var en simpel autoencoder med tre lager, den andra prototypen var en ny autoencoder som har fått namnet staplad autoencoder och den tredje prototypen var en variationell autoencoder. För denna studie så gav den föreslagna staplade autoencodern bäst resultat för återkallelse, noggrannhet och NPV i de test som var designade att efterlikna ett verkligt scenario.
Merrill, Nicholas Swede. "Modified Kernel Principal Component Analysis and Autoencoder Approaches to Unsupervised Anomaly Detection." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/98659.
Full textMaster of Science
Anomaly detection is the task of identifying examples that differ from the normal or expected pattern. The challenge of unsupervised anomaly detection is distinguishing normal and anomalous data without the use of labeled examples to demonstrate their differences. This thesis addresses shortcomings in two anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE) and proposes new solutions to apply them in the unsupervised setting. Ultimately, the two modified methods, Unsupervised Ensemble KPCA (UE-KPCA) and the Modified Training and Scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets.
Tjhai, Gina C. "Anomaly-based correlation of IDS alarms." Thesis, University of Plymouth, 2011. http://hdl.handle.net/10026.1/308.
Full textVidmark, Anton. "CONSTRUCTING AND VARYING DATA MODELS FOR UNSUPERVISED ANOMALY DETECTION ON LOG DATAData modelling and domain knowledge’s impact on anomaly detection and explainability." Thesis, Umeå universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-163544.
Full textAlaverdyan, Zaruhi. "Unsupervised representation learning for anomaly detection on neuroimaging. Application to epilepsy lesion detection on brain MRI." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSEI005/document.
Full textThis work represents one attempt to develop a computer aided diagnosis system for epilepsy lesion detection based on neuroimaging data, in particular T1-weighted and FLAIR MR sequences. Given the complexity of the task and the lack of a representative voxel-level labeled data set, the adopted approach, first introduced in Azami et al., 2016, consists in casting the lesion detection task as a per-voxel outlier detection problem. The system is based on training a one-class SVM model for each voxel in the brain on a set of healthy controls, so as to model the normality of the voxel. The main focus of this work is to design representation learning mechanisms, capturing the most discriminant information from multimodality imaging. Manual features, designed to mimic the characteristics of certain epilepsy lesions, such as focal cortical dysplasia (FCD), on neuroimaging data, are tailored to individual pathologies and cannot discriminate a large range of epilepsy lesions. Such features reflect the known characteristics of lesion appearance; however, they might not be the most optimal ones for the task at hand. Our first contribution consists in proposing various unsupervised neural architectures as potential feature extracting mechanisms and, eventually, introducing a novel configuration of siamese networks, to be plugged into the outlier detection context. The proposed system, evaluated on a set of T1-weighted MRIs of epilepsy patients, showed a promising performance but a room for improvement as well. To this end, we considered extending the CAD system so as to accommodate multimodality data which offers complementary information on the problem at hand. Our second contribution, therefore, consists in proposing strategies to combine representations of different imaging modalities into a single framework for anomaly detection. The extended system showed a significant improvement on the task of epilepsy lesion detection on T1-weighted and FLAIR MR images. Our last contribution focuses on the integration of PET data into the system. Given the small number of available PET images, we make an attempt to synthesize PET data from the corresponding MRI acquisitions. Eventually we show an improved performance of the system when trained on the mixture of synthesized and real images
Lindroth, Henriksson Amelia. "Unsupervised Anomaly Detection on Time Series Data: An Implementation on Electricity Consumption Series." Thesis, KTH, Matematisk statistik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301731.
Full textDigitaliseringen av elbranschen, införandet av smarta nät samt ökad reglering av elmätning har resulterat i stora mängder eldata. Denna data skapar en unik möjlighet att analysera och förstå fastigheters elförbrukning för att kunna effektivisera den. Ett viktigt inledande steg i analysen av denna data är att identifiera möjliga anomalier. I denna uppsats testas fyra olika maskininlärningsmetoder för detektering av anomalier i elförbrukningsserier: densitetsbaserad spatiell klustring för applikationer med brus (DBSCAN), lokal avvikelse-faktor (LOF), isoleringsskog (iForest) och en-klass stödvektormaskin (OC-SVM). För att kunna utvärdera metoderna infördes syntetiska anomalier i elförbrukningsserierna och de fyra metoderna utvärderades därefter för de två anomalityperna punktanomali och gruppanomali. Utöver elförbrukningsdatan inkluderades även variabler som beskriver tidigare elförbrukning, utomhustemperatur och tidsegenskaper i modellerna. Resultaten tyder på att tillägget av temperaturvariabeln och lag-variablerna i allmänhet försämrade modellernas prestanda, medan införandet av tidsvariablerna förbättrade den. Av de fyra metoderna visade sig OC-SVM vara bäst på att detektera punktanomalier medan LOF var bäst på att detektera gruppanomalier. I ett försök att förbättra modellernas detekteringsförmåga utfördes samma experiment efter att elförbrukningsserierna trend- och säsongsrensats. Modellerna presterade inte bättre på de rensade serierna än på de icke-rensade.
Jernbäcker, Carl. "Unsupervised real-time anomaly detection on streaming data for large-scale application deployments." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262681.
Full textAvvikelsedetektering är klassificeringen av datapunkter som inte följer det kända mönstret; tidigare studier krävde omfattande mänskliga interaktioner med antingen märkning eller sortering av normala och onormala data från varandra. I detta examensarbete vill vi gå ett steg längre och tillämpa maskininlärningsteknik på tidsseriedata för att få en djupare förståelse för egenskaperna hos en given datapunkt utan någon sortering och märkning. I detta examensarbete presenteras en metod som framgångsrikt kan hitta anomalier i både reella och syntetiska dataset. Metoden använder en kombination av tre algoritmer från olika discipliner, Hierarchical temporal memory och Restricted Boltzmann machines från maskininlärning och Autoregressive integrated moving average från regression. Varje algoritm är specialiserad på att hitta en viss typ av anomalier. Kombinationen finner alla anomalier med liten eller inget avstånd från förekomst av en anomali till dess detektion.
Bracci, Lorenzo, and Amirhossein Namazi. "EVALUATION OF UNSUPERVISED MACHINE LEARNING MODELS FOR ANOMALY DETECTION IN TIME SERIES SENSOR DATA." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299734.
Full textMed utvecklingen av Sakernas internet och digitaliseringen av samhället kan man registrera tidsseriedata på allt fler platser, bland annat igenom närhetssensorer på bilar, temperatursensorer i tillverkningsanläggningar och rörelsesensorer i smarta hem. Detta ständigt ökande beroende i samhället av dessa enheter leder till ett behov av att upptäcka ovanligt beteende som kan orsakas av funktionsstörning i sensorn eller genom upptäckt av en ovanlig händelse. Det ovanliga beteendet som nämns kallas ofta för en anomali. För att upptäcka avvikande beteenden används avancerad teknik som kombinerar matematik och datavetenskap, som ofta kallas maskininlärning. För att hjälpa maskiner att lära sig värdefulla mönster behövs ofta mänsklig tillsyn, vilket i detta fall skulle motsvara användningsinspelningar som en person redan har klassificerat som avvikelser eller normala punkter. Tyvärr är det tidskrävande att märka data, särskilt de stora datamängder som skapas från sensorinspelningar. Därför utvärderas tekniker som inte kräver någon handledning i denna avhandling för att utföra anomalidetektering. Flera olika maskininlärningsmodeller utbildas på olika datamängder för att få en bättre förståelse för vilka tekniker som fungerar bättre när olika krav är viktiga, t.ex. närvaro av en mindre dataset eller strängare krav på inferens tid. Av de utvärderade modellerna resulterade OCSVM i bästa totala prestanda, uppnådde en noggrannhet på 85% och K- means var den snabbaste modellen eftersom det hade en inferens tid av 0,04 millisekunder. Dessutom visade LSTM- baserade modeller de bästa möjliga förbättringarna med större datamängder.
Mathur, Nitin O. "Application of Autoencoder Ensembles in Anomaly and Intrusion Detection using Time-Based Analysis." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin161374876195402.
Full textHanna, Peter, and Erik Swartling. "Anomaly Detection in Time Series Data using Unsupervised Machine Learning Methods: A Clustering-Based Approach." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273630.
Full textFör flera företag i tillverkningsindustrin är felsökningar av produkter en fundamental uppgift i produktionsprocessen. Då användningen av olika maskininlärningsmetoder visar sig innehålla användbara tekniker för att hitta fel i produkter är dessa metoder ett populärt val bland företag som ytterligare vill förbättra produktionprocessen. För vissa industrier är feldetektering starkt kopplat till anomalidetektering av olika mätningar. I detta examensarbete är syftet att konstruera oövervakad maskininlärningsmodeller för att identifiera anomalier i tidsseriedata. Mer specifikt består datan av högfrekvent mätdata av pumpar via ström och spänningsmätningar. Mätningarna består av fem olika faser, nämligen uppstartsfasen, tre last-faser och fasen för avstängning. Maskinilärningsmetoderna är baserade på olika klustertekniker, och de metoderna som användes är DBSCAN och LOF algoritmerna. Dessutom tillämpades olika dimensionsreduktionstekniker och efter att ha konstruerat 5 olika modeller, alltså en för varje fas, kan det konstateras att modellerna lyckats identifiera anomalier i det givna datasetet.
Sivaramakrishnan, Jayaram. "Unsupervised probabilistic and kernel regression methods for anomaly detection and parameter margin prediction of industrial design." Thesis, Sivaramakrishnan, Jayaram (2021) Unsupervised probabilistic and kernel regression methods for anomaly detection and parameter margin prediction of industrial design. PhD thesis, Murdoch University, 2021. https://researchrepository.murdoch.edu.au/id/eprint/62536/.
Full textMinarini, Francesco. "Anomaly detection prototype for log-based predictive maintenance at INFN-CNAF tier-1." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/19304/.
Full textLabonne, Maxime. "Anomaly-based network intrusion detection using machine learning." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAS011.
Full textIn recent years, hacking has become an industry unto itself, increasing the number and diversity of cyber attacks. Threats on computer networks range from malware to denial of service attacks, phishing and social engineering. An effective cyber security plan can no longer rely solely on antiviruses and firewalls to counter these threats: it must include several layers of defence. Network-based Intrusion Detection Systems (IDSs) are a complementary means of enhancing security, with the ability to monitor packets from OSI layer 2 (Data link) to layer 7 (Application). Intrusion detection techniques are traditionally divided into two categories: signatured-based (or misuse) detection and anomaly detection. Most IDSs in use today rely on signature-based detection; however, they can only detect known attacks. IDSs using anomaly detection are able to detect unknown attacks, but are unfortunately less accurate, which generates a large number of false alarms. In this context, the creation of precise anomaly-based IDS is of great value in order to be able to identify attacks that are still unknown.In this thesis, machine learning models are studied to create IDSs that can be deployed in real computer networks. Firstly, a three-step optimization method is proposed to improve the quality of detection: 1/ data augmentation to rebalance the dataset, 2/ parameters optimization to improve the model performance and 3/ ensemble learning to combine the results of the best models. Flows detected as attacks can be analyzed to generate signatures to feed signature-based IDS databases. However, this method has the disadvantage of requiring labelled datasets, which are rarely available in real-life situations. Transfer learning is therefore studied in order to train machine learning models on large labeled datasets, then finetune them on benign traffic of the network to be monitored. This method also has flaws since the models learn from already known attacks, and therefore do not actually perform anomaly detection. Thus, a new solution based on unsupervised learning is proposed. It uses network protocol header analysis to model normal traffic behavior. Anomalies detected are then aggregated into attacks or ignored when isolated. Finally, the detection of network congestion is studied. The bandwidth utilization between different links is predicted in order to correct issues before they occur
Pierrau, Magnus. "Evaluating Unsupervised Methods for Out-of-Distribution Detection on Semantically Similar Image Data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302583.
Full textBegreppet “out-of-distribution detection” (OOD-detektion) avser metoder vilka används för att upptäcka data som avviker från den underliggande datafördelningen som använts för att träna en maskininlärningsmodell. Detta är ett viktigt ämne, då artificiella neuronnät tidigare har visat sig benägna att generera godtyckligt säkra förutsägelser, även på data som avviker från den underliggande träningsfördelningen. Tidigare arbeten har producerat många välpresterande OOD-detektionsmetoder, men dessa har ofta utvärderats på data som är semantiskt olikt träningsdata, och reflekterar därför inte nödvändigtvis metodernas förmåga under mer utmanande förutsättningar. I detta arbete utvärderas och jämförs sex oövervakade OOD-detektionsmetoder under utmanande förhållanden, i form av klassificering av semantiskt liknande bilddata med hjälp av djupa neuronnät. Arbetet visar att resultaten för samtliga metoder varierar markant mellan olika data och att ingen enskild modell är konsekvent överlägsen de andra. Arbetet finner lovande resultat för en metod som utnyttjar djupa neuronnätsensembler, men överlag så presterar samtliga modeller sämre än vad tidigare arbeten rapporterat, där mindre utmanande data har nyttjats för att utvärdera metoderna.
Kommineni, Sri Sai Manoj, and Akhila Dindi. "Automating Log Analysis." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21175.
Full textABUKMEIL, MOHANAD. "UNSUPERVISED GENERATIVE MODELS FOR DATA ANALYSIS AND EXPLAINABLE ARTIFICIAL INTELLIGENCE." Doctoral thesis, Università degli Studi di Milano, 2022. http://hdl.handle.net/2434/889159.
Full textAvdic, Adnan, and Albin Ekholm. "Anomaly Detection in an e-Transaction System using Data Driven Machine Learning Models : An unsupervised learning approach in time-series data." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18421.
Full textBaur, Christoph [Verfasser], Nassir [Akademischer Betreuer] Navab, Nassir [Gutachter] Navab, and Ben [Gutachter] Glocker. "Anomaly Detection in Brain MRI: From Supervised to Unsupervised Deep Learning / Christoph Baur ; Gutachter: Nassir Navab, Ben Glocker ; Betreuer: Nassir Navab." München : Universitätsbibliothek der TU München, 2021. http://d-nb.info/1236343115/34.
Full textManovi, Livia. "Machine Learning Unsupervised Methods in the Design of an On-board Health Monitoring System for Satellite Applications." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Find full textDalvi, Aditi. "Performance of One-class Support Vector Machine (SVM) in Detection of Anomalies in the Bridge Data." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin150478019017791.
Full textAit, Saada Mira. "Unsupervised learning from textual data with neural text representations." Electronic Thesis or Diss., Université Paris Cité, 2023. http://www.theses.fr/2023UNIP7122.
Full textThe digital era generates enormous amounts of unstructured data such as images and documents, requiring specific processing methods to extract value from them. Textual data presents an additional challenge as it does not contain numerical values. Word embeddings are techniques that transform text into numerical data, enabling machine learning algorithms to process them. Unsupervised tasks are a major challenge in the industry as they allow value creation from large amounts of data without requiring costly manual labeling. In thesis we explore the use of Transformer models for unsupervised tasks such as clustering, anomaly detection, and data visualization. We also propose methodologies to better exploit multi-layer Transformer models in an unsupervised context to improve the quality and robustness of document clustering while avoiding the choice of which layer to use and the number of classes. Additionally, we investigate more deeply Transformer language models and their application to clustering, examining in particular transfer learning methods that involve fine-tuning pre-trained models on a different task to improve their quality for future tasks. We demonstrate through an empirical study that post-processing methods based on dimensionality reduction are more advantageous than fine-tuning strategies proposed in the literature. Finally, we propose a framework for detecting text anomalies in French adapted to two cases: one where the data concerns a specific topic and the other where the data has multiple sub-topics. In both cases, we obtain superior results to the state of the art with significantly lower computation time
Formato, Lorenzo. "IDENTIFICAZIONE DI GUASTI TRAMITE ALGORITMI DI CLASSIFICAZIONE & CLUSTERING per applicazioni di Manutenzione Predittiva in Scenari di Industria 4.0." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/23028/.
Full textHamid, Muhammad Raffay. "A computational framework for unsupervised analysis of everyday human activities." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/24765.
Full textCommittee Chair: Aaron Bobick; Committee Member: Charles Isbell; Committee Member: David Hogg; Committee Member: Irfan Essa; Committee Member: James Rehg
Boniol, Paul. "Detection of anomalies and identification of their precursors in large data series collections." Electronic Thesis or Diss., Université Paris Cité, 2021. http://www.theses.fr/2021UNIP5206.
Full textExtensive collections of data series are becoming a reality in a large number of scientific and social domains. There is, therefore, a growing interest and need to elaborate efficient techniques to analyze and process these data, such as in finance, environmental sciences, astrophysics, neurosciences, engineering. Informally, a data series is an ordered sequence of points or values. Once these series are collected and available, users often need to query them. These queries can be simple, such as the selection of time interval, but also complex, such as the similarities search or the detection of anomalies, often synonymous with malfunctioning of the system under study, or sudden and unusual evolution likely undesired. This last type of analysis represents a crucial problem for applications in a wide range of domains, all sharing the same objective: to detect anomalies as soon as possible to avoid critical events. Therefore, in this thesis, we address the following three objectives: (i) retrospective unsupervised subsequence anomaly detection in data series. (ii) unsupervised detection of anomalies in data streams. (iii) classification explanation of known anomalies in data series in order to identify possible precursors. This manuscript first presents the industrial context that motivated this thesis, fundamental definitions, a taxonomy of data series, and state-of-the-art anomaly detection methods. We then present our contributions along the three axes mentioned above. First, we describe two original solutions, NormA (that aims to build a weighted set of subsequences that represent the different behaviors of the data series) and Series2Graph (that transform the data series in a directed graph), for the task of unsupervised detection of anomalous subsequences in static data series. Secondly, we present the SAND (inspired from NormA) method for unsupervised detection of anomalous subsequences in data streams. Thirdly, we address the problem of the supervised identification of precursors. We subdivide this task into two generic problems: the supervised classification of time series and the explanation of this classification’s results by identifying discriminative subsequences. Finally, we illustrate the applicability and interest of our developments through an application concerning the identification of undesirable vibration precursors occurring in water supply pumps in the French nuclear power plants of EDF
OLIVEIRA, Paulo César de. "Abordagem semi-supervisionada para detecção de módulos de software defeituosos." Universidade Federal de Pernambuco, 2015. https://repositorio.ufpe.br/handle/123456789/19990.
Full textMade available in DSpace on 2017-07-24T12:11:04Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) Dissertação Mestrado Paulo César de Oliveira.pdf: 2358509 bytes, checksum: 36436ca63e0a8098c05718bbee92d36e (MD5) Previous issue date: 2015-08-31
Com a competitividade cada vez maior do mercado, aplicações de alto nível de qualidade são exigidas para a automação de um serviço. Para garantir qualidade de um software, testá-lo visando encontrar falhas antecipadamente é essencial no ciclo de vida de desenvolvimento. O objetivo do teste de software é encontrar falhas que poderão ser corrigidas e consequentemente, aumentar a qualidade do software em desenvolvimento. À medida que o software cresce, uma quantidade maior de testes é necessária para prevenir ou encontrar defeitos, visando o aumento da qualidade. Porém, quanto mais testes são criados e executados, mais recursos humanos e de infraestrutura são necessários. Além disso, o tempo para realizar as atividades de teste geralmente não é suficiente, fazendo com que os defeitos possam escapar. Cada vez mais as empresas buscam maneiras mais baratas e efetivas para detectar defeitos em software. Muitos pesquisadores têm buscado nos últimos anos, mecanismos para prever automaticamente defeitos em software. Técnicas de aprendizagem de máquina vêm sendo alvo das pesquisas, como uma forma de encontrar defeitos em módulos de software. Tem-se utilizado muitas abordagens supervisionadas para este fim, porém, rotular módulos de software como defeituosos ou não para fins de treinamento de um classificador é uma atividade muito custosa e que pode inviabilizar a utilização de aprendizagem de máquina. Neste contexto, este trabalho propõe analisar e comparar abordagens não supervisionadas e semisupervisionadas para detectar módulos de software defeituosos. Para isto, foram utilizados métodos não supervisionados (de detecção de anomalias) e também métodos semi-supervisionados, tendo como base os classificadores AutoMLP e Naive Bayes. Para avaliar e comparar tais métodos, foram utilizadas bases de dados da NASA disponíveis no PROMISE Software Engineering Repository.
Because the increase of market competition then high level of quality applications are required to provide automate services. In order to achieve software quality testing is essential in the development lifecycle with the purpose of finding defect as earlier as possible. The testing purpose is not only to find failures that can be fixed, but improve software correctness and quality. Once software gets more complex, a greater number of tests will be necessary to prevent or find defects. Therefore, the more tests are designed and exercised, the more human and infrastructure resources are needed. However, time to run the testing activities are not enough, thus, as a result, it causes escape defects. Companies are constantly trying to find cheaper and effective ways to software defect detection in earlier stages. In the past years, many researchers are trying to finding mechanisms to automatically predict these software defects. Machine learning techniques are being a research target, as a way of finding software modules detection. Many supervised approaches are being used with this purpose, but labeling software modules as defective or not defective to be used in training phase is very expensive and it can make difficult machine learning use. Considering that this work aims to analyze and compare unsupervised and semi-supervised approaches to software module defect detection. To do so, unsupervised methods (of anomaly detection) and semi-supervised methods using AutoMLP and Naive Bayes algorithms were used. To evaluate and compare these approaches, NASA datasets were used at PROMISE Software Engineering Repository.
Boussik, Amine. "Apprentissage profond non-supervisé : Application à la détection de situations anormales dans l’environnement du train autonome." Electronic Thesis or Diss., Valenciennes, Université Polytechnique Hauts-de-France, 2023. http://www.theses.fr/2023UPHF0040.
Full textThe thesis addresses the challenges of monitoring the environment and detecting anomalies, especially obstacles, for an autonomous freight train. Although traditionally, rail transport was under human supervision, autonomous trains offer potential advantages in terms of costs, time, and safety. However, their operation in complex environments poses significant safety concerns. Instead of a supervised approach that requires costly and limited annotated data, this research adopts an unsupervised technique, using unlabeled data to detect anomalies based on methods capable of identifying atypical behaviors.Two environmental surveillance models are presented : the first, based on a convolutional autoencoder (CAE), is dedicated to identifying obstacles on the main track; the second, an advanced version incorporating the vision transformer (ViT), focuses on overall environmental surveillance. Both employ unsupervised learning techniques for anomaly detection.The results show that the highlighted method offers relevant insights for monitoring the environment of the autonomous freight train, holding potential to enhance its reliability and safety. The use of unsupervised techniques thus showcases the utility and relevance of their adoption in an application context for the autonomous train
Cherdo, Yann. "Détection d'anomalie non supervisée sur les séries temporelle à faible coût énergétique utilisant les SNNs." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4018.
Full textIn the context of the predictive maintenance of the car manufacturer Renault, this thesis aims at providing low-power solutions for unsupervised anomaly detection on time-series. With the recent evolution of cars, more and more data are produced and need to be processed by machine learning algorithms. This processing can be performed in the cloud or directly at the edge inside the car. In such a case, network bandwidth, cloud services costs, data privacy management and data loss can be saved. Embedding a machine learning model inside a car is challenging as it requires frugal models due to memory and processing constraints. To this aim, we study the usage of spiking neural networks (SNNs) for anomaly detection, prediction and classification on time-series. SNNs models' performance and energy costs are evaluated in an edge scenario using generic hardware models that consider all calculation and memory costs. To leverage as much as possible the sparsity of SNNs, we propose a model with trainable sparse connections that consumes half the energy compared to its non-sparse version. This model is evaluated on anomaly detection public benchmarks, a real use-case of anomaly detection from Renault Alpine cars, weather forecasts and the google speech command dataset. We also compare its performance with other existing SNN and non-spiking models. We conclude that, for some use-cases, spiking models can provide state-of-the-art performance while consuming 2 to 8 times less energy. Yet, further studies should be undertaken to evaluate these models once embedded in a car. Inspired by neuroscience, we argue that other bio-inspired properties such as attention, sparsity, hierarchy or neural assemblies dynamics could be exploited to even get better energy efficiency and performance with spiking models. Finally, we end this thesis with an essay dealing with cognitive neuroscience, philosophy and artificial intelligence. Diving into conceptual difficulties linked to consciousness and considering the deterministic mechanisms of memory, we argue that consciousness and the self could be constitutively independent from memory. The aim of this essay is to question the nature of humans by contrast with the ones of machines and AI
Lu, Wei. "Unsupervised anomaly detection framework for multiple-connection based network intrusions." Thesis, 2005. http://hdl.handle.net/1828/1949.
Full text