Log in

Relevant bibliographies by topics / Semi- and unsupervised learning / Journal articles

Journal articles on the topic 'Semi- and unsupervised learning'

To see the other types of publications on this topic, follow the link: Semi- and unsupervised learning.

Author: Grafiati

Published: 10 December 2022

Last updated: 28 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Semi- and unsupervised learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gao Huang, Shiji Song, Jatinder N. D. Gupta, and Cheng Wu. "Semi-Supervised and Unsupervised Extreme Learning Machines." IEEE Transactions on Cybernetics 44, no. 12 (December 2014): 2405–17. http://dx.doi.org/10.1109/tcyb.2014.2307349.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

C A Padmanabha Reddy, Y., P. Viswanath, and B. Eswara Reddy. "Semi-supervised learning: a brief review." International Journal of Engineering & Technology 7, no. 1.8 (February 9, 2018): 81. http://dx.doi.org/10.14419/ijet.v7i1.8.9977.

Full text

Abstract:

Most of the application domain suffers from not having sufficient labeled data whereas unlabeled data is available cheaply. To get labeled instances, it is very difficult because experienced domain experts are required to label the unlabeled data patterns. Semi-supervised learning addresses this problem and act as a half way between supervised and unsupervised learning. This paper addresses few techniques of Semi-supervised learning (SSL) such as self-training, co-training, multi-view learning, TSVMs methods. Traditionally SSL is classified in to Semi-supervised Classification and Semi-supervised Clustering which achieves better accuracy than traditional supervised and unsupervised learning techniques. The paper also addresses the issue of scalability and applications of Semi-supervised learning.

APA, Harvard, Vancouver, ISO, and other styles

3

Hui, Binyuan, Pengfei Zhu, and Qinghua Hu. "Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4215–22. http://dx.doi.org/10.1609/aaai.v34i04.5843.

Full text

Abstract:

Graph convolutional networks (GCN) have achieved promising performance in attributed graph clustering and semi-supervised node classification because it is capable of modeling complex graphical structure, and jointly learning both features and relations of nodes. Inspired by the success of unsupervised learning in the training of deep models, we wonder whether graph-based unsupervised learning can collaboratively boost the performance of semi-supervised learning. In this paper, we propose a multi-task graph learning model, called collaborative graph convolutional networks (CGCN). CGCN is composed of an attributed graph clustering network and a semi-supervised node classification network. As Gaussian mixture models can effectively discover the inherent complex data distributions, a new end to end attributed graph clustering network is designed by combining variational graph auto-encoder with Gaussian mixture models (GMM-VGAE) rather than the classic k-means. If the pseudo-label of an unlabeled sample assigned by GMM-VGAE is consistent with the prediction of the semi-supervised GCN, it is selected to further boost the performance of semi-supervised learning with the help of the pseudo-labels. Extensive experiments on benchmark graph datasets validate the superiority of our proposed GMM-VGAE compared with the state-of-the-art attributed graph clustering networks. The performance of node classification is greatly improved by our proposed CGCN, which verifies graph-based unsupervised learning can be well exploited to enhance the performance of semi-supervised learning.

APA, Harvard, Vancouver, ISO, and other styles

4

Guo, Wenbin, and Juan Zhang. "Semi-supervised learning for raindrop removal on a single image." Journal of Intelligent & Fuzzy Systems 42, no. 4 (March 4, 2022): 4041–49. http://dx.doi.org/10.3233/jifs-212342.

Full text

Abstract:

This article propose s a network that is mainly used to deal with a single image polluted by raindrops in rainy weather to get a clean image without raindrops. In the existing solutions, most of the methods rely on paired images, that is, the rain image and the real image without rain in the same scene. However, in many cases, the paired images are difficult to obtain, which makes it impossible to apply the raindrop removal network in many scenarios. Therefore this article proposes a semi-supervised rain-removing network apply to unpaired images. The model contains two parts: a supervised network and an unsupervised network. After the model is trained, the unsupervised network does not require paired images and it can get a clean image without raindrops. In particular, our network can perform training on paired and unpaired samples. The experimental results show that the best results are achieved not only on the supervised rain-removing network, but also on the unsupervised rain-removing network.

APA, Harvard, Vancouver, ISO, and other styles

5

Niu, Gang, Bo Dai, Makoto Yamada, and Masashi Sugiyama. "Information-Theoretic Semi-Supervised Metric Learning via Entropy Regularization." Neural Computation 26, no. 8 (August 2014): 1717–62. http://dx.doi.org/10.1162/neco_a_00614.

Full text

Abstract:

We propose a general information-theoretic approach to semi-supervised metric learning called SERAPH (SEmi-supervised metRic leArning Paradigm with Hypersparsity) that does not rely on the manifold assumption. Given the probability parameterized by a Mahalanobis distance, we maximize its entropy on labeled data and minimize its entropy on unlabeled data following entropy regularization. For metric learning, entropy regularization improves manifold regularization by considering the dissimilarity information of unlabeled data in the unsupervised part, and hence it allows the supervised and unsupervised parts to be integrated in a natural and meaningful way. Moreover, we regularize SERAPH by trace-norm regularization to encourage low-dimensional projections associated with the distance metric. The nonconvex optimization problem of SERAPH could be solved efficiently and stably by either a gradient projection algorithm or an EM-like iterative algorithm whose M-step is convex. Experiments demonstrate that SERAPH compares favorably with many well-known metric learning methods, and the learned Mahalanobis distance possesses high discriminability even under noisy environments.

APA, Harvard, Vancouver, ISO, and other styles

6

Zhang, Ziji, Peng Zhang, Peineng Wang, Jawaad Sheriff, Danny Bluestein, and Yuefan Deng. "Rapid analysis of streaming platelet images by semi-unsupervised learning." Computerized Medical Imaging and Graphics 89 (April 2021): 101895. http://dx.doi.org/10.1016/j.compmedimag.2021.101895.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Akdemir, Deniz, and Jean-Luc Jannink. "Ensemble learning with trees and rules: Supervised, semi-supervised, unsupervised." Intelligent Data Analysis 18, no. 5 (July 16, 2014): 857–72. http://dx.doi.org/10.3233/ida-140672.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Shutova, Ekaterina, Lin Sun, Elkin Darío Gutiérrez, Patricia Lichtenstein, and Srini Narayanan. "Multilingual Metaphor Processing: Experiments with Semi-Supervised and Unsupervised Learning." Computational Linguistics 43, no. 1 (April 2017): 71–123. http://dx.doi.org/10.1162/coli_a_00275.

Full text

Abstract:

Highly frequent in language and communication, metaphor represents a significant challenge for Natural Language Processing (NLP) applications. Computational work on metaphor has traditionally evolved around the use of hand-coded knowledge, making the systems hard to scale. Recent years have witnessed a rise in statistical approaches to metaphor processing. However, these approaches often require extensive human annotation effort and are predominantly evaluated within a limited domain. In contrast, we experiment with weakly supervised and unsupervised techniques—with little or no annotation—to generalize higher-level mechanisms of metaphor from distributional properties of concepts. We investigate different levels and types of supervision (learning from linguistic examples vs. learning from a given set of metaphorical mappings vs. learning without annotation) in flat and hierarchical, unconstrained and constrained clustering settings. Our aim is to identify the optimal type of supervision for a learning algorithm that discovers patterns of metaphorical association from text. In order to investigate the scalability and adaptability of our models, we applied them to data in three languages from different language groups—English, Spanish, and Russian—achieving state-of-the-art results with little supervision. Finally, we demonstrate that statistical methods can facilitate and scale up cross-linguistic research on metaphor.

APA, Harvard, Vancouver, ISO, and other styles

9

Weinlichová, Jana, and Jiří Fejfar. "Usage of self-organizing neural networks in evaluation of consumer behaviour." Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 58, no. 6 (2010): 625–32. http://dx.doi.org/10.11118/actaun201058060625.

Full text

Abstract:

This article deals with evaluation of consumer data by Artificial Intelligence methods. In methodical part there are described learning algorithms for Kohonen maps on the principle of supervised learning, unsupervised learning and semi-supervised learning. The principles of supervised learning and unsupervised learning are compared. On base of binding conditions of these principles there is pointed out an advantage of semi-supervised learning. Three algorithms are described for the semi-supervised learning: label propagation, self-training and co-training. Especially usage of co-training in Kohonen map learning seems to be promising point of other research. In concrete application of Kohonen neural network on consumer’s expense the unsupervised learning method has been chosen – the self-organization. So the features of data are evaluated by clustering method called Kohonen maps. These input data represents consumer expenses of households in countries of European union and are characterised by 12-dimension vector according to commodity classification. The data are evaluated in several years, so we can see their distribution, similarity or dissimilarity and also their evolution. In the article we discus other usage of this method for this type of data and also comparison of our results with results reached by hierarchical cluster analysis.

APA, Harvard, Vancouver, ISO, and other styles

10

Yamkovyi, Klym. "DEVELOPMENT AND COMPARATIVE ANALYSIS OF SEMI-SUPERVISED LEARNING ALGORITHMS ON A SMALL AMOUNT OF LABELED DATA." Bulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies, no. 1 (5) (July 12, 2021): 98–103. http://dx.doi.org/10.20998/2079-0023.2021.01.16.

Full text

Abstract:

The paper is dedicated to the development and comparative experimental analysis of semi-supervised learning approaches based on a mix of unsupervised and supervised approaches for the classification of datasets with a small amount of labeled data, namely, identifying to which of a set of categories a new observation belongs using a training set of data containing observations whose category membership is known. Semi-supervised learning is an approach to machine learning that combines a small amount of labeled data with a large amount of unlabeled data during training. Unlabeled data, when used in combination with a small quantity of labeled data, can produce significant improvement in learning accuracy. The goal is semi-supervised methods development and analysis along with comparing their accuracy and robustness on different synthetics datasets. The proposed approach is based on the unsupervised K-medoids methods, also known as the Partitioning Around Medoid algorithm, however, unlike Kmedoids the proposed algorithm first calculates medoids using only labeled data and next process unlabeled classes – assign labels of nearest medoid. Another proposed approach is the mix of the supervised method of K-nearest neighbor and unsupervised K-Means. Thus, the proposed learning algorithm uses information about both the nearest points and classes centers of mass. The methods have been implemented using Python programming language and experimentally investigated for solving classification problems using datasets with different distribution and spatial characteristics. Datasets were generated using the scikit-learn library. Was compared the developed approaches to find average accuracy on all these datasets. It was shown, that even small amounts of labeled data allow us to use semi-supervised learning, and proposed modifications ensure to improve accuracy and algorithm performance, which was demonstrated during experiments. And with the increase of available label information accuracy of the algorithms grows up. Thus, the developed algorithms are using a distance metric that considers available label information. Keywords: Unsupervised learning, supervised learning. semi-supervised learning, clustering, distance, distance function, nearest neighbor, medoid, center of mass.

APA, Harvard, Vancouver, ISO, and other styles

11

Goernitz, N., M. Kloft, K. Rieck, and U. Brefeld. "Toward Supervised Anomaly Detection." Journal of Artificial Intelligence Research 46 (February 20, 2013): 235–62. http://dx.doi.org/10.1613/jair.3623.

Full text

Abstract:

Anomaly detection is being regarded as an unsupervised learning task as anomalies stem from adversarial or unlikely events with unknown distributions. However, the predictive performance of purely unsupervised anomaly detection often fails to match the required detection rates in many tasks and there exists a need for labeled data to guide the model generation. Our first contribution shows that classical semi-supervised approaches, originating from a supervised classifier, are inappropriate and hardly detect new and unknown anomalies. We argue that semi-supervised anomaly detection needs to ground on the unsupervised learning paradigm and devise a novel algorithm that meets this requirement. Although being intrinsically non-convex, we further show that the optimization problem has a convex equivalent under relatively mild assumptions. Additionally, we propose an active learning strategy to automatically filter candidates for labeling. In an empirical study on network intrusion detection data, we observe that the proposed learning methodology requires much less labeled data than the state-of-the-art, while achieving higher detection accuracies.

APA, Harvard, Vancouver, ISO, and other styles

12

Wang, Yintong, Jiandong Wang, Haiyan Chen, and Bo Sun. "Semi-Supervised Local Fisher Discriminant Analysis Based on Reconstruction Probability Class." International Journal of Pattern Recognition and Artificial Intelligence 29, no. 02 (February 27, 2015): 1550007. http://dx.doi.org/10.1142/s021800141550007x.

Full text

Abstract:

Fisher discriminant analysis (FDA) is a classic supervised dimensionality reduction method in statistical pattern recognition. FDA can maximize the scatter between different classes, while minimizing the scatter within each class. As it only utilizes the labeled data and ignores the unlabeled data in the analysis process of FDA, it cannot be used to solve the unsupervised learning problems. Its performance is also very poor in dealing with semi-supervised learning problems in some cases. Recently, several semi-supervised learning methods as an extension of FDA have proposed. Most of these methods solve the semi-supervised problem by using a tradeoff parameter that evaluates the ratio of the supervised and unsupervised methods. In this paper, we propose a general semi-supervised dimensionality learning idea for the partially labeled data, namely the reconstruction probability class of labeled and unlabeled data. Based on the probability class optimizes Fisher criterion function, we propose a novel Semi-Supervised Local Fisher Discriminant Analysis (S2LFDA) method. Experimental results on real-world datasets demonstrate its effectiveness compared to the existing similar correlation methods.

APA, Harvard, Vancouver, ISO, and other styles

13

Schmarje, Lars, Monty Santarossa, Simon-Martin Schroder, and Reinhard Koch. "A Survey on Semi-, Self- and Unsupervised Learning for Image Classification." IEEE Access 9 (2021): 82146–68. http://dx.doi.org/10.1109/access.2021.3084358.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

dos Santos Ferreira, Alessandro, Daniel Matte Freitas, Gercina Gonçalves da Silva, Hemerson Pistori, and Marcelo Theophilo Folhes. "Unsupervised deep learning and semi-automatic data labeling in weed discrimination." Computers and Electronics in Agriculture 165 (October 2019): 104963. http://dx.doi.org/10.1016/j.compag.2019.104963.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Shen, Bin, and Olzhas Makhambetov. "Hierarchical Semi-Supervised Factorization for Learning the Semantics." Journal of Advanced Computational Intelligence and Intelligent Informatics 18, no. 3 (May 20, 2014): 366–74. http://dx.doi.org/10.20965/jaciii.2014.p0366.

Full text

Abstract:

Most semi-supervised learning methods are based on extending existing supervised or unsupervised techniques by incorporating additional information from unlabeled or labeled data. Unlabeled instances help in learning statistical models that fully describe the global property of our data, whereas labeled instances make learned knowledge more human-interpretable. In this paper we present a novel way of extending conventional non-negativematrix factorization (NMF) and probabilistic latent semantic analysis (pLSA) to semi-supervised versions by incorporating label information for learning semantics. The proposed algorithm consists of two steps, first acquiring prior bases representing some classes from labeled data and second utilizing them to guide the learning of final bases that are semantically interpretable.

APA, Harvard, Vancouver, ISO, and other styles

16

Chang, Xiaobin, Yongxin Yang, Tao Xiang, and Timothy M. Hospedales. "Disjoint Label Space Transfer Learning with Common Factorised Space." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3288–95. http://dx.doi.org/10.1609/aaai.v33i01.33013288.

Full text

Abstract:

In this paper, a unified approach is presented to transfer learning that addresses several source and target domain labelspace and annotation assumptions with a single model. It is particularly effective in handling a challenging case, where source and target label-spaces are disjoint, and outperforms alternatives in both unsupervised and semi-supervised settings. The key ingredient is a common representation termed Common Factorised Space. It is shared between source and target domains, and trained with an unsupervised factorisation loss and a graph-based loss. With a wide range of experiments, we demonstrate the flexibility, relevance and efficacy of our method, both in the challenging cases with disjoint label spaces, and in the more conventional cases such as unsupervised domain adaptation, where the source and target domains share the same label-sets.

APA, Harvard, Vancouver, ISO, and other styles

17

Zhou, Rong Fu, Luan Yang, Li Hua Wang, and Quan Jiang Sun. "User Oriented Semi-Supervised Document Clustering." Applied Mechanics and Materials 644-650 (September 2014): 1523–26. http://dx.doi.org/10.4028/www.scientific.net/amm.644-650.1523.

Full text

Abstract:

In many text mining applications, it is needed to cluster documents according to demand of users. However, Traditional documents clustering that use unsupervised learning are not able to meet this demand. In this paper, a new clustering approach that focuses on the problem is proposed. Main contributions include: (1) Expresses user requirement by topic with multiple attributes (2) Annotates topic semantic by ontology, calculate dissimilarity between topic semantics and build dissimilarity matrix. Experiments show that new approach is effective.

APA, Harvard, Vancouver, ISO, and other styles

18

Stausberg, J., and D. Nasseh. "Evaluation of a Binary Semi-supervised Classification Technique for Probabilistic Record Linkage." Methods of Information in Medicine 55, no. 02 (2016): 136–43. http://dx.doi.org/10.3414/me14-01-0087.

Full text

Abstract:

SummaryBackground: The process of merging data of different data sources is referred to as record linkage. A medical environment with increased preconditions on privacy protection demands the transformation of clear-text attributes like first name or date of birth into one-way encrypted pseudonyms. When performing an automated or privacy preserving record linkage there might be the need of a binary classification deciding whether two records should be classified as the same entity. The classification is the final of the four main phases of the record linkage process: Preprocessing, indexing, matching and classification. The choice of binary classification techniques in dependence of project specifications in particular data quality has not extensively been studied yet.Objectives: The aim of this work is the introduction and evaluation of an automatable semi-supervised binary classification system applied within the field of record linkage capable of competing or even surpassing advanced automated techniques of the domain of unsupervised classification.Methods: This work describes the rationale leading to the model and the final implementation of an automatable semi-supervised binary classification system and the comparison of its classification performance to an advanced active learning approach out of the domain of unsupervised learning. The performance of both systems has been measured on a broad variety of artificial test sets (n = 400), based on real patient data, with distinct and unique characteristics.Results: While the classification performance for both methods measured as F-measure was relatively close on test sets with maximum defined data quality, 0.996 for semi-supervised classification, 0.993 for unsupervised classification, it incrementally diverged for test sets of worse data quality dropping to 0.964 for semi-supervised classification and 0.803 for unsupervised classification.Conclusions: Aside from supplying a viable model for semi-supervised classification for automated probabilistic record linkage, the tests conducted on a large amount of test sets suggest that semi-supervised techniques might generally be capable of outperforming unsupervised techniques especially on data with lower levels of data quality.

APA, Harvard, Vancouver, ISO, and other styles

19

Turowski, Krzysztof, Jithin K. Sreedharan, and Wojciech Szpankowski. "Temporal Ordered Clustering in Dynamic Networks: Unsupervised and Semi-Supervised Learning Algorithms." IEEE Transactions on Network Science and Engineering 8, no. 2 (April 1, 2021): 1426–42. http://dx.doi.org/10.1109/tnse.2021.3058376.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Shi, Chuqiao, Michael Cao, David Muller, and Yimo Han. "Rapid and Semi-Automated Analysis of 4D-STEM data via Unsupervised Learning." Microscopy and Microanalysis 27, S1 (July 30, 2021): 58–59. http://dx.doi.org/10.1017/s1431927621000805.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

L. Lima, Alexsander, Stanley W. F. Rezende, Diogo S. Rabelo, Quintiliano S. S. Nomelini, José Waldemar Silva, Roberto M. Finzi Neto, Carlos A. Gallo, and José dos Reis V. Moura Jr. "Anomaly Detection Applied to ISHM for Thickness Reduction Analysis in Controlled Environments." International Journal of Advanced Engineering Research and Science 9, no. 12 (2022): 426–32. http://dx.doi.org/10.22161/ijaers.912.46.

Full text

Abstract:

In this work, three machine learning approaches were evaluated for detecting anomalies in impedance-based structural health monitoring (ISHM – Impedance-based Structural Health Monitoring) of a specimen in a controlled environment. Supervised, unsupervised, and semi-supervised algorithms were chosen to compare them regarding detecting anomalies in an aluminum beam with failure induced by surface machining on one of the faces. After applying the algorithms, it was found that, of the three types of learning, supervised and semi-supervised were the ones that achieved the best accuracy in detecting anomalies. On the other hand, the unsupervised type model did not obtain good results for the conditions investigated. Thus, this can be an important technique comparison achievement for implementing real anomaly detection ISHM systems.

APA, Harvard, Vancouver, ISO, and other styles

22

Ye, Han-Jia, Xin-Chun Li, and De-Chuan Zhan. "Task Cooperation for Semi-Supervised Few-Shot Learning." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 10682–90. http://dx.doi.org/10.1609/aaai.v35i12.17277.

Full text

Abstract:

Training a model with limited data is an essential task for machine learning and visual recognition. Few-shot learning approaches meta-learn a task-level inductive bias from SEEN class few-shot tasks, and the meta-model is expected to facilitate the few-shot learning with UNSEEN classes. Inspired by the idea that unlabeled data can be utilized to smooth the model space in traditional semi-supervised learning, we propose TAsk COoperation (TACO) which takes advantage of unsupervised tasks to smooth the meta-model space. Specifically, we couple the labeled support set in a few-shot task with easily-collected unlabeled instances, prediction agreement on which encodes the relationship between tasks. The learned smooth meta-model promotes the generalization ability on supervised UNSEEN few-shot tasks. The state-of-the-art few-shot classification results on MiniImageNet and TieredImageNet verify the superiority of TACO to leverage unlabeled data and task relationship in meta-learning.

APA, Harvard, Vancouver, ISO, and other styles

23

Han, Chang Hee, Misuk Kim, and Jin Tae Kwak. "Semi-supervised learning for an improved diagnosis of COVID-19 in CT images." PLOS ONE 16, no. 4 (April 1, 2021): e0249450. http://dx.doi.org/10.1371/journal.pone.0249450.

Full text

Abstract:

Coronavirus disease 2019 (COVID-19) has been spread out all over the world. Although a real-time reverse-transcription polymerase chain reaction (RT-PCR) test has been used as a primary diagnostic tool for COVID-19, the utility of CT based diagnostic tools have been suggested to improve the diagnostic accuracy and reliability. Herein we propose a semi-supervised deep neural network for an improved detection of COVID-19. The proposed method utilizes CT images in a supervised and unsupervised manner to improve the accuracy and robustness of COVID-19 diagnosis. Both labeled and unlabeled CT images are employed. Labeled CT images are used for supervised leaning. Unlabeled CT images are utilized for unsupervised learning in a way that the feature representations are invariant to perturbations in CT images. To systematically evaluate the proposed method, two COVID-19 CT datasets and three public CT datasets with no COVID-19 CT images are employed. In distinguishing COVID-19 from non-COVID-19 CT images, the proposed method achieves an overall accuracy of 99.83%, sensitivity of 0.9286, specificity of 0.9832, and positive predictive value (PPV) of 0.9192. The results are consistent between the COVID-19 challenge dataset and the public CT datasets. For discriminating between COVID-19 and common pneumonia CT images, the proposed method obtains 97.32% accuracy, 0.9971 sensitivity, 0.9598 specificity, and 0.9326 PPV. Moreover, the comparative experiments with respect to supervised learning and training strategies demonstrate that the proposed method is able to improve the diagnostic accuracy and robustness without exhaustive labeling. The proposed semi-supervised method, exploiting both supervised and unsupervised learning, facilitates an accurate and reliable diagnosis for COVID-19, leading to an improved patient care and management.

APA, Harvard, Vancouver, ISO, and other styles

24

AL-Shaeli, Intisar, Lsmail Sharhan Hburi, and Ammar A. Majeed. "Reconfigurable intelligent surface passive beamforming enhancement using unsupervised learning." International Journal of Electrical and Computer Engineering (IJECE) 13, no. 1 (February 1, 2023): 493. http://dx.doi.org/10.11591/ijece.v13i1.pp493-501.

Full text

Abstract:

<p><span lang="EN-US">Reconfigurable intelligent surfaces (RIS) is a wireless technology that has the potential to improve cellular communication systems significantly. This paper considers enhancing the RIS beamforming in a RIS-aided multiuser multi-input multi-output (MIMO) system to enhance user throughput in cellular networks. The study offers an unsupervised/deep neural network (U/DNN) that simultaneously optimizes the intelligent surface beamforming with less complexity to overcome the non-convex sum-rate problem difficulty. The numerical outcomes comparing the suggested approach to the near-optimal iterative semi-definite programming strategy indicate that the proposed method retains most performance (more than 95% of optimal throughput value when the number of antennas is 4 and RIS’s elements are 30) while drastically reducing system computing complexity.</span></p>

APA, Harvard, Vancouver, ISO, and other styles

25

Soto, P. J., J. D. Bermudez, P. N. Happ, and R. Q. Feitosa. "A COMPARATIVE ANALYSIS OF UNSUPERVISED AND SEMI-SUPERVISED REPRESENTATION LEARNING FOR REMOTE SENSING IMAGE CATEGORIZATION." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-2/W7 (September 16, 2019): 167–73. http://dx.doi.org/10.5194/isprs-annals-iv-2-w7-167-2019.

Full text

Abstract:

<p><strong>Abstract.</strong> This work aims at investigating unsupervised and semi-supervised representation learning methods based on generative adversarial networks for remote sensing scene classification. The work introduces a novel approach, which consists in a semi-supervised extension of a prior unsupervised method, known as MARTA-GAN. The proposed approach was compared experimentally with two baselines upon two public datasets, <i>UC-MERCED</i> and <i>NWPU-RESISC45</i>. The experiments assessed the performance of each approach under different amounts of labeled data. The impact of fine-tuning was also investigated. The proposed method delivered in our analysis the best overall accuracy under scarce labeled samples, both in terms of absolute value and in terms of variability across multiple runs.</p>

APA, Harvard, Vancouver, ISO, and other styles

26

van Engelen, Jesper E., and Holger H. Hoos. "A survey on semi-supervised learning." Machine Learning 109, no. 2 (November 15, 2019): 373–440. http://dx.doi.org/10.1007/s10994-019-05855-6.

Full text

Abstract:

AbstractSemi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at neural network-based models and generative learning. The literature on the topic has also expanded in volume and scope, now encompassing a broad spectrum of theory, algorithms and applications. However, no recent surveys exist to collect and organize this knowledge, impeding the ability of researchers and engineers alike to utilize it. Filling this void, we present an up-to-date overview of semi-supervised learning methods, covering earlier work as well as more recent advances. We focus primarily on semi-supervised classification, where the large majority of semi-supervised learning research takes place. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches and algorithms developed over the past two decades, with an emphasis on the most prominent and currently relevant work. Furthermore, we propose a new taxonomy of semi-supervised classification algorithms, which sheds light on the different conceptual and methodological approaches for incorporating unlabelled data into the training process. Lastly, we show how the fundamental assumptions underlying most semi-supervised learning algorithms are closely connected to each other, and how they relate to the well-known semi-supervised clustering assumption.

APA, Harvard, Vancouver, ISO, and other styles

27

Zhao, Guo Zhen, and Wan Li Zuo. "Semi-Supervised Word Sense Disambiguation via Context Weighting." Advanced Materials Research 1049-1050 (October 2014): 1327–38. http://dx.doi.org/10.4028/www.scientific.net/amr.1049-1050.1327.

Full text

Abstract:

Word sense disambiguation as a central research topic in natural language processing can promote the development of many applications such as information retrieval, speech synthesis, machine translation, summarization and question answering. Previous approaches can be grouped into three categories: supervised, unsupervised and knowledge-based. The accuracy of supervised methods is the highest, but they suffer from knowledge acquisition bottleneck. Unsupervised method can avoid knowledge acquisition bottleneck, but its effect is not satisfactory. With the built-up of large-scale knowledge, knowledge-based approach has attracted more and more attention. This paper introduces a new context weighting method, and based on which proposes a novel semi-supervised approach for word sense disambiguation. The significant contribution of our method is that thesaurus and machine learning techniques are integrated in word sense disambiguation. Compared with the state of the art on the test data of the English all words disambiguation task in Sensaval-3, our method yields obvious improvements over existing methods in nouns, adjectives and verbs disambiguation.

APA, Harvard, Vancouver, ISO, and other styles

28

Kannan, K. Gokul, and T. R. Ganesh Babu. "Semi Supervised Generative Adversarial Network for Automated Glaucoma Diagnosis with Stacked Discriminator Models." Journal of Medical Imaging and Health Informatics 11, no. 5 (May 1, 2021): 1334–40. http://dx.doi.org/10.1166/jmihi.2021.3787.

Full text

Abstract:

Generative Adversarial Network (GAN) is neural network architecture, widely used in many computer vision applications such as super-resolution image generation, art creation and image to image translation. A conventional GAN model consists of two sub-models; generative model and discriminative model. The former one generates new samples based on an unsupervised learning task, and the later one classifies them into real or fake. Though GAN is most commonly used for training generative models, it can be used for developing a classifier model. The main objective is to extend the effectiveness of GAN into semi-supervised learning, i.e., for the classification of fundus images to diagnose glaucoma. The discriminator model in the conventional GAN is improved via transfer learning to predict n + 1 classes by training the model for both supervised classification (n classes) and unsupervised classification (fake or real). Both models share all feature extraction layers and differ in the output layers. Thus any update in one of the model will impact both models. Results show that the semi-supervised GAN performs well than a standalone Convolution Neural Networks (CNNs) model.

APA, Harvard, Vancouver, ISO, and other styles

29

Leordeanu, Marius, Mihai Cristian Pîrvu, Dragos Costea, Alina E. Marcu, Emil Slusanschi, and Rahul Sukthankar. "Semi-Supervised Learning for Multi-Task Scene Understanding by Neural Graph Consensus." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 3 (May 18, 2021): 1882–92. http://dx.doi.org/10.1609/aaai.v35i3.16283.

Full text

Abstract:

We address the challenging problem of semi-supervised learning in the context of multiple visual interpretations of the world by finding consensus in a graph of neural networks. Each graph node is a scene interpretation layer, while each edge is a deep net that transforms one layer at one node into another from a different node. During the supervised phase edge networks are trained independently. During the next unsupervised stage edge nets are trained on the pseudo-ground truth provided by consensus among multiple paths that reach the nets' start and end nodes. These paths act as ensemble teachers for any given edge and strong consensus is used for high-confidence supervisory signal. The unsupervised learning process is repeated over several generations, in which each edge becomes a "student" and also part of different ensemble "teachers" for training other students. By optimizing such consensus between different paths, the graph reaches consistency and robustness over multiple interpretations and generations, in the face of unknown labels. We give theoretical justifications of the proposed idea and validate it on a large dataset. We show how prediction of different representations such as depth, semantic segmentation, surface normals and pose from RGB input could be effectively learned through self-supervised consensus in our graph. We also compare to state-of-the-art methods for multi-task and semi-supervised learning and show superior performance.

APA, Harvard, Vancouver, ISO, and other styles

30

Verstraete, David Benjamin, Enrique López Droguett, Viviana Meruane, Mohammad Modarres, and Andrés Ferrada. "Deep semi-supervised generative adversarial fault diagnostics of rolling element bearings." Structural Health Monitoring 19, no. 2 (May 23, 2019): 390–411. http://dx.doi.org/10.1177/1475921719850576.

Full text

Abstract:

With the availability of cheaper multisensor suites, one has access to massive and multidimensional datasets that can and should be used for fault diagnosis. However, from a time, resource, engineering, and computational perspective, it is often cost prohibitive to label all the data streaming into a database in the context of big machinery data, that is, massive multidimensional data. Therefore, this article proposes both a fully unsupervised and a semi-supervised deep learning enabled generative adversarial network-based methodology for fault diagnostics. Two public datasets of vibration data from rolling element bearings are used to evaluate the performance of the proposed methodology for fault diagnostics. The results indicate that the proposed methodology is a promising approach for both unsupervised and semi-supervised fault diagnostics.

APA, Harvard, Vancouver, ISO, and other styles

31

Weinstein, Ben G., Sergio Marconi, Stephanie Bohlman, Alina Zare, and Ethan White. "Individual Tree-Crown Detection in RGB Imagery Using Semi-Supervised Deep Learning Neural Networks." Remote Sensing 11, no. 11 (June 1, 2019): 1309. http://dx.doi.org/10.3390/rs11111309.

Full text

Abstract:

Remote sensing can transform the speed, scale, and cost of biodiversity and forestry surveys. Data acquisition currently outpaces the ability to identify individual organisms in high resolution imagery. We outline an approach for identifying tree-crowns in RGB imagery while using a semi-supervised deep learning detection network. Individual crown delineation has been a long-standing challenge in remote sensing and available algorithms produce mixed results. We show that deep learning models can leverage existing Light Detection and Ranging (LIDAR)-based unsupervised delineation to generate trees that are used for training an initial RGB crown detection model. Despite limitations in the original unsupervised detection approach, this noisy training data may contain information from which the neural network can learn initial tree features. We then refine the initial model using a small number of higher-quality hand-annotated RGB images. We validate our proposed approach while using an open-canopy site in the National Ecological Observation Network. Our results show that a model using 434,551 self-generated trees with the addition of 2848 hand-annotated trees yields accurate predictions in natural landscapes. Using an intersection-over-union threshold of 0.5, the full model had an average tree crown recall of 0.69, with a precision of 0.61 for the visually-annotated data. The model had an average tree detection rate of 0.82 for the field collected stems. The addition of a small number of hand-annotated trees improved the performance over the initial self-supervised model. This semi-supervised deep learning approach demonstrates that remote sensing can overcome a lack of labeled training data by generating noisy data for initial training using unsupervised methods and retraining the resulting models with high quality labeled data.

APA, Harvard, Vancouver, ISO, and other styles

32

Zhao, Mingbo, Yue Zhang, Zhao Zhang, Jiao Liu, and Weijian Kong. "ALG: Adaptive low-rank graph regularization for scalable semi-supervised and unsupervised learning." Neurocomputing 370 (December 2019): 16–27. http://dx.doi.org/10.1016/j.neucom.2019.08.036.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Morette, N., L. C. Castro Heredia, Thierry Ditchi, A. Rodrigo Mor, and Y. Oussar. "Partial discharges and noise classification under HVDC using unsupervised and semi-supervised learning." International Journal of Electrical Power & Energy Systems 121 (October 2020): 106129. http://dx.doi.org/10.1016/j.ijepes.2020.106129.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Tian, Song, Jian She Song, Shu Bing Tian, and Wei Gong. "Change Detection in SAR Images Based on Semi-Supervised Learning." Applied Mechanics and Materials 596 (July 2014): 484–89. http://dx.doi.org/10.4028/www.scientific.net/amm.596.484.

Full text

Abstract:

Support Vector Machine (SVM) is a supervised approach, which needs large numbers of labeled samples. However, it is difficult to obtain such samples for change detection based on SAR images and the available labeled samples are very limited. this paper proposes a semi-supervised support vector machine (S3VM) unsupervised SAR image change detection. Using of K-means clustering method obtain threshold of image; introduce offsets which are automatically selected to achieve a pseudo-training set and unlabeled set; Finally, based on the statistical characteristics of semi-supervised support vector machines for image change and non-change class. The experimental results showed that: In the case without using noise reduction and the reduction in the number of samples, the proposed algorithm can maintain better classification, generalization and more stable detection accuracy.

APA, Harvard, Vancouver, ISO, and other styles

35

Khang, Tran Dinh, Manh-Kien Tran, and Michael Fowler. "A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzzification Coefficients." Algorithms 14, no. 9 (August 29, 2021): 258. http://dx.doi.org/10.3390/a14090258.

Full text

Abstract:

Clustering is an unsupervised machine learning method with many practical applications that has gathered extensive research interest. It is a technique of dividing data elements into clusters such that elements in the same cluster are similar. Clustering belongs to the group of unsupervised machine learning techniques, meaning that there is no information about the labels of the elements. However, when knowledge of data points is known in advance, it will be beneficial to use a semi-supervised algorithm. Within many clustering techniques available, fuzzy C-means clustering (FCM) is a common one. To make the FCM algorithm a semi-supervised method, it was proposed in the literature to use an auxiliary matrix to adjust the membership grade of the elements to force them into certain clusters during the computation. In this study, instead of using the auxiliary matrix, we proposed to use multiple fuzzification coefficients to implement the semi-supervision component. After deriving the proposed semi-supervised fuzzy C-means clustering algorithm with multiple fuzzification coefficients (sSMC-FCM), we demonstrated the convergence of the algorithm and validated the efficiency of the method through a numerical example.

APA, Harvard, Vancouver, ISO, and other styles

36

Zhang, Xu, Huan Zhang, Xinyue Zhang, Xinyue Zhang, Cheng Zhen, Tianguo Yuan, and Jiande Wu. "Improving Semi-Supervised Image Classification by Assigning Different Weights to Correctly and Incorrectly Classified Samples." Applied Sciences 12, no. 23 (November 22, 2022): 11915. http://dx.doi.org/10.3390/app122311915.

Full text

Abstract:

Semi-supervised deep learning, a model that aims to effectively use unlabeled data to help learn sample features from labeled data, is a recent hot topic. To effectively use unlabeled data, a new semi-supervised learning model based on a consistency strategy is proposed. In the supervised part with labeled samples, the image generation model first generates some artificial images to complement the limited number of labeled samples. Secondly, the sample label mapping, as the “benchmark”, is compared to the corresponding sample features in the network as an additional loss to complement the original supervisory loss, aiming to better correct the model parameters. Finally, the original supervised loss is changed so that the network parameters are determined by the characteristics of each correctly classified sample. In the unsupervised part, the actual unsupervised loss is altered so that the model does not “treat all samples equally” and can focus more on the characteristics of misclassified samples. A total of 40 labeled samples from the CIFAR-10 and SVHN datasets were used to train the semi-supervised model achieving accuracies of 93.25% and 96.83%, respectively, demonstrating the effectiveness of the proposed semi-supervised model.

APA, Harvard, Vancouver, ISO, and other styles

37

Tang, Xinyu, Milad Nasr, Saeed Mahloujifar, Virat Shejwalkar, Liwei Song, Amir Houmansadr, and Prateek Mittal. "Machine Learning with Differentially Private Labels: Mechanisms and Frameworks." Proceedings on Privacy Enhancing Technologies 2022, no. 4 (October 2022): 332–50. http://dx.doi.org/10.56553/popets-2022-0112.

Full text

Abstract:

Label differential privacy is a relaxation of differential privacy for machine learning scenarios where the labels are the only sensitive information that needs to be protected in the training data. For example, imagine a survey from a participant in a university class about their vaccination status. Some attributes of the students are publicly available but their vaccination status is sensitive information and must remain private. Now if we want to train a model that predicts whether a student has received vaccination using only their public information, we can use label-DP. Recent works on label-DP use different ways of adding noise to the labels in order to obtain label-DP models. In this work, we present novel techniques for training models with label-DP guarantees by leveraging unsupervised learning and semi-supervised learning, enabling us to inject less noise while obtaining the same privacy, therefore achieving a better utility-privacy trade-off. We first introduce a framework that starts with an unsupervised classifier f0 and dataset D with noisy label set Y , reduces the noise in Y using f0 , and then trains a new model f using the less noisy dataset. Our noise reduction strategy uses the model f0 to remove the noisy labels that are incorrect with high probability. Then we use semi-supervised learning to train a model using the remaining labels. We instantiate this framework with multiple ways of obtaining the noisy labels and also the base classifier. As an alternative way to reduce the noise, we explore the effect of using unsupervised learning: we only add noise to a majority voting step for associating the learned clusters with a cluster label (as opposed to adding noise to individual labels); the reduced sensitivity enables us to add less noise. Our experiments show that these techniques can significantly outperform the prior works on label-DP.

APA, Harvard, Vancouver, ISO, and other styles

38

Singhania, Dipika, Rahul Rahaman, and Angela Yao. "Iterative Contrast-Classify for Semi-supervised Temporal Action Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 2 (June 28, 2022): 2262–70. http://dx.doi.org/10.1609/aaai.v36i2.20124.

Full text

Abstract:

Temporal action segmentation classifies the action of each frame in (long) video sequences. Due to the high cost of frame-wise labeling, we propose the first semi-supervised method for temporal action segmentation. Our method hinges on unsupervised representation learning, which, for temporal action segmentation, poses unique challenges. Actions in untrimmed videos vary in length and have unknown labels and start/end times. Ordering of actions across videos may also vary. We propose a novel way to learn frame-wise representations from temporal convolutional networks (TCNs) by clustering input features with added time-proximity conditions and multi-resolution similarity. By merging representation learning with conventional supervised learning, we develop an "Iterative Contrast-Classify (ICC)'' semi-supervised learning scheme. With more labelled data, ICC progressively improves in performance; ICC semi-supervised learning, with 40% labelled videos, performs similarly to fully-supervised counterparts. Our ICC improves MoF by {+1.8, +5.6, +2.5}% on Breakfast, 50Salads, and GTEA respectively for 100% labelled videos.

APA, Harvard, Vancouver, ISO, and other styles

39

Li, Shu, Wei Wang, Wen-Tao Li, and Pan Chen. "Multi-View Representation Learning with Manifold Smoothness." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 10 (May 18, 2021): 8447–54. http://dx.doi.org/10.1609/aaai.v35i10.17026.

Full text

Abstract:

Multi-view representation learning attempts to learn a representation from multiple views and most existing methods are unsupervised. However, representation learned only from unlabeled data may not be discriminative enough for further applications (e.g., clustering and classification). For this reason, semi-supervised methods which could use unlabeled data along with the labeled data for multi-view representation learning need to be developed. Manifold information plays an important role in semi-supervised learning, but it has not been considered for multi-view representation learning. In this paper, we introduce the manifold smoothness into multi-view representation learning and propose MvDGAT which learns the representation and the intrinsic manifold simultaneously with graph attention network. Experiments conducted on real-world datasets reveal that our MvDGAT can achieve better performance than state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

40

Taherkhani, Fariborz, Hadi Kazemi, and Nasser M. Nasrabadi. "Matrix Completion for Graph-Based Deep Semi-Supervised Learning." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5058–65. http://dx.doi.org/10.1609/aaai.v33i01.33015058.

Full text

Abstract:

Convolutional Neural Networks (CNNs) have provided promising achievements for image classification problems. However, training a CNN model relies on a large number of labeled data. Considering the vast amount of unlabeled data available on the web, it is important to make use of these data in conjunction with a small set of labeled data to train a deep learning model. In this paper, we introduce a new iterative Graph-based Semi-Supervised Learning (GSSL) method to train a CNN-based classifier using a large amount of unlabeled data and a small amount of labeled data. In this method, we first construct a similarity graph in which the nodes represent the CNN features corresponding to data points (labeled and unlabeled) while the edges tend to connect the data points with the same class label. In this graph, the missing label of unsupervised nodes is predicted by using a matrix completion method based on rank minimization criterion. In the next step, we use the constructed graph to calculate triplet regularization loss which is added to the supervised loss obtained by initially labeled data to update the CNN network parameters.

APA, Harvard, Vancouver, ISO, and other styles

41

Solatidehkordi, Zahra, and Imran Zualkernan. "Survey on Recent Trends in Medical Image Classification Using Semi-Supervised Learning." Applied Sciences 12, no. 23 (November 25, 2022): 12094. http://dx.doi.org/10.3390/app122312094.

Full text

Abstract:

Training machine learning and deep learning models for medical image classification is a challenging task due to a lack of large, high-quality labeled datasets. As the labeling of medical images requires considerable time and effort from medical experts, models need to be specifically designed to train on low amounts of labeled data. Therefore, an application of semi-supervised learning (SSL) methods provides one potential solution. SSL methods use a combination of a small number of labeled datasets with a much larger number of unlabeled datasets to achieve successful predictions by leveraging the information gained through unsupervised learning to improve the supervised model. This paper provides a comprehensive survey of the latest SSL methods proposed for medical image classification tasks.

APA, Harvard, Vancouver, ISO, and other styles

42

Wang, Wei, Feiyu Chen, Yongxin Ge, Sheng Huang, Xiaohong Zhang, and Dan Yang. "Discriminative deep semi-nonnegative matrix factorization network with similarity maximization for unsupervised feature learning." Pattern Recognition Letters 149 (September 2021): 157–63. http://dx.doi.org/10.1016/j.patrec.2021.06.013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Zhang, Nan, and Shifei Ding. "Unsupervised and semi-supervised extreme learning machine with wavelet kernel for high dimensional data." Memetic Computing 9, no. 2 (June 22, 2016): 129–39. http://dx.doi.org/10.1007/s12293-016-0198-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Wang, Zhiqiong, Luxuan Qu, Junchang Xin, Hongxu Yang, and Xiaosong Gao. "A unified distributed ELM framework with supervised, semi-supervised and unsupervised big data learning." Memetic Computing 11, no. 3 (July 13, 2018): 305–15. http://dx.doi.org/10.1007/s12293-018-0271-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Manian, Vidya, Estefanía Alfaro-Mejía, and Roger P. Tokars. "Hyperspectral Image Labeling and Classification Using an Ensemble Semi-Supervised Machine Learning Approach." Sensors 22, no. 4 (February 18, 2022): 1623. http://dx.doi.org/10.3390/s22041623.

Full text

Abstract:

Hyperspectral remote sensing has tremendous potential for monitoring land cover and water bodies from the rich spatial and spectral information contained in the images. It is a time and resource consuming task to obtain groundtruth data for these images by field sampling. A semi-supervised method for labeling and classification of hyperspectral images is presented. The unsupervised stage consists of image enhancement by feature extraction, followed by clustering for labeling and generating the groundtruth image. The supervised stage for classification consists of a preprocessing stage involving normalization, computation of principal components, and feature extraction. An ensemble of machine learning models takes the extracted features and groundtruth data from the unsupervised stage as input and a decision block then combines the output of the machines to label the image based on majority voting. The ensemble of machine learning methods includes support vector machines, gradient boosting, Gaussian classifier, and linear perceptron. Overall, the gradient boosting method gives the best performance for supervised classification of hyperspectral images. The presented ensemble method is useful for generating labeled data for hyperspectral images that do not have groundtruth information. It gives an overall accuracy of 93.74% for the Jasper hyperspectral image, 100% accuracy for the HSI2 Lake Erie images, and 99.92% for the classification of cyanobacteria or harmful algal blooms and surface scum. The method distinguishes well between blue green algae and surface scum. The full pipeline ensemble method for classifying Lake Erie images in a cloud server runs 24 times faster than a workstation.

APA, Harvard, Vancouver, ISO, and other styles

46

Kejriwal, Mayank. "Unsupervised DNF Blocking for Efficient Linking of Knowledge Graphs and Tables." Information 12, no. 3 (March 19, 2021): 134. http://dx.doi.org/10.3390/info12030134.

Full text

Abstract:

Entity Resolution (ER) is the problem of identifying co-referent entity pairs across datasets, including knowledge graphs (KGs). ER is an important prerequisite in many applied KG search and analytics pipelines, with a typical workflow comprising two steps. In the first ’blocking’ step, entities are mapped to blocks. Blocking is necessary for preempting comparing all possible pairs of entities, as (in the second ‘similarity’ step) only entities within blocks are paired and compared, allowing for significant computational savings with a minimal loss of performance. Unfortunately, learning a blocking scheme in an unsupervised fashion is a non-trivial problem, and it has not been properly explored for heterogeneous, semi-structured datasets, such as are prevalent in industrial and Web applications. This article presents an unsupervised algorithmic pipeline for learning Disjunctive Normal Form (DNF) blocking schemes on KGs, as well as structurally heterogeneous tables that may not share a common schema. We evaluate the approach on six real-world dataset pairs, and show that it is competitive with supervised and semi-supervised baselines.

APA, Harvard, Vancouver, ISO, and other styles

47

Saghezchi, Firooz B., Georgios Mantas, Manuel A. Violas, A. Manuel de Oliveira Duarte, and Jonathan Rodriguez. "Machine Learning for DDoS Attack Detection in Industry 4.0 CPPSs." Electronics 11, no. 4 (February 16, 2022): 602. http://dx.doi.org/10.3390/electronics11040602.

Full text

Abstract:

The Fourth Industrial Revolution (Industry 4.0) has transformed factories into smart Cyber-Physical Production Systems (CPPSs), where man, product, and machine are fully interconnected across the whole supply chain. Although this digitalization brings enormous advantages through customized, transparent, and agile manufacturing, it introduces a significant number of new attack vectors—e.g., through vulnerable Internet-of-Things (IoT) nodes—that can be leveraged by attackers to launch sophisticated Distributed Denial-of-Service (DDoS) attacks threatening the availability of the production line, business services, or even the human lives. In this article, we adopt a Machine Learning (ML) approach for network anomaly detection and construct different data-driven models to detect DDoS attacks on Industry 4.0 CPPSs. Existing techniques use data either artificially synthesized or collected from Information Technology (IT) networks or small-scale lab testbeds. To address this limitation, we use network traffic data captured from a real-world semiconductor production factory. We extract 45 bidirectional network flow features and construct several labeled datasets for training and testing ML models. We investigate 11 different supervised, unsupervised, and semi-supervised algorithms and assess their performance through extensive simulations. The results show that, in terms of the detection performance, supervised algorithms outperform both unsupervised and semi-supervised ones. In particular, the Decision Tree model attains an Accuracy of 0.999 while confining the False Positive Rate to 0.001.

APA, Harvard, Vancouver, ISO, and other styles

48

Ali Humayun, Mohammad, Ibrahim Hameed, Syed Muslim Shah, Sohaib Hassan Khan, Irfan Zafar, Saad Bin Ahmed, and Junaid Shuja. "Regularized Urdu Speech Recognition with Semi-Supervised Deep Learning." Applied Sciences 9, no. 9 (May 13, 2019): 1956. http://dx.doi.org/10.3390/app9091956.

Full text

Abstract:

Automatic Speech Recognition, (ASR) has achieved the best results for English, with end-to-end neural network based supervised models. These supervised models need huge amounts of labeled speech data for good generalization, which can be quite a challenge to obtain for low-resource languages like Urdu. Most models proposed for Urdu ASR are based on Hidden Markov Models (HMMs). This paper proposes an end-to-end neural network model, for Urdu ASR, regularized with dropout, ensemble averaging and Maxout units. Dropout and ensembles are averaging techniques over multiple neural network models while Maxout are units in a neural network which adapt their activation functions. Due to limited labeled data, Semi Supervised Learning (SSL) techniques are also incorporated to improve model generalization. Speech features are transformed into a lower dimensional manifold using an unsupervised dimensionality-reduction technique called Locally Linear Embedding (LLE). Transformed data along with higher dimensional features is used to train neural networks. The proposed model also utilizes label propagation-based self-training of initially trained models and achieves a Word Error Rate (WER) of 4% less than that reported as the benchmark on the same Urdu corpus using HMM. The decrease in WER after incorporating SSL is more significant with an increased validation data size.

APA, Harvard, Vancouver, ISO, and other styles

49

Başkaya, Osman, and David Jurgens. "Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation." Journal of Artificial Intelligence Research 55 (April 22, 2016): 1025–58. http://dx.doi.org/10.1613/jair.4917.

Full text

Abstract:

Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to benefit many applications in Natural Language Processing. Although supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words. While unsupervised techniques have been proposed to overcome this data sparsity problem, such techniques have not outperformed supervised methods. In this paper, we propose a new approach to building semi-supervised WSD systems that combines a small amount of sense-annotated data with information from Word Sense Induction, a fully-unsupervised technique that automatically learns the different senses of a word based on how it is used. In three experiments, we show how sense induction models may be effectively combined to ultimately produce high-performance semi-supervised WSD systems that exceed the performance of state-of-the-art supervised WSD techniques trained on the same sense-annotated data. We anticipate that our results and released software will also benefit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce accurate WSD systems with minimal annotation effort.

APA, Harvard, Vancouver, ISO, and other styles

50

Stepišnik, Tomaž, and Dragi Kocev. "Semi-supervised oblique predictive clustering trees." PeerJ Computer Science 7 (May 3, 2021): e506. http://dx.doi.org/10.7717/peerj-cs.506.

Full text

Abstract:

Semi-supervised learning combines supervised and unsupervised learning approaches to learn predictive models from both labeled and unlabeled data. It is most appropriate for problems where labeled examples are difficult to obtain but unlabeled examples are readily available (e.g., drug repurposing). Semi-supervised predictive clustering trees (SSL-PCTs) are a prominent method for semi-supervised learning that achieves good performance on various predictive modeling tasks, including structured output prediction tasks. The main issue, however, is that the learning time scales quadratically with the number of features. In contrast to axis-parallel trees, which only use individual features to split the data, oblique predictive clustering trees (SPYCTs) use linear combinations of features. This makes the splits more flexible and expressive and often leads to better predictive performance. With a carefully designed criterion function, we can use efficient optimization techniques to learn oblique splits. In this paper, we propose semi-supervised oblique predictive clustering trees (SSL-SPYCTs). We adjust the split learning to take unlabeled examples into account while remaining efficient. The main advantage over SSL-PCTs is that the proposed method scales linearly with the number of features. The experimental evaluation confirms the theoretical computational advantage and shows that SSL-SPYCTs often outperform SSL-PCTs and supervised PCTs both in single-tree setting and ensemble settings. We also show that SSL-SPYCTs are better at producing meaningful feature importance scores than supervised SPYCTs when the amount of labeled data is limited.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!